Applied Analytics - New York Accident Mitigation
Analytics as a Hobby
As part of my day-to-day work, I leverage a suite of analytic tools such as SQL (Teradata/Bigquery), Tableau, Splunk, Python, and Spark. The nature of my work is centered around Product Analytics, Business Strategy, and Process Optimization within Media & Technology.
I enjoy my career, but I still find it important to stay relevant in the marketplace by applying my skills in unfamiliar areas. Practicing use-cases with public data sets is a perfect way to improve technical skills & strategic problem solving.
Taking on New York City Traffic
Having lived in New York city, I saw first hand how dense traffic could be. I was baffled at the number of construction sites, the diversity of people passing through at all hours of the day, and the sheer number of cars on the road. Driving through the city was quite simply, intimidating. That being said, I often pondered what contributed the most to accidents in New Yorkâone of the biggest cities of the world. If I had to, how would I propose solutions to City Council?
I recently discovered public data sets offered by Google Cloud to answer these common questions. These sets provide a great opportunity to exercise necessary business skills. I aimed to answer my traffic question by leveraging new_york_mv_collisions from BigQuery and visualizing the data through Tableau.
Key Insights
Accidents are highest During Business Hours
Public Transportation & Human Error are the Primary Contributors to Accidents
Accident Rate is Correlated to Per Capita Income
Improvements to City Planning can Reduce the Majority of Accidents
City Council Should Allocate Budget Towards Adding Designated Taxi/Bus/Bike lanes to Manhattan Avenues
Accidents are Highest During Business Hours
The number of daily car accidents fluctuates from winter to summer as a result of external driving conditions. Looking beyond the oscillations, accidents increase only 1% Y/Y on average. Similarly, Cycling and Pedestrian related accidents are asynchronously periodic due to seasonal shifts in preferred transportation. Pulling together these trends, we can consider that accident frequency corresponds to a combination of the number of vehicles on the road & the prevalence of external hazards.
This supposition is strengthened when we dive into weekly and hourly averages. Accidents are highest during business hours. This is counter to the expectation that accidents would be highest at night where driving ability is compromised by fatigue, human error, and environmental factors. We see some lift around 3am that supports the notion, but not to the extent we would expect.
What Other Factors Contributed to Accidents?
The vast majority of accidents involved Public Transportation. Of these, roughly 1 in 5 accidents resulted in injuries and/or fatalities.Counter to belief, far fewer accidents resulted from Human Error (30% fewer than Public Transportation). Even fewer accidents resulted from aggressive driving/DUIs (58% fewer than Public Transportation).
Subdividing Accidents into Preventative Solutions
Logic for Consolidating Primary Reason for Collision:
Accidents included as many as 5 causes. After consolidating causes into distinct categories, I ranked and assigned a primary reason based on inherent risk & degree of prevalence.
Logic for Consolidating Primary Vehicle for Collision:
Accidents included as many as 5 causes. After consolidating causes into categories, I ranked and assigned a primary reason based on inherent risk & degree of prevalence.
Logic for Aggregating Collision Variables into Solutions:
After isolating primary vehicle and primary cause for each accident, I subdivided accidents into solution categories: Law Enforcement, City Planning, Training, and Other. The derived solutions enable us to better understand how city budget should be divided.
What Solutions Should the City Focus On?
Once we categorize accidents by city planning, law enforcement, and training solutions, we are able to gain insight into what solutions should be prioritized.
Across all the boroughs, deficiencies in city planning contribute to roughly 50% of all accidents. Contrastingly, law enforcement (i.e. aggressive driving/DUI) contribute to only ~15%. Considering that NYPD already has ~40K active officers, supplementing law enforcement with additional manpower would be costly. Moreover, predicting the return on investment involves multiple variables and cannot be guaranteed.
Improvements to city planning requires updates to infrastructure and would yield sustainable benefits. Manhattan & Brooklyn would benefit the most from such improvements, but the outstanding questions is where improvements should be concentrated.
Solutions by Borough
The average income per zip code strongly correlates to the # of accidents in Brooklyn & Queens. Budget limitations in low income neighborhoods result in depreciated roads & fewer resources for traffic control.
Manhattan is less correlated to income, largely due to the high influx of tourists & commuters that pass through. Accidents are concentrated along Avenues 1-9.
Infrastructure & per capita income are contributing factors, but we require additional analysis to prioritize how to allocate budget.
What solutions are the most feasible and realistic?
Which problems are part of larger systemic issues?
The Bottom Line
Focus on adding Taxi/Bus/Bike lanes to Manhattan Avenues to reduce the number of accidents in NYC.
By filtering for the 15 streets in NYC that resulted in the highest number of hazardous accidents (injuries and/or fatalities), we observe that the majority of accidents result from weaknesses in Public Transportation, Infrastructure, and Cyclist accommodation and that ~50% of all accidents are specific to Manhattan. Other boroughs are comprised of denser networks of roads, resulting in accidents being more broadly distributed and harder to address through a single project; multiple would need to be implemented across high risk zones.
Our goal is to maximize budget and to reduce collisions in New York. To achieve this, the city should target the streets most responsible for city collisions and redesign infrastructure around the most common factors. Adding dedicated lanes for public transportation and cyclists along Avenues 1 - Avenue 9 would do just that.