This week I analyzed a HUGE data set provided by Maven Analytics, which contained records of 28 million Green Taxi trips in New York City. It was perfect for data wrangling and visualization task. I picked up my favorite BI tool Power BI to complete this entire project.
Build a dashboard to assist the Lead Dispatcher with weekly planning and logistics based on Green Taxi Trips in New York City (Year 2017-2020).
Answering the following questions:
- What’s the average number of trips we can expect this week?
- What’s the average fare per trip we expect to collect?
- What’s the average distance traveled per trip?
- How do we expect trip volume to change, relative to last week?
- Which days of the week and times of the day will be busiest?
- What will likely be the most popular pick-up and drop-off locations?
Provided dataset contained records of 28 million Green Taxi trips in New York City for four fiscal years 2017 to 2020. The raw data has some issues, so it required the following adjustments and assumptions to clean and prepare the data for further analysis.
- Included trips that were NOT sent via “store and forward”
- Included street-hailed trips paid by card or cash, with a standard rate
- Removed any trips with dates before 2017 or after 2020, along with any trips with pickups or drop-offs into unknown zones
- Assumed any trips with no recorded passengers had 1 passenger
- Swapped dates If a pickup date/time is AFTER the drop-off date/time
- Removed trips lasting longer than a day, and any trips which show both a distance and a fare amount of 0
- Changed any records where the fare, taxes, and surcharges are ALL negative into positive value
- Calculated the distance for any trips that had a fare amount but a trip distance of 0, using this formula: (Fare amount – 2.5) / 2.5
- Calculated the fare for any trips that had a trip distance but a fare amount of 0, using this formula: 2.5 + (trip distance x 2.5)
Data Cleaning and Preparation task completed using Power Query
Check out an interactive report here: https://bit.ly/3Oj42S1