Week 8 is the last week of formalized curricula at Galvanize. It covers visualization, web app development via Flask and Bootstrap, and a two-day challenge to build a full end-to-end data pipeline for a predicting fraud on EventBrite's dataset.
On to the big ticket item for the week. A two day challenge to build a full end-to-end pipeline for predicting fraud based on EventBrite's dataset. The deliverable was a webapp dashboard to track and alert fraudulent events in real-time (for real-time analysis, an instructor setup a server to ping events every few seconds). We worked in teams of four and began executing. The first day was mainly comprised of feature engineering, and modeling. However, in making an application like this, we were very aware that our models, vectorizers, and pipeline must scale and align - a difficult task working on a team of four. However, after careful planning and a lot of whiteboarding, we had a working vectorizer and model for predicting low, medium, and high likelihood of fraud based on the following probability output of the model:
The second day was focused on building a web app and dashboard along with backend postgres database to store risk predictions. We connected to streaming events from the server using requests module, vectorized and predicted the risk for each event. Risk calculations and event details were then stored in a postgres database and accessed via a Flask webapp dashboard which displayed events that were likely to be fraudulent.
Building this full end to end product was a great experience on several fronts:
- Working with a team to build a scalable end to end product
- Building a working dashboard that could be used by executives to make decisions
- Attacking all parts of the data pipeline (exploratory analysis, feature engineering, model building, webapp development, backend database engineering) to create a single data product
Next week begins 3 weeks of our personal capstone projects. I will simply write a single post for my project once complete. Can't wait!