Matthew B. Burrell

Riverside, CA · (951) 452-6856 · matthew_burrell@outlook.com

I am a performance-driven professional and I am accustomed to the rigors of fast-paced, highly regulated environments requiring sharp attention to detail, consummate accuracy, and outstanding analytical skills. I will work to deliver results while partnering with staff to accelerate the achievement of immediate and long-range goals.

I am a highly driven engineering professional with hands-on experience gathering and analyzing data as well as utilizing insights to drive complex technology projects. I possess the ability to create machine learning-based tools/processes, enhancing data collection procedures, and processing as well as verifying the integrity of data. I possess the capability to employ collaborative development methodologies to deliver breakthrough technologies and ensure optimal performance on technology initiatives. I also excel while meeting exacting standards, quality objectives, and timelines. I am skilled in collaborating with all members of an organization to achieve business and technological objectives.

I am consistently recognized by management for outstanding technical, interpersonal, leadership, and communication skills. I have flexible and stable work ethics that are underscored by a positive team-building attitude. I am acknowledged throughout my career for demonstrating rational decision making; weigh facts and parlay the most pertinent information.


Projects

Ames Housing Data: Kaggle Challenge

The Ames Housing Data: Kaggle Challenge was a project to develop a machine learning model to predict housing prices. With the main goal of accuracy on my mind, I had a secondary goal to determine what features effect the price. For example, what is the effect of quality of the house or square footage? The secondary goal was important because of the business case of determining which features could be improved to increase a homes price. The model will help people make homeowners and home flippers make better decsisions. To achieve my goals, I decided to use and OLS (Ordinary Least Square) model. OLS is a white box model, which means I can use it to predict home prices, but I can use the coefficents to gauge what features effect home prices.

Classifiying MLB Reddit Post

The project was inspired form a July 2020 New York Times article about the decline of baseballs popularity. With Covid-19 pandemic, MLB teams took a big hit from lost revenue form game attendance. As a result, MLB teams need more creative ways to find team fans to relieve the financial pain from the pandemic. The project's goal was to classify Reddit post of MLB teams. The idea was to train a model to determine whether a post belongs to a certaint MLB fan base. The data was collected using Reddit's With that knowledge, teams can then directly market team games and merchandise to actual team fans from socail media post accross different platfroms. The model developed was a random forest classifier with an accuracy of 85.89%.

How to Improve SAT and ACT Participation Rates

The project was the first GA DSI bootcamp project. The problem was to solve where the College Board could spend money to improve the SAT participation rate. To solve the business problem, I used Exploratory Data Analysis to identify places of oppurtunity to imporve SAT and ACT participation rates.


Education

California State Polytechnic University, Pomona

Master of Science
Economics - Econometrics and Quantitative Economics Track
May 2019

California State University, San Bernardino

Bachelor of Arts
Economics - Economics and Mathematical Economics
June 2016

Riverside City College

Associate of Science
Math and Science
December 2013

Skills

Programming Languages & Tools
Statistical Modeling: classification, regression, clustering, feature engineering, neural networks, ARIMA
Statistical Tools: time series, regression models, hypothesis testing, confidence intervals, principal component analysis, dimensionality reduction, A/B testing, data visualization, data mining, data analysis
Software and Programming Languages: Python (Scikit-Learn, NumPy, SciPy, Pandas, TensorFlow, Keras), R, SQL (MySQL, PostgreSQL, Oracle), Hadoop (Hive, MapReduce), Microsoft Excel, LaTeX, HTML, Scala, Matlab
Cloud Computing: AWS (EC2, S3, Redshift), IBM Cloud
Toolkit: Jupyter Notebook, Tableau, GitHub, RStudio, VS Code, Stata, Microsoft Excel, Git

Workflow
  • Problem Framing: Goals
  • Data Preparation: Acquire Data, Clean Data, Explore Data
  • Analysis: Baseline Modeling, Secondary Modeling
  • Reflection: Comparison, Notes
  • Explore Alternatives
  • Dissemination: DS Product, DS Report, Sharing Experiments

Interests

Apart from being a data scientist, I enjoy gaming, exploring the kitchen (I am no chef, but I can make a mean lobster mac and cheese), going on hikes, and traveling. When forced indoors, I follow a number of sci-fi and fantasy genre movies and television shows. My favorite sci-fi/fantasy is Star Wars, and my favorite characters are Mandalorians because of their samuraiesque culture and appeal.