Master Thesis Defense by Noah Thomas Bloss
Title: Predicting the outcome of a pitch in baseball with machine learning
Abstract: This thesis has investigated Major League Baseball data in order to predict a pitcher’s future Earned Run Average. The data used is from the years 2017 to 2023, which has been collected by statcast. In the search for a more accurate predictor, three different approaches have been studied with machine learning techniques. For each approach, multiple neural networks have been trained with hyperparameters assigned by a grid search algorithm, evaluating systematically the configurations given by a pre-defined hyperparameter space. Each individual pitch is given a predicted pitch score. In each approach the correlation coefficient between the pitchers’ average pitch score and the subsequent Earned Run Average is determined. These correlations have to be higher in comparison to the Fielding Independent Pitching coefficient to be described a better predictor.
The resulting correlation coefficients for the predictions of all approaches being lower than the coefficients of the Fielding Independent Pitching, meaning that the trained neural networks can not be considered a better predictor of the pitcher’s Earned Run Average.
Supervisor:
- Charles Steinhardt, University of Copenhagen, Niels Bohr Institute
Censor:
- Georgios Magdis, Technical University of Denmark (DTU)