Welcome to the Baseball Machine Learning Workbench

The Baseball Machine Learning Workbench is an interactive web application. Explore various analytics, decision intelligence & Machine Intelligence techniques using historical baseball data.

Prediction breakdown
Variable Response
Prediction Matrix
Available Scenarios:
32x32

What-If Analysis - Rules Engine This scenario showcases how a simple rules engine can be used to attempt to predict baseball Hall Of Fame Induction. No Machine Intelligence is used, rather a simple rule:
If sum of career HRs >= 500 then Hall of Fame Induction is True (else Hall of Fame Induction is False).

32x32

What-If Analysis - Single Model This scenario showcases how a Machine Intelligence model can be used to attempt to predict baseball Hall Of Fame Induction. Machine Intelligence is used to classify the batter baseball data.
The key difference over the rules engine approach is that a probability is returned; allowing a decision to be made on a probability threshhold and other statistical metrics.

32x32

What-If Analysis - Multiple Models This scenario showcases how multiple Machine Intelligence models can be used to attempt to predict appearing on a Hall Of Fame Ballot & Hall Of Fame Induction. Machine Intelligence is used to classify the batter baseball data.
The multiple models implementation showcases progression to Hall of Fame Induction. First, the player needs to be considered on being on the Hall of Fame Ballot then considered for Hall of Fame Induction. This can be used to aid the decision maker in providing multiple supporting conclusions provided by machine learning models (experts).

Machine Learning Probability Statement characteristics:
  • Baseball data used: MLB batter data aggregated at the season level from 1876 to 2019. (Note: Only players that were predominatly position players are included, pitchers data has been omitted.)
  • The prediction of Hall of Fame Ballot or Induction is surfaced as a probability percentage between 0% and 100%.
  • Hall of Fame Ballot defined as the presence of the candidate batter on any of the yearly vote total for the Hall of Fame.
  • Hall of Fame Induction defined as the candidate achieving 75% of the necessary vote by the BWAA electors or special BWAA sessions. Note: This explicitly excludes candidates in the Hall of Fame sent in by other means (i.e. veteran's comittee). More info: https://baseballhall.org/hall-of-famers/rules/bbwaa-rules-for-election
  • The machine learning models have been built using the Generalized Additive Models (GAM) algorithm using ML.NET.
  • The following batting features were used to build the ML models: Years Played, At Bats, Runs, Hits, Doubles, Triples, Home Runs, RBIs, Stolen Bases, Batting Average, Slugging Percentage, All-Star Appearances, MVPs, Triple Crowns, Gold Gloves, Total Bases, Total Player Awards.