Importing Retrosheet Game Logs

So, I recently ran into a case where I needed to import the Retrosheet Game Logs, and I'd figure I'd share the steps I used to do that.

First, download the .zip archives for the Game Logs (I used 1900-2013) from here:

Retrosheet Game Logs

Then, extract all of them into the same folder.

After that, create the game_logs table (in an already created retrosheet scheme) using the following script:

SET SQL_MODE="NO_AUTO_VALUE_ON_ZERO";

CREATE TABLE IF NOT EXISTS `games` (
`date` date NOT NULL DEFAULT '0000-00-00',
`game_number` char(1) NOT NULL,
`day` char(3) NOT NULL,
`visitor` char(3) NOT NULL,
`visitor_lg` char(2) NOT NULL,
`visitor_game_number` smallint(3) NOT NULL DEFAULT 0,
`home` char(3) NOT NULL,
`home_lg` char(2) NOT NULL,
`home_game_number` smallint(3) NOT NULL DEFAULT 0,
`visitor_score` tinyint(3) NOT NULL DEFAULT 0,
`home_score` tinyint(3) NOT NULL DEFAULT 0,
`outs` smallint(5) NOT NULL DEFAULT 0,
`daynight` char(1) NOT NULL,
`completion` varchar(255) NOT NULL,
`forfeit` varchar(1) NOT NULL,
`protest` varchar(2) NOT NULL,
`park` varchar(5) NOT NULL,
`attendance` int(10) NOT NULL DEFAULT 0,
`game_minutes` smallint(5) NOT NULL DEFAULT 0,
`visitor_linescore` varchar(255) NOT NULL,
`home_linescore` varchar(255) NOT NULL,
`visitor_ab` tinyint(3) NOT NULL DEFAULT 0,
`visitor_h` tinyint(3) NOT NULL DEFAULT 0,
`visitor_2b` tinyint(3) NOT NULL DEFAULT 0,
`visitor_3b` tinyint(3) NOT NULL DEFAULT 0,
`visitor_hr` tinyint(3) NOT NULL DEFAULT 0,
`visitor_rbi` tinyint(3) NOT NULL DEFAULT 0,
`visitor_sh` tinyint(3) NOT NULL DEFAULT 0,
`visitor_sf` tinyint(3) NOT NULL DEFAULT 0,
`visitor_hbp` tinyint(3) NOT NULL DEFAULT 0,
`visitor_bb` tinyint(3) NOT NULL DEFAULT 0,
`visitor_ibb` tinyint(3) NOT NULL DEFAULT 0,
`visitor_so` tinyint(3) NOT NULL DEFAULT 0,
`visitor_sb` tinyint(3) NOT NULL DEFAULT 0,
`visitor_cs` tinyint(3) NOT NULL DEFAULT 0,
`visitor_gidp` tinyint(3) NOT NULL DEFAULT 0,
`visitor_ci` tinyint(3) NOT NULL DEFAULT 0,
`visitor_lob` tinyint(3) NOT NULL DEFAULT 0,
`visitor_pitchers_used` tinyint(3) NOT NULL DEFAULT 0,
`visitor_individual_er` tinyint(3) NOT NULL DEFAULT 0,
`visitor_team_er` tinyint(3) NOT NULL DEFAULT 0,
`visitor_wp` tinyint(3) NOT NULL DEFAULT 0,
`visitor_balks` tinyint(3) NOT NULL DEFAULT 0,
`visitor_putouts` tinyint(3) NOT NULL DEFAULT 0,
`visitor_assists` tinyint(3) NOT NULL DEFAULT 0,
`visitor_errors` tinyint(3) NOT NULL DEFAULT 0,
`visitor_passed_balls` tinyint(3) NOT NULL DEFAULT 0,
`visitor_double_plays` tinyint(3) NOT NULL DEFAULT 0,
`visitor_triple_plays` tinyint(3) NOT NULL DEFAULT 0,
`home_ab` tinyint(3) NOT NULL DEFAULT 0,
`home_h` tinyint(3) NOT NULL DEFAULT 0,
`home_2b` tinyint(3) NOT NULL DEFAULT 0,
`home_3b` tinyint(3) NOT NULL DEFAULT 0,
`home_hr` tinyint(3) NOT NULL DEFAULT 0,
`home_rbi` tinyint(3) NOT NULL DEFAULT 0,
`home_sh` tinyint(3) NOT NULL DEFAULT 0,
`home_sf` tinyint(3) NOT NULL DEFAULT 0,
`home_hbp` tinyint(3) NOT NULL DEFAULT 0,
`home_bb` tinyint(3) NOT NULL DEFAULT 0,
`home_ibb` tinyint(3) NOT NULL DEFAULT 0,
`home_so` tinyint(3) NOT NULL DEFAULT 0,
`home_sb` tinyint(3) NOT NULL DEFAULT 0,
`home_cs` tinyint(3) NOT NULL DEFAULT 0,
`home_gidp` tinyint(3) NOT NULL DEFAULT 0,
`home_ci` tinyint(3) NOT NULL DEFAULT 0,
`home_lob` tinyint(3) NOT NULL DEFAULT 0,
`home_pitchers_used` tinyint(3) NOT NULL DEFAULT 0,
`home_individual_er` tinyint(3) NOT NULL DEFAULT 0,
`home_team_er` tinyint(3) NOT NULL DEFAULT 0,
`home_wp` tinyint(3) NOT NULL DEFAULT 0,
`home_balks` tinyint(3) NOT NULL DEFAULT 0,
`home_putouts` tinyint(3) NOT NULL DEFAULT 0,
`home_assists` tinyint(3) NOT NULL DEFAULT 0,
`home_errors` tinyint(3) NOT NULL DEFAULT 0,
`home_passed_balls` tinyint(3) NOT NULL DEFAULT 0,
`home_double_plays` tinyint(3) NOT NULL DEFAULT 0,
`home_triple_plays` tinyint(3) NOT NULL DEFAULT 0,
`hp_ump_id` varchar(8) NOT NULL,
`hp_ump_name` varchar(255) NOT NULL,
`1b_ump_id` varchar(8) NOT NULL,
`1b_ump_name` varchar(255) NOT NULL,
`2b_ump_id` varchar(8) NOT NULL,
`2b_ump_name` varchar(255) NOT NULL,
`3b_ump_id` varchar(8) NOT NULL,
`3b_ump_name` varchar(255) NOT NULL,
`lf_ump_id` varchar(8) NOT NULL,
`lf_ump_name` varchar(255) NOT NULL,
`rf_ump_id` varchar(8) NOT NULL,
`rf_ump_name` varchar(255) NOT NULL,
`visitor_manager_id` varchar(8) NOT NULL,
`visitor_manager_name` varchar(255) NOT NULL,
`home_manager_id` varchar(8) NOT NULL,
`home_manager_name` varchar(255) NOT NULL,
`winning_pitcher_id` varchar(8) NOT NULL,
`winning_pitcher_name` varchar(255) NOT NULL,
`losing_pitcher_id` varchar(8) NOT NULL,
`losing_pitcher_name` varchar(255) NOT NULL,
`saving_pitcher_id` varchar(8) NOT NULL,
`saving_pitcher_name` varchar(255) NOT NULL,
`gwrbi_batter_id` varchar(8) NOT NULL,
`gwrbi_batter_name` varchar(255) NOT NULL,
`visitor_starting_pitcher_id` varchar(8) NOT NULL,
`visitor_starting_pitcher_name` varchar(255) NOT NULL,
`home_starting_pitcher_id` varchar(8) NOT NULL,
`home_starting_pitcher_name` varchar(255) NOT NULL,
`visitor_batter_1_id` varchar(8) NOT NULL,
`visitor_batter_1_name` varchar(255) NOT NULL,
`visitor_batter_1_pos` tinyint(3) NOT NULL DEFAULT 0,
`visitor_batter_2_id` varchar(8) NOT NULL,
`visitor_batter_2_name` varchar(255) NOT NULL,
`visitor_batter_2_pos` tinyint(3) NOT NULL DEFAULT 0,
`visitor_batter_3_id` varchar(8) NOT NULL,
`visitor_batter_3_name` varchar(255) NOT NULL,
`visitor_batter_3_pos` tinyint(3) NOT NULL DEFAULT 0,
`visitor_batter_4_id` varchar(8) NOT NULL,
`visitor_batter_4_name` varchar(255) NOT NULL,
`visitor_batter_4_pos` tinyint(3) NOT NULL DEFAULT 0,
`visitor_batter_5_id` varchar(8) NOT NULL,
`visitor_batter_5_name` varchar(255) NOT NULL,
`visitor_batter_5_pos` tinyint(3) NOT NULL DEFAULT 0,
`visitor_batter_6_id` varchar(8) NOT NULL,
`visitor_batter_6_name` varchar(255) NOT NULL,
`visitor_batter_6_pos` tinyint(3) NOT NULL DEFAULT 0,
`visitor_batter_7_id` varchar(8) NOT NULL,
`visitor_batter_7_name` varchar(255) NOT NULL,
`visitor_batter_7_pos` tinyint(3) NOT NULL DEFAULT 0,
`visitor_batter_8_id` varchar(8) NOT NULL,
`visitor_batter_8_name` varchar(255) NOT NULL,
`visitor_batter_8_pos` tinyint(3) NOT NULL DEFAULT 0,
`visitor_batter_9_id` varchar(8) NOT NULL,
`visitor_batter_9_name` varchar(255) NOT NULL,
`visitor_batter_9_pos` tinyint(3) NOT NULL DEFAULT 0,
`home_batter_1_id` varchar(8) NOT NULL,
`home_batter_1_name` varchar(255) NOT NULL,
`home_batter_1_pos` tinyint(3) NOT NULL DEFAULT 0,
`home_batter_2_id` varchar(8) NOT NULL,
`home_batter_2_name` varchar(255) NOT NULL,
`home_batter_2_pos` tinyint(3) NOT NULL DEFAULT 0,
`home_batter_3_id` varchar(8) NOT NULL,
`home_batter_3_name` varchar(255) NOT NULL,
`home_batter_3_pos` tinyint(3) NOT NULL DEFAULT 0,
`home_batter_4_id` varchar(8) NOT NULL,
`home_batter_4_name` varchar(255) NOT NULL,
`home_batter_4_pos` tinyint(3) NOT NULL DEFAULT 0,
`home_batter_5_id` varchar(8) NOT NULL,
`home_batter_5_name` varchar(255) NOT NULL,
`home_batter_5_pos` tinyint(3) NOT NULL DEFAULT 0,
`home_batter_6_id` varchar(8) NOT NULL,
`home_batter_6_name` varchar(255) NOT NULL,
`home_batter_6_pos` tinyint(3) NOT NULL DEFAULT 0,
`home_batter_7_id` varchar(8) NOT NULL,
`home_batter_7_name` varchar(255) NOT NULL,
`home_batter_7_pos` tinyint(3) NOT NULL DEFAULT 0,
`home_batter_8_id` varchar(8) NOT NULL,
`home_batter_8_name` varchar(255) NOT NULL,
`home_batter_8_pos` tinyint(3) NOT NULL DEFAULT 0,
`home_batter_9_id` varchar(8) NOT NULL,
`home_batter_9_name` varchar(255) NOT NULL,
`home_batter_9_pos` tinyint(3) NOT NULL DEFAULT 0,
`additional_info` varchar(255) NOT NULL,
`acquisition` char(1) NOT NULL,
PRIMARY KEY (date, game_number, visitor, home)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Once the table is created, you can import all of the text files using a batch file like this:

FOR /R ".\Retrosheet Game Logs" %%G IN (*.TXT) DO (
    MOVE "%%G" ".\Retrosheet Game Logs\game_logs.txt"
    mysqlimport --port=3306 --host=localhost -u root -pPASSWORD --local --verbose --force --fields-terminated-by=, --fields-enclosed-by=\" --lines-terminated-by=\n retrosheet ".\Retrosheet Game Logs\game_logs.txt"
    MOVE ".\game_logs.txt" "%%G"
)

When that is completed, you will have the game log data in a game_logs table inside of your schema.

doyler on Githubdoyler on Twitter
doyler
Ray Doyle is an avid pentester/security enthusiast/beer connoisseur who has worked in IT for almost 16 years now. From building machines and the software on them, to breaking into them and tearing it all down; he's done it all. To show for it, he has obtained an OSCE, OSCP, eCPPT, GXPN, eWPT, eWPTX, SLAE, eMAPT, Security+, ICAgile CP, ITIL v3 Foundation, and even a sabermetrics certification!

He currently serves as a Senior Staff Adversarial Engineer for Avalara, and his previous position was a Principal Penetration Testing Consultant for Secureworks.

When he's not figuring out what cert to get next or side project to work on, he enjoys playing video games, traveling, and watching sports.

As an Amazon Associate I earn from qualifying purchases.

Common passed on this blog, I made it to a jam.

4 Comments

Filed under Stealing First

4 Responses to Importing Retrosheet Game Logs

  1. John Signore

    I was able to create the table but I’m running into issues at the very end importing the data.

    I copied and pasted that into a batch file called gamelogs.bat. I ran the batch file but I still have nothing in my database.

    • Hmm, are your game log text files in “Retrosheet Game Logs/game_logs.txt” relative to the batch file?

      If so, are your MySQL username and password correct in the query in the batch file?

  2. John Signore

    I figured it out using a different method but ran into something else that has me confused.

    I downloaded all the game log files into a folder. I then used the command type *.TXT >> AllGames.Txt to get the data into one large file.

    Then I used to script at the top to create the table. Lastly, I imported the data file.

    So now I have data but it appears as if I don’t have complete data. I should have 2,430 games per season. If I run a query on each season, I’m getting 1,215. It appears I’m only getting half of each season. If I look at my text file, the data is completely there for the missing games so something is happening when I try to import it.

    • Are you running into any errors when you attempt to do the import?

      Additionally, is it possibly an issue with the DBMS/your settings? Some databases only allow you to select a limited number of rows by default without modifying some settings.

      Is it always the same half of games that don’t show up in your tables, or is it seemingly random?

      I haven’t run into that issue yet myself, but I’d be glad to try to help you work through it.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.