MLB: Chicago White Sox at Detroit Tigers

The Struggles of Small Sample Sizes

A month of baseball has been played. The White Sox are off to a shockingly good start for a team that went into the season labeled as a rebuilding team. Most batters have had about 100 plate appearances. Fans and analysts alike are itching to draw conclusions from the month of baseball we’ve seen so far. After all, a month seems like a long time at this juncture of the season. We literally have to talk about the fact that Avisail Garcia is still hitting .370 with a 1.029 OPS, Tommy Kahnle is striking out 57.6 percent of batters faced, Anthony Swarzak has allowed just four base runners and no runs, and Leury Garcia has a wRC+ of 107. So let’s do that.

But wait! A wrinkle! All of these things have occurred in a small sample size. I won’t get into the disgusting math details because, well, they’re disgusting, but there is an amount of time that must be waited before things start to take focus. That was vague wasn’t it? Some of this can be derived intuitively. Batting average is just a fraction in decimal form. It’s easy to see how in the early part of the season, the change in numerator and denominator are far too close to each other in terms of magnitude. The denominator needs to reach a large enough magnitude to keep the fraction from fluctuating too easily. The same applies to more complicated metrics. There’s a certain point that must be passed before the metric really provides analytical value.

The math behind finding those points is incredibly complicated and often requires consideration of variables regarding the player himself. In other words, it’s way above my head.

So how do we know if Avisail Garcia’s hot start is a changed man or a month-long anomaly? How do we know if Kahnle is the next Andrew Miller? Honestly, it’s not easy. It is a struggle dealing with these small sample sizes.

Avisail Garcia is posting a slash line of .370/.420/.609 and had a 199 wRC+ entering Wednesday. That translates to an ISO of .247, which is nearly double the mark he set last year (.140). He also has an unreal BABIP of .460 that is bound to drop at least a hundred points at some point. But he’s also lowered his O-Swing% by nearly two percentage points while raising his Z-Swing% by almost eight percentage points and overall contact rate by about five percentage points. There are good signs. There are bad signs. There are great numbers. There are regression concerns. What remains constant in all of this is that it’s a small sample size.

Swarzak has allowed just four base runners this season. He hasn’t given up a single run. He’s pitched just 13 ⅓ innings. Kahnle has struck out 57.6 percent of the batters he’s faced, but there have been just 33 players to step to the plate against him. Leury Garcia has a 107 wRC+ but in nice little sample size of just 69 plate appearances entering Wednesday. Jose Quintana has allowed three or more runs in half of his starts this season, but there have only been six starts for the White Sox ace. I could go on and on and make you press that little “x” in the corner of this tab. Instead, I think the point has been made.

These sample sizes — they are too small. Intuition, past experiences, and physical cues can sometimes lead us in the direction of a conclusion about a player based on these small sample sizes. However, that sort of analysis tends to come up short.

Dealing with small sample sizes is hard. There’s a strong desire to draw conclusions from the first month of the season. After all, it’s been a whole month! While it’s certainly fine to attempt to analyze the baseball that has been played to this point in the season, we must remember the struggle of small sample sizes. This analysis we have made must be taken with a grain of salt. Will Avisail Garcia have a career year that makes him a viable starter on a good team? Maybe, but probably not. Will Quintana pitch so poorly that the White Sox are incapable of trading him? Maybe, but probably not. Lets wait for the sample size to grow a little before we attempt to answer these questions.

Lead Photo Credit: Raj Mehta-USA TODAY Sports

Related Articles

Leave a comment

Use your Baseball Prospectus username