|
||||||||||||||||||||||||||||||||||||||||||||||||
|
Does Team Clutch Matter in
Baseball? by Cyril Morong How teams hit and pitch in the clutch certainly matters. But perhaps the better question is does clutch performance matter any more than non-clutch performance? Probably not. I have three main points: 1. To be among the best teams in baseball does not necessarily require that a team perform better in the clutch (relative to their opposition) than they do in the non-clutch (relative to their opposition). 2. Splitting a team’s performance into clutch and non-clutch helps very little, if at all, in explaining winning percentage. 3. Non-clutch performance has a greater statistical impact on winning than does clutch performance. Note: A glossary at the end of this handout defines terms not defined in the body of the paper. Point #1: To be among the best teams in baseball does not necessarily require that a team perform better in the clutch (relative to their opposition) than they do in the non-clutch (relative to their opposition). The tables below show how the best and worst teams (by winning percentage) performed compared to their opposition during the seasons 1989-2002 (394 teams). Table 1
“Negative” means a team had a bigger advantage in OPS over its opposition in close and late situations than they did in NON-close and late situations. “Positive” means a team had a bigger advantage in OPS over its opposition in NON-close and late situations than they did in close and late situations. Now certainly more of the top 25 teams had a bigger advantage when it was close and late than when it was non-close and late. But 10 of the best 25 teams of the last 14 seasons actually had a bigger advantage over their opposition in non-close and late situations than in close and late situations. In other words, 40% of the best 25 teams actually saw their relative performance level fall when it was close and late. Moving to the top 50 teams, almost as many teams had a bigger advantage in non-close and late situations than in close and late situations. Table 2
“Negative” means a team had a bigger advantage in OPS over its opposition in runners in scoring position situations than they did in NON- runners in scoring position situations. “Positive” means a team had a bigger advantage in OPS over its opposition in NON- runners in scoring position situations than they did in runners in scoring position situations. Table 2’s pattern is similar to Table 1’s. It is clearly not necessary for a team to increase its relative performance in the clutch to be good. Point #2: Splitting a team’s performance into clutch and
non-clutch helps very little, if at all, in explaining winning percentage. This section uses information from linear regression analysis, where the computer estimates an equation that shows the relationship between a dependent variable and one or more independent variables. Regression 1PCT = 0.49 + 1.27*OPS - 1.26*OPPOPS R-squared = 0.798 (N= 394) PCT = a team’s winning percentage OPS = a team’s hitting OPS OPPOPS = the OPS of a team’s opponents (Regressions involving independent variables other than OPS are discussed in the Appendix-For more information on the regressions, like t-values, standard error terms, etc., contact the author) The R-squared of 0.798 means that 79.8% of the variation in winning percentage across teams is explained by a team’s OPS and its opponent’s OPS. But what if their respective counterparts for both close and late situations and non-close and late situations replace each of these independent variables, OPS and OPPOPS? Regression 2PCT = 0.501 + 0.918*NONCLOPS + 0.345*CLOPS - 0.845*OPPNONCLOPS - 0.421*OPPCLOPS R-squared = 0.831 NONCLOPS = a team’s OPS in non-close and late situations CLOPS = a team’s OPS in close and late situations OPPNONCLOPS = an opponent’s OPS in non-close and late situations OPPCLOPS = an opponent’s OPS in close and late situations Breaking down teams’ performances into clutch and non-clutch only improves R-squared by 0.033. So now instead of the independent variables explaining 79.8% of the variation in team winning percentage, they explain 83.1%. Explanatory power increases very little. Regression 3PCT = 0.501 + 0.848*NONRISPOPS + 0.432*RISPOPS - 0.799*OPPNONRISPOPS - 0.462*OPPRISPOPS R-squared = 0.801 NONRISPOPS = a team’s OPS in situations without runners in scoring position RISPOPS = a team’s OPS in situations with runners in scoring position OPPNONRISPOPS = an opponent’s OPS in situations without runners in scoring position OPPRISPOPS = an opponent’s OPS in situations with runners in scoring position In this case breaking performance down into runners in scoring position (RISP) situations and non-RISP situations also adds very little explanatory power. The R-squared increases by just 0.003 (0.801-0.798), compared to Regression 1. Point #3: Non-clutch performance has a greater statistical impact on winning than does clutch performance. Looking at Regression 2 again PCT = 0.501 + 0.918*NONCLOPS + 0.345*CLOPS - 0.845*OPPNONCLOPS - 0.421*OPPCLOPS notice that the coefficient estimates for the non-close and late situations are greater than for the close and late situations. For example, a 0.010 increase in NONCLOPS will increase a team’s winning percentage by 0.00918 or 1.49 wins a season while it is only 0.559 wins for CLOPS (0.010*0.345*162). A similar point can be made about OPPNONCLOPS and OPPCLOPS. Of course, non-close and late situations comprise a much greater percentage (84%) of the average team’s than do close and late situations (16%). But the close and late situations are supposed to be more important, coming at times when the game might be more likely to hang in the balance. The same point can be made about Regression 3, which involves runners in scoring position situations (23.6% of plate appearances) versus non-runners in scoring position situations. The coefficient estimates on the non-RISP situations are greater than the coefficient estimates on the RISP situations, again showing that the non-clutch situation is more important. It seems that the greater quantity of non-clutch situations outweighs the quality of clutch situations. (Beta coefficients and elasticities were calculated for Regression 2-see Appendix. Those, too, show that the non-close and late situations have a greater statistical impact on winning. There are also results from a regression in which team winning percentage depended on a team’s advantage over its opposition in both close and late and non-close and late situations. Partial correlation coefficients were calculated for that regression and again the non-close and late situation was more important.) Appendix Regression 2 with beta coefficientsPCT = 0.600*NONCLOPS + 0.263*CLOPS - 0.584*OPPNONCLOPS - 0.362*OPPCLOPS “Beta coefficients are occasionally used to make statements about the relative importance of the independent variables in a multiple regression model. To determine beta coefficients, one simply performs a linear regression in which each variable is normalized by subtracting its mean and dividing by its estimated standard deviation.”(PR, p. 90) They allow for a direct comparison of the independent variables. (PR, p. 91). The beta coefficient estimates on the non-close and late variables are higher than on the close and late variables, again showing that non-clutch situations have a greater impact on winning. Elasticities for Regression 2: NONCLOPS = 1.37 CLOPS = 0.497 OPPNONCLOPS = -1.26 OPPCLOPS = -0.608 If a team increases its OPS in non-close and late situations by 1%, it increases its winning percentage by 1.37%. If it increases its OPS in close and late situations by 1%, its winning percentage would only go up by 0.497%. So it is more important for team to improve its performance in non-clutch situations. As mentioned earlier, this is probably because there are more of such situations. Partial correlation coefficientsRegression 4 PCT = 0.500 + 0.878*NONCLADV + 0.388*CLADV NONCLADV = a team’s OPS in non-close and late situations – its opponent’s OPS in non-close and late situations (or its advantage) CLADV = a team’s OPS in close and late situations – its opponent’s OPS in close and late situations (or its advantage) Partial correlation coefficient for NONCLADV = 0.807 Partial correlation coefficient for CLADV = 0.676 Partial correlation coefficients allow us to “see if whether the dependent variable and one independent variable are related after netting out the effect of any other independent variables in the model.” (PR, p. 92) Also, “Partial correlation coefficients are often used to determine the relative importance of different variables in multiple regressions models.” (PR, p. 93). So again, the non-clutch situation is more important by the higher partial correlation coefficient. Notice that the normal coefficient estimate for the NONCLADV variable is also higher than for the CLADV variable. Normal correlations for Regression 4PCT & NONCLADV = 0.829 PCT & CLADV = 0.717 NONCLADV & CLADV = 0.46 Normal correlations for Regression 2PCT & NONCLOPS = 0.429 PCT & CLOPS = 0.430 PCT & OPPNONCLOPS = -0.464 PCT & OPPCLOPS = -0.533 CLOPS & NONCLOPS = 0.580 OPPCLOPS & OPPNONCLOPS = 0.556 Sample Standard deviations for Regression 2NONCLOPS = 0.0451 CLOPS = 0.0527 OPPNONCLOPS = 0.0477 OPPCLOPS = 0.0594 PCT = 0.069 Notice that close and late performance varies more than non-close and late performance. This may be due to the fact that close and late makes up only 16% of plate appearances (walks + at bats) for the average team. Expanded Regressions-Regression 6PCT = 0.49 + 1.04*XB% + 2.72*HIT% + 2.53*BB% - 1.00*OPPXB% - 2.7*OPPHIT% - 2.54*OPPBB% R-squared =0.820 XB% = a team’s extra bases divided by plate appearances (at bats + walks) HIT% = a team’s hits divided by plate appearances BB% = a team’s walks divided by plate appearances OPPXB% = an opponent’s extra bases divided by plate appearances OPPHIT% = an opponent’s hits divided by plate appearances OPPBB% = an opponent’s walks divided by plate appearances I ran this regression because of deficiencies with OPS, namely that it might double count batting average. Since OPS adds on-base percentage to slugging percentage, it is adding two stats that may be highly correlated. This is less of a problem for the three stats used here. Expanded Regressions-Regression 7If a team’s performance is again broken down into clutch and non-clutch performance (in this case using close and late (CL)), again there is very little gain in R-squared. It rises by only 0.033 over Regression 6. PCT = 0.53 + 0.89*NONCLXB% + 1.83*NONCLHIT% + 1.71*NONCLBB% - 0.78*OPPNONCLXB% -1.74*OPPNONCLHIT% - 1.63*OPPNONCLBB% + 0.21*CLXB% + 0.74*CLHIT% + 0.71*CLBB% -0.27*OPPCLXB% - 0.96*OPPCLHIT% - 0.85*OPPCLBB% R-squared =0.853 Expanded Regressions-Regression 8The next regression breaks down performance using RISP. PCT = 0.49 + 0.86*NONRISPXB% + 1.66*NONRISPHIT% + 1.77*NONRISPBB% -0.66*OPPNONRISPXB% - 1.64*OPPRISPHIT% - 1.82*OPPNONRISPBB% + 0.16*RISPXB% 1.13*RISPHIT% + 0.76*RISPBB% - 0.31*OPPRISPXB% - 1.13*OPPRISPHIT% - 0.76*OPPRISPBB% R-squared = 0.828 Notice that this regression only increases the R-squared by 0.008 over Regression 6. GlossaryOPS = On-base percentage + slugging percentage On-base percentage = (Hits + Walks)/(Walks + At bats). This is the only available data from “Stats Fantasy Advantage” for situations like close and late and runners in scoring position. Times hit by pitch and sacrifices are not given there for those situations. Slugging percentage = total bases divided by at bats. Close and Late-Situations when the game is in the 7th inning or later and the batting team is leading by one run, tied, or has the potential tying run on base, at bat or on deck. Sources and ReferencesPindyck, Robert S. and Daniel L. Rubinfeld, Econometric Models and Economic Forecasts, second edition SABERMETRIC BASEBALL ENCYCLOPEDIA Created by Lee Sinins STATS, INC. “Stats Fantasy Advantage” website Clutch Hitting ResearchBrooks Harold, “The Statistical Mirage of Clutch Hitting,” Baseball Research Journal 1989 Conlon Tom, “Or Does Clutch Ability Exist? By The Numbers March 1990 Richard D. Cramer, "Do Clutch Hitters Exist?" Baseball Research Journal 1977 Gary Gillette, “Much Ado About Nothing,” SABERMETRIC REVIEW, July 1986 Tom Hanrahan, “Clutch Teams in 1999” By the Numbers May, 2000 Tom Hanrahan, “What Makes a “Clutch” Situation?” By the Numbers February, 2001 Karcher Keith, “The Power of Statistical Tests,” By The Numbers June 1991 Eldon G. Mills and Harlan D. Mills, Player Win Averages, 1970. A.S. Barnes Cyril Morong, “Clutch Hitting and Experience,” By the Numbers November 2000 Pete Palmer, “Clutch Hitting One More Time,” By the Numbers, March 1990 Willie Runquist, Baseball by the Numbers, 1995. McFarland. Runquist Willie, “Clutch Hitters and Other Mythological Animals,” By The Numbers March 1994 Rob Wood, “Clutch Ability: Myth or Reality?” By the Numbers, December 1989 Note: By the Numbers is the Newsletter of the SABR Statistical Analysis Committee. The Baseball Research Journal is also published by SABR. |
||||||||||||||||||||||||||||||||||||||||||||||||