Last week, I introduced the topic of Free Agent position-player salaries in a post that can be found here. Today, I will address two questions:
1. How do teams weigh positional differences?
Some studies on MLB simply group players by broad categories: catchers, infielders, and outfielders. Having played and watched baseball extensively, I believe that there is an immense distinction between a shortstop and a first-baseman even though both are technically infielders, and thus I felt it unjust to group them under the same heading. As such, I divided players into more specific clusters, similar to how WAR calculations group players by position.
Catcher, shortstop, and center field are the three most demanding positions on the defensive spectrum, and I hope to discover how these positions are weighted in the Free Agent market. Second-basemen and third-basemen form a logical pair, largely because they are clearly unlike first-basemen and designated hitters, so I grouped each of those pairings. Left and right fielders typically profile differently than center fielders, so I determined that it was best to break up outfielders as well. A player is placed into a cluster depending on where he logged the most innings during the prior season. For example, while Torii Hunter appears as a CF in 2008 after making 155 starts there in ‘07, he is grouped as a COF in 2013 because he transitioned to RF (134 games in RF, 1 in CF) towards the end of his tenure in Anaheim. My initial hypothesis is that ‘up-the-middle’ players – C, SS, and CF – will be valued more than players of other positions.
Using these classifications, I will show how different position clusters are valued in the FA market after controlling for performance and other factors. In the ensuing analysis in Table 2, the coefficients on each position variable represent the difference in salary between an average first-basemen/designated-hitter and a player at the given position. For example, the coefficient of 0.472 on C means that relative to first-basemen, a catcher earns a 47.2% higher salary, on average.
As expected, ‘up-the-middle’ players – catchers, shortstops, and centerfielders – command the greatest premium on the Free Agent market. Shortstops demand the most of all positional groupings, 56% more than the average first-basemen. In other words, if a SS can produce the typical offensive output of a 1B, he will earn more, because there are fewer shortstops that can hit well. The coefficient on IF is intriguing, because it shows that second- and third-basemen are paid a premium as well. Though the effect is not as significant or as big in magnitude, the regression implies that they earn roughly 20% more than a typical first-basemen once performance is controlled for. Lastly, there is evidence that corner outfielders are not paid distinctly differently than first-basemen. In other words, players who are low on the defensive spectrum are paid primarily for their production on offense, with little regard for what defensive value they may provide.
I referenced the scarcity of position-players relative to pitchers in MLB today, and the positional scarcity concept applies to these results as well. The up-the-middle positions are the most difficult to play at an MLB-caliber level, so consequently there are fewer players who are capable of playing these positions effectively, and even fewer who are available in Free Agency. Thus, if a top player at any of these positions hits Free Agency, more teams than usual will become bidders for his services.
For example, CF Gary Matthews Jr. entered Free Agency in 2007 after a career year in 2006, posting an above-average OBP and SLG after spending 8 seasons proving he was no more than an adequate player. Nonetheless, as a CF who proved he could produce well offensively, the Angels rewarded him with a 5-year, $50M contract, and they were not the only team lining up to pay him. Unsurprisingly, Matthews flamed out shortly thereafter, but his case exemplifies the sky-high demand for premier offensive talent at challenging defensive positions. If Ian Desmond reaches Free Agency this upcoming offseason, teams will be tripping over each other to offer him $20M+ per season, because elite shortstops hardly ever reach the open market. A shortstop who posts three straight seasons with at least 4 fWAR – and can hit for power – has unspeakable value in today’s MLB.
One further question I sought to examine with regards to positional differences was: is defensive ability weighted differently by position? Some of this effect may have been picked up in the coefficients on the position variables, but to check for this specific effect, I created interaction terms between each position variable and DRS, and then regressed these terms against the percentage change in salary. The only significant coefficient that resulted came on IF*DRS (coefficient of 0.022), which was surprising, as I expected C and SS defense to perhaps be more valued. After looking at the dataset, I believe the driving force behind this coefficient may be that 5 of the top 7 DRS seasons in the sample were posted by 2B or 3B –Chone Figgins, Adrian Beltre (2x), Mark Ellis, and Juan Uribe. It would be fascinating to see if this result holds if it were extrapolated over a larger sample.
2. Which statistics matter most in determining salaries, and have we seen front offices place appropriate value on sabermetric statistics (including defensive stats)?
In this section of the analysis, I first propose a model of strictly traditional statistics, then offer a model with only advanced statistics, and culminate with a model that incorporates significant statistics from each category. [Please reference Part I for clarification of traditional vs. advanced stats]. Each of the following models features controls for four factors: career statistics, year-to-year variation, the player’s age/years of MLB service, and positional discrepancies. I will begin with strictly a ‘traditional’ statistics model, mainly to serve as a baseline for further discussion.
The regression netted 9 statistically significant results, and the results are similar to those found in the literature. Of the performance-related variables, games started (Gs), offensive production (OBP, SLG, R), and past performance (AVG_career) all have the expected positive signs and are significant at the 5% level. The coefficient on Gs suggests that for a 10-game increase in games started, a FA can expect to receive a 4% increase in salary, holding all else equal. The coefficient on R is surprisingly robust, and implies that if a player scores 1 standard deviation more runs (25.35) in a given season, his salary will increase by 25.4%. Using the mean AAV, this represents an increase in salary from $4.55M to $5.70M, certainly more than a marginal increase. Observing significant positive coefficients on both OBP and SLG confirms that each is considered a ‘winning’ statistic. At the same time, FPct and SB lacked relevance in the model, as expected, because both metrics have been proven to be imprecise measures of their respective characteristics, which MLB general managers realize.
When comparing career statistics across players, it is best to use statistics that measure a player’s typical season as opposed to their aggregate performance. This provides us with a better gauge of a player’s annual production, and avoids giving undeserved additional merit to players who have longer careers. In the regression, there are four new variables (HR_peryear, RBI_peryear, R_peryear, and SB_peryear) that reflect a player’s average output per season in each category. Though none of these added metrics have distinct significant effects, replacing the 4 alternatives raises the model’s R2 by 2%. Of the career statistics, AVG was the only one with a significant effect. On average, a player with a .302 lifetime AVG (1 std. dev. above the mean) can expect to earn a 39% increase in salary over the average Free Agent.
Once you control for performance, teams tend to pay a premium for a player with more years of MLB service. The logic behind this may be that veterans bring unquantifiable traits to a club, like leadership skills that may help improve young players or an established name that may improve a team’s legitimacy in the eyes of fans. The square of YearsinMLB has a negative coefficient, which shows that the effect of increased experience is non-linear. Much literature suggests that a player peaks between his age-27 and 29 seasons, and this model implies that teams understand the age curve. It seems teams view veteran players as something of a necessary evil: they may exhibit a decline in production relative to their career peak, but they can also help the team off the field.
The advanced statistics included in this next model help to comprise fWAR, so in addition to other insight the model may provide, it will also demonstrate which components of fWAR are deemed most important by teams. Here, I include OPS, ISO, wOBA, K, BB, and DRS, and the OLS estimates can be found in Table 4.
After tinkering with different combinations, I determined that this collection of stats provided a sufficient representation of advanced statistics. DRS trumped DEF and both measures of UZR in terms of how teams project a player’s defensive ability, and its coefficient in this model is both positive and statistically significant. The impact of DRS in economic terms is not immense, but it certainly indicates that teams are now willing to pay a premium for better defenders. Per the model, if a player saves 5 more runs – the difference between a below-average and average defender – in the prior season, he can expect his FA salary to increase by 6.4%, all else equal.
On the offensive side of the ball, wOBA and K% were the two advanced stats with significant coefficients. wOBA proved to be a better indicator of FA salary than wRC+, perhaps because wRC+ tends to be used more often for cross-era comparison than for evaluating prior season performance. The model predicts a 24.5% increase in salary for a 20-point increase in wOBA; for reference, a jump from a .320 to a .340 wOBA is equivalent to going from an average hitter to an above-average one. So, given that result, it seems that teams still weight offensive contributions about four times more than they do defensive performance. This may be for two reasons. Despite the improvements made in terms of quantifying defense, offensive metrics are still more reliable even in small samples, and can thus be more easily interpreted. And secondly, offense puts fans in the seats, so teams will pay a premium to add talented offensive contributors.
K% had a strong negative coefficient, which portends that teams still value players who can put the ball in play, even though Moneyball showed that a strikeout is essentially just another out. This coefficient may help to partially explain why Paul Konerko (K% = 17.10 in 2011) earned $2M more in AAV than Adam LaRoche (K% = 21.30 in 2013) despite otherwise similar profiles. Now, it seems that players are receiving too great of a penalty for a high K% relative to the actual detriment that their strikeouts cost a team.
It is striking that this model’s R2 is less than that of the traditional model, which means that either not all teams rely on sabermetric analysis in their Free Agent evaluations, or (more likely) teams rely on both advanced and traditional statistics when determining player salaries. Therefore, I will build a final model that integrates significant metrics from both of the prior models. Results for the combined model can be found in Table 5. Using a medley of statistics garners the highest R2 yet, at 0.74, which means that 74% of the variation in the independent variable can be explained by the model.
Gs and R may serve somewhat as proxies for measuring position players who not only are starters, but starters who stay healthy throughout the prior season. Both of these are counting stats, meaning they increase with playing time, so while Gs and R certainly measure productivity, their positive coefficients may also represent that healthier players earn higher salaries. Only R produces a statistically significant coefficient, though; does this mean that teams overvalue how many runs a player scored in the prior season? The statistic is highly dependent on four factors: playing time, how often you get on base, where you hit in the lineup, and the quality of surrounding players. Teams may place too much emphasis on the value of a player who is an established leadoff hitter, or on a player who played on a high-scoring team. Given that R_peryear is also significant, perhaps the coefficient on R should partially be interpreted as the effect of a player being a consistent starter rather than just the output he produces.
HR is also a counting stat, and while its coefficient is not significant at the 10% level, the variable’s inclusion in the model increased the R2 by .36, which advocates its importance. Although much of the value of home runs is included in wOBA, teams may find a player who hits many home runs particularly appealing in terms of fan interest.
wOBA remains statistically significant in the updated model, though its level of importance in determining FA salary is slightly dampened by the inclusion of R and HR. Now, a 20-point wOBA increase only results in around a 10% increase in salary, as opposed to nearly 25% in the model with only advanced statistics. DRS, on the other hand, maintains a similar effect to the prior model, as a 5-run increase translates to a 5.3% increase in salary.
As for career statistics, the two most important predictors of FA salary are career OPS (OPS_career) and career runs per season (R_peryear). FanGraphs classifies an average OPS as .730 and an above-average OPS as .800; a 70-point increase in career OPS in the model indicates a 30.7% increase in expected FA salary. This is a motivating finding, because it contradicts some literature stating that teams do not adequately weigh past performance in their analysis. Thus, it seems that teams rely on both a player’s prior season and his career track record when offering him a salary in Free Agency.
The combined model of both traditional and advanced statistics had the greatest explanatory power of all the models put forth, which suggests that teams include both types of analysis when determining a player’s worth. Though including fWAR in the model may have generated a higher R2, I felt it was important to break down the statistic into its various components (offense and defense) to show their independent effects. While offensive performance is still the primary predictor of FA salary, I found evidence that teams also pay premiums for players with better objective defensive metrics.
In conclusion, the results show that teams have shifted away from only using traditional statistics, which do not perfectly measure a player’s worth. Instead, teams refer to a combination of advanced and traditional statistics that gauge performance in the most recent season. Career performance also has a significant impact when it comes to FA salaries, because the larger sample size of a player’s entire career has more predictive power than just the most recent season. With the refinements that have been made to improve the accuracy of quantifying a player’s marginal revenue product, Free Agent salaries have become superior representations of a player’s value.
The post Determinants of Free Agent Position-Player Salaries: Part II appeared first on Batting Leadoff.