Archive for June, 2014

PAR for the Course

Saturday, June 14th, 2014


I take a lot of pride in this web site I have developed and enhanced over the years.  Although it lacks some of the bells and whistles you’ll find on the mainstream fantasy baseball sites, this is a fully functional site that handles all of the key elements of fantasy baseball game management.  Just in the last few years, I’ve added a couple new features that I believe have greatly enhanced the overall user experience, namely the live stats tracker and the recent addition of injury notices.  But there is one key element of most fantasy sites that this one has always lacked:  some sort of player rater to help users analyze where their players stack up against the rest of the league.  After several years of spit-balling ideas, I’m happy to announce the addition of a new league statistic that will show up on pages throughout the site:  PAR (Points Above Replacement).

As the name somewhat suggests, PAR is loosely based on the Sabermetric statistic that is now the de facto #1 stat for rating baseball players, WAR – Wins Above Replacement.  Most of you are probably at least somewhat familiar with WAR, but to put it in very simple terms, it is a stat designed to determine player value based on how many additional wins a player helps his team earn compared to how many wins they would have earned with a replacement level player in that roster spot instead.  So let’s say Player A earns a 4.0 WAR for a 90 win team.  The theory is that the team would have instead won 86 games if Player A had been replaced by a “replacement level” player.  If you want to see a full description of WAR, including the definition of what “replacement level” means, I recommend checking out this page.

If determining the number of wins a player is worth is widely accepted as the best way of determining a player’s value in real baseball, what about fantasy baseball?  Well, we don’t care about “wins” (team wins, that is).  But we do care about points.  The goal is to earn as many points in the standings as possible.  So wouldn’t it be useful to know approximately how many points a player is worth?  Specifically, how many points in the standings a player helps a team earn compared to a replacement level player.  Enter PAR.  This method of player valuation is something I’ve been working on for quite some time.  It was my goal to complete this effort this past winter and get it up and running before Opening Day.  I made decent progress, but hit a major snag:  my numbers just weren’t adding up.  Specifically, I wasn’t getting anywhere near the expected results in the ratio categories (AVG, ERA and WHIP).  This led to flawed numbers across the board, so I had decided to table it until next winter.  But then a couple weeks ago, I was reading a fantasy baseball article on FanGraphs which contained a link to a second article and then a link to another site.  Suddenly, before my eyes was the exact solution I had been looking for, particularly for fixing my problem with ratio categories.  On one hand, I was a little upset that I hadn’t found (or really even looked for) this site before.  But on the other hand, it was extremely rewarding to discover that fantasy gurus had come up with methods of player valuation that were nearly identical to what I had come up with on my own.  So I think this should give my new stat a little credibility.

Here’s the page I discovered on a site called Smart Fantasy Baseball.  It’s definitely worth checking out because it probably describes the concepts a lot more clearly than I will be able to.  But what I’ve come up with is not exactly the same, so I will describe PAR in all of its gory details in just a bit.  The linked page describes a concept of player valuation called “Standings Gain Points” or SGP.  So I could have called this new stat SGP as well, but I had already picked PAR before I ever saw anything about SGP.  The concept is the same though.  SGP is the number of points in the standings that a particular player earns for his team.  There is a replacement level concept built into it as well, but that is where my formula is a little different.  One thing to keep in mind is that SGP, and probably other stats like it, are primarily designed to assign values to players to assist with draft preparation or to set future performance projections.  Most of the big sites that develop pre-season player rankings probably incorporate these ideas into their rankings and dollar value assignments.  But that’s not what I’m looking to do here.  I don’t intend to use PAR in pre-season projections or rankings, partly because I assume you all have your own methods of draft preparation (or lack thereof) that you do on your own anyway.

My intent with PAR is to assign a specific value to the numbers actually accumulated throughout the season in this league.  The formula is based solely on numbers (past and present) from this league.  Many of you probably occasionally glance at player ratings on other sites, which certainly have some value.  But they are usually based on default league settings on those specific sites.  PAR is completely based on our league’s settings and historical results.

So let’s get to it.  Here is my best attempt to describe what PAR is.  I’ll leave it to you to determine if it is a worthwhile metric, or completely useless information.  I’m not going to go through all of the math involved, but will provide enough information that you could “check my work” if you so desire.  Or if you are a very trusting person, you can immediately buy into this new stat as gospel truth and stop reading now.

As mentioned, the idea behind PAR is to determine how many points a player helps a team gain in the standings.  For now, only the raw total PAR is displayed on this site, but it is made up of 5 sub-parts:  one for each of the five categories a player helps contribute towards, which are obviously different for hitters than pitchers.  For each of those sub-parts there are two key numbers involved:  the “replacement level” stat total and the number of units in the category necessary to gain a point in the standings.  But before we dive into those, let’s talk sample size, which is applicable to all that follows.

All of the numbers that feed into PAR come straight out of this league.  At first I thought about only using numbers from the specific season for which I was calculating a player’s PAR.  After all, this would seem to be the best true measure of a player’s value in a given season.  But then I decided this was much too small of a sample size and could be totally thrown out of whack by teams that decided to punt certain categories.  Also, I didn’t want a player who puts up identical numbers in consecutive seasons to potentially have a significantly different PAR for those two years.  So I decided to expand it to a five year sample size.  I chose five years, and not the entire league history, because as you are well aware, there have been major peaks and valleys in offensive production in baseball during the two decades this league has existed.  If I were to use the same numbers to calculate PAR in 2014 as 2001, very few offensive players would have positive value now, while most pitchers would have accumulated negative value during the heart of the “steroid era”, which doesn’t really make sense since there are just as many points to be gained in offensive categories now as there were then.  So I picked five years to produce a decent sample size that wouldn’t be totally ruined by seismic era shifts.  I’ll have more to stay about this later, but now let’s start looking at how players earn points.  I’m going to focus mostly on the counting categories for now, but I’ll get to a separate discussion about the ratio categories (AVG, ERA, WHIP) later.

To determine what it takes to earn a full point in a given category, I came up with a method of calculating the “average” gap between teams in the standings in that category.  Average is in quotes because it is not exactly the mathematical average, which would only rely on teams that finish in first and last place to calculate this gap.  Initially, that’s exactly what I did.  But then I found that Smart Fantasy Baseball article, which recommended calculating this gap using a slope formula to create a linear distribution.  Check out that article again for the full details, but the main reason to use a slope formula is because it lessens the impact of outliers and includes all teams in the calculation, not just the first and last place squads.  The slope is calculated in each category and averaged over the five year period.  The table a little further down the page displays the calculated values for each category that were used for the 2013 PAR numbers, and will be used in-season for 2014 as well.  So, for example, the calculated gap of 10.05 for home runs means that it generally takes 10 home runs to gain a point in the standings.  Therefore, a player who hits 10 home runs above replacement level will earn a full point towards his PAR for home runs alone.  And for every additional 10 home runs he hits above that, he earns yet another full point.

Now let’s dive into the replacement level discussion, which is where my method is actually quite different than what I found on other sites.  Replacement level is one of the more controversial aspects of WAR because not everyone agrees on what it should mean.  In fact, prior to a year ago, the two mainstream producers of WAR (FanGraphs and Baseball-Reference) used completely different formulas to determine replacement level.  They have since unified, but it is still far from a 100% agreed upon standard.  This is also a challenge in fantasy player valuation.  Smart Fantasy Baseball’s approach was to set the replacement level baseline based on the projections of players who would just miss being drafted, so basically the best remaining players in the post-draft free agent pool.  This totally makes sense since those guys would be the true replacements for injured/under-performing players.  But keep in mind that I’m not creating a projection system.  I want to use real stats.  And trying to identify who the best available players are in the free agent pool at any given time is not really doable programmatically, especially since we use such a limited pool of players.  So I decided to go a different route.

The definition of a replacement level player that I came up with is an average player on a team that will finish in last place in any given category.  So basically, for the counting stats (HR, RBI, R, SB, W, SV, K), I determined the typical last place team total in each category and divided that by 14 for the hitting categories and 9 for the pitching categories (14 and 9 being the number of active hitter/pitcher roster spots).  This produces the units that a “replacement level” player would be expected to accumulate in each category assuming he was on the major league roster for the full season.  Wait a second, what does “typical last place team” mean?  Well, I could have just taken the five year average of last place teams in each category.  But again, I didn’t want my numbers to be drastically swayed by teams that intentionally tanked categories.  So instead, I calculated the average team total in each category over a five year span and subtracted from that the “gap” value described above times 4.5.  4.5 because an “average” team would earn 5.5 points in the category, but I’m looking for a total for the team that finished with 1 point (last place).  So this replacement level value is what a last place team would accumulate if all of the team totals truly formed a linear distribution.

The calculated point differences and replacement level numbers for each category are in the table below.  These were the numbers I used for the 2013 PAR calculations and will be used in-season for 2014 as well.

Category Point Diff. Gap Replacement Level
AVG .0029 .2583
HR 10.05 15.74
RBI 24.55 65.24
R 25.35 66.52
SB 11.02 7.36
ERA 0.109 4.065
WHIP 0.0195 1.3162
W 2.91 7.46
SV 10.54 5.02
K 30.60 113.25

Now let’s do some math and examine exactly how these numbers came about in one category, home runs.  From 2009 through 2013, the average team total in home runs was 265.66.  Using the slope function on each set of 10 team totals (one for each season), the average of those five results is 10.05, which becomes the point difference gap you see in the above table.  The formula to determine the replacement level is:  ((total team average) – (4.5 x point diff gap)) / 14.  So the calculation for the home run replacement level is:  (265.66 – (4.5 x 10.05)) / 14 = 15.74.

Besides all of that, there are other numbers involved in calculating PAR.  Obviously, a player’s actual stats are included.  But also a new stat that I needed to start tracking in order to make this work:  number of weeks on the active roster.  This is important because I wanted to make PAR a cumulative stat, like WAR, meaning that a player will “earn” value throughout the season towards an end of the year total, but only while on the active roster.  Without tracking weeks on the roster, players who only spend a short period of time on the roster would post a PAR way below zero since they would likely fall well short of the full season replacement level totals.  But this would be misleading because their contribution is not necessarily negative for the team if they produce good numbers during that brief stint.  So another aspect of the PAR formula is multiplying the replacement value by a ratio of the number of weeks a player is on the active roster over 26.  26 is the full number of weeks in the baseball season.  Therefore, a player on the active roster for exactly half the season (13 weeks) would only need to accumulate half of the replacement level total in order to start earning positive value.

Here is an example of the home run part of the PAR calculation for Jose Bautista in 2013.  He hit 28 home runs in 21 active weeks on the roster, so that’s why those two numbers appear:  (28 – (15.74 x (21/26))) / 10.05 = 1.52.  So Bautista earned 1.52 “points” for HR, which was then added with the four other parts to create a total of 3.6 PAR for the 2013 season.

I’ve kind of been glossing over the ratio categories to this point.  The number of weeks on the active roster is not used for these categories because we have a better way of determining how much of an impact a player has on those categories:  their actual number of at bats or innings pitched.  In batting average, the first thing needed is the average number of at bats per player over the 5 year span.  This was calculated by taking the total number of at bats in the league over those five years and dividing it by 700 (50 team totals and 14 slots per team).  This came to a total of 531.04 at bats.  Next, the previously calculated replacement level batting average was used to find the replacement level hits:  (.2583 x 531.04) = 137.28.  So our replacement level hitter has about 137 hits and 531 at bats.  The individual player AVG PAR is calculated by taking a team full of replacement level players plus the player being examined.  That’s 13 replacement players plus the examined player to fill up the full 14 slots:  ((137 x 13) + player’s hits) / (531 x 13) + player’s at bats)) = adjusted batting average.  The adjusted batting average will show how much of an effect the player had on the team batting average.  The rest of the calculation is the same as the other categories.  The concept for ERA and WHIP is similar, except the replacement level innings, earned runs, and walks plus hits are calculated and used instead.  This whole paragraph probably makes zero sense, so I once again refer you to the Smart Fantasy Baseball article to get a better grasp on this.  Just keep in mind that I’m using replacement level players instead of average players.  The concept is more or less the same though.

Now that I’ve described how PAR is calculated, let’s see if the numbers add up.  On a team-by-team basis, you would expect the total batting PAR to be approximately the team’s batting total minus 5 since a team full of replacement level players would still “earn” 5 batting points.  The same applies for pitching.  But looking at individual team PAR totals can be misleading since some teams might win a category convincingly, earning more than the necessary nine points above replacement, in turn skewing the overall numbers.  So a better way to analyze the results is to add up league-wide totals in each sub-part (category) of PAR.  You would expect the league wide total PAR earned in each category to be somewhere around 45 (9 + 8 + 7 + 6 … + 1).  My calculations for the 2013 season produced the following total PAR in each category:

  • Average:  38.27
  • Home Runs:  30.28
  • Runs Batted In:  22.72
  • Runs Scored:  28.12
  • Stolen Bases:  34.84
  • Earned Run Average:  50.73
  • WHIP Ratio:  58.04
  • Wins:  38.26
  • Saves:  49.17
  • Strike Outs:  49.92

In summary, some categories came closer to the expected result than others.  But even the ones that aren’t close are explainable and not necessarily a sign of a flawed system.  In particular, the league totals in HR, RBI and R were significantly lower in 2013 than over the course of the five year span we examined.  Therefore, I would actually expect these numbers to be well below 45.  To what degree is hard to calculate, but overall, I am satisfied with the results.  Just keep in mind that when I start releasing the PAR numbers for earlier seasons, we should start to see the opposite situation where offensive points earned exceed the expected totals.  I really won’t know for sure how iron clad this formula is until I complete this task for the full league history, and that is going to take a while.  There is a decent chance I will tweak the formula as I proceed.

Next, I’m going to explain a little about how you should interpret these PAR numbers and possibly add a few words of warning to clear up some potential misconceptions.  First, and in my opinion most importantly, keep in mind that there is no positional adjustment included in these ratings.  PAR is calculated using the same numbers for catchers as outfielders.  Positional strength plays no role.  Since it is much more difficult to get great value out of certain positions, you shouldn’t simply decide Player A is more valuable than Player B based on a higher PAR if they play different positions.  A catcher with a 3.0 PAR is probably more valuable than an outfielder with the same PAR.  Down the road, I intend to come up with a second new stat, closely related to PAR, which will include a positional adjustment.  But that’s not going to happen anytime soon.

This lack of a positional adjustment is especially noticeable for pitchers.  Relief pitchers, due to their reduced innings and lack of win opportunities, are going to have a tough time earning positive value.  Almost all non-closers are going to have negative PAR.  This may seem like a huge red flag and a flaw in the system.  But I don’t think it is.  These numbers accurately reflect how much more of an impact starting pitchers have on a team’s total stats compared to relievers.  This is not to say relief pitchers have no value though.  A 0.0 PAR player still helps a team more than a -2.0 player.

Similarly, it is a mistake to make direct comparisons between hitters and pitchers based on PAR.  In general, pitchers are going to have higher PAR than hitters.  The reason for this is because there are just as many points to be gained in the standings in pitching categories as hitting, yet there are far fewer pitchers earning those points so there are more points to go around to each player.  I considered adding an adjustment to pitchers’ PAR to make the average pitcher’s PAR equivalent to an average hitter.  But I decided against it because I wanted to maintain the goal of total league-wide PAR matching the numbers of points actually available in the league standings.  So keep this in mind when comparing the value of a hitter to a pitcher.

One false impression you could receive from PAR is that your team would be better off with an empty roster spot than playing a guy who is earning negative value.  This is not the case.  A negative value means that the player is providing less value than a replacement player, but a replacement player is more valuable than no player at all.  To illustrate this, let’s say you decide to go the full season with just one healthy catcher and a second catcher who misses the entire season with an injury.  A hypothetical player who puts up zeros in all five categories for the full season would earn a -7.5 PAR.  It would be nearly impossible for any real player to put up a PAR worse than that.  Same goes with pitching.  A pitcher with no stats for a full season would accumulate a -6.7 PAR.  Keep that in mind when determining if it makes sense to play a man short rather than using the below replacement level player on your bench.

This may be obvious, but simply accumulating the highest team PAR does not guarantee you a championship.  It is very possible to accumulate a category PAR total that is more than the full nine points necessary to finish first in that category.  Ideally, you want to accumulate close to nine points in each of the categories you intend to win.  Of course, it’s not really possible to see what your PAR is in each category right now, but this is something I hope to add in the future.

Finally, I suggest you pay little attention to the PAR values that are included in the “MLB” lines of a players’ stats for the current season.  Since I don’t have a good way of determining how many weeks a player has been on an active MLB roster, I’m assuming they have been active the full season, which is obviously not the case for a great number of players.  I thought about not calculating these numbers at all, but decided the information could be useful to see how valuable your bench players or free agents have been.  For now, I’m not calculating PAR for the weekly stat lines, but I may add that later.

So what comes next?  At the moment, the web site contains PAR numbers for the 2013 and 2014 seasons.  The 2014 numbers will be updated every morning as part of the daily stats update.  One thing to keep in mind is that at the beginning of each week most active players’ PAR will take a slight hit as the number of weeks value that is included in the calculations is incremented by one.  This will be barely noticeable later in the season, but you might see some guys drop a tenth of a point or two right now simply for that reason.  I’m going to take a closer look at the year-by-year results in separate posts as I release those numbers to the site.  I’ll analyze the 2013 numbers in greater detail very soon.  Then I will start working my way backwards starting with 2012.  I don’t expect to finish this project until next winter.  I’m definitely going to need to make some changes to the formula as I approach the early seasons of this league when there were fewer teams and fewer points available.  I have no idea how I’m going to do that right now, but I have plenty of time to think about that.

Wow, that’s one of the longest things I’ve written since college.  I hope you find some of this information helpful in understanding the new stat.  More importantly, I hope you find PAR to be a useful tool in analyzing players’ value in this league.  This is definitely a work in progress and I am very willing to make adjustments.  So if you find flaws in my system or think there are ways I can improve it, don’t hesitate to let me know.  Also, I’m sure there is much of what I described that is not clear to you at the moment.  Please leave me feedback on any questions or comments you have.  Enjoy!

DTBL May Awards

Wednesday, June 4th, 2014


Once again, it’s time to check out the best of the best in DTBL through the month of May. There were some massive months, particularly from the hitters as you’ll see below. Unfortunately, the biggest loss from the list has been Jose Fernandez due to his UCL tear and subsequent Tommy John surgery. Hopefully the injury epidemic is over for now, but this being baseball in 2014, no one seems safe.

On a more positive note, the players below are a decent mix of guys powered almost solely by an incredible May, and guys who have been consistent year round. As the season plays out, it will be interesting to watch if the streaky guys can finally maintain their play over an entire season, or if the steady mashers will rise, and stay, at the top.

All stats below are through May 31, and cover Rookie of the Year (ROY), Cy Young, and Most Valuable Player (MVP).

ROY:

1. Josh Donaldson, Moonshiners – .280 BA, 48 R, 15 HR, 46 RBI, 1 SB
2. Yaisel Puig, Jackalope – .344 BA, 32 R, 11 HR, 40 RBI, 5 SB
3. Julio Teheran, Darkhorses – 0.932 WHIP, 1.83 ERA, 5 W, 0 SV, 66 Ks
4. Michael Wacha, Gators – 1.064 WHIP, 2.45 ERA, 4 W, 0 SV, 75 Ks
5. Sonny Gray, Jackalope – 1.122 WHIP, 2.31 ERA, 5 W, 0 SV, 60 Ks

There’s not much change in the rookie listings, as Donaldson, Teheran, and Gray all are carry overs from April. Yaisel Puig finally returned to his 2013 form, mashing 8 homers and driving in 25 runs while chipping in 4 steals, proving himself truly worthy of his number one overall pick this year. The other newcomer, Michael Wacha, almost made this list in April, but strong consistency vaults him past Sonny Gray in these rankings. Meanwhile, Josh Donaldson and Julio Teheran continued their stellar play from April, with Donaldson putting up almost identical numbers in May, and Teheran upping his strikeout totals to go with slightly depressed ratios.

Other rookies of note include Anthony Rendon, whose slow May dropped him off the leaderboard, Gerrit Cole, Evan Gattis, Shelby Miller, and Brian Dozier, all of whom have decent to excellent numbers in certain categories, but lack that overall excellence exhibited by the top 5.

Cy Young:

1. Johnny Cueto, Demigods – 0.758 WHIP, 1.68 ERA, 5 W, 0 SV, 92 Ks
2. Adam Wainwright, Cougars – 0.914 WHIP, 2.32 ERA, 8 W, 0 SV, 81 Ks
3. Felix Hernandez, Jackalope – 1.024 WHIP, 2.57 ERA, 7 W, 0 SV, 83 Ks
4. Zack Greinke, Naturals – 1.121 WHIP, 2.18 ERA, 8 W, 0 SV, 76 Ks
5. Julio Teheran, Darkhorses – 0.932 WHIP, 1.83 ERA, 5 W, 0 SV, 66 Ks

It’s hard to come up with words that can adequately express just how awesome the top pitches are this year. All five of these guys are bringing it in every category, tossing up video game style ratios with absurd strikeout totals. Johnny Cueto, Adam Wainwright, and Zack Greinke, the April carryovers, have shown that their hot starts are no flukes. Felix Hernandez continues to show why everyone calls him King Felix, while Julio Teheran’s surprising rookie season is enough to vault him into the top five overall for pitchers.

Unfortunately, everyone could see that Francisco Rodriguez would come back down to earth after his impeccable start. But, even so, there are no shortage of pitchers waiting in the wings. Tim Hudson seems to have found the fountain of youth, Yu Darvish is dealing again after neck issues, Chris Sale, Kyle Lohse, and other are all dealing. The two big surprises, though, are Mark Buehrle, who’s spinning a top 10 season from the free agent list, and Jeff Samardzija, who was leading the majors in ERA through May but only had one win to show for it.

MVP:

1. Nelson Cruz, Gators – .315 BA, 39 R, 20 HR, 52 RBI, 0 SB
2. Giancarlo Stanton, Jackalope – .316 BA, 40 R, 16 HR, 51 RBI, 4 SB
3. Troy Tulowitzki, Naturals – .352 BA, 45 R, 14 HR, 37 RBI, 1 SB
4. Josh Donaldson, Moonshiners – .280 BA, 48 R, 15 HR, 46 RBI, 1 SB
5. Yaisel Puig, Jackalope – .344 BA, 32 R, 11 HR, 40 RBI, 5 SB

Nelson Cruz had a ridiculous May to jump him to the top of the MVP race. A .339 average, 13 homers, 27 RBI; all fantastic numbers. Giancarlo Stanton continues to smash the cover off the ball; one only wonders if he can stay healthy. Troy Tulowitzki continues to rake as well, with his .352 batting average still leading the majors to go along with solid stats all around.

Then come the two big surprises on this list – Josh Donaldson and Yaisel Puig. Both DTBL rookies are putting up numbers that not only lead their draft class, but compete with the numbers of the established veterans. After this point in the season, it would be no surprise to see them challenging for bragging rights as the best of the best the rest of year.

However, there is no shortage of competition for this race. The only thing keeping Edwin Encarnacion off this list was a slow April; his 16 home runs and 33 RBI in May were incredible. Carlos Gomez is the only significant power and speed guy, with 11 homers and 11 steals to go with other solid all around numbers. And there’s more, with Alexi Ramirez, Jose Bautista, Michael Brantley, and Paul Goldschmidt all waiting in the wings. In another version of the MVP list, any of those guys could be on it and they wouldn’t look out of place. Finally, Miguel Cabrera has finally remembered how to hit, and may soon take his accustomed place as a member of this elite company.

Questions? Comments? Grievances your player got left out? Feel free to leave your comments below.