Several studies have documented increased life expectancy and improved mortality for Major League Baseball players, but none has yet provided a complete analysis of baseball player mortality patterns over time. We selected all baseball players who debuted between 1900 and 1999, modeled numbers of deaths, calculated SMRs with 95% confidence intervals, and calculated life expectancies for baseball players and the general population in each decade from 1930-1939 and 1990-1999. Mortality risk for MLB players increased with age and decreased over time except in the 1950-1969 and 1980-1999 periods. Ballplayers had greater life expectancy in all periods compared to the general population, though the differences were small in the 1950s and 1960s. SMRs revealed that baseball players experienced fewer deaths than expected based on general population rates in most decades from 1930 onward. This research shows that baseball players experienced lower rates of mortality than the general population throughout the last century.
The relationship between athleticism and mortality risk has been of interest to scientists for over 150 years. A number of researchers have studied mortality in collegiate and professional athletes, including rowing teams (Hartley and Llewellyn 1939) professional skiers (Karvonen 1976), and American football players (Abel and Kruger 2006b, Baron and others 2012) Olympic athletes (Gajewski and Poznanska 2008) among others (Ruiz and others 2011). Though utilizing various methods, subjects, times, and places, these studies all reach similar conclusions: mortality rates of athletes are generally lower than those of age- and sex-matched general populations (Ruiz and others 2011).
Major League Baseball (MLB) affords an excellent opportunity for mortality investigations. Baseball is a statistics-rich sport and MLB has a long history, with near-complete performance and biographical information documented back to 1871 (Holtz 2005). The large number and diversity of MLB players (over 17,000 current and former players from more than 50 countries) provide a wealth of data to analyze, and several mortality studies have been published using these data.
The first published study of the mortality of MLB players appeared in the Bulletin of the Metropolitan Life Insurance Company (MetLife) in 1975. The study presented standardized mortality ratios (SMRs) for professional baseball players from 1876 through 1973. The authors reported reduced mortality for players debuting before 1900, with continuing improvement over time: the SMR was 97% for ballplayers with careers beginning between 1876 and 1900, 64% for those whose careers began 1901-1930, and 55% for those who debuted in 1931 or later (MetLife 1975).
Waterbor et al. (1988) studied ballplayers that entered the Majors between 1911 and 1915 who were still living as of 1925. Here too the authors used SMRs to compare the ballplayers to the US general population, and calculated both cause-specific and all-cause SMRs. In seeming contradiction to the MetLife study, the authors found no significant difference in the all-cause SMRs in any of the 5-year intervals between 1960 and 1984. The only statistically significant reduction in the cause-specific SMRs was for infectious and parasitic diseases.
Studies of the healthy worker effect among MLB players reported increases in life expectancy of up to 8 years for a cohort of ballplayers who debuted in the Majors between 1900 and 1939 (Abel and Kruger 2006a) and 4 to 6 years for a cohort that debuted between 1900 and 1919 (Abel and Kruger 2005). Another study estimated an advantage of 4.9 years for life expectancy of MLB players aged 20-24 years in the interval 1970 to 2004. Logistic regression models from that study also indicated that mortality risk for ballplayers increased with age and decreased over time between 1902 and 2004 (Saint Onge, Rogers, Krueger 2008).
These studies together suggest that there has been a decline in ballplayer mortality rates over time, and that baseball players experienced greater longevity than the general population for select periods early in the last century. However, these studies do not provide a complete picture of MLB mortality – they have not used all available data; they have not examined the changes in life expectancy over time; and they do not include any comparison of MLB and general population life expectancies over time.
In the present study we have used data for all players who debuted 1900–1999 coupled with general population mortality rates for males in the US to test the following hypotheses: (1) that age-specific mortality rates for MLB players improved in each decade from 1900-1909 to 1990-1999; (2) that the age-specific numbers of deaths observed for baseball players were significantly less than expected in the general population in all decades from 1930-1939 to 1990-1999 (a comparison with the general population would not be possible for the ealiest three decades because the MLB data are too thin); and (3) that ballplayer life expectancy was greater than that of the general population at all ages in all decades between 1930-1939 and 1990-1999.
The data used in this study were provided by the Baseball Almanac (www.baseball-almanac.com), an online interactive baseball encyclopedia with more than 500,000 pages of baseball facts, statistics, and original research. The database also contains biographical information on every player, including date of birth, debut date in MLB, date of retirement, and, for those who died, dates of death.
Of primary concern for this study was the correct ascertainment of deaths. Waterbor and colleagues confirmed the vital status of over 97% of Major League players through December 31, 1984 by contacting living players or by obtaining death certificates for those known to have died. (Waterbor and others 1988b) The results of those efforts were fed back to the Baseball Hall of Fame and integrated into the Baseball Almanac database. Since that time, vital status is regularly updated using news media articles, obituary searches, and reports from ballplayers’ fans and families. Vital data are cross-checked with the Biographical Research Committee at the Society for American Baseball Research (SABR), whose staff investigates vital data for Major League Baseball players on an ongoing basis using genealogy websites and the Social Security Death Index (Carle 2005).
From the Almanac we selected all ballplayers born in the United States who played their first games in MLB between January 1, 1900 and December 31, 1999. Each member of the cohort was observed until death or the study closing date of December 31, 1999, whichever came first. Those still alive at the end of the observation period were counted as still living, and their follow-up was terminated at that time.
In accordance with federal regulations concerning the protection of human subjects in research, this study was exempt from institutional review, as the authors made no contact with the ballplayers under study and the data used in this research are freely distributed to the public on the internet.
All data processing and statistical analyses were performed using the SAS System for Windows V9.2 and Microsoft Excel 2007.
For purposes of comparison we used general population mortality rates from the Human Mortality Database (HMD), a joint project of the Department of Demography at the University of California at Berkeley and the Max Planck Institute for Demographic Research in Germany. The database provides all-cause mortality rates for more than sixty countries, derived from official records of live births, deaths, and census information. Currently the HMD has complete mortality tables from 1933 through 2007. (University of California at Berkeley and Max Planck Institute for Demographic Research, 2010)
Standardized Mortality Ratios
Standardized mortality ratios were computed by indirect standardization with the general population as follows:
1. We tabulated observed numbers of deaths for ballplayers within 5-year age groups and within each decade from 1930 to 1999. We chose 1930 as the starting point because the 1930s is the earliest decade for which general population mortality rates are published by the HMD. In addition, the data in our sample are sparse before 1930, so any SMRs would be unreliable even if a reasonable set of general population mortality rates were available.
2. Time at risk for each player within each age group and decade was determined based on the player’s debut date in MLB and the end of follow-up (either the end of the study period, December 31, 1999, or the ballplayer’s date of death). We then summed exposure time within each age group and decade.
3. Mortality rates for males in the US general population were extracted from the HMD dataset for equivalent age groups and decades.
4. Expected numbers of deaths for each age and decade were determined by multiplying the general population mortality rates (step 3) by the exposure times (step 2). The resulting numbers of expected deaths were then summed by decade without regard to age.
5. Standardized mortality ratios (SMRs) were determined by dividing the observed numbers of deaths (step 1) by the corresponding expected numbers (step 4).
6. We determined confidence intervals for the SMRs based on the assumption that the observed numbers of deaths follow a Poisson distribution in each period.
Negative Binomial Regression
To test the hypothesis of improving mortality rates between 1900 and 1999, we modeled the death rate using a Poisson model. However, significant over-dispersion made the Poisson model a poor fit, so we fit a negative binomial model instead. Regression diagnostics (not reported here) confirmed that the negative binomial regression was a more appropriate model for handling the over-dispersion and was an adequate fit for the data.
We fit terms for each decade between 1900 and 1999, as well as terms for age in 5-year intervals. We also tested for age-by-period interaction, though this proved to be statistically insignificant. We tested the significance of the parameters in the model using likelihood ratio tests and Wald statistics at the 5% level of significance.
We constructed period life tables for the general population and MLB using standard methods (Anderson 1999). The resulting tables provide decennial life expectancies from 1930 to 1999 for MLB players and for the general population. As the HMD provides mortality rates starting in 1933, the 1933-1939 rates were used for 1930-1939 comparison. As a gauge of comparative mortality we computed the differences in life expectancies between MLB and the general population for several ages in each period.
In many early decades of follow-up, the mortality schedules for baseball players were incomplete at old ages as all players were still relatively young and thus few deaths were observed. In the 1930s, stable hazard rates were available for all ages up to 70 years, by 1950 hazards were available for ages up to 90 years, and by 1970 data were complete up to 100 years of age. Wherever hazards were missing we used the general population mortality rates to complete the schedule of rates for the MLB. This approach assumes, for the purpose of calculating life expectancies of MLB players, that missing or unstable mortality rates for MLB players are, at worst, equal to those of the general population. Thus in the 1930s and 1940s general population rates were used for ages 80 and above, in the 1950s general population rates were used for ages 90 and above, and so on, until the 1990s when MLB rates were available for all ages.
The mean age at debut for all players was 24.3 years (SD = 2.8 years), and they were followed for 34.0 years on average (SD = 18.5 years). The mean age at death was 68.0 years (SD = 15.0 years) while the mean age of those still alive at the end of follow-up was 49.7 years (SD = 17.0 years). The full dataset comprised 401,763.4 person-years and 5,530 deaths contributed by 14,360 ballplayers, all of whom debuted in the Majors between January 1, 1900 and December 31, 1999.
We present the results of the negative binomial regression model as Table 1. The table displays the mortality rate ratios (MRRs) and corresponding 95% confidence intervals for both period and age. The MRRs are in comparison to 1980–1999 (for the calendar period comparison) and ages under 30 years (for the age comparison).
As expected, mortality rates for ballplayers increased with age. The increase in MMRs begins immediately with the 30 to 34 age group, and rises exponentially thereafter.
Initial modelling showed that mortality rates were greater in each decade from 1900 to 1979 (compared with the final two decades from 1990 to 1999) and that MMRs declined steadily from 1900 to 1999 with a few consecutive pairs of decades having a stable MMR. These decades were subsequently combined for the final calendar period analysis. The model shows that the MMR was greatest (6.388) for the period from 1900 to 1919 (compared to the reference period from 1980 to 1999) and declined steadily throughout the period of the study (with periods of stability across consecutive decades from 1900 to 1919, 1950 to 1969, and 1980 to 1999). .
Mortality rates for MLB players were lower than those of age- and calendar-period-matched general populations throughout the study period. The Figure displays the number of person-years of follow-up in each decade (purple bars) and corresponding SMRs and their 95% confidence intervals (overlaid white bands). The horizontal categories represent time periods from between 1930 and 1999. The white point within each bar represents the point estimate of the SMR for the respective period, and the vertical band through each white point shows the 95% confidence interval of the estimated SMR. The dotted line across the Figure represents an SMR of 1.0. SMRs whose confidence intervals lie completely below this line demonstrate statistically fewer deaths than expected for ballplayers based on general population mortality rates. Any SMRs with confidence intervals that cross the horizontal dotted line cannot be said to be statistically significantly different from unity; in such a case there is not statistically significant evidence that MLB players' mortality rates are different than those of the general population for that period (at p = 0.05).
Between the 1930s and the 1990s, the SMRs follow a parabolic pattern, starting at their lowest in the 1930s, rising to a peak in the 1960s, and then declining again through the 1980s. The average SMR for the entire 1930–1999 period was statistically significant at 0.87 (95% CI 0.85-0.89).
Table 2 exhibits the life expectancies for MLB and the US general population as well as the difference between them from the 1930s to the 1990s. Life expectancies are presented for 10 year age intervals between age 20 and 60 for each decade. In general, the ballplayers had a better life expectancy than males in the general population for all combinations of age and period. The life expectancies suggest a pattern of “protective fade”, where the differences are largest in the youngest ages (when the players are actively playing baseball) and diminish with increasing age.
Life expectancies in both the MLB and the general population increased over time, but at different rates. Up to the 1950s, the differences in life expectancy narrowed as the general population life expectancy increased faster than that of the MLB players. Starting in the 1960s however, life expectancy for ballplayers increased more quickly than it did for the general population, leading to wider differences in later years. By the 1990s, baseball players could expect as much as 4.4 additional years of life.
This study provides evidence that age-specific mortality rates for Major League Baseball players declined over most of the 20th Century (as did those of age-matched males in the general population). Furthermore, the results support the hypothesis that mortality rates for MLB were less than those for the general population of US males in that period. Finally, differences in life expectancy suggest that ballplayers had greater longevity than the general population in most years and at most ages. Unlike previous studies which relied on cohort methods, the findings here provide longitudinal measures of change by using period life tables and period-specific SMRs.
The inclusion of both relative and absolute measures of baseball player mortality is one of the strengths of this study. The absolute measures (life expectancies and the negative binomial model of death counts) confirm that ballplayer mortality rates declined over time, while the comparative measures (SMRs and life expectancy differences) show that longevity of MLB players has been greater than that of the general population across the century. Combining these two approaches we have demonstrated that the rate of decline in mortality rates for MLB players was less than that of the general population up to 1960, and greater than that of the general population throughout the remainder of the century. We have also demonstrated that MLB mortality rates have remained lower than those of the general population consistently from 1930 to 1999.
One limitation of the present work is the potential for unmeasured confounding in the data owing to player race. Though initially all white, the Big Leagues experienced drastic demographic change following racial integration in 1947. However, racial and ethnic information was incomplete for the players, so the SMRs are not adjusted for race or ethnicity. Because the MLB cohort was all white prior to 1947, the expected numbers of deaths during that period were too high, and thus the SMRs too low. By the mid-1970s the percentage of black players in the MLB exceeded that in the general population (Helyar 2007), making the expected numbers of deaths too low and the SMRs too high. It is conceivable that the removal of these potential sources of bias could result in a leveling out of the SMRs shown in the Figure for the period up to 1969, and in a steeper decline of SMRs after 1969.
Another limitation is the lack of information concerning other occupational exposures for players who had off-season jobs or other occupations after baseball. In the first half of the century in particular many players worked off-season jobs as baseball did not yet pay well enough to be a year-round profession. In addition, most players retire from baseball at a relatively young age, leaving them to pursue second careers. As noted previously, exposures related to these other occupations could have a significant impact on the mortality rates and life expectancy of the ballplayers (Waterbor and others 1988a).
The results of the negative binomial regression on the MLB hazards largely agree with the logistic models presented by Saint Onge, Rogers and Krueger (2008). The models in both studies show mortality risk increasing with age and decreasing by period, and both studies show that the hazards were flat from the 1950-1959 to the 1960-1969 period. Saint Onge modeled the mortality rates as flat from 1970 onward, while we found the risk to decline from 1960 onward, beginning to show some evidence of flattening only between 1980 and 1999.
SMRs reported by MetLife (1975) are substantially lower than either the decade-specific SMRs or the combined 1930–1999 SMR that we present here. It seems likely that a number of deaths were not identified in the MetLife data; summing the deaths in our dataset over the period of the MetLife study yielded 3,247 deaths whereas MetLife reported only 2,698 for the same period. Missing deaths make the observed count of deaths artificially low and the total exposure time artificially high. The net result is an SMR that is biased downward.
Waterbor and colleagues (1988) reported that the SMRs for baseball players compared to the general population of white males rose until 1960, and then were non-significant through the mid-1980s. The current work shows a similar pattern until 1960, but displays significant reduction in the SMRs from 1970 onward. Much of this difference may stem from a difference in the choice of reference populations, as the Waterbor study compared the ballplayers to white males in the US general population. The lack of significance may also be in part due to the size of the intervals over which the SMRs were calculated. The Waterbor article calculated the SMRs for narrower intervals, which yields SMRs that are more time-specific, but lower in statistical power.
While previous estimates of life expectancy agree that MLB players lived longer on average than the general population, the reported life expectancy gains were as much as 2 years higher than we report here. The differences may be due to methodological differences and the choice of comparison rates.
An unanswered question is what the relative impact of access to healthcare, socioeconomic status, and physical fitness may have on the mortality rates of MLB players. These factors are highly correlated for professional athletes and it is unclear whether they influence mortality independently or collectively. To adequately address this question, future research should include detailed information about player salaries, the use of team doctors and other healthcare by players, and longitudinal measures of player fitness – particularly since this is a major requirement of baseball players in the modern era.
Future research should also gather information concerning extraneous occupational exposures. While these data are likely lost to perpetuity in some historical cohorts, post-MLB occupations and off-season activities could be tracked for contemporaneous cohorts.
Professional baseball players and other athletes offer the opportunity to explore the relationship between peak athleticism and mortality as well as the incidence and natural course of common chronic diseases. Because studies such as this can point the way toward further valuable research, a concerted effort should be made to fully leverage the natural experiment that professional athletes provide.
1. Abel EL and Kruger ML. 2006a. The healthy worker effect in major league baseball revisited. Research in Sports Medicine 14:83-7.
2. Abel EL and Kruger ML. 2006b. The healthy worker effect in professional football. Res Sports Med 14(4):239-43.
3. Abel EL and Kruger ML. 2005. Longevity of major league baseball players. Research in Sports Medicine 13:1-5.
4. Anderson RN. 1999. Method for constructing complete annual US life tables. National Center for Health Statistics. Report nr Vital Health Stat 2(129); (PHS) 99-1329.
5. Baron SL, Hein MJ, Lehman E, Gersicb CM. 2012. Body Mass Index, Playing Position, Race, and the Cardiovascular Mortality of Retired Professional Football Players. The American Journal of Cardiology 109:889 – 896.
6. Carle W. 2005. Personal communication.
7. Gajewski AK and Poznanska A. 2008. Mortality of top athletes, actors and clergy in poland: 1924-2000 follow-up study of the long term effect of physical activity. Eur J Epidemiol 23(5):335-40.
8. Hartley PH and Llewellyn GF. 1939. Longevity of oarsmen. Br Med J 1(4082):657-62.
9. Helyar J. 2007 April 9, 2007, 9:20 PM ET. Robinson would have mixed view of today's game. <http://sports.espn.go.com/mlb/jackie/news/story?id=2828584>. Accessed 2012 05/15.
10. Holtz S. 2005. Baseball almanac - www.baseball-almanac.com.
11. Karvonen MJ. 1976. Sports and longevity. Adv Cardiol 18(0):243-8.
12. MetLife. 1975. Longevity of major league baseball players. Statistical Bulletin of the Metropolitan Life Insurancy Company 56:2-4.
13. Ruiz JR, Moran M, Arenas J, Lucia A. 2011. Strenuous endurance exercise improves life expectancy: It's in our genes. Br J Sports Med 45(3):159-61.
14. Saint Onge JM, Rogers RG, Krueger PM. 2008. Major league baseball players' life expectancies*. Soc Sci Q 89(3):817-30.
15. University of California at Berkeley (USA) and Max Planck Institute for Demographic Research (Germany). 2010 October 13, 2010. Human mortality database. Accessed 2011 04/09.
16. Waterbor J, Cole P, Delzell E, Andjelkovich D. 1988a. The mortality experience of major-league baseball players (letter). N Engl J Med 319(15):1014-5.
17. Waterbor J, Cole P, Delzell E, Andjelkovich D. 1988b. The mortality experience of major-league baseball players. N Engl J Med 318(19):1278-80.
Source(s) of Funding
No external funding was provided for this study.
The authors declare that they do not have any competing interests.
This article has been downloaded from WebmedCentral. With our unique author driven post publication peer
review, contents posted on this web portal do not undergo any prepublication peer or editorial review. It is
completely the responsibility of the authors to ensure not only scientific and ethical standards of the manuscript
but also its grammatical accuracy. Authors must ensure that they obtain all the necessary permissions before
submitting any information that requires obtaining a consent or approval from a third party. Authors should also
ensure not to submit any information which they do not have the copyright of or of which they have transferred
the copyrights to a third party.
Contents on WebmedCentral are purely for biomedical researchers and scientists. They are not meant to cater to
the needs of an individual patient. The web portal or any content(s) therein is neither designed to support, nor
replace, the relationship that exists between a patient/site visitor and his/her physician. Your use of the
WebmedCentral site and its contents is entirely at your own risk. We do not take any responsibility for any harm
that you may suffer or inflict on a third person by following the contents of this website.