Increasing very elderly populations (ages 85+) have potentially major implications for the cost of income support, aged care, and healthcare. The availability of accurate estimates for this population age group, not only at a national level but also at a state or regional scale, is vital for policy development, budgeting, and planning for services. At the highest ages census-based population estimates are well known to be problematic and previous studies have demonstrated that more accurate estimates can be obtained indirectly from death data. This paper assesses indirect estimation methods for estimating state-level very elderly populations from death counts. A method for incorporating internal migration is also proposed. The results confirm that the accuracy of official estimates deteriorates rapidly with increasing age from 95 and that the survivor ratio method can be successfully applied at subnational level and internal migration is minor. It is shown that the simpler alternative of applying the survivor ratio method at a national level and apportioning the estimates between the states produces very accurate estimates for most states and years. This is the recommended method. While the methods are applied at a state level in Australia, the principles are generic and are applicable to other subnational geographies.
1. Introduction
The increase in the size of very elderly populations—defined here as those aged 85 years and above—is a demographic phenomenon occurring throughout the world [1] and one which has widespread economic and social implications. The consequences for the costs of income support, aged care, and healthcare are becoming a major focus for policymakers in Australia, as it is in many countries [2]. Given the challenges posed by increasing very elderly numbers, accurate estimates and projections are vital for policy development, budgeting, and strategic planning for services. The composition and geographical distribution of the very elderly impact on service delivery and the distribution of government funds to state, territory, and local governments [3]. The availability of accurate and detailed data on this population age group, not only at a national level but also at a subnational level, is thus becoming increasingly important to allow effective planning and budgeting for their unique needs.
Estimated Resident Populations (ERPs) published by the Australian Bureau of Statistics (ABS) constitute Australia’s official population estimates and are derived from census counts. Such estimates tend to overestimate very elderly numbers, and the degree of overestimation is typically higher for males than females and tends to increase with age [4–9]. According to Wilson and Bell [10], official Australian population estimates provided by the ABS for ages 90 and older appear to be too high and fluctuate implausibly over time. This is a well-documented issue and not unique to Australia [11]. In order to address it, indirect estimation methods have been employed to derive population estimates for the very elderly from death counts [12]. Because date of birth is typically verified at death, age at death is considered more reliable than age recorded on the census form, and these indirect estimates have been shown to be very accurate when compared with population numbers in countries with population registers [5, 11, 13, 14].
The extinct cohort method, which sums up future cohort deaths, can be used to estimate historical population numbers for cohorts when all members have died [15]. However, for cohorts where there are members still alive, more approximate methods, referred to as nearly extinct cohort estimation methods, need to be applied. The Human Mortality Database (HMD) applies a variant of one of these methods, the survivor ratio method, based on the outcome of a study which retrospectively tested a number of alternatives at a national level across nine large low-mortality countries [16]. Further tests, also at a national level, were performed by Andreev [17]. More recently, Terblanche and Wilson [18] evaluated the accuracy of different variants of nearly extinct cohort methods to estimate very elderly populations in Australia and New Zealand. However, such methods have not been tested or widely applied at a subnational level.
Kannisto [19] demonstrated that the extinct cohort method can be successfully applied at subnational level and to lower ages by applying it to derive estimates for the provinces of Finland from age 65. While this study made an important contribution to the literature, it was performed more than 20 years ago and applied to a much earlier era, namely, the late 19th and early 20th century. Furthermore, it was applied to provinces in Finland where migration rates are much lower than in Australia. Kannisto’s study was thus applied in a very different context. In the UK, the survivor ratio method has been used since 2010 to derive official population estimates for ages 90 to 105+ at a subnational level, namely, for England and Wales (combined), Scotland, and Northern Ireland [20]. However, it is not known how the accuracy of the extinct cohort and nearly extinct cohort methods compares to that of official estimates based on census counts. While a number of methods have been assessed for accuracy at the national level, no such assessment has been undertaken at a subnational level. Given the importance of very elderly populations at a subnational level for policy and research purposes, better methods are needed to estimate these fast-growing populations.
The aim of this paper is to assess the accuracy of a number of nearly extinct cohort estimation methods, previously tested only at a national level, to estimate very elderly numbers for Australian states. The accuracy of state-level ERPs is also assessed. When applying the extinct cohort and nearly extinct cohort methods at a national level, international migration is ignored as it is considered negligible at the highest ages [8]. Ignoring internal migration may not be appropriate when applying these methods at a subnational level, however. This paper proposes a method of incorporating internal migration into the extinct cohort and nearly extinct cohort estimation methods and considers the impact of such an allowance. The objective of the paper is to discover a reliable method for creating population estimates at single ages 85–109 at a state level for Australia, where much of the planning and budgeting for aged care and healthcare services and infrastructure takes place. Although applied to Australian data and at a state level, the methods are relevant for other subnational geographies. Accurate population estimates will enable the calculation of more accurate mortality rates and assist in the preparation of better projections. Population estimates determine the denominators in the calculation of death rates. Accurate death rates in the fitting period and jump-off year are essential for creating accurate projections of death rates and very elderly populations.
The rest of the paper is organised as follows. Section 2 contains a description of data and methods. Section 3 presents results from the evaluation of alternative nearly extinct cohort estimation methods applied at a state level. This includes an assessment of the impact of net internal migration on the accuracy of the estimates. Section 4 consists of a summary and recommendations.
2. Data and Methods
The relative accuracy of various nearly extinct cohort estimation methods was assessed by retrospectively applying them to extinct cohorts to create estimates for 31 December each year from 1981 to 1996 and comparing the results against those obtained from applying the extinct cohort method at 31 December 2012. Methods tested in this study include variants of the survivor ratio method [8, 21], the Das Gupta method [22], and the Survivor Ratio Advanced method [18]. Section 2.1 provides an overview of the methods and variants tested in this study and Section 2.2 sets out the error measures that were used. Death data, official estimates, and interstate migration data are described in Section 2.3. The estimation methods, including a method for incorporating internal migration for application at a subnational level, are described in Section 2.4.
2.1. Estimation Methods Tested at State Level
The methods tested in this study at a state level are shown in Table 1. Different variants assessed in this paper relate to different age ranges used for estimating survivorship and constraining population age groups. An evaluation of a number of variants at a national level for Australia by Terblanche and Wilson [18] showed that results did not differ significantly whether 3 or 5 cohorts were used to average survivor ratios or death ratios. For this purpose, therefore, only variants based on 5 cohorts were assessed.
Nearly extinct cohort methods tested at a state level.
SR variants
DG variants
SRA variants
Proportional variants
No constraining
SR (5, 5, NC)
DG (1, 5, NC)
SRA (5, 5, NC, 10)
Prop SR NC
Constrained to total official estimates for ages 85+
SR (5, 5, 85+)
DG (1, 5, 85+)
Prop SR 85+
Constrained to total official estimates for ages 90+
SR (5, 5, 90+)
DG (1, 5, 90+)
Prop SR 90+
Note. The notation is method (age range, number of cohorts for averaging survivor ratio or death ratio, and official population estimate used as a constraint). In the case of the SRA method, the last parameter refers to the number of older cohorts over which survivor ratio change is measured.
In addition to testing the above methods, the accuracy of the ABS’s ERPs was also assessed. Population estimates were separately assessed for males and females and for the states of New South Wales (NSW), Victoria (Vic), Queensland (Qld), South Australia (SA), and Western Australia (WA) and combination of Tasmania, Northern Territory, and Australian Capital Territory (Tas-NT-ACT).
2.2. Error Measures
Error is defined as the population estimate (E) minus the actual population (A), where “population estimate” refers to numbers obtained by applying nearly extinct cohort methods and “actual population” describes the populations calculated by the extinct cohort method. The accuracies of different estimation methods and variants were compared based on the Weighted Mean Absolute Percentage Error (WMAPE) measure [23], calculated as(1)WMAPEt=∑xEx-Ax∑xAx100%,where x refers to age. Lower WMAPEs indicate greater accuracy. WMAPEs were calculated for each state and year and summed over ages 90–94, ages 95–99, and ages 100+, respectively. Because the highest age at which ERPs are published is 100+, a more appropriate measure for comparing the accuracy of ERPs and nearly extinct cohort methods at these ages is Percentage Error (PE), calculated as(2)PEt=∑xEx-∑xAx∑xAx100%.In order to compare the accuracy of final open-ended age group ERPs with those of the other methods over the study period, the Mean Absolute Percentage Error (MAPE) was used, calculated as(3)MAPEt=∑tPEtn,where n refers to the number of years in the study.
2.3. Data2.3.1. Death Data and Estimated Resident Populations
Death data at a state level from 1971 to 2012 by sex and single years of age to 109 and 110+ were obtained from the ABS [24]. Death data at ages above 95 were adjusted for randomisation applied by the ABS, such that death counts at individual ages summed across states reconciled to the national number of deaths, and at both national and state scales the sum of deaths at single ages 100+ equalled the 100+ totals. In order to apply the extinct cohort and nearly extinct cohort estimation methods, deaths by age and year were split into cohorts by applying triangle factors. The derivation of these triangle factors is described in the Appendix.
The accuracy of official estimates, ERPs, was also assessed. Furthermore, some variants of the nearly extinct cohort estimation methods involve constraining to ERP totals for ages 85+ or 90+. ERPs were available from the ABS from 1971 by state, sex, and single year of age to 99 and in aggregate for ages 100+ [25]. The 30 June ERPs were interpolated to be consistent with extinct cohort and nearly extinct cohort estimates created at 31 December for each year.
Data for ten earlier years are required when determining death ratios over 5-year age ranges and for 5 older cohorts. The accuracy of nearly extinct cohort methods at state level was thus assessed by retrospectively applying them to create estimates for 31 December from 1981 to 1996 and comparing them against the extinct cohort estimates. The extinct cohort method was used to produce estimates for cohorts born up to 1906. Cohorts born up to 1902 were completely extinct by 31 December 2012 while cohorts 1903 to 1906 include a small degree of indirect estimation by “completing” the number of deaths. Given the tiny numbers involved, the possible extent of error is very small.
2.3.2. Interstate Migration Data
Before applying the nearly extinct cohort methods at a state scale, it was determined whether an explicit allowance for interstate migration was required. A proposed approach to incorporating adjustments for interstate migration into the extinct cohort and nearly extinct cohort methods is set out in Section 2.4.5. This makes use of ABS census migration data. For each current state of usual residence, the census provides counts of people living at a different address one and five years prior to the census. Interstate migration data were obtained from the 1981, 1986, 1991, 1996, 2001, 2006, and 2011 censuses. Data for interstate migration over five-year periods were used due to their lower volatility. For censuses up to 1996 data were available by single age up to the age of 98 and in aggregate for ages 99+ and for censuses from 2001 by single age to 99 and in aggregate for ages 100+. Respondents who did not provide their state of usual residence five years ago (“not stated”) were allocated proportionally between the numbers who did not move, moved from other states, or moved from overseas.
Using these data, net interstate migration ratios (nmx,t) by age over the previous 5 years were calculated for each census date as the number of people who moved in from other states (IMt-i) minus the number of people who moved away to other states (OMt-i) during this period, expressed as a percentage of the end-of-period census population aged x (CensusPx,t):(4)nmx,t=∑i=0k-1IMt-i-OMt-iCensusPx,t.By expressing net migration from census data as a percentage of census population, it is assumed that any age misstatement inherent in census data occurs to a similar extent in migration counts and population counts. Given that both are derived from the same data source this is considered a reasonable assumption. It is acknowledged that the more common population balancing equation assumes movements across borders and that census data is not movement data but rather reflects transitions only between two points in time and the net effect of movements over a period of time [26]. However, in this particular instance, it is assumed that net migration from the movement and transition perspectives is approximately the same.
Figure 1 shows the five-yearly net interstate migration ratios for each state for ages 85+ for females and males, respectively. From these graphs it is clear that net migration at these high ages is small and does not vary significantly between intercensal periods. Ratios do not vary appreciably by gender. For NSW, net migration ratios were negative indicating that moves out exceeded moves in. More people moved into Qld and Tas-NT-ACT than those who moved out, while net migration ratios varied between −0.5% and +0.5% for Vic, SA, and WA. An analysis of net five-yearly migration ratios by state and age for each intercensal period indicates that there are no clear patterns by age and time.
Five-yearly net interstate migration ratios for females and males at ages 85+ by state and census period.
2.4. Population Estimation Methods2.4.1. Extinct Cohort Method
The extinct cohort method was used to calculate the “actual population” against which the accuracy of nearly extinct estimation methods was assessed. According to this method, a cohort’s population for any year and age is estimated by summing subsequent cohort deaths [6]. The population of cohort c, people born in the year t-x, aged x last birthday on 31 December of year t is thus(5)Px,t=∑i=1ω-xDt+ic,where ω is the age of extinction, estimated as the highest age at which there was expected to be only one survivor [16], and Dt+ic is the number of deaths in year t+i from cohort c. This is illustrated by the Lexis diagram in Figure 2, which shows age along the vertical axis, time along the horizontal axis, and cohort along the diagonal axis. Death data were obtained by single year of age and calendar year and were allocated to cohorts using the triangle factors described in the Appendix.
The extinct cohort method of estimation. Source: based on Kannisto [5]. Note: numbers in shaded parallelograms are period-cohort deaths (Dt+ic); the thick vertical line represents population (Px,t) at time t, which is the sum of cohort deaths in later years.
When applying the extinct cohort and nearly extinct cohort estimation methods at a national level, an implicit assumption is made that deaths are the only source of population flows and therefore that international migration at these ages is negligible and can be ignored [8]. However, interstate migration may not be negligible so that when applying these methods at a state level, explicit allowance may need to be made for movements into and out of the states. A method for allowing for interstate migration is set out in Section 2.4.5.
2.4.2. Survivor Ratio Method
Developed by Dépoid [21], the survivor ratio (SR) method is an extension of the extinct cohort method to estimate populations of nearly extinct cohorts. The size of a nearly extinct cohort at a certain age is estimated from the survival of older cohorts to the same age. A survivor ratio, defined as the ratio of a cohort’s population at the calculation date to its size k years ago, may be expressed as(6)Rxc=Px,tPx-k,t-k.In line with the extinct cohort method, the number of survivors from a particular cohort k years earlier can be written as(7)Px-k,t-k=Px,t+∑i=0k-1Dt-icso that the survivor ratio for this cohort over k years can be expressed as(8)Rxc=Px,tPx,t+∑i=0k-1Dt-ic.The estimated population aged x at 31 December of year t can be obtained by solving for Px,t:(9)Px,t=Rxc1-Rxc∑i=0k-1Dt-ic.The cohort’s population in earlier years and at younger ages may then be obtained by summing deaths as in the extinct cohort method. To smooth out random variations, the estimated survivor ratio in (6) is usually based on the average experience of m older cohorts:(10)Rx=∑j=1mPx,t-j∑j=1mPx-k,t-k-j.The SR method is denoted by SR (k, m, constraint), where k refers to the age range over which the survivor ratio is measured and m refers to the number of older cohorts over which it is averaged. Thatcher et al. [16] and Wilmoth et al. [12] indirectly allowed for mortality improvements by increasing the estimated populations produced by a factor such that the total estimated population at ages 90+ equals the total official estimates at 90+ at the calculation date. For the purpose of this study, “constraint” indicates whether a constraint was applied and, if so, the age range over which the ERPs are aggregated. It is either “NC,” “85+,” or “90+,” referring to no constraint, constraining to ERPs for ages 85+, or constraining to ERPs for ages 90+.
2.4.3. Das Gupta’s Method
Another method used to estimate survivors of nearly extinct cohorts was developed by Das Gupta [22]. Das Gupta’s method estimates future cohort deaths for nearly extinct cohorts based on death ratios, rather than estimating the proportion alive based on survivor ratios [27]. A death ratio is the number of deaths experienced by a cohort at a particular age relative to deaths at the previous age and is typically averaged over a number of older cohorts, as illustrated in Figure 3. The death ratio at age x, based on observed deaths among m older cohorts, is calculated as(11)drx=∑i=1mDt-ic-i∑i=1mDt-i-1c-i.The numbers of deaths from a cohort at future dates and at higher ages are then derived by applying these death ratios successively:(12)Dtc=drxDt-1c.Cohort population estimates are then derived in the same way as the extinct cohort method by summing cohort deaths (5). Terblanche and Wilson [18] showed that the DG method produces the same results as the SR method when k=1. For consistency, the DG method will be denoted by DG (1, m, constraint), where m refers to the number of older cohorts over which the death ratio is averaged and “constraint” indicates whether a constraint was applied and, if so, the age range over which the ERPs are aggregated.
Das Gupta’s death ratios. Source: Andreev [17]. Note: numbers in shaded parallelograms are period-cohort deaths. A death ratio for a single cohort is calculated as the ratio of deaths in year t to deaths in year t-1. Shown above are deaths for three cohorts (c-3 to c-1) used to estimate deaths for cohort c.
2.4.4. Survivor Ratio Advanced Method
Mortality rates at high ages are declining in many countries [1, 5]. Applying survivor ratios or death ratios based on the observed deaths among older cohorts without adjustment may therefore result in an understatement of population estimates [17]. The decline in mortality rates can be explicitly taken into account by adjusting the survivor ratios or death ratios or implicitly by adjusting the resulting population estimates. Methods where explicit allowance is made for declining mortality include the mortality decline and Das Gupta Advanced methods, variants of the SR and DG methods, respectively [27]. Terblanche and Wilson [18] evaluated two variants of a new, simpler method for Australia and New Zealand termed the Survivor Ratio Advanced (SRA) method. The SRA method is a variation of the SR method whereby survivor ratios are adjusted to reflect average mortality decline as measured over a number of older cohorts before applying the SR method. One of the variants of the SRA method tested, using survivor ratios calculated over a five-year age range, was found to produce very accurate estimates and was also tested at a state level. The SRA method is described below.
According to (6), the survivor ratio at a particular age for cohort c is calculated as(13)Rxc=Px,tPx-k,t-k.The average improvement in the survivor ratios between subsequent cohorts can be written as(14)rx=∑j=1n-1Rxc-j/Rxc-j-1-1n-1,where n is the number of older cohorts for which changes in survivor ratios are measured. The survivor ratio based on m older cohorts (10) is then increased with this average improvement as follows:(15)Rx′=∑j=1mPx,t-j∑j=1mPx-k,t-k-j×1+rx1/2m+1.The method is denoted by SRA (k, m, NC, n), where k and m are as defined for the SR method. Due to the volatility of state survivor ratios at very high ages, combined improvement factors were calculated based on aggregate ratios for ages 100+. Where the adjusted survivor ratios exceeded 1 due to a high rate of improvement, further adjustment was required to avoid negative population estimates. For the purpose of this study, no survivor ratios were allowed to exceed 0.9 as this avoided unrealistic numbers.
2.4.5. Method to Allow for Interstate Migration
The method set out below was used to adjust the extinct cohort and nearly extinct cohort population estimates to allow for interstate migration. The population for a particular cohort in a state can be estimated from the population at an earlier date by subtracting cohort deaths and out-migrants and adding in-migrants in the interim period:(16)Px,t=Px-k,t-k-∑i=0k-1Dt-ic+∑i=0k-1IMt-ic-OMt-ic,where Px,t is the number of people aged x last birthday on 31 December of year t, Dtc is the number of cohort deaths during year t, and IMtc and OMtc represent interstate in- and out-migration during year t.
As shown in Section 2.3.2 (4), five-yearly net interstate migration ratios were calculated from census data as the number of people who moved in from other states (IMt-i) minus the number of people who moved away to other states (OMt-i) expressed as a percentage of the end-of-period census population. Using the calculated net migration ratios, net interstate migration is estimated from estimated populations aged x at time t as (17)nmx,tPx,t=∑i=0k-1IMt-i-OMt-i.Combining (16) and (17) and rearranging the terms give(18)Px,t-nmx,tPx,t=Px-k,t-k-∑i=0k-1Dt-ic.Estimating the population aged x at time t by way of the SR method (9), (18) can be rewritten as follows: (19)Px,t-nmx,tPx,t=Rx1-Rx∑i=0k-1Dt-ic.Thus the state population aged x at time t, allowing for internal migration, is(20)Px,t=Rx/1-Rx∑i=0k-1Dt-ic1-nmx,t.Therefore, before applying the estimation methods to state-level data, deaths in each period-cohort space were adjusted as follows:(21)Dt′c=Dtc1-nmx,t′,where nmx,t′ is the net migration ratio over a single year, derived from the k-yearly factors:(22)nmx,t′=1+nmx,t1/k-1.The five-yearly net migration ratios described in Section 2.3.2 were converted to annual ratios (22) and used to adjust numbers of deaths before applying the estimation methods described earlier in this section. The results of testing different estimation methods presented in Section 3 incorporate adjustments for net interstate migration. Net migration has been included in both the estimates tested and the extinct cohort estimates. The impact of this allowance is discussed in Section 3.2.1.
Before applying net migration ratios, checks were performed to ensure that net migration across states summed to zero. Furthermore, after splitting state-level and national level deaths into age-period-cohort triangles and adjusting state deaths to allow for interstate migration, state data were proportionally adjusted to ensure that deaths in each age-period-cohort triangle, when added across states, added up to national level death counts. Only minor adjustments were required.
2.4.6. Proportional Method
For the purpose of this study, a new approach, referred to as the proportional method, was also tested at state level. According to this method, variants of the SR method are applied at the national level, and resulting estimates at single ages are apportioned between the states based on the state-level proportions of ERPs in the 85+ age group relative to the national ERP for these ages. The proportional method is set out in algebraic form below. The ratio of state-level ERPs at ages 85+ relative to the national level ERP at ages 85+ in year t is(23)proptS=∑x=85100+ERPx,tS∑S∑x=85100+ERPx,tS,where ERPx,tS is the ERP at 31 December of year t for age x last birthday and state S and the summation across states equals the total ERP at national level. Following the estimation of national level estimates using a variant of the SR method, this ratio is applied to the estimates at single ages in year t, to derive state-level estimates at single ages in year t:(24)Px,tS=proptSPx,tnat,where Px,tnat is the population estimate for age x at 31 December of year t at national level and Px,tS is the corresponding estimate at state level. For the purpose of this study, the variants SR (5, 5, 85+), SR (5, 5, 90+), or SR (5, 5, NC) were applied at national level and corresponding state-level proportional variants were denoted by Prop SR 85+, Prop SR 90+, and Prop SR NC, respectively.
3. Accuracy of Population Estimation Methods
In this section we present the results of the evaluation of alternative nearly extinct cohort population estimation methods at the state scale. Section 3.1 sets out the errors and relative rankings of the different methods tested for each state averaged over the period 1981 to 1996. Section 3.2 describes how errors for the different methods for each state changed over the study period.
It is expected that estimation methods will be less effective for smaller populations due to greater volatility in the numbers of deaths. For this reason, errors are generally higher for males than females and for smaller states and also tend to increase with age. Before discussing the results, it is informative to consider the relative sizes of male and female very elderly populations in the different states. At the end of 1996, the last year in the study period, 35% of the very elderly population was in NSW, 27% in Vic, 16% in Qld, 10% in SA, 9% in WA, and 4% in Tas-NT-ACT. These proportions were similar for males and females. Female very elderly populations varied from 51,000 (NSW) to 5,300 (Tas-NT-ACT) and males from 21,000 (NSW) to 2,300 (Tas-NT-ACT).
3.1. Best Method on Average over the Period 1981–1996 by State
Females. Figure 4 shows WMAPEs for each method tested for females aged 90–94 and 95–99 by state. These are average WMAPEs over the period 1981 to 1996. Consistent with the findings of Terblanche and Wilson [18] at a national level, unconstrained variants and variants of the DG method produced the least accurate estimates. More accurate estimates were obtained when constraining results for either the SR or proportional methods to 90+ or 85+ ERPs than explicitly allowing for survivor ratio improvement as with the SRA (5, 5, NC, 10) method.
Weighted Mean Absolute Percentage Errors for different methods tested for females at ages 90–94 and 95–99, by state.
At ages 90–94 the ABS ERPs proved more accurate than nearly extinct cohort estimates for all states, with WMAPEs varying from 1.9% (Vic) to 3.8% (WA). At ages 95–99 ABS ERPs were less accurate with WMAPEs of between 7.6% (NSW) and 15.0% (Tas-NT-ACT). The accuracy of ERPs deteriorated more with increasing age compared to the nearly extinct cohort estimates so that their relative ranking dropped. Across the states at ages 95–99 errors for the most accurate nearly extinct cohort estimates were on average 2.8% lower than for ERPs.
While no single method consistently performed best for all the states, at ages 90–94 the SR variants and proportional methods constrained to 90+ and 85+ ERPs were consistently among the top four methods for all states. This was also true at ages 95–99 with the exception of WA. In contrast with the other states, the proportional variants performed relatively poorly for WA, and the top four methods were the three SR variants and the SRA method. The relative ranking of the SR variants and proportional methods constrained to 90+ and 85+ ERPs varied between the states but their accuracy generally differed only marginally. At ages 90–94 the errors produced by the best four methods were generally below 5% for all states except Tas-NT-ACT. At ages 95–99 the average errors were below 5% for NSW and Qld; between 5% and 10% for Vic, SA, and WA; and between 10% and 15% for Tas-NT-ACT. On average for ages 90–99, the variant SR (5, 5, 90+) was the most accurate for NSW and Vic. In line with national level findings by Terblanche and Wilson [18], this was also the best performing method across the states with an average error of 3.8%, followed by Prop SR 90+ with 4.0%, and Prop SR 85+ with 4.2%.
The accuracy of ABS ERPs for ages 100+ was generally fairly low, as can be seen in Table 2, which shows MAPEs for each method and state for females aged 100+. MAPEs for ERPs varied between 14.3% (SA) and 27.8% (WA), compared to MAPEs produced by the best method in each state of between 6.2% (NSW) and 22.5% (WA). On average over the study period, the Prop SR NC method produced the most accurate estimates for females in aggregate at ages 100+ for NSW, Vic, and Qld. The Prop SR 90+ and Prop SR 85+ methods produced marginally higher MAPEs for these states and the most accurate estimates for SA and Tas-NT-ACT. For WA, the best performing methods were the unconstrained SR and proportional methods and SR (5, 5, 90+).
Mean Absolute Percentage Error for females at ages 100+, by state.
NSW
Vic
Qld
SA
WA
Tas-NT-ACT
Prop SR NC
6.2%
10.1%
5.7%
8.7%
23.4%
18.3%
Prop SR 90+
6.8%
11.0%
8.0%
7.5%
27.3%
14.0%
SR (5, 5, NC)
6.8%
14.7%
9.3%
14.6%
22.5%
39.9%
Prop SR 85+
6.9%
11.5%
8.6%
7.8%
27.6%
13.9%
SR (5, 5, 90+)
8.6%
12.7%
13.5%
15.2%
23.6%
34.1%
SR (5, 5, 85+)
9.4%
13.0%
14.6%
16.5%
24.0%
36.1%
SRA (5, 5, NC, 10)
13.0%
17.2%
27.5%
33.2%
29.8%
60.3%
DG (1, 5, 90+)
13.2%
14.8%
27.4%
28.1%
41.5%
55.4%
DG (1, 5, 85+)
14.9%
16.6%
30.6%
30.1%
43.3%
58.8%
DG (1, 5, NC)
15.1%
17.7%
33.0%
28.2%
43.5%
62.6%
ABS ERP
16.5%
16.5%
18.6%
14.3%
27.8%
27.2%
Note: MAPEs for the most accurate method for each state have been highlighted in bold and MAPEs for ABS ERPs are highlighted in italics.
Males. Figure 5 shows WMAPEs for each method for males aged 90–94 and 95–99 by state. Consistent with the results for females, the SR variants performed better than the DG variants and the variants constrained to 85+ or 90+ ERPs performed better than unconstrained variants. Constraining to ERPs also generally produced more accurate estimates than explicitly allowing for survival improvement. Similar to females, ABS ERPs for males at ages 90–94 were the most accurate estimates in NSW, Vic, SA, and WA and the second most accurate in Qld and Tas-NT-ACT. WMAPEs for ERPs varied from 4.6% (NSW) to 7.6% (Tas-NT-ACT), higher than for females. However, ERPs for ages 95–99 were significantly less accurate in both absolute terms and relative to the nearly extinct cohort estimates, with WMAPEs varying between 19.4% for SA and 32.9% for Tas-NT-ACT.
Weighted Mean Absolute Percentage Errors for different methods tested for males at ages 90–94 and 95–99, by state.
As was the case for females, no single method consistently produced the lowest errors for all states for all age groups. However, at ages 90–94 the SR (5, 5, 85+) and the Prop SR 85+ methods were generally the best performers. At ages 95–99 the Prop SR NC method was the most accurate for NSW, Vic, WA, and Tas-NT-ACT and the Prop SR 85+ method was among the top three in Vic, Qld, SA, and Tas-NT-ACT. On average for ages 90–99 the Prop SR 85+ variant was the most accurate for NSW, Vic, Qld, and Tas-NT-ACT. This was also the best performing method across all states with an average error of 6.9%, followed by SR (5, 5, 85+) with 7.3%.
As shown in Table 3, MAPEs for ERPs for males aged 100+ were high, varying from 27.4% (SA) to 71.4% (WA). For NSW and Qld ERPs for ages 100+ were less accurate than all the nearly extinct cohort estimates with MAPEs of 53.2% and 47.3%, respectively. The Prop SR NC method produced the most accurate estimates in aggregate at ages 100+ for NSW, Vic, Qld, WA, and Tas-NT-ACT. Variants of the proportional methods generally produced the most accurate estimates for males at single ages 100 and above. These errors are generally higher than for females, probably as a result of the smaller numbers and greater volatility in death numbers.
Mean Absolute Percentage Error for males at ages 100+, by state.
NSW
Vic
Qld
SA
WA
Tas-NT-ACT
Prop SR NC
15.4%
11.1%
14.6%
20.1%
13.9%
31.7%
SR (5, 5, NC)
16.0%
24.1%
22.6%
30.3%
53.1%
44.9%
SR (5, 5, 85+)
19.3%
27.7%
25.4%
33.0%
55.1%
45.4%
SR (5, 5, 90+)
20.4%
27.5%
26.1%
34.0%
52.8%
44.6%
SRA (5, 5, NC, 10)
21.4%
32.6%
38.1%
36.5%
132.1%
57.3%
Prop SR 85+
24.4%
12.5%
22.4%
17.8%
15.6%
33.0%
Prop SR 90+
27.1%
13.6%
24.7%
16.7%
16.4%
32.1%
DG (1, 5, NC)
32.9%
33.7%
34.8%
89.1%
59.5%
76.9%
DG (1, 5, 85+)
33.2%
42.8%
35.5%
85.2%
55.5%
72.2%
DG (1, 5, 90+)
34.9%
43.3%
35.7%
80.9%
50.9%
74.4%
ABS ERP
53.2%
39.7%
47.3%
27.4%
71.4%
55.3%
Note: MAPEs for the most accurate method for each state have been highlighted in bold and MAPEs for ABS ERPs are highlighted in italics.
To summarise, ERPs for all the states proved very accurate at ages 90–94 for both males and females, but their accuracy decreased rapidly with increasing age, and significantly more accurate estimates were obtained for ages 95+ using nearly extinct cohort estimation methods. Across the states, variants of both the SR and proportional methods with either 85+ or 90+ ERPs constraints produced reasonably accurate estimates for both sexes at ages 90–99. At ages 100+ the unconstrained proportional method performed the best, closely followed by the constrained proportional variants.
3.2. Changes over the Period 1981–1996 by State
Females Aged 90–99. Figure 6 shows the WMAPEs in each year from 1981 to 1996 for females aged 90–94 and 95–99 in NSW, Vic, and Qld. Figure 7 shows the same information for SA, WA, and Tas-NT-ACT. Only WMAPEs for ERPs and the SR variants and proportional methods constrained to 85+ and 90+ ERPs are shown as these were the best performing methods on average over the period. Errors for the other methods were volatile and at times very high and are not shown.
Weighted Mean Absolute Percentage Errors for selected methods for females at ages 90–94 and 95–99 in NSW, Vic, and Qld.
Weighted Mean Absolute Percentage Error for selected methods for females at ages 90–94 and 95–99 in SA, WA, and Tas-NT-ACT.
WMAPEs for ERPs for females aged 90–94 were consistently below 5% throughout the period 1981 to 1996 for all states. The nearly extinct cohort estimation methods also performed well, producing errors below 5% in most years for NSW, Vic, and Qld and below 10% for SA, WA, and Tas-NT-ACT. The exceptions were the proportional variants, which generated errors above 10% for WA from 1986 to 1988 and SR (5, 5, 85+), which produced errors in excess of 10% for Tas-NT-ACT from 1990 to 1994.
At ages 95–99 ERPs and nearly extinct cohort estimates were less accurate on average and errors were more volatile from year to year. WMAPEs for ERPs at ages 95–99 in the larger states of NSW, Vic, and Qld were more often higher than those from the SR and proportional variants, resulting in higher average errors. The proportional methods faired particularly well in Qld. WMAPEs for the proportional methods showed an increasing trend in NSW, however, with WMAPEs for Prop SR 85+ increasing from 1.5% in 1985 to 8.0% in 1995, but a decreasing trend in Vic, from 14.9% in 1983 to 2.1% in 1994. In WA, the proportional methods produced significantly higher errors than ERPs from 1989 to 1994 and the SR variants constrained to either 85+ or 90+ ERPs were the most accurate.
Males Aged 90–99. Figure 8 shows the WMAPEs in each year from 1981 to 1996 for males in NSW, Vic, and Qld aged 90–94 and 95–99 for ERPs and the SR variants and proportional methods constrained to 85+ and 90+ ERPs. Figure 9 shows the same information for SA, WA, and Tas-NT-ACT. At ages 90–94 ERPs were generally the most accurate with errors below 10% for all states, although from 1990 its relative ranking dropped in most states. During most of the last few years the Prop SR 85+ and SR (5, 5, 85+) methods produced more accurate estimates compared to ERPs in all states.
Weighted Mean Absolute Percentage Error for selected methods for males at ages 90–94 and 95–99 in NSW, Vic, and Qld.
Weighted Mean Absolute Percentage Error for selected methods for males at ages 90–94 and 95–99 in SA, WA, and Tas-NT-ACT.
At ages 95–99, WMAPEs for ERPs were significantly higher and more volatile than for the other methods. The accuracy of ERPs for males at ages 95–99 also varied significantly more than for females and errors were much higher. Following a period of steady increase from around 1990 to 1992, WMAPEs for ERPs in most states reached their highest levels in 1995 or 1996, with 45.8% in NSW, 38.8% in Vic, 37.4% in Qld, 30.8% in SA, 55.2% in WA, and 38.2% in Tas-NT-ACT. Differences in errors between ERPs and the other methods were large while differences between SR and proportional variants were relatively small.
Ages 100+. Figure 10 shows Percentage Errors (PEs) for females and males aged 100+, respectively, in NSW and WA. Trends for Vic, Qld, and SA were similar to NSW and are not shown.
Percentage Errors for females and males at ages 100+ in NSW and WA.
It is clear from these graphs that ABS ERPs at ages 100+ exhibit more volatile error patterns compared to nearly extinct cohort estimates. ERPs for females were either significantly underestimated or significantly overestimated, while for males they were mostly overstated and to a greater degree. Much more accurate estimates could be obtained using either the SR or the proportional variants.
3.2.1. Results of Not Allowing for Interstate Migration
The previous two sections discussed the results of population estimation methods explicitly incorporating interstate migration. This section compares some of the results of the tests where no allowance was made for interstate migration and population estimates are based on deaths only. Table 4 compares the MAPEs for a number of estimation methods for males at ages 100+ with and without adjustments for interstate migration.
Mean Absolute Percentage Error for males at ages 100+, allowing for interstate migration (a), not allowing for interstate migration (b), and the difference (c).
MAPE for methods based on deaths and interstate migration
NSW
Vic
Qld
SA
WA
Tas-NT-ACT
SR (5, 5, NC)
16.0%
24.1%
22.6%
30.3%
53.1%
44.9%
SR (5, 5, 90+)
20.4%
27.5%
26.1%
34.0%
52.8%
44.6%
SR (5, 5, 85+)
19.3%
27.7%
25.4%
33.0%
55.1%
45.4%
DG (1, 5, NC)
32.9%
33.7%
34.8%
89.1%
59.5%
76.9%
DG (1, 5, 90+)
34.9%
43.3%
35.7%
80.9%
50.9%
74.4%
DG (1, 5, 85+)
33.2%
42.8%
35.5%
85.2%
55.5%
72.2%
MAPE for methods based on deaths only
NSW
Vic
Qld
SA
WA
Tas-NT-ACT
SR (5, 5, NC)
16.1%
24.2%
22.6%
30.2%
53.1%
44.8%
SR (5, 5, 90+)
20.4%
27.6%
26.2%
33.9%
52.3%
44.4%
SR (5, 5, 85+)
19.3%
27.7%
25.6%
33.0%
54.6%
45.3%
DG (1, 5, NC)
32.9%
33.7%
34.8%
89.1%
59.6%
76.8%
DG (1, 5, 90+)
34.8%
43.3%
34.5%
80.7%
50.8%
74.1%
DG (1, 5, 85+)
33.1%
42.8%
35.2%
85.1%
55.4%
72.1%
(a) − (b): difference in errors with and without interstate migration
NSW
Vic
Qld
SA
WA
Tas-NT-ACT
SR (5, 5, NC)
−0.1%
−0.1%
0.0%
0.1%
0.0%
0.1%
SR (5, 5, 90+)
0.0%
0.0%
−0.1%
0.1%
0.5%
0.2%
SR (5, 5, 85+)
0.0%
−0.1%
−0.1%
0.0%
0.5%
0.2%
DG (1, 5, NC)
−0.1%
0.0%
0.0%
0.0%
−0.1%
0.1%
DG (1, 5, 90+)
0.0%
0.0%
1.2%
0.2%
0.1%
0.3%
DG (1, 5, 85+)
0.0%
0.0%
0.3%
0.1%
0.0%
0.1%
It is clear that the results of accuracy tests differ very little whether interstate migration is allowed for or not. As was clear from Figure 1 in Section 2.3.2 the net migration ratios are small. Although the small size of the errors is partly a function of both the extinct cohort and nearly extinct cohort estimates consistently including or excluding migration, the difference in population estimates at single ages with or without migration never exceeds 0.3% at any age for any state. This suggests that while it would be theoretically more accurate to allow for interstate migration, it is probably not worth the effort. This is consistent with the findings of Kannisto [19].
4. Summary and Conclusion
This paper has assessed the accuracy of a number of indirect estimation methods for creating population estimates at ages 85 and above at a state level in Australia. Nearly extinct cohort estimation methods have been evaluated at a national level for a number of European countries, Japan, and the US by Thatcher et al. [16] and Andreev [17] and more recently for Australia and New Zealand by Terblanche and Wilson [18]. The Human Mortality Database contains very elderly population estimates for almost 40 countries at a national level based on the method found to be most accurate by Thatcher et al. [16], namely, the SR (5, 5, 90+) method. These estimation methods have not, however, been widely applied to estimate very elderly populations at a subnational level and we are not aware of any evaluations undertaken to check the accuracy of such methods at a subnational level. Accurate subnational level estimates are vital, though, as many budgeting and planning decisions to cater for the needs of the very elderly are made at this level.
This study has shown that the nearly extinct cohort methods can be successfully applied at a subnational level to produce reasonably accurate estimates by single years of age at the highest ages. These methods can be used to produce significantly more accurate estimates for ages 95 and above compared to official estimates. In addition, the availability of such estimates at single ages above 100 will facilitate calculation of more accurate and detailed mortality rates and better projections at these high ages. While we applied nearly extinct cohort methods at a state level in Australia, the methods and principles should be applicable in other contexts and for other subnational geographies, subject to the availability of sufficient volumes of data. The methods appear to become less effective for smaller populations due to the increased volatility of survivor and death ratios.
Consistent with Australian national level results [18], state-level estimates based on five-year age ranges, as with the SR variants, were found to be more accurate than those based on one-year age ranges, as is implicit in the DG method. The approach of adjusting results by constraining them to official 85+ or 90+ ERPs was also found to produce more accurate estimates than explicitly allowing for improvements in survival. It was furthermore found that the level of interstate migration at ages 85+ is small and allowing for this does not significantly improve the estimates.
State-level ERPs proved very accurate at ages 90–94 but their accuracy deteriorated more rapidly with increasing age than nearly extinct cohort methods. For both sexes at ages 95 and above the accuracy of ERPs varied significantly from year to year and more accurate estimates could be derived by applying nearly extinct cohort methods to state-level death counts or by apportioning estimates obtained by applying nearly extinct cohort methods at national level to states. While no single method proved to be the most accurate for all states and age ranges, reasonably accurate estimates for most states, at most ages 90 and above and for both sexes, could be produced by either the Prop SR 85+ or the SR (5, 5, 85+) methods. Terblanche and Wilson [18] recommended the use of the SR (5, 5, 85+) to create very elderly population estimates for both sexes for Australia. Similarly and based on the results of this study, we recommend the simpler Prop SR 85+ method be used to allocate the resulting national level age distributions to the state scale.
In conclusion, it is acknowledged that the results of this study are dependent on the study period used and that consideration of a different period may yield different average results. However, the general ranking of methods and hence the conclusions drawn are not expected to be significantly different.
AppendixSplitting of Death Counts by Age and Year into Age-Period-Cohort Triangles
Triangle factors applied by the HMD are obtained from regression equations based on multiple regression analysis of the data for three countries (Sweden, Japan, and France) [12]. In order to derive triangle factors for the purpose of this study, a new, simpler method was applied, described below.
The number of deaths at age x in year t from cohort c can be expressed as the number of births in year t-x, multiplied by their survival to age x, their probability of death between exact age x and exact age x+1, and the proportion of deaths occurring in the earlier triangle of the age-cohort parallelogram. For cohort c and cohort -1, respectively, this is expressed as follows:(A.1)Dx,tc=Bcp0cxqxcy,Dx,tc-1=Bc-1p0c-1xqxc-11-y,where Dx,tc is the number of deaths from cohort c at age x in year t, where cohort c are those people born in the year t-x; Bc is the number of births in year t-x; p0cx is the probability of survival, or the survival ratio, from birth to age x for cohort c; qxc is the probability of death between exact age x and exact age x+1 for cohort c; y is the proportion of deaths occurring in the later triangle of the age-cohort parallelogram. By assuming deaths are evenly distributed throughout the year between exact age x and exact age x+1, y is taken as 0.5.
If z is the ratio of deaths in the lower triangle relative to those in the upper triangle, that is, the number of deaths at age x in year t from cohort c relative to those from cohort c-1:(A.2)z=Dx,tcDx,tc-1=BcBc-1p0cxp0c-1xqxcqxc-1y1-y,then the proportion of deaths in the lower triangle equals z/(1+z).
This relationship was used to split national and state death counts by age and year into deaths by age, year, and cohort. Survival ratios and probabilities of death applied at both national and state level are those measured at a national level, while birth numbers are state-specific. Triangle factors varied by state and between males and females. National level survival ratios and probabilities of death up to 2009 were obtained from the HMD and are thus a function of the triangle factors applied in the HMD. Survival ratios and death probabilities for 2010–2012 were calculated based on the regression equations used in the HMD as set out in the HMD Methods Protocol [12].
Disclosure
This paper was completed while the first author was a Ph.D. student at the University of Queensland.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgment
The first author gratefully acknowledges receipt of a UQ Scholarship.
RauR.SorokoE.JasilionisD.VaupelJ. W.Continued reductions in mortality at advanced agesAustralian GovernmentABSCoaleA. J.KiskerE. E.Mortality crossovers: reality or bad data?KannistoV.KannistoV.LauritsenJ.ThatcherA. R.VaupelJ. W.Reductions in mortality at advanced ages: several decades of evidence from 27 countriesPrestonS. H.EloI. T.StewartQ.Effects of age misreporting on mortality estimates at older agesThatcherA. R.Trends in numbers and mortality at high ages in England and WalesThatcherA. R.The demography of centenarians in England and WalesWilsonT.BellM.Australia's uncertain demographic futureJdanovD. A.JasilionisD.SorokoE. L.RauR.VaupelJ. W.Beyond the Kannisto-Thatcher database on old age mortality: an assessment of data quality at advanced agesWilmothJ. R.AndreevK. F.JdanovD. A.GleiD. A.BoeC.BubenheimM.PhilipovD.ShkolnikovV.VachonP.CondranG. A.HimesC. L.PrestonS. H.Old-age mortality patterns in low-mortality countries: an evaluation of population and death data at advanced ages, 1950 to the presentBourbeauR.LebelA.Mortality statistics for the oldest-oldVincentP.La mortalité des vieillardsThatcherR.KannistoV.AndreevK. F.The survivor ratio method for estimating numbers at high agesAndreevK. F.A method for estimating size of population aged 90 and over with application to the U.S. Census 2000 dataTerblancheW.WilsonT.An evaluation of nearly-extinct cohort methods for estimating the very elderly populations of Australia and New ZealandKannistoV.ONSDépoidF.La mortalité des grands vieillardsdas GuptaP.SiegelJ. S.Australian Bureau of Statistics (ABS)ABSReesP.WillekensF.RogersA.WillekensF. J.Data and accountsAndreevK. F.