Tag Archives: epidemiology

Life expectancy and transition in cause of death patterns

By: Karl Pettersson

Re-posted from: https://www.dusty-test.klpn.se/posts/2022-03-27-transition.html

Life expectancy and transition in cause of death patterns

Posted on 2022-03-27

by Karl Pettersson.

Tags: ,

This week, the Swedish statistical agency has published life tables for
Sweden 2021 (Statistics Sweden 2023). With the first waves of the COVID pandemic, life
expectancy at birth decreased from 84.73 years for females and 81.34
years for males in 2019 to 84.29/80.60 years in 2020. For 2021, the
numbers were again 84.82/81.21 years. This reflects, of course, the
decreased COVID mortality due to vaccination. Moreover, the flu A(H3N2)
wave, which peaked around Christmas with rather high rates of illness
among young people, did not cause substantial excess mortality, which
may, in part, be due to people with respiratory symptoms having less
contacts than usual with older people and other risk groups.

The increase in life expectancy in Sweden, and many other countries, up
until the mid-20th century was largely driven by decreasing childhood
mortality, which also caused changes in the cause of death patterns,
with directly communicable diseases becoming less common relative to
age-related diseases, such as circulatory diseases and cancer. In
contrast, the continued increase in rich countries after that, which was
temporarily interrupted by the pandemic, is largely due to decreased
mortality at older ages.

Vishnevsky (2017) discusses the development in life expectancy and causes
of death after 1960 in Russia, compared to high-income countries, in
particular Western European countries. In the EU-15 countries,
age-standardised mortality rates from circulatory, external and
respiratory causes have decreased greatly since 1970, while cancer
mortality has decreased modestly. The proportion of deaths from
circulatory causes has also decreased (from nearly 50 percent to about
30 percent), while the proportion of deaths from cancer has increased
(from about 20 percent to about 30 percent). No such changes have
occurred in Russia, where life expectancy has not improved much since
the 1960s (although it has improved relative to the dramatic increases
in mortality during the 1990s).

From this, one might conclude that the increased life expectancy in rich
countries largely has been about decreased circulatory mortality.
However, Vishnevsky points out that focusing on standardised rates for
all ages hides a significant increase in life expectancy for those dying
also of non-circulatory causes. In Sweden, for example, the life
expectancy for people dying of cancer or other neoplasms increased 8.2
years for females and 7.6 years for males during to period 1960–2010.
The corresponding increase for circulatory diseases (where life
expectancy was higher than for cancer already in 1960) is 8.0/6.8 years.
It is clear that this reflects a marked decrease in cancer mortality at
young ages, a point similar to what has been made earlier by researchers
like Riggs (1994).

One factor not discussed by Vishnevsky is the impact of changing
practices in reporting causes of death over a long time. For example,
the increase in life expectancy has been particularly strong for the
residual category, other diseases, in Sweden, with 20.0 years for
females and 17.8 years for males. This category includes dementia, which
was a rare underlying cause of death in 1960. Back then, most people
with dementia probably had circulatory or respiratory causes reported
instead, and the other category was dominated by other causes, with a
much lower life expectancy.

In light of this, it may be interesting to compare the correlation
between general life expectancy and proportion of deaths ascribed to
different causes in varying countries more in detail. I made a Julia
package, MortIntl, which can be
used to analyse such trends, based on cause-specific mortality data from
WHO (2025) and life tables from University of California, Berkeley and Max Planck Institute for Demographic Research (2022). It uses a configuration similar to
my earlier Mortchartgen,
which I have used to generate Mortality
Charts
, but extracts data directly from
the data files using AWK instead of relying on a SQL database.

Fig. 1 and fig. 2 show female and male life expectancy at birth in
relation to proportion of deaths from circulatory causes (as defined for
Mortality Charts) for
the Nordic and Baltic countries, with Iceland excluded due to small
population.1

Figure 1: Circulatory deaths vs life expectancy females Nordic and Baltic countries.
Figure 2: Circulatory deaths vs life expectancy males Nordic and Baltic countries.

The charts clearly show that improvements in life expectancy continued
for a long time among, for example, females in Finland and Sweden, after
circulatory causes became dominant, without any substantial change in
the proportion of deaths ascribed to these causes. That proportion
really started decreasing after the 1980s, when dementia became more
commonly reported (see Mortality
Charts
).

The Baltic countries, especially Estonia, have in recent years attained
a female life expectancy close to the Nordic countries, but the
proportion of circulatory deaths there is higher than it has been in the
Nordic countries at any point in time. In contrast, Denmark, has had a
lower proportion of circulatory deaths than the other Nordic countries,
a pattern which has been more pronounced in recent decades. The
difference in circulatory deaths between Denmark and Estonia in recent
years, when both have had similar life expectancy among females, is
greater than the temporal variation, over nearly 70 years, in any of the
Nordic countries.

From this, it seems that clear that great caution is warranted in
drawing any epidemiological conclusions from trends for officially
reported circulatory mortality over all ages.

References

Riggs, J. E. 1994. “The cohort mortality perspective: The emperor′s new clothes of epidemiology, an illustration using cancer mortality.” Regulatory Toxicology and Pharmacology 19 (2): 202–210. doi:10.1006/rtph.1994.1018.
Statistics Sweden. 2023. “Life table by sex and age.” https://www.statistikdatabasen.scb.se/goto/en/ssd/LivslangdEttariga.
University of California, Berkeley and Max Planck Institute for Demographic Research. 2022. Human Mortality Database.” https://www.mortality.org.
Vishnevsky, Anatoly. 2017. “Mortality in russia: The second epidemiological revolution that never was.” Demographic Review 2 (5): 4–33. doi:10.17323/demreview.v2i5.5581.
WHO. 2025. “WHO Mortality Database.” https://www.who.int/data/data-collection-tools/who-mortality-database.

  1. The charts can be generated by cloning the blog
    repository
    , installing
    MortIntl with the relevant data files, as described in the
    documentation, and running circall_e0_baltnord.jl in the
    subdirectory postdata/2022-03-27-transition.↩︎

A website with mortality charts built using Julia

By: Karl Pettersson

Re-posted from: https://www.dusty-test.klpn.se/posts/2017-03-01-mcsite.html

A website with mortality charts built using Julia

Posted on 2017-03-01

by Karl Pettersson.

Tags: ,

Since 2015, I have run a website with cause-specific mortality trends.
The idea is to have a static site, which gives fast and easy access to
information about international mortality trends, using open data
available from WHO (2025), which, for many countries, covers the time
period from 1950 up until recent times. The website is inspired by
Whitlock (2012), which contains comprehensible charts with mortality trends
based on these data, but has been unmaintained since 2013, when its
creator died. Other sites with international cause-specific mortality
trends I have seen tend to be slower, due to dynamic chart generation,
and to cover only shorter time periods.

My implementation of the site generator, which was written in Python and
R, had become rather messy, and the chart tools I used
(matplotlib and
ggplot2) are not really suited to make
interactive web charts. I decided to rewrite the routines to generate
the charts and the site files in Julia (albeit with the help of some
non-Julia tools, as described below). These routines are now available
as a GitHub repo, and I use
them to generate the site in both
English and
Swedish versions.

The site is built as follows with the Julia package (see the README in
the repo for instructions). The whole process is controlled with a JSON
configuration
file
.
YAML, using some non-JSON features, might be less cumbersome, and will
perhaps be used once there is full YAML write support implemented in
Julia. Julia functions mentioned are in the main
Mortchartgen.jl
file, if not otherwise stated.

  1. The WHO (2025) data files are downloaded and read into a MySQL
    database, using the functions in the
    Download.jl
    file.
  2. These data files contain cause of death codes from many different
    versions of the ICD classifications for different time periods and
    countries, and the codes are also often at a much more detailed
    level than I use in the charts. Therefore, the data on deaths is
    grouped using regular expressions defined in the configuration file.
    To avoid repeating this time-consuming regular expression matching,
    the resulting DataFrames can be saved in CSV files. There are still
    some issues with unsupported datatypes in the
    MySQL.jl package, which mean
    that grouping cannot be done at the SQL level and that prepared SQL
    statements cannot be used.
  3. The charts themselves are generated from the DataFrames created in
    step 2, using the Python Bokeh
    library
    , which is well-suited
    for interactive web visualizations. I call Bokeh directly using
    PyCall, instead of using the
    Bokeh.jl package, which is
    unmaintained. There is a batchplot function to generate all the
    charts for the site using the settings in the configuration file.
  4. The writeplotsite function generates the charts as well as HTML
    tables with links to the charts, a documentation file in Markdown
    format, and navigation menus for a given language, and copies these
    to a given output location. To generate the site files, except for
    the charts themselves, templates processed with
    Mustache.jl
    are used.
  5. The final generation of the site is done using
    Hakyll, a static site generator
    written in Haskell. In the output directory generated in step 4,
    there will be a Haskell source file, site.hs, which, provided that
    a Haskell complier and the Hakyll libraries are installed, can be
    compiled to an executable file. This file can then be run as
    ./site build to generate the site, which can then be uploaded to a
    web server. The resulting site is static in the sense that it has no
    code running on the server-side (but rendering the charts requires
    JavaScript on the client side).

References

Whitlock, Gary. 2012. Mortality Trends [archived 21 december 2014].” http://web.archive.org/web/20141221203103/http://www.mortality-trends.org/.
WHO. 2025. “WHO Mortality Database.” https://www.who.int/data/data-collection-tools/who-mortality-database.

Calculating lifetime cancer risk in a population

By: Karl Pettersson

Re-posted from: https://www.dusty-test.klpn.se/posts/2016-11-06-secanc.html

Calculating lifetime cancer risk in a population

Posted on 2016-11-06

by Karl Pettersson.

Tags: ,

It is common to hear statements such as one in three persons will
develop cancer during their lifetime
, one in nine women will develop
breast cancer
and so on. Most often, such statements are based on a
simple calculation of cumulative risk, i.e. age-specific incidence rates
for a given year and cancer diagnosis are summed up to a chosen maximum
age, e.g. 75 years, and the resulting cumulative incidence rate \(r\) is
then converted into a probability using the formula \(1-\exp(-r)\).
However, if lifetime cancer risk is interpreted as the proportion of
the population which will be diagnosed with cancer during their
lifetime, this method gives incorrect results, because it does not take
the following into account:

  1. Future changes in cancer rates.
  2. People who die before they reach the maximum age, due to causes
    unrelated to cancer.
  3. People who develop cancer at ages above the maximum age.
  4. People who are diagnosed with multiple primary cancers during their
    lifetime.

The first problem will not be further discussed in this post, as dealing
with it obviously would require projections into the future. The other
problems can be assessed with a method described by Sasieni et al. (2011), which
they call AMP (adjusted for multiple primaries), and which only
requires routinely available data. Their idea is to build a life table
where it is possible to be eliminated from the population either by
being diagnosed with cancer or by dying from something other than
cancer. It is then possible to calculate the proportions eliminated in
these different ways. The AMP method hinges on the independence
assumption that primary cancer incidence and mortality from causes other
than cancer are the same among people who have had cancer as in the
general population, because these groups cannot normally be
differentiated in official statistics. Only the following data are
required:

  1. Age-specific population size, in order to calculate incidence and
    mortality rates.
  2. Age-specific number of cancer cases.
  3. Age-specific number of deaths due to all causes.
  4. Age-specific number of deaths due cancer. Note that official
    statistics normally reports so-called underlying causes of deaths,
    which means that this should include complications of cancer and
    cancer treatment (otherwise, the independence assumption given above
    would be violated).

Using my LifeTable package, the
AMP method can be easily implemented in Julia. I will give examples with
calculations for Sweden 2014, using data from Statistics Sweden (2016) for
population size, National Board of Health and Welfare (2015) for cancer cases and National Board of Health and Welfare (2025) for deaths.
The data are given in 5-year age intervals from 0–4 to 80–84 years,
with an open interval for ages above 85 years. The files used in the
example are available via a
gist.
The Julia
file

contains the following code:

using LifeTable, DataFrames

function AmpLt(inframe, sex, rate = "inc")
    age = inframe[1]
    pop = inframe[2]
    acd = inframe[3]
    cd = inframe[4]
    cc = inframe[5]
    if rate == "inc"
        ncol = cc
        dcol = acd .- cd .+ cc
    elseif rate == "mort"
        ncol = cd
        dcol = acd
    end
    df = DataFrame(age = age, pop = pop, dcol = dcol)
    cprop = ncol ./ dcol 
    lt = PeriodLifeTable(df, sex)
    return CauseLife(lt, cprop)
end

Assuming the LifeTable package is installed and the files have been
downloaded, you can calculate tables with lifetime cancer risk for
Swedish females and males, at a given age:

include("amplt.jl")
fse14 = readtable("fse14.csv")
mse14 = readtable("mse14.csv")
ampfse14 = AmpLt(fse14, 2)
ampmse14 = AmpLt(mse14, 1)

The first row in the f column in a frame returned by AmpLt gives the
lifetime cancer risk at birth, which should be about 45.7 percent for
females and 49.3 percent for males. It is also possible to calculate
lifetime risk for cancer mortality, rather than incidence:

mampfse14 = AmpLt(fse14, 2, "mort")
mampmse14 = AmpLt(mse14, 1, "mort")

The first row in these frames should be about 22.3 and 26.2 percent for
females and males. With PyPlot, the frames can be plotted:

plot(ampfse14[:age], ampfse14[:f], label = "incidence, females")
plot(ampfse14[:age], ampmse14[:f], label = "incidence, males")
plot(ampfse14[:age], mampfse14[:f], label = "mortality, females")
plot(ampfse14[:age], mampmse14[:f], label = "mortality, males")
title("Lifetime cancer risk Sweden 2014")
xlim(0, 85)
ylim(0, 0.5)
legend(loc=3)
grid(1)
Lifetime probabilty of cancer incidence and mortality for Swedish females and males 2014

As the chart shows, the probabilities tend to decrease with age,
especially after age 60, which is due to increasing competition from
other causes of death, e.g. circulatory disorders.

If cancer incidence and mortality are changed, this might also influence
mortality from some non-cancer causes. For example, decreased smoking
tends to decrease lung cancer incidence and mortality, as well as
mortality from nonmalignant respiratory diseases and atherosclerotic
diseases.1 One might ask how such risk factor changes would influence
lifetime cancer risk, which might be decreased, as well as unchanged, or
even increased, due to diminished competition. The following function
recalculate a frame with cancer cases, as well as cancer deaths and
non-cancer deaths, changed by the same factor for all age groups:

function RateChange(inframe, changefac)
    ncd = inframe[4] .* changefac
    ncc = inframe[5] .* changefac
    nacd = (inframe[3].-inframe[4]).*changefac .+ ncd
    return DataFrame(age = inframe[1], pop = inframe[2],
        acd = nacd, cd = ncd, cc = ncc)
end

To calculate lifetime cancer risk for Swedish females with all three
rates reduced by one third, give AmpLt(RateChange(fse14, 2/3), 2). The
frame returned by this call gives a lifetime risk at birth of 37.5
percent. For males, the corresponding risk is 42.6 percent. For
mortality, the risks would be 19.2 and 24.0 percent for females and
males respectively. If age-specific cancer incidence and mortality and
non-cancer mortality are reduced by the same factor, in a society such
as Sweden, which already has high life expectancy, this tends to reduce
the lifetime risk of getting cancer or dying from it, because more
people will survive to higher ages where the probability of getting
cancer before succumbing to something else is lower (with greater
reductions, the risks at lower ages asymptotically approach the
corresponding risks at the highest age, i.e. 85 years in this example).

References

National Board of Health and Welfare. 2015. “Cancer.” http://www.socialstyrelsen.se/statistics/statisticaldatabase/cancer.
———. 2025. “Cause of death.” https://sdb.socialstyrelsen.se/if_dor/val_eng.aspx.
Sasieni, P. D., J. Shelton, N. Ormiston-Smith, C. S. Thomson and P. B. Silcocks. 2011. “What is the lifetime risk of developing cancer?: The effect of adjusting for multiple primaries.” British Journal of Cancer 105. doi:10.1038/bjc.2011.250.
Statistics Sweden. 2016. “Mean population by region, marital status, age and sex.” http://www.statistikdatabasen.scb.se/goto/en/ssd/MedelfolkHandelse.

  1. This shared risk factor can be expected to violate the
    independence assumption in the AMP method to some extent. However,
    as noted by Sasieni et al. (2011), these effects should not be serious when all
    cancers are studied, because there are few lung cancer survivors in
    the population.↩︎