Cancer diagnoses
Population size
To study geographical variation, the Dutch Cancer Atlas uses the number of new cancer diagnoses (incidence) over the period 2011-2022. Since some areas have relatively more people, while others have fewer, simply comparing the absolute numbers of cancer diagnoses between different areas is not informative. Geographical differences in such a comparison would mainly be caused by differences in the population size of the areas. Therefore, the analyses take into account the population size of different areas by standardising the estimates for population size. This means that the differences in diagnosis rates between areas in the atlas are not caused by differences in the population size of the areas.
For reference, the statistics panel on the right displays the 'average number of new diagnoses per year' for the entire Netherlands over the period 2011 to 2022. In addition to the absolute number of new diagnoses per year, the number of new diagnoses per 100,000 persons per year is also shown. This is the so-called incidence rate (or diagnosis rate), which also takes into account the population size.
Age
Cancer is most common in older individuals. If an area has a relatively higher proportion of older residents compared to another area, it would not be very informative to compare just the diagnosis rate between these areas. Differences between these areas would then largely be caused by differences in age distribution. In the statistical analyses, the estimates were standardised by age by taking age at the time of diagnosis into account in the modelling. Differences between areas in the atlas are therefore not caused by differences in age distribution between areas.
Outcome measure
For each area in the Dutch Cancer Atlas the Standardized Incidence Ratio (SIR) is estimated using Bayesian statistical modelling. The SIR is the ratio between the observed and expected number of diagnoses.
The observed number of cancer diagnoses is simply the total number of cancer cases in a particular area.
The expected number of diagnoses (during the studied period) is the number of diagnoses expected in the area if the risk of cancer is equal to the average risk in the Netherlands. This means the analysis has to take into account how many people live in a certain area and what their age distribution is. Areas with more people usually have more cancer diagnoses than areas with fewer people. Also, areas with relatively more older people tend to have more cancer diagnoses than areas with more young people. The estimates in the atlas should not reflect differences in age and population size.
To calculate the expected number of cases, first, the age-specific diagnosis rates of the Netherlands are calculated. Assuming no variation in diagnosis rates in the Netherlands, these age-specific diagnosis rates would be the same in every area. So, knowing how many residents a particular area has and how old they are, one can calculate how many cancer cases there would be if the age-specific diagnosis rates of the Netherlands applied to that area. This is done by first calculating the expected number of diagnoses per age category by multiplying the Dutch age-specific diagnosis rates by the number of people per age category in that particular area. The total number of expected cancer diagnoses in that area is then derived by summing the expected number of diagnoses per age category.
Dividing the observed number of diagnoses by the expected number of diagnoses provides the SIR. For instance, if we observe 25 cancer cases in a particular area, but we expected, based on the age-specific diagnosis rates of the Netherlands, to observe 20 cancer cases in that area, the SIR would be 25/20 = 1.25. This means the diagnosis rate is 25% higher than the Dutch average.
However, in the atlas, the estimation of the SIR is a bit more complicated, because the modelling includes not only the expected and observed cases in an area but also includes information from neighbouring areas to ensure spatial smoothing. More explanation on the statistical modelling and smoothing can be found here .
What do the colours mean?
The range of colours in the atlas indicates whether the diagnosis rate is lower than (blue), equal to (yellow) or higher (red) than the Dutch average. The higher (or lower) the standardized incidence ratio (SIR), the darker the colour. But the colour is also affected by the certainty of the SIR. To illustrate, the diagnosis rate could be 15% higher or lower than the Dutch average while the area is still coloured yellow. This is purposely done. If the estimate is that the cancer rate differs by 15% but at the same time the probability that this estimate truly differs from the Dutch average is low, the area will be coloured yellow. A more detailed explanation of certainty and colour use can be found here.
What is important to note is that a blue area does not necessarily have a lower absolute number of cancer diagnoses than a red area. Also, a blue area does not necessarily mean that cancer is rare in that particular area. The blue area simply means that it has a lower diagnosis rate than expected, based on its population size, age and gender distribution and the average age-specific diagnosis rates of the Netherlands. Similarly, a red area has a higher diagnosis rate than expected but can still contain few (absolute) cancer diagnoses.
To illustrate, in a blue area the observed number of diagnoses may be 50, while based on its population size, age and gender distribution we may expect 60 cases. The SIR in that area would then be (50/60=) 0.83. A red area could have 30 observed diagnoses, while the expected number would be 20. The SIR in that area would be (30/20=) 1.50. In this a case, the absolute number of diagnoses in the red area is lower than the absolute number of diagnoses in the blue area.
This also means that the estimates of one area cannot directly be compared to the estimates of another area. They have meaning only in comparison to the Dutch average cancer diagnosis rate. It is important to realize that the estimated SIR of an area represents the average diagnosis rate for the whole area and does not give information on individual risks of cancer.