Global burden of chikungunya virus infections and the potential benefit of vaccination campaigns

Countries and territories considered

We considered countries and territories (as defined by the United Nations) with a population size of over 200,000 inhabitants. We included territories to avoid problems with some (especially island) territories being in different CHIKV risk zones than the remainder of the country (for example, French Guiana and mainland France). This resulted in a total of 180 countries and territories.

Literature review

Taking each country and territory in turn, we used the following resources:

Google (search term ‘chikungunya AND [COUNTRY]’)
Google scholar (search term ‘chikungunya AND [COUNTRY]’)
PubMed (search term ‘chikungunya AND [COUNTRY]’)
GIDEON
WHO/PAHO websites
ProMED (search term ‘chikungunya AND [COUNTRY]’)

We identified Ministry of Health websites where possible. For each country and territory, we identified whether there was evidence of CHIKV autochthonous transmission. We considered evidence of transmission as either the detection of cases (with at least one case confirmed via polymerase chain reaction (PCR) or IgM) or from seroprevalence studies (IgG or IgM). We considered outbreaks from any year prior to 2023. We recorded the specific year of the outbreak when they occurred between the years 2011 and 2022 (Supplementary Table 1).

Seroprevalence studies

As part of the literature review process, we specifically highlighted seroprevalence studies for CHIKV for data extraction. Our inclusion criteria were studies in healthy individuals from the general population that had tested for CHIKV IgG. We excluded seroprevalence studies in suspected cases (Supplementary Table 2). From each detected study, we identified the location of the study (from which we subsequently identified coordinates), the number of individuals per age group and the number of individuals who tested positive for CHIKV. We subsequently contacted the authors of all identified seroprevalence studies to obtain finer-scale data on age and location. From this process, we obtained 49 age-stratified seroprevalence datasets from research groups covering 97 locations across 29 countries.

Relationship between Aedes mosquito distribution and CHIKV

A previous estimate modeled the presence of A. aegypti and A. albopictus in 5-km × 5-km grid cells around the world using climate data and a large global repository of mosquito trap data¹⁹. We explored the extent to which these estimates of A. aegypti and A. albopictus were associated with CHIKV presence in a country.

We first extracted the human population-weighted average presence (MosqWeighted) of A. albopictus and A. aegypti in a country j using the estimated distribution of population density in the same grid cells as the mosquito data from WorldPop^19,38.

$${\rm{MosqWeighted}}_{j}=\sum _{i}{\rm{pop}}_{i,\;j}{\rm{mosq}}_{i,\;j}/{\rm{POP}}_{j}$$

(1)

where ${\rm{pop}}_{i,\,j}$ is the human population in cell i in country j; ${\rm{mosq}}_{i,\,j}$ is the mosquito presence probability from Kraemer et al.¹⁸ for that same cell i; and ${\rm{{PO}{P}}}_{j}$ is the total population size in country j. We used this approach as mosquito levels in places with large population sizes are more relevant to CHIKV burden than places with few inhabitants.

We then compared these estimates of mosquito presence to the probability of that country having ever reported CHIKV transmission by calculating the proportion of countries that had ever reported outbreaks among those with population-weighted mosquito levels of increasing sizes (Fig. 1b). To quantify the relationship between population-weighted mosquito levels and presence of CHIKV, we used logistic regression:

$${\rm{{logit}({presenc}{e}}}_{j})={\beta }_{0}+{\beta }_{1}\cdot {\rm{{MosqWeighte}{d}}}_{j}$$

(2)

We conducted separate analyses for A. aegypti and A. albopictus and then a separate one where we just used the greater one of the two species within each cell.

Estimating the population at risk

For each 5-km × 5-km grid cell of a given country, we used the fitted relationship between mosquito occurrence and the probability of CHIKV outbreaks from logistic regression (Fig. 1b) and the number of people living in that cell to obtain a weighted average of the size of the population at risk in that location.

$${\rm{{PopAtRis}{k}}}_{j}=\sum _{i}{\rm{{po}{p}}}_{i,\;j}\cdot {\rm{{ProbOutbrea}{k}}}_{i,\;j}$$

(3)

where ${\rm{ProbOutbreak}}_{i,\;j}$ is the estimated probability of observing an outbreak in cell i in country j based on the fitted logistic model in equation (2), where we use the greater of A. albopictus and A. aegypti presence within that cell. For India, China and the United States, which have large populations, we added an additional mask that assumes CHIKV transmission occurs only in areas where there is good evidence of sustained transmission, from either national seroprevalence studies (India) or good case reporting (China and the United States)¹¹. In masked areas, ${\rm{ProbOutbreak}}_{i,\,j}$ is set to 0.

Assigning of epidemic status by country

To estimate the reliability of case data for each country, we used the Healthcare Access and Quality (HAQ) Index that is measured on a scale from 0 (worst) to 100 (best) based on amenable mortality³⁹. Countries in the first two deciles of the index (HAQ Index > 82.2) were classified as having a good surveillance system³⁹. We next used the data on case occurrence, seroprevalence studies and mosquito distribution to categorize the epidemic status of all countries and territories as set out in Table 1. Each country and territory was assigned to ‘endemic’ (that is, with evidence of sustained transmission each year), ‘epidemic’ (that is, evidence of transmission but not sustained across years) or ‘no transmission’. We also provide an assessment of the strength of the evidence for the categorization of each country (Table 1).

Estimating CHIKV transmission dynamics

We used the collected data to inform different models and get estimates on parameters that capture the global dynamics of CHIKV. These models rely on the definition of FOI, which represents the rate at which the susceptible members of the population in a community get infected. For endemic countries, we estimated an FOI using a serocatalytic model in a Markov chain Monte Carlo (MCMC) framework. In epidemic countries, we considered that FOI was not constant and estimated an annual probability of an outbreak occurring as well as the annual FOI when an outbreak occurs.

We subsetted the serological datasets based on the status classification of the country that they originate from. The epidemic-prone countries were used to fit a single epidemic model to estimate an overall probability of outbreak occurrence (μ) and average outbreak size (λ). The endemic countries were used to fit a single endemic model to estimate the distribution of time-constant FOIs across locations characterized by its mean (λ_endemic) and standard deviation (σ_endemic).

Epidemic model

We assume that outbreaks have a yearly probability μ of occurring with an average size of λ and a standard deviation σ that are the same across locations. In each location, we simulated outbreak patterns and retained the global parameters that generated the most likely patterns. An outbreak pattern in a given location is suitable if it approximates well age-specific cumulative FOIs inferred from the age-stratified seroprevalence in that location.

For a given location r and a given age group A, we call ${N}_{{pos},{rA}}$ the number of samples that tested positive for IgG antibodies against CHIKV and ${N}_{{tot},{rA}}$ the total number of samples tested. Note that the size of the age groups differed from one location to another. We use a binomial likelihood to fit our model to the data:

$${N}_{{pos},{rA}} \sim {\rm{Binomial}}({N}_{{\rm{{tot}}},{rA}},{p}_{{rA}})$$

with ${p}_{{rA}}$ being the probability that an individual from a given age group A will have been infected by the time the sample was collected. By definition, the probability p of being infected at age a follows an exponential law of rate equal to the FOI (escaping infection from birth to age a). And the p_rA (proportion of people who have been infected from birth to age a) is the cumulative distribution function of p. Taking into account grouping by age, we get:

$${p}_{rA}=1-{e}^{-{\underline{\varGamma }}_{rA}}$$

with ${\underline{\varGamma }}_{rA}$ being the average cumulative FOI for an age group A that we estimated to be the mean value of the cumulative FOI Γ_rN for all ages included in the year group:

$${\underline{\varGamma}}_{rA}={\rm{mean}}_{N\in [{\rm{age}}_{\rm{min}};{\rm{age}}_{\rm{max}}]}({\varGamma}_{rN})$$

From their birth to the study year, any individual who is N years old will have been exposed to the cumulative FOI:

$${\varGamma }_{{rN}}=\mathop{\sum }\limits_{i=0}^{N}{\lambda }_{{ri}}$$

with λ_ri being the annual FOI in location r at year (Y − i) where Y is the year where the serum samples were collected in this location.

λ_ri was simulated during an iterated filtering process drawn from Beta(α, β) with a probability μ and set to 0 otherwise. α and β were calculated such that the mean and standard deviation of the distribution were equal to λ and σ. For parameter identifiability purposes, σ was fixed at 0.0025. The parameters estimated are μ and λ using a sequential Monte Carlo (SMC) Bayesian framework embedded in C and accessed using R and the ‘pomp’⁴⁰ package. Uninformative uniform priors U(0,1) were used for λ and σ. In total, 1,000 particles divided into equal-sized blocks (one by location) were used to simulate outbreak patterns and explore the parameter space. Each particle filtering instance was replicated 20 times to estimate the likelihood and associated uncertainty.

Endemic model

Following the previous notations, we assumed that outbreaks occur every year and estimated the average FOI resulting from that endemic pattern. We assumed all the λ_ri to be equal to λ_r such that

$${\varGamma }_{{rN}}=N* {\lambda }_{r}$$

where λ_r is the time-constant FOI estimated for location r. λ_r is drawn from a beta distribution Beta(α, β) of mean λ_endemic and standard deviation σ_endemic. The hyperparameters λ_endemic and σ_endemic were also estimated in our model using uninformative uniform priors U(0,1) for both. We used an MCMC Bayesian framework using Stan and R. The posterior distribution of considered parameters was sampled using four chains of 4,500 iterations each including a burn-in phase of 500 iterations.

Vaccine simulation framework

For each country, we simulated transmission of chikungunya. If a country has more than 10 million inhabitants, we considered that it was structured with several 10-million population subdistricts with independent FOI patterns from one to another.

The population has an age structure derived from the 2020 United Nations World Population Prospects⁴¹. They were grouped in the 12 following age groups: 0–5, 6–10, 11–12, 13–18, 19–20, 21–30, 31–40, 41–50, 51–60, 61–70, 71–80 and 81+. We used these age groups to allow for age-specific vaccine policies (especially where we consider 12+ and 18+ vaccination strategies) as well as to allow sufficient granularity to have age-specific mortality.

Each age group has six different compartments that individuals can be allocated to depending on their infection and vaccination status:

S: unvaccinated, never-infected individuals
I: unvaccinated, infectious individuals
R: unvaccinated, recovered individuals (seropositives)
V: vaccinated never-infected individuals
IV: vaccinated, infectious individuals
RV: vaccinated, recovered individuals (seropositives)

We ran the simulation for a period of 20 years, during which we measured the impact estimates.

Prior to the first year, the FOIs for each year were drawn. If the country is endemic, a time-constant FOI was drawn from a Beta distribution of mean λ_endemic and standard deviation σ_endemic. If the country is epidemic, for each year, the annual FOI was set to 0 (no transmission event) with probability (1 − μ) and to a location-specific FOI drawn from a Beta distribution of mean λ_endemic and standard deviation σ_endemic with probability μ.

Every year, the following events occur:

Loss of vaccination protection
Running a deterministic SIRV model with seasonal transmissibility calculated from the FOI previously drawn
Aging of the population

Loss of vaccination

The proportion of vaccinated people in the population exponentially decays with a decay rate ${v}_{\rm{dur}}^{-1}$ where v_dur (base case 5 years) is the vaccine duration of protection. This is translated by a flow of individuals from compartment V to S and from compartment RV to R.

Vaccination schedule

In endemic countries, the vaccination campaign occurs every 5 years and vaccinates 50% of the unvaccinated population (S + R) over 180 d. Vaccinations occur at a fixed daily rate over the duration of the vaccination campaign.

In epidemic countries, vaccination starts as soon as a threshold number of cases is detected (500 cases per million) and also aims at vaccinating 50% of the unvaccinated population (S + R) over 180 d. However, vaccination can stop prematurely if the outbreak dies out reaching 50 daily cases per million.

We assume the vaccine to be delivered in a single dose, with a time to reach protective immunity of 14 d. The vaccine provides a protection against infection of 40% and protection against disease of 70%.

SIRV model

The chikungunya transmission and vaccination of the population are simulated simultaneously using an age-stratified SIRV model described by the following system of differential equations for a given age group a (Extended Data Fig. 3):

$$\frac{{\rm{d{S}}}_{a}}{{\rm{d}}{t}}=-{St}{I}_{a}-{St}{V}_{a}$$

$$\frac{{\rm{d}}{V}_{a}}{{\rm{d}}{t}}={St}{V}_{a}-{VtI}{V}_{a}$$

$$\frac{{I}_{a}}{{\rm{d}}{t}}={St}{I}_{a}-{It}{R}_{a}$$

$$\frac{{\rm{d}}{{IV}}_{a}}{{\rm{{d}}}{t}}={Vt}{{IV}}_{a}-{IVt}{{RV}}_{a}$$

$$\frac{{\rm{d}}{R}_{a}}{{\rm{d}}{t}}={It}{R}_{a}-{RtR}{V}_{a}$$

$$\frac{{\rm{d}}{{RV}}_{a}}{{\rm{{d}}}{t}}={IVt}{{RV}}_{a}+{RtR}{V}_{a}$$

with:

$S{{tI}}_{a}=\beta (t)* \frac{{I}_{\rm{tot}}}{N}* {S}_{a}$
${VtI}{V}_{a}=\beta (t)* (1-{v}_{{ei}})* \frac{{I}_{\rm{tot}}}{N}* {V}_{a}$
${It}{R}_{a}=\sigma * {I}_{a}$
${IVt}{{RV}}_{a}=\sigma * {{IV}}_{a}$
${St}{V}_{a}$ and ${RtR}{V}_{a}$ being dictated by the daily rates of vaccination as described previously
${I}_{\rm{tot}}=\sum _{a}({I}_{a}+I{V}_{a})$, the total number of infected individuals
$N=\sum _{a}({S}_{a}+{V}_{a}+{I}_{a}+I{V}_{a}+{R}_{a}+R{V}_{a})$, the total population size
v_ei is the vaccine-induced protection against infection
1/σ is the mean duration of infectiousness
β(t) is the seasonal transmission rate, linearly decreasing over time (sawtooth pattern with an offsetted baseline)

Population aging

Each year, individuals from compartments of age a go to compartments of age a + 1. A new birth cohort of completely susceptible individuals is introduced in the population at age 0, and individuals in the last age compartment (100+) are removed from the population. To keep the age structure constant through time, each compartment is then proportionally adjusted.

Model parameters

As there is considerable uncertainty in the vaccine characteristics and feasibility of different deployment strategies, there was a meeting of experts convened, with representatives from WHO, CEPI, Gavi and academia, where a broad consensus was reached. These are summarized in Extended Data Table 2.

Unvaccinated infections have a 50% chance of being symptomatic (expert opinion), and vaccinated infections have 50% × (1-vaccine protection against disease) / (1-vaccine protection against infection) = 50% × (1−0.7) / (1−0.4) = 25% chance of developing symptoms. We assume that all symptomatic infections are detected by the surveillance systems (number of cases). (Extended Data Fig. 4).

Symptomatic infections have a 88% chance of having a mild acute phase and a 12% chance of having a severe one^6,42,43. Severe acute phases have then a 40% chance of developing chronic symptoms (arthralgia)^44,45,46,47. We assume the acute phases to last for 7 d on average and chronic symptoms to last for 1 year^3,37,48.

The probability of death given symptomatic infection (case fatality rate (CFR)) is age dependant and was based on clinical data from Paraguay⁶.

Years lived with a disability (YLDs) were computed by assuming a disability weight of 0.006 for mild acute symptoms, 0.133 for severe acute symptoms and 0.233 for arthralgia (chronic symptoms)^{49,50,51,52,53,54}. Years of life lost (YLLs) were measured as being the difference between the age of death and the mean life expectancy of the country. Individuals dying after the mean life expectancy of the country did not contribute to the YLL calculation (Extended Data Fig. 4).

By definition, DALYs lost to chikungunya were calculated as being the sum of YLLs and YLDs.

Statistical analysis

Correlation coefficients between the probability of local transmission and the probability of vector presence were computed using the Pearson correlation coefficient. Seroprevalence estimates in a specific group were computed as the proportion of seropositive individual in that group, and CIs were computed as a binomial proportion CI using the Wilson score interval. Information on the Bayesian frameworks used, including choice of priors and algorithm parameters, are detailed in specific sections of the Methods. The estimates and 95% CI of the serocatalytic models are the median of the sampled posteriors along with their 2.5% and 97.5% quantiles used for the 95% CIs. Results and 95% CI on the burden estimates and simulated vaccine impact are the mean and the 2.5% and the 97.5% quantiles calculated from the outcome of 1,000 bootstrapped simulations.

Inclusion and ethics statement

This project is a result of a collaboration among several institutions, including from many countries that are impacted by CHIKV and are, hence, directly concerned by the presented results. This study will be relevant by providing estimates of how much of this burden can be avoided locally by implementing vaccination campaigns. All co-authors contributed right from the early stages of the project. All models were developed using anonymized datasets. The data were either extracted from existing publications or provided by the underlying data collectors without any personal identifiers. The datasets are credited to the local research groups that generated them. Local and regional research was duly acknowledged in citations. As there are currently no existing measured estimates of the efficacy of the IXCHIQ vaccine, consensus estimates of key characteristics of the vaccine were obtained from an expert panel consisting of individuals from academia, WHO, CEPI and Gavi.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Post Views: 14

Top Categories

(66) Yoga & Exercise

(3) Weight Loss

(30) Seasonal Wellness

(26) Studies & Findings

Popular News

Hershey’s Just Launched a Wild New Protein Bar...

Gold Bond’s $17 Firming Body Lotion Improves Crepey...

Yoga vs. Pilates: Exploring the Key Differences Explained

The Shape Summer 2025 Sweat Issue