Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Transfer learning for mortality risk: A case study on the United Kingdom

  • Asmik Nalmpatian ,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

    asmik.nalmpatian@campus.lmu.de

    Affiliation Department of Statistics, LMU Munich, Munich, Bavaria, Germany

  • Christian Heumann,

    Roles Conceptualization, Formal analysis, Methodology, Supervision, Validation, Writing – review & editing

    Affiliation Department of Statistics, LMU Munich, Munich, Bavaria, Germany

  • Levent Alkaya,

    Roles Data curation, Formal analysis, Methodology, Software, Writing – review & editing

    Affiliation Independent Researcher, Munich, Bavaria, Germany

  • William Jackson

    Roles Data curation, Investigation, Methodology, Resources, Validation, Writing – review & editing

    Affiliation Independent Researcher, Munich, Bavaria, Germany

Abstract

This study introduces a transfer learning framework to address data scarcity in mortality risk prediction for the UK, where local mortality data is unavailable. By leveraging a pretrained model built from data across eight countries (excluding the UK) and incorporating synthetic data from the countries most similar to the UK, our approach extends beyond national boundaries. This framework reduces reliance on local datasets while maintaining strong predictive performance. We evaluate the model using the Continuous Mortality Investigation (CMI) dataset and a Drift model to address discrepancies arising from local demographic differences. Our research bridges machine learning and actuarial science, enhancing mortality risk prediction and pricing strategies, particularly in data-poor settings.

Introduction

In life insurance, accurate mortality risk prediction is essential for pricing and managing risks. However, this process is often hindered by data scarcity, particularly in underrepresented demographic segments or smaller niches of the market. Mortality events are infrequent, meaning data accumulates slowly, making it difficult for insurers to build robust predictive models. This lack of data can lead to unreliable risk assessments and pricing strategies, ultimately affecting profitability and customer affordability.

Transfer learning offers a promising solution to these challenges by leveraging models trained on data-rich countries and adapting them to data-poor environments. This allows insurers to generate reliable mortality predictions even when local data is unavailable. Previous studies, such as those by [1] and [2], have laid the groundwork for transfer learning in mortality risk prediction, but have primarily focused on scenarios with small volumes of target data. Additionally, much of the research has relied on deep neural networks (DNNs), which, while powerful, can be computationally intensive and require extensive fine-tuning, especially for small datasets [3,4].

In contrast, gradient boosting machines (GBMs) offer a more efficient and interpretable alternative for transfer learning, particularly in cases where no target data is available. Despite their potential, GBMs have received less attention in the context of mortality risk prediction. Inspired by the success of machine learning (ML) models in clinical research [57], this study introduces a GBM-based transfer learning framework for predicting mortality rates in the UK, where no local life insurance data is available. By incorporating synthetic data from countries most similar to the UK, this approach demonstrates high predictive accuracy while reducing dependence on local datasets. To further enhance the model, we introduce a Drift model to evaluate and correct discrepancies arising from demographic differences between countries.

This research not only extends the boundaries of transfer learning in actuarial science but also has broader implications for improving mortality risk prediction and pricing strategies in data-poor markets. Our study—the workflow of which is illustrated in Fig 1—is guided by three primary research questions:

thumbnail
Fig 1. Graphical abstract to summarize our workflow and value-added.

https://doi.org/10.1371/journal.pone.0313378.g001

  1. (i) How can we estimate mortality rates in a country with no internal life portfolio data? This involves implementing a ML-based transfer method, focusing on the UK, and constructing a Country similarity index using external data to identify relevant source countries.
  2. (ii) How accurate is the model, and how can a Drift model address discrepancies between predicted and expected mortality? The accuracy of the transfer learning method is assessed using various metrics, with a Drift model employed to explore factors contributing to discrepancies between transferred mortality tables and expected outcomes from the CMI dataset.
  3. (iii) How can additional variables beyond age and gender improve mortality risk predictions. We investigate how the inclusion of additional variables can enhance the baseline mortality predictions, providing an application case to demonstrate improvements.

Related work

Previous research has predominantly focused on leveraging DNNs to model mortality data. A notable example is the work by [8], which discussed the integration of Generalized Linear Models (GLMs) within residual networks to capture both linear and nonlinear effects. Despite their potential, these Combined Actuarial Neural Networks (CANNs) face challenges in enforcing monotonicity, which is crucial for mortality data [9]. In contrast, our study explores the use of GBMs, which offer a more flexible, interpretable and computationally efficient alternative, particularly in data-scarce environments. GBMs have shown promise in various actuarial applications, providing a transparent framework for mortality prediction. Our approach extends previous methodologies by incorporating a Drift model to explicitly address demographic discrepancies, enhancing the model’s adaptability to different population characteristics.

Numerous studies have aimed to compare health care systems, financing mechanisms and health outcomes across countries. Bauer and Ameringer [10] emphasizes the difficulty of collecting comprehensive data from different countries due to logistical and financial constraints. However, incorporating statistical data from credible sources like the World Health Organization (WHO) and the Organization for Economic Co-operation and Development (OECD), along with a proposed multivariate statistics framework, serves as a valuable supplement. The significance of conducting cross-national research on healthcare system performance is underscored since it is considered crucial for guiding public policy [11]. For example, [12] revealed that the difference in spending between the United States and European countries can be traced back to disparities in diagnosis and treatment rates for certain chronic conditions. Another study explored the impact of culture in forecasting a country’s population health, gauged through life expectancy and healthcare spending [13]. Hofstede’s influential study on cross-cultural research argues that comprehending a nation’s culture demands exploring dimensions like Power Distance, Individualism-Collectivism, Masculinity-Femininity, and Uncertainty Avoidance [14]. To tackle the issue of determining and gauging population health, [15] suggested two models. The descriptive model assesses population health through indicators like life expectancy, categorized by markers such as socio-economic status or race. Various indices measure similarities between countries across a range of dimensions, yet there is currently a gap in addressing both mortality and life insurance specifics. Our approach involves constructing and optimizing a distance-based index for country similarity. We base this approach on solely external sources. The forthcoming sections outline our proposed method in a reproducible manner.

Database and methodology

Data

In our study, we rely on the open source Human Mortality Database (HMD) as our primary external data source. HMD offers age and gender-specific mortality rates for the overall population across various countries. However, our primary focus is not on estimating the mortality of the overall population in the UK. Instead, our goal is to estimate the mortality rates within the company’s own life insurance portfolio in the UK. It’s important to note that there are often differences between overall mortality rates and those within a specific portfolio, particularly due to the underwriting process in life insurance. To address this limitation, we leverage data from eight countries and establish connections to capture this discrepancy between overall and portfolio mortality rates. To ensure that the analysis accurately reflects the mortality patterns across different countries and within the company’s life insurance portfolio, our approach involves three different populations, as illustrated in Fig 2: the global overall population, the global insured population of the company, and the insured population of the company within a particular country.

thumbnail
Fig 2. Illustration of targeted population segments across different datasets and models.

https://doi.org/10.1371/journal.pone.0313378.g002

Overall population: Age- and gender-specific overall population mortality rates from the HMD are retrieved for all countries. While these represent total population mortality, not insured population mortality, they bridge the gap between total and insured mortality, as it is the only feature we have available for the target country. To minimize yearly artifacts mortality rates from 2008 to 2018 were projected one year ahead using the Lee Carter model [16] (see Methodology section and S2 Appendix).

Insured population: We utilize a pooled internal portfolio dataset from different countries to pretrain a GBM model [17] for predicting mortality rates for the insured population globally. This dataset incorporates common global characteristics shared across different countries, such as age, gender, sum assured, allowing for cross-country data comparison, and integrates the overall population mortality, yielding in a total of 9 global features, without any country indicator. (see Methodology section and S1 Appendix).

The dataset includes policy data from a global primary insurer that was active during the specified period, totaling almost 10 million life-years of exposure and recording nearly 10,000 insurance claims (deaths). The data analysis was conducted in an aggregated form, grouped into distinct combinations of feature values, summarizing the deaths Dj and exposure Ej data for each unique combination features across all countries, in this case K = 8, the names of which have been withheld to maintain confidentiality. Four of the countries are located in Western Europe, three in Latin America, and one in Central and Eastern Europe.

Table 1 provides a detailed overview of Dj, Ej and the total number of years Tj in country j, to give the main characteristics and distribution of the pooled dataset. This paper will analyse age and gender as internal features, while keeping other features used in the modeling anonymous for privacy reasons.

thumbnail
Table 1. Overview of death counts , exposure in life years , and total number of years in country j.

https://doi.org/10.1371/journal.pone.0313378.t001

Insured population in specific countries: In addition to the global features, including the overall population mortality of these countries, we include 12-16 local features from each country j, depending on local data availability, such as occupational class, which are not comparable across regions. After retraining the specialized GBM models on a total of 21-25 features, initialized by the pretrained model, we predict mortality rates for the portfolio of country M using a synthetic dataset tailored specifically for M. Our method for creating the synthetic dataset combines stochastic and rule-based techniques to bootstrap by resampling from the internal portfolio of K countries, while introducing variations to account for uncertainty [18] (see Methodology section and S1 Appendix).

Mortality of UK’s insurance population for evaluation:

We utilize the ’16’ series mortality tables from Working Paper 154 [19] for the evaluation purposes and the Drift model, given the absence of an actual UK portfolio for comparison. These tables, derived from data from different UK life insurance companies, offer detailed insights into age, gender, smoking status, and curtate duration. To guarantee an impartial assessment and prevent undue complication, we consolidate the tables according to age and gender categories that correspond to population proportions.

External data for the Country similarity index:

The Country similarity index seeks to measure the similarity between the target country M and the K (= 8) source countries in the internal dataset in terms of mortality and life insurance characteristics. We develop this by considering various indicators, selected based on prior research and expert input, adaptable to specific contexts. These indicators are categorized into three dimensions: Life Insurance Performance Indicators, Healthcare Statistics, and Overall Population Mortality. The details of these indicators are outlined in Table 2, with the methodology for their construction discussed in the subsequent subsection.

thumbnail
Table 2. Dimensions and items obtained from external sources for the construction of a Country similarity index related to mortality in life insurance.

https://doi.org/10.1371/journal.pone.0313378.t002

Methodology

General setup

Consider a scenario where K source datasets with aggregated sample size nj are collected from countries representing life insurance portfolios. The pooled dataset has total aggregated sample size . The objective is to estimate death counts relative to exposure. The feature set comprises global features that are comparable and available across countries including the overall population from HMD and local features that are specific to each country. Our challenge arises in estimating mortality rates DM due to the lack of internal data. However, we do have access to external data that provides information about mortality rates in different countries, including M. So, the scenario we are dealing with is comparing what we know from this external data along with some internal data we have (which is not specific to M) to try to estimate mortality rates specifically for country M. Fig 3 is a visual representation of the transfer learning framework: From the pretrained global model to the refined mortality rate predictions for the target county M based on a synthetic dataset.

thumbnail
Fig 3. Framework sketch: Synthetic-data-based mortality predictions for target country M using a pretrained global mortality risk model.

https://doi.org/10.1371/journal.pone.0313378.g003

Pretrained model

Consider a broad category of risk prediction models, where the process of fitting the model involves using a loss function . With an estimated parameter vector corresponding to the coefficients in a GBM, the predicted outcome is given by . Specifically, we employ the negative Poisson log-likelihood function with Poisson distributional assumption. By minimizing the expected loss function based on we result in the parameter set estimate an thus predicted number of deaths . A detailed methodology for the GBM model is provided in S1 Appendix. Up to this point, a benchmark model has been developed without considering the country M. Previous work such as [33,34,4] and [35] characterize the similarity between the target model and the source models by a certain distance measure. Based upon this idea, we will generate a synthetic portfolio dataset XM for country M, leveraging the similarity of the external data between the target population M and the source populations 1 to K (excluding M).

Country similarity index

To measure how similar the target country M is to the K source countries, we create a Country similarity index based on external insurance and mortality data , with K number of source countries and 1 target country. In our application case, Q is equal to 13, larger than K + 1 = 9. These Q items, which are given in Table 2 apply to the entire population of a country, rather than internal data X, which specifically characterizes the country’s insured population. After centering and scaling, the Manhattan distance between vectors of each source and of target country M is calculated, as the sum of the absolute differences between corresponding components of vectors: [36]. Finally, this results in a k-dimensional vector, representing the sum of item-wise distances between the and M across all Q items. The summation of distances over the countries is then transformed into the normed similarity score using the exponential function, so that the value range changes from to . This transformation allows a similarity comparison rather than an absolute measure of distance, and becomes important later in the resampling stage to define the variance of the Gaussian distribution.

Synthetic portfolio data for country M

In countries with no mortality data at all due to portfolio characteristics and size, synthetic data generation offers an efficient solution to address data limitations [37]. The process of producing mortality datasets that closely mimic actual data may comprise stochastic techniques [38], rule-based approaches set by human experts [39] or deep generative models (e.g., [40,41]).

Assuming the known age and gender distribution for M, we resample feature combinations (rows) from the K datasets, encompassing both global and local features, along with the number of deaths, proportional to each similarity score for . The overall population mortality of those countries has been substituted with the one of country M obtained from the HMD.

To address potential unknown heterogeneity between j and M, we use a data augmentation technique with noise drawing inspiration from established practices (e.g., [42,43]):

  1. Metric Data: We introduce Gaussian noise with a mean of 0 and a standard deviation that is inversely proportional to the similarity score: . Higher similarity measure corresponds to a lower standard deviation, implying less noise is added to metric data.
  2. Categorical Data: For categorical data, a noise level is drawn from , where again . If the drawn value falls within a predefined interval around 0, the original value is retained, otherwise, a new value is drawn from the uniform distribution.

Finally, the synthetic dataset for the target country M is generated and contains the feature sets (including HMD) and as well as the exposure EM for country M. The estimation of death counts, denoted as , is required. More details on the workflow of synthetic data generation is available in S5 Appendix.

Transfer model

Since the pretrained model excludes local factors like occupation class, which cannot be compared across countries, but may have significant impact on mortality, we calculate the specialized models with the local data on top. Each specialized model takes the output of the pretrained model from the first step and makes it more precise for that country. We find that incorporating local attributes during the latter phase of training offers optimal adaptability; this approach allows local nuances to be effectively integrated and, in cases where they are not applicable or transferable to the target country, they can be subsequently adjusted or mitigated. Initially, we utilize the global features of the synthetic dataset to generate preliminary predictions using a pretrained model. Subsequently, we enhance these predictions by employing the specialized GBM models tailored for countries . Through iterative boosting, the specialized model adjusts to the characteristics of the countries according to their similarity, thereby refining the mortality rate predictions. The final mortality rate predictions are determined by combining the specialized predictions and the pretrained predictions for all countries, as elaborated in the following Algorithm 1 and detailed out in S1 Appendix.

Algorithm 1. Algorithmic representation of the transfer framework

  1. 1: Train the global GBM model on the pooled dataset, containing the datasets from countries . Here, represents the features common across all countries.
  2. 2: For each country train a local GBM model using country j’s dataset, which includes both global features and local features specific to country j. These models are initialised using the output predictions of the pre-trained global GBM model (as opposed to more conventional, i.e. random, initialisation).
  3. 3: For country M (= UK), calculate the similarity scores with each country , based on external data with predefined similarity metric, which can include factors specific to life insurance, economic, and mortality.
  4. 4: For each country , perform the following steps to create the synthetic dataset for country M (UK):
  • Use the calculated similarity scores to proportionally resample exposures from each country j’s dataset to contribute to country M’s synthetic dataset. Ensure the total exposure for country M, EM, is equal to the sum of resampled exposures from each country j, which in total should amount to 100,000,000.
  • Apply data augmentation by adding noise to the features to generate variability and improve the robustness of the model.
  • Replace the population mortality variable in the dataset with that from country M, aligning the dataset with the mortality conditions of country M.
  • Compile the resampled and augmented data to form the synthetic dataset XM for country M. This dataset will be a row-block matrix where each block corresponds to data from a specific country j with different dimension, containing both global and local features. The first columns consist of global features to which the global model will be applied. Record the origin of each row to ensure that the specialized GBM model trained for that country can be subsequently applied.
  1. 5: Use the global model and the respective local models to predict the expected value for the synthetic dataset of country M:(1)
  2. where is the expected mortality rate for the synthetic data of country M derived from country j. The term is the global model’s prediction using the global features of the synthetic dataset for country M derived from country j. The term represents the adjustment made by the local model of country j, applied to the portion of the synthetic dataset XM that originated from country j. This ensures that the global model’s predictions are fine-tuned to reflect the specific characteristics of country j that are as similar to country M, as determined by the similarity scores.

Agreement metrics

Using several metrics we evaluate the agreement of transferred mortality rates with the CMI mortality rates , as proxy for expected UK mortality. Specifically, we employed Spearman correlation, cosine similarity and R-squared with centered expected versus predicted mortality rates. These metrics are defined as follows:

1. Spearman correlation:

2. Cosine similarity:

3. R-squared with centered actuals versus centered predicted vectors :

The selection of Spearman correlation, cosine similarity, and centered R-squared is justified by their complementary insights into the evaluation of age- and gender-specific mortality predictions. Spearman correlation is robust for assessing rank-order relationships, making it ideal for capturing the alignment of predicted and actual mortality rankings, especially in the presence of non-linear trends and outliers. Cosine similarity focuses on the directional consistency of the predicted mortality distribution, ensuring that the shape and pattern of predictions align with CMI benchmarks, independent of magnitude differences. Centered R-squared evaluates the variance alignment between predictions and observed rates, emphasizing the model’s goodness of fit to capture demographic-specific fluctuations. For instance, consider a scenario where the model accurately predicts mortality rates for males aged 30-50 but underestimates rates for older females (e.g., 70+). In this case, Spearman correlation would remain high if the ranking within age groups is preserved, even if predictions for older ages deviate in magnitude. Meanwhile, cosine similarity would decline due to a directional mismatch in the mortality profile for older females, reflecting a flatter or inconsistent trend compared to CMI rates. Lastly, the underprediction for older females would reduce the overall explained variance in the R-squared metric, highlighting that the model struggles with these demographic subgroups. Together, these metrics provide a fair assessment of the predictions, ensuring that both ranking, shape, and variance are considered, which is essential for accurately comparing predictions with CMI tables and understanding demographic trends.

Drift model evaluation

We propose a Drift model to evaluate the remaining disagreement by identifying and quantifying the drift drivers between target country’s expected mortality and the mortality rates transferred from other countries to M.

We assume a Poisson distribution for mortality counts in country M, denoted as . Our analysis focuses on examining the discrepancy between the predicted mortality rate and the actual rate across various features or feature categories. This discrepancy, denoted as , serves as an indicator of the quality of transfer learning. We adopt the two-stage or residual model proposed by [44] to estimate :

(2)

A GLM is used with new exposure , target and model specification as follows [45]:

(3)

In the Poisson case, [46] demonstrated that the method is mathematically equivalent to using the ratio as target and as weights:

(4)

The validation of our approach, presented in Results section, includes comprehensive evaluation, such as its application to the UK insurance population and drift analysis from CMI mortality tables.

Due to exclusive usage of publicly available anonymized data (CMI and HMD) and aggregated, anonymized insurance data for model pretraining, there was no direct interaction with human participants, and no personally identifiable information was accessed. The insurance company data used for pretraining was provided in an aggregated and anonymized form, with no possibility of tracing back to any individual policyholder. No UK-specific data from the insurance company was used. The UK-specific results were derived entirely from publicly available data and a synthetic dataset generated for this study, with no real UK life insurance data being used. Therefore, this study does not involve new data collection from human participants and participant consent was not applicable.

Results

Transfer learning application in the UK

The following section introduces the application of the transfer learning framework to the UK, where internal mortality data is unavailable. This analysis establishes the foundation for subsequent discussions and demonstrates a high level of agreement with expected outcomes.

The point of Fig 4 is to show the plausible transfer of knowledge from the countries to the UK, according to their similarity. It is clear that the degree of proximity is more pronounced in Europe, and therefore it makes more sense to resample from there than from the Latin American countries.

thumbnail
Fig 4. The composition of the exposure drawn from the countries for the synthetic dataset for the UK, proportional to the similarity score.

https://doi.org/10.1371/journal.pone.0313378.g004

Fig 5 shows the predicted number of deaths for the UK based on the transfer model for age and gender. The remaining variables are not disclosed as they are considered to be insurer-specific and require confidential background information for proper interpretation. The categories with the highest exposure and claims are based on more original data, indicating greater reliability of the estimation and deserving of our focus.

thumbnail
Fig 5. Exposure (bars) and predicted death counts (lines) by age and gender, derived from the synthetic-data-based transfer model. Age groups are defined retrospectively, and modeling is conducted using a metric scale. A. Age. B. Gender.

https://doi.org/10.1371/journal.pone.0313378.g005

Furthermore addressing the second research question, we aim to evaluate the transfer model’s accuracy in matching the expected age-gender mortality rates using agreement metrics. CMI stands in for expected mortality rates in the UK’s insured population, given the lack of access to internal portfolio data. Despite differences in datasets and modeling, we regard CMI as a reliable proxy for UK policyholders’ actual mortality rates. The analysis focused on transferring insights about the relative mortality impact from different features in the data as opposed to producing an accurate estimate of the overall rate of mortality. This decision was made in part because it is expected that data will be available in the receiving country to estimate the overall rate of mortality, either from publicly available resources, or more likely from internal data that better reflects the specifics of the cohort being considered. Therefore, for evaluation purposes, we use Spearman correlation, cosine similarity, and R-squared as agreement metrics. These metrics do not consider the agreement of the difference in average mortality, ensuring objectivity in our evaluation.

Table 3 provides these measures not only for the UK but also for 8 other countries in the pooled dataset, as the transfer model’s predictive performance was also quantitatively examined for each of the 8 countries by pretraining the global model on the remaining seven. Given that the highest possible score is 1 for all metrics, we are within the highest acceptable range for the UK, as well as for the extended experiment.

While the table indicates a high level of precision in estimating the age-gender mortality using the transfer learning framework, the following section proposes using the Drift model to identify the cause of any remaining marginal discrepancies.

Fig 6 offers an initial insight into the disparities between the predictions of the transfer model and the CMI mortality rates, specifically examining age and gender. Despite an overall trend of underestimation in our estimates compared to CMI, our attention shifts to understanding the specific impacts of various features. Subsequently, we delve into the examination of age and gender as overlapping features present in both the predicted (transfer) and expected benchmark (CMI) mortality rates. To ensure monotonicity, it may be desirable to smooth the curves, i.e. to use them directly in pricing. We present our proposal for this in S3 Appendix, but in the main body we continue with the original version in order to remain faithful to the portfolio context and not to lose its specificity. Additionally, we offer the inclusion of confidence intervals through bootstrapping method as a validation technique, to provide a more detailed assessment of the uncertainty associated with the predictions. From Fig 7 it becomes evident that the confidence interval mostly contains the CMI, particularly for males, solidifying the reliability of the results, especially given the reliance on synthetic datasets. The methodology details are documented in S4 Appendix.

thumbnail
Fig 6. Comparison of UK mortality rates between Transfer Learning and CMI by age and gender. While transfer weighted by similarity score shows the above approach in black, the blue line shows the alternative of resampling only from the most similar country (MSC), which leads to a less accurate prediction.

https://doi.org/10.1371/journal.pone.0313378.g006

thumbnail
Fig 7. Bootstrap validation for 95% confidence interval of (weighted) transfer results.

https://doi.org/10.1371/journal.pone.0313378.g007

Fig 8 illustrates the exponentiated coefficients of the Drift model, offering insights into the relationship between the two mortality tables by quantifying deviations from the average ratio. The red dashed line at approximately 0.5 represents the exponentiated intercept , indicating the average ratio across all features. An exponentiated effect of 1 for a specific feature implies no impact on the ratio, suggesting effective capture of pattern differences between the source and target countries for that feature.

thumbnail
Fig 8. Exponentiated effects of age and gender on the ratio of transfer to CMI. The gray line represents the no-effect line, while the red dashed line is the exponentiated intercept.

https://doi.org/10.1371/journal.pone.0313378.g008

The multiplicative effect of age in relation to the average ratio is approximately 1, indicating that age does not significantly influence the relationship between the transfer model and CMI. Although slight differences may exist in the age curve and average values, this suggests that the transfer learning framework effectively captured the shape variances between the other K (= 8) countries and the UK by age, resulting in a close replication of CMI. This successful matching of age curves is a critical finding for insurance purposes, and lends confidence to subsequent analyses. Despite being from a different country, the methodology achieves a close match to the expected age curve, providing a strong basis for further analysis.

Regarding gender-specific mortality risks, while both the transferred results and CMI indicate higher mortality rates for males than females, the transferred estimations may show slight discrepancies: males are slightly overestimated and females underestimated compared to the average mortality risk. However, these deviations appear minor and likely stem from cohort distinctions between CMI and internal data, as well as cultural differences between the primary reference countries and the UK’s insurance mortality data, possibly reflecting subtle cultural influences and evolving gender roles in different countries.

Building upon the strong alignment observed in the transfer learning process, the subsequent section investigates additional variables.

Improving baseline mortality through additional variables in the transfer model

The Drift model, which actually goes beyond age and gender, examines additional variables found in portfolio datasets but not included in the CMI. With the CMI serving as the insurer’s base table, the exponentiated effects estimated by the Drift model for additional variables provide direct insight to insurers. This allows them to assess the potential impact of including these variables in the pricing model, and to determine possible loadings or discounts accordingly.

For example, considering Feature A with values A1, A2, A3, A4, A5, A6, absent from the CMI, Fig 9a shows that the predicted mortality rates increase from A1 to A6. Consequently, the Drift model’s exponentiated effects reveal that policies falling under A1 have a 33% lower mortality ratio compared to the average, while those under A6 exhibit a 24% higher ratio, both ceteris paribus. Therefore, a UK insurer may include an extra risk factor in their pricing strategy due to the relative risk of A1 being approximately 54% (67/124) of A6. This justifies a 33% loading for A6 policyholders. It is suggested that selection effects would significantly impact the risk profile. The estimation of all other variables is presented in S3 Appendix.

thumbnail
Fig 9. Feature A (with values A1-A6) evaluation as a risk factor for mortality.

Transfer model results and evaluation of drift from CMI. (A) The mortality rates for the UK are displayed on a logarithmic scale, segmented by Feature A. Red line represents CMI mortality rates. (B) Exponentiated effects of Feature A on the ratio of transfer to CMI. The red line represents the exponentiated intercept, while the gray line represents the no-effect line.

https://doi.org/10.1371/journal.pone.0313378.g009

In summary, the transfer learning framework effectively provides mortality risk predictions for the UK, leveraging a pretrained model from 8 other countries due to a lack of local mortality portfolio data, while refining the model using open-source UK total population mortality rates and data synthesized from the available countries accordingly to their similarity degree. While the model performs well with less culture-specific risk factors, discrepancies with CMI mortality tables highlight the need for evaluation using the Drift model. This is essential for comprehensive risk assessment and to inform pricing strategies, particularly in scenarios where data is not available.

Limitation of generalizability

Overall, the transfer model results provide notable advantages for generalizability, especially when new country data is entirely absent. It allows us to leverage existing models trained on data from other regions, thereby circumventing the need for extensive local data collection and reducing both time and resource requirements. By utilizing knowledge from a previously trained model, transfer learning can enhance performance in target countries that share similar characteristics with the source countries, effectively applying pretrained insights. However, the absence of local data presents unique challenges. Transfer learning is most effective when the source and target countries exhibit substantial similarity. As the disparity between these countries increases, the effectiveness of transfer learning diminishes. For example, applying a model trained only on South American countries to predict outcomes in Asian countries may not be successful due to demographic, cultural, and economic differences. To address these challenges, we have implemented several mechanisms. The Country similarity index considers external demographic, insurance, and mortality-specific characteristics, capturing the degree of similarity between countries. This index aids in selecting appropriate source countries, minimizing the risk of misaligned data transfer. Additionally, the Drift model helps analyze discrepancies between source and target countries, offering a tool to understand the limits of generalizability and the extent to which transfer learning can be applied. Bootstrapping confidence intervals provide an additional layer of validation, helping to understand potential biases and offering robust insights into model performance and reliability in regions lacking local data. In practical terms, while the transfer learning framework holds considerable promise, its generalizability in the absence of local data depends on the similarity between source and target countries. By incorporating mechanisms like the Country similarity index, Drift model, Bootstrapping confidence intervals we facilitate more informed and reliable applications in regions with differing cultural, demographic, or economic conditions, even when local data is completely missing.

Summary and outlook

This research presents a novel transfer learning framework designed to provide accurate mortality risk predictions for the UK, despite the complete absence of local mortality portfolio data. By leveraging pretrained and specialized models from eight other countries, along with UK population mortality rates obtained from open sources and synthesized data, we refine predictions for this data-scarce environment.

The framework establishes a solid foundation for mortality risk estimation and pricing, particularly benefiting small countries with insufficient data. Our predictive model shows strong agreement with the CMI mortality tables for age and gender, with only slight deviations detected via the Drift model. Expert validation further supports the inclusion of additional variables to enhance mortality risk estimation.

The approach offers several practical benefits, including strong predictive performance, reduced reliance on local data, and lower computational demands, making it efficient for multi-centre studies. It simplifies the development and deployment of ML models by eliminating the need for extensive training data in each new country. Our findings suggest that transfer learning is particularly effective for factors that are less influenced by cultural differences, although it may experience drift when capturing local specificities.

While the reliance on synthetic data helps overcome data scarcity, it may introduce uncertainties, particularly when source countries differ demographically or economically from the target country. The effectiveness of the Drift model also depends on the quality and similarity of external data used in the transfer learning process.

Future research could focus on addressing uncertainties in predictions by incorporating additional socio-economic and regional factors that may further improve mortality predictions. Expanding the framework to other regions and markets, especially those lacking sufficient local data, would provide valuable insights into its broader applicability. Testing the model in different settings could refine its use for life insurance product development in underserved demographic segments and emerging markets.

Supporting information

S1 Appendix. Methodology details of pretraining and specialization.

https://doi.org/10.1371/journal.pone.0313378.s001

(PDF)

S3 Appendix. Additional results of the Drift model.

https://doi.org/10.1371/journal.pone.0313378.s003

(PDF)

S4 Appendix. Bootstrap validation for confidence intervals.

https://doi.org/10.1371/journal.pone.0313378.s004

(PDF)

S5 Appendix. Workflow for synthetic data generation.

https://doi.org/10.1371/journal.pone.0313378.s005

(PDF)

References

  1. 1. Vincelli M. A machine learning approach to incorporating industry mortality table features into a company’s insured mortality analysis. Soc Actuar Res Rep. 2019;2019(Sept):1–53.
  2. 2. Lim HB, Shyamalkumar ND. Incorporating industry stylized facts into mortality tables: Transfer learning with monotonicity constraints. 2024. Available from: https://papers.ssrn.com/abstract=3964181
  3. 3. Yosinski J, Clune J, Bengio Y, Lipson H. How transferable are features in deep neural networks? Advances in Neural Information Processing Systems 27 (NIPS 2014). vol. 27. 2014.
  4. 4. Tian Y, Feng Y. Transfer learning under high-dimensional generalized linear models. J Am Stat Assoc. 2022;118(544):1–14.
  5. 5. Gong JJ, Sundt TM, Rawn JD, Guttag JV. Instance weighting for patient-specific risk stratification models. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015. pp. 369–78.
  6. 6. Wiens J, Guttag J, Horvitz E. A study in transfer learning: leveraging data from multiple hospitals to enhance hospital-specific predictions. J Am Med Inform Assoc. 2014;21(4):699–706. pmid:24481703
  7. 7. Desautels T, Calvert J, Hoffman J, Mao Q, Jay M, Fletcher G, et al. Using transfer learning for improved mortality prediction in a data-scarce hospital setting. Biomed Inform Insights. 2017;9:1178222617712994. pmid:28638239
  8. 8. Lim HB, Shyamalkumar N. Incorporating industry stylized facts into mortality tables: transfer learning with monotonicity constraints. SSRN. 2021. Available from: https://ssrn.com/abstract=3964181
  9. 9. Schelldorfer J, Wüthrich MV. Nesting classical actuarial models into neural networks. SSRN. 2019. Available from: https://ssrn.com/abstract=3320525.
  10. 10. Bauer DT, Ameringer CF. A framework for identifying similarities among countries to improve cross-national comparisons of health systems. Health Place. 2010;16(6):1129–35. pmid:20675180
  11. 11. Murray CJL, Frenk JD. Ranking 37th–measuring the performance of the US health care system. N Engl J Med. 2010;362(2):98–99.
  12. 12. Thorpe RJ Jr, Koster A, Kritchevsky SB, Newman AB, Harris T, Ayonayon HN, et al. Race, socioeconomic resources, and late-life mobility and decline: findings from the health, aging, and body composition study. J Gerontol A Biol Sci Med Sci. 2011;66(10):1114–23. pmid:21743093
  13. 13. Matus JC. A comparison of country’s cultural dimensions and health outcomes. Healthcare (Basel). 2021;9(12):Article 12. Available from:
  14. 14. Hofstede G. National cultures revisited. Asia Pac J Manage 1984;2:22–28.
  15. 15. McDowell I, Spasoff RA, Kristjansson B. On the classification of population health measurements. Am J Public Health. 2004;94(3):388-93. pmid:14998801
  16. 16. Human Mortality Database (UK). Human Mortality Database. 2023. Available from: https://www.mortality.org/
  17. 17. Oram E, Dash PB, Naik B, Nayak J, Vimal S, Nataraj SK. Light gradient boosting machine-based phishing webpage detection model using phisher website features of mimic URLs. Pattern Recognit Lett. 2021;152:100–106.
  18. 18. Singh K, Xie M. Bootstrap: a statistical method. Rutgers University; USA: 2008. pp. 1–14. Available from: https://statweb.rutgers.edu/mxie/rcpapers/bootstrap.pdf.
  19. 19. Continuous Mortality Investigation. Working Paper 154: Final “16” Series term assurance mortality and accelerated critical illness tables. 2021. Available from: https://www.actuaries.org.uk/learn-and-develop/continuous-mortality-investigation/cmi-working-papers/assurances/cmi-working-paper-154.
  20. 20. OECD. Insurance Indicators – Life insurance share. 2023. Available from: https://stats.oecd.org/index.aspx?queryid=25443
  21. 21. OECD. Insurance Indicators – Density. 2023. Available from: https://stats.oecd.org/index.aspx?queryid=25445
  22. 22. OECD. Insurance Indicators – Penetration. 2023. Available from: https://stats.oecd.org/index.aspx?queryid=25444
  23. 23. OECD. Insurance Indicators – Total life gross premiums. 2023. Available from: https://stats.oecd.org/Index.aspx?DataSetCode=INSIND
  24. 24. OECD. Insurance Indicators – Life retention ratio. 2023. Available from: https://stats.oecd.org/index.aspx?queryid=25441
  25. 25. Global Residence Index. The Health Index. 2023. Available from: https://globalresidenceindex.com/hnwi-index/health-index/
  26. 26. The World Bank. Indicator Data – Retirement Pension. 2023. Available from: https://wbl.worldbank.org/en/data/exploretopics/getting-a-job#Retirement
  27. 27. The World Bank. World Health Organization’s Global Health Workforce Statistics - Physicians. 2023. Available from: https://data.worldbank.org/indicator/SH.MED.PHYS.ZS?most_recent_value_desc=true
  28. 28. The World Bank. The Global Health Observatory – Hospital beds. 2023. Available from: https://www.who.int/data/gho/data/indicators/indicator-details/GHO/hospital-beds-(per-10-000-population).
  29. 29. ChartsBin.com. Basic health services by country. 2023. Available from: http://chartsbin.com/view/41517.
  30. 30. The World Bank. World Health Organization Global Health Expenditure database – Current health expenditure. 2023. Available from: https://data.worldbank.org/indicator/SH.XPD.CHEX.PC.CD.
  31. 31. The World Bank. The Program in Global Surgery and Social Change – Risk of impoverishing expenditure for surgical care. 2023. Available from: https://data.worldbank.org/indicator/SH.XPD.CHEX.PC.CD.
  32. 32. Human Mortality Database. HMD - United Kingdom Total Population. 2023. Available from: https://www.mortality.org/Country/Country?cntr=GBR_NP/.
  33. 33. Li S, Cai TT, Li H. Transfer learning for high-dimensional linear regression: prediction, estimation and minimax optimality. J R Stat Soc Ser B Stat Methodol. 2022;84(1):149–73.
  34. 34. Li S, Cai T, Duan R. Targeting underrepresented populations in precision medicine: a federated transfer learning approach. Ann Appl Stat. 2023;17(4):2970–92.
  35. 35. Xu K, Bastani H. Learning across bandits in high dimension via robust statistics. arXiv, preprint, arXiv:2112.14233. 2021.
  36. 36. Perlibakas V. Distance measures for PCA-based face recognition. Pattern Recognit Lett. 2004;25(6):711–24.
  37. 37. Reps JM, Rijnbeek PR, Ryan PB. Identifying the DEAD: development and validation of a patient-level model to predict death status in population-level claims data. Drug Saf. 2019;42(11):1377–86. pmid:31054141
  38. 38. Carmona R, Delarue F, et al. Probabilistic theory of mean field games with applications I–II. Springer; 2018.
  39. 39. Gunay EE, Kula U. A two-stage stochastic rule-based model to determine pre-assembly buffer content. J Ind Eng Int. 2018;14:655–63.
  40. 40. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, et al. Generative adversarial networks. Commun ACM. 2020;63(11):139–44.
  41. 41. Bonabeau E. Agent-based modeling: Methods and techniques for simulating human systems. Proc Natl Acad Sci U S A. 2002;99:7280–7.
  42. 42. Simard PY, Steinkraus D, Platt JC, et al. Best practices for convolutional neural networks applied to visual document analysis. In: Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings. Edinburgh, UK: 2003. pp. 958–63. https://doi.org/10.1109/ICDAR.2003.1227801
  43. 43. Bishop CM, Nasrabadi NM. Pattern recognition and machine learning. Springer; 2006.
  44. 44. Levantesi S, Pizzorusso V. Application of machine learning to mortality modeling and forecasting. Risks. 2019;7(1):26.
  45. 45. Fahrmeir L, Kneib T, Lang S, Marx B. Regression models. Springer; 2013.
  46. 46. Yan J, Guszcza J, Flynn M, Wu CS. Applications of the offset in property-casualty predictive modeling. Casualty Actuarial Soc E-Forum. 2009;1(1):366–85.