Figures
Abstract
In Adult Social Care, UK local authorities have statutory responsibilities for assessing needs and delivering services to ensure adults’ wellbeing. Administrative data collected during this process may help local authorities’ compliance with these duties. We developed and internally validated predictive models for older people (>- 60 years) receiving social care for whether they remained at home or were admitted to care homes, two years after index assessments, using administrative data from one English City authority. We enquired whether the right data, to predict older people’s vulnerability to adverse outcome (care home) or evidence their strengths (remaining at home), were present in local authority systems, and if accurate datasets for large numbers of older people could be constructed to allow robust modelling. Logistic regression models were created with binary outcome (remaining at home/entering care homes). Sample size calculations determined the maximum number of candidate predictors we could consider for model development. 20,218 older people in the data cohort indicated we could consider a maximum of 46 candidate predictor parameters. In our final analyses we considered 31 candidate predictor parameters, in the areas of: age, sex, ethnicity, deprivation, housing tenure, abilities in activities of daily living, access to carer, Primary Support Reason, diagnosis of dementia. We used all data for model development and internal validation using cross-validation. Models were robust as to assumptions, with no evidence of overfitting and good predictive accuracy. Circumstances predicting strongly that older people would not remain at home were that they were: from other (non-white) ethnic groups, privately rented tenants or other (unstable) tenancies, not able to eat or keep their home habitable without difficulty, with their Primary Support Reason for personal care, mental health, or support with cognition. With public involvement partners, we created a prototype index for local authority use to inform professional decisions and/or local planning.
Citation: Clarkson P, Lerigo F, Martin GP, Wall L, Davies S, Nenadic G, et al. (2026) Development and validation of prediction models for predicting social care strengths and vulnerability in older people: Cohort study using routine data in Adult Social Care. PLoS One 21(4): e0328330. https://doi.org/10.1371/journal.pone.0328330
Editor: Belal Hossain, The University of British Columbia, CANADA
Received: July 9, 2025; Accepted: March 2, 2026; Published: April 15, 2026
Copyright: © 2026 Clarkson et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The authors declare that the minimal data set for this study cannot be shared publicly due to ethical and legal restrictions on sharing a de-identified data set which contains potentially identifying and sensitive participant information. Data are available from the Project Support Team via the University of Manchester by email (scsenquiries@manchester.ac.uk) for researchers who meet the criteria for access to confidential de-identified data. Requests for access to the underlying dataset will be forwarded to the relevant Data Controller for consideration.
Funding: This study was funded by the National Institute for Health and Care Research (NIHR), School for Social Care Research (grant number: 102645/CM/UMCR-P194) to PC, GPM, LW, SD, GN, PH and CR; https://sscr.nihr.ac.uk/ This report is independent research by the National Institute for Health and Care Research, School for Social Care Research. The views expressed in this publication are those of the author(s) and not necessarily those of the NIHR SSCR, the National Institute for Health and Care Research or the Department of Health and Social Care. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Introduction
In the United Kingdom (UK), Adult Social Care safeguards and protects individuals and provides practical support and care to vulnerable people. Responsibility for Adult Social Care in England lies with 152 local authorities, which have a range of statutory responsibilities to ensure the wellbeing of adults who access and receive social care services [1]. Authorities respond to this by assessing needs and delivering services to hopefully meet those needs. Older people form a large part of the recipients of social care services. Most older people want to stay independent for as long as they can, but many older people underreport their difficulties, often accepting societal assumptions that their circumstances cannot be improved [2]. They may not reveal difficulties to professionals who assess them for the care they might need because of uncertainties over what the responses might be [3]. Likewise, local authority professional assessors may under-detect circumstances driving social care needs amongst older people, sometimes due to inadequacy of assessment documentation or their reticence to probe about circumstances, such as, for example, low mood or incontinence [4]. In both these scenarios, crucial pieces of information might help improve the assessment of future risk to aid decision-making. Ways to gather and use this information unobtrusively, alongside or apart from the face-to-face encounter between older person and assessor, may be beneficial in planning and monitoring the care required and in local authorities’ complying with their legislative duties to promote citizens’ wellbeing.
The process of assessment and service provision in Adult Social Care generates a wealth of routine administrative data on older people’s circumstances, their available resources (such as family carer support) and strengths. However, these data are often underutilised in terms of investigating potential or unmet needs, risks of adverse outcomes or to assist in the longer-term planning of care. This is largely because authorities differ in the sophistication of the data that is generated and held and how it could be used in local evaluation, care planning and research.
There is increasing interest, however, in using the data generated from routine Adult Social Care services for secondary purposes in developing prediction models [5] that can use information about an individual and their environment to estimate their risk of adverse outcomes [6]. Such analyses are relatively more common in healthcare than social care and have been undertaken to inform professional decisions [7,8]. These healthcare prediction models often conceive of predictor variables in the context of deficits or frailty. In the context of Adult Social Care, however, such prediction models could be used to help inform a more positive social vulnerability approach [9], where prediction could identify older people’s daily living difficulties or circumstances that can be mitigated by environmental changes as well as through their own strengths.
The aim of this study was to use routine, administrative data accessed and analysed within Manchester City Council, a large English local authority, to develop and validate a predictive model of social care strengths and vulnerability, two years from initial index assessments as part of the Adult Social Care system. We aimed to discover if and how a prediction model for social care outcomes, using routine data, could be developed, and used to identify older people’s needs and potentially judge threats to their wellbeing. The paper also presents some of the lessons learned from undertaking such predictive modelling in social care, a sector where routine administrative data are not generally used for the purposes of research. We aimed to discover if:
- the right data, to potentially predict older people’s vulnerability to adverse outcomes or evidence their strengths in lessening these, were present in local authority systems.
- accurate, individual-level, datasets for large numbers of older people, representative of those receiving social care, could be constructed to allow robust statistical modelling.
- robust predictive models, statistically predicting outcomes and tested for assumptions, missing data and model fit, could be developed and validated.
- findings could be used to enable a prediction algorithm [10,11] to be made available, as an index for social care, that local authorities could use in planning more appropriate care around older people’s circumstances, resources, and strengths.
Materials and methods
This study was funded by the National Institute for Health and Care Research (NIHR), School for Social Care Research in England as a proof of concept [12]. The reporting of this study adheres to best practice recommendations [13] and followed the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD) guidelines [14]. The concept, data collation and analysis plan had the input of staff from Performance, Research and Intelligence, Manchester City Council who helped develop the application for funding, discussed data availability and access with the research team and helped draft the analysis plan. Our public involvement and engagement partners at Made by Mortals CIC [15], designed innovative, inclusive methods for deliberating on the proposed analyses and development of a predictive index and helped in sharing our learning to other local authority Adult Social Care departments.
Source of data and participants
Data for the development of the prediction model was from the Adult Social Care department of Manchester City Council, administrative data owned by the local authority as a public body, not research data. The Performance, Research and Intelligence team of the council harness the data generated from adult service users’ contacts with the social care system, alongside data from other council departments and local National Health Service (NHS), able to be accessed by them under devolved arrangements between health and social care [16]. There are around 70,000 contacts with Adult Social Care per-year in the city, from a population of over 424,992 adult residents. The data contain the social care information on all adults who access services in the Manchester City local authority area, 32 constituency wards comprising the central urban district within the Greater Manchester region, which covers 10 local authorities in Northwest England.
The study was reviewed and approved by an institutional review board (ethics committee) before the study began (Health Research Authority, Social Care REC: Ref: 23/IEC08/0012). University research staff were granted access to the dataset (from 31/08/2023) in the council’s secure environment for analysis (researchers went to the data, the data did not come to them), complying with established practice in this type of data access for research [17]. Authors had no access to information that could identify individual participants (data subjects) during or after data collection.
We identified a cohort of older people, aged 60 years and above, from the dataset who had initial contact, care assessments [1], or reviews from the council between 1st April 2021 and 31st March 2023, inclusive. Those under Deprivation of Liberty Safeguards [18] were excluded, due to their non-capacity to consent to council privacy notices concerning access to their data. The index date was taken as the date of first contact assessment for all individuals, with their data followed up until they had the outcome of interest or the end of the cohort period, of 31 March 2023, whichever occurred first.
Outcomes
The primary outcome was whether older people remained at home (i.e., they were not permanently admitted to a care home) at the end of the cohort period, i.e., as of 31 March 2023. Remaining at home and not entering permanent residential or nursing home care is a key marker of success for social care at home [19] and thus was used as an indicator of older people’s strengths in mitigating any threats to their wellbeing. We also re-ran this model with this outcome reversed, investigating predictors of permanent placement in a care home, an indicator of vulnerability and threats to wellbeing, through the cohort period. This variable was the obverse of the primary outcome, in that it reflects an adverse outcome, albeit entering a care home can be a positive choice for older people and their families. Of all care home admissions in the council dataset, those where the admission was permanent and not temporary, i.e., for respite after hospital discharge or during COVID-19 aftercare arrangements, were used to indicate this outcome of interest.
Candidate predictor variables
We selected, a-priori, potential candidate predictor variables that we hypothesised would be linked to adverse outcomes, like care home admission, or to be indicative of potential threats to older people’s wellbeing, and from the literature [9,20–23]. These 27 candidate data items (86 predictor parameters) were ones for which we first aimed to investigate whether variables were available in the local authority systems. They are summarised in Box 1. We defined these variables according to data recorded at, or prior to the index date of each individual (i.e., those variables that are available for the first assessments individuals have with the local authority Adult Social Care department). We determined which of these variables were available and could be extracted into a database for analysis, along with reasons for non-availability, through discussions with council data analysts. Additionally, we collapsed categories of some variables in cases of low levels of a categorical variable, and to permit more parsimonious model estimation. Our choice of variables to take forward into the final model analyses were also governed by whether they would be available, or relevant, at the point the prediction model would be applied in practice. For example, service use during the cohort period is a response to circumstances identified at index assessment (in other words, an outcome), so is not relevant here in predicting future needs. In Box 1 we also identify the items that were taken forward as candidates for final model development/analysis, after we had completed this process of combining categories and removing variables irrelevant to prediction.
Box 1. Candidate predictor variables evaluated during model development.
Demographic
Age, from date of birth, b in years (continuous), range 60 < 121 [a]
Sex * in four categories (Male, Female, Other, Unknown) [a]
Ethnicity * in 21 categories (Asian or Asian British – Indian, Asian or Asian British – Pakistani, Asian or Asian British – Bangladeshi, Asian or Asian British – Chinese, Asian or Asian British - Any other Asian background, Black, Black British, Caribbean or African – Caribbean, Black, Black British, Caribbean or African – African, Black, Black British, Caribbean or African - Any other Black, Black British or Caribbean background, Mixed or multiple ethnic groups - White and Black Caribbean, Mixed or multiple ethnic groups - White and Black African, Mixed or multiple ethnic groups - White and Asian, Mixed or multiple ethnic groups - Any other Mixed or multiple ethnic background, White - English, Welsh, Scottish, Northern Irish or British, White – Irish, White - Gypsy or Irish Traveller, White – Roma, White - Any other White background, Other ethnic group – Arab, Other ethnic group - Any other ethnic group, No data – Refused, No data - Undeclared or Not known) [a]
Index of Multiple Deprivation (IMD) [24] (continuous) based on Lower Super Output Area, derived from postcode b [a]
Living Situation
Living Situation in 2 categories (Alone, with Carer/other family)
Accommodation/Living Situation * in 20 categories (Owner occupier or shared ownership scheme, Tenant, Tenant - private landlord, Settled mainstream housing with family/ friends, Supported accommodation/ supported lodgings/ supported group home, Shared Lives scheme, Approved premises for offenders, Sheltered housing/ extra care housing/ other sheltered housing, Mobile accommodation for Gypsy/ Roma and Traveller communities, Rough sleeper/ squatting, Night shelter/ emergency hostel/ direct access hostel, Refuge, Placed in temporary accommodation by the council (inc. homelessness resettlement), Staying with family/ friends as a short-term guest, Acute/ long-term healthcare residential facility or hospital, Registered care home, Registered nursing home, Prison/ Young offenders institution/ detention centre, Other temporary accommodation, Unknown) [a]
Has unpaid carer b in 3 categories (Yes, No, unknown) [a]
Functioning
Activities of daily living; Eating & Drinking in two categories, whether have difficulties in (yes, no) [a]
Activities of daily living; Maintaining Personal Hygiene in two categories, whether have difficulties in (yes, no) [a]
Activities of daily living; Managing Toilet Needs in two categories, whether have difficulties in (yes, no) [a]
Activities of daily living; Being Appropriately Clothed in two categories, whether have difficulties in (yes, no) [a]
Activities of daily living; Maintaining a Habitable Home Environment in two categories, whether have difficulties in (yes, no) [a]
Primary Support Reason, support reasons in Care Act National Eligibility Criteria b in 13 categories (Physical Support: Access & mobility only, Physical Support: Personal care support, Sensory Support: Support for visual impairment, Sensory Support: Support for hearing impairment, Sensory Support: Support for dual impairment, Support with Memory & Cognition, Learning Disability Support, Mental Health Support, Social Support: Substance misuse support, Social Support: Asylum seeker support, Social Support: Support for Social Isolation/Other, Social Support: Support to Carer, Unknown) [a]
Being safe in and around your home in 2 categories (Yes/No)
Making use of facilities and services in the local community in 2 categories (met/unmet)
Health conditions likely to incur risk or threats to wellbeing
Visual Impairment b in 5 categories (Blind/severely sight impaired, Partial sight/sight impaired, No visual impairment, Visual impairment - severity unknown, Unknown
Hearing Impairment b in 6 categories (Deaf with speech, Deaf without speech, Hard of hearing, No hearing impairment, Hearing impairment - severity unknown, Unknown)
Dementia b in 3 categories (Yes, No, Unknown) [a]
Falls in 2 categories (Yes, No)
Service use, likely indicative of greater need, or to control for previous service use
Home care in hours per week (continuous)
Day care in hours per week (continuous)
Delivered meals in units per week (continuous)
Delivered equipment/aids in 2 categories (Yes, No)
Transport in units per week (continuous)
Professional support, occupational therapy, social work in units per week (continuous)
Hospital admissions, number within period (continuous)
Accident & Emergency attendances, number within period (continuous)
a considered as candidates for final model development/analysis after undertaking process of category-combining and removing irrelevant variables.
b Items included in the Adult Social Care Client Level Data (CLD); the mandatory data set that local authorities began to submit to NHS England from July 2023 [25]
Sample size
As the dataset created was a fixed sample size, calculations were used to determine the maximum number of candidate predictors (prediction parameters) that could be considered for model development (i.e., prior to any data driven predictor selection) [26]. These calculations were performed in the R package pmsampsize [27]. For inputs into these calculations, we used a conservatively proposed c-statistic of 0.7 (i.e., level of discrimination predictive performance that we expected the models to reach), a target shrinkage of 0.9, and an outcome prevalence of 0.96 (remaining at home) and 0.04 (those in care homes) taken from our development data.
For our fixed available sample size (20,218) the sample size calculations indicated we could consider a maximum of 46 candidate predictor parameters (which results in a minimum required sample size of 19,770 and 20,064 for the primary and secondary outcome, respectively). In our final analyses (Tables 1 and 2) we eventually considered 31 candidate predictor parameters.
Missing data
The percentage of missing values across the variable categories included in analysis varied between 0 and 76%. S1 Table in the Supplemental Materials gives the missing data rates of each variable. Many older people had no scores for some variables, as unless the local authority progressed their initial contact to a full Care Act [1] assessment, these variables were not measured. Data was inspected using graphical plots generated by the VIM 6.2.2 package [28], no pattern of missing data differentiated any socio-demographic group of older people. We used multiple imputation to analyse five multiple imputed datasets. The multiple imputation procedure was repeated in each iteration of the cross-validation loops. Methodologists currently regard multiple imputation as the preferred technique for management of missing data because it improves accuracy and statistical power relative to alternative missing data techniques. Graphical checks were performed on continuous variables to observe the relationship between each predictor and outcome variable. Incomplete variables were imputed under fully conditional specification, using the default settings of the mice 3.0 package [29]. Fully conditional specification was used, as in the mice package an imputation method is chosen dependent on the measurement level of each variable and a fully conditional specification is considered suitable where linear relationships are observed between each continuous predictor variable and the outcome variable. Visual plots were inspected to ensure each variable had achieved acceptable imputation. The parameters of substantive interest were estimated in each imputed dataset separately and combined using Rubin’s rules [30].
Statistical analysis methods
As recommended [26], we used all the data for model development and internal validation using cross-validation. This is an iterative process where the full dataset was split into five random folds of equal size. In each step, a fold was left out with the models developed on the remaining folds that was then validated on the excluded fold; this process then repeats at each stage.
For model development, two models were developed. First, we fitted a logistic regression model to the binary outcome of whether older people remained at home versus not (i.e., admitted permanently to care home or other care facility) within two years of the first index contact. Second, we fitted a logistic regression model to this outcome (reversed), treated as a binary outcome of placed in a care home (vs. not placed in care home) after first index contact assessment. For all modelling, we considered all covariates simultaneously as main effects and model assumptions were tested post-estimation.
For validation of each model (in each cycle of the cross-validation process), predictive performance was assessed using metrics of calibration (agreement between observed and expected outcomes) and discrimination (ability of the models to differentiate those who will experience the outcomes from those who will not). We provided calibration plots, assessments of calibration-in-the-large and calibration slope, and c-statistics for discrimination [31]. All performance results were summarised across the cycles of the cross-validation processes, including calculation of 95% confidence intervals. For graphical summaries of predictive performance (calibration plots), we report the calibration plots for each of five iterations in the cross-validation analyses separately. Calibration plots show the observed outcome proportion against the expected outcome proportion from the model, with a perfectly calibrated model having a calibration plot along the y = x line.
All analyses were performed using R version 4.3.1 “Beagle Scouts” [32], along with the packages: broom [33], car [34], dplyr [35], forcats [36], mice [37], predRupdate [38], readxl [39], tidyverse [40], VIM [41], and writexl [42].
Results
Overall study cohort
There were 20,218 older people overall in the study cohort who had received at least a contact assessment from the local authority with many (75%) going on to receive subsequent care needs assessments and reviews.
Data availability.
From the list of candidate predictor variables (Box 1) the following functioning variables were unavailable: ‘being safe in and around home’ and ‘making use of facilities and services in the local community’. These were items that would have been drawn from the council strengths and needs assessment, but which were not mandatorily populated in the routine administrative data. In terms of health conditions, falls were not available, being sometimes recorded in primary care, but not mandatorily as routine data available to the council. For service use, meals, transport, and professional support were not available, being not recorded in council systems for contracted agencies (meals, transport) or not systematically available in routine data (professional support).
Baseline characteristics
Table 3 shows the baseline characteristics of the cohort. The mean age of older service users was 77 years. The majority (58%) were female and White British/Irish (62%). Manchester City residents were living in areas of high deprivation (mean IMD of 44) with 40 per cent living as housing association tenants, and 38 per cent as owner occupiers. In terms of functioning at index assessment, there were significant difficulties with Activities of Daily Living experienced by older users in the domains of eating (78% had difficulties, unable to function without help), personal hygiene (81%), managing the toilet (57%), being appropriately clothed (70%), and maintaining the home (74%). Almost 30 percent had an unpaid family carer they could call on for support. In terms of the Primary Support Reason they approached the social care authority for, mostly this was for personal care (40%) and access and mobility (31%). Around 9 per cent of older users had a diagnosis of dementia.
Predictor variables
The variables eventually included in the models were therefore ones for which information was available and relevant to the concept of identifying risks of future harms (or strengths) in a prediction model. Therefore, those unavailable, above, were excluded from the models. Also, all service use variables (Box 1) were excluded from the modelling, as they were not relevant for predicting future outcomes, as individuals’ service use is itself an (intermediate) outcome. Tables 1 and 2 outline all eventual predictor variables and their candidate predictor parameters included in the final model analyses.
Model evaluation
Tables 1 and 2 show the coefficients of our final models for remaining at home or entering care homes respectively. For the model of remaining at home (Table 1) the following were significant predictors: Ethnicity Asian or Asian British, Black or Black British (more likely to remain at home); Living in area of high deprivation (more likely to remain at home); living in residential care or ‘other’ tenancy – including unstable accommodation (less likely to remain at home); No access to carer (less likely to remain at home)’ Primary Support Reason of needing physical or mental health support (less likely to remain at home) or support to carer (more likely to remain at home). For the model of entering care homes (Table 2) the following were significant predictors: Ethnicity Black or Black British (less likely to enter care homes) or other non-white ethnicity (more likely to enter care homes); living in private tenancy (more likely to enter care homes); living in residential care or other tenancy (more likely to enter care homes); Activities of Daily Living (ADL) unable to eat and keep the home habitable without difficulty (more likely to enter care homes); Primary Support Reason being need for physical, mental health support or support with memory and cognition (more likely to enter care homes).
Both models were robust as to assumptions for logistic regression, i.e., binary, independence of variables, large enough sample size, linearity of predictors and log odds, and multicollinearity.
Model validation
Tables 4 presents the predictive performance results from the internal cross-validation process for each model. It can be seen, in the data reflecting the full models, that calibration-in-the-large/pooled c-statistic (calibration intercept = 0) is good and no evidence of overfitting (calibration slope inside range 0.9-1.1). Low Brier scores indicated good predictive accuracy.
S1 Fig in supplementary information, presents calibration plots of observed vs predicted probability for the models of remaining at home in each of the five cycles of the cross-validation process. These show that the models were well calibrated across the full risk range.
Developing a predictive index
Results from these analyses were shared with the council and in an interactive, forum-theatre style immersive workshop event run by our public involvement and engagement partners [43]. An audience of stakeholders (council data intelligence officers, service managers, social workers, charity members, researchers, and older people drawing on social care) deliberated on the form a social care predictive index could take. It was envisaged that this could be to aid professional judgement in individual cases and/or support service planning or forecasting at a population level. It could also put the older person in control of their care, in conversations with professionals, by using the data from the index as a feedback mechanism to guide joint decision-making. An index was thought to be broadly acceptable to stakeholders, but with some caveats. Basing an index on the ‘right’ data was viewed as important. It was thought professionals must “read into” the tool the “right things”, which should include such data as: age, housing situation, how much support older people need and details about their health. There was concern and a little distrust of data and how it might be used, e.g., by central government. The idea of the ‘strengths’ part of the project was seen as worthwhile – “to build on what people can do, not what they can’t”. The data used for the index should also be accurate, and there was concern at data becoming dated: “would it reflect present circumstances or those six months ago?”
Comprehensive analysis of the data would produce a tool that would hopefully “give you an idea of what the outcomes might be and what you might ultimately be looking at facing in the future and being able to put plans into place in advance of reaching that point”. But assurances of data security were seen as important and the index should be seen only as a support to professional judgement, with “computers and data being only tools at the hands of human beings”
We developed the index with the feedback from the workshop event in mind. From our analyses, we determined the contributions of each predictor variable to the outcome from the model of risk of care home admission. We multiplied each covariate by the corresponding coefficient estimate and summed across these to get the linear predictor, then applied the inverse-logit function to obtain the predicted risks [44] for older people with different characteristics, in an exemplar interactive index. We used R and a Shiny package to create this [45]. Our exemplar index, predicting individuals’ risk of care home admission, is openly accessible (at: https://scs-10.shinyapps.io/VisableRev/), giving an output of predicted risk, interpreted as the degree of risk of being in a care home if the person has the circumstances considered. From this index, S2 and S3 Fig in the Supplemental Materials show two examples of potential use of the index in future practice by council team leaders or social workers, highlighting the probability of care home admission, for hypothetical cases of relatively high and low risk respectively. These examples are given to adhere to reporting guidelines for prediction modelling (TRIPOD) that clearly state that authors should give examples of how to make predictions from the model.
Supplementary S2 Fig shows a hypothetical case of a 70-year woman from a relatively deprived area of the inner city (IMD = 46) living in a private tenancy, with ‘other’ (non-white) ethnicity. In terms of need, they would have been assessed as not able to feed themselves or look after their home, but were able to bathe and groom, use the toilet and dress themselves. Their records would show that they did not have access to a carer and their Primary Support Reason for accessing social care would be for ‘mental health support’. They would be known to have a dementia diagnosis. The Figure shows how one could make a prediction for such a hypothetical person, who would have entered the social care system, with these items of information available in their record. For this hypothetical case, there would be a 12% probability of them being admitted to a care home, two years after their initial social care assessment.
Supplementary S2 Fig shows a hypothetical case of a 61-year-old white man, an owner-occupier from a well-resourced outer-city area (IMD = 15.51). In terms of need, they would have been assessed as able to undertake all daily living activities but would have difficulties looking after their home. Their records would show that they did have access to a carer and their Primary Support Reason for accessing social care would be for ‘access and mobility’. They would be known to not have a dementia diagnosis. The Figure shows how one could make a prediction for such a hypothetical person, who would have entered the social care system, with these items of information available in their record. For this hypothetical case, there would only be a 1% probability of them being admitted to a care home, two years after their initial social care assessment.
It is not recommended that this prototype index yet be used to support operational decision making in individual cases in local authority Adult Social Care departments. The analyses presented here require further external validation in other local authorities before this is appropriate. Work is planned on an external validation in other local authorities in the UK, the results of which will be presented separately. Work is also underway to investigate how the index could potentially be reconfigured to assist service planning at a population level, rather than individual risk profiling.
Discussion
We developed and validated statistical models from routine administrative Adult Social Care data in a local authority in Greater Manchester, an urban, city authority with a diverse population of older people requiring access to social care support. From these, we provided robust, well-calibrated models for predicting which sorts of older people’s circumstances predicted the likelihood of them either remaining at home or entering care homes two-years after initial assessment. We found circumstances predicting strongly that older people would not remain at home, and enter care homes, were: that they belonged to other (non-white) ethnic groups; that they were privately rented tenants or living in other (unstable) tenancies; that, in terms of their activities of daily living, they were not able to eat or keep their home habitable without difficulty; that their Primary Support Reason was for personal care, mental health, or support with memory and cognition. There are potential variable interactions, not considered in the current model, which could be considered in the future. These include problems with living conditions, interacting with functional problems, like abilities in activities of daily living, and interactions between living alone (or without carer), cognitive and functional impairment [46].
These findings have broad face validity according to variables from research investigations into the factors most likely to predict the living circumstances of older people from initial assessments of their needs [20–23]. Particularly not having access to a carer (showing a significant negative relationship in our remaining at home model) and requiring personal care, mental health support and support with memory and cognition would be expected to be important in determining older people’s risks of entering care homes. Also, very important at initial assessment is the level of performance in the activities of daily living, whose relationship with older people’s later living situation is evidenced in previous research enquiry. Abilities in activities of daily living follow a set pattern through life, those reflecting basic functions essential to survival – such as feeding – are acquired first with more complex learned activities – such as dressing – acquired last [47]. In aging or as long-term conditions are acquired, this process is reversed; the more basic functions are retained the longest and so cases where older people experience difficulties with these should be signalled for special attention [48]. Our finding of activity difficulties with eating being a significant predictor of the likelihood of care home admission reinforces the use of such a data item as integral to assessing risk in social care.
However, although these findings appear more-or-less established from research studies, one advantage of our study is that the administrative data are from large numbers of real, routine cases in Adult Social Care departments, that have complied with eligibility criteria for support and exhibit a range of care needs, and from operational data of users’ journeys through the care system. The data are thus representative of the population of older social care users receiving support from the local authority. Our sample size was more than adequate and predictor-outcome relationships, on the whole, are as expected from the primary research evidence.
On the other hand, our study does have limitations. One limitation is our finding from area deprivation (drawn from anonymised post-code linkage), which appeared counter-intuitive, as those in areas of greater deprivation were more likely to remain at home. This could be explained by the distribution of the data, particularly with Manchester City having a preponderance of higher deprived small local areas. And, therefore, there is a much greater effect of higher deprivation on the likelihood of remaining at home; in Manchester, 19,505 (96%) older people remained at home two-years after initial assessment, as opposed to 713 (4%) entering care homes on a permanent basis. There is therefore a need to investigate the generalisability of our models for transportability across regions of the UK, where the prevalence of deprivation may be lower and so its impact on the likelihood of care home admission may be more pronounced. A further limitation is our omission of some relevant well-known, predictors of care home admission [22], for which the data were unavailable. For example, older people’s feelings of safety at home and the existence of falls are relevant predictors of entering care homes but were unavailable to include in our models. Other candidate predictors in our study, important in the research literature, for example making use of facilities in the local community (social participation), were not available in the Adult Social Care data. More detailed data of circumstances that may pose greater risk of care home admission, such as details of older people’s actual home or living situation, for example whether there are stairs or other impediments to independent living, were also not available and are not likely to be available in the future. Another limitation is that some variables had high levels of missingness, however this was seven variables from fifteen with missingness over 50%. The impact of this is likely to be relatively minimal. However, we do plan, in future, to externally validate this model.
We were able to develop a prototype index from these analyses and prediction estimates that the council and other local authorities, could potentially use to help inform future professional decisions, service configuration and/or local planning. User, practitioner and manager views on such an index were that it was broadly acceptable but that it must be based on the ‘right’ data, that the data and analysis was trusted, and that the data were secure. We provide examples from this index, which can be used to inform discussion about its future development. As stated, our study was a proof-of-concept to examine whether the right data in this setting could be available and that models could be run on the data that were well calibrated and could reliably predict our chosen outcomes. We would not recommend our prototype decision-making tool yet be used routinely by local authorities to judge risk in individual older people. Further external validation in other local authorities is needed, particularly those with differences, importantly, in deprivation levels, rurality, and ethnic mix that are different to those in Manchester. In addition, we would like to improve the look and useability of the tool for social care professionals and managers in terms of the information it can provide, for those largely unfamiliar with statistical language and outputs.
Adult social care practice in England does not presently capitalise enough on the wealth of routine data existing and generated within local authority Adult Social Care. Our study showed that much of the data most useful in predicting what might happen to older users in their care trajectories is now currently being collected by local authorities. These data items, including those of Primary Support Reason, carer access, accommodation-type, and ethnicity, are those now mandated in the national Client Level Data (CLD) set, and so will be collectable by local authority performance and intelligence teams. The CLD is now planned to be available nationally, for all local authorities, also for researchers to eventually access and use [49,50]. Therefore, this can be used in future to undertake more sophisticated analyses, on users’ journeys through the care system to inform evaluations of how councils are responding to their Care Act [1] responsibilities. Our analyses point the way to how this could successfully be done, in partnership with local authority Adult Social Care departments.
Supporting information
S1 Checklist. TRIPOD checklist: Prediction model development and validation.
https://doi.org/10.1371/journal.pone.0328330.s001
(PDF)
S1 Table. Missingness in variables used in the logistic models.
https://doi.org/10.1371/journal.pone.0328330.s002
(PDF)
S1 Fig. Calibration plots of observed vs predicted probability for the models of remaining at home.
https://doi.org/10.1371/journal.pone.0328330.s003
(PNG)
S2 Fig. Example risk calculation for relatively high-risk case.
https://doi.org/10.1371/journal.pone.0328330.s004
(PDF)
S3 Fig. Example risk calculation for relatively low risk case.
https://doi.org/10.1371/journal.pone.0328330.s005
(PDF)
References
- 1.
UK Government. Care Act 2014, c.23. Available from: http://www.legislation.gov.uk/ukpga/2014/23/pdfs/ukpga_20140023_en.pdf
- 2.
Philp I. No time to lose. In: Goodwin J, Rossall P, editors. Improving Later Life. London: Age UK. 2012. p. 58–61.
- 3. Howse K, Ebrahim S, Gooberman-Hill R. Help-avoidance: why older people do not always seek help. Rev Clin Gerontol. 2005;14:63–70.
- 4. Clarkson P, Abendstern M, Sutcliffe C, Hughes J, Challis D. Reliability of needs assessments in the community care of older people: impact of the single assessment process in England. J Public Health (Oxf). 2009;31(4):521–9. pmid:19372147
- 5.
Riley RD, van der Windt D, Croft P, Moons KGM. Prognosis research in healthcare concepts, methods, and impact. Oxford: Oxford University Press; 2019.
- 6.
NHS England Transformation Directorate. AI in adult social care: Innovation to improve outcomes and increase choices for adult social care service users. 2021. Available from: https://www.nhsx.nhs.uk/ai-lab/explore-all-resources/understand-ai/ai-adult-social-care/
- 7. Sessler DI, Sigl JC, Manberg PJ, Kelley SD, Schubert A, Chamoun NG. Broadly applicable risk stratification system for predicting duration of hospitalization and mortality. Anesthesiology. 2010;113(5):1026–37. pmid:20966661
- 8. Zhao J, Guo Z, Liu X, Glaser DL, Elkassabany NM, Liu J. The development and application of a risk stratification index system for outpatient shoulder arthroscopy patient management-a single academic center’s experience. JSES Open Access. 2017;1(1):1–4. pmid:30675530
- 9. Andrew MK, Mitnitski AB, Rockwood K. Social vulnerability, frailty and mortality in elderly people. PLoS One. 2008;3(5):e2232. pmid:18493324
- 10. Clift AK, Coupland CAC, Keogh RH, Diaz-Ordaz K, Williamson E, Harrison EM, et al. Living risk prediction algorithm (QCOVID) for risk of hospital admission and mortality from coronavirus 19 in adults: national derivation and validation cohort study. BMJ. 2020;371:m3731. pmid:33082154
- 11. Clegg A, Bates C, Young J, Ryan R, Nichols L, Ann Teale E, et al. Development and validation of an electronic frailty index using routine primary care electronic health record data. Age Ageing. 2016;45(3):353–60. pmid:26944937
- 12. Kush R, Alschuler L, Ruggeri R, Cassells S, Gupta N, Bain L, et al. Implementing Single Source: the STARBRITE proof-of-concept study. J Am Med Inform Assoc. 2007;14(5):662–73. pmid:17600107
- 13.
Steyerberg EW. Clinical prediction models. New York: Springer; 2009.
- 14. Collins GS, Reitsma JB, Altman DG, Moons KGM. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): the TRIPOD statement. Ann Intern Med. 2015;162(1):55–63. pmid:25560714
- 15.
Made by Mortals: Bringing lived experience to life. 2025. Available from: https://www.madebymortals.org/
- 16. Malik B, Wells J, Hughes J, Clarkson P, Keady J, Young A, et al. Complex care needs and devolution in Greater Manchester: a pilot study to explore social care innovation in newly integrated service arrangements for older people. Aust Health Rev. 2020;44(6):838–46. pmid:32788034
- 17. Desai T, Ritchie F, Welpton R. Five Safes: designing data access for research. Bristol Business School Working Papers in Economics. 2016. 1601 p. Available from: https://www2.uwe.ac.uk/faculties/BBS/Documents/1601.pdf
- 18.
UK Government. Deprivation of liberty safeguards: resources. 2024. Available from: https://www.gov.uk/government/publications/deprivation-of-liberty-safeguards-forms-and-guidance
- 19. Wiles JL, Leibing A, Guberman N, Reeve J, Allen RES. The meaning of “aging in place” to older people. Gerontologist. 2012;52(3):357–66. pmid:21983126
- 20. Hildon ZJ-L, Tan CS, Shiraz F, Ng WC, Deng X, Koh GCH, et al. The theoretical and empirical basis of a BioPsychoSocial (BPS) risk screener for detection of older people’s health related needs, planning of community programs, and targeted care interventions. BMC Geriatr. 2018;18(1):49. pmid:29454316
- 21. Clarkson P, Venables D, Hughes J, Burns A, Challis D. Integrated specialist assessment of older people and predictors of care-home admission. Psychol Med. 2006;36(7):1011–21. pmid:16563208
- 22. Luppa M, Luck T, Weyerer S, König H-H, Brähler E, Riedel-Heller SG. Prediction of institutionalization in the elderly. A systematic review. Age Ageing. 2010;39(1):31–8. pmid:19934075
- 23. Bardsley M, Billings J, Dixon J, Georghiou T, Lewis GH, Steventon A. Predicting who will use intensive social care: case finding tools based on linked health and social care data. Age Ageing. 2011;40(2):265–70. pmid:21252036
- 24.
UK Government. Ministry of Housing, Communities and Local Government. The English Indices of Deprivation 2019. Frequently Asked Questions. 2019. Available from: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/853811/IoD2019_FAQ_v4.pdf
- 25.
NHS Digital. Collection of Client Level Adult Social Care Data (No 3) Directions. 2023. Available from: https://digital.nhs.uk/about-nhs-digital/corporate-information-and-documents/directions-and-data-provision-notices/secretary-of-state-directions/collection-of-client-level-adult-social-care-data-no-3
- 26. Riley RD, Ensor J, Snell KIE, Harrell FE, Martin GP, Reitsma JB. Calculating the sample size required for developing a clinical prediction model. BMJ. 2020;m441.
- 27.
Package ‘pmsampsize’, October 14, 2022, Version 1.1.2; The Comprehensive R Archive Network. Available from: https://cran.r-project.org/
- 28. Kowarik A, Templ M. Imputation with the R Package VIM. J Stat Soft. 2016;74(7):1–16.
- 29. van Buuren S, Groothuis-Oudshoorn K. mice: Multivariate Imputation by Chained Equations inR. J Stat Soft. 2011;45(3):1–67.
- 30.
Rubin DB. Multiple imputation for nonresponse in surveys. New York: John Wiley & Sons, Inc; 1987.
- 31. Steyerberg EW, Vickers AJ, Cook NR, Gerds T, Gonen M, Obuchowski N, et al. Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology. 2010;21(1):128–38. pmid:20010215
- 32.
R Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing; 2023. Available from: https://www.R-project.org/
- 33. Robinson D, Hayes A, Couch S. broom: Convert statistical objects into tidy tibbles. R package version 1.0.7. 2024. Available from: https://CRAN.R-project.org/package=broom
- 34.
Fox J, Weisberg S. An R Companion to Applied Regression. 3 ed. Thousand Oaks (CA): Sage. 2019. Available from: https://www.john-fox.ca/Companion/
- 35. Wickham H, François R, Henry L, Müller K, Vaughan D. dplyr: A grammar of data Manipulation. R package version 1.1.4. 2023. Available from: https://CRAN.R-project.org/package=dplyr
- 36. Wickham H. forcats: Tools for Working with Categorical Variables (Factors). R package version 1.0.0. 2023. Available from: https://CRAN.R-project.org/package=forcats
- 37. van Buuren S, Groothuis-Oudshoorn K. mice: Multivariate Imputation by Chained Equations in R. J Stat Soft. 2011;45(3).
- 38. Martin G, Jenkins D, Sperrin M. predRupdate: Prediction model validation and updating. R package version 0.2.0. 2024. Available from: https://CRAN.R-project.org/package=predRupdate
- 39. Wickham H, Bryan J. readxl: Read Excel Files. R package version 1.4.5. 2025. Available from: https://CRAN.R-project.org/package=readxl
- 40. Wickham H, Averick M, Bryan J, Chang W, McGowan L, François R, et al. Welcome to the Tidyverse. JOSS. 2019;4(43):1686.
- 41. Templ M, Kowarik A, Alfons A, Cillia G, Prantner B, Rannetbauer W. VIM: Visualization and imputation of missing values. R package version 6.2.2. 2025. Available from: https://cran.r-project.org/web/packages/VIM/index.html
- 42. Ooms J, McNamara J. writexl: Export data frames to Excel ‘xlsx’ format. R package version 1.5.4. 2025. Available from: https://cran.r-project.org/web/packages/writexl/index.html
- 43.
Made by Mortals. For Data’s Sake: An interactive event exploring a new social care index. 2025. Available from: https://www.madebymortals.org/for-datas-sake-an-interactive-event-exploring-a-new-social-care-index/
- 44. Moons KGM, Harrell FE, Steyerberg EW. Should scoring rules be based on odds ratios or regression coefficients?. J Clin Epidemiol. 2002;55(10):1054–5. pmid:12464384
- 45. Ji X, Kattan MW. Tutorial: development of an online risk calculator platform. Ann Transl Med. 2018;6(3):46. pmid:29610738
- 46.
Manchester City Council. Deprivation: data and intelligence. 2024. Available from: https://www.manchester.gov.uk/info/200088/statistics_and_intelligence/2168/deprivation
- 47. Williams RG, Johnston M, Willis LA, Bennett AE. Disability: a model and measurement technique. Br J Prev Soc Med. 1976;30(2):71–8. pmid:953379
- 48.
Clarkson P. What research tells social workers about their work with older people. In: Davies M, editors. Social work with adults. London: Palgrave, Macmillan; 2012. p. 300–13.
- 49.
Social Care Institute for Excellence (SCIE). How data can support better outcomes for people needing care and support, using Client Level Data (CLD). 2024. Available from: https://www.scie.org.uk/innovation/showcase-webinars/client-level-data-cld/#:~:text=Client%2DLevel%20Data%20(CLD),under%20the%20Care%20Act%202014
- 50.
Arden&GEM. Adult Social Care Client Level Data (CLD), NHS Digital. Available from: https://www.ardengemcsu.nhs.uk/adult-social-care-client-level-data/