Data sources
Cebr has utilised publicly available datasets and identified five main themes within which the key indicators have been established. The data has been collected and consolidated into one rich dataset from which the analysis has been conducted.
The timing of this research means that the 2021 Census is the best source of data for several variables given that it covers the whole UK population and is very recent. Any future updates would need to draw on alternative sources based on regular research like the Annual Population Survey. Moreover, Census data is heavily influenced by COVID-19; whilst this is no bad thing given its relevance to the recent increase in economic inactivity, in the future it will become less salient.
Employment and labour market
Economic activity data has been obtained from two sources. First, the Annual Population Survey 2023 has the most up-to-date data by local authority. However, as it is a sample survey, data on inactivity by reason is incomplete. Second, to supplement the APS data, we have included economic inactivity data from the Census 2021, and this is broken down by category of economic inactivity. In addition, claimant count and highest level of qualification data (e.g., no qualification, NVQ4+) from the Census 2021 have also been utilised.
The number of claims made from the Coronavirus Job Retention Scheme provided from HMRC was also included in the dataset. This data illustrates the scale of employment that was most affected by COVID-19 and the resulting lockdowns. In addition, to test the possibility that the structure of the economy and employment sector may be affecting inactivity, the Business Register and Employment Survey (BRES) provides recent data on UK employment by sector. The BRES data on the split between private and public sector employment was also included.
To understand how the local economy and business base affects economic inactivity several indicators were included. From the ONS data, both Gross Value-Added local estimates and job density, which measures numbers of jobs per resident aged 16 to 64, were incorporated into the larger dataset. Business Demography 2021 provides estimates on the number of UK business births and deaths at a local authority level which have been included in our dataset. Finally, the Annual Survey of Hours and Earnings (ASHE) measures the percentage of employees earning below the Living Wage Foundations rates at a local authority level which has been included to get an understanding of job quality at a local level.
Deprivation and poverty
We have included a range of indicators from different data sources to highlight this theme. Firstly, the 2019 Index of Multiple Deprivation (IMD) from the Ministry of Housing, Communities and Local Government. This provides data on a range of different domains of deprivation which are combined and weighted to calculate the IMD. Secondly, to supplement the IMD data, the Census 2021 data on the dimensions of deprivation based on four household characteristics – education, employment, health and housing – was integrated.
Thirdly, the Local Authority Indicator data from the Office of National Statistics (ONS) was incorporated into the dataset. Deprivation and poverty statistics were extracted and inputted into the dataset. Finally, the Cost-of-Living Vulnerability Index was included to complement the IMD data. This index goes further with respect to the cost-of-living crisis by focusing on indicators of poverty that correspond with the specific cost pressures associated with it – such as food and fuel poverty.
LG Inform data was also utilised for a number of indicators such as, the number of individuals liable to UK income tax through pension income, number of children living in relative low income and the proportion of individuals over 65 claiming pension credit.
Housing
Under this theme some key indicators such as the number of people claiming housing benefits and transport accessibility indicators, which could be driving economic inactivity, were included.
The Department of Work and Pensions commissions statistics on housing benefit claims yearly. In addition, data on methods used to travel to work from the Census 2021 could shed some light on the transport accessibility indicators which could impact economic inactivity at a local level. The Local Authority Indicator also publishes data on transport links to employment for 2022. This has supplemented the 2021 Census data.
The ONS also provides data on dwellings by their tenure – outright ownership, ownership with a mortgage, private rent, and social rent. Census 2021 data on main language, ethnicity and lone parent single family households was also included.Under this theme some key indicators such as the number of people claiming housing benefits and transport accessibility indicators, which could be driving economic inactivity, were included.
The Department of Work and Pensions commissions statistics on housing benefit claims yearly. In addition, data on methods used to travel to work from the Census 2021 could shed some light on the transport accessibility indicators which could impact economic inactivity at a local level. The Local Authority Indicator also publishes data on transport links to employment for 2022. This has supplemented the 2021 Census data.
Health and wellbeing
Poor health can be a major barrier to entering employment. This has been intensified by the pandemic, which has resulted in long Covid keeping more people out of the workforce due to health issues. However, local authority data on long Covid is not available. We therefore proxied this with historic ONS data on infections. Also, we have considered the prevalence of unpaid care, which could be another key indicator driving economic inactivity, using Census 2021 data. Linking the provision of unpaid care, economic activity and other indicators will provide insights into which local authorities need the most support. In addition, government-funded early education and childcare for children aged two to four years from the ONS was included to understand how the prevalence of childcare demands might affect economic inactivity. The Census 2021 data was also utilised to provide estimates that classify usual residents by long-term health problems or disabilities and by the state of their general health in 5 categories from very good to very bad.
The Local Authority Indicator from the ONS provides data on key statistics for physical and mental health. Data from the Office of Health Improvement and Disparities (OHID) gives life expectancy figures, segmented by male and female and at birth or at 65. In addition, the OHID gives data on inequality in life expectancy broken down by gender and at birth or at 65. This data shows inequalities within local areas, enabling a focus on the deprivation that exists everywhere at small area level. We had also hoped to include healthy life expectancy, which captures how long a person can expect to be in good health – however these figures were only available for upper-tier local authorities.
LG Inform data was also used for a range of health indicators including PIP claims, loneliness, obesity, or physically inactive adults, under 75 mortality rates, disability living allowance and prevalence of long-term musculoskeletal conditions.
Financial Vulnerability
The Good Credit Index is based on three sub-indices measuring different aspects of credit which were found to be important based on a literature review. These three strands are: the credit environment, credit scores; and credit need. The overall Good Credit Index was created by summing these three sub-indices, with an equal weighting given to each. We hoped to include other indicators to understand financial vulnerability including, household debt and households below 60 per cent of median earnings. However, these figures are only available at shire and unitary local authorities and therefore, cannot be included for this analysis.
Regression methodology
What is regression?
Regression modelling aims to explain differences in a factor of interest – in this case, economic inactivity – using a selection of other factors that are theorised to be associated with it. These associations may represent causal relationships, although regression analysis is itself unable to prove causality.
In addition, it should be borne in mind that a regression model only explains a proportion of the variation in the factor of interest, leaving a certain degree of variation unexplained. This remaining variation could be due to additional factors not included, random fluctuations or unique local circumstances, or a combination of all three. As such, this analysis does not imply that the metrics identified are the only ones which play a role in economic inactivity.
One further caveat is that factors are generally interrelated, and the relationships between them can be complex and multilayered. For example, if long-term musculoskeletal conditions are related to economic inactivity, the actual causes of these conditions are the true driver of the reasons for inactivity, and this causality is mediated through the influence of the symptoms and diagnosis of these conditions. It is never possible for an analysis to capture all, or the deepest, of the drivers of a particular factor of interest, and some drivers can work together, or work differently under different circumstances.
The first step in our modelling is to develop a rich dataset with all the indicators that we believe may affect economic inactivity. Combining datasets from various sources has its challenges. Therefore, a key aspect of the methodology was to collate them to develop a robust view of all the indicators we deem may affect economic inactivity that are not provided in one single data source.
To ensure that data was comparable across local authorities, in some cases figures had to be taken as a proportion of the population or a subset thereof. For example, data on children in low-income households was available as a raw number. Using this would have told us about the size of the local authority and the proportion of its population who were children, as well as about the phenomenon of interest, namely proportion of children in low-income households. The figures were therefore divided by number of under 18s in each local authority. The datasets were cleaned and merged to produce a single dataset of all the indicators for each local authority.
Once the single dataset was completed, we ran bivariate regressions. Bivariate analysis helps understand and establish the strength of the relationship between the dependent and independent variables. The two variables are frequently denoted as X and Y, with one being an independent variable (or explanatory variable), while the other is a dependent variable (or outcome variable).
The underlying idea is to quantify the relationships between each independent variable and dependent variable and includes testing simple hypotheses, particularly of association and causality. This can help in identifying a subset of variables that are more important and also know about the important levels of particular feature values.
From this a forward stepwise regression was performed which allows us to select important variables to get an easily interpretable model. This involves testing each variable as it is added to the model, then keeps those that are deemed most statistically significant.
The final regression includes those independent variables which:
- are statistically significant in driving economic inactivity according to our results, ideally to a high degree of confidence
- have a non-trivial impact on economic inactivity, whether positive or negative
- affect economic inactivity as implied by the regression through a plausible economic transmission mechanism.
This process was repeated for the key sub-categories of economic inactivity: retired, looking after home/family, and long-term sick or disabled. Running these models allows us to have more granular detail on the underlying causes of different economic inactivity types at the local level, and how the drivers of different causes vary.
It should be noted that although linear regression analysis is a reliable method, omitted variable bias cannot be completely ruled out as it is not possibly to include all relevant variable in the model. Omitted variable bias occurs when a statistical model fails to include one or more relevant variables. This should be taken into account when examining the results below.
In total, 311 local authorities were included in our regressions – all of those in England excluding the City of London, Isles of Scilly, and certain other authorities which were outliers and/or did not provide sufficient data.