Jyl Djumalieva and Dr. Cath Sleeman

Skill shortages are costly and can hamper growth, but we don't currently measure these shortages in a detailed or timely way. To address this challenge, we have developed the first data-driven skills taxonomy for the UK that is publicly available. A skills taxonomy provides a consistent way of measuring the demand and supply of skills. It can also help workers and students learn more about the skills that they need, and the value of those skills. The full research paper is available to read here.

Skill shortages are a costly unknown

Skill shortages are a major issue in the UK and arise because there are not enough people with particular skills to meet demand. The Open University estimates that skill shortages cost the UK £2bn a year in higher salaries, recruitment costs and temporary staffing bills. They can also significantly hamper growth. According to OECD research, the UK could boost its productivity by 5 per cent if it reduced the level of skill mismatch to OECD best practice levels.

Despite the importance of skill shortages, we don't currently measure them in a detailed and timely way. The best available estimates come from the Employer Skills Survey, which was released last week. While the survey is able to shed light on the various causes of skill shortages, it is only conducted once every two years and it focuses on broad rather than detailed groups of skills.

Looking ahead, skill mismatches may worsen because the skills we need are changing, owing both to short-term factors such as Brexit, and to longer-term trends such as automation. Last week the Chartered Institute of Personnel and Development (CIPD) found that for 40 per cent of employers it had become more difficult to fill vacancies over the past 12 months. And Nesta's own research found that one-fifth of workers are in occupations that will likely shrink over the next 10 to 15 years.

More than ever, we need to create an informed labour market. Informed in the sense that education providers, workers, students, employers and policy makers know how skills are changing and are empowered to react to these changes, thereby countering skill mismatches.

The UK needs a skills taxonomy

The first step to measuring shortages is to build a skills taxonomy, which shows the skill groups needed by workers in the UK today. The taxonomy can then be used as a framework by which to measure the demand for the skills by employers, the current supply of those skills from workers, and the potential supply based on courses offered by education providers and employers. At present we don't consistently measure these factors in the UK, and one reason may be that we don't have an accepted method of grouping skills. This is despite having well-established taxonomies for defining groups of occupations and industries.

How to build a taxonomy

We began with a list of just over 10,500 unique skills that had been mentioned within the descriptions of 41 million UK job adverts, collected between 2012 and 2017 and provided by Burning Glass Technologies. These skills included specific tasks (such as insurance underwriting), knowledge (biology), software programmes (Microsoft Excel) and even personal attributes (positive disposition). Machine learning was used to hierarchically cluster the skills. The more frequently two skills appeared in the same advert, the more likely it is that they ended up in the same branch of the taxonomy. The taxonomy therefore captures the clusters of skills that we need for our jobs.

The final taxonomy can be seen in the diagram below and has a tree-like structure with three layers. The first layer contains 6 broad clusters of skills; these split into 35 groups, and then split once more to give 143 clusters of specific skills. Each of the approximately 10,500 skills lives within one of these 143 skill groups. The same methodology could be used to create further layers.

A UK skills taxonomy

Enter your job title to find the skills that you need

The demand for a given skill cluster is estimated using the total number of the cluster's skills mentioned across all adverts. The job titles shown when hovering over a cluster are the three most common job titles whose adverts most frequently mention skills from the cluster.

Much more than a list of skills

The skills taxonomy was enriched to provide estimates of the demand for each skill cluster (based on the number of mentions within adverts), the change in demand over recent years and the value of each skill cluster (based on advertised salaries). The estimates of demand get us halfway to measuring skill shortages. Most importantly, a user can search on the taxonomy by job title, and discover the skills needed for a wide range of jobs.

The ten clusters (at the third layer) containing the most demanded skills are:

  1. Social work and caregiving
  2. General sales
  3. Software development
  4. Office administration
  5. Driving and automotive maintenance
  6. Business management
  7. Accounting and financial management
  8. Business analysis and IT projects
  9. Accounting administration
  10. Retail

Two benefits of a data-driven taxonomy

Using job adverts to create the taxonomy ensures that it is based on the same 'skills language' used by UK employers, rather than the language of academics or policy makers. The other benefit of a data-driven approach relates to maintenance. Several existing skill taxonomies (such as O*Net and ESCO) rely, at least in part, on expert consultation which means that updating the taxonomies can be a long and costly process. In contrast, a data-driven taxonomy is easier to update: the same methodology can be applied to a new set of job adverts.

Valuing skills

The chart below shows the median annual salaries for each skill cluster in the third layer, as well as the lower and upper quartiles. These are estimates rather than precise figures, as only 61 per cent of adverts mention a salary. Despite this, these are the first estimates of skill values for the UK. To date, workers and students have had to decide between these skills without access to this information.

The value of skills

The annual salaries are based on all adverts that mention at least one skill in the cluster. The median, lower and upper salary quartiles were calculated using job adverts collected in 2015-2017 (inclusive). 'High demand' indicates clusters where the total mentions of skills within a cluster comprise at least 1 per cent of all skills mentioned. The figures shown by the names of the first layers are median salaries.

The five skill clusters at the third layer with the highest annual median salaries are:

  1. Data engineering
  2. Securities trading
  3. IT security operations
  4. IT security standards
  5. Mainframe programming

The five clusters with the lowest salaries are:

  1. Premises security
  2. Medical administration
  3. Dental assistance
  4. Office administration
  5. Logistics administration

Adding growth in demand for skills

The chart below ranks the clusters in the third layer of the taxonomy, both by their value (salary) and by the growth in demand for their skills between 2012-2014 and 2015-2017.

The clusters in the top-right hand corner of the chart have both relatively high salaries and high growth. They include data engineering, it security implementation, it security operations, marketing research, app development and web development. This may reflect a shortage in workers who have these relatively new skills.

In the opposite corner of the chart are skill clusters with low salaries and low growth. Several require engaging with digital technology in a rather routine way, such as in shipping and warehouse operations, medical coding and administration and web content management. This finding is consistent with our recent study which showed that the digital skills that are linked to occupations least likely to grow tended to use software for administrative purposes or in a routine way.

A cautious approach is recommended when interpreting these results, as a change in skill mentions within adverts may not always reflect a change in skill demands. For example, a rise in freelancing may lead to a fall in adverts, and a consequent decline in mentions of skills used by freelancers.

The value of skills and growth in their demand

The squares are equally spaced along both axes. They are ordered by median salary and growth in demand. The median salary for a skill cluster is calculated using salaries in job adverts that mention at least one skill from the cluster. The growth in demand for a skill cluster is estimated from the increase in mentions of the cluster's skills across job adverts, between 2012-2014 and 2015-2017. The area of a square reflects the total mentions of the cluster's skills across all years.

Automatically identifying transferable skills

Transferable skills are valuable, both to students, who may not have decided on a career path, and to workers, who can use these skills to transition between jobs. The data-driven methodology automatically identifies a set of 66 transferable skills prior to clustering, and these sit outside of the taxonomy. The set includes broad technical skills (teaching, sales, research, budgeting, planning), interpersonal skills (customer service, teamwork/collaboration, leadership, building effective relationships), basic skills (mathematics, english, computer skills, writing) and personal qualities (detail-oriented, positive disposition, self-starter, quick learner). This set could be used as a starting point for defining the first accepted group of transferable skills in the UK.

Limitations to consider

No taxonomy will be truly comprehensive, whether it is derived from experts or created from job adverts, and moreover there is no single 'right way' to group skills. In this research, the most important limitation to remember is that not all work is advertised online. As a result, the demand for skills used predominantly by freelancers or by casual workers may be underestimated in the taxonomy. Despite this risk, the data-driven approach still creates the most detailed taxonomy of UK skills available to the public today. And it is more easily updated than expert-derived taxonomies.

Putting the skills taxonomy to work

Over the next year we will be showing a range of use cases for the skills taxonomy. This will include estimating skill shortages at a regional level, automatically detecting new and redundant sets of skills, and estimating the potential supply of skills based on available courses and training. The taxonomy itself will also continue to evolve, as we add a fourth layer and try to capture the lateral relationships between clusters.

Anyone is welcome to use the taxonomy. If you are interested, please get in touch by emailing [email protected].