OSR review of 2020 exam algorithms

The Office for Statistics Regulation has published its report into the use of algorithms in the 2020 school and college exam series.

To date we have seen reports into the 2020 qualifications published in Scotland, Northern Ireland and Wales and these reports have all considered the broader issues around policy, communication, oversight and delivery. [Declaration: I was a member of the independent panel for the review in Wales].

The OSR review has focused on the development of the statistical models used across the UK to award grades in 2020 and sought to identify lessons for public bodies considering the use of statistical models to support decisions.

Algorithm has become something of a damaged brand with an unseemly rush to declare ‘no algorithms‘ for 2021 qualifications and fatuous comments about mutant algorithms in 2020. The OSR has taken a calm look at the issues and has come up with some eminently sensible findings and recommendations.

The OSR review does not attempt to assess the merits of any of the algorithms used in the four nations of the UK for the 2020 qualification series. Instead it considers the events of 2020 and what issues we can draw around the question of public confidence in algorithms.

The report makes clear that the use of algorithms in 2020 was always going to be immensely challenging and that, while the organisations involved did act with honesty and integrity, there were failures to communicate the extent of this challenge and to expose the models to sufficient technical challenge. The report notes that the failure to command public confidence was a critical factor in the ultimate collapse of the policy.

The OSR makes a number of recommendations for those developing statistical models, the policymakers who commission statistical models and the centre of government who have a critical role to play in improving public confidence in this area.

I find the position of the OSR in this debate very interesting. The review has considered the events of 2020 against the Code of Practice for Statistics and its three pillars of trustworthiness, quality and value. The code sets the standards that producers of official statistics should commit to and, while the three pillars are arguably broad enough to apply in algorithmic decision-making, I can’t help but feel that this is a rapidly-emerging area that goes beyond what this code of practice was designed for.

I have written previously about the particular challenges faced when attempting to use algorithms in decision-making and the need for appropriate governance over algorithms. The OSR recommendations for central government call for a lot of guidance and leadership in this area but to my eyes this does not go far enough. We should have something specific – and equivalent to code of practice for official statistics – to guide and assess the use of predictive algorithms in decision-making.

The report provides a comprehensive and robust analysis of the use of algorithms in the 2020 qualifications series. It includes some very helpful clarity around terms and definitions (a notoriously murky area) and steps through a thorough analysis of the 2020 grade awarding experience. At every stage it presents a clear line of sight from the expression of principles and concepts to the successes and failures experienced in 2020.

The OSR has set out the lessons from 2020 with clarity and coherence. This report could and should be a milestone in the broader development of algorithmic decision making in the public sector. The challenge now is to reflect, consider and then to act.

The OSR report can be downloaded here.

Andy Youell
2021-03-02