Guest Editorial

The development of Artificial Intelligence systems has seen remarkable growth over recent months and years. Given the apparent all-pervading nature of these technologies, should the UK have a sovereign capability in this field?

A sovereign AI capability for the UK

Volume 23, Issue 5 - July 2023

Mark Girolami and Michael Wooldridge

Mark Girolami and Michael Wooldridge

Professor Mark Girolami FRSE FREng has been the Chief Scientist of The Alan Turing Institute, the UK’s National Institute for data science and artificial intelligence, since October 2021. He was one of the original founding Executive Directors of the Institute and previously led the Turing’s Data-centric Engineering programme, which is principally funded by Lloyd’s Registry Foundation. Professor Michael Wooldridge is a programme director for AI at the Alan Turing Institute and a Professor of Computer Science at the University of Oxford and. He has been an AI researcher for more than 30 years. He is a Fellow of the ACM, AAAI and EurAI. From 2014-16, he was President of the European Association for AI, and from 2015-17 he was President of the International Joint Conference on AI (IJCAI).

SUMMARY

  • An enormously successful new class of AI systems – Foundation Models – is causing profound changes in the technology sector
  • Foundation models require enormous computational and data resources. Because of this they are currently owned by a small number of foreign-owned companies
  • There are many arguments in favour of a sovereign AI capability in foundation models, and in part to address these, the Government set up a taskforce in April 2023
  • There are many possibilities for a sovereign AI capability, ranging from a moonshot to develop UK foundation models from scratch, down to simply licensing of technology.

We have seen a stream of advances in AI over the past decade, culminating in the release of ChatGPT in November 2022, which became the first mass-market general-purpose AI system. The success of ChatGPT is causing seismic changes in the big-tech industry: we are witnessing a technology watershed akin to the release of the World-Wide Web some 30 years ago.

ChatGPT is a Foundation Model – a very large AI system, built using vast quantities of data and requiring AI supercomputers to process that data. The resources required to build foundation models means that their development has been restricted to a small number of foreign-owned companies.

While there are many applications for this technology in the UK public sector which would bring significant productivity gains, currently this entails putting UK data on foreign-owned AI computers which raises many concerns. Additionally, reliance on foreign-owned companies raises concerns if the UK truly aspires to be a science and technology superpower: are we as a nation willing to accept that we will play no major part in the development of a technology as important as the World-Wide Web?

For these and other reasons, there has been much recent discussion around the possibility of the UK acquiring a sovereign AI capability in foundation models. Indeed, we have many existing organisations and assets well-placed to support such an endeavour, not least the UK’s national institute for data science and AI – the Alan Turing Institute.

Against this background, the UK Government made two important announcements in 2023. First, a £900 million investment in high-performance computer facilities for the UK was announced in the 2023 Spring Budget1. Second, on 24 April 2023, the Prime Minister announced the intention to form a UK Foundation Model Taskforce2, with an initial budget of £100 million, with an emphasis on safe AI.

This article considers the question of what a sovereign AI capability for the UK might look like, what are the options, and what advantages and disadvantages do they have?

Current position

The main challenge for the UK is that foundation model technology is developed and owned by a small number of foreign-owned companies. For the most part these companies do not make program code or data open to inspection and they control access to their systems. The UK academic sector, while having historic strengths in AI, does not remotely have the resources required to build such models and UK universities are therefore greatly limited in the research they can do in this area.

If a UK company develops a successful AI technology, then what is to stop it being acquired by a foreign body? 

This represents a serious national shortcoming if we indeed aspire to be a science superpower and believe that this technology represents a technological watershed. While the UK private sector has a flourishing AI culture, UK-owned companies currently do not have experience in building foundation models, nor the capability to do so – although foreign-owned companies operating in the UK do have such capabilities (notably DeepMind).

Key factors

A sovereign AI capability must involve establishing and sustaining an infrastructure around five different axes:

People and skills. Researchers and developers with skills in foundation models are in high demand. A sovereign AI capability will require ensuring that the UK has a sustainable pipeline of such individuals, with skills ranging from understanding how to apply foundation models down to their scientific principles.

  • Data. Foundation models require huge quantities of data. To obtain sufficient data, the standard approach is to download much of the World-Wide Web. This raises multiple issues: the web contains enormous quantities of biased and toxic content; and there is the very real possibility of poisoned data (i.e. bad actors deliberately seeding public data sources with disinformation). A sovereign AI capability thus requires trusted data with transparent provenance, reflecting UK values, including regulation (a subject that cannot be adequately covered in this piece).
  • Hardware. Although the hardware issue might appear to have been resolved through the March 2023 Budget announcement of £900 million for UK compute, care will required to ensure that the compute resources that are ultimately procured through this are fit for purpose.
  • Software. The open-source traditions of the international AI community mean that considerable quantities of relevant computer code are available. However, the scale of Large Language Models (LLMs) means that building a new model is a substantial (and expensive) software development challenge.
  • Sovereignty. A sovereign AI capability must in some sense be owned by the UK. An extreme interpretation is that the UK controls the entire supply chain required to build such a model. This is not feasible for sovereign UK AI: for example, the UK does not have a suitable microprocessor fabrication capability. Any version of sovereign AI will involve some compromise against this standard. Purely private sector solutions are precarious in terms of sovereignty: if a UK company develops a successful AI technology, then what is to stop it being acquired by a foreign body? This suggests a sovereign AI capability would either have to be protected or else have a centre of gravity in the UK public sector.

Against this background, there are a range of models for a sovereign AI capability. Here are just three, chosen to highlight some of the main choices and their implications.

1. Build from scratch

The most ambitious scenario would involve putting in place a major R&D effort to build a UK equivalent of ChatGPT from scratch. Irrespective of the involvement of public or private sectors, this would be a huge undertaking, beyond the £100 million envelope initially available to the taskforce. It would require putting in place an R&D team of something like 100 (highly paid) researchers and developers: just this staffing process would require a year even in the most optimistic scenario (more realistically 2-3 years to reach full capacity). It would require provisioning them with suitable computer resources (lead time 6-18 months if funding is no obstacle).

The team would need to acquire suitable datasets and put in place processes to address concerns, which would require coordination with (for example) defence and security partners – likely timescale at least a year, probably two. Once the team had all components in place (data, hardware, software), actually building a new foundation model takes months – and it is far from certain that the first attempt would succeed. The upshot is that the first new model would be 18 months from launch at least, even if funding was no obstacle, but the likelihood is that it would take much longer.

The chief benefit of this scenario is that, if successful, it would resolve the concerns listed above. Downstream, there would be licensing and other commercial opportunities available. Overall, the project would represent a decisive UK investment in this extremely important area.

There are of course risks – the most obvious being that the project simply fails. However, it is unlikely that a project like this would deliver nothing, and there would be significant national benefits in establishing capacity in this domain. A related possibility is that the project delivers something substantially behind the state of the art.

2. Adapt existing software to UK needs

A more modest scenario would involve negotiating with trusted private sector partners to build models using their software, using data we provide, running on secure UK data centres. Thus, we would not own the program code – but the models would be built to our specification, with our data, on our computers.

Ultimately, this would amount to the UK licensing technology, rather than developing it from scratch. However, we would play a role in the configuration of the software, working alongside tech companies while models are being built, and having some freedom to adapt the technology to UK needs.

This approach is less risky than the first scenario and surely less costly; it could likely be done within the £100 million envelope of the task force. The biggest risk would come in negotiating suitable arrangements with private sector providers – in particular, putting UK data on foreign data centres should not be considered acceptable. We note that the Prime Minister recently secured agreements with several big-tech companies to have preferential access to their foundation models, providing a starting point for negotiations.

Noting the requirement for a pipeline of skills, we again emphasise the important of R&D programmes supporting research around the applications of foundation models in the public sector.

3. Off-the-shelf solutions

The least risky solution would involve simply licensing technology from existing suppliers on suitable terms. The UK would play no part in developing the software and our expertise would in this case amount to nothing more than hosting it. R&D efforts would presumably be limited to finding applications of the technology in GOV.UK bodies.

Such a solution is low risk, but very low ambition. It would likely deliver productivity benefits in Government Departments, which would have the benefit of working with polished state-of-the-art products. However, it is hard to see how this could be considered as delivering a truly sovereign AI capability. Crucially, it does not satisfy the skills, data, or sovereignty requirements listed above: the UK would not ‘own’ the technology in any meaningful sense.

Each of these choices involves trade-offs. What is clear is that we do not have the luxury of time to hold out for certainty – choices must be made now to keep the UK at the forefront.

1 Spring Budget 2023 www.gov.uk/government/publications/spring-budget-2023/spring-budget-2023-html 

Taskforce announcement www.gov.uk/government/news/initial-100-million-for-expert-taskforce-to-help-uk-build-and-adopt-next-generation-of-safe-ai 

Acknowledgement

We are very grateful to Major Harden Bates for discussions around foundation models and a sovereign AI capability.