The Data Science Campus: data science for public good
The Data Science Campus vision is data science for public good. Our goals are to explore new data sources, perform cutting-edge research with new-generation tools and technology, and build data science capability across government. This will enable the UK to grasp the transformational opportunities offered by data science, inform our understanding of the UK, and support better decision-making.
Background Worldwide, the economy and society are changing rapidly. Research and statistics need to be able to continuously adapt in order to remain relevant, and to support decision-making for government, the private sector and citizens. To do this, statistics institutions need to move away from traditional survey and analytical approaches, and embrace the innovative exploration of new data sources and new techniques. To address this need, the Office for National Statistics (ONS) officially launched the Data Science Campus (the Campus) in March 2017.
Its role is to:
• lead innovative research projects, exploring new developments in data science techniques, such as neural networks, text mining, and image analysis, and data sources, for example social media, real-time data from administrative and management systems, satellite images, the Internet of Things
• build data science capability across government, through the creation of career pathways, developing training and trainers, collaboration and mentoring
• support the data science community across Government
• be recognized as the Government hub for data science, and as a world leader in data science for government.
This response focuses on an overview of the Campus’ innovation in research and working practices. Research projects The Campus project process is designed to foster innovation, and as such, is experimenting with a novel and unique way of working for ONS. Campus projects run for a maximum of 6 months, and must have a significant element of new data science. This could be the exploration of a new type of data, or the application of innovative machine learning techniques, or both. Work on the development of current ONS statistical publications, using existing data sources is explicitly excluded from Campus projects, as is involvement in the implementation of new statistics into regular production. This is to allow the Campus data scientists the freedom and flexibility to explore cutting edge ideas, outside the constraints of the statistical production timetable. It also allows for failure. The freedom to fail – in the sense of exploratory research producing a result which is not useful– is key to success. Without this freedom, research will be limited to safe options, and opportunities may be missed. Projects are run using an Agile approach, which allows them to be flexible to research outcomes, and, if necessary, to fail fast.
To lead these projects, the Campus has specifically recruited data scientists with excellent skills in data science, something which is rare – although growing – across the rest of ONS and government. Projects are run as close collaborations with a range of partners: policy-makers, to understand the big questions; other national statistical institutions, to share knowledge and experience; subject matter experts; and academic data scientists, to draw from their skills. Some examples of Campus projects are described below. They demonstrate how the Campus is improving and supplementing existing statistical outputs, and also addressing policy questions. Traditional techniques and data sources would have been inadequate to address these issues.
Projects so far include:
• a new indicator for the Natural Capital Account, using Google Street View images to map the urban forest, important in preventing flash floods, reducing air and noise pollution, and supporting native ecosystems
• a superfast GDP indicator, which takes the temperature of the UK’s economy faster than the official statistics, to inform economic policy in a timely way
• a tool for managing and exploring the data for monitoring the UK’s Sustainable Development Goals, in which there has been international interest
• analysis of ship transponder data to understand the pressure on UK ports in advance of the UK’s exit from the EU, working in collaboration with the Centre for Big Data Studies, Statistics Netherlands
• automated classification of the financial sector into sub-sectors by type of activity, to support the development of more granular financial statistics to inform financial policy, a priority for the Bank of England.
Other longer-term goals include: exploring the potential of visible, infra-red and LIDAR satellite images to improve or supplement our understanding of the UK; exploring the use of Blockchain in understanding supply chains and provenance; and understanding error and uncertainty in administrative data and big data.
What Makes Your Project Innovative?
Historically data science within ONS has been focused on the application of data science approaches to existing operational challenges. Working within a traditionally risk-averse public sector environment, compounded by the rigorous methodologies necessitated by the production of official statistics, created an environment that was not conducive to experimentation or innovation. In an attempt to change this, the Data Science Campus was created to combine the academic rigor of a university research environment with the agile and disruptive workspace of a techstart- up.
With staff drawn from academia, industry and the public sector, a focus on exploring new data sources, methodologies and technologies outside the normal boundaries of official statistics, with explicit permission to “fail fast” and a physical environment inspired by Silicon Valley start-up culture, the Campus is designed from the start to operate outside the traditional practices of the public sector. However, innovation is useless in isolation. An extensive range of capability programmes have been developed to disseminate learnings from the Campus across ONS and wider UK government. A 2-year apprenticeship programme in Data Analytics was launched in the Campus in 2016 to provide vocational training in key techniques and expanded in 2017 to local and national government. An MSc in Data Analytics for Government, developed for government analysts by the Campus and the Government Statistical Service, is launching in October 2017 with University College London, Oxford Brookes University and University of Southampton. A dedicated Campus training team provides a range of training courses across UK government to bridge the gap between these two programmes, with government data scientists from across the UK coming to the Campus to learn and collaborate. These collaborations are enhanced by a range of research partnerships with universities and the Alan Turing Institute, the UK’s national institute for data science, that enable the transfer of leading edge techniques and tools from academia to the public sector through the Campus.
What is the current status of your innovation?
The Independent Review of UK Economic Statistics, (Bean, 2016, ‘Bean Review’) stated: “Although better use of this data has the potential to transform the provision of economic statistics, ONS will need to build up its capability to handle such data. This will take some time and will require not only recruitment of a cadre of data scientists but also active learning and experimentation. That can be facilitated through collaboration with relevant partners – in academia, the private and public sectors, and internationally.” The publication of the Bean Review provided the Office for National Statistics with an unparalleled opportunity to address the changing needs of a UK transformed by the Digital Economy. Measuring the modern economy requires new approaches, new tools and above all else, a new mindset. The Campus has been set up to become a leader in the field of data science, and to enable this transformation. It is shaping the application and awareness of data science not only within the environment of its immediate stakeholders, but within the wider UK society as a whole. The Bean Review has provided ONS with an opportunity not just to transform itself, but to have a significant and positive impact on the world around it. The Campus is that tool for change. Funding for the Campus was agreed in March 2016, and its scope extended beyond supporting economic statistics, to cover five themes: the evolving economy, the UK in a global context, sustainability, urban and rural, and society. Four phases of activity were planned for the first 24 months of activity.
We are now in the fourth phase:
• Start-up Phase I (May –Jul 2016) which was focused on planning and recruitment
• Start-up Phase II (Aug-Sep 2016) which was focused on opening the Data Campus and initiating its first activities
• Launch Phase (Oct 2016 – Mar 2017) where the activities of the Campus were in Beta, with the formal launch held on 30 March 2016
• Main Phase (Apr 2017-2018) where the Campus is now delivering its first results, and establishing the structures and processes that will enable it to scale effectively. Funding has been agreed for an initial two years, with further funding dependent on the Campus meeting the objectives set out in the business plan. The remit of the Campus is not only to carry out cutting-edge research, but also to build capability across the whole of UK government.
As described in more detail above, this is through a variety of approaches:
• collaboration on specific research projects
• the launch of the apprenticeship scheme
• the development of learning paths and good practice for data scientists, and the delivery of training across government
• the development of an MSc in Data Analytics for Government, with our academic partners
• funding for PhDs in data science • running hackathons and data dives
• mentoring data science projects across ONS and government.
Today, 8 of the initial 12 research projects are drawing to a close, with excellent feedback from our project partners. Demand is high for research projects. We have a significant backlog to choose from for the next round of projects, all of which are high value in terms of public good impact, and all of which explore new data sources or new data science techniques in a novel way. These projects involve a wide range of stakeholders, and cover all the five themes.
Collaborations & Partnerships
Collaboration and partnerships are key to the success of the Campus. One of the first people to be recruited to the Campus was the Head of Partnerships and Knowledge Transfer. Data science is a new area for ONS, requiring new relationships to be forged. Having a member of the senior leadership team, with the right experience, dedicated to pro-active engagement has been vital in rapidly establishing effective relationships with stakeholders across government, academia, the third sector and the private sector. As a result, Memoranda of Understanding have now been signed with 12 partner institutions, formalising our joint agreement to work together in a mutually beneficial way to promote data science. This has lead to access to new data sources, collaboration with academia on developing the MSc and on PhD projects which will meet the remit of the Campus and ONS, and collaboration and cross-fertilisation of projects and research ideas.
Users, Stakeholders & Beneficiaries
Stakeholders are closely involved in research projects. Stakeholders – which include other government departments, ONS teams, private sector companies and the third sector are invited to suggest initial ideas. A crossdisciplinary brainstorming workshop is then held to refine these ideas into potential projects. Invitees include the policy / question ‘owner’, data experts, subject matter experts and data scientists. The projects are then run in close collaboration with the relevant experts and final users. This multi-disciplinary approach allows Campus data scientists to benefit from subject matter expertise, and stakeholders to benefit from working alongside the Campus on real projects. Some stakeholders are unfamiliar with data science. The Campus addresses this by demonstrating projects and their benefits, and talking through stakeholder challenges to identify where data science can help. This builds the capability of our partners to exploit the opportunities of data science.
Results, Outcomes & Impacts
In the first 6 months since the formal launch of the Campus, 12 projects have been kicked off, with 8 now nearing completion. These include:
• mapping the urban forest – this is a new requirement for the Natural Capital Account, and prior to this work, there was no UK dataset available. This work will produce a map of the whole of the UK’s urban forest, and a methodology for updating this. The value of urban nature from pollution removal alone was estimated by ONS at over £211 million for 2015. With the addition of other services such as recreation, flood prevention and noise regulation the annual value to the UK economy is likely to be in excess of £1 billion.
• superfast GDP and the automatic classification of the financial sector into granular financial subsectors – these projects address a key economic challenge: how to identify any risk to the UK economy early, to inform financial and monetary policy and potentially mitigate or avoid any future financial crisis. The Bank of England estimated that the last financial crisis cost the UK £7.4 billion. Mitigating this by only 0.01% would save the UK £740 million.
• understanding the gap between falling reported calorie intake and increasing obesity – this will inform health policy on obesity, a key issue for the National Health Service. Public Health England estimated the cost of obesity to the UK economy to be £27 billion per year (2007 estimate)
• a survey question bank – this project has converted pdf survey questionnaires into machine-readable format, and analysed question text to inform the harmonisation of business survey questions and administrative data definitions. This supports ONS’ Data Collection Transformation Programme, increasing efficiency, and reducing burden on survey respondents by identifying administrative data which could replace survey data. Other results and impacts around building capability are described elsewhere in this submission.
Challenges and Failures
The key challenge has been in recruiting a sufficient number of highly-skilled data scientists. Government cannot generally compete with private sector salaries for data scientists. We have been successful in recruitment by emphasizing the public good aspect of our work, and the research freedom.
Conditions for Success
A number of conditions were required for success:
• commitment from the most senior levels, and inspirational leadership to deliver the Campus objectives
• commitment to a higher risk appetite for innovation, that allows – and indeed encourages – ‘failure’ of (some) projects. This enables the Campus to be truly innovative, rather than focusing only on certain successes
• new infrastructure to support big data storage and analysis
• a significant and protected allocation of learning time for the data scientists, to allow them to remain at the cutting edge of data science.
The Campus is the hub for the UK, and as such would not be replicated in the UK. However, we are sharing our experience in research projects, data science skills and capability building across government – and with some private sector companies, in order to support the development of UK data science capability. In addition, many national statistical institutions (NSIs) are facing a similar challenge to seize the opportunities offered by data science. We have shared our experience with a number of other NSIs across the world, to inform their own capability programs, and to seek collaboration on joint projects. Campus staff are visiting Rwanda and The UN Economic Commission for Africa in September 2017 to scope long term interventions, where the Campus will support the development of data science strategy and capability in Africa.
Establishing a true culture of innovation has been vital for the success of the Campus. The freedom to experiment – not just in research, but in ways of working, with an expectation that some things may not work and will need to be re-evaluated, is challenging to those with an established belief in ‘right first time’. But it is this approach which has allowed the Campus to achieve so much, in such a new area for ONS, in a short space of time. Clearly, the corollary to that is that how things are working needs to be continuously assessed, with a willingness to change if necessary.