Mobile Networks Data for Official Statistics
The Mobile Networks Data for Official Statistics project seeks to provide an end-to-end generic production framework for using data generated by mobile networks, which can be adapted to the different statistical needs and statistical domains through a modular approach. It produces a robust, mathematically sound solution to improving timeliness and relevance of official statistical products in order to meet a concrete need of statistical data users.
The course of action was determined in 2013 through the Scheveningen Memorandum in which European statistical system (ESS) members acknowledged the importance of Big Data for official statistics in the context of rapidly changing economic and social environments, where timely and relevant statistics are the cornerstones of any sound decision-making process. Traditional statistical production systems based on census and survey-collected data are proving to be outpaced by a growing number of underlying issues, ranging from costs, timeliness, and relevance to response burden on statistical units. Mobile network data can provide an efficient and effective data source to circumvent at least some of the issues raised by traditional methods of data collection.
The objectives of this project are:
1. To provide a robust and modular methodological and production framework which can be configured and finely tuned to the needs of any statistical office.
2. To provide proof-of-concept, experimental, statistics in the domain of social and economic statistics.
The innovative aspect is bestowed by the new, modular approach of integrating mobile phone data into the production of official statistics. This approach allows to easily change/update specific modules that are technology dependent within the proposed framework. It is one of the first end-to-end frameworks dealing with mobile phone data, starting from raw data and going up to the final statistical indicators. With access to mobile phone data being problematic, a micro-simulation software was developed to produce synthetic data sets, similar to real ones. This tool was instrumental in developing the methodological framework. All of the project's tools (methodology and IT production framework) were implemented from scratch.
What Makes Your Project Innovative?
The main innovative breakthrough of the Mobile Network Data project consists in providing a truly modular and mathematically-sound design, ready for use by any national statistical office with access to mobile networks datasets.
The second aspect consists in an end-to-end, fully working, modular production pipeline, implemented from scratch, provided under a free and open-source software license.
Thirdly, the innovation is characterised by the creation of a network data micro-simulator which can be used under different scenarios, minimizing the risks of using real world data and providing a statistician/researcher playground. It also provides the so called “ground truth” which is needed to assess the quality of statistical models. Since real data cannot provide such information, the micro-simulation software is a key driver of the project's success.
What is the current status of your innovation?
- Given the modular approach embedded in the project, different deliverables are being pursued at the same time, with different partners and schedules.
- Negotiations and agreements with mobile network operators must still be pursued and the conditions under which access will be granted are a clear objective for the project and the whole ESS.
- A full, updated list of deliverables can be found on the dedicated project page on the European Commission's website (link above) under the section "milestones and deliverables".
Collaborations & Partnerships
The methodological framework and its software implementation is a result of the collaboration between INE Spain and INS Romania. Part of the work was done under the ESSnet Big Data II / Workpackage I, a project within the European statistical system (ESS).
Users, Stakeholders & Beneficiaries
The primary beneficiaries of the project are statistical offices who strive to adopt mobile network data in their statistical production systems, simultaneously safeguarding the governing principles of official statistics in terms of independence, quality, confidentiality and transparency. Also, telecommunication companies can benefit from working with top tier statisticians and improve their targeting strategies for covering sub-populations based on socio-demographic or economic characteristics
Results, Outcomes & Impacts
The main results of the project are a methodological proposal to include mobile phone data into the production of official statistics and a software implementation which is open source. The statistical methodology underlying the production framework comprises mathematical methods focusing on data quality, especially on the accuracy dimension. The use of agent-based micro-data simulations allows the statistician to compare the estimates with the ground truth along different stages of the end-to-end process, thus providing a continuous quality assessment of the methods. From a strategic point of view, the impact can be measured by the degree of adoption of such a methodological approach by the official statistics agencies. Using this data source for different areas of official statistics (tourism, population statistics etc.) will significantly improve the timeliness and relevance of official statistics products.
Challenges and Failures
The main challenge is data access, which proves to be problematic under current European regulations. However, this could be seen as an opportunity in developing and implementing data privacy tools suited for official statistics, giving a strong boost towards innovation, and resulting in more robust and proactive statistical offices in tackling changes in data collection ecosystems.
Conditions for Success
Conditions for success revolve around building a supporting infrastructure and/or services for data sharing under very strict data privacy regulations. This new data source strongly suggests that collaboration and integration of private sector agents (mobile network operators) is necessary to build an efficient production process.
Modularity and adaptation, derived from the underlying mathematical approach, stand as key features of this production framework, which allows different organisations to adapt and fine-tune the framework to concrete circumstances.
There are partial examples of use of mobile network data by a number of organizations to produce different socio-demographic aggregates and indicators. These are mainly one-off case studies showing the high potential of this new data source. This project focuses on a standardized modular and evolvable production process fully integrating statistical accuracy assessment.
A modular-based design proved to be the optimal in order to provide a robust methodological and production framework which could fit the needs of all the national statistical offices. The methodological system can be decomposed in interchangeable parts and replaced by modules that can better suit user needs. This goes also for the production system where software components can be replaced without breaking the entire architecture.
All results are publicly available. The source code is distributed freely using the github platform and the description of the methodological framework is also publicly available.