Big data: could mobile data contribute to the design of better public policies?

The Fourth Industrial Revolution has brought with it “an enormous volume of available data generated at high speed”, explains the global director of data research at Vodafone and chief data scientist at Data-Pop Alliance, Nuria Oliver, in an interview with Equal Times. It is big data, “unstructured data – not numbers stored in a database – such as images, texts, voice or video that require very different storage and processing techniques to those used 20 years ago”, clarifies Oliver.

“Part of this big data is generated by sensors such as particle accelerators or astronomical telescopes, but the part that we generate reflects aspects of our behaviour: such as Internet searches, banking transactions and interactions on social media or mobiles. There are more mobile phones than people, which is why they can be considered as sensors of a whole population or country. In addition, it is the technological device with the biggest uptake in the history of humanity, and the most powerful.

“As a species, we have never had a tool that allows us to measure human behaviour on a massive scale. And the mobile phone does that,” explains Oliver. In 2013, the MIT Technologies Review described the idea of using mobiles as a “humanity sensor” as one of the breakthrough technologies of that year.

“Decisions with a global reach that have hitherto been taken without considering quantitative population data can now taken by doing so,” explains Oliver. Such data allows us to understand very useful global patterns of human behaviour when addressing social issues. What are known as algorithms “for social good” can be used to optimize the resources in public goods, such as health, security, access to education or fair employment.

“Based on my experience at the UN and the World Bank, the idea is to look at how can we take advantage of this data to improve the decisions we take, supporting the reasoning behind them with less bias and limitations. As human beings, we have conflicts of interests, weaknesses, hidden interests,” says the Spanish scientist.

Big data and sustainable development goals

The last UN World Data Forum revealed that 78 per cent of the global population aged between 18 and 44 have their smartphones within reach 22 hours a day, which makes mobile data a powerful tool for achieving the Sustainable Development Goals of the 2030 Agenda.

A pilot project in Nuremberg is using mobile data to measure CO2 emissions (SDG 13, climate action).

Using its own algorithms, the data analysis company Teralytics transformed anonymised data generated by the Telefónica Germany network into traffic flows, identifying over 1.2 million routes. The South Pole Group, a sustainability solutions provider, then modelled the pollution levels, taking into consideration information from the Federal Environment Ministry and meteorological data for Germany.

“Cities are responsible for 70 per cent of greenhouse gas emissions, which is why they are central to [tackling the issue of] environmental protection,” adds Renat Heuberger, executive director of South Pole Group, who sees “great potential in the use of data generated on a daily basis, such as mobile network data, to cut urban pollution levels”.

“Algorithms are considered to be very abstract. This joint pilot project shows how they can help us to tackle concrete social problems and challenges,” underlines Maximilian Groth, business developer at Teralytics. “The conclusions on how we can improve traffic management are particularly relevant for us. The results obtained could lead to a realistic assessment of these control mechanisms,” affirms Peter Pluschke, city councillor and head of the Health and Environmental Policy Department in Nuremberg.

Predicting crime hotspots

Mobile data can contribute to creating “sustainable cities and communities” (SDG 11), as illustrated by the crime prediction project in London jointly led by the MIT Media Lab, the research department of Telefónica Spain, the Bruno Kessler Foundation, Data-Pop Alliance and the University of Trento (Italy).

“We use demographic information combined with human mobility information provided by the anonymised and aggregated data from mobile telephone networks to predict whether a particular district of London will experience an increase in crime over the next month or not - whether there will be more or fewer crimes than the median by neighbourhood in the Metropolitan Area,” researcher Andrey Bogomolov, specialist in artificial intelligence, data mining and IT at the Social Sciences, Arts and Humanities Faculty of the University of Trento explains to Equal Times.

“Our findings back up the hypothesis that the behavioural data captured by those networks, combined with basic demographic information, can be used to predict crime. Our combined machine learning model reaches a true-positive rate of 68.37 per cent when it uses demographic information and mobile data, and this rises to 69.54 per cent when we add census data (proportion of migrants, rate of employment, ethnicity, green space, crime figures, house prices, life expectancy, education levels, etc.),” says the scientist.

“The diversity and the regularity of aggregated human behaviour provides us with much more information than official statistics, which are less useful than human dynamics because their temporal granularity – accuracy of a measurement in relation to time – is lower,” he explains.

Bogomolov makes it clear that the project has nothing to do with the idea of ‘pre-crime’, as raised by Philip K. Dick’s short story The Minority Report. “The technology behind our research could contribute, rather, to the optimal distribution of police resources and local government policy initiatives.”

He adds that these artificial intelligence methods predicting crime hotspots do not, however, provide any guarantee of “safe” areas or “peace”. “It is up to governments to take action,” he underlines. “We hope that, if such technologies were made public and were developed in real time, criminals might modify their behaviour on the basis that the police may be anticipating more criminal activity in a given area at a given time. This is a matter requiring more in-depth research by behavioural scientists, from a ‘game theory’ perspective,” he explains to Equal Times.

Privacy and lack of transparency, the “dark side” of big data

“The main precautions to be taken are related to people’s privacy and the conflicting nature of machine learning systems that model human behaviour,” recalls Bogomolov.

It is what a study by Data-Pop Alliance and the Bruno Kessler Foundation refers to as the “dark side”: violations of privacy, information asymmetry, lack of transparency, discrimination, social exclusion, etc.

The study however points to the need to focus “on the potential of data-driven policies to lead to positive disruption, such that they reinforce and enable the powerful functions of algorithms as tools generating value while minimizing their dark side”.

“Here, opacity is confronted with transparency: in order to make more objective and fair decisions, we have to be conscious of the potential limitations of the algorithms. And if the data is biased – if it is not representative of the population concerned – the algorithms will also be biased,” indicates Oliver, one of its authors.

“The interpretability problem is substantial. Many of the current algorithms are like black boxes, and it is difficult to interpret what they are actually doing. They work very well, but we don’t know how to explain why. And in the context of decisions affecting people, it is essential that we are able to do so, not only so that we can make sure they are the right decisions but also so that we can explain why,” notes the expert.