Today’s urban environment is awash with data. Data become available from a wide range of origins, from fixed and mobile sensors to large-scale monitoring infrastructures, and can come from public, private, or industry sources. Making such data useful is an enabler for the development of novel innovative Big Data applications that utilize massive urban data streams. There are large efforts underway on this front, as we are gradually moving towards a smart city era. Technologies that discover knowledge from urban data, such as new techniques for the detection of disastrous events, tracking health issues, monitoring crucial environmental factors, or improving energy efficiency, will have impact in a lot of aspects of the citizens everyday life.
The importance of data originating in the urban space is going to expand significantly in the future. In fact, the smartamerica.org project reports that United States cities alone are going to invest an estimated sum of 41 trillion USD over the next 20 years with the stated goals to upgrade their infrastructure so that they can improve connectivity and the ability to collect data, in order to improve the efficient use of city resources and enhance the quality of life.
The toughest obstacles in using urban data are that such data are heterogeneous, noisy, and unlabeled. In addition urban data can include massive and high-speed data streams (for example video feeds). The combination of monitoring and networking technologies (including the proliferation of ever more powerful smartphones and the introduction in a large scale of the Internet of Things -IoT) produces a space where a very large numbers of data producers produce high volumes of data that are available in real time. Urban data by definition are produced by the monitoring of human activities and therefore are a projection of citizen related actions and events in the data space defined by the technologies employed for the collection. Actions and events that involve many citizens are very complex and can be difficult to interpret in themselves; understanding and utilizing the data that are related to such is even more challenging. Succinctly stated, urban data are difficult to understand.