Data Mining

Data Mining in PETRA focuses on the exploitation of (anonymous) individual mobility data, i.e. data describing the movement single users or vehicles in a given period of time, such as GPS traces of vehicles or mobile phone CDR data. In particular, the analysis tools developed emphasize the notion of individual mobility behaviours, i.e. the mobility schedule followed by a single user in her daily life. Such schedule includes regularities (daily routines, e.g. home-to-work trips) and specific objectives (the purpose of each trip), which can be exploited to provide improved predictive models, better understanding of urban traffic as well as to support innovative services.

Trip Activity Recognition
Human mobility is driven by people’s daily activities, such as going to work or school, shopping, transporting kids, and so on. The digital mobility traces collected through a variety of technologies, from navigation devices to smart phones, allow us to understand people’s movements in great detail. However, they generally fail to capture the purpose of such movements, i.e. the kind of activity behind each travel.
To this purpose the ABC method has been developed, which focuses on the notion of Individual Mobility Network (IMN), which is a network view of the overall mobility of an individual, which clusters trips to/from same locations (the nodes of the network) and merges features of the single trips (the links of the network) with their mutual relation through network-based statistics.

figure1IMN extracted from the mobility of an individual.

The algorithm developed, named Activity-Based Cascade (ABC) classifier, classifies the trips in a cascade fashion, i.e. the different trip categories are identified one after the other and not all at the same time, exploiting the knowledge collected through the classifications already performed to better classify the next categories.

Individual Mobility Profiles

Mobility Profiles describe the systematic component of the mobility of an individual. As such, they describe those movements that tend to repeat consistently in time, such as home-to-work routines and similar, while disregarding occasional trips that constitute noise in this context.

figure2Individual history of a user and the process of Mobility Profile extraction.

Given the set of trajectories of a user, we extract the systematic movements by clustering the trajectories using a density-based clustering (in the specific case, a revised version of the Optics algorithm), equipped with a proper function to compare trajectories. The user’s mobility profile will be described by a representative trip for each cluster, while the unclustered trips represent non-systematic mobility.

Individual Mobility Prediction
Predicting the future locations of a mobile user is a flourishing research area with many useful applications, such as predicting traffic or pre-fetch possible services or suggestions for the user. For this reason we developed a system called MyWay to forecast the exact future position visited by mobile users. MyWay employs predictors that exploit the individual systematic behaviors of a single user to predict its future behavior. It requires that each individual computes an abstract representation of her systematic behavior: the individual mobility profiles mentioned above. MyWay is based on two steps: a learning phase, that is simply represented by the
acquisition of mobility profiles; and a prediction phase, that given the current trajectory of a user predicts her future positions.

figure3Individual Prediction Strategy Schema

The algorithm follows a hierarchical strategy that recognizes the specificity of the individual profile compared to the collective profiles. More specifically, it uses the user’s individual profile for making predictions over ongoing trip that appear to follow her personal routines, and, in the case this fails, it follows the same approach over the collective profiles.

Classification of City Users based on Mobile Phone data
Mobile phone data, despite their limited geo-localization precision compared with GPS tracks, are of utmost interest due to their global availability for any country, and the ability to portray mobility independently from the transportation means. An emerging line of research in the data mining community is the leveraging of mobile phone call records for urban demographics, to classify mobile phone users into the behavioral categories of city users residents, commuters and visitors based on their call profiles.
We build an analytical process, called Sociometer, which is based on a stylized representation of the call pattern of a user in a particular spatio-temporal selection. A classification model is trained to learn how to annotate profiles with a city user category (resident, commuter, or visitor), and then applied to every (unlabeled) profile in order to estimate the presence of people in the three categories.

figure4Aggregate representation of the call pattern of a user over an area.

Such classification process allows to easily provide up-to-date statistics about a territory in terms of resident population, systematic traffic flows across areas (represented by users that are resident in one and commuters in the other), as well as attraction of external visitors.