Humboldt-Universität zu Berlin - High Dimensional Nonstationary Time Series

IRTG1792DP2018 013

Improving Crime Count Forecasts Using Twitter and Taxi Data

Lara Vomfell
Wolfgang Karl Härdle
Stefan Lessmann

Data from social media has created opportunities to understand how and why
people move through their urban environment and how this relates to criminal
activity. To aid resource allocation decisions in the scope of predictive
policing, the paper proposes an approach to predict weekly crime counts. The
novel approach captures spatial dependency of criminal activity through approximating
human dynamics. It integrates point of interest data in the form
of Foursquare venues with Twitter activity and taxi trip data, and introduces a
set of approaches to create features from these data sources. Empirical results
demonstrate the explanatory and predictive power of the novel features. Analysis
of a six-month period of real-world crime data for the city of New York
evidences that both temporal and static features are necessary to eectively account
for human dynamics and predict crime counts accurately. Furthermore,
results provide new evidence into the underlying mechanisms of crime and give
implications for crime analysis and intervention.

Predictive Policing, Crime Forecasting, Social Media Data, Spatial Econometrics

JEL classification: