top of page
Philadelphia-Pass-Loews-Skyline-C.Smyth2

Deconstructing

Crimes in Philadelphia

Objectives

The main objective for this project was to make Philadelphia attractive to a large company to relocate their office here. While this was a bigger question of fundamental economics and the company's comparative advantage, we were asked to explore data sets and use visualizations tools to sell Philly!

While many of my peers answered and offered several exciting activities, events, and good places to eat in Philadelphia, I worked backwards. I answered the question of why not Philly? A major reason many mention was security. I chose to dig deeper into a couple specific crimes in Philadelphia as well as offer insights into some inner workings of crimes in Philly.

Datasets & Tools

Crime

9fc8347e89143d29973120aa135c689f.png

I primarily utilized a publicly available dataset on crime provided from the City of Philadelphia. It included data from 01/2006 to 02/2019 with more than 2.5 million rows. Data and details are available here.

Key pertinent variables of the data include:

  • Type of crime

  • Date and time officer was dispatched to scene

  • Location of crime (Latitude & Longitude)

  • Location block of Crime

Weather

hot-sunny-weather-icon-icon--33.png

I wanted to study the effect of weather on crimes, those perhaps with less intention. I chose to use daily weather data from the National Oceanic and Atmospheric Administration Station at the Philadelphia International Airport. The data is linked here.

Key pertinent variables of the data include:

  • Daily min and max temperature

  • Average daily precipitation

  • Average daily snowfall

  • Average daily wind speeds

Unemployment

unemployment-icon-png.png

Lastly, I wanted to visualize the effects on unemployment on crime. I obtained the "Local Area Unemployment Statistics" from the Bureau of Labor Statistics from 01/2006 to 12/2018. The data is updated monthly and is available here.

Key pertinent variables of the dataset include:

  • Labor Force numbers

  • Employed numbers

  • Unemployed numbers

  • Unemployment rate

Manipulation & Visualization

Using Open Refine, redundant variables in incidents dataset were removed and cleaned with open refine. Daily weather data was then manipulated to match date format on monthly unemployment data.

Tableau was then used to join the datasets, and visualize the following data. Data was uploaded on Tableau's public server for usage on this website. 

open-refine-logo.png
tableau-logo-png-4.png

Analysis

The Bottom Line (beginning at the top!)

[1]

Crime in Philadelphia is at the all-time low. Consistently almost every month, subsequent years show lower crime rates. As mentioned in the Philly enquirer article here, Philadelphia's total crime rate is hitting the lowest it's ever been since the 1970's.

In-fact, at the time of the publication of this web page in February, we're probably at the lowest point of crime shown by the the sharp "v" shape in February, consistently.

Driving Crime Down

Below are, I take a deeper look into a number of the largest contributors to the decline in crime. (The main ones being "All Other Offenses" which has been removed for better granularity. Feel free to play around with the checkboxes and filters!

[2]

Let's look at the data in a colored table to get a better overview of the underlying data. It's clear that there are driving factors such as "Driving Under Influence" and "Robbery Firearm" decreased, there was also an increase in "Thefts from Vehicles" and "Weapon Violations" in recent times.

[3]

Where Should I Watch Out?

I plotted a heat map of the of crimes in Philadelphia. Because the data was specific to address, it was difficult to draw insight. In tableau, I created a new calculated field rounding off the latitude and longitude to the closest hundredth, therefore creating the orderly arranged round balls, that essentially showcases prevalence of crimes within a 0.01 degree radius. I also placed a household income overlay on the map for better context.

Using a pure sum, it seems that Center City has the highest density of crime, but this may be driven by large accounts thefts and fraud centered around the city center. This is the case when we change the "Crime" to fraud and thefts, highly accumulated in center city. When we shift to rape there is a higher distribution, with the highest numbers in North Philly, West Philly and Center City. Homicide too has the highest incidences in North Philly. Interestingly, prostitution is centered in specific hotspots locations in North Philly

Explore the crime map by changing "Crime" from all to Fraud, Thefts, Homicide- Criminal & Rape and prostitution and more!

[4]

When Should I Watch Out?

Thinking along the time dimension of visualization [2], I asked how I can translate this into actionable decision making. Below is a summary of percentage of total crimes against day of week, aimed at increasing cognizance of crimes during specific times of the day. The percentage is calculated along the specific crime.

We can see a high time correlation between burglary and thefts, indicating a similar preference for criminals to act during the week, while pedestrians are usually busy. Fraud and Forgery also have a similar, but stronger difference between weekday and weekend relationship. Prostitution, has the strongest effect with only 1% of prostitution happening on Sunday, and 11% on Saturday, with the rest prevalent in the week.

Conversely, driving under influence and public drunkenness is most prevalent during the weekend, most probably correlated with high alcohol consumption and higher number of "Happy Hour" promotions during those days. Interestingly, homicides also show, a dip during the weekdays, and a rise during the weekends. 

[5]

Let's look a little deeper, are there specific times these crimes occur?

Most interestingly, homicides have a higher occurrence during the night, with peak around midnight, with percentage of records dropping slowly afterwards. 

Driving under influence is most prominent after midnight, peaking around 3AM most probably due to the fact that bars begin to closed then. Public drunkenness follows similarly, however had a higher incidence during the day than driving under influence.

While some prostitution happens during the day, the majority occurs, unsurprisingly, past 8 PM with a peak at 10PM. There is also a minor increase centered around 10AM.

Residential burglary tends to happen after 8AM with the majority happening past 3PM. It also follows patterns that are set out by forgery and fraud in which they happen earlier in the during the day with small declines during lunch hour and towards the end of the day. 

However, we have to admit the limitations of this data. The times reported are those when "Officer's are dispatched". For crimes that occur in the middle of the night, it may be the case that victims choose to delay reporting to later in the day.

Play around with the data, click through crimes to see their different rates throughout the day!

[6]

Heated Relationships

Visualization [1] showed the smallest number of crimes consistently reported in February. Could this be correlated to the temperature? There are numerous studies that find strong correlation between ambient temperature and crime (including this cool paper that actually studies this correlation to the granularity of day-to-day crime). 

Generally it follows that crimes increase during summer months, and decrease during winter months. There could be several reasons this happens, perhaps it could be the seasonal unemployment. The next sections attempts to visualize the correlation between temperature and crime in Philadelphia.

Below is a bar graph of count of crimes against time with a line graph overlapping indicating the average temperature throughout the year.

The line representing the average temperature in the month is consistent, but click the different crimes to filter them out!

[7]

Let's look at a zoomed in version of the graph above, now plotting crime and temperature against weeks of the year. 

[8]

Plotting a graph of Number of Crimes against Temperature provided a stretched bell shape curve for all crimes, but I wanted to get a better look at what was driving this. I decided to instead graph percentage of total crimes calculated across all crimes, along average temperature.

 

My rational: as crimes increases, the percentage of a specific crime should scale accordingly. Graphing the differences will allow us to see which crimes take up a lower and higher percentage of total crime over different temperatures. 

 

Data below indicates that crimes such as narcotic violations, vandalism, thefts, robbery w/ no firearms and weapon violations indicate a lower prevalence amongst lower temperatures. 

 

Motor Vehicle Theft and robbery with firearms however seem to have a negative correlation with temperature, across the spectrum. 

 

Graphs such as embezzlement may suffer from lack of samples with only 5 crimes recorded in the smallest temperature bin.

Scroll through and find different percentages of crimes broken across different temperature bins . Look particularly at at the difference between vandalism, thefts and narcotic violations and motor vehicle theft and robbery with firearms!

[9]

Washing away the crimes?

I also wanted to investigate the correlation between rain (calculated using precipitation) and specific crimes. Unlike temperature however, that was more consistent over a period of time, precipitation was very erratic (only appearing less than a day during most occurrences). In order to try to better understand it's affect on crime, I chose to firstly plot the number of records against daily precipitation. 

A quick look at crime during Q2 of 2012 shows large dips in crime during high precipitation periods. However, there also seems to be large dips in crime that weren't associated to higher precipitation rates. This required further investigation.

[10]

Moreover, looking at specific crimes was limiting as the dataset because of the low number of counts when stretched to specific days of the year. In attempt to better understand the data, I plot 

This is similar to Visualization [9], just inverted with crimes being on the x axis, and precipitation bins on the Y axis. 

[11]

Visualization [11] & [12] show a lower incidence of thefts during rainy days, and robbery during the heaviest of rainy days. 

Interestingly, homicides have an increasing percentage over higher precipitation bins. However, the lack of data is a concern. ( This is due to the fact that not many homicides occur, particularly during days with high precipitation. It is not strong enough to conclusively deduce that a higher incidence of crimes during high incidences of precipitation).

[12]

Snow doesn't do much to stop criminals.

It seems that in winter months, crimes like thefts and take a dip. As we've investigated temperature's role in this, could it be snow? Would there be a similar effect?

[13]

[14]

Committing Crimes Can Be A Full-time Job.

Unemployment, much like the temperatures throughout the year is also cyclical. Does this have a correlation between crimes throughout the year?

We can see the seasonal unemployment that causes the hike and drops every year, but we can also see the effects of the recession of '08.

[15]

Crimes that show a decreased prevalence during times of higher unemployment include vagrancy, weapon violations, residential, 

Conversely, homicides, while dropping in number showed a higher prevalence within low unemployment. As expected, fraud and embezzlement also increased in percentage during times of low unemployment.

[16]

It could be that companies coming here might have spillover effects in the domain of crimes as well , we'll just have to see. For now, crime in Philly has certainly dropped over the past few years, let's hope we've gained some insight into how to drive it down further more. I hope this data visualization project has enabled you to stay a little safer! Cheers!

Imran Idzqandar is a Sophomore studying Business Analytics and Behavioral Economics at the Wharton School, University of Pennsylvania.

 

This was project for Prasanna Tambe's class, OIDD 245: Analytics & The Digital Economy. The focus of this project was exploratory data visualization.

bottom of page