Can I Show The Different Methods Of Death Penalties As Well As Predict Future Years
I would like to be able to predict the rise/fall in death penalties for this dataset below This is USA 1976 death penalty data found at: https://www.kaggle.com/usdpic/execution-dat
Solution 1:
It sounds like what you want is to reshape your data so that you have a time series for each "method", which you can then use in a predictive model. It's probably worth pointing out that the distribution of "Method" is really skewed (values are from 1999 onwards), so it will be very difficult/impossible to forecast most of them:
df['Method'].value_counts()
# Lethal Injection 923# Electrocution 17# Gas Chamber 1# Firing Squad 1
Here is a solution that will help you reshape your data to get time series data for each "Method" (I've added a bit more of an explanation at the end):
df['Date'] = pd.to_datetime(df['Date'])
df = df[df['Date'].dt.year >= 1999]
df = df.set_index('Date')
df2 = df.groupby('Method').resample('1M').agg('count')['Name'].to_frame()
df2 = df2.reset_index().pivot(index='Date',columns='Method',values='Name').fillna(0)
df2.plot()
We can check that the new shape of the data gives us the correct number of "Method" counts:
df2.sum()
# Method# Electrocution 17.0# Firing Squad 1.0# Gas Chamber 1.0# Lethal Injection 923.0
Explained
df['Date'] = pd.to_datetime(df['Date'])
# Filter out rows where date values where the year is less than 1999df = df[df['Date'].dt.year >= 1999]
# Set the index to be the datetimedf = df.set_index('Date')
# This bit gets interesting - we're grouping by each method and then resampling# within each group so that we get a row per month, where each month now has a# count of all the previous rows associated with that month. As the dataframe is# now filled with the same count value for each column, we arbitrarily take the # first one which is 'Name'# Note: you can change the resampling frequency to any time period you want, # I've just chosen month as it is granular enough to cover the whole period
df2 = df.groupby('Method').resample('1M').agg('count')['Name'].to_frame()
# Name# Method Date # Electrocution 1999-06-30 1# 1999-07-31 1# 1999-08-31 1# 1999-09-30 0# 1999-10-31 0# ... ...# Lethal Injection 2016-08-31 0# 2016-09-30 0# 2016-10-31 2# 2016-11-30 1# 2016-12-31 2
df2 = df2.reset_index().pivot(index='Date',columns='Method',values='Name').fillna(0)
# Method Electrocution Firing Squad Gas Chamber Lethal Injection# Date # 1999-01-31 0.0 0.0 0.0 10.0# 1999-02-28 0.0 0.0 0.0 12.0# 1999-03-31 0.0 0.0 1.0 7.0# 1999-04-30 0.0 0.0 0.0 10.0# 1999-05-31 0.0 0.0 0.0 6.0# ... ... ... ... ...# 2016-08-31 0.0 0.0 0.0 0.0# 2016-09-30 0.0 0.0 0.0 0.0# 2016-10-31 0.0 0.0 0.0 2.0# 2016-11-30 0.0 0.0 0.0 1.0# 2016-12-31 0.0 0.0 0.0 2.0
Post a Comment for "Can I Show The Different Methods Of Death Penalties As Well As Predict Future Years"