Pandas Groupby Date Range Depending On Each Row
I would like to do the following: for a data frame that looks like this: df = pd.DataFrame({'ID':['A', 'A', 'C' ,'B', 'B'], 'date':['06/24/2014','06/25/2014','06/23/2014','07/02/19
Solution 1:
You could sort the rows by date
, then take the difference between consecutive dates.
Test when the difference is greater than 2 days. Taking the cumulative sum assigns the desired group numbers:
import pandas as pd
df = pd.DataFrame({"ID":["A", "A", "C" ,"B", "B"], "date":["06/24/2014","06/25/2014","06/23/2014","07/02/1999","07/02/1999"], "value": ["3","5","1","7","8"] })
df['date'] = pd.to_datetime(df['date'])
df = df.sort_values(by='date')
df['group'] = (df['date'].diff() > pd.Timedelta(days=2)).cumsum()
print(df)
yields
IDdatevaluegroup3B1999-07-02 704B1999-07-02 802C2014-06-23 110A2014-06-24 311A2014-06-25 51
Post a Comment for "Pandas Groupby Date Range Depending On Each Row"