Align Data In One Column With Another Row, Based On The Last Time Some Condition Was True
I’m trying to parse millions of lines of log files that suffer from an unfortunate deficiency. Data relating to a single event can be split across log entries but there is no dir
Solution 1:
IIUC, I think you can do it this way. Create two masking one representing the rows where the current Iteration value is now. And, the second mask puts True on the first record where you want the Iteration value to move too. Then group on the first mask with cumsum and put that current value on all records, then use the second mask with where.
mask=(df['thing_I_care_about'].isnull() &
df['A'].isnull() &
df['B'].isnull() &
df['C'].isnull())
fmask = (df['thing_I_care_about'].notnull() &
df['A'].notnull() &
df['B'].notnull() &
df['C'].notnull())
df.assign(Iterations=df.groupby(mask[::-1].cumsum())['Iterations'].transform(lambda x: x.iloc[-1]).where(fmask))
Output:
thing_I_care_about thread_num A B C Iterations
0 thing_1 2 X X X 110.01NaN2 X X NaNNaN2 thing_2 3NaN X X NaN3NaN2NaNNaNNaNNaN4 thing_3 7 X X X 150.05 thing_4 5 X X NaNNaN6NaN7NaNNaNNaNNaN
Post a Comment for "Align Data In One Column With Another Row, Based On The Last Time Some Condition Was True"