How To Ignore Null Values In Data Frame And Build A New Data Frame Based On A Column
How do I ignore null and None values in a data frame based on ID and frame the data. id A B C A [] [] [] A [aaaa] None [] A [] [bbbb] None A
Solution 1:
If there are None
values like NoneType
and lists in all another columns like id
, then create index by id
, get first values of lists by indexing with str[0]
, replace None
s to NaN
s and last aggregate GroupBy.first
:
print (df.applymap(type))
id A B C
0 <class'int'> <class'list'> <class'list'> <class'list'>
1 <class'int'> <class'list'> <class'NoneType'> <class'list'>
2 <class'int'> <class'list'> <class'list'> <class'NoneType'>
3 <class'int'> <class'list'> <class'list'> <class'list'>
4 <class'int'> <class'NoneType'> <class'list'> <class'list'>
5 <class'int'> <class'list'> <class'list'> <class'list'>
6 <class'int'> <class'list'> <class'NoneType'> <class'list'>
7 <class'int'> <class'list'> <class'NoneType'> <class'NoneType'>
8 <class'int'> <class'list'> <class'list'> <class'list'>
9 <class'int'> <class'NoneType'> <class'list'> <class'NoneType'>
df1 = (df.set_index('id')
.apply(lambda x: x.str[0]).mask(lambda x: x.isna(), np.nan)
.groupby('id')
.first())
print (df1)
A B C
id1 aaaa bbbb ccccc
2 xxxx yyyy zzzz
Another idea:
df1 = (df.set_index('id')
.applymap(lambda x: np.nan if x == [] else x)
.stack()
.unstack()
.apply(lambda x: x.str[0])
)
print (df1)
A B C
id1 aaaa bbbb ccccc
2 xxxx yyyy zzzz
Post a Comment for "How To Ignore Null Values In Data Frame And Build A New Data Frame Based On A Column"