Skip to content Skip to sidebar Skip to footer

How To Ignore Null Values In Data Frame And Build A New Data Frame Based On A Column

How do I ignore null and None values in a data frame based on ID and frame the data. id A B C A [] [] [] A [aaaa] None [] A [] [bbbb] None A

Solution 1:

If there are None values like NoneType and lists in all another columns like id, then create index by id, get first values of lists by indexing with str[0], replace Nones to NaNs and last aggregate GroupBy.first:

print (df.applymap(type))
              id                   A                   B                   C
0  <class'int'>      <class'list'>      <class'list'>      <class'list'>
1  <class'int'>      <class'list'>  <class'NoneType'>      <class'list'>
2  <class'int'>      <class'list'>      <class'list'>  <class'NoneType'>
3  <class'int'>      <class'list'>      <class'list'>      <class'list'>
4  <class'int'>  <class'NoneType'>      <class'list'>      <class'list'>
5  <class'int'>      <class'list'>      <class'list'>      <class'list'>
6  <class'int'>      <class'list'>  <class'NoneType'>      <class'list'>
7  <class'int'>      <class'list'>  <class'NoneType'>  <class'NoneType'>
8  <class'int'>      <class'list'>      <class'list'>      <class'list'>
9  <class'int'>  <class'NoneType'>      <class'list'>  <class'NoneType'>

df1 = (df.set_index('id')
         .apply(lambda x: x.str[0]).mask(lambda x: x.isna(), np.nan)
         .groupby('id')
         .first())
print (df1)
       A     B      C
id1   aaaa  bbbb  ccccc
2   xxxx  yyyy   zzzz

Another idea:

df1 = (df.set_index('id')
         .applymap(lambda x: np.nan if x == [] else x)
         .stack()
         .unstack()
         .apply(lambda x: x.str[0])
       )
print (df1)
       A     B      C
id1   aaaa  bbbb  ccccc
2   xxxx  yyyy   zzzz

Post a Comment for "How To Ignore Null Values In Data Frame And Build A New Data Frame Based On A Column"