How To Take Items In An Index As Columns In Pandas
Solution 1:
You could create a grouping variable, then reshape using pivot
df.assign(grp=df.iloc[:,0].str.contains('address').cumsum()).pivot('grp','INDEX','INFO')
Out:
INDEX address name phone type website
grp
1 2. 123 APPLE STREET APPLE STORE 555-5555 BUSINESS APPLE.COM
2 456 peach ave PEACH STORE 777-7777 BUSINESS PEACH.COM
3 789 banana rd banana store 999-9999 BUSINESS NaN
how your df looks like:
INDEX INFO
0 address 2. 123 APPLE STREET
1 phone 555-5555
2 name APPLE STORE
3 website APPLE.COM
4 type BUSINESS
5 address 456 peach ave
6 phone 777-7777
7 name PEACH STORE
8 website PEACH.COM
9 type BUSINESS
10 address 789 banana rd
11 phone 999-9999
12 name banana store
13 type BUSINESS
Solution 2:
It is pivoting table. I use unstack
. As in your comment, I guess your INDEX
isn't a column. It is the index of the dataframe, so I change the code accordingly.
s = df.groupby('INDEX').cumcount()
df_out = df.set_index(s, append=True).INFO.unstack(0, fill_value='None')
Out[111]:
INDEX address name phone type website
0 123 APPLE STREET APPLE STORE 555-5555 BUSINESS APPLE.COM
1 456 peach ave PEACH STORE 777-7777 BUSINESS PEACH.COM
2 789 banana rd banana store 999-9999 BUSINESS None
Note: since you want None
, I fill NaN
with string None
. If you want python None
, you should just leave it as NaN
because they work same way. If you want NaN
, take out the option fill_value='None'
Solution 3:
This should do the trick,
import pandas as pd
INDEX = ['address', 'phone', 'name', 'website', 'type', 'address', 'phone', 'name', 'website', 'type', 'address', 'phone', 'name', 'type']
INFO = ['123 APPLE STREET', '555-5555', 'APPLE STORE', 'APPLE.COM', 'BUSINESS', '456 peach ave', '777-7777', 'PEACH STORE', 'PEACH.COM', 'BUSINESS', '789 banana rd', '999-9999', 'banana store', 'BUSINESS']
df = pd.DataFrame(index=INDEX, data=INFO, columns=['INFO'])
df.index.name = 'INDEX'
df2 = df.groupby('INDEX').agg(INFO=('INFO', list))
pd.DataFrame(df2['INFO'].to_list(), index=df2.index).transpose()
Here's the output you get,
Out[132]:
INDEX address name phone type website
0 123 APPLE STREET APPLE STORE 555-5555 BUSINESS APPLE.COM
1 456 peach ave PEACH STORE 777-7777 BUSINESS PEACH.COM
2 789 banana rd banana store 999-9999 BUSINESS None
Solution 4:
I figured out the issue. The majority of the answers can accomplish this task. however there was a bug in the dataframe. It was still giving me an error of a list no matter what I did, so I did something that was unorthodox in Python. I saved the PDF as an excel sheet and bring it back to a pandas data frame. Once I did that, the traceback disappear. Weird huh? The bigger question is to how to prevent it from happening. But thank you for all your responses.
Post a Comment for "How To Take Items In An Index As Columns In Pandas"