How To Take Items In An Index As Columns In Pandas
Solution 1:
You could create a grouping variable, then reshape using pivot
df.assign(grp=df.iloc[:,0].str.contains('address').cumsum()).pivot('grp','INDEX','INFO')
Out:
INDEX address name phone type website
grp
1 2. 123 APPLE STREET APPLE STORE 555-5555 BUSINESS APPLE.COM
2 456 peach ave PEACH STORE 777-7777 BUSINESS PEACH.COM
3 789 banana rd banana store 999-9999 BUSINESS NaN
how your df looks like:
INDEX INFO
0 address 2. 123 APPLE STREET
1 phone 555-5555
2 name APPLE STORE
3 website APPLE.COM
4 type BUSINESS
5 address 456 peach ave
6 phone 777-7777
7 name PEACH STORE
8 website PEACH.COM
9 type BUSINESS
10 address 789 banana rd
11 phone 999-9999
12 name banana store
13 type BUSINESS
Solution 2:
It is pivoting table. I use unstack. As in your comment, I guess your INDEX isn't a column. It is the index of the dataframe, so I change the code accordingly.
s = df.groupby('INDEX').cumcount()
df_out = df.set_index(s, append=True).INFO.unstack(0, fill_value='None')
Out[111]:
INDEX address name phone type website
0 123 APPLE STREET APPLE STORE 555-5555 BUSINESS APPLE.COM
1 456 peach ave PEACH STORE 777-7777 BUSINESS PEACH.COM
2 789 banana rd banana store 999-9999 BUSINESS None
Note: since you want None, I fill NaN with string None. If you want python None, you should just leave it as NaN because they work same way. If you want NaN, take out the option fill_value='None'
Solution 3:
This should do the trick,
import pandas as pd
INDEX = ['address', 'phone', 'name', 'website', 'type', 'address', 'phone', 'name', 'website', 'type', 'address', 'phone', 'name', 'type']
INFO = ['123 APPLE STREET', '555-5555', 'APPLE STORE', 'APPLE.COM', 'BUSINESS', '456 peach ave', '777-7777', 'PEACH STORE', 'PEACH.COM', 'BUSINESS', '789 banana rd', '999-9999', 'banana store', 'BUSINESS']
df = pd.DataFrame(index=INDEX, data=INFO, columns=['INFO'])
df.index.name = 'INDEX'
df2 = df.groupby('INDEX').agg(INFO=('INFO', list))
pd.DataFrame(df2['INFO'].to_list(), index=df2.index).transpose()
Here's the output you get,
Out[132]:
INDEX address name phone type website
0 123 APPLE STREET APPLE STORE 555-5555 BUSINESS APPLE.COM
1 456 peach ave PEACH STORE 777-7777 BUSINESS PEACH.COM
2 789 banana rd banana store 999-9999 BUSINESS None
Solution 4:
I figured out the issue. The majority of the answers can accomplish this task. however there was a bug in the dataframe. It was still giving me an error of a list no matter what I did, so I did something that was unorthodox in Python. I saved the PDF as an excel sheet and bring it back to a pandas data frame. Once I did that, the traceback disappear. Weird huh? The bigger question is to how to prevent it from happening. But thank you for all your responses.
Post a Comment for "How To Take Items In An Index As Columns In Pandas"