How To Create A New Column For Transposed Data
I'm attempting to transpose a row into a new column using the pandas dataframe. Visit ID is the unique identifier. I used df.pivot and df.melt but df.melt seems to do the opposite.
Solution 1:
You can use merge
:
out = pd.merge(df[df['Primary or Secondary'] == 'Primary'],
df[df['Primary or Secondary'] == 'Secondary'],
on='Visit ID', suffixes=('', '2'))
The rest is just reformatting:
out = out[['Visit ID', 'DX Code', 'DX Code2', 'Insurance', 'Insurance2']] \
.rename(columns={'Insurance': 'Primary', 'Insurance2': 'Secondary'})
>>>df
Visit ID DX Code DX Code2 Primary Secondary
0 1 123 234 Aetna Affinity
1 2 789 456 Medicare VNS
Solution 2:
You can use datar
, which uses pandas as backend but implements dplyr
-like syntax:
>>> from datar.allimport c, f, tribble, tibble, rep, paste0, pivot_wider
>>> >>> df = tribble(
... f.Visit_ID, f.DX_Code, f.Insurance, f.Primary_or_Secondary,
... 1, 123, "Aetna", "Primary",
... 1, 234, "Affinity", "Secondary",
... 2, 456, "VNS", "Secondary",
... 2, 789, "Medicare", "Primary",
... )
>>> df
Visit_ID DX_Code Insurance Primary_or_Secondary
<int64> <int64> <object> <object>
01123 Aetna Primary
11234 Affinity Secondary
22456 VNS Secondary
32789 Medicare Primary
>>> # Create a new df with names and values>>> df2 = tibble(
... Visit_ID=rep(df.Visit_ID, 2),
... name=c(paste0("DX Code", rep(c("", "2"), 2)), df.Primary_or_Secondary),
... value=c(df.DX_Code, df.Insurance)
... )
>>> >>> df2
Visit_ID name value
<int64> <object> <object>
01 DX Code 12311 DX Code2 23422 DX Code 45632 DX Code2 78941 Primary Aetna
51 Secondary Affinity
62 Secondary VNS
72 Primary Medicare
>>> df2 >> pivot_wider()
Visit_ID DX Code DX Code2 Primary Secondary
<int64> <object> <object> <object> <object>
01123234 Aetna Affinity
12456789 Medicare VNS
Disclaimer: I am the author of the datar package.
Post a Comment for "How To Create A New Column For Transposed Data"