Python Pandas Conditional Replace String Based On Column Values
Given these data frames...: DF = pd.DataFrame({'COL1': ['A', 'B', 'C', 'D','D','D'], 'COL2': [11032, 1960, 11400, 11355, 8, 7], 'year': ['20
Solution 1:
This looks like you want to update
DF
with data from DF2
.
Assuming that all values in DF2
are unique for a given pair of values in ColX
and ColY
:
DF = DF.merge(DF2.set_index(['ColX', 'ColY'])[['ColZ']],
how='left',
left_on=['COL1', 'year'],
right_index=True)
DF.COL2.update(DF.ColZ)
del DF['ColZ']
>>> DF
COL1 COL2 year
0 A 1103220161 B 196020172 C 1140020183 D 1135520194 D 820205 D 1002021
I merge a temporary dataframe (DF2.set_index(['ColX', 'ColY'])[['ColZ']]
) into DF, which adds all the values from ColZ where its index (ColX
and ColY
) match the values from COL1
and year
in DF
. All non-matching values are filled with NA
.
I then use update
to overwrite the values in DF.COL2
from the non-null values in DF.ColZ.
I then delete DF['ColZ'] to clean-up.
If ColZ
matches an existing column name in DF
, then you would need to make some adjustments.
An alternative solution is as follows:
DF = DF.set_index(['COL1', 'year']).update(DF2.set_index(['ColX', 'ColY']))
DF.reset_index(inplace=True)
The output is identical to that above.
Post a Comment for "Python Pandas Conditional Replace String Based On Column Values"