Replace 1s in one hot columns with values from another column
up vote
8
down vote
favorite
I have a data frame that looks like this:
df = pd.DataFrame({"value": [4, 5, 3], "item1": [0, 1, 0], "item2": [1, 0, 0], "item3": [0, 0, 1]})
df
value item1 item2 item3
0 4 0 1 0
1 5 1 0 0
2 3 0 0 1
Basically what I want to do is replace the value of the one hot encoded elements with the value from the "value" column and then delete the "value" column. The resulting data frame should be like this:
df_out = pd.DataFrame({"item1": [0, 5, 0], "item2": [4, 0, 0], "item3": [0, 0, 3]})
item1 item2 item3
0 0 4 0
1 5 0 0
2 0 0 3
python pandas dataframe
New contributor
add a comment |
up vote
8
down vote
favorite
I have a data frame that looks like this:
df = pd.DataFrame({"value": [4, 5, 3], "item1": [0, 1, 0], "item2": [1, 0, 0], "item3": [0, 0, 1]})
df
value item1 item2 item3
0 4 0 1 0
1 5 1 0 0
2 3 0 0 1
Basically what I want to do is replace the value of the one hot encoded elements with the value from the "value" column and then delete the "value" column. The resulting data frame should be like this:
df_out = pd.DataFrame({"item1": [0, 5, 0], "item2": [4, 0, 0], "item3": [0, 0, 3]})
item1 item2 item3
0 0 4 0
1 5 0 0
2 0 0 3
python pandas dataframe
New contributor
i think this can be solved if you just use df["columNameToReplace"] = df["value"] and then delete the value from the dataframe ?
– Vaibhav gusain
yesterday
add a comment |
up vote
8
down vote
favorite
up vote
8
down vote
favorite
I have a data frame that looks like this:
df = pd.DataFrame({"value": [4, 5, 3], "item1": [0, 1, 0], "item2": [1, 0, 0], "item3": [0, 0, 1]})
df
value item1 item2 item3
0 4 0 1 0
1 5 1 0 0
2 3 0 0 1
Basically what I want to do is replace the value of the one hot encoded elements with the value from the "value" column and then delete the "value" column. The resulting data frame should be like this:
df_out = pd.DataFrame({"item1": [0, 5, 0], "item2": [4, 0, 0], "item3": [0, 0, 3]})
item1 item2 item3
0 0 4 0
1 5 0 0
2 0 0 3
python pandas dataframe
New contributor
I have a data frame that looks like this:
df = pd.DataFrame({"value": [4, 5, 3], "item1": [0, 1, 0], "item2": [1, 0, 0], "item3": [0, 0, 1]})
df
value item1 item2 item3
0 4 0 1 0
1 5 1 0 0
2 3 0 0 1
Basically what I want to do is replace the value of the one hot encoded elements with the value from the "value" column and then delete the "value" column. The resulting data frame should be like this:
df_out = pd.DataFrame({"item1": [0, 5, 0], "item2": [4, 0, 0], "item3": [0, 0, 3]})
item1 item2 item3
0 0 4 0
1 5 0 0
2 0 0 3
python pandas dataframe
python pandas dataframe
New contributor
New contributor
edited yesterday
coldspeed
113k17102176
113k17102176
New contributor
asked yesterday
Gorjan Radevski
434
434
New contributor
New contributor
i think this can be solved if you just use df["columNameToReplace"] = df["value"] and then delete the value from the dataframe ?
– Vaibhav gusain
yesterday
add a comment |
i think this can be solved if you just use df["columNameToReplace"] = df["value"] and then delete the value from the dataframe ?
– Vaibhav gusain
yesterday
i think this can be solved if you just use df["columNameToReplace"] = df["value"] and then delete the value from the dataframe ?
– Vaibhav gusain
yesterday
i think this can be solved if you just use df["columNameToReplace"] = df["value"] and then delete the value from the dataframe ?
– Vaibhav gusain
yesterday
add a comment |
4 Answers
4
active
oldest
votes
up vote
12
down vote
accepted
Why not just multiply?
df.pop('value').values * df
item1 item2 item3
0 0 5 0
1 4 0 0
2 0 0 3
DataFrame.pop
has the nice effect of in-place removing and returning a column, so you can do this in a single step.
if the "item_*" columns have anything besides 1 in them, then you can multiply with bools:
df.pop('value').values * df.astype(bool)
item1 item2 item3
0 0 5 0
1 4 0 0
2 0 0 3
If your DataFrame has other columns, then do this:
df
value name item1 item2 item3
0 4 John 0 1 0
1 5 Mike 1 0 0
2 3 Stan 0 0 1
# cols = df.columns[df.columns.str.startswith('item')]
cols = df.filter(like='item').columns
df[cols] = df.pop('value').values * df[cols]
df
name item1 item2 item3
0 John 0 5 0
1 Mike 4 0 0
2 Stan 0 0 3
4
Most elegant answer so far
– horro
yesterday
I like it but I should have been more specific with my question. Here is how my data frame actually looks like:df_in = pd.DataFrame({"value": [4, 5, 3], "name": ["John", "Mike", "Stan"], "item1": [0, 1, 0], "item2": [1, 0, 0], "item3": [0, 0, 1]})
And the output df should be:df_out = pd.DataFrame({"name": ["John", "Mike", "Stan"], "item1": [0, 4, 0], "item2": [5, 0, 0], "item3": [0, 0, 3]})
– Gorjan Radevski
yesterday
@GorjanRadevski Let me know if the edit does it for you.
– coldspeed
yesterday
1
I feel so stupid after watching this answer :}
– Mohit Motwani
yesterday
1
That works! I have no idea why on the sample data frame it worked without the addition. Thank you!
– Gorjan Radevski
yesterday
|
show 4 more comments
up vote
1
down vote
You could do something like:
df = pd.DataFrame([df['value']*df['item1'],df['value']*df['item2'],df['value']*df['item3']])
df.columns = ['item1','item2','item3']
EDIT:
As this answer will not scale well to many columns as @coldspeed comments, it should be done iterating a loop:
cols = ['item1','item2','item3']
for c in cols:
df[c] *= df['value']
df.drop('value',axis=1,inplace=True)
1
This won't scale well to many columns.
– coldspeed
yesterday
1
Fair point, it should be done iterating a loop
– horro
yesterday
add a comment |
up vote
0
down vote
You need:
col = ['item1','item2','item3']
for c in col:
df[c] = df[c] * df['value']
df.drop(['value'],1,inplace=True)
1
Surely you can think of something better than iteration...
– coldspeed
yesterday
add a comment |
up vote
0
down vote
You can use np.where:
items = ['item1', 'item2', 'item3']
for item in items:
df[item] = np.where(df[item]==1, df['value'], 0)
df.drop(columns = ['value'], inplace =True)
df
item1 item2 item3
0 0 4 0
1 5 0 0
2 0 0 3
1
Again, can you think of something better than iteration? Don't forget to drop the "value" column once done.
– coldspeed
yesterday
@coldspeed You're right
– Mohit Motwani
yesterday
add a comment |
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
12
down vote
accepted
Why not just multiply?
df.pop('value').values * df
item1 item2 item3
0 0 5 0
1 4 0 0
2 0 0 3
DataFrame.pop
has the nice effect of in-place removing and returning a column, so you can do this in a single step.
if the "item_*" columns have anything besides 1 in them, then you can multiply with bools:
df.pop('value').values * df.astype(bool)
item1 item2 item3
0 0 5 0
1 4 0 0
2 0 0 3
If your DataFrame has other columns, then do this:
df
value name item1 item2 item3
0 4 John 0 1 0
1 5 Mike 1 0 0
2 3 Stan 0 0 1
# cols = df.columns[df.columns.str.startswith('item')]
cols = df.filter(like='item').columns
df[cols] = df.pop('value').values * df[cols]
df
name item1 item2 item3
0 John 0 5 0
1 Mike 4 0 0
2 Stan 0 0 3
4
Most elegant answer so far
– horro
yesterday
I like it but I should have been more specific with my question. Here is how my data frame actually looks like:df_in = pd.DataFrame({"value": [4, 5, 3], "name": ["John", "Mike", "Stan"], "item1": [0, 1, 0], "item2": [1, 0, 0], "item3": [0, 0, 1]})
And the output df should be:df_out = pd.DataFrame({"name": ["John", "Mike", "Stan"], "item1": [0, 4, 0], "item2": [5, 0, 0], "item3": [0, 0, 3]})
– Gorjan Radevski
yesterday
@GorjanRadevski Let me know if the edit does it for you.
– coldspeed
yesterday
1
I feel so stupid after watching this answer :}
– Mohit Motwani
yesterday
1
That works! I have no idea why on the sample data frame it worked without the addition. Thank you!
– Gorjan Radevski
yesterday
|
show 4 more comments
up vote
12
down vote
accepted
Why not just multiply?
df.pop('value').values * df
item1 item2 item3
0 0 5 0
1 4 0 0
2 0 0 3
DataFrame.pop
has the nice effect of in-place removing and returning a column, so you can do this in a single step.
if the "item_*" columns have anything besides 1 in them, then you can multiply with bools:
df.pop('value').values * df.astype(bool)
item1 item2 item3
0 0 5 0
1 4 0 0
2 0 0 3
If your DataFrame has other columns, then do this:
df
value name item1 item2 item3
0 4 John 0 1 0
1 5 Mike 1 0 0
2 3 Stan 0 0 1
# cols = df.columns[df.columns.str.startswith('item')]
cols = df.filter(like='item').columns
df[cols] = df.pop('value').values * df[cols]
df
name item1 item2 item3
0 John 0 5 0
1 Mike 4 0 0
2 Stan 0 0 3
4
Most elegant answer so far
– horro
yesterday
I like it but I should have been more specific with my question. Here is how my data frame actually looks like:df_in = pd.DataFrame({"value": [4, 5, 3], "name": ["John", "Mike", "Stan"], "item1": [0, 1, 0], "item2": [1, 0, 0], "item3": [0, 0, 1]})
And the output df should be:df_out = pd.DataFrame({"name": ["John", "Mike", "Stan"], "item1": [0, 4, 0], "item2": [5, 0, 0], "item3": [0, 0, 3]})
– Gorjan Radevski
yesterday
@GorjanRadevski Let me know if the edit does it for you.
– coldspeed
yesterday
1
I feel so stupid after watching this answer :}
– Mohit Motwani
yesterday
1
That works! I have no idea why on the sample data frame it worked without the addition. Thank you!
– Gorjan Radevski
yesterday
|
show 4 more comments
up vote
12
down vote
accepted
up vote
12
down vote
accepted
Why not just multiply?
df.pop('value').values * df
item1 item2 item3
0 0 5 0
1 4 0 0
2 0 0 3
DataFrame.pop
has the nice effect of in-place removing and returning a column, so you can do this in a single step.
if the "item_*" columns have anything besides 1 in them, then you can multiply with bools:
df.pop('value').values * df.astype(bool)
item1 item2 item3
0 0 5 0
1 4 0 0
2 0 0 3
If your DataFrame has other columns, then do this:
df
value name item1 item2 item3
0 4 John 0 1 0
1 5 Mike 1 0 0
2 3 Stan 0 0 1
# cols = df.columns[df.columns.str.startswith('item')]
cols = df.filter(like='item').columns
df[cols] = df.pop('value').values * df[cols]
df
name item1 item2 item3
0 John 0 5 0
1 Mike 4 0 0
2 Stan 0 0 3
Why not just multiply?
df.pop('value').values * df
item1 item2 item3
0 0 5 0
1 4 0 0
2 0 0 3
DataFrame.pop
has the nice effect of in-place removing and returning a column, so you can do this in a single step.
if the "item_*" columns have anything besides 1 in them, then you can multiply with bools:
df.pop('value').values * df.astype(bool)
item1 item2 item3
0 0 5 0
1 4 0 0
2 0 0 3
If your DataFrame has other columns, then do this:
df
value name item1 item2 item3
0 4 John 0 1 0
1 5 Mike 1 0 0
2 3 Stan 0 0 1
# cols = df.columns[df.columns.str.startswith('item')]
cols = df.filter(like='item').columns
df[cols] = df.pop('value').values * df[cols]
df
name item1 item2 item3
0 John 0 5 0
1 Mike 4 0 0
2 Stan 0 0 3
edited yesterday
answered yesterday
coldspeed
113k17102176
113k17102176
4
Most elegant answer so far
– horro
yesterday
I like it but I should have been more specific with my question. Here is how my data frame actually looks like:df_in = pd.DataFrame({"value": [4, 5, 3], "name": ["John", "Mike", "Stan"], "item1": [0, 1, 0], "item2": [1, 0, 0], "item3": [0, 0, 1]})
And the output df should be:df_out = pd.DataFrame({"name": ["John", "Mike", "Stan"], "item1": [0, 4, 0], "item2": [5, 0, 0], "item3": [0, 0, 3]})
– Gorjan Radevski
yesterday
@GorjanRadevski Let me know if the edit does it for you.
– coldspeed
yesterday
1
I feel so stupid after watching this answer :}
– Mohit Motwani
yesterday
1
That works! I have no idea why on the sample data frame it worked without the addition. Thank you!
– Gorjan Radevski
yesterday
|
show 4 more comments
4
Most elegant answer so far
– horro
yesterday
I like it but I should have been more specific with my question. Here is how my data frame actually looks like:df_in = pd.DataFrame({"value": [4, 5, 3], "name": ["John", "Mike", "Stan"], "item1": [0, 1, 0], "item2": [1, 0, 0], "item3": [0, 0, 1]})
And the output df should be:df_out = pd.DataFrame({"name": ["John", "Mike", "Stan"], "item1": [0, 4, 0], "item2": [5, 0, 0], "item3": [0, 0, 3]})
– Gorjan Radevski
yesterday
@GorjanRadevski Let me know if the edit does it for you.
– coldspeed
yesterday
1
I feel so stupid after watching this answer :}
– Mohit Motwani
yesterday
1
That works! I have no idea why on the sample data frame it worked without the addition. Thank you!
– Gorjan Radevski
yesterday
4
4
Most elegant answer so far
– horro
yesterday
Most elegant answer so far
– horro
yesterday
I like it but I should have been more specific with my question. Here is how my data frame actually looks like:
df_in = pd.DataFrame({"value": [4, 5, 3], "name": ["John", "Mike", "Stan"], "item1": [0, 1, 0], "item2": [1, 0, 0], "item3": [0, 0, 1]})
And the output df should be: df_out = pd.DataFrame({"name": ["John", "Mike", "Stan"], "item1": [0, 4, 0], "item2": [5, 0, 0], "item3": [0, 0, 3]})
– Gorjan Radevski
yesterday
I like it but I should have been more specific with my question. Here is how my data frame actually looks like:
df_in = pd.DataFrame({"value": [4, 5, 3], "name": ["John", "Mike", "Stan"], "item1": [0, 1, 0], "item2": [1, 0, 0], "item3": [0, 0, 1]})
And the output df should be: df_out = pd.DataFrame({"name": ["John", "Mike", "Stan"], "item1": [0, 4, 0], "item2": [5, 0, 0], "item3": [0, 0, 3]})
– Gorjan Radevski
yesterday
@GorjanRadevski Let me know if the edit does it for you.
– coldspeed
yesterday
@GorjanRadevski Let me know if the edit does it for you.
– coldspeed
yesterday
1
1
I feel so stupid after watching this answer :}
– Mohit Motwani
yesterday
I feel so stupid after watching this answer :}
– Mohit Motwani
yesterday
1
1
That works! I have no idea why on the sample data frame it worked without the addition. Thank you!
– Gorjan Radevski
yesterday
That works! I have no idea why on the sample data frame it worked without the addition. Thank you!
– Gorjan Radevski
yesterday
|
show 4 more comments
up vote
1
down vote
You could do something like:
df = pd.DataFrame([df['value']*df['item1'],df['value']*df['item2'],df['value']*df['item3']])
df.columns = ['item1','item2','item3']
EDIT:
As this answer will not scale well to many columns as @coldspeed comments, it should be done iterating a loop:
cols = ['item1','item2','item3']
for c in cols:
df[c] *= df['value']
df.drop('value',axis=1,inplace=True)
1
This won't scale well to many columns.
– coldspeed
yesterday
1
Fair point, it should be done iterating a loop
– horro
yesterday
add a comment |
up vote
1
down vote
You could do something like:
df = pd.DataFrame([df['value']*df['item1'],df['value']*df['item2'],df['value']*df['item3']])
df.columns = ['item1','item2','item3']
EDIT:
As this answer will not scale well to many columns as @coldspeed comments, it should be done iterating a loop:
cols = ['item1','item2','item3']
for c in cols:
df[c] *= df['value']
df.drop('value',axis=1,inplace=True)
1
This won't scale well to many columns.
– coldspeed
yesterday
1
Fair point, it should be done iterating a loop
– horro
yesterday
add a comment |
up vote
1
down vote
up vote
1
down vote
You could do something like:
df = pd.DataFrame([df['value']*df['item1'],df['value']*df['item2'],df['value']*df['item3']])
df.columns = ['item1','item2','item3']
EDIT:
As this answer will not scale well to many columns as @coldspeed comments, it should be done iterating a loop:
cols = ['item1','item2','item3']
for c in cols:
df[c] *= df['value']
df.drop('value',axis=1,inplace=True)
You could do something like:
df = pd.DataFrame([df['value']*df['item1'],df['value']*df['item2'],df['value']*df['item3']])
df.columns = ['item1','item2','item3']
EDIT:
As this answer will not scale well to many columns as @coldspeed comments, it should be done iterating a loop:
cols = ['item1','item2','item3']
for c in cols:
df[c] *= df['value']
df.drop('value',axis=1,inplace=True)
edited yesterday
answered yesterday
horro
4631726
4631726
1
This won't scale well to many columns.
– coldspeed
yesterday
1
Fair point, it should be done iterating a loop
– horro
yesterday
add a comment |
1
This won't scale well to many columns.
– coldspeed
yesterday
1
Fair point, it should be done iterating a loop
– horro
yesterday
1
1
This won't scale well to many columns.
– coldspeed
yesterday
This won't scale well to many columns.
– coldspeed
yesterday
1
1
Fair point, it should be done iterating a loop
– horro
yesterday
Fair point, it should be done iterating a loop
– horro
yesterday
add a comment |
up vote
0
down vote
You need:
col = ['item1','item2','item3']
for c in col:
df[c] = df[c] * df['value']
df.drop(['value'],1,inplace=True)
1
Surely you can think of something better than iteration...
– coldspeed
yesterday
add a comment |
up vote
0
down vote
You need:
col = ['item1','item2','item3']
for c in col:
df[c] = df[c] * df['value']
df.drop(['value'],1,inplace=True)
1
Surely you can think of something better than iteration...
– coldspeed
yesterday
add a comment |
up vote
0
down vote
up vote
0
down vote
You need:
col = ['item1','item2','item3']
for c in col:
df[c] = df[c] * df['value']
df.drop(['value'],1,inplace=True)
You need:
col = ['item1','item2','item3']
for c in col:
df[c] = df[c] * df['value']
df.drop(['value'],1,inplace=True)
edited yesterday
answered yesterday
Sociopath
3,26471535
3,26471535
1
Surely you can think of something better than iteration...
– coldspeed
yesterday
add a comment |
1
Surely you can think of something better than iteration...
– coldspeed
yesterday
1
1
Surely you can think of something better than iteration...
– coldspeed
yesterday
Surely you can think of something better than iteration...
– coldspeed
yesterday
add a comment |
up vote
0
down vote
You can use np.where:
items = ['item1', 'item2', 'item3']
for item in items:
df[item] = np.where(df[item]==1, df['value'], 0)
df.drop(columns = ['value'], inplace =True)
df
item1 item2 item3
0 0 4 0
1 5 0 0
2 0 0 3
1
Again, can you think of something better than iteration? Don't forget to drop the "value" column once done.
– coldspeed
yesterday
@coldspeed You're right
– Mohit Motwani
yesterday
add a comment |
up vote
0
down vote
You can use np.where:
items = ['item1', 'item2', 'item3']
for item in items:
df[item] = np.where(df[item]==1, df['value'], 0)
df.drop(columns = ['value'], inplace =True)
df
item1 item2 item3
0 0 4 0
1 5 0 0
2 0 0 3
1
Again, can you think of something better than iteration? Don't forget to drop the "value" column once done.
– coldspeed
yesterday
@coldspeed You're right
– Mohit Motwani
yesterday
add a comment |
up vote
0
down vote
up vote
0
down vote
You can use np.where:
items = ['item1', 'item2', 'item3']
for item in items:
df[item] = np.where(df[item]==1, df['value'], 0)
df.drop(columns = ['value'], inplace =True)
df
item1 item2 item3
0 0 4 0
1 5 0 0
2 0 0 3
You can use np.where:
items = ['item1', 'item2', 'item3']
for item in items:
df[item] = np.where(df[item]==1, df['value'], 0)
df.drop(columns = ['value'], inplace =True)
df
item1 item2 item3
0 0 4 0
1 5 0 0
2 0 0 3
edited yesterday
answered yesterday
Mohit Motwani
825320
825320
1
Again, can you think of something better than iteration? Don't forget to drop the "value" column once done.
– coldspeed
yesterday
@coldspeed You're right
– Mohit Motwani
yesterday
add a comment |
1
Again, can you think of something better than iteration? Don't forget to drop the "value" column once done.
– coldspeed
yesterday
@coldspeed You're right
– Mohit Motwani
yesterday
1
1
Again, can you think of something better than iteration? Don't forget to drop the "value" column once done.
– coldspeed
yesterday
Again, can you think of something better than iteration? Don't forget to drop the "value" column once done.
– coldspeed
yesterday
@coldspeed You're right
– Mohit Motwani
yesterday
@coldspeed You're right
– Mohit Motwani
yesterday
add a comment |
Gorjan Radevski is a new contributor. Be nice, and check out our Code of Conduct.
Gorjan Radevski is a new contributor. Be nice, and check out our Code of Conduct.
Gorjan Radevski is a new contributor. Be nice, and check out our Code of Conduct.
Gorjan Radevski is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53632617%2freplace-1s-in-one-hot-columns-with-values-from-another-column%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
i think this can be solved if you just use df["columNameToReplace"] = df["value"] and then delete the value from the dataframe ?
– Vaibhav gusain
yesterday