Pandas measure elapsed time since a condition

up vote
9
down vote

favorite

I have the following dataframe:

               Time   Work

2018-12-01 10:00:00     Off

2018-12-01 10:00:02     On

2018-12-01 10:00:05     On

2018-12-01 10:00:06     On

2018-12-01 10:00:07     On

2018-12-01 10:00:09    Off

2018-12-01 10:00:11    Off

2018-12-01 10:00:14     On

2018-12-01 10:00:16     On

2018-12-01 10:00:18     On

2018-12-01 10:00:20    Off

I would like to creat a new column with the elapsed time since the device started working.

               Time   Work    Elapsed Time

2018-12-01 10:00:00    Off               0

2018-12-01 10:00:02     On               2

2018-12-01 10:00:05     On               5

2018-12-01 10:00:06     On               6

2018-12-01 10:00:07     On               7

2018-12-01 10:00:09    Off               0

2018-12-01 10:00:11    Off               0

2018-12-01 10:00:14     On               3

2018-12-01 10:00:16     On               5

2018-12-01 10:00:18     On               7

2018-12-01 10:00:20    Off               0

How can I do it?

asked yesterday

Rafael

462

New contributor

4

Welcome to Stack Overflow, Rafael! I definitely came here just because the title seemed amusing, but left learning what Pandas actually means in this context.
– zarose
yesterday

add a comment |

up vote
9
down vote

favorite

I have the following dataframe:

               Time   Work

2018-12-01 10:00:00     Off

2018-12-01 10:00:02     On

2018-12-01 10:00:05     On

2018-12-01 10:00:06     On

2018-12-01 10:00:07     On

2018-12-01 10:00:09    Off

2018-12-01 10:00:11    Off

2018-12-01 10:00:14     On

2018-12-01 10:00:16     On

2018-12-01 10:00:18     On

2018-12-01 10:00:20    Off

I would like to creat a new column with the elapsed time since the device started working.

               Time   Work    Elapsed Time

2018-12-01 10:00:00    Off               0

2018-12-01 10:00:02     On               2

2018-12-01 10:00:05     On               5

2018-12-01 10:00:06     On               6

2018-12-01 10:00:07     On               7

2018-12-01 10:00:09    Off               0

2018-12-01 10:00:11    Off               0

2018-12-01 10:00:14     On               3

2018-12-01 10:00:16     On               5

2018-12-01 10:00:18     On               7

2018-12-01 10:00:20    Off               0

How can I do it?

asked yesterday

Rafael

462

New contributor

4

Welcome to Stack Overflow, Rafael! I definitely came here just because the title seemed amusing, but left learning what Pandas actually means in this context.
– zarose
yesterday

add a comment |

up vote
9
down vote

favorite

I have the following dataframe:

               Time   Work

2018-12-01 10:00:00     Off

2018-12-01 10:00:02     On

2018-12-01 10:00:05     On

2018-12-01 10:00:06     On

2018-12-01 10:00:07     On

2018-12-01 10:00:09    Off

2018-12-01 10:00:11    Off

2018-12-01 10:00:14     On

2018-12-01 10:00:16     On

2018-12-01 10:00:18     On

2018-12-01 10:00:20    Off

I would like to creat a new column with the elapsed time since the device started working.

               Time   Work    Elapsed Time

2018-12-01 10:00:00    Off               0

2018-12-01 10:00:02     On               2

2018-12-01 10:00:05     On               5

2018-12-01 10:00:06     On               6

2018-12-01 10:00:07     On               7

2018-12-01 10:00:09    Off               0

2018-12-01 10:00:11    Off               0

2018-12-01 10:00:14     On               3

2018-12-01 10:00:16     On               5

2018-12-01 10:00:18     On               7

2018-12-01 10:00:20    Off               0

How can I do it?

asked yesterday

Rafael

462

New contributor

I have the following dataframe:

               Time   Work

2018-12-01 10:00:00     Off

2018-12-01 10:00:02     On

2018-12-01 10:00:05     On

2018-12-01 10:00:06     On

2018-12-01 10:00:07     On

2018-12-01 10:00:09    Off

2018-12-01 10:00:11    Off

2018-12-01 10:00:14     On

2018-12-01 10:00:16     On

2018-12-01 10:00:18     On

2018-12-01 10:00:20    Off

I would like to creat a new column with the elapsed time since the device started working.

               Time   Work    Elapsed Time

2018-12-01 10:00:00    Off               0

2018-12-01 10:00:02     On               2

2018-12-01 10:00:05     On               5

2018-12-01 10:00:06     On               6

2018-12-01 10:00:07     On               7

2018-12-01 10:00:09    Off               0

2018-12-01 10:00:11    Off               0

2018-12-01 10:00:14     On               3

2018-12-01 10:00:16     On               5

2018-12-01 10:00:18     On               7

2018-12-01 10:00:20    Off               0

How can I do it?

python pandas time timedelta

asked yesterday

Rafael

462

New contributor

asked yesterday

Rafael

462

New contributor

asked yesterday

Rafael

462

New contributor

asked yesterday

Rafael

462

asked yesterday

Rafael

462

New contributor

Rafael is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

4

Welcome to Stack Overflow, Rafael! I definitely came here just because the title seemed amusing, but left learning what Pandas actually means in this context.
– zarose
yesterday

add a comment |

4

Welcome to Stack Overflow, Rafael! I definitely came here just because the title seemed amusing, but left learning what Pandas actually means in this context.
– zarose
yesterday

Welcome to Stack Overflow, Rafael! I definitely came here just because the title seemed amusing, but left learning what Pandas actually means in this context.
– zarose
yesterday

add a comment |

5 Answers
5

active

oldest

votes

up vote
9
down vote

You can use groupby:

# df['Time'] = pd.to_datetime(df['Time'], errors='coerce') # Uncomment if needed.

sec = df['Time'].dt.second

df['Elapsed Time'] = (

    sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))



df

                  Time Work  Elapsed Time

0  2018-12-01 10:00:00  Off             0

1  2018-12-01 10:00:02   On             2

2  2018-12-01 10:00:05   On             5

3  2018-12-01 10:00:06   On             6

4  2018-12-01 10:00:07   On             7

5  2018-12-01 10:00:09  Off             0

6  2018-12-01 10:00:11  Off             0

7  2018-12-01 10:00:14   On             3

8  2018-12-01 10:00:16   On             5

9  2018-12-01 10:00:18   On             7

10 2018-12-01 10:00:20  Off             0

The idea is to extract the seconds portion and subtract the elapsed time from the first moment the state changes from "Off" to "On". This is done using transform and first.

cumsum is used to identify groups:

df.Work.eq('Off').cumsum()



0     1

1     1

2     1

3     1

4     1

5     2

6     3

7     3

8     3

9     3

10    4

Name: Work, dtype: int64

If there's a possibility your device can span multiple minutes while in the "On", then, initialise sec as:

sec = df['Time'].values.astype(np.int64) // 10e8



df['Elapsed Time'] = (

    sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))



df

                  Time Work  Elapsed Time

0  2018-12-01 10:00:00  Off           0.0

1  2018-12-01 10:00:02   On           2.0

2  2018-12-01 10:00:05   On           5.0

3  2018-12-01 10:00:06   On           6.0

4  2018-12-01 10:00:07   On           7.0

5  2018-12-01 10:00:09  Off           0.0

6  2018-12-01 10:00:11  Off           0.0

7  2018-12-01 10:00:14   On           3.0

8  2018-12-01 10:00:16   On           5.0

9  2018-12-01 10:00:18   On           7.0

10 2018-12-01 10:00:20  Off           0.0

edited yesterday

answered yesterday

coldspeed

113k18103176

@Rafael Yeah, the assumption here is that your row starts in the "Off" condition. Can you append a row at the beginning of your frame?
– coldspeed
yesterday

@Rafael Okay, and regarding your second point, does df['Time'].values.astype(np.int64) // 10e8 work?
– coldspeed
yesterday

The code worked fine for seconds. However, when the first cell of the column Work was 'On' the elapsed time did not begin in zero. Besides, when the time changed to the next minute, the elapsed time was negative. I tried using sec = df['Time'].astype(int) but I got the error: cannot astype a datetimelike from [datetime64[ns]] to [int32];
– Rafael
yesterday

@Rafael can you read my comments just about yours again please?
– coldspeed
yesterday

I deleted the comment and posted it again so I could edit it. Regarding your answers, I receive the data every day, it begins 'On' and ends 'On', so I am not sure if I can append a row, but I will try to using the date change as condition. The code df['Time'].values.astype(np.int64) // 10e8 did work.
– Rafael
yesterday

add a comment |

up vote
8
down vote

IIUC first with transform

(df.Time-df.Time.groupby(df.Work.eq('Off').cumsum()).transform('first')).dt.seconds

Out[1090]: 

0     0

1     2

2     5

3     6

4     7

5     0

6     0

7     3

8     5

9     7

10    0

Name: Time, dtype: int64

answered yesterday

W-B

96.5k72962

add a comment |

up vote
7
down vote

You could use two groupbys. The first calculates the time difference within each group. The second then sums those within each group.

s = (df.Work=='Off').cumsum()

df['Elapsed Time'] = df.groupby(s).Time.diff().dt.total_seconds().fillna(0).groupby(s).cumsum()

Output

                  Time Work  Elapsed Time

0  2018-12-01 10:00:00  Off           0.0

1  2018-12-01 10:00:02   On           2.0

2  2018-12-01 10:00:05   On           5.0

3  2018-12-01 10:00:06   On           6.0

4  2018-12-01 10:00:07   On           7.0

5  2018-12-01 10:00:09  Off           0.0

6  2018-12-01 10:00:11  Off           0.0

7  2018-12-01 10:00:14   On           3.0

8  2018-12-01 10:00:16   On           5.0

9  2018-12-01 10:00:18   On           7.0

10 2018-12-01 10:00:20  Off           0.0

answered yesterday

ALollz

10.6k31234

The code worked fine. However, when the first work cell of the dataframe was 'On', the elapsed time was not zero.
– Rafael
yesterday

@Rafael good point. There might be a neat way to fix it in the calculation, but you can just fix it after the fact with df.loc[df.index < s[s==1].idxmax(), 'Elapsed Time'] = 0. I guess there's still an issue if the machine never tuns on, but that can be fixed or handled too.
– ALollz
yesterday

add a comment |

up vote
4
down vote

Using a groupby, you can do this:

df['Elapsed Time'] = (df.groupby(df.Work.eq('Off').cumsum()).Time

                      .transform(lambda x: x.diff()

                                 .dt.total_seconds()

                                 .cumsum())

                      .fillna(0))



>>> df

                  Time Work  Elapsed Time

0  2018-12-01 10:00:00  Off           0.0

1  2018-12-01 10:00:02   On           2.0

2  2018-12-01 10:00:05   On           5.0

3  2018-12-01 10:00:06   On           6.0

4  2018-12-01 10:00:07   On           7.0

5  2018-12-01 10:00:09  Off           0.0

6  2018-12-01 10:00:11  Off           0.0

7  2018-12-01 10:00:14   On           3.0

8  2018-12-01 10:00:16   On           5.0

9  2018-12-01 10:00:18   On           7.0

10 2018-12-01 10:00:20  Off           0.0

answered yesterday

sacul

29k41639

add a comment |

up vote
3
down vote

A numpy slicy approach

u, f, i = np.unique(df.Work.eq('Off').values.cumsum(), True, True)

t = df.Time.values



df['Elapsed Time'] = t - t[f[i]]

df



                  Time Work Elapsed Time

0  2018-12-01 10:00:00  Off     00:00:00

1  2018-12-01 10:00:02   On     00:00:02

2  2018-12-01 10:00:05   On     00:00:05

3  2018-12-01 10:00:06   On     00:00:06

4  2018-12-01 10:00:07   On     00:00:07

5  2018-12-01 10:00:09  Off     00:00:00

6  2018-12-01 10:00:11  Off     00:00:00

7  2018-12-01 10:00:14   On     00:00:03

8  2018-12-01 10:00:16   On     00:00:05

9  2018-12-01 10:00:18   On     00:00:07

10 2018-12-01 10:00:20  Off     00:00:00

We can nail down the integer bit with

df['Elapsed Time'] = (t - t[f[i]]).astype('timedelta64[s]').astype(int)

df



                  Time Work  Elapsed Time

0  2018-12-01 10:00:00  Off             0

1  2018-12-01 10:00:02   On             2

2  2018-12-01 10:00:05   On             5

3  2018-12-01 10:00:06   On             6

4  2018-12-01 10:00:07   On             7

5  2018-12-01 10:00:09  Off             0

6  2018-12-01 10:00:11  Off             0

7  2018-12-01 10:00:14   On             3

8  2018-12-01 10:00:16   On             5

9  2018-12-01 10:00:18   On             7

10 2018-12-01 10:00:20  Off             0

answered yesterday

piRSquared

151k22138281

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

Rafael is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53656878%2fpandas-measure-elapsed-time-since-a-condition%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

5 Answers
5

active

oldest

votes

5 Answers
5

active

oldest

votes

up vote
9
down vote

You can use groupby:

# df['Time'] = pd.to_datetime(df['Time'], errors='coerce') # Uncomment if needed.

sec = df['Time'].dt.second

df['Elapsed Time'] = (

    sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))



df

                  Time Work  Elapsed Time

0  2018-12-01 10:00:00  Off             0

1  2018-12-01 10:00:02   On             2

2  2018-12-01 10:00:05   On             5

3  2018-12-01 10:00:06   On             6

4  2018-12-01 10:00:07   On             7

5  2018-12-01 10:00:09  Off             0

6  2018-12-01 10:00:11  Off             0

7  2018-12-01 10:00:14   On             3

8  2018-12-01 10:00:16   On             5

9  2018-12-01 10:00:18   On             7

10 2018-12-01 10:00:20  Off             0

The idea is to extract the seconds portion and subtract the elapsed time from the first moment the state changes from "Off" to "On". This is done using transform and first.

cumsum is used to identify groups:

df.Work.eq('Off').cumsum()



0     1

1     1

2     1

3     1

4     1

5     2

6     3

7     3

8     3

9     3

10    4

Name: Work, dtype: int64

If there's a possibility your device can span multiple minutes while in the "On", then, initialise sec as:

sec = df['Time'].values.astype(np.int64) // 10e8



df['Elapsed Time'] = (

    sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))



df

                  Time Work  Elapsed Time

0  2018-12-01 10:00:00  Off           0.0

1  2018-12-01 10:00:02   On           2.0

2  2018-12-01 10:00:05   On           5.0

3  2018-12-01 10:00:06   On           6.0

4  2018-12-01 10:00:07   On           7.0

5  2018-12-01 10:00:09  Off           0.0

6  2018-12-01 10:00:11  Off           0.0

7  2018-12-01 10:00:14   On           3.0

8  2018-12-01 10:00:16   On           5.0

9  2018-12-01 10:00:18   On           7.0

10 2018-12-01 10:00:20  Off           0.0

edited yesterday

answered yesterday

coldspeed

113k18103176

@Rafael Yeah, the assumption here is that your row starts in the "Off" condition. Can you append a row at the beginning of your frame?
– coldspeed
yesterday

@Rafael Okay, and regarding your second point, does df['Time'].values.astype(np.int64) // 10e8 work?
– coldspeed
yesterday

The code worked fine for seconds. However, when the first cell of the column Work was 'On' the elapsed time did not begin in zero. Besides, when the time changed to the next minute, the elapsed time was negative. I tried using sec = df['Time'].astype(int) but I got the error: cannot astype a datetimelike from [datetime64[ns]] to [int32];
– Rafael
yesterday

@Rafael can you read my comments just about yours again please?
– coldspeed
yesterday

I deleted the comment and posted it again so I could edit it. Regarding your answers, I receive the data every day, it begins 'On' and ends 'On', so I am not sure if I can append a row, but I will try to using the date change as condition. The code df['Time'].values.astype(np.int64) // 10e8 did work.
– Rafael
yesterday

add a comment |

up vote
9
down vote

You can use groupby:

# df['Time'] = pd.to_datetime(df['Time'], errors='coerce') # Uncomment if needed.

sec = df['Time'].dt.second

df['Elapsed Time'] = (

    sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))



df

                  Time Work  Elapsed Time

0  2018-12-01 10:00:00  Off             0

1  2018-12-01 10:00:02   On             2

2  2018-12-01 10:00:05   On             5

3  2018-12-01 10:00:06   On             6

4  2018-12-01 10:00:07   On             7

5  2018-12-01 10:00:09  Off             0

6  2018-12-01 10:00:11  Off             0

7  2018-12-01 10:00:14   On             3

8  2018-12-01 10:00:16   On             5

9  2018-12-01 10:00:18   On             7

10 2018-12-01 10:00:20  Off             0

The idea is to extract the seconds portion and subtract the elapsed time from the first moment the state changes from "Off" to "On". This is done using transform and first.

cumsum is used to identify groups:

df.Work.eq('Off').cumsum()



0     1

1     1

2     1

3     1

4     1

5     2

6     3

7     3

8     3

9     3

10    4

Name: Work, dtype: int64

If there's a possibility your device can span multiple minutes while in the "On", then, initialise sec as:

sec = df['Time'].values.astype(np.int64) // 10e8



df['Elapsed Time'] = (

    sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))



df

                  Time Work  Elapsed Time

0  2018-12-01 10:00:00  Off           0.0

1  2018-12-01 10:00:02   On           2.0

2  2018-12-01 10:00:05   On           5.0

3  2018-12-01 10:00:06   On           6.0

4  2018-12-01 10:00:07   On           7.0

5  2018-12-01 10:00:09  Off           0.0

6  2018-12-01 10:00:11  Off           0.0

7  2018-12-01 10:00:14   On           3.0

8  2018-12-01 10:00:16   On           5.0

9  2018-12-01 10:00:18   On           7.0

10 2018-12-01 10:00:20  Off           0.0

edited yesterday

answered yesterday

coldspeed

113k18103176

@Rafael Yeah, the assumption here is that your row starts in the "Off" condition. Can you append a row at the beginning of your frame?
– coldspeed
yesterday

@Rafael Okay, and regarding your second point, does df['Time'].values.astype(np.int64) // 10e8 work?
– coldspeed
yesterday

The code worked fine for seconds. However, when the first cell of the column Work was 'On' the elapsed time did not begin in zero. Besides, when the time changed to the next minute, the elapsed time was negative. I tried using sec = df['Time'].astype(int) but I got the error: cannot astype a datetimelike from [datetime64[ns]] to [int32];
– Rafael
yesterday

@Rafael can you read my comments just about yours again please?
– coldspeed
yesterday

I deleted the comment and posted it again so I could edit it. Regarding your answers, I receive the data every day, it begins 'On' and ends 'On', so I am not sure if I can append a row, but I will try to using the date change as condition. The code df['Time'].values.astype(np.int64) // 10e8 did work.
– Rafael
yesterday

add a comment |

up vote
9
down vote

You can use groupby:

# df['Time'] = pd.to_datetime(df['Time'], errors='coerce') # Uncomment if needed.

sec = df['Time'].dt.second

df['Elapsed Time'] = (

    sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))



df

                  Time Work  Elapsed Time

0  2018-12-01 10:00:00  Off             0

1  2018-12-01 10:00:02   On             2

2  2018-12-01 10:00:05   On             5

3  2018-12-01 10:00:06   On             6

4  2018-12-01 10:00:07   On             7

5  2018-12-01 10:00:09  Off             0

6  2018-12-01 10:00:11  Off             0

7  2018-12-01 10:00:14   On             3

8  2018-12-01 10:00:16   On             5

9  2018-12-01 10:00:18   On             7

10 2018-12-01 10:00:20  Off             0

The idea is to extract the seconds portion and subtract the elapsed time from the first moment the state changes from "Off" to "On". This is done using transform and first.

cumsum is used to identify groups:

df.Work.eq('Off').cumsum()



0     1

1     1

2     1

3     1

4     1

5     2

6     3

7     3

8     3

9     3

10    4

Name: Work, dtype: int64

If there's a possibility your device can span multiple minutes while in the "On", then, initialise sec as:

sec = df['Time'].values.astype(np.int64) // 10e8



df['Elapsed Time'] = (

    sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))



df

                  Time Work  Elapsed Time

0  2018-12-01 10:00:00  Off           0.0

1  2018-12-01 10:00:02   On           2.0

2  2018-12-01 10:00:05   On           5.0

3  2018-12-01 10:00:06   On           6.0

4  2018-12-01 10:00:07   On           7.0

5  2018-12-01 10:00:09  Off           0.0

6  2018-12-01 10:00:11  Off           0.0

7  2018-12-01 10:00:14   On           3.0

8  2018-12-01 10:00:16   On           5.0

9  2018-12-01 10:00:18   On           7.0

10 2018-12-01 10:00:20  Off           0.0

edited yesterday

answered yesterday

coldspeed

113k18103176

You can use groupby:

# df['Time'] = pd.to_datetime(df['Time'], errors='coerce') # Uncomment if needed.

sec = df['Time'].dt.second

df['Elapsed Time'] = (

    sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))



df

                  Time Work  Elapsed Time

0  2018-12-01 10:00:00  Off             0

1  2018-12-01 10:00:02   On             2

2  2018-12-01 10:00:05   On             5

3  2018-12-01 10:00:06   On             6

4  2018-12-01 10:00:07   On             7

5  2018-12-01 10:00:09  Off             0

6  2018-12-01 10:00:11  Off             0

7  2018-12-01 10:00:14   On             3

8  2018-12-01 10:00:16   On             5

9  2018-12-01 10:00:18   On             7

10 2018-12-01 10:00:20  Off             0

The idea is to extract the seconds portion and subtract the elapsed time from the first moment the state changes from "Off" to "On". This is done using transform and first.

cumsum is used to identify groups:

df.Work.eq('Off').cumsum()



0     1

1     1

2     1

3     1

4     1

5     2

6     3

7     3

8     3

9     3

10    4

Name: Work, dtype: int64

If there's a possibility your device can span multiple minutes while in the "On", then, initialise sec as:

sec = df['Time'].values.astype(np.int64) // 10e8



df['Elapsed Time'] = (

    sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))



df

                  Time Work  Elapsed Time

0  2018-12-01 10:00:00  Off           0.0

1  2018-12-01 10:00:02   On           2.0

2  2018-12-01 10:00:05   On           5.0

3  2018-12-01 10:00:06   On           6.0

4  2018-12-01 10:00:07   On           7.0

5  2018-12-01 10:00:09  Off           0.0

6  2018-12-01 10:00:11  Off           0.0

7  2018-12-01 10:00:14   On           3.0

8  2018-12-01 10:00:16   On           5.0

9  2018-12-01 10:00:18   On           7.0

10 2018-12-01 10:00:20  Off           0.0

edited yesterday

answered yesterday

coldspeed

113k18103176

edited yesterday

answered yesterday

coldspeed

113k18103176

answered yesterday

coldspeed

113k18103176

answered yesterday

coldspeed

113k18103176

@Rafael Yeah, the assumption here is that your row starts in the "Off" condition. Can you append a row at the beginning of your frame?
– coldspeed
yesterday

@Rafael Okay, and regarding your second point, does df['Time'].values.astype(np.int64) // 10e8 work?
– coldspeed
yesterday

The code worked fine for seconds. However, when the first cell of the column Work was 'On' the elapsed time did not begin in zero. Besides, when the time changed to the next minute, the elapsed time was negative. I tried using sec = df['Time'].astype(int) but I got the error: cannot astype a datetimelike from [datetime64[ns]] to [int32];
– Rafael
yesterday

@Rafael can you read my comments just about yours again please?
– coldspeed
yesterday

I deleted the comment and posted it again so I could edit it. Regarding your answers, I receive the data every day, it begins 'On' and ends 'On', so I am not sure if I can append a row, but I will try to using the date change as condition. The code df['Time'].values.astype(np.int64) // 10e8 did work.
– Rafael
yesterday

add a comment |

@Rafael Yeah, the assumption here is that your row starts in the "Off" condition. Can you append a row at the beginning of your frame?
– coldspeed
yesterday

@Rafael Okay, and regarding your second point, does df['Time'].values.astype(np.int64) // 10e8 work?
– coldspeed
yesterday

The code worked fine for seconds. However, when the first cell of the column Work was 'On' the elapsed time did not begin in zero. Besides, when the time changed to the next minute, the elapsed time was negative. I tried using sec = df['Time'].astype(int) but I got the error: cannot astype a datetimelike from [datetime64[ns]] to [int32];
– Rafael
yesterday

@Rafael can you read my comments just about yours again please?
– coldspeed
yesterday

I deleted the comment and posted it again so I could edit it. Regarding your answers, I receive the data every day, it begins 'On' and ends 'On', so I am not sure if I can append a row, but I will try to using the date change as condition. The code df['Time'].values.astype(np.int64) // 10e8 did work.
– Rafael
yesterday

@Rafael Yeah, the assumption here is that your row starts in the "Off" condition. Can you append a row at the beginning of your frame?
– coldspeed
yesterday

@Rafael Okay, and regarding your second point, does df['Time'].values.astype(np.int64) // 10e8 work?
– coldspeed
yesterday

The code worked fine for seconds. However, when the first cell of the column Work was 'On' the elapsed time did not begin in zero. Besides, when the time changed to the next minute, the elapsed time was negative. I tried using sec = df['Time'].astype(int) but I got the error: cannot astype a datetimelike from [datetime64[ns]] to [int32];
– Rafael
yesterday

@Rafael can you read my comments just about yours again please?
– coldspeed
yesterday

I deleted the comment and posted it again so I could edit it. Regarding your answers, I receive the data every day, it begins 'On' and ends 'On', so I am not sure if I can append a row, but I will try to using the date change as condition. The code df['Time'].values.astype(np.int64) // 10e8 did work.
– Rafael
yesterday

add a comment |

up vote
8
down vote

IIUC first with transform

(df.Time-df.Time.groupby(df.Work.eq('Off').cumsum()).transform('first')).dt.seconds

Out[1090]: 

0     0

1     2

2     5

3     6

4     7

5     0

6     0

7     3

8     5

9     7

10    0

Name: Time, dtype: int64

answered yesterday

W-B

96.5k72962

add a comment |

up vote
8
down vote

IIUC first with transform

(df.Time-df.Time.groupby(df.Work.eq('Off').cumsum()).transform('first')).dt.seconds

Out[1090]: 

0     0

1     2

2     5

3     6

4     7

5     0

6     0

7     3

8     5

9     7

10    0

Name: Time, dtype: int64

answered yesterday

W-B

96.5k72962

add a comment |

up vote
8
down vote

IIUC first with transform

(df.Time-df.Time.groupby(df.Work.eq('Off').cumsum()).transform('first')).dt.seconds

Out[1090]: 

0     0

1     2

2     5

3     6

4     7

5     0

6     0

7     3

8     5

9     7

10    0

Name: Time, dtype: int64

answered yesterday

W-B

96.5k72962

IIUC first with transform

(df.Time-df.Time.groupby(df.Work.eq('Off').cumsum()).transform('first')).dt.seconds

Out[1090]: 

0     0

1     2

2     5

3     6

4     7

5     0

6     0

7     3

8     5

9     7

10    0

Name: Time, dtype: int64

answered yesterday

W-B

96.5k72962

answered yesterday

W-B

96.5k72962

answered yesterday

W-B

96.5k72962

answered yesterday

W-B

96.5k72962

add a comment |

up vote
7
down vote

You could use two groupbys. The first calculates the time difference within each group. The second then sums those within each group.

s = (df.Work=='Off').cumsum()

df['Elapsed Time'] = df.groupby(s).Time.diff().dt.total_seconds().fillna(0).groupby(s).cumsum()

Output

                  Time Work  Elapsed Time

0  2018-12-01 10:00:00  Off           0.0

1  2018-12-01 10:00:02   On           2.0

2  2018-12-01 10:00:05   On           5.0

3  2018-12-01 10:00:06   On           6.0

4  2018-12-01 10:00:07   On           7.0

5  2018-12-01 10:00:09  Off           0.0

6  2018-12-01 10:00:11  Off           0.0

7  2018-12-01 10:00:14   On           3.0

8  2018-12-01 10:00:16   On           5.0

9  2018-12-01 10:00:18   On           7.0

10 2018-12-01 10:00:20  Off           0.0

answered yesterday

ALollz

10.6k31234

The code worked fine. However, when the first work cell of the dataframe was 'On', the elapsed time was not zero.
– Rafael
yesterday

@Rafael good point. There might be a neat way to fix it in the calculation, but you can just fix it after the fact with df.loc[df.index < s[s==1].idxmax(), 'Elapsed Time'] = 0. I guess there's still an issue if the machine never tuns on, but that can be fixed or handled too.
– ALollz
yesterday

add a comment |

up vote
7
down vote

You could use two groupbys. The first calculates the time difference within each group. The second then sums those within each group.

s = (df.Work=='Off').cumsum()

df['Elapsed Time'] = df.groupby(s).Time.diff().dt.total_seconds().fillna(0).groupby(s).cumsum()

Output

                  Time Work  Elapsed Time

0  2018-12-01 10:00:00  Off           0.0

1  2018-12-01 10:00:02   On           2.0

2  2018-12-01 10:00:05   On           5.0

3  2018-12-01 10:00:06   On           6.0

4  2018-12-01 10:00:07   On           7.0

5  2018-12-01 10:00:09  Off           0.0

6  2018-12-01 10:00:11  Off           0.0

7  2018-12-01 10:00:14   On           3.0

8  2018-12-01 10:00:16   On           5.0

9  2018-12-01 10:00:18   On           7.0

10 2018-12-01 10:00:20  Off           0.0

answered yesterday

ALollz

10.6k31234

The code worked fine. However, when the first work cell of the dataframe was 'On', the elapsed time was not zero.
– Rafael
yesterday

@Rafael good point. There might be a neat way to fix it in the calculation, but you can just fix it after the fact with df.loc[df.index < s[s==1].idxmax(), 'Elapsed Time'] = 0. I guess there's still an issue if the machine never tuns on, but that can be fixed or handled too.
– ALollz
yesterday

add a comment |

up vote
7
down vote

You could use two groupbys. The first calculates the time difference within each group. The second then sums those within each group.

s = (df.Work=='Off').cumsum()

df['Elapsed Time'] = df.groupby(s).Time.diff().dt.total_seconds().fillna(0).groupby(s).cumsum()

Output

                  Time Work  Elapsed Time

0  2018-12-01 10:00:00  Off           0.0

1  2018-12-01 10:00:02   On           2.0

2  2018-12-01 10:00:05   On           5.0

3  2018-12-01 10:00:06   On           6.0

4  2018-12-01 10:00:07   On           7.0

5  2018-12-01 10:00:09  Off           0.0

6  2018-12-01 10:00:11  Off           0.0

7  2018-12-01 10:00:14   On           3.0

8  2018-12-01 10:00:16   On           5.0

9  2018-12-01 10:00:18   On           7.0

10 2018-12-01 10:00:20  Off           0.0

answered yesterday

ALollz

10.6k31234

You could use two groupbys. The first calculates the time difference within each group. The second then sums those within each group.

s = (df.Work=='Off').cumsum()

df['Elapsed Time'] = df.groupby(s).Time.diff().dt.total_seconds().fillna(0).groupby(s).cumsum()

Output

                  Time Work  Elapsed Time

0  2018-12-01 10:00:00  Off           0.0

1  2018-12-01 10:00:02   On           2.0

2  2018-12-01 10:00:05   On           5.0

3  2018-12-01 10:00:06   On           6.0

4  2018-12-01 10:00:07   On           7.0

5  2018-12-01 10:00:09  Off           0.0

6  2018-12-01 10:00:11  Off           0.0

7  2018-12-01 10:00:14   On           3.0

8  2018-12-01 10:00:16   On           5.0

9  2018-12-01 10:00:18   On           7.0

10 2018-12-01 10:00:20  Off           0.0

answered yesterday

ALollz

10.6k31234

answered yesterday

ALollz

10.6k31234

answered yesterday

ALollz

10.6k31234

answered yesterday

ALollz

10.6k31234

The code worked fine. However, when the first work cell of the dataframe was 'On', the elapsed time was not zero.
– Rafael
yesterday

@Rafael good point. There might be a neat way to fix it in the calculation, but you can just fix it after the fact with df.loc[df.index < s[s==1].idxmax(), 'Elapsed Time'] = 0. I guess there's still an issue if the machine never tuns on, but that can be fixed or handled too.
– ALollz
yesterday

add a comment |

The code worked fine. However, when the first work cell of the dataframe was 'On', the elapsed time was not zero.
– Rafael
yesterday

@Rafael good point. There might be a neat way to fix it in the calculation, but you can just fix it after the fact with df.loc[df.index < s[s==1].idxmax(), 'Elapsed Time'] = 0. I guess there's still an issue if the machine never tuns on, but that can be fixed or handled too.
– ALollz
yesterday

The code worked fine. However, when the first work cell of the dataframe was 'On', the elapsed time was not zero.
– Rafael
yesterday

@Rafael good point. There might be a neat way to fix it in the calculation, but you can just fix it after the fact with df.loc[df.index < s[s==1].idxmax(), 'Elapsed Time'] = 0. I guess there's still an issue if the machine never tuns on, but that can be fixed or handled too.
– ALollz
yesterday

add a comment |

up vote
4
down vote

Using a groupby, you can do this:

df['Elapsed Time'] = (df.groupby(df.Work.eq('Off').cumsum()).Time

                      .transform(lambda x: x.diff()

                                 .dt.total_seconds()

                                 .cumsum())

                      .fillna(0))



>>> df

                  Time Work  Elapsed Time

0  2018-12-01 10:00:00  Off           0.0

1  2018-12-01 10:00:02   On           2.0

2  2018-12-01 10:00:05   On           5.0

3  2018-12-01 10:00:06   On           6.0

4  2018-12-01 10:00:07   On           7.0

5  2018-12-01 10:00:09  Off           0.0

6  2018-12-01 10:00:11  Off           0.0

7  2018-12-01 10:00:14   On           3.0

8  2018-12-01 10:00:16   On           5.0

9  2018-12-01 10:00:18   On           7.0

10 2018-12-01 10:00:20  Off           0.0

answered yesterday

sacul

29k41639

add a comment |

up vote
4
down vote

Using a groupby, you can do this:

df['Elapsed Time'] = (df.groupby(df.Work.eq('Off').cumsum()).Time

                      .transform(lambda x: x.diff()

                                 .dt.total_seconds()

                                 .cumsum())

                      .fillna(0))



>>> df

                  Time Work  Elapsed Time

0  2018-12-01 10:00:00  Off           0.0

1  2018-12-01 10:00:02   On           2.0

2  2018-12-01 10:00:05   On           5.0

3  2018-12-01 10:00:06   On           6.0

4  2018-12-01 10:00:07   On           7.0

5  2018-12-01 10:00:09  Off           0.0

6  2018-12-01 10:00:11  Off           0.0

7  2018-12-01 10:00:14   On           3.0

8  2018-12-01 10:00:16   On           5.0

9  2018-12-01 10:00:18   On           7.0

10 2018-12-01 10:00:20  Off           0.0

answered yesterday

sacul

29k41639

add a comment |

up vote
4
down vote

Using a groupby, you can do this:

df['Elapsed Time'] = (df.groupby(df.Work.eq('Off').cumsum()).Time

                      .transform(lambda x: x.diff()

                                 .dt.total_seconds()

                                 .cumsum())

                      .fillna(0))



>>> df

                  Time Work  Elapsed Time

0  2018-12-01 10:00:00  Off           0.0

1  2018-12-01 10:00:02   On           2.0

2  2018-12-01 10:00:05   On           5.0

3  2018-12-01 10:00:06   On           6.0

4  2018-12-01 10:00:07   On           7.0

5  2018-12-01 10:00:09  Off           0.0

6  2018-12-01 10:00:11  Off           0.0

7  2018-12-01 10:00:14   On           3.0

8  2018-12-01 10:00:16   On           5.0

9  2018-12-01 10:00:18   On           7.0

10 2018-12-01 10:00:20  Off           0.0

answered yesterday

sacul

29k41639

Using a groupby, you can do this:

df['Elapsed Time'] = (df.groupby(df.Work.eq('Off').cumsum()).Time

                      .transform(lambda x: x.diff()

                                 .dt.total_seconds()

                                 .cumsum())

                      .fillna(0))



>>> df

                  Time Work  Elapsed Time

0  2018-12-01 10:00:00  Off           0.0

1  2018-12-01 10:00:02   On           2.0

2  2018-12-01 10:00:05   On           5.0

3  2018-12-01 10:00:06   On           6.0

4  2018-12-01 10:00:07   On           7.0

5  2018-12-01 10:00:09  Off           0.0

6  2018-12-01 10:00:11  Off           0.0

7  2018-12-01 10:00:14   On           3.0

8  2018-12-01 10:00:16   On           5.0

9  2018-12-01 10:00:18   On           7.0

10 2018-12-01 10:00:20  Off           0.0

answered yesterday

sacul

29k41639

answered yesterday

sacul

29k41639

answered yesterday

sacul

29k41639

answered yesterday

sacul

29k41639

add a comment |

up vote
3
down vote

A numpy slicy approach

u, f, i = np.unique(df.Work.eq('Off').values.cumsum(), True, True)

t = df.Time.values



df['Elapsed Time'] = t - t[f[i]]

df



                  Time Work Elapsed Time

0  2018-12-01 10:00:00  Off     00:00:00

1  2018-12-01 10:00:02   On     00:00:02

2  2018-12-01 10:00:05   On     00:00:05

3  2018-12-01 10:00:06   On     00:00:06

4  2018-12-01 10:00:07   On     00:00:07

5  2018-12-01 10:00:09  Off     00:00:00

6  2018-12-01 10:00:11  Off     00:00:00

7  2018-12-01 10:00:14   On     00:00:03

8  2018-12-01 10:00:16   On     00:00:05

9  2018-12-01 10:00:18   On     00:00:07

10 2018-12-01 10:00:20  Off     00:00:00

We can nail down the integer bit with

df['Elapsed Time'] = (t - t[f[i]]).astype('timedelta64[s]').astype(int)

df



                  Time Work  Elapsed Time

0  2018-12-01 10:00:00  Off             0

1  2018-12-01 10:00:02   On             2

2  2018-12-01 10:00:05   On             5

3  2018-12-01 10:00:06   On             6

4  2018-12-01 10:00:07   On             7

5  2018-12-01 10:00:09  Off             0

6  2018-12-01 10:00:11  Off             0

7  2018-12-01 10:00:14   On             3

8  2018-12-01 10:00:16   On             5

9  2018-12-01 10:00:18   On             7

10 2018-12-01 10:00:20  Off             0

answered yesterday

piRSquared

151k22138281

add a comment |

up vote
3
down vote

A numpy slicy approach

u, f, i = np.unique(df.Work.eq('Off').values.cumsum(), True, True)

t = df.Time.values



df['Elapsed Time'] = t - t[f[i]]

df



                  Time Work Elapsed Time

0  2018-12-01 10:00:00  Off     00:00:00

1  2018-12-01 10:00:02   On     00:00:02

2  2018-12-01 10:00:05   On     00:00:05

3  2018-12-01 10:00:06   On     00:00:06

4  2018-12-01 10:00:07   On     00:00:07

5  2018-12-01 10:00:09  Off     00:00:00

6  2018-12-01 10:00:11  Off     00:00:00

7  2018-12-01 10:00:14   On     00:00:03

8  2018-12-01 10:00:16   On     00:00:05

9  2018-12-01 10:00:18   On     00:00:07

10 2018-12-01 10:00:20  Off     00:00:00

We can nail down the integer bit with

df['Elapsed Time'] = (t - t[f[i]]).astype('timedelta64[s]').astype(int)

df



                  Time Work  Elapsed Time

0  2018-12-01 10:00:00  Off             0

1  2018-12-01 10:00:02   On             2

2  2018-12-01 10:00:05   On             5

3  2018-12-01 10:00:06   On             6

4  2018-12-01 10:00:07   On             7

5  2018-12-01 10:00:09  Off             0

6  2018-12-01 10:00:11  Off             0

7  2018-12-01 10:00:14   On             3

8  2018-12-01 10:00:16   On             5

9  2018-12-01 10:00:18   On             7

10 2018-12-01 10:00:20  Off             0

answered yesterday

piRSquared

151k22138281

add a comment |

up vote
3
down vote

A numpy slicy approach

u, f, i = np.unique(df.Work.eq('Off').values.cumsum(), True, True)

t = df.Time.values



df['Elapsed Time'] = t - t[f[i]]

df



                  Time Work Elapsed Time

0  2018-12-01 10:00:00  Off     00:00:00

1  2018-12-01 10:00:02   On     00:00:02

2  2018-12-01 10:00:05   On     00:00:05

3  2018-12-01 10:00:06   On     00:00:06

4  2018-12-01 10:00:07   On     00:00:07

5  2018-12-01 10:00:09  Off     00:00:00

6  2018-12-01 10:00:11  Off     00:00:00

7  2018-12-01 10:00:14   On     00:00:03

8  2018-12-01 10:00:16   On     00:00:05

9  2018-12-01 10:00:18   On     00:00:07

10 2018-12-01 10:00:20  Off     00:00:00

We can nail down the integer bit with

df['Elapsed Time'] = (t - t[f[i]]).astype('timedelta64[s]').astype(int)

df



                  Time Work  Elapsed Time

0  2018-12-01 10:00:00  Off             0

1  2018-12-01 10:00:02   On             2

2  2018-12-01 10:00:05   On             5

3  2018-12-01 10:00:06   On             6

4  2018-12-01 10:00:07   On             7

5  2018-12-01 10:00:09  Off             0

6  2018-12-01 10:00:11  Off             0

7  2018-12-01 10:00:14   On             3

8  2018-12-01 10:00:16   On             5

9  2018-12-01 10:00:18   On             7

10 2018-12-01 10:00:20  Off             0

answered yesterday

piRSquared

151k22138281

A numpy slicy approach

u, f, i = np.unique(df.Work.eq('Off').values.cumsum(), True, True)

t = df.Time.values



df['Elapsed Time'] = t - t[f[i]]

df



                  Time Work Elapsed Time

0  2018-12-01 10:00:00  Off     00:00:00

1  2018-12-01 10:00:02   On     00:00:02

2  2018-12-01 10:00:05   On     00:00:05

3  2018-12-01 10:00:06   On     00:00:06

4  2018-12-01 10:00:07   On     00:00:07

5  2018-12-01 10:00:09  Off     00:00:00

6  2018-12-01 10:00:11  Off     00:00:00

7  2018-12-01 10:00:14   On     00:00:03

8  2018-12-01 10:00:16   On     00:00:05

9  2018-12-01 10:00:18   On     00:00:07

10 2018-12-01 10:00:20  Off     00:00:00

We can nail down the integer bit with

df['Elapsed Time'] = (t - t[f[i]]).astype('timedelta64[s]').astype(int)

df



                  Time Work  Elapsed Time

0  2018-12-01 10:00:00  Off             0

1  2018-12-01 10:00:02   On             2

2  2018-12-01 10:00:05   On             5

3  2018-12-01 10:00:06   On             6

4  2018-12-01 10:00:07   On             7

5  2018-12-01 10:00:09  Off             0

6  2018-12-01 10:00:11  Off             0

7  2018-12-01 10:00:14   On             3

8  2018-12-01 10:00:16   On             5

9  2018-12-01 10:00:18   On             7

10 2018-12-01 10:00:20  Off             0

answered yesterday

piRSquared

151k22138281

answered yesterday

piRSquared

151k22138281

answered yesterday

piRSquared

151k22138281

answered yesterday

piRSquared

151k22138281

add a comment |

Rafael is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Rafael is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Cfrtjryk