I don’t know how to do this bit. Not I don’t know technically, I mean I have no awareness of the nature of the tools and practice to look for events in a data stream, catergorise them, and present them.
My opening gambit is:
- Look through each
weight
sample, comparing it to the last (or the last few). - If the current value is higher or lower (over a certain threshold) than it was, then:
- Record this as a significant event by putting it into another list with the same timestamp (
events
) - Combine the
events
stream with the main data frame - Present the raw
weights
data in a graph, and: - Show the
events
overlaid
I can iterate through each row just using iterators and python loops, but that feels like a pandas anti-pattern. From reading around (how do I even describe this problem for google?), it seems like it’s best to do things in pandas en masse rather than by examining each record individually. I think that’s what pandas does.
df['pct_change'] = df[large_window['name']].pct_change()
df.plot(x='time', y=[large_window['name'], 'pct_change'], secondary_y=['pct_change'])

That’s a bit like what I’m looking for. The pct_change (percent change: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.pct_change.html?highlight=pct_change#pandas.Series.pct_change) will detect the scale of changes. It’ll hover around 0, but you can see where the big jumps are, then the percent change is also big.
This also uses the second_y
A negative percent change means the coffee machine is lighter (ie the pot is lifted or a cup is taken). A positive percent change means the machine got heavier (ie pot replaced or water refilled).
I can look through those percent changes and spot ones bigger than [a certain value], and mark those cases on the plot or save them out somehow for further analysis.