Tag Archives: python

Coffee Boss day 6.5: Combining a scatter with a line

So based on what I spotted in the source code of matplotlib’s _axes.py (https://github.com/matplotlib/matplotlib/blob/master/lib/matplotlib/axes/_axes.py#L1469-L1493):

    def plot(self, *args, scalex=True, scaley=True, data=None, **kwargs):
        """
        Plot y versus x as lines and/or markers.

        Call signatures::

            plot([x], y, [fmt], *, data=None, **kwargs)
            plot([x], y, [fmt], [x2], y2, [fmt2], ..., **kwargs)

        The coordinates of the points or line nodes are given by *x*, *y*.

        The optional parameter *fmt* is a convenient way for defining basic
        formatting like color, marker and linestyle. It's a shortcut string
        notation described in the *Notes* section below.

        >>> plot(x, y)        # plot x and y using default line style and color
        >>> plot(x, y, 'bo')  # plot x and y using blue circle markers
        >>> plot(y)           # plot y using x as index array 0..N-1
        >>> plot(y, 'r+')     # ditto, but with red plusses

I saw that I could use my axe to simply plot the positions of the indicators on in two dimensions. I got this:

Which is pretty much perfectly what I want right now. I did some fairly dirty mucking around with the data to get it to do this, essentially looking for where the row-to-row weight difference crosses a threshold from low-to-high.

# median filter with a rolling window: low pass filter
df['rolling4'] = df['weight'].rolling(4).median()

# normalise by looking for difference over 8 samples
df['diff'] = df['rolling4'].diff(periods=-8)

# Tag with True where the change is over 300g
threshold = 300.0
df['thresholded'] = (df['diff'] > threshold)

# Produce 'highlight' boolean where the threshold is True, AND
# the threshold for the previous row was False. This feels pretty clunky.
df['highlight'] = (df['thresholded'] == True) & (df['thresholded'].shift(1) == False)

# Now create a new dataframe with just the highlights in, and only the interesting columns
highlights = df[df['highlight']][['datetime', 'rolling4']]

That’s good isn’t it?

Coffee Boss day 6: Horizontal line

I’ve been trying to get a horizontal line to show a threshold. It never worked. It gave me a mean-spirited error message that I couldn’t understand. I spent the last few days trying. I got this one:

Traceback (most recent call last):
  File "C:/Users/sandy_000/PycharmProjects/coffee_boss/viz/viz.py", line 65, in <module>
    df.plot(y=['diff'], ax=ax3)
  File "C:\Users\sandy_000\venv\coffee_boss\lib\site-packages\pandas\plotting\_core.py", line 794, in __call__
    return plot_backend.plot(data, kind=kind, **kwargs)
  File "C:\Users\sandy_000\venv\coffee_boss\lib\site-packages\pandas\plotting\_matplotlib\__init__.py", line 62, in plot
    plot_obj.generate()
  File "C:\Users\sandy_000\venv\coffee_boss\lib\site-packages\pandas\plotting\_matplotlib\core.py", line 284, in generate
    self._adorn_subplots()
  File "C:\Users\sandy_000\venv\coffee_boss\lib\site-packages\pandas\plotting\_matplotlib\core.py", line 472, in _adorn_subplots
    sharey=self.sharey,
  File "C:\Users\sandy_000\venv\coffee_boss\lib\site-packages\pandas\plotting\_matplotlib\tools.py", line 316, in _handle_shared_axes
    _remove_labels_from_axis(ax.xaxis)
  File "C:\Users\sandy_000\venv\coffee_boss\lib\site-packages\pandas\plotting\_matplotlib\tools.py", line 281, in _remove_labels_from_axis
    for t in axis.get_majorticklabels():
  File "C:\Users\sandy_000\venv\coffee_boss\lib\site-packages\matplotlib\axis.py", line 1252, in get_majorticklabels
    ticks = self.get_major_ticks()
  File "C:\Users\sandy_000\venv\coffee_boss\lib\site-packages\matplotlib\axis.py", line 1407, in get_major_ticks
    numticks = len(self.get_majorticklocs())
  File "C:\Users\sandy_000\venv\coffee_boss\lib\site-packages\matplotlib\axis.py", line 1324, in get_majorticklocs
    return self.major.locator()
  File "C:\Users\sandy_000\venv\coffee_boss\lib\site-packages\matplotlib\dates.py", line 1431, in __call__
    self.refresh()
  File "C:\Users\sandy_000\venv\coffee_boss\lib\site-packages\matplotlib\dates.py", line 1451, in refresh
    dmin, dmax = self.viewlim_to_dt()
  File "C:\Users\sandy_000\venv\coffee_boss\lib\site-packages\matplotlib\dates.py", line 1202, in viewlim_to_dt
    .format(vmin))
ValueError: view limit minimum 0.0 is less than 1 and is an invalid Matplotlib date value. This often happens if you pass a non-datetime value to an axis that has datetime units

So what I did was change

ax3.axhline(threshold, linewidth=1, color='r')

df.plot(y=[small_window['name'], 'thresholded'], secondary_y=['thresholded'], ax=ax1)
df.plot(y=[small_window['name']], ax=ax2)
df.plot(y=['diff'], ax=ax3)

To

df.plot(y=[small_window['name'], 'thresholded'], secondary_y=['thresholded'], ax=ax1)
df.plot(y=[small_window['name']], ax=ax2)
df.plot(y=['diff'], ax=ax3)

ax3.axhline(threshold, linewidth=1, color='r')

Yes. Same, but the hline happens after the plot. OK, I can make the intuitive leap for why this works and not be cross about it, but I wish I’d tried this a week ago.