Histogram() doesn't display the values at the edge of defined range

147 views Asked by At

I can't get Julia to display edge values on histograms, when defining a range for the bins. Here is a minimal example:

using Plots
x = [0,0.5,1]
plot(histogram(x, bins=range(0,1,length=3)))

Defining them explicitly doesn't help (bins=[0,0.3,0.7,1]). It seems that histogram() excludes the limits of the range. I can extend the range to make it work:

plot(histogram(x, bins=[0,0.3,0.7,1.01))

But I really don't think that should be the way to go. Surprisingly, fixing the number of bins does work (nbins=3) but I need to keep the width of all the bins the same and constant across different runs for comparison purposes.

I have tried with Plots, PlotlyJS and StatsBase (with fit() and its closed attribute) to no avail. Maybe I'm missing something, so I wanted to ask: is it possible to do what I want?

2

There are 2 answers

1
Dan Getz On BEST ANSWER

Try:

plot(histogram(x, bins=range(0,nextfloat(1.0),length=3)))

Although this extends the range, it does so in a minimal way. Essentially the most minimal which turns the right end of the histogram closed.

As for equal widths, when dealing with floating points, equal widths has different meanings - in terms of real numbers (which are not always representible), or in terms (for example) of the number of values, but this can be different for [0.0,1.0] and [1.0,2.0].

So hopefully, this scratches the itch in the OP.

2
jling On

https://juliastats.org/StatsBase.jl/latest/empirical/#StatsBase.Histogram

most importantly:

closed: A symbol with value :right or :left indicating on which side bins (half-open intervals or higher-dimensional analogues thereof) are closed. See below for an example.

this is very common in many histogram implementations, for example, Numpy

In [8]:  np.histogram([0], bins=[0, 1, 2])
Out[8]: (array([1, 0]), array([0, 1, 2]))

In [9]:  np.histogram([1], bins=[0, 1, 2])
Out[9]: (array([0, 1]), array([0, 1, 2]))

Numpy has the inconsistency that the last bin is closed on both sides, but it's perfectly normal for every bin to close on one side,