rrgui

How can I synchronize the axes of a frequency polygon on top of a histogram in R?


In a previous question, I asked how to superimpose a frequency polygon on top of a histogram. That problem was solved. I have a different problem now. I want the frequency polygon's class marks to be in the middle of each histogram bar. The class mark is the value exactly in the middle of a class and is found by averaging the upper and lower boundary of the histogram bars (a.k.a., "classes"). If I were drawing the frequency polygon, I would simply draw a dot in the middle of each histogram class (or bar) and connect the dots. However, when I execute the following code, the frequency polygon is "spread out," and does not have the same axis values as the histogram.

# declare your variable
data <- c(10, 7, 8, 4, 5, 6, 6, 9, 5, 6, 3, 8,
+ 4, 6, 10, 5, 9, 7, 6, 2, 6, 5, 4, 8, 7, 5, 6)

# find the range
range(data)

# establish a class width
class_width = seq(1, 11, by=2)
class_width

# create a frequency table
data.cut = cut(data, class_width, right=FALSE)
data.freq = table(data.cut)
cbind(data.freq)

# histogram of this data
hist(data, axes=TRUE,
breaks=class_width, col="slategray3",
border = "dodgerblue4", right=FALSE, 
xlab = "Scores", xaxp=c(1, 11, 10), 
yaxp=c(0, 12, 12), main = "Histogram and Frequency Polygon")

# paint the frequency polygon over the histogram
par(new=TRUE)

# create a frequency polygon for the data
plot(data.freq, axes=FALSE, type="b", ann=FALSE)

Here is an image of what RGui produces. I have used MS Paint to draw red lines, indicating what I am trying to have R execute. The two plots seem to have the same y-axis values. How can I get the two plots to share the same x-axis values? Thanks!

Histogram, Edited by MS Paint


Solution

  • If you look at the (invisible) output from your hist(...), you'll see several properties that might be useful. Notably: $mids. (They are well-defined in ?hist.)

    Using your data up until the hist call:

    h <- hist(data, axes=TRUE,
        breaks=class_width, col="slategray3",
        border = "dodgerblue4", right=FALSE, 
        xlab = "Scores", xaxp=c(1, 11, 10), 
        yaxp=c(0, 12, 12), main = "Histogram and Frequency Polygon")
    str(h)
    # List of 6
    #  $ breaks  : num [1:6] 1 3 5 7 9 11
    #  $ counts  : int [1:5] 1 4 12 6 4
    #  $ density : num [1:5] 0.0185 0.0741 0.2222 0.1111 0.0741
    #  $ mids    : num [1:5] 2 4 6 8 10
    #  $ xname   : chr "data"
    #  $ equidist: logi TRUE
    #  - attr(*, "class")= chr "histogram"
    

    No need for par(new=TRUE) to just add a line:

    lines(h$mids, data.freq)
    

    If this is a simplified example where you must use par(new=TRUE), then you need to do two things: set the xlim/ylim of your second line, and give proper X points. (You may not realize it, but the second plot is inferring X values of 1:5 here. Do just plot(data.freq) to see.)

    par(new = TRUE)
    xlim <- range(class_width)
    ylim <- c(0, max(h$counts))
    plot(h$mids, data.freq, axes = FALSE, type = "b", ann = FALSE,
         xlim = xlim, ylim = ylim)