r3dhistogramplot3d

This is my goal: Plot the average of z according to bins formed by x and y in R


So I came across this answer here, and my question is, if I have three variables and I want to use the x and y to create bins, like using cut and table in the other answer, how can I then graph the z as the average of all the variable Z data that falls into those bins?

This what I have:

library(plot3D)

x <- data$OPEXMKUP_PT_1d
y <- data$prod_opex


z <- data$ab90_ROIC_wogw3

x_c <- cut(x, 20)
y_c <- cut(y, 20)
cutup <- table(x_c, y_c)
mat <- data.frame(cutup)


hist3D(z = cutup, border="black", bty ="g",
       main = "Data", xlab = "Markup",
       ylab ="Omega", zlab = "Star")

But it show the z as the frequency, and when I try,

hist3D(x, y, z, phi = 0, bty = "g",  type = "h", main = 'NEWer',
       ticktype = "detailed", pch = 19, cex = 0.5,
       xlim=c(0,3),
       ylim=c(-10,20),
       zlim=c(0,1))

It thinks for a long time and throws an error,

Error: protect(): protection stack overflow
Graphics error: Plot rendering error

It will do the 3d scatter fine but the data doesn't make sense since the Z variable is a ratio that falls mostly between 0 and 1, so you get a bunch of tall lines and and a bunch of short lines. I would like them averaged by bin to show a visual of how the average ratio changes as x and y change. Please let me know if there is a way to do this.


Solution

  • Not sure exactly what your data looks like, so I made some up. You should be able to adjust to your needs. It's a bit hacky/brute force-ish, but could work just fine if your data isn't too large to slow down the loop.

    library(plot3D)
    
    # Fake it til you make it
    n = 5000
    x = runif(n)
    y = runif(n)
    z = x + 2*y + sin(x*2*pi)
    
    # Divide into bins
    x_c = cut(x, 20) 
    y_c = cut(y, 20) 
    x_l = levels(x_c)
    y_l = levels(y_c)
    
    # Compute the mean of z within each x,y bin
    z_p = matrix(0, 20, 20) 
    for (i in 1:length(x_l)){
        for (j in 1:length(y_l)){
            z_p[i,j] = mean(z[x_c %in% x_l[i] & y_c %in% y_l[j]])
            }   
        }   
    
    # Get the middle of each bin
    x_p = sapply(strsplit(gsub('\\(|]', '', x_l), ','), function(x) mean(as.numeric(x)))
    y_p = sapply(strsplit(gsub('\\(|]', '', y_l), ','), function(x) mean(as.numeric(x)))
    
    # Plot
    hist3D(x_p, y_p, z_p, bty = "g",  type = "h", main = 'NEWer',
           ticktype = "detailed", pch = 19, cex = 0.5)
    

    Basically, we're just manually computing the average bin height z by looping over the bins. There may be a better way to do the computation.

    enter image description here