I have a weather dataset, i found a simple linear model for two columns Temperature and Humidity and plotted the histogram of its residuals and calculated the mean and std.
model <- lm(Temperature..C. ~ Humidity, data = inputData)
model.res = resid(model)
hist(model.res)
mean(model.res)
sd(model.res)
I should Plot QQ-plot of residuals versus a zero-mean normal distribution with estimated std. I used Kolmogorov-Smirnov to compare a sample with a reference probability distribution but i don't know how to plot it together:
ks<-ks.test(model.res, "pnorm", mean=0, sd=sd(model.res))
qqnorm(model.res, main="qqnorm")
qqline(model.res)
Data example:
Temperature..C. Humidity
1 9.472222 0.89
2 9.355556 0.86
3 9.377778 0.89
4 8.288889 0.83
5 8.755556 0.83
6 9.222222 0.85
7 7.733333 0.95
8 8.772222 0.89
9 10.822222 0.82
10 13.772222 0.72
11 16.016667 0.67
12 17.144444 0.54
13 17.800000 0.55
14 17.333333 0.51
15 18.877778 0.47
16 18.911111 0.46
17 15.388889 0.60
18 15.550000 0.63
19 14.255556 0.69
20 13.144444 0.70
Here is a solution using ggplot2
ggplot(model, aes(sample = rstandard(model))) +
geom_qq() +
stat_qq_line(dparams=list(sd=sd(model.res)), color="red") +
stat_qq_line()
The red line represents the qqline with your sample sd, the blackline a sd of 1.
You did not ask for that, but you could also add a smoothed qqplot:
data_model <- model
data_model$theo <- unlist(qqnorm(data_model$residuals)[1])
ggplot(data_model, aes(sample = rstandard(data_model))) +
geom_qq() +
stat_qq_line(dparams=list(sd=sd(model.res)), color="red") +
geom_smooth(aes(x=data_model$theo, y=data_model$residuals), method = "loess")