I have some 5 min price data on stock securities and as a proof of concept I have summarised them into daily price tables.
Say a 2 week period, for a security coded 603666, if the underling security is bought and sold daily like the table below, then tidyquant::geom_candlestick()
worked nicely.
The problem happens when some security had very low trading volume/or only very infrequently traded like the table below where you see lots zeros, the geom_candlestick()
plotted something weird: it should be just a horizontal line since open=high=low=0
for the days with no trading volume. but it gave me a bar plot. Is it becuase the close price has been recorded and carried over as $78.49 where in fact it should be zero for the zero volume trading days?
And a follow up question on geom_candlestick()
: can I overlay volume data on top of it? Say the left y axis is "close price", I would like to add a right y axis so I can plot barplot for the trading volume, and only highlight the big buy or sell volume.
Thank you very much, Somehow in a world of ChatGPT, I still like StackOverflow
for the above data:
dput(daily.close.data)
structure(list(symbol = c(127654L, 127654L, 127654L, 127654L,
127654L, 127654L, 127654L, 127654L, 127654L, 127654L, 127654L,
127654L, 127654L, 127654L, 127654L, 127654L, 127654L), date = structure(c(18753,
18754, 18757, 18758, 18759, 18760, 18761, 18764, 18766, 18767,
18768, 18771, 18772, 18773, 18774, 18775, 18778), class = "Date"),
UpdateTime = new("Period", .Data = c(44, 40, 13, 5, 10, 41,
43, 8, 7, 35, 13, 34, 8, 2, 38, 20, 2), year = c(0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), month = c(0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), day = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), hour = c(15,
15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15,
15), minute = c(40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40,
40, 40, 40, 40, 40, 40)), PreCloPrice = c(78.3, 78.3, 78.49,
78.49, 78.49, 78.49, 78.49, 78.49, 78.49, 78.49, 78.49, 78.49,
78.49, 78.49, 78.49, 78.49, 78.49), OpenPrice = c(0, 78.49,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 78.49, 0, 78.49), HighPrice = c(0,
78.49, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 78.49, 0, 78.49
), LowPrice = c(0, 78.49, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 78.49, 0, 78.49), LastPrice = c(0, 78.49, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 78.49, 0, 78.49), close = c(78.3,
78.49, 78.49, 78.49, 78.49, 78.49, 78.49, 78.49, 78.49, 78.49,
78.49, 78.49, 78.49, 78.49, 78.49, 78.49, 78.49), volume = c(0L,
49L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 143L,
0L, 96L)), class = c("grouped_df", "tbl_df", "tbl", "data.frame"), row.names =
c(NA, -17L), groups = structure(list(date = structure(c(18753,
18754, 18757, 18758, 18759, 18760, 18761, 18764, 18766, 18767,
18768, 18771, 18772, 18773, 18774, 18775, 18778), class = "Date"),
.rows = structure(list(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L,
10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L), ptype = integer(0), class =
c("vctrs_list_of",
"vctrs_vctr", "list"))), row.names = c(NA, -17L), .drop = TRUE, class =
c("tbl_df",
"tbl", "data.frame")))
the ploting code is pretty standard:
daily.close.data %>%
ggplot(aes(x = date, y = close)) +
geom_candlestick(aes(open = OpenPrice, high = HighPrice, low = LowPrice, close =
close)) +
labs(title = paste0(str_remove_all(name, ".csv")," Candlestick Chart"),
subtitle = "From sample market",
y = "Closing Price", x = "")
theme_tq()
I think the expected result will be obtained when open=low=high=close
, here even though we have open=low=high=0
, the variable close
has nonzero high values, that's why the bars are plotted instead.
If we use the variable LastPrice
instead of close
and use a small jitter (just for the sake of plotting), we shall obtain the following figure (note that for all days except last couple of days, all the values remain close to 0, whereas for those couple of days, all the values stay close to 80):
nrows <- nrow(daily.close.data)
pcols <- c('OpenPrice', 'HighPrice', 'LowPrice', 'LastPrice')
daily.close.data[,pcols] <- daily.close.data[,pcols] +
matrix(rnorm(nrows*length(pcols)), nrow=nrows) # jitter
daily.close.data %>%
ggplot(aes(x = date, y = close)) +
geom_candlestick(aes(open = OpenPrice, high = HighPrice, low = LowPrice,
close = LastPrice))