I have downloaded historical US stock prices from the CRSP database via the WRDS website.
I can import the .csv file but my approach to properly fit it into a xts-object is at least unhandy. With longer time periods and more data, after splitting the original data frame according to each ID, the list of data frames get up to dozens of Gigabyte. Therefore, I am looking for a more efficient approach to covert the data frame, which consists of a simple list, to a ready to use xts-object.
Structure of data frame after import: (stocks are listed one below the other)
> head(dataf)
ï..Names.Date PERMNO Price.adjusted
1 31/01/2014 10104 36.90
2 28/02/2014 10104 39.11
3 31/03/2014 10104 40.91
Desired format in the xts-object:
> dat[1:3,1:19]
X10104 X10107 X11308 X11587 X11628 X11850 X12060 X12072 X12400
Jan 2014 36.90 37.84 37.82 267.18 18.35 92.16 25.13 17.74 53.53
Feb 2014 39.11 38.31 38.20 289.43 19.73 96.27 25.47 18.43 53.68
Mar 2014 40.91 40.99 38.66 306.14 20.20 97.68 25.89 18.25 52.54
My approach:
#read CSV into a data frame
dataf <- read.csv(file = "us-data14-16.csv", header = TRUE, sep = ";", fill = TRUE)
#data preprocessing, deletes objects with price = 0
dataf <- dataf[dataf[, 3] != 0, ]
#split list according to ticker in a list of data frames
dataf <- split(dataf, f= dataf[,2])
#get identifier
id <- names(dataf)
#convert data frames into xts objects
datax <- lapply(dataf, function(x) xts(x$Price.adjusted, as.yearmon(x[,1], "%d/%m/%Y")))
#set column name according to ticker (loop through every element in the list)
sapply(seq_along(datax), function(x) colnames(datax[[x]]) <<- id[x])
#merge list of xts objects in one xts object
dat <- do.call(merge, datax)
You need to use the reshape function with direction as wide.
df = data.frame(Names.Date= rep(seq(1:10), 10), PERMNO=sort(rep(seq(from= 101, to=110), 10)), Price.adjusted=rnorm(100))
reshape(df, idvar = "Names.Date", timevar = "PERMNO", direction = "wide")