I have multiple xts objects stored in a list each with 1000+ rows. They represent stock rolling window regression data. Each element has it's unique Ticker name. Here they are called Stock1, 2 ...etc for testing purposes. Rows are named by date as is the xts format. Each element is equal in dimensions. Each one looks like this:
> tail(testlist$Stock1, n = 3)
(Intercept) rmrf smb hml rmw cma
2014-12-29 0.0003223177 1.010215 -0.02164844 -0.3322500 0.07819563 1.106934
2014-12-30 0.0002631315 1.002356 -0.02351438 -0.3465390 0.05954400 1.118506
2014-12-31 0.0002837304 1.000084 -0.01619536 -0.3494401 0.06121434 1.124845
> tail(testlist$Stock2, n = 3)
(Intercept) rmrf smb hml rmw cma
2014-12-29 0.0003308951 0.7503819 -0.1967255 -0.10242616 -0.2264914 0.8329570
2014-12-30 0.0003051495 0.7409709 -0.1899856 -0.07461764 -0.2240448 0.7921883
2014-12-31 0.0002614874 0.7478099 -0.1833077 -0.06197362 -0.2056615 0.7550211
> tail(testlist$Stock3, n = 3)
(Intercept) rmrf smb hml rmw cma
2014-12-29 -0.0003803988 0.8363603 -0.4153470 0.7459769 -0.7981382 -0.2839360
2014-12-30 -0.0004121386 0.8352243 -0.4224404 0.7405976 -0.8114066 -0.2790438
2014-12-31 -0.0004660716 0.8355641 -0.4343012 0.7571033 -0.8057412 -0.3026019
> tail(testlist$Stock4, n = 3)
(Intercept) rmrf smb hml rmw cma
2014-12-29 -0.0008295692 0.9296299 -0.07776571 0.007084297 -0.1377356 0.8038542
2014-12-30 -0.0007734696 0.9383387 -0.08941983 0.011685507 -0.1092656 0.7863335
2014-12-31 -0.0007591168 0.9391670 -0.08782070 0.015619229 -0.1083707 0.7924232
What i need to do: Merge the rows by name and by aggregating all data in my list to obtain a new set of data. Each should look like this:
Name Date (Intercept) rmrf smb hml rmw cma
Stock1 2014-12-29 0.0003223177 1.010215 -0.02164844 -0.3322500 0.07819563 1.106934
Stock2 2014-12-29 0.0003308951 0.7503819 -0.1967255 -0.10242616 -0.2264914 0.8329570
Stock3 2014-12-29 -0.0003803988 0.8363603 -0.4153470 0.7459769 -0.7981382 -0.2839360
Stock4 2014-12-29 -0.0008295692 0.9296299 -0.07776571 0.007084297 -0.1377356 0.8038542
Each such element should not be a time-series any more. but a static one, with each stock representing it's coeffiecient values at time "t". In terms of size each element should have a number of rows equal to the number of Stocks in the original list.
EDIT as asked by josilber
> dput(list(Stock1=tail(testlist$Stock1, n = 3), Stock2=tail(testlist$Stock2, n = 3)))
structure(list(Stock1 = structure(c(0.000322317700198485, 0.000263131488679374,
0.000283730373928844, 1.01021497011709, 1.00235580055438, 1.00008407331697,
-0.0216484434660844, -0.023514378867335, -0.0161953614672028,
-0.332250031553704, -0.346538978804535, -0.349440052163927, 0.078195628743663,
0.0595439997647003, 0.0612143446991752, 1.1069343396633, 1.11850626745067,
1.12484530131584), class = c("xts", "zoo"), .indexCLASS = "Date", tclass = "Date", .indexTZ = "UTC", tzone = "UTC", index = structure(c(1419811200,
1419897600, 1419984000), tzone = "UTC", tclass = "Date"), .Dim = c(3L,
6L), .Dimnames = list(NULL, c("(Intercept)", "rmrf", "smb", "hml",
"rmw", "cma"))), Stock2 = structure(c(0.000330895099805035, 0.000305149500450527,
0.000261487411574969, 0.750381906747217, 0.740970893865186, 0.747809929767095,
-0.1967254672836, -0.189985607343021, -0.183307667378927, -0.10242615734439,
-0.0746176364711423, -0.0619736225998069, -0.226491384004977,
-0.224044849587752, -0.205661480898329, 0.832956994676299, 0.792188348360969,
0.755021100668421), class = c("xts", "zoo"), .indexCLASS = "Date", tclass = "Date", .indexTZ = "UTC", tzone = "UTC", index = structure(c(1419811200,
1419897600, 1419984000), tzone = "UTC", tclass = "Date"), .Dim = c(3L,
6L), .Dimnames = list(NULL, c("(Intercept)", "rmrf", "smb", "hml",
"rmw", "cma")))), .Names = c("Stock1", "Stock2"))
I am completely in the dark. I have looked at some functions that may come to use: lapply
/ also the merge
function seems to be suitable but it only works on 2 elements.
I will continue to update this post as i search for answers. If anyone has any leads or has done this before and can point in the right direction, thank you!
EDIT
#Flatten data add one more name and put into one data frame
all_coef_data<- do.call(rbind,Map(cbind,
Name=names(testlist),
Date=lapply(testlist,function(x) as.Date(as.POSIXct(c(attr(x,'index')),origin='1970-01-01'))),
lapply(testlist, as.data.frame)
))
A common denominator in each row that i need to get out is the Date. I now split the dataframe by Date using this. The output is a list.
out <- split(all_coef_data , f = all_coef_data$Date )
output:
> head(out$'2011-05-23', n=3)
Name Date (Intercept) rmrf smb hml rmw cma
Stock1.2011-05-23 Stock1 2011-05-23 -4.376389e-04 1.103582 -0.21747611 -0.1879211 -0.05849794 -0.1949192
Stock2.2011-05-23 Stock2 2011-05-23 1.115140e-04 1.198622 0.05422819 0.9998529 0.92141407 -0.8565260
Stock3.2011-05-23 Stock3 2011-05-23 5.457214e-05 1.303025 0.04705294 0.6897673 -0.19708983 -0.8247877
> tail(out$'2011-05-23', n=3)
Name Date (Intercept) rmrf smb hml rmw cma
Stock48.2011-05-23 Stock48 2011-05-23 0.0007354997 0.505054 0.1774544 -0.38934089 0.71775909 0.5189329
Stock49.2011-05-23 Stock49 2011-05-23 0.0004224351 1.304719 0.4511903 -0.64937062 -0.08872941 0.1545058
Stock50.2011-05-23 Stock50 2011-05-23 0.0003851261 1.020434 -0.1107910 -0.03964192 0.09526658 -0.4961902
Sounds like you want to
as.data.frame()
,cbind()
new columns Name
(from list component names) and Date
(from xts row names, which actually come from the index
attribute), and finallyrbind()
everything together into a single data.frame.do.call(rbind,Map(cbind,
Name=names(testlist),
Date=lapply(testlist,function(x) as.Date(as.POSIXct(c(attr(x,'index')),origin='1970-01-01'))),
lapply(testlist,as.data.frame)
));
## Name Date (Intercept) rmrf smb hml rmw cma
## Stock1.2014-12-29 Stock1 2014-12-29 0.0003223177 1.0102150 -0.02164844 -0.33225003 0.07819563 1.1069343
## Stock1.2014-12-30 Stock1 2014-12-30 0.0002631315 1.0023558 -0.02351438 -0.34653898 0.05954400 1.1185063
## Stock1.2014-12-31 Stock1 2014-12-31 0.0002837304 1.0000841 -0.01619536 -0.34944005 0.06121434 1.1248453
## Stock2.2014-12-29 Stock2 2014-12-29 0.0003308951 0.7503819 -0.19672547 -0.10242616 -0.22649138 0.8329570
## Stock2.2014-12-30 Stock2 2014-12-30 0.0003051495 0.7409709 -0.18998561 -0.07461764 -0.22404485 0.7921883
## Stock2.2014-12-31 Stock2 2014-12-31 0.0002614874 0.7478099 -0.18330767 -0.06197362 -0.20566148 0.7550211
If you don't like the new row names, you can wrap the line in `rownames<-`(...,NULL)
.