relasticsearchropensci

Write data frame to Elastic Search with @timestamp


I am exploring the elastic R package to write a data frame to ElasticSearch. I am using docs_bulk function.

One of the columns in my dataframe is @timestamp which is in POSIXct format. But the field is getting saved in Elastic Search as string. Any idea on how can I get the column saved in time format.

I also tried by manually created the index mapping with proper data type definition but it didn't work.

Please suggest.

Version:

R: 3.3.1

Elastic Search - 2.4.1

OS - Redhat


Solution

  • elastic doesn't try to capture data types from your input data.frame or list to docs_bulk() - We could think about trying to do that, but I imagine R data types wouldn't map exactly to Elasticsearch types - might play around with trying to map data types. Here's how I'd do it:

    library('elastic')
    connect()
    

    Dummy data.frame

    df <- data.frame(
      date = as.POSIXct(seq(from = as.Date("2016-10-01"), 
                            to = as.Date("2016-10-31"), by = 'day')),
      num = 1:31
    )
    

    Create a mapping, either as a list or JSON string

    mapping <- list(
      world = list(properties = list(
        date = list(
          type = "date",
          format = "yyyy-mm-dd HH:mm:ss"
        ),
        num = list(type = "long")
    )))
    

    Make the index

    index_create(index = "hello")
    

    Create the mapping in the index

    mapping_create(index = "hello", type = "world", body = mapping)
    

    Get the mapping

    mapping_get("hello")
    #> $hello
    #> $hello$mappings
    #> $hello$mappings$world
    #> $hello$mappings$world$properties
    #> $hello$mappings$world$properties$date
    #> $hello$mappings$world$properties$date$type
    #> [1] "date"
    #> 
    #> $hello$mappings$world$properties$date$format
    #> [1] "yyyy-mm-dd HH:mm:ss"
    #> 
    #> 
    #> $hello$mappings$world$properties$num
    #> $hello$mappings$world$properties$num$type
    #> [1] "long"
    

    Bulk load data.frame

    docs_bulk(df, index = "hello", type = "world")
    

    Search on the index

    Search("hello")