rstandard-deviationvolatility

compute standard deviation in R by sector


i'm new to R and I'm sorry if this question is already answered. This is an example of my dataset:

idnumber SIC(1-digit) Year   Ebit
198       A           2019   2344
196       A           2019   6383
374       A           2019   5628
281       A           2019   2672
274       A           2018   2792
196       A           2018   3802
374       A           2018   3892
468       B           2019   6372
389       B           2019   3829
493       C           2019   2718
928       C           2019   2628
278       C           2019   3672

I want to compute the standard deviation for "Ebit" by the industrial sector "SIC(1-digit)". In this way I would like to find a volatility measure of operating revenue "Ebit" by industry.

Thanks in advance, for your kind answer..


Solution

  • Let's load your data to reproduce your example:

    dat <- data.frame(
      idnumber = c(198, 196, 374, 281, 274, 196, 374, 468, 389, 493, 928, 278),
      `SIC(1-digit)` =  c('A', 'A', 'A', 'A', 'A', 'A', 'A', 'B', 'B', 'C', 'C', 'C'),
      Year = c(2019, 2019, 2019, 2019, 2018, 2018, 2018, 2019, 2019, 2019, 2019, 2019),
      Ebit = c(2344, 6383, 5628, 2672, 2792, 3802, 3892, 6372, 3829, 2718, 2628, 3672),
      check.names = FALSE
    )
    

    You see SIC(1-digit) is surrounded by back-ticks and the argument check.names = FALSE. This is because your column name has the special characters ( and ); you can read more about this here and here


    Once your data is loaded, you can use dplyr:

    library(dplyr)
    
    dat %>% 
      group_by(`SIC(1-digit)`) %>%
      summarise(standard_deviation = sd(Ebit))
    
    
    # A tibble: 3 x 2
      `SIC(1-digit)` standard_deviation
    * <chr>                       <dbl>
    1 A                           1544.
    2 B                           1798.
    3 C                            579.
    

    Or data.table:

    library(data.table)
    
    setDT(dat)
    dat[, .(standard_deviation = sd(Ebit)), by = `SIC(1-digit)`]
    
      
       SIC(1-digit) standard_deviation
    1:            A          1544.4116
    2:            B          1798.1725
    3:            C           578.5257