rpanel-data

Creating and transposing panel data in R


Currently I have the following data set for every country in the world from Afghanistan to Zimbabwe for the years 1996 until 2021 (full data set was difficult to show with a picture): Data 1

I would like to have the data in a panel data form as follows:

Country     Year    Central Government Debt
Afghanistan 1996    34,69009
Afghanistan ...     ...
Afghanistan 2021    value for 2021
....
Zimbabwe    1996    value for 1996
Zimbabwe    ...     ...
Zimbabwe    2021    value for 2021

So, I would like to have the variables Country, Year and Central Government Debt as the columns. Then all the countries (from Afghanistan to Zimbabwe as one could see in the picture) as rows with the corresponding values of the government debt from every year.

I hope that I have explained the problem clear enough.


Solution

  • You can use pivot_longer().

    For example, here is an example data similar to yours:

    your_data = structure(list(country = c("Country1", "Country2", "Country3", 
        "Country4", "Country5"), `1996` = c(43.5759781114757, 39.4892847444862, 
            31.1313527473249, 22.3277196078561, 13.8667897786945), `1997` = c(64.1469739656895, 
                39.3858637614176, 185.817262185737, 97.6506256405264, 81.9881041860208
            ), `1998` = c(53.3410886977799, 42.1991529292427, 46.6682229144499, 
                54.6986216097139, 34.4061564910226)), class = "data.frame", row.names = c(NA, 
                    -5L))
    your_data
    
    # country     1996      1997     1998
    # 1 Country1 43.57598  64.14697 53.34109
    # 2 Country2 39.48928  39.38586 42.19915
    # 3 Country3 31.13135 185.81726 46.66822
    # 4 Country4 22.32772  97.65063 54.69862
    # 5 Country5 13.86679  81.98810 34.40616
    

    Use tidyverse package:

    library(tidyverse)
    
    new_data = your_data |> 
               pivot_longer(cols = 2:4,  # These are column positions
                            names_to = "year", 
                            values_to = "central_government_debt")
    
    new_data
    # A tibble: 15 × 3
    # country  year  central_government_debt
    # <chr>    <chr>                   <dbl>
    #   1 Country1 1996                     43.6
    # 2 Country1 1997                     64.1
    # 3 Country1 1998                     53.3
    # 4 Country2 1996                     39.5
    # 5 Country2 1997                     39.4
    # 6 Country2 1998                     42.2
    # 7 Country3 1996                     31.1
    # 8 Country3 1997                    186. 
    # 9 Country3 1998                     46.7
    # 10 Country4 1996                     22.3
    # 11 Country4 1997                     97.7
    # 12 Country4 1998                     54.7
    # 13 Country5 1996                     13.9
    # 14 Country5 1997                     82.0
    # 15 Country5 1998                     34.4