Suppose the data is:
data <- structure(list(country = c("Poland", "Poland", "Poland", "Poland",
"Poland", "Poland", "Portugal", "Portugal", "Portugal", "Portugal",
"Portugal", "Portugal", "Spain", "Spain", "Spain", "Spain", "Spain",
"Spain"), Code = c("POL", "POL", "POL", "POL", "POL", "POL",
"PRT", "PRT", "PRT", "PRT", "PRT", "PRT", "ESP", "ESP", "ESP",
"ESP", "ESP", "ESP"), year = c(1950, 1951, 1952, 1953, 1954,
1955, 1950, 1951, 1952, 1953, 1954, 1955, 1950, 1951, 1952, 1953,
1954, 1955), IV = c(3, 3, 3, 3, 3, 3, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1)), row.names = c(1L, 2L, 3L, 4L, 5L, 6L, 7L,
8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L), class = "data.frame")
How to difference IV values within each panel (country)? In other words, I want to substract IV values for the year of 1951 from 1950; 1952-1951; 1953-1952; 1954-1953; 1955-1954. And so for every country. In the outcome dataset (let's name it 'newdata'), IV value for each year has to show its difference with an IV value from a previous year; 1950 year should be just empty.
Any suggestions, guys? Hope my question is not confusing.
Using diff
in ave
perhaps.
> transform(data, dif=ave(IV, Code, FUN=\(x) c(NA, diff(x))))
country Code year IV dif
1 Poland POL 1950 3 NA
2 Poland POL 1951 3 0
3 Poland POL 1952 1 -2
4 Poland POL 1953 3 2
5 Poland POL 1954 3 0
6 Poland POL 1955 1 -2
7 Portugal PRT 1950 1 NA
8 Portugal PRT 1951 1 0
9 Portugal PRT 1952 1 0
10 Portugal PRT 1953 1 0
11 Portugal PRT 1954 1 0
12 Portugal PRT 1955 1 0
13 Spain ESP 1950 1 NA
14 Spain ESP 1951 1 0
15 Spain ESP 1952 3 2
16 Spain ESP 1953 1 -2
17 Spain ESP 1954 3 2
18 Spain ESP 1955 1 -2
To eliminate 1950 thereafter, pipe in a subset
.
> transform(data, dif=ave(IV, Code, FUN=\(x) c(NA, diff(x)))) |>
+ subset(year != 1950)
country Code year IV dif
2 Poland POL 1951 3 0
3 Poland POL 1952 1 -2
4 Poland POL 1953 3 2
5 Poland POL 1954 3 0
6 Poland POL 1955 1 -2
8 Portugal PRT 1951 1 0
9 Portugal PRT 1952 1 0
10 Portugal PRT 1953 1 0
11 Portugal PRT 1954 1 0
12 Portugal PRT 1955 1 0
14 Spain ESP 1951 1 0
15 Spain ESP 1952 3 2
16 Spain ESP 1953 1 -2
17 Spain ESP 1954 3 2
18 Spain ESP 1955 1 -2
Data:
set.seed(42)
data$IV <- sample(data$IV) ## to better demonstrate the result