I´m stuck with dplyr (again!) and trying to solve my problem without dying in the attemp.
The first lines of my df look like this:
df <- structure(list(fecha = c(1990, 1990, 1990, 1990, 1990, 1990,
1990, 1990, 1990, 1990, 1990, 1990, 1990, 1990, 1990), cientifico = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "Argentina sphyraena", class = "factor"),
dem_sect = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L), .Label = c("AB", "EP", "FE", "MF",
"PA"), class = "factor"), sector = c("EPb", "EPc", "EPc",
"EPb", "EPa", "EPa", "EPb", "EPc", "EPb", "EPb", "EPb", "EPb",
"EPb", "EPb", "EPa"), md_area = c(3010.44, 665.88, 665.88,
3010.44, 1273.65, 1273.65, 3010.44, 665.88, 3010.44, 3010.44,
3010.44, 3010.44, 3010.44, 3010.44, 1273.65), md_peso = c(1.42957605985037,
1.04499099099099, 1.04499099099099, 1.42957605985037, 1.24025925925926,
1.24025925925926, 1.42957605985037, 1.04499099099099, 1.42957605985037,
1.42957605985037, 1.42957605985037, 1.42957605985037, 1.42957605985037,
1.42957605985037, 1.24025925925926), dummy = c(4303.65295361596,
695.838601081081, 695.838601081081, 4303.65295361596, 1579.65620555556,
1579.65620555556, 4303.65295361596, 695.838601081081, 4303.65295361596,
4303.65295361596, 4303.65295361596, 4303.65295361596, 4303.65295361596,
4303.65295361596, 1579.65620555556)), row.names = c(NA, -15L
), class = "data.frame")
I´m trying to "translate" this: sumsect <- tapply(md_peso * md_area, as.factor(substr(names(sector), 1, 2)), sum)
into dplyr. But with no success although I´ve tried many many approaches. I added a column ("dem_sect") which will be the result of as.factor(substr(names(sector), 1, 2))
in an attempt to solve the problem, but I failed.
The desired output would be a data frame with a new column: "sumsect" (with the same value (in this case 6579.148 (the sum of md_peso * md_area by sector (1579.6562 + 4303.6530 + 695.8386))
fecha cientifico dem_sect sector md_area md_peso dummy sumsect
1 1990 Argentina sphyraena EP EPb 3010.44 1.429576 4303.6530 6579.148
2 1990 Argentina sphyraena EP EPc 665.88 1.044991 695.8386 6579.148
3 1990 Argentina sphyraena EP EPc 665.88 1.044991 695.8386 6579.148
4 1990 Argentina sphyraena EP EPb 3010.44 1.429576 4303.6530 6579.148
5 1990 Argentina sphyraena EP EPa 1273.65 1.240259 1579.6562 6579.148
6 1990 Argentina sphyraena EP EPa 1273.65 1.240259 1579.6562 6579.148
7 1990 Argentina sphyraena EP EPb 3010.44 1.429576 4303.6530 6579.148
8 1990 Argentina sphyraena EP EPc 665.88 1.044991 695.8386 6579.148
9 1990 Argentina sphyraena EP EPb 3010.44 1.429576 4303.6530 6579.148
10 1990 Argentina sphyraena EP EPb 3010.44 1.429576 4303.6530 6579.148
11 1990 Argentina sphyraena EP EPb 3010.44 1.429576 4303.6530 6579.148
12 1990 Argentina sphyraena EP EPb 3010.44 1.429576 4303.6530 6579.148
13 1990 Argentina sphyraena EP EPb 3010.44 1.429576 4303.6530 6579.148
14 1990 Argentina sphyraena EP EPb 3010.44 1.429576 4303.6530 6579.148
15 1990 Argentina sphyraena EP EPa 1273.65 1.240259 1579.6562 6579.148
Any hint will be more than welcome. Thanks in advance
Update: Seeing @Jahi Zamy answer+1 it is also possible using no grouping: Grouping would have the chance to control over different groups in the real data set:
df %>%
mutate(sumsect = sum(unique( md_peso * md_area)))
First answer:
You can do it this way with dplyr
: The trick is using group_by
and then ungroup()
and sum with unique
values. In case you want to sum for specific groups, then instead of ungroup
use group_by
the desired group:
df %>%
group_by(sector) %>%
mutate(y = md_peso * md_area) %>%
ungroup() %>%
mutate(sumsect = sum(unique(y)), .keep="unused")
fecha cientifico dem_sect sector md_area md_peso dummy sumsect
<dbl> <fct> <fct> <chr> <dbl> <dbl> <dbl> <dbl>
1 1990 Argentina sphyraena EP EPb 3010. 1.43 4304. 6579.
2 1990 Argentina sphyraena EP EPc 666. 1.04 696. 6579.
3 1990 Argentina sphyraena EP EPc 666. 1.04 696. 6579.
4 1990 Argentina sphyraena EP EPb 3010. 1.43 4304. 6579.
5 1990 Argentina sphyraena EP EPa 1274. 1.24 1580. 6579.
6 1990 Argentina sphyraena EP EPa 1274. 1.24 1580. 6579.
7 1990 Argentina sphyraena EP EPb 3010. 1.43 4304. 6579.
8 1990 Argentina sphyraena EP EPc 666. 1.04 696. 6579.
9 1990 Argentina sphyraena EP EPb 3010. 1.43 4304. 6579.
10 1990 Argentina sphyraena EP EPb 3010. 1.43 4304. 6579.
11 1990 Argentina sphyraena EP EPb 3010. 1.43 4304. 6579.
12 1990 Argentina sphyraena EP EPb 3010. 1.43 4304. 6579.
13 1990 Argentina sphyraena EP EPb 3010. 1.43 4304. 6579.
14 1990 Argentina sphyraena EP EPb 3010. 1.43 4304. 6579.
15 1990 Argentina sphyraena EP EPa 1274. 1.24 1580. 6579.