i have a data frame with the following columns
Vendor | Product | Price |
---|---|---|
101 | 1001 | 1.50 |
101 | 1001 | 1.50 |
101 | 1001 | 1.80 |
102 | 1001 | 2.10 |
102 | 1001 | 2.10 |
101 | 1002 | 10.10 |
101 | 1002 | 10.10 |
101 | 1002 | 10.10 |
103 | 1002 | 10.10 |
104 | 1002 | 20.10 |
104 | 1002 | 20.10 |
104 | 1002 | 10.10 |
i need to count how many times each price shows up for each product by each vendor and create a new dataframe df2 like the following:
Product | Vendor | Price | count of Price |
---|---|---|---|
1001 | 101 | 1.5 | 2 |
1001 | 101 | 1.8 | 1 |
1001 | 102 | 2.1 | 2 |
1002 | 101 | 10.10 | 3 |
1002 | 103 | 10.10 | 1 |
1002 | 104 | 10.10 | 1 |
1002 | 104 | 20.10 | 2 |
i am new in R programming and everything i tried so far is not working. Any help would be greatly appreciated
Generating fake data:
vendor <- sample(x = 1:5,size = 100,replace = TRUE)+100
product <- sample(x = 1:8,size = 100,replace = TRUE)+1000
price <- (product-1000) + sample(x = -8:8,size = 100,replace = TRUE)/10
df <- data.frame(vendor,product,price)
Solution with dplyr :
library(dplyr)
df %>% group_by(vendor,product,price) %>% count %>% ungroup
Result :
# A tibble: 88 x 4
vendor product price n
<dbl> <dbl> <dbl> <int>
1 101 1001 0.4 1
2 101 1001 0.8 1
3 101 1001 1 1
4 101 1001 1.5 1
5 101 1001 1.7 2
6 101 1002 1.2 1
7 101 1002 1.4 2
8 101 1004 3.5 1
9 101 1004 3.8 1
10 101 1004 3.9 2