radjustment

Majority Adjustment/Matching for Ground Truth


So, I am working on a ground truth for my machine learning algorithm with R. 5 people classified pictures for me. See the df in the picture below.

Text](https://stackover[![Datafram]1flow.com/Dataframe.PNG)

The results seem to be quite depended on the classifier. What I want to do now, is finding a common Ground Truth. Therefore, I want to do an adjustment by majority for most cases. If at least 3 classifiers classified a picture with the same condition, I want the other two which are different, to switch to this majority condition. In the rare case where there is no majority condition, I want to recheck the picture and make the decision myself.

Below find a reproducible example

data <- data.frame(Pic = character(10), class1 = numeric(10), class2 = numeric(10), class3 = numeric(10), class4 = numeric(10), 
                   class5 = numeric(10), check = numeric(10))

data$Pic <- 1:10
set.seed(1234)
data$class1 <- sample(1:5,5)
data$class2 <- sample(1:5,5)
data$class3 <- sample(1:5,5)
data$class4 <- sample(1:5,5)
data$class5 <- sample(1:5,5)

data[5,2:6] <- 5
data[1,5] <-4
data[9,5] <- 2

data$check <- ifelse(data$class1 == data$class2 &  data$class1 == data$class3 &
                     data$class1 == data$class4& data$class1 == data$class5, "Good", "delta")

Any help on is warmly welcome


Solution

  • found the answer myself, this does the trick:

    library(data.table)
    setDT(df)[, c("Most_Frequent", "Count") := {tbl <- table(unlist(.SD))
    .(names(tbl)[which.max(tbl)], max(tbl))}, by = Variable]