ralternate

Is there a way in R or Excel to check that values alternate?


I have a data frame with 60000 rows. In one of the column, the values alternate between "landing" and "take off", as such.

Movements Flight number Destination
take-off 011 Paris
landing 011 Paris
take-off 053 Ibiza
landing 053 Ibiza
take-off 067 Mans
take-off 123 Geneva
landing 123 Geneva

But there are some mistakes, like here in row 5. In this instance, I have the take-off row, but not the landing one.

Is there any way in R or Excel to highlight or select those rows? To be specific: flight numbers are not unique and appear multiple times throughout it.

Thanks!

I tried this:

data() <- read.csv("Flight data.csv")

# Tried to find the positions where values don't alternate correctly
incorrect_positions <- which(data$event[-1] == data$event[-length(data$event)]) + 1

if (length(incorrect_positions) == 0) {
  print("The values in the 'event' column alternate correctly.\n")
} else {
  print("The values in the 'event' column do not alternate correctly at positions:", incorrect_positions, "\n")
}

It did not give me an error, but apparently doesn't correspond to what I want - it kept printing that all the values alternate correctly, when I know for a fact that they don't.


Solution

  • Using as.factor then as.integer to strip of the levels. differences should be -1 or 1; so taking abs will help to identify which are bad ones.

    > f <- \(x) c(1L, abs(diff(as.integer(as.factor(x)))))
    > f(dat$Movements)
    [1] 1 1 1 1 1 0 1
    > which(f(dat$Movements) != 1)
    [1] 6
    

    Data:

    > dput(dat)
    structure(list(Movements = c("take-off", "landing", "take-off", 
    "landing", "take-off", "take-off", "landing"), Flight.number = c(11L, 
    11L, 53L, 53L, 67L, 123L, 123L), Destination = c("Paris", "Paris", 
    "Ibiza", "Ibiza", "Mans", "Geneva", "Geneva")), class = "data.frame", row.names = c(NA, 
    -7L))