r time-series data-wrangling eye-tracking

Calculating time to first fixation, first fixation duration, and visit duration from series of fixations (eye tracking; R)

I hope everyone is doing well.

I am currently working with an eye tracking dataset. I have processed the fixations using the R package "gazepath". This has provided me with an output of fixations to particular coordinates on a x/y plane.

My goal is to calculate: first fixation duration, time to first fixation, and total visit duration for a series of areas of interest which correspond to a x-y coordinate range for each trial.

For this study, I have two main Areas of Interest - eyes and mouth. For example say the eyes were located from x1 – .200 to x2 – .300 and y1 – .500 to y1 – .600 And the face was located from x1 – .100 to x2 – .500 and y1 .100 to y2 .800.

So in the example below, for trial 1 looking at the face it should output something like: time to first fixation = 1; first fixation duration =250; total fixation duration = 2116.667

I would want to do this for each trial and each AOI. Help with creating a loop for a series of subject files and saving the output for each individual subject would also be greatly appreciated.

Thank you for your time and consideration! Take care, Caroline

df1 <- data.frame(Participant = c('A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A' ), 
Trial = c(1,1,1,1,2, 2,2,2,2,2), 
FixationDuration = c(250, 950, 250, 666.6666667, 216.6666667, 383.3333333, 433.3333333, 500, 383.3333333, 550),
StartTimeforFixation= c(1, 301, 1284, 1584, 2301, 2567, 3001, 3484, 4034, 4451), 
EndTimeforFixation = c(250, 1250, 1533, 2250, 2516, 2950, 3433, 3983, 4416, 5000),
mean_x = c(0.464453,  0.499141, 0.491302, 0.496063, 0.491435, 0.494063, 0.498093, 0.487845, 0.492093, 0.497614),
mean_y = c(0.638584, 0.515769, 0.604171, 0.685817, 0.546331, 0.70222,0.528106, 0.615643, 0.551993, 0.661424),
POGsdSacAmp = c(4.84E-05, 0.000103, 6.69E-05, 0.000111, 0.000118, 0.000108, 
7.15E-05, 7.31E-05, 6.76E-05, 7.10E-05),
RMS = c(6.61E-05, 0.000128, 7.89E-05, 8.27E-05, 0.000156, 0.000151, 7.85E-05, 6.91E-05,  8.86E-05, 9.17E-05))

Solution

Using dplyr, this can be achieved pretty easily by grouping.

library(tidyverse)
df1 <- tibble(Participant = c('A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A' ), 
                  Trial = c(1,1,1,1,2, 2,2,2,2,2), 
                  FixationDuration = c(250, 950, 250, 666.6666667, 216.6666667, 383.3333333, 433.3333333, 500, 383.3333333, 550),
                  StartTimeforFixation= c(1, 301, 1284, 1584, 2301, 2567, 3001, 3484, 4034, 4451), 
                  EndTimeforFixation = c(250, 1250, 1533, 2250, 2516, 2950, 3433, 3983, 4416, 5000),
                  mean_x = c(0.464453,  0.499141, 0.491302, 0.496063, 0.491435, 0.494063, 0.498093, 0.487845, 0.492093, 0.497614),
                  mean_y = c(0.638584, 0.515769, 0.604171, 0.685817, 0.546331, 0.70222,0.528106, 0.615643, 0.551993, 0.661424),
                  POGsdSacAmp = c(4.84E-05, 0.000103, 6.69E-05, 0.000111, 0.000118, 0.000108, 
                                  7.15E-05, 7.31E-05, 6.76E-05, 7.10E-05),
                  RMS = c(6.61E-05, 0.000128, 7.89E-05, 8.27E-05, 0.000156, 0.000151, 7.85E-05, 6.91E-05,  8.86E-05, 9.17E-05))

First, we need to compute the individual durations:

df1 %>%
  mutate(fix_time = EndTimeforFixation - StartTimeforFixation)
# A tibble: 10 x 10
#   Participant Trial FixationDuration StartTimeforFixat~ EndTimeforFixati~ mean_x mean_y POGsdSacAmp     RMS fix_time
#   <chr>       <dbl>            <dbl>              <dbl>             <dbl>  <dbl>  <dbl>       <dbl>   <dbl>    <dbl>
# 1 A               1             250                   1               250  0.464  0.639   0.0000484 6.61e-5      249
# 2 A               1             950                 301              1250  0.499  0.516   0.000103  1.28e-4      949
# 3 A               1             250                1284              1533  0.491  0.604   0.0000669 7.89e-5      249
# 4 A               1             667.               1584              2250  0.496  0.686   0.000111  8.27e-5      666
...

Note here I get 249 ms for the first one when you had 250 ms.

Then you can define groups, here the participant and the trial should be the relevant groups. Within each group, you can then compute whatever statistic you want:

df1 %>%
  mutate(fix_time = EndTimeforFixation - StartTimeforFixation) %>%
  group_by(Participant, Trial) %>%
  summarize(tot_duration = sum(fix_time))
# A tibble: 2 x 3
# Groups:   Participant [1]
#  Participant Trial tot_duration
#  <chr>       <dbl>        <dbl>
#1 A               1         2113
#2 A               2         2460

Of course, in the summarize statement you can also compute mean(), var(), sd(), or anything else you're interested in.

Now, what should you do to only compute the statistics for trials in the right area? You can use filter before computing:

df1 %>%
  mutate(fix_time = EndTimeforFixation - StartTimeforFixation,
         AOI_face = (mean_x >= .100 & mean_x <= .500 & mean_y >= .100 & mean_y <= .800),
         AOI_eyes = (mean_x >= .200 & mean_x <= .300 & mean_y >= .500 & mean_y <= .600)) %>%
  filter(AOI_face) %>%
  group_by(Participant, Trial) %>%
  summarize(tot_duration = sum(fix_time))

Here I'm assuming one trial can be in several AOI. If you instead assign a single AOI for each trial, you would want to create a single AOI column with values "face", "eyes", ..., and group_by(Participant, Trial, AOI) to compute the mean for each.

Finally, to save the results to disk, I would recommend write_csv().