My dataset, named ds, is a matrix with three columns and 4000+ observations. The three columns in ds are:
name v2 f1
I want to find the position of the min for v2 for factor x. I tried to use tapply as follows
tapply(ds$v2, ds$f1 == x, which.min)
The answer I get is something like this:
FALSE TRUE
2821 19
I presumed that 19 is the absolute position in my dataset and if I want to find the name of the observation all I need to do is
ds[19, 1]
But apparently that is incorrect. I have understood that 19 corresponds to the relative position i.e. it is the 19th observation for factor x.
So my question is: How can I find the absolute position for min value of factor x?
tapply
will apply the function on each unique value of the second argument so you shouldn't use ds$f1 == x
and probably just ds$f1
so it looks like:
tapply(ds$v2, ds$f1 == x, which.min)
Here is an example with the iris data set that comes with R:
tapply(iris$Sepal.Length, iris$Species, which.min)
EDIT:
However, as you noted, this will give you the position within the subsetted data and not the absolute position.
I don't think it's possible to get the absolute value from tapply
because you are working on a single vector. If you want to work with multiple columns at once, you can use this kind of approach:
d <- split(iris, iris$Species)
row_positions <- sapply(d, function(x) rownames(x[which.min(x$Sepal.Length), ]))
iris[row_positions, ]