rapply

An intuitive way to understand MARGIN in sweep and apply


I have always been confused by how MARGIN seems to mean 2 different things in sweep and apply. Consider:

m <- matrix(1:6, ncol = 2)
# The "- 1" operation is applied to all cells in each row
sweep(m, MARGIN = 1, 1, "-")
# The sum operation is applied to each column
apply(m, MARGIN = 1, sum)

Do you have a mnemonic device to understand this seemingly contradictory meaning of MARGIN?


Solution

  • The MARGIN argument means exactly the same thing in both functions and that is row-wise operation. I have been confused with sweep many times in the past but I think you are confused with apply.

    I am printing the matrix below so that it is easy to visually compare with apply and sweep later on:

    > m
         [,1] [,2]
    [1,]    1    4
    [2,]    2    5
    [3,]    3    6
    

    First of all the sweep function does a row-wise operation when MARGIN is 1. I will slightly change the third argument so that this is more obvious:

    > sweep(m, MARGIN = 1, 1:3, "-")
         [,1] [,2]
    [1,]    0    3
    [2,]    0    3
    [3,]    0    3
    

    In the above case number 1 was deducted from row 1, number 2 from row 2 and number 3 from row 3. So, clearly this is a row-wise operation.

    Now let's see below the apply function:

    > apply(m, MARGIN = 1, sum)
    [1] 5 7 9
    

    Clearly, the matrix has 3 rows and 2 columns. It is easy to imply that this is also a row-wise operation since we have 3 results i.e. the same as the number of rows. This is also confirmed if we check the numbers. Row 1 sums to 5, row 2 to 7 and row 3 to 9.

    So, clearly MARGIN in both cases implies a row-wise operation.