pythonpandasseriespandas-loc

Why use loc in Pandas?


Why do we use loc for pandas dataframes? it seems the following code with or without using loc both compiles and runs at a similar speed:

%timeit df_user1 = df.loc[df.user_id=='5561']

100 loops, best of 3: 11.9 ms per loop

or

%timeit df_user1_noloc = df[df.user_id=='5561']

100 loops, best of 3: 12 ms per loop

So why use loc?

Edit: This has been flagged as a duplicate question. But although pandas iloc vs ix vs loc explanation? does mention that

you can do column retrieval just by using the data frame's __getitem__:

df['time']    # equivalent to df.loc[:, 'time']

it does not say why we use loc, although it does explain lots of features of loc. But my specific question is: why not just omit loc altogether? For this question, I have accepted a very detailed answer below.

Also in the above post, the answer (which I do not think is an answer) is really well hidden in the discussion, and any person searching for what I was, would find it hard to locate the information and would be much better served by the answer provided to my question here.


Solution