vaex is a library similar to pandas, that provides a dataframe class I'm looking for a way to access a specific cell by row and column
for example:
import vaex
df = vaex.from_dict({'a': [1,2,3], 'b': [4,5,6]})
df.a[0] # this works in pandas but not in vaex
In this specific case you could do df.a.values[0]
, but if this was a virtual column, it would lead to the whole column being evaluated. What would be faster to do (say in a case of > 1 billon rows, and a virtual column), is to do:
df['r'] = df.a + df.b
df.evaluate('r', i1=2, i2=3)[0]
This will evaluate the virtual column/expression r
, from row 2 to 3 (an array of length 1), and get the first element.
This is rather clunky, and there is an issue open on this: https://github.com/vaexio/vaex/issues/238
Maybe you are surprised that vaex does not have something as 'basic' as this, but vaex is often used for really large datasets, where you don't access individual rows that often, so we don't run into this a lot.