In R (thanks to magrittr) you can now perform operations with a more functional piping syntax via %>%
. This means that instead of coding this:
> as.Date("2014-01-01")
> as.character((sqrt(12)^2)
You could also do this:
> "2014-01-01" %>% as.Date
> 12 %>% sqrt %>% .^2 %>% as.character
To me this is more readable and this extends to use cases beyond the dataframe. Does the python language have support for something similar?
Pipes are a new feature in Pandas 0.16.2.
Example:
import pandas as pd
from sklearn.datasets import load_iris
x = load_iris()
x = pd.DataFrame(x.data, columns=x.feature_names)
def remove_units(df):
df.columns = pd.Index(map(lambda x: x.replace(" (cm)", ""), df.columns))
return df
def length_times_width(df):
df['sepal length*width'] = df['sepal length'] * df['sepal width']
df['petal length*width'] = df['petal length'] * df['petal width']
x.pipe(remove_units).pipe(length_times_width)
x
NB: The Pandas version retains Python's reference semantics. That's why length_times_width
doesn't need a return value; it modifies x
in place.