I find the glimpse function very useful in R/dplyr. But as someone who is used to R and is working with Python now, I haven't found something as useful for Panda dataframes.
In Python, I've tried things like .describe() and .info() and .head() but none of these give me the useful snapshot which R's glimpse() gives us.
Nice features which I'm quite accustomed to having in glimpse() include:
Here is some simple code you could work it with:
R
library(dplyr)
test <- data.frame(column_one = c("A", "B", "C", "D"),
column_two = c(1:4))
glimpse(test)
# The output is as follows
Rows: 4
Columns: 2
$ column_one <chr> "A", "B", "C", "D"
$ column_two <int> 1, 2, 3, 4
Python
import pandas as pd
test = pd.DataFrame({'column_one':['A', 'B', 'C', 'D'],
'column_two':[1, 2, 3, 4]})
Is there a single function for Python which mirrors these capabilities closely (not multiple and not partly)? If not, how would you create a function that does the job precisely?
Here is one way to do it:
def glimpse(df):
print(f"Rows: {df.shape[0]}")
print(f"Columns: {df.shape[1]}")
for col in df.columns:
print(f"$ {col} <{df[col].dtype}> {df[col].head().values}")
Then:
import pandas as pd
df = pd.DataFrame(
{"column_one": ["A", "B", "C", "D"], "column_two": [1, 2, 3, 4]}
)
glimpse(df)
# Output
Rows: 4
Columns: 2
$ column_one <object> ['A' 'B' 'C' 'D']
$ column_two <int64> [1 2 3 4]