[SOLVED] Equivalent for R / dplyr's glimpse() function in Python for Panda dataframes?

Equivalent for R / dplyr's glimpse() function in Python for Panda dataframes?

I find the glimpse function very useful in R/dplyr. But as someone who is used to R and is working with Python now, I haven't found something as useful for Panda dataframes.

In Python, I've tried things like .describe() and .info() and .head() but none of these give me the useful snapshot which R's glimpse() gives us.

Nice features which I'm quite accustomed to having in glimpse() include:

All variables/column names as rows in the output
All variable/column data types
The first few observations of each column
Total number of observations
Total number of variables/columns

Here is some simple code you could work it with:

library(dplyr)

test <- data.frame(column_one = c("A", "B", "C", "D"),
           column_two = c(1:4))

glimpse(test)

# The output is as follows

Rows: 4
Columns: 2
$ column_one <chr> "A", "B", "C", "D"
$ column_two <int> 1, 2, 3, 4

Python

import pandas as pd

test = pd.DataFrame({'column_one':['A', 'B', 'C', 'D'],
                     'column_two':[1, 2, 3, 4]})

Is there a single function for Python which mirrors these capabilities closely (not multiple and not partly)? If not, how would you create a function that does the job precisely?

Solution

Here is one way to do it:

def glimpse(df):
    print(f"Rows: {df.shape[0]}")
    print(f"Columns: {df.shape[1]}")
    for col in df.columns:
        print(f"$ {col} <{df[col].dtype}> {df[col].head().values}")

Then:

import pandas as pd

df = pd.DataFrame(
    {"column_one": ["A", "B", "C", "D"], "column_two": [1, 2, 3, 4]}
)

glimpse(df)

# Output
Rows: 4
Columns: 2
$ column_one <object> ['A' 'B' 'C' 'D']
$ column_two <int64> [1 2 3 4]