I would like to print out the name of a variable that is supplied when a function is used in Python.
The goal is to use the supplied name in some printed text.
There are multiple questions dealing with similar topics, but I am unable to find an exact solution or approach.
I am running Python 3 on Windows.
(Python version: 3.12.3 (tags/v3.12.3:f6650f9, Apr 9 2024, 14:05:25) [MSC v.1938 64 bit (AMD64)]
Previous questions and answers:
How to print original variables name
Example:
import pandas as pd
data_ok = {'col_a':[1,1,1,1],\
'col_b':[2,2,2,2],\
'col_c':[3,3,3,3]}
data_no = {'col_a':[1,1,1,1],\
'col_b':[2,2,2,2],\
'col_d':[4,4,4,4]}
df_ok = pd.DataFrame(data_ok)
df_no = pd.DataFrame(data_no)
print(df_ok)
print(df_no)
def df_check(df_in):
need_cols = ['col_a', 'col_b', 'col_c']
have_cols = df_in.columns.tolist()
check_cols = all(e in have_cols for e in need_cols)
assert check_cols == True, f"---------- Import Error - Check dataframe {df_in} columns in file. The columns must include : {need_cols} ----------"
if check_cols == True:
print("\n\n", "-"*80, '(required columns found!' , need_cols)
print("\n\n", "-" * 80, '(continuing with analysis)' )
df_check(df_ok)
df_check(df_no)
-------------------------------------------------------------------------------- (required columns found! ['col_a', 'col_b', 'col_c'] )
-------------------------------------------------------------------------------- (continuing with analysis)
#...
assert check_cols == True, f"---------- Import Error - Check dataframe {df_in} columns in file. The columns must include : {need_cols} ----------"
^^^^^^^^^^^^^^^^^^
AssertionError: ---------- Import Error - Check dataframe col_a col_b col_d
0 1 2 4
1 1 2 4
2 1 2 4
3 1 2 4 columns in file. The columns must include : ['col_a', 'col_b', 'col_c'] ----------
Goal:
Is there a way to have the printed message report the name of the provided dataframe as opposed to printing the data frame itself?
AssertionError: ---------- Import Error - Check dataframe "df_no" columns in file. The columns must include : ['col_a', 'col_b', 'col_c'] ----------
You can do some checking against the globals()
builtin to see if the object passed into df_check
has the same id
.
import pandas as pd
... # code omitted for brevity
df_ok = pd.DataFrame()
df_no = pd.DataFrame()
def df_check(df_in) -> None:
... # insert other code here
df_name = (
[k for k, v in globals().items() if # get all names defined in the global scope
isinstance(v, pd.DataFrame) and # check if the object is a DataFrame
id(v) == id(df_in)] # check if that object matches the one passed to df_in
)[0] # get the 0th (and only) element from this list
print(df_name) # just an example so we can see the result
df_check(df_no)
# >>> df_no
df_check(df_ok)
# >>> df_ok
Note: you can ostensibly skip checking if the object is a DataFrame
(comment out the second line in that list comprehension) since only the passed-in object will match df_in
no matter what it is, but it never hurts to be more explicit.
All said, this works but it's not something I'd generally advise doing. This is likely more trouble than necessary if all you're after is an error message that amounts to "This DataFrame doesn't have the required column format".