Not able to use explode from pandas df.
I want to "explode" a named column with named sub columns in a data frame on Jupyter Notebook.
Here is the data frame :
State or territory Census population[8][9][a]
State or territory July 1, 2024 (est.) April 1, 2020
0 California 39431263.0 39538223
1 Texas 31290831.0 29145505
2 Florida 23372215.0 21538187
3 New York 19867248.0 20201249
4 Pennsylvania 13078751.0 13002700
I want to explode Census population and then delete April 1 2020 leaving "State or territory" and "July 1, 2024 (est.)"
import pandas as pd
tables1 = pd.read_html("https://en.wikipedia.org/wiki/Fortune_500")
tables2 = pd.read_html(
"https://en.wikipedia.org/wiki/List_of_U.S._states_and_territories_by_population")
df1 = tables[1]
df2 = tables2[0]
df1copy = df1.drop(["Rank"], axis=1)
df2copy = df2.drop(
["Change, 2010–2020[9][a]",
"House seats[b]",
"Pop. per elec. vote (2020)[c]",
"Pop. per seat (2020)[a]",
"% US (2020)",
"% EC (2020)"],
axis=1)
print(df1copy.head())
print(df2copy.head())
df2.drop(["July 1, 2024 (est.)"], axis=1)
print(df2.head())
Here is the result:
KeyError Traceback (most recent call last)
File ~/Library/Python/3.9/lib/python/site-packages/pandas/core/indexes/base.py:3805, in Index.get_loc(self, key)
3804 try:
-> 3805 return self._engine.get_loc(casted_key)
3806 except KeyError as err:
KeyError: 'July 1, 2024 (est.)'
You need to specify the level of the column index that you want to drop.
Since the date column you want to drop is in level 1 you need to mention it explicitly.
df2.drop(columns=["July 1, 2024 (est.)"], level=1, axis=1)