pythonpandas

How to explode a column


Not able to use explode from pandas df.

I want to "explode" a named column with named sub columns in a data frame on Jupyter Notebook.

Here is the data frame :

   State or territory  Census population[8][9][a]               
   State or territory         July 1, 2024 (est.)  April 1, 2020
0          California                  39431263.0       39538223
1               Texas                  31290831.0       29145505
2             Florida                  23372215.0       21538187
3            New York                  19867248.0       20201249
4        Pennsylvania                  13078751.0       13002700

I want to explode Census population and then delete April 1 2020 leaving "State or territory" and "July 1, 2024 (est.)"

import pandas as pd

tables1 = pd.read_html("https://en.wikipedia.org/wiki/Fortune_500")
tables2 = pd.read_html(
    "https://en.wikipedia.org/wiki/List_of_U.S._states_and_territories_by_population")
df1 = tables[1]
df2 = tables2[0]
df1copy = df1.drop(["Rank"], axis=1)
df2copy = df2.drop(
    ["Change, 2010–2020[9][a]",
     "House seats[b]",
     "Pop.  per elec. vote (2020)[c]",
     "Pop. per seat (2020)[a]",
     "% US (2020)",
     "% EC (2020)"],
    axis=1)
print(df1copy.head())
print(df2copy.head())
df2.drop(["July 1, 2024 (est.)"], axis=1)
print(df2.head())

Here is the result:

KeyError  Traceback (most recent call last)
  File ~/Library/Python/3.9/lib/python/site-packages/pandas/core/indexes/base.py:3805, in Index.get_loc(self, key)
    3804 try:
 -> 3805     return self._engine.get_loc(casted_key)
    3806 except KeyError as err:
KeyError: 'July 1, 2024 (est.)'

Solution

  • You need to specify the level of the column index that you want to drop.

    Since the date column you want to drop is in level 1 you need to mention it explicitly.

    df2.drop(columns=["July 1, 2024 (est.)"], level=1, axis=1)