to use the geo-plot animation frame I want all my groups to end on the same date. This would avoid the last frame to grey out certain countries. Currently, the newest data point according to date is 'Timestamp('2021-05-13 00:00:00')'.
So in the next step I want to add new rows according to all countries so that they have rows until the newest date in the df. Columns 'people_vaccinated_per_hundred' and 'people_fully_vaccinated_per_hundred' can be populated using ffill.
So ideally, if e.g. norway has 1 day less than the newest data point '2021-05-13', then it should add a new row as shown below. And this should be done for all other countries in the df.
Example
country iso_code date people_vaccinated_per_hundred people_fully_vaccinated_per_hundred
12028 Norway NOR 2021-05-02 0.00 NaN
12029 Norway NOR 2021-05-03 0.00 NaN
12188 Norway NOR ... ... ...
12188 Norway NOR 2021-05-11 27.81 9.55
12189 Norway NOR 2021-05-12 28.49 10.42
Add new row
12189 Norway NOR 2021-05-13 28.49 10.42
A blunt approach to this might be to create a cartesian product of the countries and dates, then join on this to create empty values for every missing combination of dates and country.
countries = df.loc[:, ['country', 'iso_code']].drop_duplicates()
dates = df.loc[:, 'date'].drop_duplicates()
all_countries_dates = countries.merge(dates, how='cross')
df.merge(all_countries_dates, how='right', on=['country', 'iso_code', 'date'])
With a dataset like:
country iso_code date people_vaccinated people_fully_vaccinated
Norway NOR 2021-05-09 0.00 1.00
Norway NOR 2021-05-10 0.00 3.00
Norway NOR 2021-05-11 27.81 9.55
Norway NOR 2021-05-12 28.49 10.42
Norway NOR 2021-05-13 28.49 10.42
United States USA 2021-05-09 23.00 3.00
United States USA 2021-05-10 23.00 3.00
This transformation would give you:
country iso_code date people_vaccinated people_fully_vaccinated
Norway NOR 2021-05-09 0.00 1.00
Norway NOR 2021-05-10 0.00 3.00
Norway NOR 2021-05-11 27.81 9.55
Norway NOR 2021-05-12 28.49 10.42
Norway NOR 2021-05-13 28.49 10.42
United States USA 2021-05-09 23.00 3.00
United States USA 2021-05-10 23.00 3.00
United States USA 2021-05-11 NaN NaN
United States USA 2021-05-12 NaN NaN
United States USA 2021-05-13 NaN NaN
After this you could use fillna to change the empty values for the added rows.