pythonpandasliterate-programming

How to create new rows in a dataframe based on column values


I have a table of skus that need to be placed in locations. Based on the volume that a sku has determines how many locations a sku needs. There are a limited number of locations so I need to prioritize based on how much volume will be in a location. Then once in order apply the locations. When the location is full the volume for the location should be the location volume, for the last location the remainder volume. Current table setup

So the end result should look like this.

enter image description here

I was hoping to iterate based on the number of locations needed and create a row in a new table while reducing the number of listed locations by row. Something like this.

rows = int(sum(df['locations_needed']))
new_locs = []
   for i in range(rows):
       if df['locations_needed'] > 1:
       new_locs.append(df['SKU'], df['location_amount'])
       df['locations_needed'] - 1
    else:
        new_locs.append(df['SKU'], df['remainder_volume'])
        df['locations_needed'] - 1

Solution

  • Building off of using repeat as suggested by Corralien, you then set the value for the last of the groupby to the remainder volume. The reorder and reset the index again. So,

    #create row for each potential location by sku
    df=df.loc[df.index.repeat(df.locations_needed)]
    #reset index
    df= df.reset_index(drop= True)
    #fill last row in group (sku) with remainder volume
    df2= df['SKU'].duplicated(keep= 'last')
    df.loc[~df2,'location_amount'] = df['remainder_volume']
    #reorder and reset index
    df = df.sort_values(by=['location_amount'], ascending=False)
    df['locations_needed] = 1
    df= df.reset_index(drop= True)