pandasstartswithisin

Get a list of rows starting from the same value as current row in pandas dataframe


I have a dataframe that I'd like to expand with a new column which would contain/match the list of all ids if they fully contain the row string_value

id  string_value
1   The quick brown fox 
2   The quick brown fox jumps  
3   The quick brown fox jumps over 
4   The quick brown fox jumps over the lazy dog
5   The slow 
6   The slow brown fox 

Desired output

id  string_value                                new_columns
1   The quick brown fox                         [2, 3, 4]
2   The quick brown fox jumps                   [3, 4]
3   The quick brown fox jumps over              [4]
4   The quick brown fox jumps over the lazy dog []
5   The slow                                    [6]
6   The slow brown fox                          []

Thanks


Solution

  • Here's another custom function you can consider. Assuming df is this:

       id                                 string_value
    0   1                          The quick brown fox
    1   2                    The quick brown fox jumps
    2   3               The quick brown fox jumps over
    3   4  The quick brown fox jumps over the lazy dog
    4   5                                     The slow
    5   6                           The slow brown fox
    

    The custom function is

    def match_string(string_value):
        idx_list = []
        for idx, strg in list(zip(df['id'], df['string_value'])):
            if strg == string_value:
                continue
            if string_value in strg:
                idx_list.append(idx)
        return idx_list
    

    Then use lambda function:

    df['new_columns'] = df['string_value'].apply(lambda x: match_string(x))
    print(df)
    
       id                                 string_value new_columns
    0   1                          The quick brown fox   [2, 3, 4]
    1   2                    The quick brown fox jumps      [3, 4]
    2   3               The quick brown fox jumps over         [4]
    3   4  The quick brown fox jumps over the lazy dog          []
    4   5                                     The slow         [6]
    5   6                           The slow brown fox          []