pythondataframesplitdropisin

How to split a tolist result in two groups (new lists)?


I have a problem, I try many things and I can't manage to do this. It's a code that check if the delivery numbers are on our sales report.

Here it's the code:

dfn = dfn.astype('str')
diff = dfn[~dfn['Numero de Envio'].isin(dfn['Unnamed: 13'])].dropna()[['Numero de Envio','Fecha']].values.tolist() 
print(f"\n Hay ventas que podrían no ser nuestras: \U0001F611 - Revisar :" '\n')
diff 

The result is:

[['piedras blancas', '2022-12-01'],
 ['41845010982', '2022-12-01'],
 ['carrasco norte', '2022-12-05'],
 ['41855309788', '2022-12-05'],
 ['villa española ', '2022-12-07'],
 ['aires puros', '2022-12-08'],
 ['monica villa borges', '2022-12-08'],
 ['nicolas diego ', '2022-12-08'],
 ['enrique permuy', '2022-12-08'],
 ['natalia blanco', '2022-12-08'],
 ['laurita', '2022-12-09'],
 ['hugo carrion', '2022-12-10'],
 ['4187289809', '2022-12-12'],
 ['mariana', '2022-12-12'],
 ['amelia vignolo', '2022-12-14'],
 ['leonardo saucedo', '2022-12-14'],
 ['14891993727', '2022-12-17'],
 ['maria noel dottone', '2022-12-19'],
 ['41899599250', '2022-12-19'],
 ['41898783286', '2022-12-19'],
 ['corpo pilates ltd', '2022-12-19'],
 ['serrana bentancour', '2022-12-20'],
 ['fabiana lima', '2022-12-21'],
 ['41916589225', '2022-12-26'],
 ['41917845465', '2022-12-26'],
 ['41916895866', '2022-12-26'],
 ['41917386564', '2022-12-26'],
 ['41917285884', '2022-12-26'],
 ['41900719115', '2022-12-27'],
 ['mauro gonzalez', '2022-12-27']]

I want to split in two groups: The ones that start with numbers and the other the names, my expected result is:

['41845010982', '2022-12-01']          
['41899599250', '2022-12-19'],
['41898783286', '2022-12-19']
['41916589225', '2022-12-26'],
['41917845465', '2022-12-26'],
['41916895866', '2022-12-26'],
['41917386564', '2022-12-26'],
['41917285884', '2022-12-26'],
['41900719115', '2022-12-27'],


['villa española ', '2022-12-07'],
['aires puros', '2022-12-08'],
['monica villa borges', '2022-12-08'],
['nicolas diego ', '2022-12-08'],
['enrique permuy', '2022-12-08'],
['natalia blanco', '2022-12-08'],
['laurita', '2022-12-09'],
['hugo carrion', '2022-12-10'],

Solution

  • A possible solution would be to use GroupBy.apply with pandas.Series.isnumeric :

    for m, g in diff.groupby(diff.apply(lambda x: x[0]).str.isnumeric()):
        if m:
            listOfNumbers = g.tolist()
        else:
            listOfStrings = g.tolist()
    

    Output :

    print(listOfNumbers)
    
    [['41845010982', '2022-12-01'],
     ['41855309788', '2022-12-05'],
     ['4187289809', '2022-12-12'],
     ...
    
    
    print(listOfStrings)
    
    [['piedras blancas', '2022-12-01'],
     ['carrasco norte', '2022-12-05'],
     ['villa española ', '2022-12-07'],
     ...