pythonpandastextfeature-extractionnon-alphanumeric

How to count non-alphanumeric characters on pandas dataframe


Here's my data

No  Body
1   DaTa, Analytics 2
2   StackOver. 67%

Here's my expected output

No  Body                 Non Alphanumeric   
1   DaTa, Analytics 2    1       
2   StackOver. 67%       2  

I am only count non-alphanumeric like ! @ # & ( ) % – [ { } ] : ; ', ? / * space and number is not count


Solution

  • You can use:

    df['Non Alphanumeric'] = df['Body'].str.findall(r'[^a-zA-Z0-9 ]').str.len()
    

    Or:

    df['Non Alphanumeric'] = df['Body'].str.count(r'[^a-zA-Z0-9 ]')
    
    print (df)
       No               Body  Non Alphanumeric
    0   1  DaTa, Analytics 2                 1
    1   2     StackOver. 67%                 2