pythonpandaspython-recycle

the cycle says that all rows have the '|'


import pandas as pd
import numpy as np
import re

df_test = pd.DataFrame(np.array([['a|b', 'b', 'c|r'], [ 'e', 'f', 'g']]), columns=['First', 'Second', 'Third'])

for elem in df_test.get('First'):
    x = bool(re.search('|', elem))
    if x == True:
        print(elem)

Output:

a|b
e

Why does it show like that? It's considered that only the first row must be in output. No?


Solution

  • The issue in your code is with the regular expression pattern and how it interacts with the re.search function. In regular expressions, the vertical bar | is a special character that represents an OR operator. So when you use re.search('|', elem), it is essentially searching for an empty string or any character in the input elem.So, You need to escape the vertical bar using a backslash \ because you want to match the literal character |.

    import pandas as pd
    import numpy as np
    import re
    
    df_test = pd.DataFrame(np.array([['a|b', 'b', 'c|r'], ['e', 'f', 'g']]), columns=['First', 'Second', 'Third'])
    
    for elem in df_test.get('First'):
        x = bool(re.search('\|', elem))
        if x == True:
            print(elem)