pythonpandasstringdataframedrop

Drop rows that contain a specific string in a column


I have a large pandas dataset in the format as below

col1
11111112322
15211114821
25482136522
45225625656
11125648121

I would like to drop all rows that contain 1111 (four consecutive ones) to have below results

25482136522
45225625656
11125648121

I tried this but did not work:

data = df[df["col1"].str.contains("1111")==False]
Traceback (most recent call last):
  File "<pyshell#17>", line 1, in <module>
    data1_1 = section1[section1["col1"].str.contains("111111")==False]
  File "C:\Users\henry\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\generic.py", line 5575, in __getattr__
    return object.__getattribute__(self, name)
  File "C:\Users\henry\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\accessor.py", line 182, in __get__
    accessor_obj = self._accessor(obj)
  File "C:\Users\henry\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\strings\accessor.py", line 177, in __init__
    self._inferred_dtype = self._validate(data)
  File "C:\Users\henry\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\strings\accessor.py", line 231, in _validate
    raise AttributeError("Can only use .str accessor with string values!")
AttributeError: Can only use .str accessor with string values!. Did you mean: 'std'?

Solution

  • The issue, is, as the error code states, that the column is not a column of strings:

    AttributeError: Can only use .str accessor with string values!. Did you mean: 'std'?

    So to perform string actions on it, you have to first convert the column to strings, then your code will work:

    df[df["col1"].astype(str).str.contains("1111")==False]
    

    Output:

              col1
    2  25482136522
    3  45225625656
    4  11125648121