I am pulling data from a table that changes often using Python - and the method I am using is not ideal. What I would like to have is a method to pull all strings that contain only one letter and leave out anything that is 2 or more.
An example of data I might get:
115
19A6
HYS8
568
In this example, I would like to pull 115, 19A6, and 568.
Currently I am using the isdigit()
method to determine if it is a digit and this filters out all numbers with one letter, which works for some purposes, but is less than ideal.
This is an excellent case for regular expressions (regex), which is available as the built-in re
library.
The code below follows the logic:
filter
function to detect matches in the data list and output as a list.For example:
import re
data = ['115', '19A6', 'HYS8', '568', 'H', 'HI']
rexp = re.compile('^\d*[A-Z]{0,1}\d*$')
result = list(filter(rexp.match, data))
print(result)
Output:
['115', '19A6', '568', 'H']