I have a large text file containing many thousand lines but a short example that covers the same idea is:
vapor dust -2C pb
x14 71 hello! 42.42
100,000 lover baby: -2
there is a mixture of integers, alphanumerics, and floats.
ATTEMPT AT SOLN. Ive done this to create a single list composed of strings, but I am unable to isolate each cell based on if its numeric or alphanumeric
with open ('file.txt','r') as f:
data = f.read().split()
#dirty = [ x for x in data if x.isnumeric()]
print(data)
The line #dirty
fails.
I have had luck constructing a list-of-lists containing almost all required values using the code as follows:
with open ('benzene_SDS.txt','r') as f:
for word in f:
data= word.split()
clean = [ x for x in data if x.isnumeric()]
res = list(set(data).difference(clean))
print(clean)
But It doesnt return a single list, it a list of lists, most of which are blank [].
There was a hint given, that using the "try" control statement is useful in solving the problem but I dont see how to utilize it.
Any help would be greatly appreciated! Thanks.
If you're mainly asking how one would use try
to check for validity, this is what you're after:
values = []
with open ('benzene_SDS.txt','r') as f:
for word in f.read().split():
try:
values.append(float(word))
except ValueError:
pass
print(values)
Output:
[71.0, 42.42, -2.0]
However, not that this does not parse '100,000'
as either 100
or 100000
.
This code would do that:
import locale
locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')
values = []
with open('benzene_SDS.txt', 'r') as f:
for word in f.read().split():
try:
values.append(locale.atof(word))
except ValueError:
pass
print(values)
Result:
[71.0, 42.42, 100000.0, -2.0]
Note that running the same code with this:
locale.setlocale(locale.LC_ALL, 'nl_NL.UTF-8')
Yields a different result:
[71.0, 4242.0, 100.0, -2.0]
Since the Netherlands use ,
as a decimal separator and .
as a thousands separator (which basically just gets ignored in 42.42)