I am new to python use. But learn by practice to use in my data processing.
I have a big data file in the format as shown here.
Always unknown number of rows and columns. In this example there are 2 consecutive rows shown.
The 1st column is "time" and nth column is relevant data to be chosen from an indentifier ('abc' in the 1st line).
................
"2013-01-01 00:00:02" 228 227 15.65 15.84 14.85 14.68 14.53 13.75 12.45 12.55
"2013-01-02 00:01:03" 225 227 16.35 15.99 14.85 14.73 14.43 13.8 12.85 13.2
................
Desired output as
In my past trials, I end up in list, hence unable to convert either of the column.
I tried to search over past questions and answers. But failed to interpret all, as I am a beginner. I expect your quick help to read the data into column format, so as to process later. I believe, further processing can be taken care as it is more mathematical operation.
I thank you for your help indeed.
Regards
Gouri
CORRECTION-1:
I understood pandas gives a compact version to extract the column as I needed earlier. Good learning after suggestion from group.
code looks like as follows:
import pandas as pd
data = pd.read_csv(fp, sep='\t')
entry=[]
entry = data['u90']
print entry, '\n', entry[5]
out_file = open("out.txt", "w")
entry.to_csv(out_file)
Regards
Gouri
As pointed out by Hugo Honorem in comment, you can use pandas.
If you do not want to introduce more dependencies to your project, you could use a function like this:
from operator import itemgetter
def load_dataset(fp, columns, types=None, delimiter=' ', skip_header=True):
get_columns = itemgetter(*columns)
if skip_header:
next(fp)
dataset = []
for line in fp:
parts = line.split(delimiter)
columns = get_columns(parts)
if types is not None:
columns = [convertor(col) for convertor, col in zip(types, columns)]
dataset.append(columns)
return dataset
columns
should be list of integers, types
is list of callable objects that convert desired columns into types you want them to be. For floats, just pass in float
and for your date, you could pass custom to_date
function.