pythonpython-3.xfilereaderfile-processing

Read and filter column names from file using Python


I was new to python, my requirement is to fetch the column names from the file.

File may contains following types of contents:

OPTIONS ( SKIP=1)
LOAD DATA
TRAILING NULLCOLS
(
A_TEST                              NULLIF TEST=BLANKS,
B_TEST                          NULLIF TEST=BLANKS,
C_TEST                                  NULLIF TEST=BLANKS,
CREATE_DT       DATE 'YYYYMMDDHH24MISS' NULLIF OPENING_DT=BLANKS,
D_CST CONSTANT 'FNAMELOAD'
)

I need to fetch the data after the second open brackets and the first not empty string of each line which has the next value not like CONSTANT . So for the above formatted file, my expected output will be:

A_TEST,B_TEST,C_TEST,CREATE_DT.


Solution

  • You could do something like this:

    f = open("data.txt", "r")
    data=f.read()
    from_index=data.rfind('(' ) # find index of the last occurence of the bracket
    
    data_sel=data[from_index:] # select just chunk of data, starting from specified index
    lst=data_sel.split('\n') #split by the new line
    for line in lst:
        if line!='(' and line!=')' and "CONSTANT" not in line: # conditions, you will maybe have to tweak it, but here is some basic logic
            print(line.split(' ')[0]) # print the first element of the created array, or place it in the list, or something...