pythonnumpy

Referencing columns by assigned name in numpy array


I am trying to create column names for easy reference, That way I can just call the name from the rest of the program instead of having to know which column is where in terms of placement. The from_ column array is coming up empty. I am new to numpy so I am just wondering how this is done. Changing of data type for columns 5 and 6 was successful though.

def array_setter():
        import os
        import glob
        import numpy as np
        os.chdir\
        ('C:\Users\U2970\Documents\Arcgis\Text_files\Data_exports\North_data_folder')
        for file in glob.glob('*.TXT'):
                reader = open(file)
                headerLine = reader.readlines()
        for col in headerLine:
                valueList = col.split(",")
                data = np.array([valueList])
                from_ = np.array(data[1:,[5]],dtype=np.float32)
                # trying to assign a name to columns for easy reference
                to = np.array(data[1:,[6]],dtype=np.float32)
                if data[:,[1]] == 'C005706N':
                        if data[:,[from_] < 1.0]:
                                print data[:,[from_]]
array_setter()

Solution

  • If you want to index array columns by name name, I would recommend turning the array into a pandas dataframe. For example,

    import pandas as pd
    import numpy as np
    arr = np.array([[1, 2], [3, 4]])
    df = pd.DataFrame(arr, columns=['f', 's'])
    print df['f']
    

    The nice part of this approach is that the arrays still maintain all their structure but you also get all the optimized indexing/slicing/etc. capabilities of pandas. For example, if you wanted to find elements of 'f' that corresponded to elements of 's' being equal to some value a, then you could use loc

    a = 2
    print df.loc[df['s']==2, 'f']
    

    Check out the pandas docs for different ways to use the DataFrame object. Or you could read the book by Wes McKinney (pandas creator), Python for Data Analysis. Even though it was written for an older version of pandas, it's a great starting point and will set you in the right direction.