pythondatatablecount

How to find which labeled rows from a table are at or above certain points from another table using Python?


I'm new to Python and am struggling to understand how to code this specific situation. I have included an Excel screenshot to better describe the tables and graphs I am working with. From Table 1, column headings 10-13 serve as the x-values. Row # Label provides which row between 1-6 is being affected. Table 2 provides 2 points: A and B. How can we determine which of the 1-6 rows intersect or are above point A? What about for point B?

Logically, I know point A should be at or below all 6 rows and B should be at or below rows 4 and 6. Python should print {1, 2, 3, 4, 5, 6} when asked about A and print {4, 6} when asked about B.

However, how do we translate this process to be done in Python where there are two tables set up just like these?

Table 1 and Table 2

I have tried something like this, but it not working and I think it would only output the total number of rows like 6 for point A and 2 for point B instead of the specific Row # Label that I am looking for as well.

# Iterate through the points in Table #1
for i in range(len(table_one)):
    x = table_one[i][0]
    y = table_one[i][1]

# Iterate through the matrices in Table #2
for j in range(len(table_two)):
    m = table_two[j]
# Calculate the x and y values of the matrix
    m_x = np.sum(m, axis=0) * [i]
    m_y = np.sum(m, axis=1) * [i]

# Compare the x and y values of the point with those of the matrix
if np.any((np.abs(x - m_x) <= 0.5) & (np.abs(y - m_y) <= 0.5)):

# Increment the counter variable
    intersection_count = intersection_count + 1

Solution

  • When working with tables in Python, I suggest that you use pandas. It is typically imported like this import pandas as pd.

    Let us consider the following example tables:

    data1= {'11': [0.2, 0.3, 0.1, 2, 0.6, 1.2], '12': [0.3, 0.33, 0.18, 2.5, 1, 1.4]}
    data2= {'Point': ["A","B"], 'X': [11,12], 'Y': [0.18, 1.24]}
    table1 = pd.DataFrame(data=data1)
    table2 = pd.DataFrame(data=data2)
    

    table1 is similar to the left-hand side table in your snapshot, table2 is the one is the right-hand side. If I understand your question correctly, for a given abscissa (X) in table1 you want to check which samples are above the ordinate (Y) in table2.

    Let's consider:

    abscissa = 11
    

    Then your threshold is:

    threshold = table2[table2.X==abscissa]['Y'].to_numpy()
    

    The corresponding column name in the first table is X = str(abscissa).

    To check which values in the corresponding pandas series are greater or equal than the threshold you can do the following:

    table1[X].ge(threshold[0])
    

    Of course, this returns "True" for all rows. If you repeat the same using abscissa = 12, this will return:

    0    False
    1    False
    2    False
    3     True
    4    False
    5     True
    Name: 12, dtype: bool
    

    EDIT: to answer your additional question in the comment section, pandas cannot handle columns sharing the same name. For instance, if you try importing a csv file containing multiple columns named 11, pandas will rename them to 11, 11.1, 11.2, etc... That being said you could easily extract the abscissa from these column labels:

    for column in table1.columns:
        abscissa = int(float(column))
        threshold = table2[table2.X==abscissa]['Y'].to_numpy()
        table1[column].ge(threshold[0])
    

    Then it is just a matter of concatenating the different series to form the table that you are after.