I have two series in python pandas.
One from values within a file called values.csv
. It looks like this:
time, value
0, 10312435
9, 45924523
11, 43423434
20, 42343552
...
Another one is called breaks.csv
and it looks like this:
time
5
18
...
Problem: I would like to split the values.csv
into separate frames based on the values in breaks.csv
.
In the example above the first breakpoint would be 5
, resulting in a file or set containing all entries within time \in [0, 5]
, therefore only the value 0, 10312435
. The second breakpoint is 18
, therefore the second batch of values should be within (5, 18]
i.e. 9, 45924523
and 11, 43423434
and so forth.
Is something like this possible in pandas (or maybe some other easy-to-use python package)?
I came up with the following, based on Pandas split DataFrame by column value
sim_dist_right = pandas.read_csv('sim/dist_right.csv', comment='#')
sim_round_indicator = pandas.read_csv('sim/round_indicator.csv', comment='#')
round_list = []
for index, row in sim_round_indicator.iterrows():
print("splitting at " + str(row['time']))
df_sep = sim_dist_right[sim_dist_right['time'] < row['time']]
round_list.append(df_sep)
print("separated a batch of " + str(len(df_sep)) + " elements")
df_over = sim_dist_right[sim_dist_right['time'] >= row['time']]
print(str(len(df_over)) + " elements over")
sim_dist_right = df_over
print("splitted values into " + str(len(round_list)) + " batches")