I know some basics in c++, but I am a beginner in python.
I have a piece of working code (see below) and I'd like to add a constraint for formatting its output, and I cannot figure out how to do it...
Let me first explain what the program does:
I have an input file colors.csv
that contain a list of colors, one color a line: the colors are defined by their name and colorimetric coordinates X, Y and Z, it looks so:
Colorname, X1, Y1, Z1
Colorname2, X2, Y2, Z2
...etc.
Given any list of XYZ coordinates, contained in another input file targets.csv
the program will give me a list of solutions in an output file output.txt
This solution is calculated by first triangulation of the points cloud with tetgen and then barycentric coordinates of the point in a tetrahedron, (but it doesn't matters to explain everything here...)
The solution has the form:
target, name0, density0, name1, density1, name2, density2, name3, density3
There are always only 4 names and associated densities.
It will look for example like this:
122 ,PINKwA,0.202566115168,GB,0.718785775317,PINK,0.0647284446787,TUwA,0.0139196648363
123 ,PINKwA,0.200786239192,GB,0.723766147717,PINK,0.0673550497794,TUwA,0.00809256331169
124 ,PINKwA,0.19900636349,GB,0.72874651935,PINK,0.0699816544755,TUwA,0.00226546268446
125 ,OR0A,0.00155317194109,PINK,0.0716160265958,PINKwA,0.195962072115,GB,0.730868729348
126 ,OR0A,0.00409427478508,PINK,0.0726192660009,PINKwA,0.192113520109,GB,0.731172939105
127 ,OR0A,0.00663537762906,PINK,0.073622505406,PINKwA,0.188264968103,GB,0.731477148862
What I would like to do now?
For practical reasons, I would like my output to follow a certain order.
I would like a "priority list" to rule the order of the name, density
output.
My actual program output the color names in an order that I don't understand, but anyway I need these color names to be in a specific order, for example PINK
should always be the first PINKwA
the second, etc.
Instead of:
127 ,OR0A,0.00663537762906,PINK,0.073622505406,PINKwA,0.188264968103,GB,0.731477148862
I want;
127 ,PINK,0.073622505406,PINKwA,0.188264968103,OR0A,0.00663537762906,GB,0.731477148862
Because my priority list says:
0, PINK
1, PINKwA
2, OR0A
3, GB
How could I simply add this function to the code below? Any idea?
EDITED CODE (works...):
import tetgen, geometry
from pprint import pprint
import random, csv
import numpy as np
from pprint import pprint
all_colors = [(name, float(X), float(Y), float(Z))
for name, X, Y, Z in csv.reader(open('colors.csv'))]
priority_list = {name: int(i)
for i, name in csv.reader(open('priority.csv'))}
# background is marked SUPPORT
support_i = [i for i, color in enumerate(all_colors) if color[0] == 'SUPPORT']
if len(support_i)>0:
support = np.array(all_colors[support_i[0]][1:])
del all_colors[support_i[0]]
else:
support = None
tg, hull_i = geometry.tetgen_of_hull([(X,Y,Z) for name, X, Y, Z in all_colors])
colors = [all_colors[i] for i in hull_i]
print ("thrown out: "
+ ", ".join(set(zip(*all_colors)[0]).difference(zip(*colors)[0])))
targets = [(name, float(X), float(Y), float(Z), float(BG))
for name, X, Y, Z, BG in csv.reader(open('targets.csv'))]
for target in targets:
name, X, Y, Z, BG = target
target_point = support + (np.array([X,Y,Z]) - support)/(1-BG)
tet_i, bcoords = geometry.containing_tet(tg, target_point)
output = open('output.txt','a')
if tet_i == None:
output.write(str(target[0]))
output.write('\n')
else:
names = [colors[i][0] for i in tg.tets[tet_i]]
sorted_indices = sorted(enumerate(names), key=lambda (i, name): priority_list[name])
output.write(target[0])
counting = 0
for i, name in sorted(enumerate(names), key=lambda (i, name): priority_list[name]):
output.write(',%s,%s' % (name, bcoords[i]))
counting = counting + 1
if counting > 3:
output.write('\n')
counting = 0
output.close()
First, you'll need to encode your priority list directly in your Python code :
priority_list = {
'PINK': 0,
'PINKwA': 1,
'OR0A': 2,
'GB': 3,
}
This will let you quickly retrieve the order for a given color name. Then, you can use the key
argument to sorted
to sort your names by their priority. Critically, though, you need to retrieve not the sorted names but the indices of the sorted names, much like http://docs.scipy.org/doc/numpy/reference/generated/numpy.argsort.html.
sorted_indices = sorted(enumerate(names), key=lambda (i, name): priority_list[name])
The enumerate
builtin annotates each name with its index in the original list of names, and then the sorted
builtin sorts the resulting (i, name)
pairs based on their rank in the priority list. Then we can write the names out to the file, followed by the corresponding element (using the index value) from the bcoords
array.
for i, name in sorted_indices:
output.write(',%s,%s' % (name, bcoords[i]))
So, here's what I'd make the final block in your code look like :
names = [colors[i][0] for i in tg.tets[tet_i]]
output.write(target[0])
for i, name in sorted(enumerate(names), key=lambda (i, name): priority_list[name]):
output.write(',%s,%s' % (name, bcoords[i]))
output.write('\r\n')
output.close()
Here I changed your file output strategy to be a bit more Pythonic -- in general, adding strings together is largely not done, it's better instead to create a format string and fill in variables (you can also use .format()
on the string to do this). Also, you can make multiple calls to .write()
and they will simply continue to write bytes to the file, so no need to create a big long string all at once to write out. Finally, no need to call str
on '\r\n'
as it's already a string.