I have an Excel file with the ff: row/col structure
ID English Spanish French
1 Hello Hilo Halu
2 Hi Hye Ghi
3 Bus Buzz Bas
I would like to read the Excel file, extract the row and col values, and create 3 new files base on the columns English, Spanish, and French.
So I would have something like:
English File:
"1" = "Hello"
"2" = "Hi"
"3" = "Bus"
I've been using xlrd. I can open, read, and print the contents of the file. However, this is what I get using this command (with the Excel file already open):
for index in xrange(0,2):
theWord = '\n' + str(sh.col_values(index, start_rowx=index, end_rowx=1)) + '=' + str(sh.col_values(index+1, start_rowx=index, end_rowx = 1))
print theWord
OUTPUT:
[u'Parameter/Variable/Key/String']=[u'ENGLISH'] <-- is this a list?, didn't the str() use to strip it out?
What's the u doing there? How can I remove the square brackets?
The u
means it is a unicode string, it gets put there when you call str()
. If you write the string out to a file it wont be there. What you are getting is 1 row from the column. It's because you are using end_rowx=1
it returns a list with one element.
Try getting the column value lists:
ids = sh.col_values(0, start_rowx=1)
english = sh.col_values(1, start_rowx=1)
spanish = sh.col_values(2, start_rowx=1)
french = sh.col_values(3, start_rowx=1)
and then you can zip
them into tuple lists:
english_with_IDS = zip(ids, english)
spanish_with_IDS = zip(ids, spanish)
french_with_IDS = zip(ids, french)
Which are in the form:
("1", "Hello"),("2", "Hi"), ("3", "Bus")
If you want to print the pairs:
for id, word in english_with_IDS:
print id + "=" + word
col_values
returns a list of column values, if you want single values you can call sh.cell_value(rowx, cellx)
.