I'm trying to generate a QuickSight analysis with a simple .csv file. The file contains some arbitrary data like
Yifei, 24, Male, 2
Joe, 30, Male, 3
Winston, 40, Male, 7
Emily, 18, Female, 5
Wendy, 32, Female, 4
I placed the file in an S3 bucket, and then use AWS Athena to parse that into a table. The table treats all columns as strings, and I can query it properly
SELECT * FROM users
returns
name age gender consumed
1 Yifei 24 Male 2
2 Joe 30 Male 3
3 Winston 40 Male 7
4 Emily 18 Female 5
5 Wendy 32 Female 4
Okay so far so good. Then in QuickSight, I import the table as dataset, and it's properly displayed under fields with the correct values. The only problem remaining is that age
and consumed
are treated as strings, not numbers. So, I created two calculated fields:
age_calc: parseInt({age})
consumed_calc: parseInt({consume})
Works just fine, now under the fields I can see the newly created fields with correct values. However, once I try to create actual visualization (For example, a pie chart with how much everyone consumed) using the field consumed_calc
, the value of consumed_calc
is just null.
I found the issue. Basically, csv does not work very well with spaces, so despite the calculated fields showing correct result in preview, when parsed the field " 23" gets an error. Removing the spaces in the original .csv file solved this issue