Init data:
CREATE TABLE `test.test_data`(
user VARCHAR(10),
amount FLOAT)
TBLPROPERTIES ('transactional'='true');
INSERT INTO `test.test_data`
SELECT 'sonnh', 300000000000
UNION ALL
SELECT 'sonnh1', 1000000000000
UNION ALL
SELECT 'sonnh2', 900000000000;
Get data:
SELECT * FROM `test.test_data`
Results:
test.test_data.user test.test_data.amount
sonnh 299999986000
sonnh1 1000000000000
sonnh2 899999990000
I have a problem when use float data type like above. Someone explain it to me.
i think this goes back to basics of how hive store float, double, and decimal.
So it is always advisable to use decimal in case you need financial applications and use float or double when you dont care about rounding issues.
Here is an experimentation with above three data types.
CREATE TABLE tmp( a string, f FLOAT, db double, dc decimal(5,4) ) ;
insert into tmp SELECT 'id', 900000000000000000,900000000000000000,123.123 ;
insert into tmp SELECT 'id', 90000000,99900000000000000000,99.999 ;
insert into tmp SELECT 'id', 99123.456,9000000123.456,9.9999999 ;
insert into tmp SELECT 'id', 9999999123.456,500000,123.123 ;
insert into tmp SELECT 'id', 9999999000.456,99900000000000000000999999999999999.12349,9990.099999 ;
insert into tmp SELECT 'id', 9999999000.456,9990000000000000000099999999999.12349,90.999 ;
select * from tmp;
tmp.a tmp.f tmp.db tmp.dc
id 900000020000000000 900000000000000000 123.12300 -- notice that float column cant handle the data and showing different data.
id 90000000 99900000000000000000 99.99900 -- everything is fine because they are within range.
id 99123.45 9000000123.456 10.00000 -- decimal approximated the data to 10 which is very close to actual value.
id 9999999000 500000 123.12300 --all good
id 9999999000 9.99e+34 NULL --Decimal put null because it cant handle the data. Double approximated the data to E34.
id 9999999000 9.99e+30 90.99900 --Decimal is good because data is within range. Double approximated the data to E34.