I'm using JDBC to execute query "LOAD DATA LOCAL INFILE" to load csv file into mysql table.
The csv file contains Accented characters like ä,ö,ü,ß.
My issue is that German characters couldn't insert into mysql table by executing query "LOAD DATA LOCAL INFILE" through either JDBC or terminal; however, I can insert German characters by executing "INSERT" or "UPDATE" statements.
I have been trying all different ways to figure it out as following, but still couldn't work:
For JDBC connection I did:
jdbc:mysql://${sequence.db.svr}/seq? connectTimeout=20000&useUnicode=yes&characterEncoding=utf8
For "LOAD DATA LOCAL INFILE" I did:
LOAD DATA LOCAL INFILE fileName
INTO TABLE tableName
CHARACTER SET UTF8
FIELDS TERMINATED BY ','
ENCLOSED BY '"';
String query = "LOAD DATA LOCAL INFILE '" + fileName +
"' INTO TABLE pde." + table +
" CHARACTER SET UTF8" +
" FIELDS TERMINATED BY ','" +
" ENCLOSED BY '\"'"
+ ";";
For mysql database property I did:
ALTER DATABASE databaseName CHARACTER SET utf8 COLLATE utf8_unicode_ci;
For mysql table property I did:
ALTER TABLE tableName CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
Create table query:
CREATE TABLE `testTable` (
`value` varchar(255) DEFAULT NULL,
`mapped_value` varchar(255) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8
Create database query:
CREATE DATABASE `testDatabase` /*!40100 DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci */
Csv file:
amÜ,amman
amàn,amman
CSV file in hex:
616d dc2c 616d 6d61 6e0a 616d e06e 2c61 6d6d 616e 0a
It looks to me like your CSV file is coded in Latin-1 (ISO 8859-1) rather than utf-8. How can I tell this?
a m Ü , a m m a n \n a m à n , a m m a n \n
61 6d dc 2c 61 6d 6d 61 6e 0a 61 6d e0 6e 2c 61 6d 6d 61 6e 0a
See how Ü is coded by just one byte, dc
? That's Latin-1. If it were in utf-8 it would be coded by c39c
instead.
So change CHARACTER SET UTF8
in your LOAD DATA INFILE
command to CHARACTER SET latin1
and try again.