I am being sent a csv file that is tab delimited. Here is a sample of what I see:
Invoice: Invoice Date Account: Name Bill To: First Name Bill To: Last Name Bill To: Work Email Rate Plan Charge: Name Subscription: Device Serial Number
2021-03-10 Test Company Wally Kolcz test@test.com Sample plan A0H1234567890A
I wrote a script to open, read and loop over the values but I get weird stuff after:
if (($handle = fopen($user_file, "r")) !== FALSE) {
while (($data = fgetcsv($handle, 1000, "\t")) !== FALSE) {
if($line >1 && isset($data[1])){
$user = [
'EmailAddress' => $data[4],
'Name' => $data[2].' '.$data[3],
];
}
$line++;
}
fclose($handle);
}
Here is what I get when I dump the first line.
array:7 [▼
0 => b"ÿþI\x00n\x00v\x00o\x00i\x00c\x00e\x00:\x00 \x00I\x00n\x00v\x00o\x00i\x00c\x00e\x00 \x00D\x00a\x00t\x00e\x00"
1 => "\x00A\x00c\x00c\x00o\x00u\x00n\x00t\x00:\x00 \x00N\x00a\x00m\x00e\x00"
2 => "\x00B\x00i\x00l\x00l\x00 \x00T\x00o\x00:\x00 \x00F\x00i\x00r\x00s\x00t\x00 \x00N\x00a\x00m\x00e\x00"
3 => "\x00B\x00i\x00l\x00l\x00 \x00T\x00o\x00:\x00 \x00L\x00a\x00s\x00t\x00 \x00N\x00a\x00m\x00e\x00"
4 => "\x00B\x00i\x00l\x00l\x00 \x00T\x00o\x00:\x00 \x00W\x00o\x00r\x00k\x00 \x00E\x00m\x00a\x00i\x00l\x00"
5 => "\x00R\x00a\x00t\x00e\x00 \x00P\x00l\x00a\x00n\x00 \x00C\x00h\x00a\x00r\x00g\x00e\x00:\x00 \x00N\x00a\x00m\x00e\x00"
6 => "\x00S\x00u\x00b\x00s\x00c\x00r\x00i\x00p\x00t\x00i\x00o\x00n\x00:\x00 \x00D\x00e\x00v\x00i\x00c\x00e\x00 \x00S\x00e\x00r\x00i\x00a\x00l\x00 \x00N\x00u\x00m\x00b\x00e\x00r\x00 ◀"
]
I tried adding:
header('Content-Type: text/html; charset=UTF-8');
$data = array_map("utf8_encode", $data);
setlocale(LC_ALL, 'en_US.UTF-8');
And when I dump mb_detect_encoding($data[2])
, I get 'ASCII'...
Any way to fix this so I don't have to manually update the file each time I receive it? Thanks!
Looks like the file is in UTF-16 (every other byte is null).
You probably need to convert the whole file with something like mb_convert_encoding($data, "UTF-8", "UTF-16");
But you can't really use fgetcsv() in that case…