I am using FileReader.readAsArrayBuffer(file) and converting the result into a Uint8Array.
If the text file input contains a pound sterling sign (£), then this single character results in two byte codes, one for  and one for £. I understand that this is because £ is in the extended-ASCII set.
Is there a way to prevent this extra character? If not, will it always be an Â? If so, I can strip them out.
You didn't provide your js code, But it seems this happen due to a mismatch between the character encoding of the text file and how you're js interpreting it. If i assume you're reading the file as text, maybe i was right in my thinking. I will just drop this playground to give you a reference and hoping you will solve your problem.
function checkFile(file){
const fileReader = new FileReader();
fileReader.onload = function(event) {
const uint8Array = new Uint8Array(event.target.result);
// Use TextDecoder to convert Uint8Array into string
const textDecoder = new TextDecoder('utf-8', { fatal: true });
try{
const result = textDecoder.decode(uint8Array);
console.log(result); // This should correctly display the pound sign and show £ without Â.
}catch(error){
console.error('Decoding was Failed:', error);
}
};
fileReader.readAsArrayBuffer(file);
}
function uploadFile(){
const file = event.target.files[0];
if(file){
checkFile(file);
}
}
<input type="file" onchange="uploadFile()" />