I'm posting a large JSON string to a node express endpoint defined like this:
import bodyParser from 'body-parser';
const app = express();
const jsonParser = bodyParser.json({ limit: '4mb' });
const databaseUri = '<db connection string>';
const databaseClient = new MongoClient(databaseUri);
app.use(cors());
app.post('/fillDatabase', jsonParser, async (request, response) => {
const subjects = request.body;
console.log(`Filling database with ${subjects.length} subjects`);
const database = databaseClient.db('wanikani_db');
const subjectsTable = database.collection('subjects');
const result = await subjectsTable.insertMany(subjects).then((response) => {
console.log('inserted ' + response.insertedCount + ' subjects');
}).catch(() => {
console.log('failed to insert subjects');
}).finally(() => {
console.log('finally after inserting subjects');
});
});
When I post the JSON string, the endpoint function runs to completion (I see the finally after inserting subjects
log line), but the request doesn't complete. I get the stack trace
SyntaxError: Bad control character in string literal in JSON at position 289
at JSON.parse (<anonymous>)
at parse (/Users/maxc/Documents/repos/wanikani-flashcards/backend/node_modules/body-parser/lib/types/json.js:92:19)
at /Users/maxc/Documents/repos/wanikani-flashcards/backend/node_modules/body-parser/lib/read.js:128:18
at AsyncResource.runInAsyncScope (node:async_hooks:206:9)
at invokeCallback (/Users/maxc/Documents/repos/wanikani-flashcards/backend/node_modules/raw-body/index.js:238:16)
at done (/Users/maxc/Documents/repos/wanikani-flashcards/backend/node_modules/raw-body/index.js:227:7)
at IncomingMessage.onEnd (/Users/maxc/Documents/repos/wanikani-flashcards/backend/node_modules/raw-body/index.js:287:7)
at IncomingMessage.emit (node:events:514:28)
at endReadableNT (node:internal/streams/readable:1376:12)
at process.processTicksAndRejections (node:internal/process/task_queues:82:21)
the JSON object that is breaking is pasted here. The character that breaks is at data -> characters.
Hex of that character is here:
000001c0: 2020 2020 2020 2020 2020 2022 6368 6172 "char
000001d0: 6163 7465 7273 223a 2022 e4b8 8022 2c0a acters": "...",.
The character at position 289 is the Japanese character 一. How do I account for this character and others like it?
My problem ended up not being with the JSON at all. I was breaking up a large json dataset into ten requests to the database. Six of them would go through successfully, but four would not resolve. I saw the Bad Control Character errors and thought it was a JSON problem, but the actual problem was I was not adding response.sendStatus()
at the end of the /fillDatabase
endpoint. Adding that made all requests complete successfully. The Bad Control Character errors still occur, but as far as I can tell all the data is getting in correctly.