javascriptexpressbody-parser

Bad control character error when parsing Japanese characters with express body-parser


I'm posting a large JSON string to a node express endpoint defined like this:

import bodyParser from 'body-parser';
const app = express();
const jsonParser = bodyParser.json({ limit: '4mb' });

const databaseUri = '<db connection string>';
const databaseClient = new MongoClient(databaseUri);

app.use(cors());

app.post('/fillDatabase', jsonParser, async (request, response) => {
    const subjects = request.body;
    console.log(`Filling database with ${subjects.length} subjects`);
    const database = databaseClient.db('wanikani_db');
    const subjectsTable = database.collection('subjects');
    const result = await subjectsTable.insertMany(subjects).then((response) => {
        console.log('inserted ' + response.insertedCount + ' subjects');
    }).catch(() => {
        console.log('failed to insert subjects');
    }).finally(() => {
        console.log('finally after inserting subjects');
    });
});

When I post the JSON string, the endpoint function runs to completion (I see the finally after inserting subjects log line), but the request doesn't complete. I get the stack trace

SyntaxError: Bad control character in string literal in JSON at position 289
    at JSON.parse (<anonymous>)
    at parse (/Users/maxc/Documents/repos/wanikani-flashcards/backend/node_modules/body-parser/lib/types/json.js:92:19)
    at /Users/maxc/Documents/repos/wanikani-flashcards/backend/node_modules/body-parser/lib/read.js:128:18
    at AsyncResource.runInAsyncScope (node:async_hooks:206:9)
    at invokeCallback (/Users/maxc/Documents/repos/wanikani-flashcards/backend/node_modules/raw-body/index.js:238:16)
    at done (/Users/maxc/Documents/repos/wanikani-flashcards/backend/node_modules/raw-body/index.js:227:7)
    at IncomingMessage.onEnd (/Users/maxc/Documents/repos/wanikani-flashcards/backend/node_modules/raw-body/index.js:287:7)
    at IncomingMessage.emit (node:events:514:28)
    at endReadableNT (node:internal/streams/readable:1376:12)
    at process.processTicksAndRejections (node:internal/process/task_queues:82:21)

the JSON object that is breaking is pasted here. The character that breaks is at data -> characters.

Hex of that character is here:

000001c0: 2020 2020 2020 2020 2020 2022 6368 6172             "char
000001d0: 6163 7465 7273 223a 2022 e4b8 8022 2c0a  acters": "...",.

The character at position 289 is the Japanese character 一. How do I account for this character and others like it?


Solution

  • My problem ended up not being with the JSON at all. I was breaking up a large json dataset into ten requests to the database. Six of them would go through successfully, but four would not resolve. I saw the Bad Control Character errors and thought it was a JSON problem, but the actual problem was I was not adding response.sendStatus() at the end of the /fillDatabase endpoint. Adding that made all requests complete successfully. The Bad Control Character errors still occur, but as far as I can tell all the data is getting in correctly.