javascriptmongodbnestjsheap-memoryevent-stream

Reading, parsing files and inserting documents using NestJS and MongoDB causing JavaScript heap out of memory


My NestJS application has a simple purpose to:

The most important part of my code consist of:

    for(let file of FILES){
                    result = await this.processFile(file);
                    resultInsert += result;
    }

and the function processFile()

    async processFile(fileName: string): Promise<number> {
            count = 0;
    
            return new Promise((resolve, reject) => {
                let s = fs
                .createReadStream(BASE_PATH + fileName, {encoding: 'latin1'})
                .pipe(es.split())
                .pipe(
                    es
                        .mapSync(async (line: string) => {
                            
                            count++;
                            console.log(line);
                            let line_splited = line.split("@");                            
                            let user = {
                                name: line_splited[0],
                                age: line_splited[1],
                                address: line_splited[2],
                                job: line_splited[3],
                                country: line_splited[4]
                            }
                            
                            await this.userModel.updateOne(
                                user,
                                user,
                                { upsert: true }
                            );
                                    
                               
                        })
                        .on('end', () => {
                            resolve(count);
                        })
                        .on('error', err => {
                            reject(err);
                        })
                );    
            });
        }

The main problem is by the interaction of the ~9th file, I have a memory failure: Allocation failed - JavaScript heap out of memory. I saw that my problem is similar to Parsing huge logfiles in Node.js - read in line-by-line but the code still managed to fail.

I suspect the fact that I am opening a file, reading it and when I open another file, I am still inserting the previous one can cause the problem but I don't know how to handle it.


Solution

  • I could make it work by changing the updateOne() to insertMany().
    Quick explanation: instead of inserting one by one, we would be inserting by 100k. So I just created an array of user and when it reached 100k documents, we would insert with insertMany()