node.jsamazon-ec2knex.jsnvmobjection.js

Knex timeout - what is the best approach to fix it?


I have an application backend hosted on Amazon Linux AMI, which I set up approximately 6 months ago. Initially, I used Node.js 17.x without much consideration as it was part of my AWS AMI setup.

However, I've noticed a recurring issue: every time there's a long break in activity, any database query invocation (including SELECT statements) results in a Knex timeout error:

Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?

Strangely, upon the second consecutive call, the query executes successfully.

After researching, I found that this is a well-known problem (Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?), and people have managed to resolve it by either fixing the Objection/Knex version or by adjusting their Node.js version.

Initially, I attempted to fix it by upgrading to:

"knex": "^3.1.0",
"objection": "^3.1.4"

However, I encountered warnings from Objection and Knex during this process, and the issue persisted. Ultimately, I successfully resolved it by downgrading to Node.js v12.22.12 (Erbium) using NVM.

Although I haven't encountered any problems with this older Node.js version so far, I'm uncertain if it's a sustainable solution in the long term. I'm also unsure if there are any Knex/Objection configuration tweaks that I'm overlooking.

Here's my current production configuration:

{
  client: "postgresql",
  connection: {
    host: "<host>",
    database: "<dbname>",
    user: "<user>",
    password: "<password>"
  },
  pool: {
    min: 2,
    max: 10
  },
  migrations: {
    tableName: "knex_migrations"
  }
}

In summary:

  1. Should I create a new Amazon VM Image with Node.js 12.2 to ensure stability, or should I experiment with Knex configuration changes to see if they work with the latest Node.js version?
  2. Note: My current AWS Linux setup prevents me from upgrading Node.js beyond 17.x for some reason, as explained in this Stack Overflow post. Unless I'm overlooking a solution, upgrading Node.js isn't an option.

Thanks in advance!


Solution

  • The issue actually lies with knex's connection pool handler - tarn. There is a known bug with tarn where it doesn't properly handle the connections you're trying to keep alive (the minimum amount). They don't get recycled correctly. I don't know the full technical details off the top of my head, however I ran into the same issue, and the solution is to set the pool.min value to 0 because that way it properly closes them and doesn't start a secret collection of them under the hood. On average the "timeout" error you're expecting happens anywhere from 5-15 minutes after no requests have been made to the knex instance. So if you want to test setting min = 0 then you should wait >15 minutes to see if the behavior is gone.

    Alternative solutions proposed in other posts like here: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?, suggest using the propogateError option, however the knex team highly discourages it, and though you can utilize I am not familiar enough with it to say what behavior to expect.

    I would highly discourage downgrading your Node instance as a solution to this problem