Description
Summary
In our integration test suite, we do teardown the LDS Live Source by calling schema.__pgLdsSource?.close()
. This does however not work consistently, and from time to time (apparently when the server - that also runs the db in a docker container - is under higher load) it fails to stop the LDS loop and the test job (github action) is only stopped by timing out, failing the build step.
In the meantime, it does keep logging
Error during LDS loop: Pool has been closed
I believe I tracked this down to a bug in
graphile-engine/packages/lds/src/index.ts
Lines 96 to 101 in 6692c2a
async function loop() {
try {
const rows = await client.getChanges(null, 500);
…
} catch (e) {
console.error("Error during LDS loop:", e.message);
// Recovery time...
loopTimeout = setTimeout(loop, sleepDuration * 10);
return;
}
loopTimeout = setTimeout(loop, sleepDuration);
}
loop();
While the client.getChanges
query is await
ed, no timeout is scheduled and the clearTimeout
will be ineffective, so the loop will start over even after .close()
has been called.
Steps to reproduce
By setting the sleepDuration
to a minimum value, the loop spends much more time in the query so a close()
will hit the sleep much less likely.
Possible Solution
I can see a few approaches:
- just set a
closed
flag and check this before callingsetTimeout
again .unref()
each timer? I think this might have worked in my case, but probably not a generic solution- check whether the
client.pool
is closed before callinggetChanges()
, and breaking out of the loop - check whether the
client.pool
is closed in thecatch
handler of the loop, breaking out from there - a specific "client closed" exception that is handled differently in the
catch
of the loop, breaking out from there