Skip to content

fix: fixed flaky/failing acceptance tests #3777

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

quiet-node
Copy link
Contributor

Description:

  • fixed batch request WS release test due to mismatch request ID
  • fixed flaky "eth_getBalance" with block number in the last 15 minutes for account that has performed contract deploys/calls test in batch2 due to race condition during initialization

Related issue(s):

Fixes #3665
Fixes #3666

Notes for reviewer:

Checklist

  • Documented (Code comments, README, etc.)
  • Tested (unit, integration, etc.)

Signed-off-by: Logan Nguyen <logan.nguyen@swirldslabs.com>
@quiet-node quiet-node added this to the 0.69.0 milestone May 20, 2025
@quiet-node quiet-node self-assigned this May 20, 2025
@quiet-node quiet-node requested review from a team as code owners May 20, 2025 23:42
@quiet-node quiet-node added the internal For changes that affect the project's internal workings but not its outward-facing functionality. label May 20, 2025
@quiet-node quiet-node linked an issue May 20, 2025 that may be closed by this pull request
@quiet-node quiet-node requested a review from acuarica May 20, 2025 23:42
@lfdt-bot
Copy link

lfdt-bot commented May 20, 2025

🎉 Snyk checks have passed. No issues have been found so far.

security/snyk check is complete. No issues have been found. (View Details)

license/snyk check is complete. No issues have been found. (View Details)

@quiet-node quiet-node force-pushed the 3665-fix-web-server-release-test-due-to-mismatch-in-request-id branch from e09476b to 23c5fb7 Compare May 20, 2025 23:43
Copy link

github-actions bot commented May 20, 2025

Test Results

 21 files  ±0  260 suites  ±0   34m 58s ⏱️ +17s
649 tests ±0  643 ✅ ±0  5 💤 ±0  1 ❌ ±0 
750 runs  ±0  744 ✅ ±0  5 💤 ±0  1 ❌ ±0 

For more details on these failures, see this check.

Results for commit 38b18bc. ± Comparison against base commit bcd473b.

♻️ This comment has been updated with latest results.

Signed-off-by: Logan Nguyen <logan.nguyen@swirldslabs.com>
@quiet-node quiet-node force-pushed the 3665-fix-web-server-release-test-due-to-mismatch-in-request-id branch from 23c5fb7 to 1199e06 Compare May 21, 2025 00:12
simzzz
simzzz previously approved these changes May 21, 2025
Comment on lines 122 to 125
await relay.pollForValidTransactionReceipt(createChildTx.hash);

// added delay to wait for balances properly settled on MN
await Utils.wait(1500);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this a double wait? If createChildTx is finished, doesn't it mean that balances are already settled on the MN?

Copy link
Contributor Author

@quiet-node quiet-node May 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Luis, thanks for the question. Without the extra wait there still occasional balance mismatches on production networks. I believe that once a transaction reaches consensus, the balances are updated in the Merkle tree almost instantly however for the Mirror Node to reflect those updates, there’s still a slight delay. Without the wait it should work fine, but given the busy traffic on production networks, it seems the Mirror Node sometimes takes longer to catch up, leading the flakyness we're dealing with. That's why I added the extra wait to stablize the results. Not ideal but flexible for different production environemnt.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification, really helpful. Maybe we should expand the comment to include a bit of this explanation?

Also, we don't remove the flakiness entirely, this is just a mitigation. What if we poll the MN to ensure the mentioned transaction is visible from it? This way, we are 100% sure to have solved the problem. I remember seeing this mechanism in tests in the Smart Contract repo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you’re right—a comment would definitely help.

As for the extra wait doesn’t eliminate the flakiness entirely, You’re correct that it’s more of a mitigation to improve stability. If the production environment needs more than 1500 ms, the issue can still occur, though that’s rare—but still possible.

Regarding polling the Mirror Node to confirm the transaction is visible, that’s actually what’s happening under the hood with await relay.pollForValidTransactionReceipt(initialFundsTx.hash), which essentially calls eth_getTransactionReceipt to fetch the receipt from the Mirror Node.

Maybe we should consider polling the Mirror Node for updated balances instead? Kinda feel like I have done this before but it was quite cumbersome somehow but let me see what I can do.

You're right we should aim for a more robust fix rather than just a mitigation. Will push in an update for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Synced offline with @acuarica and discovered that the flakiness was caused by a cached value of the latest block number. The new solution implements a polling mechanism to ensure the block is correctly updated after contract execution, resulting in stable and consistent outcomes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

@quiet-node quiet-node requested a review from acuarica May 21, 2025 18:09
acuarica
acuarica previously approved these changes May 21, 2025
Copy link
Contributor

@acuarica acuarica left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, going to approve it.

Signed-off-by: Logan Nguyen <logan.nguyen@swirldslabs.com>
@quiet-node quiet-node dismissed stale reviews from acuarica and simzzz via 38b18bc May 22, 2025 15:30
@quiet-node quiet-node requested review from simzzz and acuarica May 22, 2025 16:48
Copy link
Contributor

@acuarica acuarica left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice stuff!

Comment on lines +129 to +135
for (let i = 0; i < 5; i++) {
if (blockNumAfterCreateChildTx > blockNumBeforeCreateChildTx) {
break;
}
await Utils.wait(1500);
blockNumAfterCreateChildTx = await await relay.call(RelayCalls.ETH_ENDPOINTS.ETH_BLOCK_NUMBER, [], requestId);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple of things: remove double await, include console.log have feedback when block number is properly updated, move the blockNumAfterCreateChildTx assignment to have it only once.

Suggested change
for (let i = 0; i < 5; i++) {
if (blockNumAfterCreateChildTx > blockNumBeforeCreateChildTx) {
break;
}
await Utils.wait(1500);
blockNumAfterCreateChildTx = await await relay.call(RelayCalls.ETH_ENDPOINTS.ETH_BLOCK_NUMBER, [], requestId);
}
for (let i = 0; i < 5; i++) {
blockNumAfterCreateChildTx = await relay.call(RelayCalls.ETH_ENDPOINTS.ETH_BLOCK_NUMBER, [], requestId);
if (blockNumAfterCreateChildTx > blockNumBeforeCreateChildTx) {
console.log("Block number updated succesfully")
break;
}
await Utils.wait(1500);
}

@@ -72,7 +72,7 @@ describe('@api-batch-2 RPC Server Acceptance Tests', function () {
const FEE_SCHEDULE_FILE_CONTENT_UPDATED =
'0a280a0a08541a061a0440a8953a0a0a08061a061a0440889d2d0a0a08071a061a0440b0b63c120208011200'; // Eth gas = 953000

let blockNumberAtStartOfTests = 0;
let blockNumAfterCreateChildTx = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: move this declaration closer to the loop

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internal For changes that affect the project's internal workings but not its outward-facing functionality.
Projects
None yet
4 participants