-
Notifications
You must be signed in to change notification settings - Fork 345
feat(ckbtc): Add get_utxos_cache to reduce latency of update_balance calls #4788
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this pull request affects the behavior of any canister owned by
the Governance team, remember to update the corresponding
unreleased_changes.md file(s).
To acknowldge this reminder (and unblock the PR), dismiss this
code review by going to the bottom of the pull request page, and
supply one of the following reasons:
-
Done.
-
No canister behavior changes.
c5fde3e
to
f5079ba
Compare
No NNS governance canister behavior changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice PR @ninegua! Some minor comments on my side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mducroux for a careful review! I should have addressed all the concerns, please let me know if the changes look good to you. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM @ninegua! Thanks for addressing my comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @ninegua for this PR! Mostly some minor comments and understanding questions, otherwise code looks very good!
@@ -208,6 +208,9 @@ type InitArgs = record { | |||
|
|||
/// The canister id of the KYT canister (deprecated, use btc_checker_principal instead). | |||
kyt_principal: opt principal; | |||
|
|||
/// The expiration duration (in nanoseconds) for cached entries in the get_utxos cache. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit1: any reason why to have this in nanoseconds? We don't have that precision, so I would say something like seconds should be good enough no? I think this would also help when verifying a proposal to ensure that we don't have an ungodly amount of zeroes to count.
nit2: for clarity when verifying for example a proposal, we could add the unit as a suffix to the field name, e.g. get_utxos_cache_expiration_seconds
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good! Changed to seconds.
rs/bitcoin/ckbtc/minter/src/lib.rs
Outdated
@@ -1340,3 +1352,96 @@ impl From<u64> for Timestamp { | |||
Self(timestamp) | |||
} | |||
} | |||
|
|||
#[derive(Clone, Eq, PartialEq, Ord, PartialOrd, Debug, CandidType, Deserialize, Serialize)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
understanding question: why do we need CandidType
and Serialize
+Deserialize
? I would expect this to just be stored on the heap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! They are not needed. Removed.
@@ -1340,3 +1352,96 @@ impl From<u64> for Timestamp { | |||
Self(timestamp) | |||
} | |||
} | |||
|
|||
#[derive(Clone, Eq, PartialEq, Ord, PartialOrd, Debug, CandidType, Deserialize, Serialize)] | |||
struct Timestamped<Inner> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
len | ||
} | ||
|
||
pub fn set_expiration(&mut self, expiration: Duration) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method potentially invalidate a lot of entries (if the expiration is much shorter than the previous one). Shouldn't we call prune
inside the method?
s.get_utxos_cache.prune(*now); | ||
s.get_utxos_cache.insert(req, res.clone(), *now) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the Rust doc of CacheWithExpiration
/// More specifically, entries are inserted with a timestamp, and
/// then all existing entries with a timestamp less than `t - expiration` are removed before
/// the new entry is inserted.
I would have assumed that prune
is actually part of insert
. What about making it a single method? Right now I would need to remember to call prune
before doing an insert
which seems a bit error-prone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initially I had the prune inside insert. But later I removed it and forgot to change the documentation.
Although I understand where you are coming from, but decoupling insert and prune is more flexible. For example, in tests, I don't have to worry about the order of insertions, when making assertions on len()
.
Alternatively I could provide an additional method called insert_no_prune()
, and let the regular insert()
do the pruning before insertion. WDYT?
}); | ||
} | ||
} | ||
pub fn insert<T: Into<Timestamp>>(&mut self, key: Key, value: Value, now: T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as commented below I would make prune
to be called in insert
@@ -156,6 +156,7 @@ pub async fn get_utxos<R: CanisterRuntime>( | |||
runtime: &R, | |||
) -> Result<GetUtxosResponse, CallError> { | |||
async fn bitcoin_get_utxos<R: CanisterRuntime>( | |||
now: &mut u64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: why having this parameter when now could be queried with runtime.time
within the method bitcoin_get_utxos
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In tests, we drive the time by mocking the runtime.time() function, and increment the time value in there. So whenever runtime.time() is called, time will change. I had to avoid unnecessary calls to runtime.time() here so as not to break existing tests. Frankly mocking time this way is quite fragile but I don't have a better idea. WDYT?
utxos.append(&mut response.utxos); | ||
num_pages += 1; | ||
} | ||
|
||
observe_get_utxos_latency(utxos.len(), num_pages, source, start_time, runtime.time()); | ||
observe_get_utxos_latency(utxos.len(), num_pages, source, start_time, now); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we even register the latency in case of a cache hit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question! If we do register, the histogram graph will have a bar of latency 0, maybe also a good comparison?
Co-authored-by: gregorydemay <112856886+gregorydemay@users.noreply.github.com>
XC-314: ckBTC: Cache update_balance results
The call to the bitcoin canister's get_utxos method takes up the majority time of a update_balance call.
Sometimes a client may call update_balance repeatedly, which adds to the workload of the ckBTC minter canister.
The result of get_utxos would remain unchanged until the bitcoin canister receives a new block, which makes it a good candidate for caching.
This PR implements an internal get_utxos_cache that will reuse previous call results (if available and not expired) without making new get_utxos calls. Entries are expired based on a pre-configured get_utxos_cache_expiration setting.
Caching get_utxos results is safe for update_balance because: