|
| 1 | +# Execution Verkle State Network |
| 2 | + |
| 3 | +This document is the specification for the sub-protocol that supports on-demand availability of Verkle state data from the execution chain. Verkle trie is the upcoming structure for storing Ethereum state. See [EIP-6800](https://eips.ethereum.org/EIPS/eip-6800) for mode details. |
| 4 | + |
| 5 | +> 🚧 THE SPEC IS IN A STATE OF FLUX AND SHOULD BE CONSIDERED UNSTABLE 🚧 |
| 6 | +
|
| 7 | +## Overview |
| 8 | + |
| 9 | +The Verkle State Network subnetwork protocol is almost identical to the [State Network](./../state/state-network.md). The main difference is in the way that data is structured and encoded. Only differences will be provided below. |
| 10 | + |
| 11 | +### Portal Network version of the Verkle Trie |
| 12 | + |
| 13 | +The high level overview and reasoning, can be found here: [ethresear.ch/t/portal-network-verkle/19339](https://ethresear.ch/t/portal-network-verkle/19339). |
| 14 | + |
| 15 | +Portal Network stores every trie node that ever existed. For optimization reasons, each trie node is split into 2-layer mini trie and each node from the mini-trie is stored separately in the network. The exact encoding and the content key is derived differently and is specified below. |
| 16 | + |
| 17 | +To represent the trie node, the Verkle Trie uses Pedersen Commitment, which is calculated using following formula: |
| 18 | + |
| 19 | +$$C = Commit(a_0, a_1, ..., a_{255}) = a_0B_0 + a_1B_1 + ... + a_{255}B_{255}$$ |
| 20 | + |
| 21 | +where: |
| 22 | +- $B_i$ is basis of the Pedersen commitment |
| 23 | + - already fixed Elliptic curve points on Banderwagon (a prime order subgroup over [Bandersnatch](https://ethresear.ch/t/introducing-bandersnatch-a-fast-elliptic-curve-built-over-the-bls12-381-scalar-field/9957)) curve. |
| 24 | +- $a_i$ are values we are committing to |
| 25 | + - value from elliptic curve's scalar field $F_r$ (maximum value is less than $2^{253}$) |
| 26 | +- $C$ is the commitment of $a_i$ values, which on its own is a point on the elliptic curve |
| 27 | + - in order to commit to another commitment, we map commitment $C$ to the scalar field $F_r$ and we call that **Pedersen Hash** or **hash commitment** |
| 28 | + - these two values are frequently used interchangeably, but they are not one-to-one mapping |
| 29 | + - in this document, we will use $C$ to indicate commitment expressed as elliptic point, and $c$ when it's mapped to scalar field (hash commitment) |
| 30 | + |
| 31 | +#### Trie Node |
| 32 | + |
| 33 | +The Verkle trie has 2 types of nodes: |
| 34 | + |
| 35 | +- branch (inner) node: up to 256 children nodes (either branch or leaf) |
| 36 | +- leaf (extension) node: up to 256 values (32 bytes each) |
| 37 | + |
| 38 | +##### Branch (Inner) node |
| 39 | + |
| 40 | +The branch node of the Verkle trie stores up to 256 values, each of which is a hash commitment of a child trie node. |
| 41 | + |
| 42 | +$$C = c_0B_0 + c_1B_1 + ... + c_{255}B_{255}$$ |
| 43 | + |
| 44 | +For optimization reasons, Portal Network splits branch node into 2-layer mini network in the following way: |
| 45 | + |
| 46 | + |
| 47 | + |
| 48 | +Each of the `branch-fragment` node stores hash commitments of 16 children nodes. The commitment of those 16 children represents that fragment node and is stored inside `branch-bundle` node. |
| 49 | + |
| 50 | +$$ |
| 51 | +\begin{align*} |
| 52 | +C^\prime_0 &= c_0B_0 + c_1B_1 +…+ c_{15}B_{15} \\ |
| 53 | +C^\prime_1 &= c_{16}B_{16} + c_{17}B_{17} +…+ c_{31}B_{31} \\ |
| 54 | +&… \\ |
| 55 | +C^\prime_{15} &= c_{240}B_{240} + c_{241}B_{241} +…+ c_{255}B_{255} \\ |
| 56 | +\end{align*} |
| 57 | +$$ |
| 58 | + |
| 59 | +The commitment of the `branch-bundle` node ($C$) is calculated as a sum 16 `branch-fragment` node commitments ($C^\prime_i$). |
| 60 | + |
| 61 | +$$C = C^\prime_0 + C^\prime_1 +...+ C^\prime_{15}$$ |
| 62 | + |
| 63 | +##### Leaf (extension) node |
| 64 | + |
| 65 | +The leaf node of the Verkle trie stores up to 256 values, each 32 bytes long. Because value (32 bytes) doesn't fit into scalar field, commitment of the leaf node ($C$) is calculated in the following way. |
| 66 | + |
| 67 | +$$C = Commit(marker, stem, C_1, C_2)$$ |
| 68 | + |
| 69 | +$$ |
| 70 | +\begin{align*} |
| 71 | +C_1 &= Commit(v_0^{low+access}, v_0^{high}, v_1^{low+access}, v_1^{high}, ... , v_{127}^{low+access}, v_{127}^{high}) \\ |
| 72 | +C_2 &= Commit(v_{128}^{low+access}, v_{128}^{high}, v_{129}^{low+access}, v_{129}^{high}, ... , v_{255}^{low+access}, v_{255}^{high}) \\ |
| 73 | +\end{align*} |
| 74 | +$$ |
| 75 | + |
| 76 | +where: |
| 77 | + |
| 78 | +- $marker$ - currently only value $1$ is used |
| 79 | +- $stem$ - the path from the root of the trie (31 bytes) |
| 80 | +- $v_i^{low+access}$ - the lower 16 bytes of the value $v_i$ plus $2^{128}$ if value is modified |
| 81 | + - note that if value is not modified, it will be equal to $0$ |
| 82 | +- $v_i^{high}$ - the higher 16 bytes of the value $v_i$ |
| 83 | + |
| 84 | + |
| 85 | +For optimization reasons, Portal Network splits leaf node into 2-layer mini network in the following way: |
| 86 | + |
| 87 | + |
| 88 | + |
| 89 | +Each of the `leaf-fragment` nodes stores up to 16 values (32 bytes each). |
| 90 | + |
| 91 | +The commitment of those 16 values ($C^\prime_i$) represents that fragment node and is stored inside `leaf-bundle` node. |
| 92 | + |
| 93 | +$$ |
| 94 | +\begin{align*} |
| 95 | +C^\prime_0 &= v_0^{low+access}B_0 + v_0^{high}B_1 + v_1^{low+access}B_2 + v_1^{high}B_3 +…+ v_{15}^{low,access}B_{30} + v_{15}^{high}B_{31} \\ |
| 96 | +C^\prime_1 &= v_{16}^{low+access}B_{32} + v_{16}^{high}B_{33} + v_{17}^{low+access}B_{34} + v_{17}^{high}B_{35} +…+ v_{31}^{low,access}B_{62} + v_{31}^{high}B_{63} \\ |
| 97 | +&… \\ |
| 98 | +C^\prime_7 &= v_{112}^{low+access}B_{224} + v_{112}^{high}B_{225} + v_{113}^{low+access}B_{226} + v_{113}^{high}B_{227} +…+ v_{127}^{low,access}B_{254} + v_{127}^{high}B_{255} \\ |
| 99 | +\\ |
| 100 | +C^\prime_8 &= v_{128}^{low+access}B_{0} + v_{128}^{high}B_{1} + v_{129}^{low+access}B_{3} + v_{129}^{high}B_{4} +…+ v_{143}^{low,access}B_{30} + v_{143}^{high}B_{31} \\ |
| 101 | +&… \\ |
| 102 | +C^\prime_{15} &= v_{240}^{low+access}B_{224} + v_{240}^{high}B_{225} + v_{241}^{low+access}B_{256} + v_{241}^{high}B_{227} +…+ v_{255}^{low,access}B_{254} + v_{255}^{high}B_{255} \\ |
| 103 | +\end{align*} |
| 104 | +$$ |
| 105 | + |
| 106 | +The commitment of the `leaf-bundle` node ($C$) is calculated in the following way: |
| 107 | + |
| 108 | +$$ |
| 109 | +\begin{align*} |
| 110 | +C_1 &= C^\prime_0 + C^\prime_1 + … + C^\prime_7 \\ |
| 111 | +C_2 &= C^\prime_8 + C^\prime_9 + … + C^\prime_{15} |
| 112 | +\end{align*} |
| 113 | +$$ |
| 114 | + |
| 115 | +$$ |
| 116 | +C = marker \cdot B_0 + stem \cdot B_1 + c_1B_2 + c_2B_3 |
| 117 | +$$ |
| 118 | + |
| 119 | + |
| 120 | +## Specification |
| 121 | + |
| 122 | +### Protocol Identifier |
| 123 | + |
| 124 | +As specified in the [Protocol identifiers](./../portal-wire-protocol.md#protocol-identifiers) section of the Portal wire protocol, the `protocol` field in the `TALKREQ` message **MUST** contain the value of `0x500E`. |
| 125 | + |
| 126 | +### Helper Data Types |
| 127 | + |
| 128 | +#### Path and Stem |
| 129 | + |
| 130 | +The Path represents the trie path from the root to the branch node. The Stem represents the first 31 bytes of the Verkle Trie key. |
| 131 | + |
| 132 | +``` |
| 133 | +Path := List[uint8; 30] |
| 134 | +Stem := Bytes31 |
| 135 | +``` |
| 136 | + |
| 137 | +#### Commitment |
| 138 | + |
| 139 | +Both elliptic curve point (commitment) and scalar field value (hash commitment) can be encoded using 32 bytes. We will define them separately in order to be explicit. |
| 140 | + |
| 141 | +``` |
| 142 | +EllipticCurvePoint := Bytes32 |
| 143 | +ScalarFieldValue := Bytes32 |
| 144 | +``` |
| 145 | + |
| 146 | +#### Bundle Commitment Proof |
| 147 | + |
| 148 | +**⚠️ This section needs more reserach and detailed specifiction. ⚠️** |
| 149 | + |
| 150 | +In order to prevent bad actors from creating false data for the `bundle` nodes of the mini tries, we have to create and include proof that fragment commitments are correct. The exact proof schema is being researched. |
| 151 | + |
| 152 | +``` |
| 153 | +BundleProof := Bytes1024 |
| 154 | +``` |
| 155 | + |
| 156 | +#### Trie Proof |
| 157 | + |
| 158 | +Using IPA and Multiproof, the same proving scheme that Verkle uses, we can prove that any node or value is included in a trie in a memory efficient way. |
| 159 | + |
| 160 | +**⚠️ This section needs detailed specifiction. ⚠️** |
| 161 | + |
| 162 | +Exact details of the specification are up to be decided. We provide only temporary types (based on execution witness from the verkle devnet). |
| 163 | + |
| 164 | +``` |
| 165 | +IpaProof := Container( |
| 166 | + cl: Vector[EllipticCurvePoint; 8], |
| 167 | + cr: Vector[EllipticCurvePoint; 8], |
| 168 | + final_evaluation: ScalarFieldValue, |
| 169 | + ) |
| 170 | +MultiPointProof := Container( |
| 171 | + open_proof: IpaProof, |
| 172 | + g_x: EllipticCurvePoint, |
| 173 | + ) |
| 174 | +
|
| 175 | +TrieProof := Container( |
| 176 | + commitments_by_path: List[EllipticCurvePoint; 32], |
| 177 | + multiproof: MultiPointProof, |
| 178 | + ) |
| 179 | +``` |
| 180 | + |
| 181 | +#### Children |
| 182 | + |
| 183 | +All our nodes store up to 16 children. We encode bitmap of present children, and then the children themself. |
| 184 | + |
| 185 | +``` |
| 186 | +Children[type] := Container(bitmap: Bitvector(16), children: List[type; 16]) |
| 187 | +``` |
| 188 | + |
| 189 | +Note that the count of set bits inside `bitmap` field MUST be equal to the length of the `children` field. The order of the children is from the lowest to the highest set bit. |
| 190 | + |
| 191 | +#### Trie node |
| 192 | + |
| 193 | +Each trie node has a different type and different proof. |
| 194 | + |
| 195 | +``` |
| 196 | +BranchBundleNode := Container( |
| 197 | + fragments: Children[EllipticCurvePoint], |
| 198 | + proof: BundleProof, |
| 199 | + ) |
| 200 | +BranchBundleNodeWithProof := Container( |
| 201 | + node: BranchBundleNode, |
| 202 | + block_hash: Bytes32, |
| 203 | + path: Path, |
| 204 | + proof: Union[None, TrieProof], |
| 205 | + ) |
| 206 | +
|
| 207 | +BranchFragmentNode := Container( |
| 208 | + fragment_index: uint8, |
| 209 | + children: Children[EllipticCurvePoint], |
| 210 | + ) |
| 211 | +BranchFragmentNodeWithProof := Container( |
| 212 | + node: BranchFragmentNode, |
| 213 | + block_hash: Bytes32, |
| 214 | + path: Path, |
| 215 | + proof: Union[None, TrieProof], |
| 216 | + ) |
| 217 | +
|
| 218 | +LeafBundleNode := Container( |
| 219 | + marker: uint32, |
| 220 | + stem: Stem, |
| 221 | + fragments: Children[EllipticCurvePoint], |
| 222 | + proof: BundleProof, |
| 223 | + ) |
| 224 | +LeafBundleNodeWithProof := Container( |
| 225 | + node: LeafBundleNode, |
| 226 | + block_hash: Bytes32, |
| 227 | + proof: TrieProof, |
| 228 | + ) |
| 229 | +
|
| 230 | +LeafFragmentNode := Container( |
| 231 | + fragment_index: uint8, |
| 232 | + values: Children[Bytes32], |
| 233 | + ) |
| 234 | +LeafFragmentNodeWithProof := Container( |
| 235 | + node: LeafFragmentNode, |
| 236 | + block_hash: Bytes32, |
| 237 | + proof: TrieProof, |
| 238 | + ) |
| 239 | +``` |
| 240 | + |
| 241 | +It's worth noting that `proof` is `None` for the branch-bundle and branch-fragments that correspond to the root of the trie (in which case `path` is empty as well). |
| 242 | + |
| 243 | +### Content types |
| 244 | + |
| 245 | +#### Content keys |
| 246 | + |
| 247 | +When doing lookup for bundle node, we don't know if we should expect branch-bundle or leaf-bundle node. For this reason, they use the same content key type. |
| 248 | + |
| 249 | +The branch-fragment key has to be different from the branch-bundle key, because they can have the same commitment (in case other fragments from that bundle are zero). |
| 250 | + |
| 251 | +The leaf-fragment key should include the `stem`, in order to avoid hot-spots. Others keys don't have to worry about hot-spots because they are build on top of leaf-bundle nodes that already includes `stem` in its commitment (effectively guaranteeing the uniqueness). |
| 252 | + |
| 253 | +``` |
| 254 | +bundle_node_key := EllipticCurvePoint |
| 255 | +bundle_content_key := 0x30 + SSZ.serialize(bundle_node_key) |
| 256 | +
|
| 257 | +branch_fragment_node_key := EllipticCurvePoint |
| 258 | +branch_fragment_content_key := 0x31 + SSZ.serialize(branch_fragment_node_key) |
| 259 | +
|
| 260 | +leaf_fragment_node_key := Container(stem: Stem, commitment: EllipticCurvePoint) |
| 261 | +leaf_fragment_content_key := 0x32 + SSZ.serialize(leaf_fragment_node_key) |
| 262 | +``` |
| 263 | + |
| 264 | +#### Content values |
| 265 | + |
| 266 | +The OFFER/ACCEPT payloads need to be provable by their recipients, while FINDCONTENT/FOUNDCONTENT payloads have to be verifiable that they match the commitment that identifies them. |
| 267 | + |
| 268 | +The content value has to correspond to the content-key. |
| 269 | + |
| 270 | +``` |
| 271 | +content_value_for_offer := Union( |
| 272 | + BranchBundleNodeWithProof, |
| 273 | + BranchFragmentNodeWithProof, |
| 274 | + LeafBundleNodeWithProof, |
| 275 | + LeafFragmentNodeWithProof, |
| 276 | + ) |
| 277 | +
|
| 278 | +content_value_for_retrieval := Union( |
| 279 | + BranchBundleNode, |
| 280 | + BranchFragmentNode, |
| 281 | + LeafBundleNode, |
| 282 | + LeafFragmentNode, |
| 283 | + ) |
| 284 | +``` |
| 285 | + |
| 286 | +## Gossip |
| 287 | + |
| 288 | +As each block, the bridge is responsible for detecting and gossiping all created and updated trie nodes separately. Bridge should first compute all content-ids that should be gossiped, and it should gossip them based on their distance to its own node-id, from closest to farthest. |
0 commit comments