Skip to content

Commit 5dcdaee

Browse files
authored
Create specification for the Verkle State Network (ethereum#295)
1 parent 542e26c commit 5dcdaee

File tree

4 files changed

+293
-2
lines changed

4 files changed

+293
-2
lines changed

README.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -190,7 +190,10 @@ This network is a pure gossip network and does not implement any form of content
190190
- [Beacon Chain Network](./beacon-chain/beacon-network.md)
191191
- [Canonical Transaction Index Network](./canonical-transaction-index-network.md)
192192
- Spec is preliminary.
193-
- Network design borrows heavily from history network.
194-
- [Transaction Gossip Network](./transaction-gossip/transaction-gossip.md):
193+
- Network design borrows heavily from history network
194+
- [Transaction Gossip Network](./transaction-gossip/transaction-gossip.md)
195195
- Spec is preliminary
196196
- Prior work: https://ethresear.ch/t/scalable-transaction-gossip/8660
197+
- [Verkle State Network](./verkle/verkle-state-network.md)
198+
- Spec is preliminary
199+
- Prior work: https://ethresear.ch/t/portal-network-verkle/19339

assets/verkle_branch_node.png

17.6 KB
Loading

assets/verkle_leaf_node.png

31.7 KB
Loading

verkle/verkle-state-network.md

Lines changed: 288 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,288 @@
1+
# Execution Verkle State Network
2+
3+
This document is the specification for the sub-protocol that supports on-demand availability of Verkle state data from the execution chain. Verkle trie is the upcoming structure for storing Ethereum state. See [EIP-6800](https://eips.ethereum.org/EIPS/eip-6800) for mode details.
4+
5+
> 🚧 THE SPEC IS IN A STATE OF FLUX AND SHOULD BE CONSIDERED UNSTABLE 🚧
6+
7+
## Overview
8+
9+
The Verkle State Network subnetwork protocol is almost identical to the [State Network](./../state/state-network.md). The main difference is in the way that data is structured and encoded. Only differences will be provided below.
10+
11+
### Portal Network version of the Verkle Trie
12+
13+
The high level overview and reasoning, can be found here: [ethresear.ch/t/portal-network-verkle/19339](https://ethresear.ch/t/portal-network-verkle/19339).
14+
15+
Portal Network stores every trie node that ever existed. For optimization reasons, each trie node is split into 2-layer mini trie and each node from the mini-trie is stored separately in the network. The exact encoding and the content key is derived differently and is specified below.
16+
17+
To represent the trie node, the Verkle Trie uses Pedersen Commitment, which is calculated using following formula:
18+
19+
$$C = Commit(a_0, a_1, ..., a_{255}) = a_0B_0 + a_1B_1 + ... + a_{255}B_{255}$$
20+
21+
where:
22+
- $B_i$ is basis of the Pedersen commitment
23+
- already fixed Elliptic curve points on Banderwagon (a prime order subgroup over [Bandersnatch](https://ethresear.ch/t/introducing-bandersnatch-a-fast-elliptic-curve-built-over-the-bls12-381-scalar-field/9957)) curve.
24+
- $a_i$ are values we are committing to
25+
- value from elliptic curve's scalar field $F_r$ (maximum value is less than $2^{253}$)
26+
- $C$ is the commitment of $a_i$ values, which on its own is a point on the elliptic curve
27+
- in order to commit to another commitment, we map commitment $C$ to the scalar field $F_r$ and we call that **Pedersen Hash** or **hash commitment**
28+
- these two values are frequently used interchangeably, but they are not one-to-one mapping
29+
- in this document, we will use $C$ to indicate commitment expressed as elliptic point, and $c$ when it's mapped to scalar field (hash commitment)
30+
31+
#### Trie Node
32+
33+
The Verkle trie has 2 types of nodes:
34+
35+
- branch (inner) node: up to 256 children nodes (either branch or leaf)
36+
- leaf (extension) node: up to 256 values (32 bytes each)
37+
38+
##### Branch (Inner) node
39+
40+
The branch node of the Verkle trie stores up to 256 values, each of which is a hash commitment of a child trie node.
41+
42+
$$C = c_0B_0 + c_1B_1 + ... + c_{255}B_{255}$$
43+
44+
For optimization reasons, Portal Network splits branch node into 2-layer mini network in the following way:
45+
46+
![Verkle Branch Node](./../assets/verkle_branch_node.png)
47+
48+
Each of the `branch-fragment` node stores hash commitments of 16 children nodes. The commitment of those 16 children represents that fragment node and is stored inside `branch-bundle` node.
49+
50+
$$
51+
\begin{align*}
52+
C^\prime_0 &= c_0B_0 + c_1B_1 +…+ c_{15}B_{15} \\
53+
C^\prime_1 &= c_{16}B_{16} + c_{17}B_{17} +…+ c_{31}B_{31} \\
54+
&… \\
55+
C^\prime_{15} &= c_{240}B_{240} + c_{241}B_{241} +…+ c_{255}B_{255} \\
56+
\end{align*}
57+
$$
58+
59+
The commitment of the `branch-bundle` node ($C$) is calculated as a sum 16 `branch-fragment` node commitments ($C^\prime_i$).
60+
61+
$$C = C^\prime_0 + C^\prime_1 +...+ C^\prime_{15}$$
62+
63+
##### Leaf (extension) node
64+
65+
The leaf node of the Verkle trie stores up to 256 values, each 32 bytes long. Because value (32 bytes) doesn't fit into scalar field, commitment of the leaf node ($C$) is calculated in the following way.
66+
67+
$$C = Commit(marker, stem, C_1, C_2)$$
68+
69+
$$
70+
\begin{align*}
71+
C_1 &= Commit(v_0^{low+access}, v_0^{high}, v_1^{low+access}, v_1^{high}, ... , v_{127}^{low+access}, v_{127}^{high}) \\
72+
C_2 &= Commit(v_{128}^{low+access}, v_{128}^{high}, v_{129}^{low+access}, v_{129}^{high}, ... , v_{255}^{low+access}, v_{255}^{high}) \\
73+
\end{align*}
74+
$$
75+
76+
where:
77+
78+
- $marker$ - currently only value $1$ is used
79+
- $stem$ - the path from the root of the trie (31 bytes)
80+
- $v_i^{low+access}$ - the lower 16 bytes of the value $v_i$ plus $2^{128}$ if value is modified
81+
- note that if value is not modified, it will be equal to $0$
82+
- $v_i^{high}$ - the higher 16 bytes of the value $v_i$
83+
84+
85+
For optimization reasons, Portal Network splits leaf node into 2-layer mini network in the following way:
86+
87+
![Verkle leaf node](./../assets/verkle_leaf_node.png)
88+
89+
Each of the `leaf-fragment` nodes stores up to 16 values (32 bytes each).
90+
91+
The commitment of those 16 values ($C^\prime_i$) represents that fragment node and is stored inside `leaf-bundle` node.
92+
93+
$$
94+
\begin{align*}
95+
C^\prime_0 &= v_0^{low+access}B_0 + v_0^{high}B_1 + v_1^{low+access}B_2 + v_1^{high}B_3 +…+ v_{15}^{low,access}B_{30} + v_{15}^{high}B_{31} \\
96+
C^\prime_1 &= v_{16}^{low+access}B_{32} + v_{16}^{high}B_{33} + v_{17}^{low+access}B_{34} + v_{17}^{high}B_{35} +…+ v_{31}^{low,access}B_{62} + v_{31}^{high}B_{63} \\
97+
&… \\
98+
C^\prime_7 &= v_{112}^{low+access}B_{224} + v_{112}^{high}B_{225} + v_{113}^{low+access}B_{226} + v_{113}^{high}B_{227} +…+ v_{127}^{low,access}B_{254} + v_{127}^{high}B_{255} \\
99+
\\
100+
C^\prime_8 &= v_{128}^{low+access}B_{0} + v_{128}^{high}B_{1} + v_{129}^{low+access}B_{3} + v_{129}^{high}B_{4} +…+ v_{143}^{low,access}B_{30} + v_{143}^{high}B_{31} \\
101+
&… \\
102+
C^\prime_{15} &= v_{240}^{low+access}B_{224} + v_{240}^{high}B_{225} + v_{241}^{low+access}B_{256} + v_{241}^{high}B_{227} +…+ v_{255}^{low,access}B_{254} + v_{255}^{high}B_{255} \\
103+
\end{align*}
104+
$$
105+
106+
The commitment of the `leaf-bundle` node ($C$) is calculated in the following way:
107+
108+
$$
109+
\begin{align*}
110+
C_1 &= C^\prime_0 + C^\prime_1 + … + C^\prime_7 \\
111+
C_2 &= C^\prime_8 + C^\prime_9 + … + C^\prime_{15}
112+
\end{align*}
113+
$$
114+
115+
$$
116+
C = marker \cdot B_0 + stem \cdot B_1 + c_1B_2 + c_2B_3
117+
$$
118+
119+
120+
## Specification
121+
122+
### Protocol Identifier
123+
124+
As specified in the [Protocol identifiers](./../portal-wire-protocol.md#protocol-identifiers) section of the Portal wire protocol, the `protocol` field in the `TALKREQ` message **MUST** contain the value of `0x500E`.
125+
126+
### Helper Data Types
127+
128+
#### Path and Stem
129+
130+
The Path represents the trie path from the root to the branch node. The Stem represents the first 31 bytes of the Verkle Trie key.
131+
132+
```
133+
Path := List[uint8; 30]
134+
Stem := Bytes31
135+
```
136+
137+
#### Commitment
138+
139+
Both elliptic curve point (commitment) and scalar field value (hash commitment) can be encoded using 32 bytes. We will define them separately in order to be explicit.
140+
141+
```
142+
EllipticCurvePoint := Bytes32
143+
ScalarFieldValue := Bytes32
144+
```
145+
146+
#### Bundle Commitment Proof
147+
148+
**⚠️ This section needs more reserach and detailed specifiction. ⚠️**
149+
150+
In order to prevent bad actors from creating false data for the `bundle` nodes of the mini tries, we have to create and include proof that fragment commitments are correct. The exact proof schema is being researched.
151+
152+
```
153+
BundleProof := Bytes1024
154+
```
155+
156+
#### Trie Proof
157+
158+
Using IPA and Multiproof, the same proving scheme that Verkle uses, we can prove that any node or value is included in a trie in a memory efficient way.
159+
160+
**⚠️ This section needs detailed specifiction. ⚠️**
161+
162+
Exact details of the specification are up to be decided. We provide only temporary types (based on execution witness from the verkle devnet).
163+
164+
```
165+
IpaProof := Container(
166+
cl: Vector[EllipticCurvePoint; 8],
167+
cr: Vector[EllipticCurvePoint; 8],
168+
final_evaluation: ScalarFieldValue,
169+
)
170+
MultiPointProof := Container(
171+
open_proof: IpaProof,
172+
g_x: EllipticCurvePoint,
173+
)
174+
175+
TrieProof := Container(
176+
commitments_by_path: List[EllipticCurvePoint; 32],
177+
multiproof: MultiPointProof,
178+
)
179+
```
180+
181+
#### Children
182+
183+
All our nodes store up to 16 children. We encode bitmap of present children, and then the children themself.
184+
185+
```
186+
Children[type] := Container(bitmap: Bitvector(16), children: List[type; 16])
187+
```
188+
189+
Note that the count of set bits inside `bitmap` field MUST be equal to the length of the `children` field. The order of the children is from the lowest to the highest set bit.
190+
191+
#### Trie node
192+
193+
Each trie node has a different type and different proof.
194+
195+
```
196+
BranchBundleNode := Container(
197+
fragments: Children[EllipticCurvePoint],
198+
proof: BundleProof,
199+
)
200+
BranchBundleNodeWithProof := Container(
201+
node: BranchBundleNode,
202+
block_hash: Bytes32,
203+
path: Path,
204+
proof: Union[None, TrieProof],
205+
)
206+
207+
BranchFragmentNode := Container(
208+
fragment_index: uint8,
209+
children: Children[EllipticCurvePoint],
210+
)
211+
BranchFragmentNodeWithProof := Container(
212+
node: BranchFragmentNode,
213+
block_hash: Bytes32,
214+
path: Path,
215+
proof: Union[None, TrieProof],
216+
)
217+
218+
LeafBundleNode := Container(
219+
marker: uint32,
220+
stem: Stem,
221+
fragments: Children[EllipticCurvePoint],
222+
proof: BundleProof,
223+
)
224+
LeafBundleNodeWithProof := Container(
225+
node: LeafBundleNode,
226+
block_hash: Bytes32,
227+
proof: TrieProof,
228+
)
229+
230+
LeafFragmentNode := Container(
231+
fragment_index: uint8,
232+
values: Children[Bytes32],
233+
)
234+
LeafFragmentNodeWithProof := Container(
235+
node: LeafFragmentNode,
236+
block_hash: Bytes32,
237+
proof: TrieProof,
238+
)
239+
```
240+
241+
It's worth noting that `proof` is `None` for the branch-bundle and branch-fragments that correspond to the root of the trie (in which case `path` is empty as well).
242+
243+
### Content types
244+
245+
#### Content keys
246+
247+
When doing lookup for bundle node, we don't know if we should expect branch-bundle or leaf-bundle node. For this reason, they use the same content key type.
248+
249+
The branch-fragment key has to be different from the branch-bundle key, because they can have the same commitment (in case other fragments from that bundle are zero).
250+
251+
The leaf-fragment key should include the `stem`, in order to avoid hot-spots. Others keys don't have to worry about hot-spots because they are build on top of leaf-bundle nodes that already includes `stem` in its commitment (effectively guaranteeing the uniqueness).
252+
253+
```
254+
bundle_node_key := EllipticCurvePoint
255+
bundle_content_key := 0x30 + SSZ.serialize(bundle_node_key)
256+
257+
branch_fragment_node_key := EllipticCurvePoint
258+
branch_fragment_content_key := 0x31 + SSZ.serialize(branch_fragment_node_key)
259+
260+
leaf_fragment_node_key := Container(stem: Stem, commitment: EllipticCurvePoint)
261+
leaf_fragment_content_key := 0x32 + SSZ.serialize(leaf_fragment_node_key)
262+
```
263+
264+
#### Content values
265+
266+
The OFFER/ACCEPT payloads need to be provable by their recipients, while FINDCONTENT/FOUNDCONTENT payloads have to be verifiable that they match the commitment that identifies them.
267+
268+
The content value has to correspond to the content-key.
269+
270+
```
271+
content_value_for_offer := Union(
272+
BranchBundleNodeWithProof,
273+
BranchFragmentNodeWithProof,
274+
LeafBundleNodeWithProof,
275+
LeafFragmentNodeWithProof,
276+
)
277+
278+
content_value_for_retrieval := Union(
279+
BranchBundleNode,
280+
BranchFragmentNode,
281+
LeafBundleNode,
282+
LeafFragmentNode,
283+
)
284+
```
285+
286+
## Gossip
287+
288+
As each block, the bridge is responsible for detecting and gossiping all created and updated trie nodes separately. Bridge should first compute all content-ids that should be gossiped, and it should gossip them based on their distance to its own node-id, from closest to farthest.

0 commit comments

Comments
 (0)