EIP-4844 marks a significant milestone in Ethereum's journey to enhance the scalability of its data availability layer. This upgrade introduces blob data, a new resource with its own dedicated fee market, alongside crucial cryptographic components such as the KZG ceremony, commitments, and proofs essential for Data Availability Sampling (DAS). Currently, full nodes still handle all data, but the scalability improvements from EIP-4844 stem from two key developments:
An independent fee market helps separate the economics of blob data from blockchain execution, ensuring consistent data throughput regardless of the demand for layer 1 (L1) transaction processing.
The introduction of a pruning mechanism for blobs, which stabilizes additional storage costs by preventing them from accumulating indefinitely.
To further boost throughput while managing or even reducing resource demands, we would need to shift towards a model where nodes only download a portion of the data. This approach is central to the ambitious Danksharding proposal, which suggests a more granular implementation of Data Availability Sampling. Due to the complex nature of these changes, breaking them down into smaller, manageable steps seems prudent to minimize risks and simplify each upgrade phase.
This concept of gradual enhancement has been echoed in discussions around Proto’s PeerDHT over the past few years.
Proto’s Peer DHT Explained:
Ever wonder how vast networks manage to find and share information swiftly and effectively? This is where Proto's Distributed Hash Table (DHT) comes into play. Proto DHT acts as the backbone of a large digital network, facilitating efficient data exchanges between computers through a well-organized system known as the DHT table.
In Proto DHT, each node in the network holds a segment of the overall data, assigned specific keys that make locating and accessing information across the network incredibly efficient. This system isn't just smart; it's also highly scalable. As nodes join or depart, the network adjusts seamlessly, maintaining consistent data accessibility and timeliness.
Technically, nodes utilize a protocol named discv5 to find other nodes that hold the data they need, based on proximity within the network. This capability is crucial for maintaining an evenly distributed overlay network. Moreover, nodes evaluate each other based on how reliably and quickly they respond, ensuring the network remains efficient and trustworthy.
Through these innovations, Ethereum continues to refine its infrastructure, ensuring it can handle growing data demands while maintaining robustness and user trust.
The Role of PeerDAS in Enhancing Data Availability
The design philosophy behind PeerDAS is to leverage well-established, battle-tested peer-to-peer components that are already operational within Ethereum. The goal here is to significantly expand data availability (DA) capabilities beyond the enhancements introduced in EIP-4844, while ensuring that the workload for honest nodes remains manageable, akin to the requirements of 4844—where downloading less than 1MB per slot is the norm.
This exploration aims to unravel the potential scalability of a straightforward network structure, which utilizes a varied distribution of node types without depending on a more complex Distributed Hash Table (DHT)-like solution. By understanding these dynamics, we can better strategize enhancements in throughput and optimize resource usage.
To further advance throughput while efficiently using resources, it's necessary to integrate some form of Data Availability Sampling (DAS), as proposed in the Danksharding blueprint.
A Closer Look at DAS: DAS relies on various node types and the specific assumptions about their roles and capabilities:
These nodes are expected to download and serve specific samples of data from designated rows and columns. Validators, in particular, might be incentivized to custody data, though not necessarily to serve it actively.
These nodes handle a significant portion of data beyond the baseline expectations set for honest nodes.
These are special cases of high-capacity nodes that store and manage 100% of the data.
The distinctiveness of each DAS solution lies in how the network is organized, how peers for sampling are discovered, and how it either utilizes or under-utilizes nodes with higher capacities. This could mean supporting variability in node capacity, rather than expecting uniform capabilities across the network.
Configuration Simplification for Clarity: Here, we simplify the full parametrization required to illustrate the concept:
NUMBER_OF_ROWS_AND_COLUMNS
Specifies the total number of rows and columns within the data structure. This setting determines the grid size used for organizing and distributing data across the network.
Sample Value: 32
SAMPLES_PER_ROW_COLUMN
Indicates the number of samples taken from each row and column. This parameter is crucial for defining the granularity of data sampling and impacts the resolution of data that nodes access and verify.
Sample Value: 512
CUSTODY_REQUIREMENT
The minimum number of rows and columns that an honest node is responsible for custodianship and serving. This requirement ensures that each node holds and can provide a baseline amount of data, supporting the network's robustness and redundancy.
Sample Value: 2
SAMPLES_PER_SLOT
Represents the number of random samples a node queries per slot. This parameter is essential for the data verification process, where nodes check the availability and integrity of data across the network.
Sample Value: 70
NUMBER_OF_PEERS
The minimum number of peer connections that a node must maintain. This ensures a well-connected network, facilitating efficient data exchange and robustness against node failures or network segmentation.
Sample Value: 70
NUM_COLUMNS
Determines the maximum number of columns in the data structure, which directly influences how much bandwidth is consumed by sampling activities. This configuration is integral for managing network load and optimizing data throughput.
NUM_COLUMN_SUBNETS
Refers to the total number of subnets or separate data channels within the network. This parameter defines how data is segmented and distributed among different network paths, enhancing data retrieval efficiency and system scalability.
Managing the complexity and risks associated with such upgrades involves breaking them down into smaller, more manageable steps. Initially, the plan is to start with a 1D DAS based on PeerDAS for a straightforward yet substantial throughput enhancement. As all necessary cryptographic components are fully integrated, the transition to a more complex 2D construction will occur. Additional features like distributed reconstruction will be incorporated as they become ready. The strategy also includes a gradual increase in throughput to reach maximum capacity as network capabilities enhance and system stability is verified. This phased upgrade approach is designed to seamlessly increase blob capacity, mirroring how gas limits are expanded on the execution layer, thereby making the transition smooth for both rollups and their users.
Here is the breakdown of each and every step:-
Stage 0: EIP-4844.
Subnets are introduced for distributing blobs, but nodes have to participate in all of them, downloading all the data. No DAS.
Stage 1: 1D PeerDAS.
Introduction
Blobs are expanded in a one-dimensional (1D) horizontal format. A method called sharding is introduced to distribute these blobs: specifically, column subnets, which are segments of the network that handle different columns of data. Not all nodes deal with every column; instead, each node participates in just a few specific subnets. This means that the older method of organizing data into row subnets might no longer be needed.
Essential networking components for peer sampling are also integrated. This includes finding peers for data sampling and conducting the sampling itself through a request/response (req/resp) protocol. This setup allows nodes to perform Data Availability Sampling (DAS) on columns. The selection of which and how many column subnets a node will participate in is carefully made, ensuring that each node’s group of peers can collectively access all necessary column data.
During this stage, a gradual increase in the total number of blobs that the system can handle is anticipated. It might start with supporting up to 32 blobs, then expand this capacity to 64, and potentially even up to 128 blobs as the network's capabilities grow. This expansion is designed to enhance the network’s efficiency and capacity in a controlled and reliable manner.
Blobs are individually extended horizontally, stacked vertically and subdivided in NUM_COLUMNS columns, which are used as the sampling unit. In the following, I am mostly going to consider the example parameter NUM_COLUMNS = 128, which implies a maximum sample size of
(256*b/128 =2bKBs), for MAX_BLOBS_PER_BLOCK = b (extended blobs are 256 KBs).
Distribution in GossipSub Topics
Let's move on to the data distribution aspect of the network. Data across the network is distributed via GossipSub Topics into "column subnets," which are specific channels or topics dedicated to sharing data. Each column of data is associated with one of these subnets, chosen from a predetermined total known as NUM_COLUMN_SUBNETS. To ensure data is both accessible and secure, each node must manage data from a minimum number of these subnets, defined by the CUSTODY_REQUIREMENT. This setup determines the least amount of data a node is responsible for, referred to as "minimum subnet density."
The fraction of data each node handles is calculated by dividing the CUSTODY_REQUIREMENT by the total NUM_COLUMN_SUBNETS. This ratio significantly influences the network's bandwidth consumption due to the amplification effects of broadcasting over the GossipSub topics used for column subnets. Unlike these, the data sampling through request/response protocols consumes far less bandwidth, which becomes increasingly efficient as the network evolves into a more complex 2D structure.
For practical settings, a subnet density ratio of 1/64 has proven safe based on past experiences, though a ratio of 1/128 might also be feasible. Nonetheless, a higher ratio, implying fewer subnets (for instance, starting with NUM_COLUMN_SUBNETS = 32 and CUSTODY_REQUIREMENT = 1), can enhance the effectiveness of peer sampling.
Suggested Starting Point: Starting with 32 column subnets and a CUSTODY_REQUIREMENT of 1 is recommended. This configuration means each node will be responsible for at least 1/32nd of the data, maintaining manageable network performance.
Flexibility and Security
Setting the CUSTODY_REQUIREMENT higher than 1 has several advantages:
Suppose we have 128 columns of data and 32 column subnets with a CUSTODY_REQUIREMENT of 1. This configuration means each node is responsible for a minimum of 4 columns
(128 columns/ 32 subnets).
If we increase the number of column subnets to 64 and set the CUSTODY_REQUIREMENT to 2, the minimum fraction of data each node handles remains 1/32 (since 2/64 = 1/32). However, the minimum unit of custody becomes smaller. In this case, each node would be responsible for at least 2 columns (128 columns / 64 subnets = 2 requirement).
This setup allows nodes to handle varying amounts of data more flexibly. For example, a node that wants to handle more than the minimum requirement can choose to manage 6, 8, or 10 columns instead of being restricted to multiples of 4 columns. This flexibility makes the network more adaptable to different nodes' capacities and preferences.
By requiring each node to handle multiple subnets (CUSTODY_REQUIREMENT > 1), the initial distribution of data across the network becomes more secure. Each node gets pieces of data from different parts of the dataset, making it harder for any single node to compromise the integrity of the data.
With a higher custody requirement, each piece of data is held by multiple nodes, increasing redundancy. This redundancy ensures that if some nodes go offline or are compromised, the data can still be reconstructed from other nodes.
Before more sophisticated security measures (like sampling) are implemented, having data held by multiple nodes adds a layer of protection. Even without sampling, the network can rely on the initial distribution to detect and correct discrepancies or errors.
Peer Sampling and Scaling Strategies
Continuing the exploration of Peer-to-Peer Data Availability Sampling (PeerDAS), the discussion now turns to the intricacies of peer sampling, particularly focusing on how entire data columns, encapsulated within "column sidecar objects," are sampled through a request/response mechanism. The selection of peers for sampling these columns is determined by specific data they manage, which is indicated in their Enhanced Node Record (ENR). This record details the columns a node is responsible for, influenced by factors such as their node ID, the number of subnets they are part of, and possibly the current epoch.
Scaling peer sampling in PeerDAS is not merely a function of available bandwidth but significantly depends on how nodes are grouped or their "peer sets". Each node’s set must comprehensively cover all the necessary data samples to ensure robustness against data loss or manipulation by unreliable or malicious nodes. To manage this effectively, keeping the number of column subnets (NUM_COLUMN_SUBNETS) relatively low is strategic; it simplifies the network's demands and ensures each node can handle its data responsibilities without excessive bandwidth usage. However, this does place a limit on how much data (or how many "blobs") the system can handle at once. As such, increasing the system's capacity—number of blobs—might need to be done gradually, aligned with enhancements in network capabilities.
Furthermore, scaling PeerDAS might potentially reach its theoretical capacity from the outset if the network includes a robust network of supernodes—nodes that manage more data than typically required. The real challenge lies in accommodating the network's diversity, as nodes vary widely in their data handling capabilities. In a network with a broad range of node capabilities (a highly heterogeneous network), maintaining lower overall numbers of required peer connections simplifies scaling. Moving forward, a setup with 32 column subnets and each node required to manage at least one (CUSTODY_REQUIREMENT = 1) could be a practical and efficient baseline for our PeerDAS infrastructure.
Balancing Subnet Sampling and Peer Sampling Trade-offs
Subnet Sampling
In subnet sampling, we set the CUSTODY_REQUIREMENT high enough to provide security during the initial data distribution phase, even before peer sampling begins. This means only a small number of nodes can be tricked into thinking that unavailable data is available, even if less than half of the data columns are actually present.
Example:-
Imagine an attacker makes columns [0, 59] available, which is slightly less than half of the data. If our CUSTODY_REQUIREMENT is set to 1 and we have 32 column subnets, honest nodes in the first 15 subnets will receive all sidecar objects and vote for the block, causing almost half of the honest nodes to vote for an unavailable block.
In general, with a CUSTODY_REQUIREMENT of k, an attacker can deceive up to 2^(-k) of all honest nodes. Therefore, to ensure meaningful security during distribution, the CUSTODY_REQUIREMENT should be at least 4.
However, if the attacker is adaptive (observes node distribution and selectively makes data available), they can still trick a significant portion of nodes. For instance, with a CUSTODY_REQUIREMENT of 4 and 128 column subnets, around 20% of honest nodes can be deceived. Increasing the CUSTODY_REQUIREMENT to 6 reduces this fraction to less than 10%.
Increasing CUSTODY_REQUIREMENT and the number of column subnets together, while keeping the same ratio, maintains subnet density and bandwidth consumption but enhances security. However, this comes with a tradeoff: higher NUM_COLUMN_SUBNETS increases the number of honest peers a node needs to cover all subnets due to more overlaps. This can make it challenging to maintain effective peer sampling, especially with low peer counts. This impacts the network's fork choice and confirmation rules.
Optimizing Fork Choice Mechanisms
Continuing the discussion on Fork Choice within the Peer-to-Peer Data Availability Sampling (PeerDAS) framework, here is a simplified explanation of how the is_data_available function works and the considerations surrounding it:
For Current Slot (n): The availability of data for a block in the current slot n is checked by examining the column subnets a node is part of. If the node has received all the required "sidecar" data packets for these columns, then the data for that block is considered available.
For Previous Slots (<n): For blocks from previous slots, the decision is based on peer sampling results. If a node has successfully obtained all requested samples from its peers, the data for those earlier blocks is also considered available.
The proposer of the next slot (n+1) has about 8 seconds after the current slot’s attestation deadline to perform peer sampling and ensure the next block is built on verified data.
Attesters (validators that confirm block data) for the current slot just need to ensure they have received all required sidecars by their deadline.
Using this system, where data verification lags slightly behind block proposal, can lead to situations where validators might initially endorse a block even if not all of its data is actually verified at the time of voting. This vulnerability is mainly due to the low data custody requirement, meaning validators don't manage enough subnets to immediately confirm all data.
Example of a Potential Attack:
An attacker might release data selectively to trick validators into supporting a block with incomplete data availability. For example:
Block B is proposed at slot n with partial data secretly made available to sway validator votes.
For the next slot (n+1), the proposer finds out the data is missing and proposes a new block B' which doesn’t build on B.
Meanwhile, the attacker releases the missing data, making block B look valid, and validators now might ignore B' because B seems to have more support.
Mitigation Strategy: To mitigate such attacks, validators of slot n+1 should remember the results of their data availability checks from just before the end of slot n (say 10 seconds into the slot). They should use this information to inform their decisions in the next slot unless it conflicts with new blocks being proposed. This method echoes the "view-merge" technique used in Ethereum consensus research but simplifies the process as it doesn't necessitate additional messages from the proposer.
This approach to managing data verification and block validation aims to balance the need for timely processing with the integrity and security of blockchain operations. By closely managing the timing and criteria for data verification, the system strives to protect against both accidental errors and deliberate attacks, ensuring a robust and reliable blockchain network.
Bandwidth Management
Bandwidth management is effectively maintained, upholding the performance benchmarks set by EIP-4844. The targets are clear: aiming for an average of 3 MBs per slot and a maximum of 6 MBs per slot.
The network employs a system where each node handles data divided into subnets, specifically focusing on how much data each node is responsible for under varying loads.
Here’s a breakdown of the mathematical framework guiding the approach:
Each column of data is broken down into cells, where each cell is 2 KB (calculated as 256 KB divided by 128 equals 2 KB).
If a node manages a block where MAX_BLOBS_PER_BLOCK = 64, then each column is 128 KB (since 64×2 KB equals 128 KB).
Data propagation through GossipSub topics incurs an 8x amplification factor, analogous to the systems used in EIP-4844, making the propagation of each column equivalent to handling one entire blob.
After accounting for data distribution, the remaining bandwidth capacity allows for the equivalent of managing two additional blobs, equating to 2048 KB.
This capacity supports approximately 16 peer samples per slot (2048 KB divided by 128 KB/sample equals 16 samples), achieving a high level of network soundness with a probability of 2−16 that any single node could be misled about data availability by an attacker, assuming non-targeted attacks.
The current setup, with MAX_BLOBS_PER_BLOCK set at 64, achieves ten times the throughput of the previous EIP-4844 standard without needing an excessive number of column subnets. This efficiency ensures that peer sampling remains robust without overly taxing the network.
Looking ahead, increasing MAX_BLOBS_PER_BLOCK to 128 may require doubling the number of column subnets to 64. This adjustment would depend heavily on the availability of supernodes and potentially higher peer counts to maintain network stability and security.
Moreover, optimizing the number of columns, like adjusting NUM_COLUMNS to 512, can significantly reduce the bandwidth impact of sampling. For instance, at MAX_BLOBS_PER_BLOCK of 64, sampling consumes bandwidth equivalent to half a blob, and at 128, one blob, which is eight times less than distribution. This reduction shows that strategic adjustments in network configurations can enhance efficiency, particularly as we transition towards more complex 2D data structures.
Network-Level Validation During Distribution
In the Ethereum improvement process, specifically PR#3531 under EIP-4844, there's a goal to perform network-level validation concurrently with the distribution phase. This allows for the parallel propagation of both the block and sidecars, eliminating the need for sidecars to wait until the block is fully received.
To ensure security, the system is designed so that proposers cannot distribute sidecars that conflict with the block's commitments without risking a penalty (known as "double signing"). This is ensured by including a kzg_commitment and a kzg_proof in the blob sidecar. These elements are accompanied by the block header and an `inclusion proof` against its body_root. This setup validates that the kzg_commitment is part of the blob_kzg_commitments list in the block body. By verifying the kzg_proof, it's confirmed that the commitment matches the blob, and the `inclusion proof` checks the list's presence in the block.
The strategy for blob sidecars can similarly be adapted for column sidecars, which are distributed across column subnets. Each column sidecar carries the necessary commitments for its verification, including the block header and an `inclusion proof` for the hash_tree_root(blob_kzg_commitments) in the body_root. This proof, which has a depth of 4, along with all `cell proofs` for the column, allows for batch verification.
This approach enables the column sidecar to be independently verified and forwarded immediately upon receipt, facilitating parallel propagation of the block and columns. It also allows for immediate sampling, with no dependencies on the receipt of the block or other column sidecars.
The main disadvantage is the redundancy in each column sidecar, as both the header and inclusion proof are replicated across each column. However, this redundancy adds only a minimal increase in bandwidth, roughly 5% per column, given that each column includes an additional 48 bytes per row for the commitments and another 48 bytes for the cell proofs. Given that there are 128 columns and each cell is approximately 2 KB, the increase is manageable, and the inclusion proof only needs to be verified once, not per column.
This method optimizes the efficiency and security of network-level validation in the Ethereum blockchain, enhancing the speed and reliability of block and sidecar propagation.
Transitioning from Stage 1 to Stage 2:
The move from Stage 1 to Stage 2 of PeerDAS leverages well-understood network components, such as GossipSub topics (subnets) for distribution and request/response (req/resp) protocols for peer sampling. Significantly, this shift doesn't necessitate a hard fork initially, as modifications are primarily within the networking architecture and fork-choice mechanisms. However, an increase in the blob count marks a pivotal change that will eventually require a hard fork. Initially, the 4844 standard blob count of 3/6 could be maintained while making other adjustments, like improving network protocols and testing new configurations. Eventually, the aim is to consolidate all significant updates—including increasing the blob count—into a single hard fork to streamline the transition and minimize disruptions.
Stage 2: 2D PeerDAS, or Full Danksharding
Full Danksharding is implemented in Stage 2, utilizing a refined sampling infrastructure where peer sampling is central. Here, blobs are extended in both dimensions—horizontally and vertically. This results in a lightweight peer sampling process that significantly reduces bandwidth consumption. The network may also support light sampling nodes that are not involved in distribution. If distributed reconstruction is implemented, it enhances the robustness against subnet failures. Initially, the blob count in this stage starts at 64 or potentially 128, with the goal to gradually increase to the maximum throughput of 256.
In this stage, blobs are extended horizontally and vertically, forming a matrix subdivided into several columns, determined by NUM_COLUMNS. Now, samples are not just columns but cells, which are intersections of rows and columns, similar to the structure seen in the Danksharding construction. This change in data format does not alter the overall distribution mechanism but does enhance the granularity and efficiency of data handling within the network.
Peer Sampling Innovations:
Peer sampling in Stage 2 builds on the established networking foundations from earlier stages. While the discovery of a diverse set of peers and the req/resp mechanisms remain the same, what changes significantly is the nature of the sampled objects. Instead of columns, nodes now request cells, which contain more granular and specific data points. Each cell comes with its proof, which must be verified against the KZG commitment of the respective row, enhancing the integrity and reliability of the data verification process.
Optimizing Bandwidth Through Advanced Configuration
Delving into the practical implications and the underlying mathematics that guide the approach aims to push the limits with MAX_BLOBS_PER_BLOCK = 256. This setup is designed to maximize the system's throughput to 32 MBs per slot, though the plan is to scale up to this level gradually.
As the network’s architecture is enhanced to support this growth, one scenario involves increasing NUM_COLUMN_SUBNETS to 64. This adjustment, feasible with a rise in reliable peer counts, leads to a new configuration where each subnet handles two columns of data. Despite each column holding about 1 MB of data, only 0.5 MBs need to be actively transmitted across the network. This efficiency is achieved by transmitting only the first half of the column, with the second half reconstructed locally at the receiver’s end using Fast Fourier Transform (FFT).
Fast Fourier Transform (FFT) Explained: The reconstruction process involves two critical FFT operations:
Conversion FFT: Transforms the data from its evaluation form to a coefficient form. This step simplifies data manipulation and prepares it for the next phase.
Recovery FFT: Converts the data back to its original form, fully reconstructing the column based on the half initially received.
This efficient use of FFTs, typically requiring about 4 milliseconds when executed with the arkworks cryptography library, underscores the system's capability to handle data robustly and swiftly.
With each subnet managing data transmission equivalent to 8 blobs, this configuration represents a slight increase over the previous 6 blobs per slot under EIP-4844 standards. The shift in data propagation dynamics, primarily due to the larger object sizes being managed, offers an interesting area for further optimization, possibly by adjusting NUM_COLUMNS.
Peer Sampling in the 2D Framework:
The bandwidth required for peer sampling in this context is remarkably low. Each data sample in the 2D setup weighs just 2 KB, a drastic reduction from previous setups where samples were significantly larger. With high-security settings—sampling k=75k=75k=75 times—the theoretical soundness level achieved is 2−302^{-30}2−30, indicating an extremely low probability that any node could be deceived about data availability. Despite the rigorous sampling, the bandwidth used remains negligible compared to what is required for distribution.
By incorporating these strategic and mathematical considerations into the design, the goal is not only to achieve high throughput but also to ensure robust security and efficiency. This holistic approach is critical as the transition towards fully realizing the capabilities of 2D PeerDAS and Danksharding in the blockchain environment progresses.
As the exploration into the network-level validation aspects of the PeerDAS system transitions from Stage 1 to more advanced stages, it's crucial to address how the vertical extension impacts data validation. Despite this new complexity, the approach for sending proofs and commitments with a column sidecar remains fundamentally unchanged from earlier stages. This stability is largely due to the homomorphic properties of both proofs and commitments, which allow for the second half of the column's proofs and commitments to be reconstructed from the first half through a linear combination.
Handling Proofs and Bandwidth: In theory, optimizing by only transmitting proofs for the first half of each column could significantly cut down on data transmission needs. However, to simplify the validation process and avoid the need for nodes to perform reconstructions, the choice might be to send all proofs anyway. If this approach is taken, a column sidecar would contain twice the amount of proofs, slightly increasing the bandwidth consumption by about 2.5%, assuming NUM_COLUMNS = 128. This increase is a manageable trade-off for reducing computational overhead on the nodes.
Transitioning to Stage 3:
Looking ahead to Stage 3, preparing for this transition involves refining the implementation of 2D cryptography. This advancement is crucial as it influences the block production process by enhancing how data is encrypted and decoded. Additionally, there's a subtle yet significant shift in peer sampling—the objects exchanged through the req/resp protocol evolve from columns to cells. This change requires a nuanced approach in how data is requested and verified among peers.
The Role of Hard Forks:
In principle, transitioning to higher throughput does necessitate a hard fork, primarily because while blocks will still contain the same list of KZG commitments (those tied directly to actual blobs), the extension rows introduced by 2D constructions can be reconstructed from existing data. Thus, a hard fork would not be needed for changes in data structure alone but to handle increased data throughput and ensure all network upgrades are synchronized. Ensuring that these transitions occur at a coordination point means that, practically, they would still likely be bundled with a hard fork for smooth implementation across the network.
This strategic planning and phased implementation help not only to manage bandwidth efficiently but also to ensure that the network remains robust and secure as more complex data structures and cryptography are introduced. This careful balance of technical innovation and practical network management is key to advancing the blockchain infrastructure into its next evolution with minimal disruption and maximized efficiency.
PeerDAS: From EIP-4844 to Full Danksharding
EIP-4844 marks a significant milestone in Ethereum's journey to enhance the scalability of its data availability layer. This upgrade introduces blob data, a new resource with its own dedicated fee market, alongside crucial cryptographic components such as the KZG ceremony, commitments, and proofs essential for Data Availability Sampling (DAS). Currently, full nodes still handle all data, but the scalability improvements from EIP-4844 stem from two key developments:
An independent fee market helps separate the economics of blob data from blockchain execution, ensuring consistent data throughput regardless of the demand for layer 1 (L1) transaction processing.
The introduction of a pruning mechanism for blobs, which stabilizes additional storage costs by preventing them from accumulating indefinitely.
To further boost throughput while managing or even reducing resource demands, we would need to shift towards a model where nodes only download a portion of the data. This approach is central to the ambitious Danksharding proposal, which suggests a more granular implementation of Data Availability Sampling. Due to the complex nature of these changes, breaking them down into smaller, manageable steps seems prudent to minimize risks and simplify each upgrade phase.
This concept of gradual enhancement has been echoed in discussions around Proto’s PeerDHT over the past few years.
Proto’s Peer DHT Explained:
Ever wonder how vast networks manage to find and share information swiftly and effectively? This is where Proto's Distributed Hash Table (DHT) comes into play. Proto DHT acts as the backbone of a large digital network, facilitating efficient data exchanges between computers through a well-organized system known as the DHT table.
In Proto DHT, each node in the network holds a segment of the overall data, assigned specific keys that make locating and accessing information across the network incredibly efficient. This system isn't just smart; it's also highly scalable. As nodes join or depart, the network adjusts seamlessly, maintaining consistent data accessibility and timeliness.
Technically, nodes utilize a protocol named discv5 to find other nodes that hold the data they need, based on proximity within the network. This capability is crucial for maintaining an evenly distributed overlay network. Moreover, nodes evaluate each other based on how reliably and quickly they respond, ensuring the network remains efficient and trustworthy.
Through these innovations, Ethereum continues to refine its infrastructure, ensuring it can handle growing data demands while maintaining robustness and user trust.
The Role of PeerDAS in Enhancing Data Availability
The design philosophy behind PeerDAS is to leverage well-established, battle-tested peer-to-peer components that are already operational within Ethereum. The goal here is to significantly expand data availability (DA) capabilities beyond the enhancements introduced in EIP-4844, while ensuring that the workload for honest nodes remains manageable, akin to the requirements of 4844—where downloading less than 1MB per slot is the norm.
This exploration aims to unravel the potential scalability of a straightforward network structure, which utilizes a varied distribution of node types without depending on a more complex Distributed Hash Table (DHT)-like solution. By understanding these dynamics, we can better strategize enhancements in throughput and optimize resource usage.
To further advance throughput while efficiently using resources, it's necessary to integrate some form of Data Availability Sampling (DAS), as proposed in the Danksharding blueprint.
A Closer Look at DAS: DAS relies on various node types and the specific assumptions about their roles and capabilities:
These nodes are expected to download and serve specific samples of data from designated rows and columns. Validators, in particular, might be incentivized to custody data, though not necessarily to serve it actively.
These nodes handle a significant portion of data beyond the baseline expectations set for honest nodes.
These are special cases of high-capacity nodes that store and manage 100% of the data.
The distinctiveness of each DAS solution lies in how the network is organized, how peers for sampling are discovered, and how it either utilizes or under-utilizes nodes with higher capacities. This could mean supporting variability in node capacity, rather than expecting uniform capabilities across the network.
Configuration Simplification for Clarity: Here, we simplify the full parametrization required to illustrate the concept:
Specifies the total number of rows and columns within the data structure. This setting determines the grid size used for organizing and distributing data across the network.
Sample Value: 32
Indicates the number of samples taken from each row and column. This parameter is crucial for defining the granularity of data sampling and impacts the resolution of data that nodes access and verify.
Sample Value: 512
The minimum number of rows and columns that an honest node is responsible for custodianship and serving. This requirement ensures that each node holds and can provide a baseline amount of data, supporting the network's robustness and redundancy.
Sample Value: 2
Represents the number of random samples a node queries per slot. This parameter is essential for the data verification process, where nodes check the availability and integrity of data across the network.
Sample Value: 70
The minimum number of peer connections that a node must maintain. This ensures a well-connected network, facilitating efficient data exchange and robustness against node failures or network segmentation.
Sample Value: 70
Determines the maximum number of columns in the data structure, which directly influences how much bandwidth is consumed by sampling activities. This configuration is integral for managing network load and optimizing data throughput.
Refers to the total number of subnets or separate data channels within the network. This parameter defines how data is segmented and distributed among different network paths, enhancing data retrieval efficiency and system scalability.
Managing the complexity and risks associated with such upgrades involves breaking them down into smaller, more manageable steps. Initially, the plan is to start with a 1D DAS based on PeerDAS for a straightforward yet substantial throughput enhancement. As all necessary cryptographic components are fully integrated, the transition to a more complex 2D construction will occur. Additional features like distributed reconstruction will be incorporated as they become ready. The strategy also includes a gradual increase in throughput to reach maximum capacity as network capabilities enhance and system stability is verified. This phased upgrade approach is designed to seamlessly increase blob capacity, mirroring how gas limits are expanded on the execution layer, thereby making the transition smooth for both rollups and their users.
Here is the breakdown of each and every step:-
Stage 0: EIP-4844.
Subnets are introduced for distributing blobs, but nodes have to participate in all of them, downloading all the data. No DAS.
Stage 1: 1D PeerDAS.
Introduction
Blobs are expanded in a one-dimensional (1D) horizontal format. A method called sharding is introduced to distribute these blobs: specifically, column subnets, which are segments of the network that handle different columns of data. Not all nodes deal with every column; instead, each node participates in just a few specific subnets. This means that the older method of organizing data into row subnets might no longer be needed.
Essential networking components for peer sampling are also integrated. This includes finding peers for data sampling and conducting the sampling itself through a request/response (req/resp) protocol. This setup allows nodes to perform Data Availability Sampling (DAS) on columns. The selection of which and how many column subnets a node will participate in is carefully made, ensuring that each node’s group of peers can collectively access all necessary column data.
During this stage, a gradual increase in the total number of blobs that the system can handle is anticipated. It might start with supporting up to 32 blobs, then expand this capacity to 64, and potentially even up to 128 blobs as the network's capabilities grow. This expansion is designed to enhance the network’s efficiency and capacity in a controlled and reliable manner.
Blobs are individually extended horizontally, stacked vertically and subdivided in
NUM_COLUMNScolumns, which are used as the sampling unit. In the following, I am mostly going to consider the example parameterNUM_COLUMNS= 128, which implies a maximum sample size of(256*b/128 =2bKBs), for
MAX_BLOBS_PER_BLOCK= b (extended blobs are 256 KBs).Distribution in GossipSub Topics
Let's move on to the data distribution aspect of the network. Data across the network is distributed via GossipSub Topics into "column subnets," which are specific channels or topics dedicated to sharing data. Each column of data is associated with one of these subnets, chosen from a predetermined total known as NUM_COLUMN_SUBNETS. To ensure data is both accessible and secure, each node must manage data from a minimum number of these subnets, defined by the CUSTODY_REQUIREMENT. This setup determines the least amount of data a node is responsible for, referred to as "minimum subnet density."
Minimum Subnet Density Calculation:
Minimum Subnet Density=
CUSTODY_REQUIREMENT/NUM_COLUMN_SUBNETSThe fraction of data each node handles is calculated by dividing the
CUSTODY_REQUIREMENTby the totalNUM_COLUMN_SUBNETS. This ratio significantly influences the network's bandwidth consumption due to the amplification effects of broadcasting over the GossipSub topics used for column subnets. Unlike these, the data sampling through request/response protocols consumes far less bandwidth, which becomes increasingly efficient as the network evolves into a more complex 2D structure.For practical settings, a subnet density ratio of 1/64 has proven safe based on past experiences, though a ratio of 1/128 might also be feasible. Nonetheless, a higher ratio, implying fewer subnets (for instance, starting with
NUM_COLUMN_SUBNETS= 32 andCUSTODY_REQUIREMENT= 1), can enhance the effectiveness of peer sampling.Suggested Starting Point: Starting with 32 column subnets and a
CUSTODY_REQUIREMENTof 1 is recommended. This configuration means each node will be responsible for at least 1/32nd of the data, maintaining manageable network performance.Flexibility and Security
Setting the
CUSTODY_REQUIREMENThigher than 1 has several advantages:Suppose we have 128 columns of data and 32 column subnets with a
CUSTODY_REQUIREMENTof 1. This configuration means each node is responsible for a minimum of 4 columns(128 columns/ 32 subnets).
If we increase the number of column subnets to 64 and set the
CUSTODY_REQUIREMENTto 2, the minimum fraction of data each node handles remains 1/32 (since 2/64 = 1/32). However, the minimum unit of custody becomes smaller. In this case, each node would be responsible for at least 2 columns (128 columns / 64 subnets = 2 requirement).This setup allows nodes to handle varying amounts of data more flexibly. For example, a node that wants to handle more than the minimum requirement can choose to manage 6, 8, or 10 columns instead of being restricted to multiples of 4 columns. This flexibility makes the network more adaptable to different nodes' capacities and preferences.
By requiring each node to handle multiple subnets (
CUSTODY_REQUIREMENT> 1), the initial distribution of data across the network becomes more secure. Each node gets pieces of data from different parts of the dataset, making it harder for any single node to compromise the integrity of the data.With a higher custody requirement, each piece of data is held by multiple nodes, increasing redundancy. This redundancy ensures that if some nodes go offline or are compromised, the data can still be reconstructed from other nodes.
Before more sophisticated security measures (like sampling) are implemented, having data held by multiple nodes adds a layer of protection. Even without sampling, the network can rely on the initial distribution to detect and correct discrepancies or errors.
Peer Sampling and Scaling Strategies
Continuing the exploration of Peer-to-Peer Data Availability Sampling (PeerDAS), the discussion now turns to the intricacies of peer sampling, particularly focusing on how entire data columns, encapsulated within "column sidecar objects," are sampled through a request/response mechanism. The selection of peers for sampling these columns is determined by specific data they manage, which is indicated in their Enhanced Node Record (ENR). This record details the columns a node is responsible for, influenced by factors such as their node ID, the number of subnets they are part of, and possibly the current epoch.
Scaling peer sampling in PeerDAS is not merely a function of available bandwidth but significantly depends on how nodes are grouped or their "peer sets". Each node’s set must comprehensively cover all the necessary data samples to ensure robustness against data loss or manipulation by unreliable or malicious nodes. To manage this effectively, keeping the number of column subnets (
NUM_COLUMN_SUBNETS) relatively low is strategic; it simplifies the network's demands and ensures each node can handle its data responsibilities without excessive bandwidth usage. However, this does place a limit on how much data (or how many "blobs") the system can handle at once. As such, increasing the system's capacity—number of blobs—might need to be done gradually, aligned with enhancements in network capabilities.Furthermore, scaling PeerDAS might potentially reach its theoretical capacity from the outset if the network includes a robust network of supernodes—nodes that manage more data than typically required. The real challenge lies in accommodating the network's diversity, as nodes vary widely in their data handling capabilities. In a network with a broad range of node capabilities (a highly heterogeneous network), maintaining lower overall numbers of required peer connections simplifies scaling. Moving forward, a setup with 32 column subnets and each node required to manage at least one (CUSTODY_REQUIREMENT = 1) could be a practical and efficient baseline for our PeerDAS infrastructure.
Balancing Subnet Sampling and Peer Sampling Trade-offs
Subnet Sampling
In subnet sampling, we set the
CUSTODY_REQUIREMENThigh enough to provide security during the initial data distribution phase, even before peer sampling begins. This means only a small number of nodes can be tricked into thinking that unavailable data is available, even if less than half of the data columns are actually present.Example:-
Imagine an attacker makes columns [0, 59] available, which is slightly less than half of the data. If our
CUSTODY_REQUIREMENTis set to 1 and we have 32 column subnets, honest nodes in the first 15 subnets will receive all sidecar objects and vote for the block, causing almost half of the honest nodes to vote for an unavailable block.In general, with a
CUSTODY_REQUIREMENTof k, an attacker can deceive up to 2^(-k) of all honest nodes. Therefore, to ensure meaningful security during distribution, theCUSTODY_REQUIREMENTshould be at least 4.However, if the attacker is adaptive (observes node distribution and selectively makes data available), they can still trick a significant portion of nodes. For instance, with a
CUSTODY_REQUIREMENTof 4 and 128 column subnets, around 20% of honest nodes can be deceived. Increasing theCUSTODY_REQUIREMENTto 6 reduces this fraction to less than 10%.Increasing
CUSTODY_REQUIREMENTand the number of column subnets together, while keeping the same ratio, maintains subnet density and bandwidth consumption but enhances security. However, this comes with a tradeoff: higherNUM_COLUMN_SUBNETSincreases the number of honest peers a node needs to cover all subnets due to more overlaps. This can make it challenging to maintain effective peer sampling, especially with low peer counts. This impacts the network's fork choice and confirmation rules.Optimizing Fork Choice Mechanisms
Continuing the discussion on Fork Choice within the Peer-to-Peer Data Availability Sampling (PeerDAS) framework, here is a simplified explanation of how the
is_data_availablefunction works and the considerations surrounding it:For Current Slot (n): The availability of data for a block in the current slot n is checked by examining the column subnets a node is part of. If the node has received all the required "sidecar" data packets for these columns, then the data for that block is considered available.
For Previous Slots (<n): For blocks from previous slots, the decision is based on peer sampling results. If a node has successfully obtained all requested samples from its peers, the data for those earlier blocks is also considered available.
The proposer of the next slot (n+1) has about 8 seconds after the current slot’s attestation deadline to perform peer sampling and ensure the next block is built on verified data.
Attesters (validators that confirm block data) for the current slot just need to ensure they have received all required sidecars by their deadline.
Using this system, where data verification lags slightly behind block proposal, can lead to situations where validators might initially endorse a block even if not all of its data is actually verified at the time of voting. This vulnerability is mainly due to the low data custody requirement, meaning validators don't manage enough subnets to immediately confirm all data.
Example of a Potential Attack:
An attacker might release data selectively to trick validators into supporting a block with incomplete data availability. For example:
Block B is proposed at slot n with partial data secretly made available to sway validator votes.
For the next slot (n+1), the proposer finds out the data is missing and proposes a new block B' which doesn’t build on B.
Meanwhile, the attacker releases the missing data, making block B look valid, and validators now might ignore B' because B seems to have more support.
Mitigation Strategy: To mitigate such attacks, validators of slot n+1 should remember the results of their data availability checks from just before the end of slot n (say 10 seconds into the slot). They should use this information to inform their decisions in the next slot unless it conflicts with new blocks being proposed. This method echoes the "view-merge" technique used in Ethereum consensus research but simplifies the process as it doesn't necessitate additional messages from the proposer.
This approach to managing data verification and block validation aims to balance the need for timely processing with the integrity and security of blockchain operations. By closely managing the timing and criteria for data verification, the system strives to protect against both accidental errors and deliberate attacks, ensuring a robust and reliable blockchain network.
Bandwidth Management
Bandwidth management is effectively maintained, upholding the performance benchmarks set by EIP-4844. The targets are clear: aiming for an average of 3 MBs per slot and a maximum of 6 MBs per slot.
The network employs a system where each node handles data divided into subnets, specifically focusing on how much data each node is responsible for under varying loads.
Here’s a breakdown of the mathematical framework guiding the approach:
Each column of data is broken down into cells, where each cell is 2 KB (calculated as 256 KB divided by 128 equals 2 KB).
If a node manages a block where
MAX_BLOBS_PER_BLOCK= 64, then each column is 128 KB (since 64×2 KB equals 128 KB).Data propagation through GossipSub topics incurs an 8x amplification factor, analogous to the systems used in EIP-4844, making the propagation of each column equivalent to handling one entire blob.
After accounting for data distribution, the remaining bandwidth capacity allows for the equivalent of managing two additional blobs, equating to 2048 KB.
This capacity supports approximately 16 peer samples per slot (2048 KB divided by 128 KB/sample equals 16 samples), achieving a high level of network soundness with a probability of 2−16 that any single node could be misled about data availability by an attacker, assuming non-targeted attacks.
The current setup, with
MAX_BLOBS_PER_BLOCKset at 64, achieves ten times the throughput of the previous EIP-4844 standard without needing an excessive number of column subnets. This efficiency ensures that peer sampling remains robust without overly taxing the network.Looking ahead, increasing
MAX_BLOBS_PER_BLOCKto 128 may require doubling the number of column subnets to 64. This adjustment would depend heavily on the availability of supernodes and potentially higher peer counts to maintain network stability and security.Moreover, optimizing the number of columns, like adjusting
NUM_COLUMNSto 512, can significantly reduce the bandwidth impact of sampling. For instance, atMAX_BLOBS_PER_BLOCKof 64, sampling consumes bandwidth equivalent to half a blob, and at 128, one blob, which is eight times less than distribution. This reduction shows that strategic adjustments in network configurations can enhance efficiency, particularly as we transition towards more complex 2D data structures.Network-Level Validation During Distribution
In the Ethereum improvement process, specifically PR#3531 under EIP-4844, there's a goal to perform network-level validation concurrently with the distribution phase. This allows for the parallel propagation of both the block and sidecars, eliminating the need for sidecars to wait until the block is fully received.
To ensure security, the system is designed so that proposers cannot distribute sidecars that conflict with the block's commitments without risking a penalty (known as "double signing"). This is ensured by including a
kzg_commitmentand akzg_proofin the blob sidecar. These elements are accompanied by the block header and an `inclusion proof` against itsbody_root. This setup validates that thekzg_commitmentis part of theblob_kzg_commitmentslist in the block body. By verifying thekzg_proof, it's confirmed that the commitment matches the blob, and the `inclusion proof` checks the list's presence in the block.The strategy for blob sidecars can similarly be adapted for column sidecars, which are distributed across column subnets. Each column sidecar carries the necessary commitments for its verification, including the block header and an `inclusion proof` for the
hash_tree_root(blob_kzg_commitments)in thebody_root. This proof, which has a depth of 4, along with all `cell proofs` for the column, allows for batch verification.This approach enables the column sidecar to be independently verified and forwarded immediately upon receipt, facilitating parallel propagation of the block and columns. It also allows for immediate sampling, with no dependencies on the receipt of the block or other column sidecars.
The main disadvantage is the redundancy in each column sidecar, as both the header and inclusion proof are replicated across each column. However, this redundancy adds only a minimal increase in bandwidth, roughly 5% per column, given that each column includes an additional 48 bytes per row for the commitments and another 48 bytes for the cell proofs. Given that there are 128 columns and each cell is approximately 2 KB, the increase is manageable, and the inclusion proof only needs to be verified once, not per column.
This method optimizes the efficiency and security of network-level validation in the Ethereum blockchain, enhancing the speed and reliability of block and sidecar propagation.
Transitioning from Stage 1 to Stage 2:
The move from Stage 1 to Stage 2 of PeerDAS leverages well-understood network components, such as GossipSub topics (subnets) for distribution and request/response (req/resp) protocols for peer sampling. Significantly, this shift doesn't necessitate a hard fork initially, as modifications are primarily within the networking architecture and fork-choice mechanisms. However, an increase in the blob count marks a pivotal change that will eventually require a hard fork. Initially, the 4844 standard blob count of 3/6 could be maintained while making other adjustments, like improving network protocols and testing new configurations. Eventually, the aim is to consolidate all significant updates—including increasing the blob count—into a single hard fork to streamline the transition and minimize disruptions.
Stage 2: 2D PeerDAS, or Full Danksharding
Full Danksharding is implemented in Stage 2, utilizing a refined sampling infrastructure where peer sampling is central. Here, blobs are extended in both dimensions—horizontally and vertically. This results in a lightweight peer sampling process that significantly reduces bandwidth consumption. The network may also support light sampling nodes that are not involved in distribution. If distributed reconstruction is implemented, it enhances the robustness against subnet failures. Initially, the blob count in this stage starts at 64 or potentially 128, with the goal to gradually increase to the maximum throughput of 256.
In this stage, blobs are extended horizontally and vertically, forming a matrix subdivided into several columns, determined by NUM_COLUMNS. Now, samples are not just columns but cells, which are intersections of rows and columns, similar to the structure seen in the Danksharding construction. This change in data format does not alter the overall distribution mechanism but does enhance the granularity and efficiency of data handling within the network.
Peer Sampling Innovations:
Peer sampling in Stage 2 builds on the established networking foundations from earlier stages. While the discovery of a diverse set of peers and the req/resp mechanisms remain the same, what changes significantly is the nature of the sampled objects. Instead of columns, nodes now request cells, which contain more granular and specific data points. Each cell comes with its proof, which must be verified against the KZG commitment of the respective row, enhancing the integrity and reliability of the data verification process.
Optimizing Bandwidth Through Advanced Configuration
Delving into the practical implications and the underlying mathematics that guide the approach aims to push the limits with MAX_BLOBS_PER_BLOCK = 256. This setup is designed to maximize the system's throughput to 32 MBs per slot, though the plan is to scale up to this level gradually.
As the network’s architecture is enhanced to support this growth, one scenario involves increasing NUM_COLUMN_SUBNETS to 64. This adjustment, feasible with a rise in reliable peer counts, leads to a new configuration where each subnet handles two columns of data. Despite each column holding about 1 MB of data, only 0.5 MBs need to be actively transmitted across the network. This efficiency is achieved by transmitting only the first half of the column, with the second half reconstructed locally at the receiver’s end using Fast Fourier Transform (FFT).
Fast Fourier Transform (FFT) Explained: The reconstruction process involves two critical FFT operations:
Conversion FFT: Transforms the data from its evaluation form to a coefficient form. This step simplifies data manipulation and prepares it for the next phase.
Recovery FFT: Converts the data back to its original form, fully reconstructing the column based on the half initially received.
This efficient use of FFTs, typically requiring about 4 milliseconds when executed with the arkworks cryptography library, underscores the system's capability to handle data robustly and swiftly.
With each subnet managing data transmission equivalent to 8 blobs, this configuration represents a slight increase over the previous 6 blobs per slot under EIP-4844 standards. The shift in data propagation dynamics, primarily due to the larger object sizes being managed, offers an interesting area for further optimization, possibly by adjusting NUM_COLUMNS.
Peer Sampling in the 2D Framework:
The bandwidth required for peer sampling in this context is remarkably low. Each data sample in the 2D setup weighs just 2 KB, a drastic reduction from previous setups where samples were significantly larger. With high-security settings—sampling k=75k=75k=75 times—the theoretical soundness level achieved is 2−302^{-30}2−30, indicating an extremely low probability that any node could be deceived about data availability. Despite the rigorous sampling, the bandwidth used remains negligible compared to what is required for distribution.
By incorporating these strategic and mathematical considerations into the design, the goal is not only to achieve high throughput but also to ensure robust security and efficiency. This holistic approach is critical as the transition towards fully realizing the capabilities of 2D PeerDAS and Danksharding in the blockchain environment progresses.
As the exploration into the network-level validation aspects of the PeerDAS system transitions from Stage 1 to more advanced stages, it's crucial to address how the vertical extension impacts data validation. Despite this new complexity, the approach for sending proofs and commitments with a column sidecar remains fundamentally unchanged from earlier stages. This stability is largely due to the homomorphic properties of both proofs and commitments, which allow for the second half of the column's proofs and commitments to be reconstructed from the first half through a linear combination.
Handling Proofs and Bandwidth: In theory, optimizing by only transmitting proofs for the first half of each column could significantly cut down on data transmission needs. However, to simplify the validation process and avoid the need for nodes to perform reconstructions, the choice might be to send all proofs anyway. If this approach is taken, a column sidecar would contain twice the amount of proofs, slightly increasing the bandwidth consumption by about 2.5%, assuming NUM_COLUMNS = 128. This increase is a manageable trade-off for reducing computational overhead on the nodes.
Transitioning to Stage 3:
Looking ahead to Stage 3, preparing for this transition involves refining the implementation of 2D cryptography. This advancement is crucial as it influences the block production process by enhancing how data is encrypted and decoded. Additionally, there's a subtle yet significant shift in peer sampling—the objects exchanged through the req/resp protocol evolve from columns to cells. This change requires a nuanced approach in how data is requested and verified among peers.
The Role of Hard Forks:
In principle, transitioning to higher throughput does necessitate a hard fork, primarily because while blocks will still contain the same list of KZG commitments (those tied directly to actual blobs), the extension rows introduced by 2D constructions can be reconstructed from existing data. Thus, a hard fork would not be needed for changes in data structure alone but to handle increased data throughput and ensure all network upgrades are synchronized. Ensuring that these transitions occur at a coordination point means that, practically, they would still likely be bundled with a hard fork for smooth implementation across the network.
This strategic planning and phased implementation help not only to manage bandwidth efficiently but also to ensure that the network remains robust and secure as more complex data structures and cryptography are introduced. This careful balance of technical innovation and practical network management is key to advancing the blockchain infrastructure into its next evolution with minimal disruption and maximized efficiency.