Polkadot logo

Introduction

This book contains the Polkadot Fellowship Requests for Comments (RFCs) detailing proposed changes to the technical implementation of the Polkadot network.

GitHub logo polkadot-fellows/RFCs

(source)

Table of Contents

RFC-0111: Pure Proxy Replication

Start Date12 Aug 2024.
DescriptionReplication of pure proxy account ownership to a remote chain
Authors@muharem @xlc

Summary

This RFC proposes a solution to replicate an existing pure proxy from one chain to others. The aim is to address the current limitations where pure proxy accounts, which are keyless, cannot have their proxy relationships recreated on different chains. This leads to issues where funds or permissions transferred to the same keyless account address on chains other than its origin chain become inaccessible.

Motivation

A pure proxy is a new account created by a primary account. The primary account is set as a proxy for the pure proxy account, managing it. Pure proxies are keyless and non-reproducible, meaning they lack a private key and have an address derived from a preimage determined by on-chain logic. More on pure proxies can be found here.

For the purpose of this document, we define a keyless account as a "pure account", the controlling account as a "proxy account", and the entire relationship as a "pure proxy".

The relationship between a pure account (e.g., account ID: pure1) and its proxy (e.g., account ID: alice) is stored on-chain (e.g., parachain A) and currently cannot be replicated to another chain (e.g., parachain B). Because the account pure1 is keyless and its proxy relationship with alice is not replicable from the parachain A to the parachain B, alice does not control the pure1 account on the parachain B.

Although this behaviour is not promised, users and clients often mistakenly expect alice to control the same pure1 account on the parachain B. As a result, assets transferred to the account or permissions granted for it are inaccessible. Several factors contribute to this misuse:

  • regular accounts on different parachains with the same account ID are typically accessible for the owner and controlled by the same private key (e.g., within System Parachains);
  • users and clients do not distinguish between keyless and regular accounts;
  • members using the multisig account ID across different chains, where a member of a multisig is a pure account;
  • users may prefer an account with a registered identity (e.g. for cross-chain treasury spend proposal), even if the account is keyless;

Given that these mistakes are likely, it is necessary to provide a solution to either prevent them or enable access to a pure account on a target chain.

Stakeholders

Runtime Users, Runtime Devs, wallets, cross-chain dApps.

Explanation

One possible solution is to allow a proxy to create or replicate a pure proxy relationship for the same pure account on a target chain. For example, Alice, as the proxy of the pure1 pure account on parachain A, should be able to set a proxy for the same pure1 account on parachain B.

To minimise security risks, the parachain B should grant the parachain A the least amount of permission necessary for the replication. First, Parachain A claims to Parachain B that the operation is commanded by the pure account, and thus by its proxy, and second, provides proof that the account is keyless.

The replication process will be facilitated by XCM, with the first claim made using the DescendOrigin instruction. The replication call on parachain A would require a signed origin by the pure account and construct an XCM program for parachain B, where it first descends the origin, resulting in the ParachainA/AccountId32(pure1) origin location on the receiving side.

To prove that the pure account is keyless, the client must provide the initial preimage used by the chain to derive the pure account. Parachain A verifies it and sends it to parachain B with the replication request.

We can draft a pallet extension for the proxy pallet, which needs to be initialised on both sides to enable replication:

#![allow(unused)]
fn main() {
// Simplified version to illustrate the concept.
mod pallet_proxy_replica {
  /// The part of the pure account preimage that has to be provided by a client.
  struct Witness {
    /// Pure proxy swapner
    spawner: AccountId,
    /// Disambiguation index
    index: u16,
    /// The block height and extrinsic index of when the pure account was created.  
    block_number: BlockNumber,
    /// The extrinsic index.
    ext_index: u32,
    // Part of the preimage, but constant.
    // proxy_type: ProxyType::Any,
  } 
  // ...
  
  /// The replication call to be initiated on the source chain.
  // Simplified version, the XCM part will be abstracted by the `Config` trait.
  fn replicate(origin: SignedOrigin, witness: Witness, proxy: xcm::Location) -> ... {
       let pure = ensure_signed(origin);
       ensure!(pure == proxy_pallet::derive_pure_account(witness), Error::NotPureAccount);
       let xcm = vec![
         DescendOrigin(who),
         Transact(
             // …
             origin_kind: OriginKind::Xcm,
	     call: pallet_proxy_replica::create(witness, proxy).encode(),
         )
       ];
       xcmTransport::send(xcm)?;
  }
  // …
  
  /// The call initiated by the source chain on the receiving chain.
  // `Config::CreateOrigin` - generally open for whitelisted parachain IDs and 
  // converts `Origin::Xcm(ParachainA/AccountId32(pure1))` to `AccountID(pure1)`.
  fn create(origin: Config::CreateOrigin, witness: Witness, proxy: xcm::Location) -> ... {
       let pure = T::CreateOrigin::ensure_origin(origin);
       ensure!(pure == proxy_pallet::derive_pure_account(witness), Error::NotPureAccount);
       proxy_pallet::create_pure_proxy(pure, proxy);
  }
}

}

Drawbacks

There are two disadvantages to this approach:

  • The receiving chain has to trust the sending chain's claim that the account controlling the pure account has commanded the replication.
  • Clients must obtain witness data.

We could eliminate the first disadvantage by allowing only the spawner of the pure proxy to recreate the pure proxies, if they sign the transaction on a remote chain and supply the witness/preimage. Since the preimage of a pure account includes the account ID of the spawner, we can verify that the account signing the transaction is indeed the spawner of the given pure account. However, this approach would grant exclusive rights to the spawner over the pure account, which is not a property of pure proxies at present. This is why it's not an option for us.

As an alternative to requiring clients to provide a witness data, we could label pure accounts on the source chain and trust it on the receiving chain. However, this would require the receiving chain to place greater trust in the source chain. If the source chain is compromised, any type of account on the trusting chain could also be compromised.

A conceptually different solution would be to not implement replication of pure proxies and instead inform users that ownership of a pure proxy on one chain does not imply ownership of the same account on another chain. This solution seems complex, as it would require UIs and clients to adapt to this understanding. Moreover, mistakes would likely remain unavoidable.

Testing, Security, and Privacy

Each chain expressly authorizes another chain to replicate its pure proxies, accepting the inherent risk of that chain potentially being compromised. This authorization allows a malicious actor from the compromised chain to take control of any pure proxy account on the chain that granted the authorization. However, this is limited to pure proxies that originated from the compromised chain if they have a chain-specific seed within the preimage.

There is a security issue, not introduced by the proposed solution but worth mentioning. The same spawner can create the pure accounts on different chains controlled by the different accounts. This is possible because the current preimage version of the proxy pallet does not include any non-reproducible, chain-specific data, and elements like block numbers and extrinsic indexes can be reproduced with some effort. This issue could be addressed by adding a chain-specific seed into the preimages of pure accounts.

Performance, Ergonomics, and Compatibility

Performance

The replication is facilitated by XCM, which adds some additional load to the communication channel. However, since the number of replications is not expected to be large, the impact is minimal.

Ergonomics

The proposed solution does not alter any existing interfaces. It does require clients to obtain the witness data which should not be an issue with support of an indexer.

Compatibility

None.

Prior Art and References

None.

Unresolved Questions

None.

  • Pure Proxy documentation - https://wiki.polkadot.network/docs/learn-proxies-pure

(source)

Table of Contents

RFC-0112: Compress the State Response Message in State Sync

Start Date14 August 2024
DescriptionCompress the state response message to reduce the data transfer during the state syncing
AuthorsLiu-Cheng Xu

Summary

This RFC proposes compressing the state response message during the state syncing process to reduce the amount of data transferred.

Motivation

State syncing can require downloading several gigabytes of data, particularly for blockchains with large state sizes, such as Astar, which has a state size exceeding 5 GiB (https://github.com/AstarNetwork/Astar/issues/1110). This presents a significant challenge for nodes with slower network connections. Additionally, the current state sync implementation lacks a persistence feature (https://github.com/paritytech/polkadot-sdk/issues/4), meaning any network disruption forces the node to re-download the entire state, making the process even more difficult.

Stakeholders

This RFC benefits all projects utilizing the Substrate framework, specifically in improving the efficiency of state syncing.

  • Node Operators.
  • Substrate Users.

Explanation

The largest portion of the state response message consists of either CompactProof or Vec<KeyValueStateEntry>, depending on whether a proof is requested (source):

  • CompactProof: When proof is requested, compression yields a lower ratio but remains beneficial, as shown in warp sync tests in the Performance section below.
  • Vec<KeyValueStateEntry>: Without proof, this is theoretically compressible because the entries are generated by iterating the storage sequentially starting from an empty storage key, which means many entries in the message share the same storage prefix, making it ideal for compression.

Drawbacks

None identified.

Testing, Security, and Privacy

The code changes required for this RFC are straightforward: compress the state response on the sender side and decompress it on the receiver side. Existing sync tests should ensure functionality remains intact.

Performance, Ergonomics, and Compatibility

Performance

This RFC optimizes network bandwidth usage during state syncing, particularly for blockchains with gigabyte-sized states, while introducing negligible CPU overhead for compression and decompression. For example, compressing the state response during a recent Polkadot warp sync (around height #22076653) reduces the data transferred from 530,310,121 bytes to 352,583,455 bytes — a 33% reduction, saving approximately 169 MiB of data.

Performance data is based on this patch, with logs available here.

Ergonomics

None.

Compatibility

No compatibility issues identified.

Prior Art and References

None.

Unresolved Questions

None.

None.

(source)

Table of Contents

RFC-0117: The Unbrick Collective

Start Date22 August 2024
DescriptionThe Unbrick Collective aims to help teams rescuing a para once it stops producing blocks
AuthorsBryan Chen, Pablo Dorado

Summary

A followup of the RFC-0014. This RFC proposes adding a new collective to the Polkadot Collectives Chain: The Unbrick Collective, as well as improvements in the mechanisms that will allow teams operating paras that had stopped producing blocks to be assisted, in order to restore the production of blocks of these paras.

Motivation

Since the initial launch of Polkadot parachains, there has been many incidients causing parachains to stop producing new blocks (therefore, being bricked) and many occurrences that required Polkadot governance to update the parachain head state/wasm. This can be due to many reasons range from incorrectly registering the initial head state, inability to use sudo key, bad runtime migration, bad weight configuration, and bugs in the development of the Polkadot SDK.

Currently, when the para is not unlocked in the paras registrar1, the Root origin is required to perform such actions, involving the governance process to invoke this origin, which can be very resource expensive for the teams. The long voting and enactment times also could result significant damage to the parachain and users.

Finally, other instances of governance that might enact a call using the Root origin (like the Polkadot Fellowship), due to the nature of their mission, are not fit to carry these kind of tasks.

In consequence, the idea of a Unbrick Collective that can provide assistance to para teams when they brick and further protection against future halts is reasonable enough.

Stakeholders

  • Parachain teams
  • Parachain users
  • OpenGov users
  • Polkadot Fellowship

Explanation

The Collective

The Unbrick Collective is defined as an unranked collective of members, not paid by the Polkadot Treasury. Its main goal is to serve as a point of contact and assistance for enacting the actions needed to unbrick a para. Such actions are:

  • Updating the Parachain Verification Function (a.k.a. a new WASM) of a para.
  • Updating the head state of a para.
  • A combination of the above.

In order to ensure these changes are safe enough for the network, actions enacted by the Unbrick Collective must be whitelisted via similar mechanisms followed by collectives like the Polkadot Fellowship. This will prevent unintended, not overseen changes on other paras to occur.

Also, teams might opt-in to delegate handling their para in the registry to the Collective. This allows to perform similar actions using the paras registrar, allowing for a shorter path to unbrick a para.

Initially, the unbrick collective has powers similar to a parachains own sudo, but permits more decentralized control. In the future, Polkadot shall provide functionality like SPREE or JAM that exceeds sudo permissions, so the unbrick collective cannot modify those state roots or code.

The Unbrick Process

flowchart TD
    A[Start] 

    A -- Bricked --> C[Request para unlock via Root]
    C -- Approved --> Y
    C -- Rejected --> A
    
    D[unbrick call proposal on WhitelistedUnbrickCaller]
    E[whitelist call proposal on the Unbrick governance]
    E -- call whitelisted --> F[unbrick call enacted]
    D -- unbrick called --> F
    F --> Y

    A -- Not bricked --> O[Opt-in to the Collective]
    O -- Bricked --> D
    O -- Bricked --> E

    Y[update PVF / head state] -- Unbricked --> Z[End]

Initially, a para team has two paths to handle a potential unbrick of their para in the case it stops producing blocks.

  1. Opt-in to the Unbrick Collective: This is done by delegating the handling of the para in the paras registrar to an origin related to the Collective. This doesn't require unlocking the para. This way, the collective is enabled to perform changes in the paras module, after the Unbrick Process proceeds.
  2. Request a Para Unlock: In case the para hasn't delegated its handling in the paras registrar, it'll be still possible for the para team to submit a proposal to unlock the para, which can be assisted by the Collective. However, this involves submitting a proposal to the Root governance origin.

Belonging to the Collective

The collective will be initially created without members (no seeding). There will be additional governance proposals to setup the seed members.

The origins able to modify the members of the collective are:

  • The Fellows track in the Polkadot Fellowship.
  • Root track in the Relay.
  • More than two thirds of the existing Unbrick Collective.

The members are responsible to verify the technical details of the unbrick requests (i.e. the hash of the new PVF being set). Therefore, they must have the technical capacity to perform such tasks.

Suggested requirements to become a member are the following:

  • Rank 3 or above in the Polkadot Fellowship.
  • Being a CTO or Technical Lead in a para team that has opted-in to delegate the Unbrick Collective to manage the PVF/head state of the para.

Drawbacks

The ability to modify the Head State and/or the PVF of a para means a possibility to perform arbitrary modifications of it (i.e. take control the native parachain token or any bridged assets in the para).

This could introduce a new attack vector, and therefore, such great power needs to be handled carefully.

Testing, Security, and Privacy

The implementation of this RFC will be tested on testnets (Rococo and Westend) first.

An audit will be required to ensure the implementation doesn't introduce unwanted side effects.

There are no privacy related concerns.

Performance, Ergonomics, and Compatibility

Performance

This RFC should not introduce any performance impact.

Ergonomics

This RFC should improve the experience for new and existing parachain teams, lowering the barrier to unbrick a stalled para.

Compatibility

This RFC is fully compatible with existing interfaces.

Prior Art and References

Unresolved Questions

  • What are the parameters for the WhitelistedUnbrickCaller track?
  • Any other methods that shall be updated to accept Unbrick origin?
  • Any other requirements to become a member?
  • We would like to keep this simple, so no funding support from the Polkadot treasury. But do we want to compensate the members somehow? i.e. Allow parachain teams to donate to the collective.
  • We hope SPREE/JAM would be carefully audited for miss-use risks before being
    provided to parachain teams, but could the unbrick collective have an elections
    that warranted trust beyond sudo powers?
  • An auditing framework/collective makes sense parachain code upgrades, but
    could also strengthen the unbrick collective.
  • Do we want to have this collective offer additional technical support to help bricked parachains? i.e. help debug the code, create the rescue plan, create postmortem report, provide resources on how to avoid getting bricked
1

The paras registrar refers to a pallet in the Relay, responsible to gather registration info of the paras, the locked/unlocked state, and the manager info.

(source)

Table of Contents

RFC-0121: Iterable Referenda Tracks

Start Date17 September 2024
DescriptionAllow dynamic modifications of referenda tracks at runtime without the need for a full runtime upgrade.
AuthorsPablo Dorado, Daniel Olano

Summary

The protocol change introduces flexibility in the governance structure by enabling the referenda track list to be modified dynamically at runtime. This is achieved by replacing static slices in TracksInfo with iterators, facilitating storage-based track management. As a result, governance tracks can be modified or added based on real-time decisions and without requiring runtime upgrades.

Motivation

Polkadot's governance system is designed to be adaptive and decentralized, but modifying the referenda tracks (which determine decision-making paths for proposals) has historically required runtime upgrades. This poses an operational challenge, delaying governance changes until an upgrade is scheduled and executed. The new system provides the flexibility needed to adjust tracks dynamically, reflecting real-time changes in governance needs without the latency and risks associated with runtime upgrades. This reduces governance bottlenecks and allows for quicker adaptation to emergent scenarios.

Stakeholders

  • Network stakeholders: the change means reduced coordination effort for track adjustments.
  • Governance participants: this enables more responsive decision-making pathways.

Explanation

The protocol modification replaces the current static slice method used for storing referenda tracks with an iterator-based approach that allows tracks to be managed dynamically using chain storage. Governance participants can define and modify referenda tracks as needed, which are then accessed through runtime rather than being hardcoded in the protocol. This system ensures that tracks are adjustable at any time, reducing upgrade-related complexities and introducing agility in how governance tracks are applied. This modification does not disrupt existing governance mechanisms but rather enhances them by increasing adaptability.

In terms of technical structure, TracksInfo::tracks will now return iterators, making it possible to alter track configurations based on storage data rather than static definitions. This opens up possibilities for new track types and governance configurations to be deployed without the need for upgrades that might take up weeks.

Drawbacks

The most significant drawback is the increased complexity for developers managing track configurations via storage-based iterators, which require careful handling to avoid misuse or inefficiencies.

Additionally, this flexibility could introduce risks if track configurations are modified improperly during runtime, potentially leading to governance instabilities.

Testing, Security, and Privacy

To ensure security, the change must be tested in testnet environments first (Paseo, Westend), particularly in scenarios where multiple track changes happen concurrently. Potential vulnerabilities in governance adjustments must be addressed to prevent abuse.

The proposal doesn't introduce privacy risks but increases the need for ensuring that any runtime changes do not inadvertently lead to insecure governance structures.

Comprehensive tests should be conducted to validate correct track modifications in different governance scenarios.

Performance, Ergonomics, and Compatibility

Performance

The proposal optimizes governance track management by avoiding the overhead of runtime upgrades, reducing downtime, and eliminating the need for full consensus on upgrades. However, there is a slight performance cost related to runtime access to storage-based iterators, though this is mitigated by the overall system efficiency gains.

Ergonomics

Developers and governance actors benefit from simplified governance processes but must account for the technical complexity of managing iterator-based track configurations.

Tools may need to be developed to help streamline track adjustments in runtime.

Compatibility

The change is backward compatible with existing governance operations, and does not require developers to adjust how they interact with referenda tracks.

A migration is required to convert existing statically-defined tracks to dynamic storage-based configurations without disruption.

Prior Art and References

This dynamic governance track approach builds on previous work around Polkadot's on-chain governance and leverages standard iterator patterns in Rust programming to improve runtime flexibility. Comparable solutions in other governance networks were examined, but this proposal uniquely tailors them to Polkadot’s decentralized, runtime-upgradable architecture.

Unresolved Questions

  • How to handle governance transitions for currently ongoing referenda when changing configuration parameters of an existing track? Ideally, most tracks should not have to go through this change, but some tactics might be applied (like a proposal that reduces the ongoing queue before a major change and then executes the change, after a reasonable period of time has elapsed and no ongoing referenda exists for that track).

There are already two proposed solutions for both the implementation and

  • This Pull Request proposes changing pallet-referenda's TracksInfo to make tracks return an iterator.
  • There is already a proposed implementation of pallet-referenda-tracks, which stores the configurations, and implements TracksInfo using the iterator approach.

(source)

Table of Contents

RFC-0123: Introduce :pending_code as intermediate storage key for the runtime code

Start Date14.10.2024
DescriptionStore a runtime upgrade in :pending_code before moving it to :code.
AuthorsBastian Köcher

Summary

The code of a runtime is stored in its own state, and when performing a runtime upgrade, this code is replaced. The new runtime can contain runtime migrations that adapt the state to the state layout as defined by the runtime code. This runtime migration is executed when building the first block with the new runtime code. Anything that interacts with the runtime state uses the state layout as defined by the runtime code. So, when trying to load something from the state in the block that applied the runtime upgrade, it will use the new state layout but will decode the data from the non-migrated state. In the worst case, the data is incorrectly decoded, which may lead to crashes or halting of the chain.

This RFC proposes to store the new runtime code under a different storage key when applying a runtime upgrade. This way, all the off-chain logic can still load the old runtime code under the default storage key and decode the state correctly. The block producer is then required to use this new runtime code to build the next block. While building the next block, the runtime is executing the migrations and moves the new runtime code to the default runtime code location. So, the runtime code found under the default location is always the correct one to decode the state from which the runtime code was loaded.

Motivation

While the issue of having undecodable state only exists for the one block in which the runtime upgrade was applied, it still impacts anything that reads state data, like block explorers, UIs, nodes, etc. For block explorers, the issue mainly results in indexing invalid data and UIs may show invalid data to the user. For nodes, reading incorrect data may lead to a performance degradation of the network. There are also ways to prevent certain decoding issues from happening, but it requires that developers are aware of this issue and also requires introducing extra code, which could introduce further bugs down the line.

So, this RFC tries to solve these issues by fixing the underlying problem of having temporary undecodable state.

Stakeholders

  • Relay chain/Parachain node developers
  • Relay chain/Parachain node operators

Explanation

The runtime code is stored under the special key :code in the state. Nodes and other tooling read the runtime code under this storage key when they want to interact with the runtime for e.g., building/importing blocks or getting the metadata to read the state. To update the runtime code the runtime overwrites the value at :code, and then from the next block on, the new runtime will be loaded. This RFC proposes to first store the new runtime code under :pending_code in the state for one block. When the next block is being built, the block builder first needs to check if :pending_code is set, and if so, it needs to load the runtime from this storage key. While building the block the runtime will move :pending_code to :code to have the runtime code at the default location. Nodes importing the block will also need to load :pending_code if it exists to ensure that the correct runtime code is used. By doing it this way, the runtime code found at :code in the state of a block will always be able to decode the state. Furthermore, this RFC proposes to introduce system_version: 3. The system_version was introduced in RFC42. Version 3 would then enable the usage of :pending_code when applying a runtime code upgrade. This way, the feature can be introduced first and enabled later when the majority of the nodes have upgraded.

Drawbacks

Because the first block built with the new runtime code will move the runtime code from :pending_code to :code, the runtime code will need to be loaded. This means the runtime code will appear in the proof of validity of a parachain for the first block built with the new runtime code. Generally this is not a problem as the runtime code is also loaded by the parachain when setting the new runtime code. There is still the possibility of having state that is not migrated even when following the proposal as presented by this RFC. The issue is that if the amount of data to be migrated is too big, not all of it can be migrated in one block, because either it takes more time than there is assigned for a block or parachains for example have a fixed budget for their proof of validity. To solve this issue there already exist multi-block migrations that can chunk the migration across multiple blocks. Consensus-critical data needs to be migrated in the first block to ensure that block production etc., can continue. For the other data being migrated by multi-block migrations the migrations could for example expose to the outside which keys are being migrated and should not be indexed until the migration is finished.

Testing, Security, and Privacy

Testing should be straightforward and most of the existing testing should already be good enough. Extending with some checks that :pending_code is moved to :code.

Performance, Ergonomics, and Compatibility

Performance

The performance should not be impacted besides requiring loading the runtime code in the first block being built with the new runtime code.

Ergonomics

It only alters the way blocks are produced and imported after applying a runtime upgrade. This means that only nodes need to be adapted to the changes of this RFC.

Compatibility

The change will require that the nodes are upgraded before the runtime starts using this feature. Otherwise they will fail to import the block build by :pending_code. For Polkadot/Kusama this means that also the parachain nodes need to be running with a relay chain node version that supports this new feature. Otherwise the parachains will stop producing/finalizing nodes as they can not sync the relay chain any more.

Prior Art and References

The issue initially reported a bug that led to this RFC. It also discusses multiple solutions for the problem.

Unresolved Questions

None

  • Solve the issue of requiring loading the entire runtime code to move it into a different location by introducing a low-level move function. When using the V1 trie layout every value bigger than 32 bytes is put into the db separately. This means a low level move function would only need to move the hash of the runtime code from :code to :pending_code.

(source)

Table of Contents

RFC-0124: Extrinsic version 5

Start Date18 October 2024
DescriptionDefinition and specification of version 5 extrinsics
AuthorsGeorge Pisaltu

Summary

This RFC proposes the definition of version 5 extrinsics along with changes to the specification and encoding from version 4.

Motivation

RFC84 introduced the specification of General transactions, a new type of extrinsic besides the Signed and Unsigned variants available previously in version 4. Additionally, RFC99 introduced versioning of transaction extensions through an extra byte in the extrinsic encoding. Both of these changes require an extrinsic format version bump as both the semantics around extensions as well as the actual encoding of extrinsics need to change to accommodate these new features.

Stakeholders

  • Runtime users
  • Runtime devs
  • Wallet devs

Explanation

Changes to extrinsic authorization

The introduction of General transactions allows the authorization of any and all origins through extensions. This means that, with the appropriate extension, General transactions can replicate the same behavior present-day v4 Signed transactions. Specifically for Polkadot chains, an example implementation for such an extension is VerifySignature, introduced in the Transaction Extension PR3685. Other extensions can be inserted into the extension pipeline to authorize different custom origins. Therefore, a Signed extrinsic variant is redundant to a General one strictly in terms of user functionality and could eventually be deprecated and removed.

Encoding format for version 5

As with version 4, the encoded extrinsic v5 is a SCALE encoded vector of bytes (u8), therefore starting with the encoded length of the following bytes in compact format. The leading byte after the length determines the version and type of extrinsic, as specified by RFC84. For reasons mentioned above, this RFC removes the Signed variant for v5 extrinsics.

For Bare extrinsics, the following bytes will just be the encoded call and nothing else.

For General transactions, as stated in RFC99, an extension version byte must be added to the extrinsic format. This byte should allow runtimes to expose more than one set of extensions which can be used for a transaction. As far as the v5 extrinsic encoding is concerned, this extension byte should be encoded immediately after the leading encoding byte. The extension version byte should be included in payloads to be signed by all extensions configured by runtime devs to ensure a user's extension version choice cannot be altered by third parties.

After the extension version byte, the extensions will be encoded next, followed by the call itself.

A quick visualization of the encoding:

  • Bare extrinsics: (extrinsic_encoded_len, 0b0000_0101, call)
  • General transactions: (extrinsic_encoded_len, , 0b0100_0101, extension_version_byte, extensions, call)

Signatures on Polkadot in General transactions

In order to run a transaction with a signed origin in extrinsic version 5, a user must create the transaction with an instance of at least one extension responsible for authorizing Signed origins with a provided signature.

As stated before, PR3685 comes with a Transaction Extension which replicates the current Signed transactions in v5 extrinsics, namely VerifySignature. I will use this extension as an example on how to replicate current Signed transaction functionality in the new v5 extrinsic format, though the runtime logic is not constrained to this particular implementation.

This extension leverages the new inherited implication functionality introduced in TransactionExtension and creates a payload to be signed using the data of all extensions after itself in the extension pipeline. This extension can be configured to accept a MultiSignature, which makes it compatible with all signature types currently used in Polkadot.

In the context of using an extension such as VerifySignature, for example, to replicate current Signed transaction functionality, the steps to generate the payload to be signed would be:

  1. The extension version byte, call, extension and extension implicit should be encoded (by "extension" and its implicit we mean only the data associated with extensions that follow this one in the composite extension type);
  2. The result of the encoding should then be hashed using the BLAKE2_256 hasher;
  3. The result of the hash should then be signed with the signature type specified in the extension definition.
#![allow(unused)]
fn main() {
// Step 1: encode the bytes
let encoded = (extension_version_byte, call, transaction_extension, transaction_extension_implicit).encode();
// Step 2: hash them
let payload = blake2_256(&encoded[..]);
// Step 3: sign the payload
let signature = keyring.sign(&payload[..]);
}

Summary of changes in version 5

In order to minimize the number of changes to the extrinsic format version and also to help all consumers downstream in the transition period between these extrinsic versions, we should:

  • Remove the Signed variant starting with v5 extrinsics
  • Add the General variant starting with v5 extrinsics
  • Enable runtimes to support both v4 and v5 extrinsics

Drawbacks

The metadata will have to accommodate two distinct extrinsic format versions at a given point in time in order to provide the new functionality in a non-breaking way for users and tooling.

Although having to support multiple extrinsic versions in metadata involves extra work, the change is ultimately an improvement to metadata and the extra functionality may be useful in other future scenarios.

Testing, Security, and Privacy

There is no impact on testing, security or privacy.

Performance, Ergonomics, and Compatibility

This change makes the authorization through signatures configurable by runtime devs in version 5 extrinsics, as opposed to version 4 where the signing payload algorithm and signatures were hardcoded. This moves the responsibility of ensuring proper authentication through TransactionExtension to the runtime devs, but a sensible default which closely resembles the present day behavior will be provided in VerifySignature.

Performance

There is no performance impact.

Ergonomics

Tooling will have to adapt to be able to tell which authorization scheme is used by a particular transaction by decoding the extension and checking which particular TransactionExtension in the pipeline is enabled to do the origin authorization. Previously, this was done by simply checking whether the transaction is signed or unsigned, as there was only one method of authentication.

Compatibility

As long as extrinsic version 4 is still exposed in the metadata when version 5 will be introduced, the changes will not break existing infrastructure. This should give enough time for tooling to support version 5 and to remove version 4 in the future.

Prior Art and References

This is a result of the work in Extrinsic Horizon and RFC99.

Unresolved Questions

None.

Following this change, extrinsic version 5 will be introduced as part of the Extrinsic Horizon effort, which will shape future work.

(source)

Table of Contents

RFC-0125: XCM Asset Metadata

Start Date22 Oct 2024
DescriptionXCM Asset Metadata definition and a way of communicating it via XCM
AuthorsDaniel Shiposha

Summary

This RFC proposes a metadata format for XCM-identifiable assets (i.e., for fungible/non-fungible collections and non-fungible tokens) and a set of instructions to communicate it across chains.

Motivation

Currently, there is no way to communicate metadata of an asset (or an asset instance) via XCM.

The ability to query and modify the metadata is useful for two kinds of entities:

  • Asset collections (both fungible and nonfungible).

    Any collection has some metadata, such as the name of the collection. The standard way of communicating metadata could help with registering foreign assets within a consensus system. Therefore, this RFC could complement or supersede the RFC for initializing fully-backed derivatives (note that this RFC is related to the old XCM RFC process; it's not the Fellowship RFC and hasn't been migrated yet).

  • NFTs (i.e., asset instances).

    The metadata is the crucial aspect of any nonfungible object since metadata assigns meaning to such an object. The metadata for NFTs is just as important as the notion of "amount" for fungibles (there is no sense in fungibles if they have no amount).

    An NFT is always a representation of some object. The metadata describes the object represented by the NFT.

    NFTs can be transferred to another chain via XCM. However, there are limitations due to the inability to communicate its metadata:

    1. Teleports are inherently impossible because they imply the complete transfer of an NFT, including its metadata, which can't be done via XCM now.
    2. Reserve-based transfers currently have limited use-case scenarios if the reserve chain provides a way of modifying metadata for its users (usually, the token's owner has privileged rights to modify a specific metadata subset). When a user transfers an NFT using this model to another chain, the NFT owner-related metadata can't be updated anymore because another chain's sovereign account owns the original token, and another chain cannot modify the metadata. However, if it were possible to update NFT metadata in the standard XCM way, another chain could offer additional metadata-related logic. For instance, it could provide a facade logic to metadata modification (i.e., provide permission-based modification authorization, new value format check, etc.).

Besides metadata modification, the ability to read it is also valuable. On-chain logic can interpret the NFT metadata, i.e., the metadata could have not only the media meaning but also a utility function within a consensus system. Currently, such a way of using NFT metadata is possible only within one consensus system. This RFC proposes making it possible between different systems via XCM so different chains can fetch and analyze the asset metadata from other chains.

Stakeholders

Runtime users, Runtime devs, Cross-chain dApps, Wallets.

Explanation

The Asset Metadata is information bound to an asset class (fungible or NFT collection) or an asset instance (an NFT). The Asset Metadata could be represented differently on different chains (or in other consensus entities). However, to communicate metadata between consensus entities via XCM, we need a general format so that any consensus entity can make sense of such information.

We can name this format "XCM Asset Metadata".

This RFC proposes:

  1. Using key-value pairs as XCM Asset Metadata since it is a general concept useful for both structured and unstructured data. Both key and value can be raw bytes with interpretation up to the communicating entities.

    The XCM Asset Metadata should be represented as a map SCALE-encoded equivalent to the BTreeMap.

    Let's call the type of the XCM Asset Metadata map MetadataMap.

  2. Communicating only the demanded part of the metadata, not the whole metadata.

    • A consensus entity should be able to query the values of interested keys to read the metadata. To specify the keys to read, we need a set-like type. Let's call that type MetadataKeys and make its instance a SCALE-encoded equivalent to the BTreeSet.

    • A consensus entity should be able to write the values for specified keys.

  3. New XCM instructions to communicate the metadata.

New instructions

ReportMetadata

The ReportMetadata is a new instruction to query metadata information. It can be used to query metadata key list or to query values of interested keys.

This instruction allows querying the metadata of:

  • a collection (fungible or nonfungible)
  • an NFT

If an asset (or an asset instance) for which the query is made doesn't exist, the Response::Null should be reported via the existing QueryResponse instruction.

The ReportMetadata can be used without origin (i.e., following the ClearOrigin instruction) since it only reads state.

Safety: The reporter origin should be trusted to hold the true metadata. If the reserve-based model is considered, the asset's reserve location must be viewed as the only source of truth about the metadata.

The use case for this instruction is when the metadata information of a foreign asset (or asset instance) is used in the logic of a consensus entity that requested it.

#![allow(unused)]
fn main() {
/// An instruction to query metadata of an asset or an asset instance.
ReportMetadata {
    /// The ID of an asset (a collection, fungible or nonfungible).
    asset_id: AssetId,

    /// The ID of an asset instance.
    ///
    /// If the value is `Undefined`, the metadata of the collection is reported.
    instance: AssetInstance,

    /// See `MetadataQueryKind` below.
    query_kind: MetadataQueryKind,

    /// The usual field for Report<something> XCM instructions.
    ///
    /// Information regarding the query response.
    /// The `QueryResponseInfo` type is already defined in the XCM spec.
    response_info: QueryResponseInfo,
}
}

Where the MetadataQueryKind is:

#![allow(unused)]
fn main() {
enum MetadataQueryKind {
    /// Query metadata key list.
    KeyList,

    /// Query values of the specified keys.
    Values(MetadataKeys),
}
}

The ReportMetadata works in conjunction with the existing QueryResponse instruction. The Response type should be modified accordingly: we need to add a new AssetMetadata variant to it.

#![allow(unused)]
fn main() {
/// The struct used in the existing `QueryResponse` instruction.
pub enum Response {
    // ... snip, existing variants ...

    /// The metadata info.
    AssetMetadata {
        /// The ID of an asset (a collection, fungible or nonfungible).
        asset_id: AssetId,

        /// The ID of an asset instance.
        ///
        /// If the value is `Undefined`, the reported metadata is related to the collection, not a token.
        instance: AssetInstance,

        /// See `MetadataResponseData` below.
        data: MetadataResponseData,
    }
}

pub enum MetadataResponseData {
    /// The metadata key list to be reported
    /// in response to the `KeyList` metadata query kind.
    KeyList(MetadataKeys),

    /// The values of the keys that were specified in the
    /// `Values` variant of the metadata query kind.
    Values(MetadataMap),
}
}

ModifyMetadata

The ModifyMetadata is a new instruction to request a remote chain to modify the values of the specified keys.

This instruction can be used to update the metadata of a collection (fungible or nonfungible) or of an NFT.

The remote chain handles the modification request and may reject it based on its internal rules. The request can only be executed or rejected in its entirety. It must not be executed partially.

To execute the ModifyMetadata, an origin is required so that the handling logic can authorize the metadata modification request from a known source. Since this instruction requires an origin, the assets used to cover the execution fees must be transferred in a way that preserves the origin. For instance, one can use the approach described in RFC #122 if the handling chain configured aliasing rules accordingly.

The example use case of this instruction is to ask the reserve location of the asset to modify the metadata. So that, the original asset's metadata is updated according to the reserve location's rules.

#![allow(unused)]
fn main() {
ModifyMetadata {
    /// The ID of an asset (a collection, fungible or nonfungible).
    asset_id: AssetId,

    /// The ID of an asset instance.
    ///
    /// If the value is `Undefined`, the modification request targets the collection, not a token.
    instance: AssetInstance,

    /// The map contains the keys mapped to the requested new values.
    modification: MetadataMap,
}
}

Repurposing AssetInstance::Undefined

As the new instructions show, this RFC reframes the purpose of the Undefined variant of the AssetInstance enum. This RFC proposes to use the Undefined variant of a collection identified by an AssetId as a synonym of the collection itself. I.e., an asset Asset { id: <AssetId>, fun: NonFungible(AssetInstance::Undefined) } is considered an NFT representing the collection itself.

As a singleton non-fungible instance is barely distinguishable from its collection, this convention shouldn't cause any problems.

Thus, the AssetInstance docs must be updated accordingly in the implementations.

Drawbacks

Regarding ergonomics, no drawbacks were noticed.

As for the user experience, it could discover new cross-chain use cases involving asset collections and NFTs, indicating a positive impact.

There are no security concerns except for the ReportMetadata instruction, which implies that the source of the information must be trusted.

In terms of performance and privacy, there will be no changes.

Testing, Security, and Privacy

The implementations must honor the contract for the new instructions. Namely, if the instance field has the value of AssetInstance::Undefined, the metadata must relate to the asset collection but not to a non-fungible token inside it.

Performance, Ergonomics, and Compatibility

Performance

No significant impact.

Ergonomics

Introducing a standard metadata format and a way of communicating it is a valuable addition to the XCM format that potentially increases cross-chain interoperability without the need to form ad-hoc chain-to-chain integrations via Transact.

Compatibility

This RFC proposes new functionality, so there are no compatibility issues.

Prior Art and References

RFC: XCM Asset Metadata

Unresolved Questions

Should the MetadataMap and MetadataKeys be bounded, or is it enough to rely on the fact that every XCM message is itself bounded?

The original RFC draft contained additional metadata instructions. Though they could be useful, they're clearly outside the basic logic. So, this RFC version omits them to make the metadata discussion more focused on the core things. Nonetheless, there is hope that metadata approval instructions might be useful in the future, so they are mentioned here.

You can read about the details in the original draft.

(source)

Table of Contents

RFC-0126: Introduce XCQ(Cross Consensus Query)

Start DateOct 25 2024
DescriptionIntroduce XCQ (Cross Consensus Query)
AuthorsBryan Chen, Jiyuan Zheng

Summary

This proposal introduces XCQ (Cross Consensus Query), which aims to serve as an intermediary layer between different chain runtime implementations and tools/UIs, to provide a unified interface for cross-chain queries. XCQ abstracts away concrete implementations across chains and supports custom query computations.

Use cases benefiting from XCQ include:

  • XCM bridge UI:
    • Query asset balances
    • Query XCM weight and fee from hop and dest chains
  • Wallets:
    • Query asset balances
    • Query weights and fees for operations across chains
  • Universal dApp that supports all the parachains:
    • Perform Feature discovery
    • Query pallet-specific features
    • Construct extrinsics by querying pallet index, call index, etc

Motivation

In Substrate, runtime APIs facilitate off-chain clients in reading the state of the consensus system. However, different chains may expose different APIs for a similar query or have varying data types, such as doing custom transformations on direct data, or differing AccountId types. This diversity also extends to client-side, which may require custom computations over runtime APIs in various use cases. Therefore, tools and UI developers often access storage directly and reimplement custom computations to convert data into user-friendly representations, leading to duplicated code between Rust runtime logic and UI JS/TS logic. This duplication increases workload and potential for bugs.

Therefore, a system is needed to serve as an intermediary layer between concrete chain runtime implementations and tools/UIs, to provide a unified interface for cross-chain queries.

Stakeholders

  • Runtime Developers
  • Tools/UI Developers

Explanation

The overall query pattern of XCQ consists of three components:

  • Runtime: View-functions across different pallets are amalgamated through an extension-based system.
  • XCQ query: Custom computations over view-function results are encapsulated via PolkaVM programs.
  • XCQ query arguments: Query arguments like accounts to be queried are also passed together with the query program.

XCQ Runtime API

The runtime API for off-chain query usage includes two methods:

  • execute_query: Executes the query and returns the result. It takes the query, input, and weight limit as arguments. The query is the query program in PolkaVM program binary format. The input is the query arguments that is SCALE-encoded. The weight limit is the maximum weight allowed for the query execution.
  • metadata: Return metadata of supported extensions (introduced in later section) and methods, serving as a feature discovery functionality

Example XCQ Runtime API:

#![allow(unused)]
fn main() {
decl_runtime_apis! {
    pub trait XcqApi {
        fn execute_query(query: Vec<u8>, input: Vec<u8>, weight_limit: u64) -> XcqResult;
        fn metadata() -> Vec<u8>;
    }
}
type XcqResult =  Result<XcqResponse, XcqError>;
type XcqResponse = Vec<u8>;
enum XcqError {
    Custom(String),
}
}

Example metadata:

#![allow(unused)]
fn main() {
Metadata {
    extensions: vec![
        ExtensionMetadata {
            name: "ExtensionCore",
            methods: vec![MethodMetadata {
                name: "has_extension",
                inputs: vec![MethodParamMetadata {
                    name: "id",
                    ty: XcqType::Primitive(PrimitiveType::U64)
                }],
                output: XcqType::Primitive(PrimitiveType::Bool)
            }]
        },
        ExtensionMetadata {
            name: "ExtensionFungibles",
            methods: vec![
                MethodMetadata {
                    name: "total_supply",
                    inputs: vec![MethodParamMetadata {
                        name: "asset",
                        ty: XcqType::Primitive(PrimitiveType::U32)
                    }],
                    output: XcqType::Primitive(PrimitiveType::U64)
                },
                MethodMetadata {
                    name: "balance",
                    inputs: vec![
                        MethodParamMetadata {
                            name: "asset",
                            ty: XcqType::Primitive(PrimitiveType::U32)
                        },
                        MethodParamMetadata {
                            name: "who",
                            ty: XcqType::Primitive(PrimitiveType::H256)
                        }
                    ],
                    output: XcqType::Primitive(PrimitiveType::U64)
                }
            ]
        }
    ]
}
}

Note: ty is represented by a meta-type system called xcq-types

xcq-types

xcq-types is a meta-type system similar to scale-info but simpler. It enables different chains with different type definitions to work via a common operation. Front-end codes constructs call data to XCQ programs according to metadata provided by different chains.

XCQ Executor

An XCQ executor is a runtime module that executes XCQ queries. It has a core method execute that takes a PolkaVM program binary, method name of the exported functions in the PolkaVM program, input arguments, and weight limit that the PolkaVM program can consume.

#![allow(unused)]
fn main() {
pub fn execute(
    &mut self,
    raw_blob: &[u8],
    method: &str,
    input: &[u8],
    weight_limit: u64,
) -> Result<Vec<u8>, XcqExecutorError> {...}
}

XCQ Extension

An extension-based design is essential for several reasons:

  • Different chains may have different data types for semantically similar queries, making it challenging to standardize function calls across them. An extension-based design with optional associated types allows these diverse data types to be specified and utilized effectively.
  • Function calls distributed across various pallets can be amalgamated into a single extension, simplifying the development process and ensuring a more cohesive and maintainable codebase.
  • New functionalities can be added without upgrading the core part of the XCQ.
  • Ensure the core part is in a minimal scope.

Essential components of an XCQ extension system include:

  • A hash-based extension id generation mechanism for addressing and versioning . The hash value derives from the extension name and its method sets. Any update to an extension is treated as a new extension.

  • decl_extension macro: Defines an extension as a Rust trait with optional associated types.

Example usage:

#![allow(unused)]
fn main() {
use xcq_extension::decl_extension;

pub trait Config {
    type AssetId: Codec;
    type AccountId: Codec;
    type Balance: Codec;
}
decl_extension! {
    pub trait ExtensionFungibles {
        type Config: Config;
        fn total_supply(asset: <Self::Config as Config>::AssetId) -> <Self::Config as Config>::Balance;
        fn balance(asset: <Self::Config as Config>::AssetId, who: <Self::Config as Config>::AccountId) -> <Self::Config as Config>::Balance;
    }
}
}
  • impl_extensions macro: Generates extension implementations and extension-level metadata.

Example Usage:

#![allow(unused)]
fn main() {
// ExtensionImpl is an aggregate struct to impl different extensions
struct ExtensionImpl;
impl extension_fungibles::Config for ExtensionImpl {
    type AssetId = u32;
    type AccountId = [u8; 32];
    type Balance = u64;
}
impl_extensions! {
    impl extension_core::ExtensionCore for ExtensionImpl {
        type Config = ExtensionImpl;
        fn has_extension(id: <Self::Config as extension_core::Config>::ExtensionId) -> bool {
            matches!(id, 0 | 1)
        }
    }

    impl extension_fungibles::ExtensionFungibles for ExtensionImpl {
        type Config = ExtensionImpl;
        #[allow(unused_variables)]
        fn total_supply(asset: <Self::Config as extension_fungibles::Config>::AssetId) -> <Self::Config as extension_fungibles::Config>::Balance {
            200
        }
        #[allow(unused_variables)]
        fn balance(asset: <Self::Config as extension_fungibles::Config>::AssetId, who: <Self::Config as extension_fungibles::Config>::AccountId) -> <Self::Config as extension_fungibles::Config>::Balance {
            100
        }
    }
}
}
  • ExtensionExecutor: Connects extension implementations and xcq-executor. All methods of all extensions that a chain supports are amalgamated into a single host_call entry. Then this entry is registered as a typed function entry in PolkaVM Linker within the xcq-executor. Given the extension ID and call data encoded in SCALE format, call requests from the guest XCQ program are dispatched to corresponding extensions:
#![allow(unused)]
fn main() {
linker
    .define_typed(
        "host_call",
        move |caller: Caller<'_, Self::UserData>,
              extension_id: u64,
              call_ptr: u32,
              call_len: u32|
              -> Result<u64, ExtensionError> {
                  ...
              });
}
  • PermissionController: Filters guest XCQ program calling requests, useful for host chains to disable some queries by filtering invoking sources.
#![allow(unused)]
fn main() {
pub trait PermissionController {
    fn is_allowed(extension_id: ExtensionIdTy, call: &[u8], source: InvokeSource) -> bool;
}
#[derive(Copy, Clone)]
pub enum InvokeSource {
    RuntimeAPI,
    XCM,
    Extrinsic,
    Runtime,
}
}

XCQ Program Structure

An XCQ program is structured as a PolkaVM program with the following key components:

  • Imported Functions:

    • host_call: Dispatches call requests to the XCQ Extension Executor.

      #![allow(unused)]
      fn main() {
      #[polkavm_derive::polkavm_import]
      extern "C" {
          fn host_call(extension_id: u64, call_ptr: u32, call_len: u32) -> u64;
      }
      }

      Results are SCALE-encoded bytes, with the pointer address (lower 32 bits) and length (higher 32 bits) packed into a u64.

    • return_ty: Returns the type of the function call result.

      #![allow(unused)]
      fn main() {
      #[polkavm_derive::polkavm_import]
      extern "C" {
          fn return_ty(extension_id: u64, call_index: u32) -> u64;
      }
      }

      Results are SCALE-encoded bytes, with the pointer address and length packed similarly to host_call.

  • Exported Functions:

    • main: The entry point of the XCQ program. It performs type checking, arguments and results passing and executes the query.

XCQ Program Execution Flow

The interaction between an XCQ program and the XCQ Extension Executor follows these steps:

  1. Program Loading: The Executor loads the PolkaVM program binary.

  2. Environment Setup: The Executor configures the PolkaVM environment, registering host functions like host_call and return_ty.

  3. Main Function Execution: The Executor calls the program's main function, passing serialized query arguments.

  4. Program Execution:

    1. Type Checking: The program uses the return_ty function to ensure compatibility with supported chain extensions.
    2. Query Execution: The program executes the query using host_call and performs custom computations.
    3. Result Serialization: The program serializes the result, writes it to shared memory, and returns the pointer and length to the executor.
  5. Result Retrieval: The Executor reads the result from shared memory and returns it to the caller.

XCM integration

The integration of XCQ into XCM is acheived by adding a new instruction to XCM, as well as a new variant of the Response type in QueryResponse message.:

  • A new ReportQuery instruction
#![allow(unused)]
fn main() {
ReportQuery {
  query: SizeLimitedXcq,
  weight_limit: Option<Weight>,
  info: QueryResponseInfo,
}
}

Report to a given destination the results of an XCQ query. After query, a QueryResponse message of type XcqResult will be sent to the described destination.

Operands:

  • query: SizeLimitedXcq has a size limit(2MB)

    • program: Vec<u8>: A pre-built PVM program binary.
    • input: Vec<u8>: The arguments of the program.
  • weight_limit: WeightLimit: The maximum weight that the query should take. WeightLimit is an enum that can specify either Limit(Weight) or Unlimited.

  • info: QueryResponseInfo: Information for making the response.

    • destination: Location: The destination to which the query response message should be sent.
    • query_id: Compact<QueryId>: The query_id field of the QueryResponse message
    • max_weight: Weight: The max_weight field of the QueryResponse message
  • Add a new variant to the Response type in QueryResponse

  • XcqResult = 6 (XcqResult) XcqResult is a enum

    • Ok = 0 (Vec<u8>): XCQ executes successfully with a SCALE-encoded response.
    • Err = 1 (ErrorCode): XCQ fails with an error code. ErrorCode is a enum
  • ExceedMaxWeight = 0

  • MemoryAllocationError = 1

  • MemoryAccessError = 2

  • CallError = 3

  • OtherPVMError = 4

Errors

  • BadOrigin
  • DestinationUnsupported

Drawbacks

Performance issues

  • XCQ Query Program Size: The size of XCQ query programs should be optimized to ensure efficient storage and transmission via XCMP/HRMP. Some strategies to address this issue include:
    • Exploring modular program structures that allow for separate storage and transmission of core logic and supporting elements. PolkaVM supports spliting the program into multiple modules.
    • Establishing guidelines for optimizing dynamic memory usage within query programs

User experience issues

  • Debugging: Currently, there is no full-fledged debuggers for PolkaVM programs. The only debugging approach is to set the PolkaVM backend in interpreter mode and then log the operations at the assembly level, which is too low-level to debug efficiently.
  • Gas computation: According to this issue, the gas cost model of PolkaVM is not accurate for now.

Testing, Security, and Privacy

  • Testing:

    • A comprehensive test suite should be developed to cover various scenarios:
      • Positive test cases:
        • Basic queries with various extensions, data types, return values, custom computations, etc.
        • Accurate conversion between given weight limit and the gas limit of PolkaVM
      • Negative test cases:
        • Queries exceeding weight limits
        • Invoking queries from unauthorized sources
      • Edge cases:
        • Queries with minimal or maximum allowed input sizes
    • Integration tests to ensure proper interaction with off-chain wallets/UI and on-chain XCM, including the aforementioned use cases in Motivation section.
  • Security:

    • The XCQ system must enforce a strict read-only policy for all query operations. A mechanism should be implemented to prevent any state-changing operations within XCQ queries. For example, perform a final rollback in frame_support::storage::with_transaction to ensure the storage won't be changed.
    • Clear guidelines and best practices should be provided for parachain developers to ensure secure implementation.

Performance, Ergonomics, and Compatibility

Performance

It's a new functionality, which doesn't modify the existing implementations.

Ergonomics

The proposal facilitate the wallets and dApps developers. Developers no longer need to examine every concrete implementation to support conceptually similar operations across different chains. Additionally, they gain a more modular development experience through encapsulating custom computations over the exposed APIs in PolkaVM programs.

Compatibility

The proposal defines new apis, which doesn't break compatibility with existing interfaces.

Prior Art and References

There are several discussions related to the proposal, including:

Unresolved Questions

(source)

Table of Contents

RFC-0130: JAM Validity + DA Services for Ethereum Optimistic Rollups and Ethereum

Start Date26 October 2024 (Updated: November 5, 2024)
DescriptionJAM Service for Validating Optimistic Rollups and Ethereum
AuthorsSourabh Niyogi
AbstractJAM’s mission as a rollup host can be extended from validating Polkadot rollups to validating Ethereum optimistic rollups (ORUs) as well as Ethereum itself. We outline a design for a JAM Service to validating ORUs + Ethereum using a similar approach to Polkadot rollups anticipated in the CoreChains service. The design involves verifying state witnesses against account balances, nonces, code, and storage, and then using this state to re-execute block transactions, all within the service's refine operation; then, these validated ETH L1+L2 blocks are stored on-chain in service's accumulate operation. This JAM service is readily implementable with already available tools that fits seamlessly into JAM’s refine-accumulate service architecture: (a) Geth’s Consensus API, which outputs state witnesses for Ethereum and its dominant ORU ecosystems (OP Stack and Arbitrum Nitro), and (b) Rust-based EVM interpreter revm that should be compilable to PolkaVM with polkatool. This allows Ethereum+ETH L2 ORU users to benefit from JAM’s high computational and storage throughput. The JAM service enables rollup operators to choose JAM Services over other rollup service providers or to enhance their use of Ethereum for improved Web3 user experience, ultimately to provide validity proofs faster. Furthermore, Ethereum itself can be validated by the same JAM service, providing additional verification for ORU L2 commitments posted to Ethereum across all Ethereum forks.

Background

The Gray Paper suggests a design for applying the same audit protocol from Polkadot's parachain validation service to ETH rollups: "Smart-contract state may be held in a coherent format on the JAM chain so long as any updates are made through the 15kb/core/sec work results, which would need to contain only the hashes of the altered contracts’ state roots." This proposal concretely outlines a JAM service to do this for two top non-Polkadot optimistic rollup platforms: OP Stack and ArbOS as well as, ostentatiously, Ethereum itself.

Optimistic rollups use centralized sequencers and have no forks, creating an illusion of fast finality while actually relying on delayed fraud proofs. Optimistic rollups are termed "optimistic" because they assume transactions are valid by default, requiring fraud proofs on Ethereum L1 if a dispute arises. Currently, ORUs store L2 data on ETH L1, using EIP-4844's blob transactions or similar DA alternatives, just long enough to allow for fraud proof submission. This approach, however, incurs a cost: a 7-day exit window to accommodate fraud proofs. JAM Service can reduce the dependence on this long exit window by validating L2 optimistic rollups as well as the L1.

Motivation

JAM is intended to host rollups rather than serve end users directly.

A JAM service to validate Optimistic Rollups and Ethereum will expand JAM's service scope to and enhance their appeal with JAM's high throughput capabilities for both DA and computational resources.

Increasing the total addressable market for rollups to include non-Polkadot rollups will increase CoreTime demand, making JAM attractive to both existing and new optimistic rollups with higher cross-validation.

A JAM Service that can certify ORUs state transitions as being valid and available can deliver ecosystem participants (e.g. CEXs, bridge operators, stablecoin issuers) potentially improved user experiences that is marketable. With popular CEXes (Coinbase, Kraken) adopting OP Stack, any improvement is highly visible to retail users, making ETH ORUs "Secured by Polkadot JAM Chain".

Requirements

  1. Securing optimistic rollups with a JAM Service should be practical and require minimal changes by the ORU operator.
  2. Securing Polkadot rollups with a JAM Service should not be affected.

Stakeholders

  1. Optimistic Rollup Operators seeking low-latency high throughput validation services and very high throughput DA
  2. Web3 developers wanting to create applications on optimistic rollups secured by JAM
  3. DOT Holders aiming to increase CoreTime demand from non-Polkadot rollups

JAM Service Design for Validating ETH L2 ORUs + Ethereum

Overview

Ethereum L1 and ETH L2 Rollups produce a sequence of blocks ${\bf B}$ with headers ${\bf H}$. The header ${\bf H}$ contains a parent header hash ${\bf H}p$ and a state root ${\bf H}{r}$, which represents the global state after applying all block transactions. The transactions trie root is unnecessary; validating the state trie root alone is sufficient for validating rollups.

This JAM Service strategy, as hinted in the JAM Gray Paper, aggregates state witnesses in a chain of work packages. The refine operation takes these state witnesses of an optimistic rollup's blocks and verifies them against the prior block's state root ${\bf H}_r$.

The rollup operator submits headers, blocks and state witnesses in a chain of work packages $..., p_{n-1}, p_{n}, p_{n+1}, ...$ corresponding to the rollup chain, which may typically but not necessarily form a chain. The strategy advocated is to preemptively validate all blocks in possible forks under the assumption that cores (and CoreTime) are plentiful, rather than seek a canonical finalized chain head. Instead of relying on a promise that each state root is correct unless proven otherwise with ORU fraud proofs, JAM validates a block using state witnesses for each and validates the posterior state root ${\bf H}_r$ by reexecuting the block's transactions.

In JAM, the 2 key operations are refine and accumulate:

  1. The refine operation happens "off-chain" with a small subset of validators (2 or 3) via an almost entirely stateless computation based the contents of the work package:

    • (a) validating state witness proofs $\Pi$ against the prior state root ${\bf H}_r$:
      • account balances
      • account nonces
      • contract code
      • storage values
    • (b) given the block's transactions (and potential incoming deposits from L1 and withdrawals to L1), applying each the transaction generating a new posterior state root ${\bf H}_r$, which if it matches that contained in ${\bf H}$, is a proof of validity.

    The Consensus API of geth , used in popular optimistic rollup platforms of OP Stack and Arbitrum Nitro, are well-suited to generate the state witnesses, and enables 1(a).

    A EVM interpreter (revm) compiled in PolkaVM using polkatool enables 1(b).

    The results of refine are inputs to accumulate -- representing a set of valid blocks.

  2. The accumulate simply stores which block hashes/header hashes have been validated, solicits the block and header for storage in JAM DA.

The primary goal is to have L2 ORUs and L1 Ethereum blocks fully available in JAM DA and modeled as "valid", even if those blocks are tentative / non-canonical. This enables JAM to certify blocks as being valid as fast as possible in the finalized on the JAM Chain on any fork.

A secondary goal is to establish finality against L1 using the state roots posted from L2 on L1, but we put this secondary goal aside for now as Ethereum finality (12.8 minutes) is generally slower than JAM finality (1-2 mins). This secondary goal can definitely be supported by the Attestations from the Beacon chain, given ordered accumulation and these attestations, JAM finalize the entirely of Ethereum's rollups handily.

Refine:

Key Input: Work Packages

Using the newPayloadWithWitnessV4 method of ConsensusAPI added in Sept 2024, comprehensive state witnesses in this commit enables JAM to be validate blocks full nodes of Ethereum, OP Stack and ArbOS in a stateless way.

Since OP Stack and ArbOS basically use historically leading geth to power their own execution engine, the ConsensusAPI of geth can be used to get a Stateless witness during tracing within the core/tracing package of StateDB. At a high level, this Consensus API has a set of State Change Hooks that are called for OnBalanceChange, OnNonceChange, OnCodeChange, OnStorageChange hooks during a replay of block execution, which culminates in InsertBlockWithoutSetHead returning a set of state witness proofs. The end result is a state witnesses / proofs $\Pi$ of storage values that can be verified in against the prior block's state root ${\bf H}_r$.

Then, given a block ${\bf B}$ and these verified proofs $\Pi$ (verified against the prior state root ${\bf H}_r$, a complete state transition to validate a new posterior state root contained within ${\bf B}$ can be thoroughly conducted. A well-tested and highly stable Rust EVM interpreter revm should be compilable to PolkaVM with polkatool (see Building JAM Services in Rust). Then, using the provided verified state_witnesses the revm "in-memory" database can be set up with AccountInfo nonce, balance, and code (see here) as well as storage items, conceptually like:

use revm::{Database, EVM, AccountInfo, U256, B160};

fn initialize_evm_with_state(evm: &mut EVM<impl Database>, state_witnesses: HashMap<B160, AccountInfo>) {
    // Set up each account's state in the EVM's database.
    for (address, account_info) in state_witnesses {
        // Insert nonce/balance/code
        evm.database().insert_account(address, account_info);
	    // Insert storage if available
	    for (storage_key, storage_value) in account_info.storage {
	        evm.database().insert_storage(address, storage_key, storage_value);
	    }
	}
}

With the prior state fully initialized in memory via initialize_evm_with_state, we run through the EVM execution of the transactions of the block. Each transaction will interact with the pre-loaded state witnesses in the database. Because revm is very well maintained and stable, it already supports the latest Ethereum State Transition tests for individual transactions and given a block of transactions (or indeed a chain of blocks) can compute the posterior state root ${\bf H}_r$ for that block (or a chain of blocks). The block is valid if the block execution of revm results in the same state root ${\bf H}_r$ as contained in the header ${\bf H}$ and in the block ${\bf B}$ (or a chain of blocks). If it does, the refine code outputs both hashes, which will be solicited on chain in accumulate (and can be supplied by anyone, whether the ORU or some third-party).

In JAM's refine code, work package content is as follows:

JAM RefineContent
Work Package $p_n \in \mathbb{P}$Data submitted by ORU operator for validation of $N$ blocks, not necessarily in a chain, potentially multiple blocks at different block heights
Payload ${\bf y}$Chain ID (e.g., 10, 42161, etc.), start and end block numbers expected in extrinsics
Extrinsics $\bar{{\bf x}}$Header Hash $H({\bf H})$ , Block Hash $H({\bf B})$, Header ${\bf H}$, block ${\bf B}$, and State Witnesses $\Pi$ against prior state root ${\bf H}_r$ for each block
Work Items ${\bf w} \in \mathbb{I}$$N$ Prior state roots ${\bf H}_r$, one for each extrinsic ${\bf x} \in \bar{\bf x}$
Work Result ${\bf r}$Tuples $(i,t,H({\bf H}),H({\bf B}))$ of Block number $i$, timestamp $t$, header hash and block hash

Refine's operations are as follows:

  1. Authorize. Check for authorization ${\bf o}$.
  2. Verify State Witnesses. For each element ${\bf x} \equiv (H({\bf H}), H({\bf B}), {\bf H}, {\bf B}, \Pi)$ in extrinsics $\bar{{\bf x}}$ and the prior state root ${\bf H}_r$ in the work item:
    • Verify each state witness $\pi \in \Pi$ and initialize AccountInfo objects for all verified proofs $a_{b}, a_{n}, a_c, a_{s}(k,v)$ for balance, nonce and storage proofs
    • If any proof $\pi \in \Pi$ fails verification, skip to the next extrinsic, considering the block invalid.
    • If all proofs $\pi \in \Pi$ pass verification, proceed to next step.
  3. Apply Transactions. Initialize revm with the value of AccountInfo to apply all transactions For the block transactions ${\bf B}_T$ , , and derive the chain, with ${\bf H}_p$ matching the previous extrinsic $H({\bf H})$, except for the first header, which must match the previous 2.
    • Use the state root ${\bf H}_r$, .
  4. Output Worked Results: Verified Proof of Validity.. For each validated block, output a tuple $(i, t, H({\bf H}), H({\bf B}))$ of block number $i$, the block timestamp $t$, and the two hashes in the extrinsic ${\bf x} \in \bar{\bf x}$: $H({\bf H})$ and $H({\bf B})$.

The refine process uses no host functions (not even historical_lookup) as chain-hood of the blocks is not pursued within refine. As detailed in GP, work results in guaranteed work reports from the above refine are assured, audited, and judged following the ELVES protocol. JAM finalizes its audited blocks via GRANDPA voting and can be exported to other chains with BEEFY.

Optimistic rollups, including OP Stack and ArbOS, use Ethereum's Merkle Patricia Trie to commit to state data on each block. See this architecture diagram for a refresher. The state root ${\bf H}_r$ is a cryptographic commitment to the entire rollup state at that block height. The state witness from the rollups API enable the ORU to prove that specific storage values exist against the prior state root.

Due to refine use of a EVM interpreting compiling to PolkaVM, this is basically reexecuting the block. The state witnesses $\Pi$ enable refine to be stateless, consistent with JAM/ELVES design principles.

For simplicity, we consider 100% of state witnesses from a single block, with zero consideration to the extremely likely possibility that the blocks may form a chain with no forks. Given a chain of blocks, a more compressed approach would be to only include $\pi \in \Pi$ that actually new storage values in the chain of blocks. This would enable a reduced work package size as well as a smaller number of proof verifications in refine. We set this obvious optimization aside for now.

Accumulate

The accumulate operation is an entirely synchronous on-chain operation performed by all JAM Validators (rather than in-core as with refine). Computation is carried out first by the JAM Block Author and the newly authored block sent to all Validators.

In this JAM Service Design, the accumulate operation indexes the tuple from refine by block number and solicits both the block and header data. When provided (by the ORU but potentially anyone), this ensures that ETH L1+L2 data necessary to validate ETH L1+ORU L2s is fully available in JAM DA. At this time, we do not concern ourselves with chain-hood and finality, and instead model all L1 + L2 forks. However, we use a simple $f$ parameter to limit to the on-chain storage to blocks within a certain time range, putatively 28 days.

On-chain Storage of Rollup State

For simplicity, for each blockchain we have a separate service with a unique chain_id (e.g Ethereum 1, Optimism 10, Base 8453, Arbitrum 42161, etc.) with its own service storage. The service stores on-chain service storage the following key components:

  • all blocks and headers, which are solicited to be added to JAM DA via solicit and after a 28 day window, removed from JAM DA via forget. Blocks and headers held in preimage storage in ${\bf a}{\bf p}$ ; preimage availability in JAM DA is represented in lookup storage ${\bf a}{\bf l}$ via the preimage hash of both the block hash $H({\bf B})$ and header hash $H({\bf H})$.
  • an index of block numbers $i$ into tuples of block hash, header hash and timestamp ($t, H({\bf H}), H({\bf B}))$, held in storage ${\bf a}_s$, potentially more than one per block number
  • a simple $f$ parameter to support forget operations

The function of the above on-chain service storage is to represent both of the following:

  • the refine validation has successfully completed for a set of blocks and
  • the block and headers are available in JAM's DA

as well as bound storage costs to a reasonable level using $f$.

Key Process

JAM AccumulateContent
Work ResultsTuple of $(i, t, H({\bf H}), H({\bf B}))$, see refine
  1. For each block number $i$ in the tuple, perform the following operations:

    • solicit is called for both hashes in the work results. The ORU operator is expected to provide both preimages but any community member could do so.
    • read is called for block number $i$ to check if there are any previous blocks stored
    • if there aren't, initialize the value for $i$ to be $(H({\bf H}), H({\bf B}))$
    • if there are, append an additional two hashes $(H({\bf H}), H({\bf B}))$
    • write the updated value, which may be multiple pairs of hashes if there are forks for the chain.
  2. To bound storage growth to just the validated blocks that are within a certain window of around 28 days, we also:

    • read $f$ from service key 0, which holds $(i,t)$, the oldest block $i$ that has been solicited and the time $t$ it was solicited, but may now be outside the window of 28 days
    • if $t$ is older than 28 days, then if ${\bf a}_l$ is Available (note there are 2 states) then issue forget and advance $(i,t)$
    • repeat the above until either the oldest block is less than 28 days ago or gas is nearly exhausted
    • write $f$ back to storage if changed
  3. Accumulation output is the newest blockhash $H({\bf B})$ solicited in step 1, which is included in the BEEFY root in JAM's Recent History.

Under normal operations, the above process will result in a full set of validated blocks and header preimages being in JAM's DA layer. An external observer of the JAM Chain can, for the last 28 days, check for validity of the chain in any fork, whether finalized or unfinalized on ETH L1 or ETH L2 ORU.

Transfer

Transfer details concerning fees of storage balance are not considered at this time. A full model appears to necessitate a clear interface between JAM and CoreTime as well as a model of how this service can support a single vs. multiple Chain IDs. Naturally, all service fees would be in DOT rather than ETH on L2.

Multiple service instantiations under different CoreTime leases may be envisioned for each rollup, but a set of homogenous ORU rollups to support sync/async models by the same ORU operator could also be supported.

Beefy

Following the BEEFY protocol, JAM's own state trie aggregates work packages' accumulate roots into its own recent history $\beta$, signed by JAM validators. BEEFY’s model allows JAM's state to be verified on Ethereum and other external chains without needing all JAM Chain data on the external chain. All rollups, whether built on Substrate or optimistic turned cynical, are committed into this state, supporting cross-ecosystem proofs using Merkle trees as today.

To support this goal, the accumulate result is the last block hash, last header hash, or last state root as suitable to match Polkadot rollup behavior.

Service PoC Implementation

This JAM Service can be implemented in Rust using polkatool and tested on a JAM Testnet by early JAM implementers.

With a full node of Ethereum, Optimism, Base, and Arbitrum, real-world work packages can be developed and tested offline and then in real-time with a dozen cores at most.

Gas + Storage Considerations

JAM's gas parameters for PVM instructions and host calls have not been fully developed yet. Combining the CoreChain and this Ethereum+ORU validation services will provide valuable data for determining these parameters.

The gas required for refine is proportional to:

  • the size of $\Pi$ submitted in a work package
  • the amount of gas used by ${\bf B}_T$, which is embedded in the ${\bf B}$ itself

The proof generation of $\Pi$ is not part of refine and do not involve the sequencer directly; instead this would be done by a neighboring community validator node. In addition, I/O of reading/writing state tries are believed to dominate ordinary ORU operation but are also part in the refine operation.

The gas required for accumulate is proportional to the number of blocks verified in refine , which result in read , write , and solicit and forget operations. It is believed that accumulate's gas cost is nominal and poses no significant issue. However, storage rent issues common to all blockchain storage applies to the preimages, which is explicitly tallied in service ${\bf a}$. Fortunately, this is upper-bounded by the number of blocks generated in a 28-day period.

Compatibility

CoreTime

Instead of ETH, rollups would require DOT for CoreTime to secure their rollup. However, rollups are not locked into JAM and may freely enter and exit the JAM ecosystem since work packages do not need to start at genesis.

Different rollups may need to scale their core usage based on rollup activity. JAM's connectivity to CoreTime is expected to handle this effectively.

Hashing

Currently, preimages are specified to use the Blake2b hash, while Ethereum rollup block hashes utilize Keccak256. This is an application level concern trivially solved by the preimage provider responding to preimage announcements by Blake2b hash instead of Keccak256.

Testing, Security, and Privacy

The described service requires expert review from security experts familiar with JAM, ELVES, and Ethereum.

The ELVES and JAM protocols are expected to undergo audit with the 1.0 ratification of JAM.

It is believed that the use of revm is safe due to its extensive coverage of Ethereum State + Block tests, but this may require careful review.

The polkatool compiler has not battletested by comparision.

The Consensus API generating state witnesses is likely mature but a relatively new addition to the geth code base.

The proposal introduces no new privacy concerns.

It is natural to bring in finality from Ethereum using Attestations from the Beacon chain to finalize validated blocks as they become available from Ethereum and ETH ORU L2s enabled by this type of service. A "ETH Beacon Service" bringing in the Altair Light Client data would enable the Ethereum Service, all ETH ORUs to compute the canonical chain. This would use the "Ordered Accumulation" capabilities of JAM, reduce the storage footprint to just those blocks that are actually finalized in the L2 against an Ethereum finalized checkpoint.

As JAM implementers move towards conformant implementations in 2025, which support gas modeling and justify performance improvements, a proper model of storage costs and fees will be necessary.

JAM enables a semi-coherent model for non-Polkadot rollups, starting with optimistic rollups as described here. A similar service may be envisioned for mature ZK rollup ecosystems, though there is not as much more in refine than to verify the ZKRU proof. A JAM messaging service between ORUs and ZKRUs may be highly desirable. This can be done in a separate service or simply by adding in transfer code with read and write operations in service storage incoming and outgoing mailboxes.

The growth of optimistic rollup platforms, led by OP Stack with CEXes (Coinbase and Kraken) and ArbOS, threatens JAM's viability as a rollup host. Achieving JAM's rollup host goals may require urgency to match the speed at which these network effects emerge.

On the other hand, if JAM Services are developed and put in production in 2025, JAM Services can validate all of Ethereum as well as Polkadot rollups.

Drawbacks, Alternatives, and Unknowns

Alongside Ethereum DA (via EIP-4844), numerous DA alternatives for rollups exist: Avail, Celestia, Eigenlayer. ORUs rely primarily on fraud proofs, but they require a lengthy 7-day window for these "fraud proofs." The cynical rollup model offers significant UX improvements by eliminating this "exit window," commonly 7 days.

This JAM Service does not turn optimistic rollups into cynical rollups. A method to do so is not known.

Acknowledgements

We are deeply grateful for the ongoing encouragement and feedback from Polkadot heavyweights (Rob Habermeier, Alistair Stewart, Jeff Burdges, Bastian Kocher), Polkadot fellows, fellow JAM Implementers/Service Builders, and the broader community.

(source)

Table of Contents

RFC-1: Agile Coretime

Start Date30 June 2023
DescriptionAgile periodic-sale-based model for assigning Coretime on the Polkadot Ubiquitous Computer.
AuthorsGavin Wood

Summary

This proposes a periodic, sale-based method for assigning Polkadot Coretime, the analogue of "block space" within the Polkadot Network. The method takes into account the need for long-term capital expenditure planning for teams building on Polkadot, yet also provides a means to allow Polkadot to capture long-term value in the resource which it sells. It supports the possibility of building rich and dynamic secondary markets to optimize resource allocation and largely avoids the need for parameterization.

Motivation

Present System

The Polkadot Ubiquitous Computer, or just Polkadot UC, represents the public service provided by the Polkadot Network. It is a trust-free, WebAssembly-based, multicore, internet-native omnipresent virtual machine which is highly resilient to interference and corruption.

The present system of allocating the limited resources of the Polkadot Ubiquitous Computer is through a process known as parachain slot auctions. This is a parachain-centric paradigm whereby a single core is long-term allocated to a single parachain which itself implies a Substrate/Cumulus-based chain secured and connected via the Relay-chain. Slot auctions are on-chain candle auctions which proceed for several days and result in the core being assigned to the parachain for six months at a time up to 24 months in advance. Practically speaking, we only see two year periods being bid upon and leased.

Funds behind the bids made in the slot auctions are merely locked, they are not consumed or paid and become unlocked and returned to the bidder on expiry of the lease period. A means of sharing the deposit trustlessly known as a crowdloan is available allowing token holders to contribute to the overall deposit of a chain without any counterparty risk.

Problems

The present system is based on a model of one-core-per-parachain. This is a legacy interpretation of the Polkadot platform and is not a reflection of its present capabilities. By restricting ownership and usage to this model, more dynamic and resource-efficient means of utilizing the Polkadot Ubiquitous Computer are lost.

More specifically, it is impossible to lease out cores at anything less than six months, and apparently unrealistic to do so at anything less than two years. This removes the ability to dynamically manage the underlying resource, and generally experimentation, iteration and innovation suffer. It bakes into the platform an assumption of permanence for anything deployed into it and restricts the market's ability to find a more optimal allocation of the finite resource.

There is no ability to determine capital requirements for hosting a parachain beyond two years from the point of its initial deployment onto Polkadot. While it would be unreasonable to have perfect and indefinite cost predictions for any real-world platform, not having any clarity whatsoever beyond "market rates" two years hence can be a very off-putting prospect for teams to buy into.

However, quite possibly the most substantial problem is both a perceived and often real high barrier to entry of the Polkadot ecosystem. By forcing innovators to either raise seven-figure sums through investors or appeal to the wider token-holding community, Polkadot makes it difficult for a small band of innovators to deploy their technology into Polkadot. While not being actually permissioned, it is also far from the barrierless, permissionless ideal which an innovation platform such as Polkadot should be striving for.

Requirements

  1. The solution SHOULD provide an acceptable value-capture mechanism for the Polkadot network.
  2. The solution SHOULD allow parachains and other projects deployed on to the Polkadot UC to make long-term capital expenditure predictions for the cost of ongoing deployment.
  3. The solution SHOULD minimize the barriers to entry in the ecosystem.
  4. The solution SHOULD work well when the Polkadot UC has up to 1,000 cores.
  5. The solution SHOULD work when the number of cores which the Polkadot UC can support changes over time.
  6. The solution SHOULD facilitate the optimal allocation of work to cores of the Polkadot UC, including by facilitating the trade of regular core assignment at various intervals and for various spans.
  7. The solution SHOULD avoid creating additional dependencies on functionality which the Relay-chain need not strictly provide for the delivery of the Polkadot UC.

Furthermore, the design SHOULD be implementable and deployable in a timely fashion; three months from the acceptance of this RFC should not be unreasonable.

Stakeholders

Primary stakeholder sets are:

  • Protocol researchers and developers, largely represented by the Polkadot Fellowship and Parity Technologies' Engineering division.
  • Polkadot Parachain teams both present and future, and their users.
  • Polkadot DOT token holders.

Socialization:

The essensials of this proposal were presented at Polkadot Decoded 2023 Copenhagen on the Main Stage. A small amount of socialization at the Parachain Summit preceeded it and some substantial discussion followed it. Parity Ecosystem team is currently soliciting views from ecosystem teams who would be key stakeholders.

Explanation

Overview

Upon implementation of this proposal, the parachain-centric slot auctions and associated crowdloans cease. Instead, Coretime on the Polkadot UC is sold by the Polkadot System in two separate formats: Bulk Coretime and Instantaneous Coretime.

When a Polkadot Core is utilized, we say it is dedicated to a Task rather than a "parachain". The Task to which a Core is dedicated may change at every Relay-chain block and while one predominant type of Task is to secure a Cumulus-based blockchain (i.e. a parachain), other types of Tasks are envisioned.

Bulk Coretime is sold periodically on a specialised system chain known as the Coretime-chain and allocated in advance of its usage, whereas Instantaneous Coretime is sold on the Relay-chain immediately prior to usage on a block-by-block basis.

This proposal does not fix what should be done with revenue from sales of Coretime and leaves it for a further RFC process.

Owners of Bulk Coretime are tracked on the Coretime-chain and the ownership status and properties of the owned Coretime are exposed over XCM as a non-fungible asset.

At the request of the owner, the Coretime-chain allows a single Bulk Coretime asset, known as a Region, to be used in various ways including transferal to another owner, allocated to a particular task (e.g. a parachain) or placed in the Instantaneous Coretime Pool. Regions can also be split out, either into non-overlapping sub-spans or exactly-overlapping spans with less regularity.

The Coretime-Chain periodically instructs the Relay-chain to assign its cores to alternative tasks as and when Core allocations change due to new Regions coming into effect.

Renewal and Migration

There is a renewal system which allows a Bulk Coretime assignment of a single core to be renewed unchanged with a known price increase from month to month. Renewals are processed in a period prior to regular purchases, effectively giving them precedence over a fixed number of cores available.

Renewals are only enabled when a core's assignment does not include an Instantaneous Coretime allocation and has not been split into shorter segments.

Thus, renewals are designed to ensure only that committed parachains get some guarantees about price for predicting future costs. This price-capped renewal system only allows cores to be reused for their same tasks from month to month. In any other context, Bulk Coretime would need to be purchased regularly.

As a migration mechanism, pre-existing leases (from the legacy lease/slots/crowdloan framework) are initialized into the Coretime-chain and cores assigned to them prior to Bulk Coretime sales. In the sale where the lease expires, the system offers a renewal, as above, to allow a priority sale of Bulk Coretime and ensure that the Parachain suffers no downtime when transitioning from the legacy framework.

Instantaneous Coretime

Processing of Instantaneous Coretime happens in part on the Polkadot Relay-chain. Credit is purchased on the Coretime-chain for regular DOT tokens, and this results in a DOT-denominated Instantaneous Coretime Credit account on the Relay-chain being credited for the same amount.

Though the Instantaneous Coretime Credit account records a balance for an account identifier (very likely controlled by a collator), it is non-transferable and non-refundable. It can only be consumed in order to purchase some Instantaneous Coretime with immediate availability.

The Relay-chain reports this usage back to the Coretime-chain in order to allow it to reward the providers of the underlying Coretime, either the Polkadot System or owners of Bulk Coretime who contributed to the Instantaneous Coretime Pool.

Specifically the Relay-chain is expected to be responsible for:

  • holding non-transferable, non-refundable DOT-denominated Instantaneous Coretime Credit balance information.
  • setting and adjusting the price of Instantaneous Coretime based on usage.
  • allowing collators to consume their Instantaneous Coretime Credit at the current pricing in exchange for the ability to schedule one PoV for near-immediate usage.
  • ensuring the Coretime-Chain has timely accounting information on Instantaneous Coretime Sales revenue.

Coretime-chain

The Coretime-chain is a new system parachain. It has the responsibility of providing the Relay-chain via UMP with information of:

  • The number of cores which should be made available.
  • Which tasks should be running on which cores and in what ratios.
  • Accounting information for Instantaneous Coretime Credit.

It also expects information from the Relay-chain via DMP:

  • The number of cores available to be scheduled.
  • Account information on Instantaneous Coretime Sales.

The specific interface is properly described in RFC-5.

Detail

Parameters

This proposal includes a number of parameters which need not necessarily be fixed. Their usage is explained below, but their values are suggested or specified in the later section Parameter Values.

Reservations and Leases

The Coretime-chain includes some governance-set reservations of Coretime; these cover every System-chain. Additionally, governance is expected to initialize details of the pre-existing leased chains.

Regions

A Region is an assignable period of Coretime with a known regularity.

All Regions are associated with a unique Core Index, to identify which core the assignment of which ownership of the Region controls.

All Regions are also associated with a Core Mask, an 80-bit bitmap, to denote the regularity at which it may be scheduled on the core. If all bits are set in the Core Mask value, it is said to be Complete. 80 is selected since this results in the size of the datatype used to identify any Region of Polkadot Coretime to be a very convenient 128-bit. Additionally, if TIMESLICE (the number of Relay-chain blocks in a Timeslice) is 80, then a single bit in the Core Mask bitmap represents exactly one Core for one Relay-chain block in one Timeslice.

All Regions have a span. Region spans are quantized into periods of TIMESLICE blocks; BULK_PERIOD divides into TIMESLICE a whole number of times.

The Timeslice type is a u32 which can be multiplied by TIMESLICE to give a BlockNumber value representing the same quantity in terms of Relay-chain blocks.

Regions can be tasked to a TaskId (aka ParaId) or pooled into the Instantaneous Coretime Pool. This process can be Provisional or Final. If done only provisionally or not at all then they are fresh and have an Owner which is able to manipulate them further including reassignment. Once Final, then all ownership information is discarded and they cannot be manipulated further. Renewal is not possible when only provisionally tasked/pooled.

Bulk Sales

A sale of Bulk Coretime occurs on the Coretime-chain every BULK_PERIOD blocks.

In every sale, a BULK_LIMIT of individual Regions are offered for sale.

Each Region offered for sale has a different Core Index, ensuring that they each represent an independently allocatable resource on the Polkadot UC.

The Regions offered for sale have the same span: they last exactly BULK_PERIOD blocks, and begin immediately following the span of the previous Sale's Regions. The Regions offered for sale also have the complete, non-interlaced, Core Mask.

The Sale Period ends immediately as soon as span of the Coretime Regions that are being sold begins. At this point, the next Sale Price is set according to the previous Sale Price together with the number of Regions sold compared to the desired and maximum amount of Regions to be sold. See Price Setting for additional detail on this point.

Following the end of the previous Sale Period, there is an Interlude Period lasting INTERLUDE_PERIOD of blocks. After this period is elapsed, regular purchasing begins with the Purchasing Period.

This is designed to give at least two weeks worth of time for the purchased regions to be partitioned, interlaced, traded and allocated.

The Interlude

The Interlude period is a period prior to Regular Purchasing where renewals are allowed to happen. This has the effect of ensuring existing long-term tasks/parachains have a chance to secure their Bulk Coretime for a well-known price prior to general sales.

Regular Purchasing

Any account may purchase Regions of Bulk Coretime if they have the appropriate funds in place during the Purchasing Period, which is from INTERLUDE_PERIOD blocks after the end of the previous sale until the beginning of the Region of the Bulk Coretime which is for sale as long as there are Regions of Bulk Coretime left for sale (i.e. no more than BULK_LIMIT have already been sold in the Bulk Coretime Sale). The Purchasing Period is thus roughly BULK_PERIOD - INTERLUDE_PERIOD blocks in length.

The Sale Price varies during an initial portion of the Purchasing Period called the Leadin Period and then stays stable for the remainder. This initial portion is LEADIN_PERIOD blocks in duration. During the Leadin Period the price decreases towards the Sale Price, which it lands at by the end of the Leadin Period. The actual curve by which the price starts and descends to the Sale Price is outside the scope of this RFC, though a basic suggestion is provided in the Price Setting Notes, below.

Renewals

At any time when there are remaining Regions of Bulk Coretime to be sold, including during the Interlude Period, then certain Bulk Coretime assignmnents may be Renewed. This is similar to a purchase in that funds must be paid and it consumes one of the Regions of Bulk Coretime which would otherwise be placed for purchase. However there are two key differences.

Firstly, the price paid is the minimum of RENEWAL_PRICE_CAP more than what the purchase/renewal price was in the previous renewal and the current (or initial, if yet to begin) regular Sale Price.

Secondly, the purchased Region comes preassigned with exactly the same workload as before. It cannot be traded, repartitioned, interlaced or exchanged. As such unlike regular purchasing the Region never has an owner.

Renewal is only possible for either cores which have been assigned as a result of a previous renewal, which are migrating from legacy slot leases, or which fill their Bulk Coretime with an unsegmented, fully and finally assigned workload which does not include placement in the Instantaneous Coretime Pool. The renewed workload will be the same as this initial workload.

Manipulation

Regions may be manipulated in various ways by its owner:

  1. Transferred in ownership.
  2. Partitioned into quantized, non-overlapping segments of Bulk Coretime with the same ownership.
  3. Interlaced into multiple Regions over the same period whose eventual assignments take turns to be scheduled.
  4. Assigned to a single, specific task (identified by TaskId aka ParaId). This may be either provisional or final.
  5. Pooled into the Instantaneous Coretime Pool, in return for a pro-rata amount of the revenue from the Instantaneous Coretime Sales over its period.

Enactment

Specific functions of the Coretime-chain

Several functions of the Coretime-chain SHALL be exposed through dispatchables and/or a nonfungible trait implementation integrated into XCM:

1. transfer

Regions may have their ownership transferred.

A transfer(region: RegionId, new_owner: AccountId) dispatchable shall have the effect of altering the current owner of the Region identified by region from the signed origin to new_owner.

An implementation of the nonfungible trait SHOULD include equivalent functionality. RegionId SHOULD be used for the AssetInstance value.

2. partition

Regions may be split apart into two non-overlapping interior Regions of the same Core Mask which together concatenate to the original Region.

A partition(region: RegionId, pivot: Timeslice) dispatchable SHALL have the effect of removing the Region identified by region and adding two new Regions of the same owner and Core Mask. One new Region will begin at the same point of the old Region but end at pivot timeslices into the Region, whereas the other will begin at this point and end at the end point of the original Region.

Also:

  • owner field of region must the equal to the Signed origin.
  • pivot must equal neither the begin nor end fields of the region.

3. interlace

Regions may be decomposed into two Regions of the same span whose eventual assignments take turns on the core by virtue of having complementary Core Masks.

An interlace(region: RegionId, mask: CoreMask) dispatchable shall have the effect of removing the Region identified by region and creating two new Regions. The new Regions will each have the same span and owner of the original Region, but one Region will have a Core Mask equal to mask and the other will have Core Mask equal to the XOR of mask and the Core Mask of the original Region.

Also:

  • owner field of region must the equal to the Signed origin.
  • mask must have some bits set AND must not equal the Core Mask of the old Region AND must only have bits set which are also set in the old Region's' Core Mask.

4. assign

Regions may be assigned to a core.

A assign(region: RegionId, target: TaskId, finality: Finality) dispatchable shall have the effect of placing an item in the workplan corresponding to the region's properties and assigned to the target task.

If the region's end has already passed (taking into account any advance notice requirements) then this operation is a no-op. If the region's begining has already passed, then it is effectively altered to become the next schedulable timeslice.

finality may have the value of either Final or Provisional. If Final, then the operation is free, the region record is removed entirely from storage and renewal may be possible: if the Region's span is the entire BULK_PERIOD, then the Coretime-chain records in storage that the allocation happened during this period in order to facilitate the possibility for a renewal. (Renewal only becomes possible when the full Core Mask of a core is finally assigned for the full BULK_PERIOD.)

Also:

  • owner field of region must the equal to the Signed origin.

5. pool

Regions may be consumed in exchange for a pro rata portion of the Instantaneous Coretime Sales Revenue from its period and regularity.

A pool(region: RegionId, beneficiary: AccountId, finality: Finality) dispatchable shall have the effect of placing an item in the workplan corresponding to the region's properties and assigned to the Instantaneous Coretime Pool. The details of the region will be recorded in order to allow for a pro rata share of the Instantaneous Coretime Sales Revenue at the time of the Region relative to any other providers in the Pool.

If the region's end has already passed (taking into account any advance notice requirements) then this operation is a no-op. If the region's begining has already passed, then it is effectively altered to become the next schedulable timeslice.

finality may have the value of either Final or Provisional. If Final, then the operation is free and the region record is removed entirely from storage.

Also:

  • owner field of region must the equal to the Signed origin.

6. Purchases

A dispatchable purchase(price_limit: Balance) shall be provided. Any account may call purchase to purchase Bulk Coretime at the maximum price of price_limit.

This may be called successfully only:

  1. during the regular Purchasing Period;
  2. when the caller is a Signed origin and their account balance is reducible by the current sale price;
  3. when the current sale price is no greater than price_limit; and
  4. when the number of cores already sold is less than BULK_LIMIT.

If successful, the caller's account balance is reduced by the current sale price and a new Region item for the following Bulk Coretime span is issued with the owner equal to the caller's account.

7. Renewals

A dispatchable renew(core: CoreIndex) shall be provided. Any account may call renew to purchase Bulk Coretime and renew an active allocation for the given core.

This may be called during the Interlude Period as well as the regular Purchasing Period and has the same effect as purchase followed by assign, except that:

  1. The price of the sale is the Renewal Price (see next).
  2. The Region is allocated exactly the given core is currently allocated for the present Region.

Renewal is only valid where a Region's span is assigned to Tasks (not placed in the Instantaneous Coretime Pool) for the entire unsplit BULK_PERIOD over all of the Core Mask and with Finality. There are thus three possibilities of a renewal being allowed:

  1. Purchased unsplit Coretime with final assignment to tasks over the full Core Mask.
  2. Renewed Coretime.
  3. A legacy lease which is ending.

Renewal Price

The Renewal Price is the minimum of the current regular Sale Price (or the initial Sale Price if in the Interlude Period) and:

  • If the workload being renewed came to be through the Purchase and Assignment of Bulk Coretime, then the price paid during that Purchase operation.
  • If the workload being renewed was previously renewed, then the price paid during this previous Renewal operation plus RENEWAL_PRICE_CAP.
  • If the workload being renewed is a migation from a legacy slot auction lease, then the nominal price for a Regular Purchase (outside of the Lead-in Period) of the Sale during which the legacy lease expires.

8. Instantaneous Coretime Credits

A dispatchable purchase_credit(amount: Balance, beneficiary: RelayChainAccountId) shall be provided. Any account with at least amount spendable funds may call this. This increases the Instantaneous Coretime Credit balance on the Relay-chain of the beneficiary by the given amount.

This Credit is consumable on the Relay-chain as part of the Task scheduling system and its specifics are out of the scope of this proposal. When consumed, revenue is recorded and provided to the Coretime-chain for proper distribution. The API for doing this is specified in RFC-5.

Notes on the Instantaneous Coretime Market

For an efficient market to form around the provision of Bulk-purchased Cores into the pool of cores available for Instantaneous Coretime purchase, it is crucial to ensure that price changes for the purchase of Instantaneous Coretime are reflected well in the revenues of private Coretime providers during the same period.

In order to ensure this, then it is crucial that Instantaneous Coretime, once purchased, cannot be held indefinitely prior to eventual use since, if this were the case, a nefarious collator could purchase Coretime when cheap and utilize it some time later when expensive and deprive private Coretime providers of their revenue.

It must therefore be assumed that Instantaneous Coretime, once purchased, has a definite and short "shelf-life", after which it becomes unusable. This incentivizes collators to avoid purchasing Coretime unless they expect to utilize it imminently and thus helps create an efficient market-feedback mechanism whereby a higher price will actually result in material revenues for private Coretime providers who contribute to the pool of Cores available to service Instantaneous Coretime purchases.

Notes on Economics

The specific pricing mechanisms are out of scope for the present proposal. Proposals on economics should be properly described and discussed in another RFC. However, for the sake of completeness, I provide some basic illustration of how price setting could potentially work.

Bulk Price Progression

The present proposal assumes the existence of a price-setting mechanism which takes into account several parameters:

  • OLD_PRICE: The price of the previous sale.
  • BULK_TARGET: the target number of cores to be purchased as Bulk Coretime Regions or renewed during the previous sale.
  • BULK_LIMIT: the maximum number of cores which could have been purchased/renewed during the previous sale.
  • CORES_SOLD: the actual number of cores purchased/renewed in the previous sale.
  • SELLOUT_PRICE: the price at which the most recent Bulk Coretime was purchased (not renewed) prior to selling more cores than BULK_TARGET (or immediately after, if none were purchased before). This may not have a value if no Bulk Coretime was purchased.

In general we would expect the price to increase the closer CORES_SOLD gets to BULK_LIMIT and to decrease the closer it gets to zero. If it is exactly equal to BULK_TARGET, then we would expect the price to remain the same.

In the edge case that no cores were purchased yet more cores were sold (through renewals) than the target, then we would also avoid altering the price.

A simple example of this would be the formula:

IF SELLOUT_PRICE == NULL AND CORES_SOLD > BULK_TARGET THEN
    RETURN OLD_PRICE
END IF
EFFECTIVE_PRICE := IF CORES_SOLD > BULK_TARGET THEN
    SELLOUT_PRICE
ELSE
    OLD_PRICE
END IF
NEW_PRICE := IF CORES_SOLD < BULK_TARGET THEN
    EFFECTIVE_PRICE * MAX(CORES_SOLD, 1) / BULK_TARGET
ELSE
    EFFECTIVE_PRICE + EFFECTIVE_PRICE *
        (CORES_SOLD - BULK_TARGET) / (BULK_LIMIT - BULK_TARGET)
END IF

This exists only as a trivial example to demonstrate a basic solution exists, and should not be intended as a concrete proposal.

Intra-Leadin Price-decrease

During the Leadin Period of a sale, the effective price starts higher than the Sale Price and falls to end at the Sale Price at the end of the Leadin Period. The price can thus be defined as a simple factor above one on which the Sale Price is multiplied. A function which returns this factor would accept a factor between zero and one specifying the portion of the Leadin Period which has passed.

Thus we assume SALE_PRICE, then we can define PRICE as:

PRICE := SALE_PRICE * FACTOR((NOW - LEADIN_BEGIN) / LEADIN_PERIOD)

We can define a very simple progression where the price decreases monotonically from double the Sale Price at the beginning of the Leadin Period.

FACTOR(T) := 2 - T

Parameter Values

Parameters are either suggested or specified. If suggested, it is non-binding and the proposal should not be judged on the value since other RFCs and/or the governance mechanism of Polkadot is expected to specify/maintain it. If specified, then the proposal should be judged on the merit of the value as-is.

NameValue
BULK_PERIOD28 * DAYSspecified
INTERLUDE_PERIOD7 * DAYSspecified
LEADIN_PERIOD7 * DAYSspecified
TIMESLICE8 * MINUTESspecified
BULK_TARGET30suggested
BULK_LIMIT45suggested
RENEWAL_PRICE_CAPPerbill::from_percent(2)suggested

Instantaneous Price Progression

This proposal assumes the existence of a Relay-chain-based price-setting mechanism for the Instantaneous Coretime Market which alters from block to block, taking into account several parameters: the last price, the size of the Instantaneous Coretime Pool (in terms of cores per Relay-chain block) and the amount of Instantaneous Coretime waiting for processing (in terms of Core-blocks queued).

The ideal situation is to have the size of the Instantaneous Coretime Pool be equal to some factor of the Instantaneous Coretime waiting. This allows all Instantaneous Coretime sales to be processed with some limited latency while giving limited flexibility over ordering to the Relay-chain apparatus which is needed for efficient operation.

If we set a factor of three, and thus aim to retain a queue of Instantaneous Coretime Sales which can be processed within three Relay-chain blocks, then we would increase the price if the queue goes above three times the amount of cores available, and decrease if it goes under.

Let us assume the values OLD_PRICE, FACTOR, QUEUE_SIZE and POOL_SIZE. A simple definition of the NEW_PRICE would be thus:

NEW_PRICE := IF QUEUE_SIZE < POOL_SIZE * FACTOR THEN
    OLD_PRICE * 0.95
ELSE
    OLD_PRICE / 0.95
END IF

This exists only as a trivial example to demonstrate a basic solution exists, and should not be intended as a concrete proposal.

Notes on Types

This exists only as a short illustration of a potential technical implementation and should not be treated as anything more.

Regions

This data schema achieves a number of goals:

  • Coretime can be individually traded at a level of a single usage of a single core.
  • Coretime Regions, of arbitrary span and up to 1/80th interlacing can be exposed as NFTs and exchanged.
  • Any Coretime Region can be contributed to the Instantaneous Coretime Pool.
  • Unlimited number of individual Coretime contributors to the Instantaneous Coretime Pool. (Effectively limited only in number of cores and interlacing level; with current values this would allow 80,000 individual payees per timeslice).
  • All keys are self-describing.
  • Workload to communicate core (re-)assignments is well-bounded and low in weight.
  • All mandatory bookkeeping workload is well-bounded in weight.
#![allow(unused)]
fn main() {
type Timeslice = u32; // 80 block amounts.
type CoreIndex = u16;
type CoreMask = [u8; 10]; // 80-bit bitmap.

// 128-bit (16 bytes)
struct RegionId {
    begin: Timeslice,
    core: CoreIndex,
    mask: CoreMask,
}
// 296-bit (37 bytes)
struct RegionRecord {
    end: Timeslice,
    owner: AccountId,
}

map Regions = Map<RegionId, RegionRecord>;

// 40-bit (5 bytes). Could be 32-bit with a more specialised type.
enum CoreTask {
    Off,
    Assigned { target: TaskId },
    InstaPool,
}
// 120-bit (15 bytes). Could be 14 bytes with a specialised 32-bit `CoreTask`.
struct ScheduleItem {
    mask: CoreMask, // 80 bit
    task: CoreTask, // 40 bit
}

/// The work we plan on having each core do at a particular time in the future.
type Workplan = Map<(Timeslice, CoreIndex), BoundedVec<ScheduleItem, 80>>;
/// The current workload of each core. This gets updated with workplan as timeslices pass.
type Workload = Map<CoreIndex, BoundedVec<ScheduleItem, 80>>;

enum Contributor {
    System,
    Private(AccountId),
}

struct ContributionRecord {
    begin: Timeslice,
    end: Timeslice,
    core: CoreIndex,
    mask: CoreMask,
    payee: Contributor,
}
type InstaPoolContribution = Map<ContributionRecord, ()>;

type SignedTotalMaskBits = u32;
type InstaPoolIo = Map<Timeslice, SignedTotalMaskBits>;

type PoolSize = Value<TotalMaskBits>;

/// Counter for the total CoreMask which could be dedicated to a pool. `u32` so we don't ever get
/// an overflow.
type TotalMaskBits = u32;
struct InstaPoolHistoryRecord {
    total_contributions: TotalMaskBits,
    maybe_payout: Option<Balance>,
}
/// Total InstaPool rewards for each Timeslice and the number of core Mask which contributed.
type InstaPoolHistory = Map<Timeslice, InstaPoolHistoryRecord>;
}

CoreMask tracks unique "parts" of a single core. It is used with interlacing in order to give a unique identifier to each component of any possible interlacing configuration of a core, allowing for simple self-describing keys for all core ownership and allocation information. It also allows for each core's workload to be tracked and updated progressively, keeping ongoing compute costs well-bounded and low.

Regions are issued into the Regions map and can be transferred, partitioned and interlaced as the owner desires. Regions can only be tasked if they begin after the current scheduling deadline (if they have missed this, then the region can be auto-trimmed until it is).

Once tasked, they are removed from there and a record is placed in Workplan. In addition, if they are contributed to the Instantaneous Coretime Pool, then an entry is placing in InstaPoolContribution and InstaPoolIo.

Each timeslice, InstaPoolIo is used to update the current value of PoolSize. A new entry in InstaPoolHistory is inserted, with the total_contributions field of InstaPoolHistoryRecord being informed by the PoolSize value. Each core's has its Workload mutated according to its Workplan for the upcoming timeslice.

When Instantaneous Coretime Market Revenues are reported for a particular timeslice from the Relay-chain, this information gets placed in the maybe_payout field of the relevant record of InstaPoolHistory.

Payments can be requested made for any records in InstaPoolContribution whose begin is the key for a value in InstaPoolHistory whose maybe_payout is Some. In this case, the total_contributions is reduced by the ContributionRecord's mask and a pro rata amount paid. The ContributionRecord is mutated by incrementing begin, or removed if begin becomes equal to end.

Example:

#![allow(unused)]
fn main() {
// Simple example with a `u16` `CoreMask` and bulk sold in 100 timeslices.
Regions:
{ core: 0u16, begin: 100, mask: 0b1111_1111_1111_1111u16 } => { end: 200u32, owner: Alice };
// First split @ 50
Regions:
{ core: 0u16, begin: 100, mask: 0b1111_1111_1111_1111u16 } => { end: 150u32, owner: Alice };
{ core: 0u16, begin: 150, mask: 0b1111_1111_1111_1111u16 } => { end: 200u32, owner: Alice };
// Share half of first 50 blocks
Regions:
{ core: 0u16, begin: 100, mask: 0b1111_1111_0000_0000u16 } => { end: 150u32, owner: Alice };
{ core: 0u16, begin: 100, mask: 0b0000_0000_1111_1111u16 } => { end: 150u32, owner: Alice };
{ core: 0u16, begin: 150, mask: 0b1111_1111_1111_1111u16 } => { end: 200u32, owner: Alice };
// Sell half of them to Bob
Regions:
{ core: 0u16, begin: 100, mask: 0b1111_1111_0000_0000u16 } => { end: 150u32, owner: Alice };
{ core: 0u16, begin: 100, mask: 0b0000_0000_1111_1111u16 } => { end: 150u32, owner: Bob };
{ core: 0u16, begin: 150, mask: 0b1111_1111_1111_1111u16 } => { end: 200u32, owner: Alice };
// Bob splits first 10 and assigns them to himself.
Regions:
{ core: 0u16, begin: 100, mask: 0b1111_1111_0000_0000u16 } => { end: 150u32, owner: Alice };
{ core: 0u16, begin: 100, mask: 0b0000_0000_1111_1111u16 } => { end: 110u32, owner: Bob };
{ core: 0u16, begin: 110, mask: 0b0000_0000_1111_1111u16 } => { end: 150u32, owner: Bob };
{ core: 0u16, begin: 150, mask: 0b1111_1111_1111_1111u16 } => { end: 200u32, owner: Alice };
// Bob shares first 10 3 ways and sells smaller shares to Charlie and Dave
Regions:
{ core: 0u16, begin: 100, mask: 0b1111_1111_0000_0000u16 } => { end: 150u32, owner: Alice };
{ core: 0u16, begin: 100, mask: 0b0000_0000_1100_0000u16 } => { end: 110u32, owner: Charlie };
{ core: 0u16, begin: 100, mask: 0b0000_0000_0011_0000u16 } => { end: 110u32, owner: Dave };
{ core: 0u16, begin: 100, mask: 0b0000_0000_0000_1111u16 } => { end: 110u32, owner: Bob };
{ core: 0u16, begin: 110, mask: 0b0000_0000_1111_1111u16 } => { end: 150u32, owner: Bob };
{ core: 0u16, begin: 150, mask: 0b1111_1111_1111_1111u16 } => { end: 200u32, owner: Alice };
// Bob assigns to his para B, Charlie and Dave assign to their paras C and D; Alice assigns first 50 to A
Regions:
{ core: 0u16, begin: 150, mask: 0b1111_1111_1111_1111u16 } => { end: 200u32, owner: Alice };
Workplan:
(100, 0) => vec![
    { mask: 0b1111_1111_0000_0000u16, task: Assigned(A) },
    { mask: 0b0000_0000_1100_0000u16, task: Assigned(C) },
    { mask: 0b0000_0000_0011_0000u16, task: Assigned(D) },
    { mask: 0b0000_0000_0000_1111u16, task: Assigned(B) },
]
(110, 0) => vec![{ mask: 0b0000_0000_1111_1111u16, task: Assigned(B) }]
// Alice assigns her remaining 50 timeslices to the InstaPool paying herself:
Regions: (empty)
Workplan:
(100, 0) => vec![
    { mask: 0b1111_1111_0000_0000u16, task: Assigned(A) },
    { mask: 0b0000_0000_1100_0000u16, task: Assigned(C) },
    { mask: 0b0000_0000_0011_0000u16, task: Assigned(D) },
    { mask: 0b0000_0000_0000_1111u16, task: Assigned(B) },
]
(110, 0) => vec![{ mask: 0b0000_0000_1111_1111u16, task: Assigned(B) }]
(150, 0) => vec![{ mask: 0b1111_1111_1111_1111u16, task: InstaPool }]
InstaPoolContribution:
{ begin: 150, end: 200, core: 0, mask: 0b1111_1111_1111_1111u16, payee: Alice }
InstaPoolIo:
150 => 16
200 => -16
// Actual notifications to relay chain.
// Assumes:
// - Timeslice is 10 blocks.
// - Timeslice 0 begins at block #1000.
// - Relay needs 10 blocks notice of change.
//
Workload: 0 => vec![]
PoolSize: 0

// Block 990:
Relay <= assign_core(core: 0u16, begin: 1000, assignment: vec![(A, 8), (C, 2), (D, 2), (B, 4)])
Workload: 0 => vec![
    { mask: 0b1111_1111_0000_0000u16, task: Assigned(A) },
    { mask: 0b0000_0000_1100_0000u16, task: Assigned(C) },
    { mask: 0b0000_0000_0011_0000u16, task: Assigned(D) },
    { mask: 0b0000_0000_0000_1111u16, task: Assigned(B) },
]
PoolSize: 0

// Block 1090:
Relay <= assign_core(core: 0u16, begin: 1100, assignment: vec![(A, 8), (B, 8)])
Workload: 0 => vec![
    { mask: 0b1111_1111_0000_0000u16, task: Assigned(A) },
    { mask: 0b0000_0000_1111_1111u16, task: Assigned(B) },
]
PoolSize: 0

// Block 1490:
Relay <= assign_core(core: 0u16, begin: 1500, assignment: vec![(Pool, 16)])
Workload: 0 => vec![
    { mask: 0b1111_1111_1111_1111u16, task: InstaPool },
]
PoolSize: 16
InstaPoolIo:
200 => -16
InstaPoolHistory:
150 => { total_contributions: 16, maybe_payout: None }

// Sometime after block 1500:
InstaPoolHistory:
150 => { total_contributions: 16, maybe_payout: Some(P) }

// Sometime after block 1990:
InstaPoolIo: (empty)
PoolSize: 0
InstaPoolHistory:
150 => { total_contributions: 16, maybe_payout: Some(P0) }
151 => { total_contributions: 16, maybe_payout: Some(P1) }
152 => { total_contributions: 16, maybe_payout: Some(P2) }
...
199 => { total_contributions: 16, maybe_payout: Some(P49) }

// Sometime later still Alice calls for a payout
InstaPoolContribution: (empty)
InstaPoolHistory: (empty)
// Alice gets rewarded P0 + P1 + ... P49.
}

Rollout

Rollout of this proposal comes in several phases:

  1. Finalise the specifics of implementation; this may be done through a design document or through a well-documented prototype implementation.
  2. Implement the design, including all associated aspects such as unit tests, benchmarks and any support software needed.
  3. If any new parachain is required, launch of this.
  4. Formal audit of the implementation and any manual testing.
  5. Announcement to the various stakeholders of the imminent changes.
  6. Software integration and release.
  7. Governance upgrade proposal(s).
  8. Monitoring of the upgrade process.

Performance, Ergonomics and Compatibility

No specific considerations.

Parachains already deployed into the Polkadot UC must have a clear plan of action to migrate to an agile Coretime market.

While this proposal does not introduce documentable features per se, adequate documentation must be provided to potential purchasers of Polkadot Coretime. This SHOULD include any alterations to the Polkadot-SDK software collection.

Testing, Security and Privacy

Regular testing through unit tests, integration tests, manual testnet tests, zombie-net tests and fuzzing SHOULD be conducted.

A regular security review SHOULD be conducted prior to deployment through a review by the Web3 Foundation economic research group.

Any final implementation MUST pass a professional external security audit.

The proposal introduces no new privacy concerns.

RFC-3 proposes a means of implementing the high-level allocations within the Relay-chain.

RFC-5 proposes the API for interacting with Relay-chain.

Additional work should specify the interface for the instantaneous market revenue so that the Coretime-chain can ensure Bulk Coretime placed in the instantaneous market is properly compensated.

Drawbacks, Alternatives and Unknowns

Unknowns include the economic and resource parameterisations:

  • The initial price of Bulk Coretime.
  • The price-change algorithm between Bulk Coretime sales.
  • The price increase per Bulk Coretime period for renewals.
  • The price decrease graph in the Leadin period for Bulk Coretime sales.
  • The initial price of Instantaneous Coretime.
  • The price-change algorithm for Instantaneous Coretime sales.
  • The percentage of cores to be sold as Bulk Coretime.
  • The fate of revenue collected.

Prior Art and References

Robert Habermeier initially wrote on the subject of Polkadot blockspace-centric in the article Polkadot Blockspace over Blockchains. While not going into details, the article served as an early reframing piece for moving beyond one-slot-per-chain models and building out secondary market infrastructure for resource allocation.

(source)

Table of Contents

RFC-5: Coretime Interface

Start Date06 July 2023
DescriptionInterface for manipulating the usage of cores on the Polkadot Ubiquitous Computer.
AuthorsGavin Wood, Robert Habermeier

Summary

In the Agile Coretime model of the Polkadot Ubiquitous Computer, as proposed in RFC-1 and RFC-3, it is necessary for the allocating parachain (envisioned to be one or more pallets on a specialised Brokerage System Chain) to communicate the core assignments to the Relay-chain, which is responsible for ensuring those assignments are properly enacted.

This is a proposal for the interface which will exist around the Relay-chain in order to communicate this information and instructions.

Motivation

The background motivation for this interface is splitting out coretime allocation functions and secondary markets from the Relay-chain onto System parachains. A well-understood and general interface is necessary for ensuring the Relay-chain receives coretime allocation instructions from one or more System chains without introducing dependencies on the implementation details of either side.

Requirements

  • The interface MUST allow the Relay-chain to be scheduled on a low-latency basis.
  • Individual cores MUST be schedulable, both in full to a single task (a ParaId or the Instantaneous Coretime Pool) or to many unique tasks in differing ratios.
  • Typical usage of the interface SHOULD NOT overload the VMP message system.
  • The interface MUST allow for the allocating chain to be notified of all accounting information relevant for making accurate rewards for contributing to the Instantaneous Coretime Pool.
  • The interface MUST allow for Instantaneous Coretime Market Credits to be communicated.
  • The interface MUST allow for the allocating chain to instruct changes to the number of cores which it is able to allocate.
  • The interface MUST allow for the allocating chain to be notified of changes to the number of cores which are able to be allocated by the allocating chain.

Stakeholders

Primary stakeholder sets are:

  • Developers of the Relay-chain core-management logic.
  • Developers of the Brokerage System Chain and its pallets.

Socialization:

This content of this RFC was discussed in the Polkdot Fellows channel.

Explanation

The interface has two sections: The messages which the Relay-chain is able to receive from the allocating parachain (the UMP message types), and messages which the Relay-chain is able to send to the allocating parachain (the DMP message types). These messages are expected to be able to be implemented in a well-known pallet and called with the XCM Transact instruction.

Future work may include these messages being introduced into the XCM standard.

UMP Message Types

request_core_count

Prototype:

fn request_core_count(
    count: u16,
)

Requests the Relay-chain to alter the number of schedulable cores to count. Under normal operation, the Relay-chain SHOULD send a notify_core_count(count) message back.

request_revenue_info_at

Prototype:

fn request_revenue_at(
    when: BlockNumber,
)

Requests that the Relay-chain send a notify_revenue message back at or soon after Relay-chain block number when whose until parameter is equal to when.

The period in to the past which when is allowed to be may be limited; if so the limit should be understood on a channel outside of this proposal. In the case that the request cannot be serviced because when is too old a block then a notify_revenue message must still be returned, but its revenue field may be None.

credit_account

Prototype:

fn credit_account(
    who: AccountId,
    amount: Balance,
)

Instructs the Relay-chain to add the amount of DOT to the Instantaneous Coretime Market Credit account of who.

It is expected that Instantaneous Coretime Market Credit on the Relay-chain is NOT transferrable and only redeemable when used to assign cores in the Instantaneous Coretime Pool.

assign_core

Prototype:

type PartsOf57600 = u16;
enum CoreAssignment {
    InstantaneousPool,
    Task(ParaId),
}
fn assign_core(
    core: CoreIndex,
    begin: BlockNumber,
    assignment: Vec<(CoreAssignment, PartsOf57600)>,
    end_hint: Option<BlockNumber>,
)

Requirements:

assert!(core < core_count);
assert!(targets.iter().map(|x| x.0).is_sorted());
assert_eq!(targets.iter().map(|x| x.0).unique().count(), targets.len());
assert_eq!(targets.iter().map(|x| x.1).sum(), 57600);

Where:

  • core_count is assumed to be the sole parameter in the last received notify_core_count message.

Instructs the Relay-chain to ensure that the core indexed as core is utilised for a number of assignments in specific ratios given by assignment starting as soon after begin as possible. Core assignments take the form of a CoreAssignment value which can either task the core to a ParaId value or indicate that the core should be used in the Instantaneous Pool. Each assignment comes with a ratio value, represented as the numerator of the fraction with a denominator of 57,600.

If end_hint is Some and the inner is greater than the current block number, then the Relay-chain should optimize in the expectation of receiving a new assign_core(core, ...) message at or prior to the block number of the inner value. Specific functionality should remain unchanged regardless of the end_hint value.

On the choice of denominator: 57,600 is a very composite number which factors into: 2 ** 8, 3 ** 2, 5 ** 2. By using it as the denominator we allow for various useful fractions to be perfectly represented including thirds, quarters, fifths, tenths, 80ths, percent and 256ths.

DMP Message Types

notify_core_count

Prototype:

fn notify_core_count(
    count: u16,
)

Indicate that from this block onwards, the range of acceptable values of the core parameter of assign_core message is [0, count). assign_core will be a no-op if provided with a value for core outside of this range.

notify_revenue_info

Prototype:

fn notify_revenue_info(
    until: BlockNumber,
    revenue: Option<Balance>,
)

Provide the amount of revenue accumulated from Instantaneous Coretime Sales from Relay-chain block number last_until to until, not including until itself. last_until is defined as being the until argument of the last notify_revenue message sent, or zero for the first call. If revenue is None, this indicates that the information is no longer available.

This explicitly disregards the possibility of multiple parachains requesting and being notified of revenue information. The Relay-chain must be configured to ensure that only a single revenue information destination exists.

Realistic Limits of the Usage

For request_revenue_info, a successful request should be possible if when is no less than the Relay-chain block number on arrival of the message less 100,000.

For assign_core, a successful request should be possible if begin is no less than the Relay-chain block number on arrival of the message plus 10 and workload contains no more than 100 items.

Performance, Ergonomics and Compatibility

No specific considerations.

Testing, Security and Privacy

Standard Polkadot testing and security auditing applies.

The proposal introduces no new privacy concerns.

RFC-1 proposes a means of determining allocation of Coretime using this interface.

RFC-3 proposes a means of implementing the high-level allocations within the Relay-chain.

Drawbacks, Alternatives and Unknowns

None at present.

Prior Art and References

None.

(source)

Table of Contents

RFC-0007: System Collator Selection

Start Date07 July 2023
DescriptionMechanism for selecting collators of system chains.
AuthorsJoe Petrowski

Summary

As core functionality moves from the Relay Chain into system chains, so increases the reliance on the liveness of these chains for the use of the network. It is not economically scalable, nor necessary from a game-theoretic perspective, to pay collators large rewards. This RFC proposes a mechanism -- part technical and part social -- for ensuring reliable collator sets that are resilient to attemps to stop any subsytem of the Polkadot protocol.

Motivation

In order to guarantee access to Polkadot's system, the collators on its system chains must propose blocks (provide liveness) and allow all transactions to eventually be included. That is, some collators may censor transactions, but there must exist one collator in the set who will include a given transaction. In fact, all collators may censor varying subsets of transactions, but as long as no transaction is in the intersection of every subset, it will eventually be included. The objective of this RFC is to propose a mechanism to select such a set on each system chain.

While the network as a whole uses staking (and inflationary rewards) to attract validators, collators face different challenges in scale and have lower security assumptions than validators. Regarding scale, there exist many system chains, and it is economically expensive to pay collators a premium. Likewise, any staked DOT for collation is not staked for validation. Since collator sets do not need to meet Byzantine Fault Tolerance criteria, staking as the primary mechanism for collator selection would remove stake that is securing BFT assumptions, making the network less secure.

Another problem with economic scalability relates to the increasing number of system chains, and corresponding increase in need for collators (i.e., increase in collator slots). "Good" (highly available, non-censoring) collators will not want to compete in elections on many chains when they could use their resources to compete in the more profitable validator election. Such dilution decreases the required bond on each chain, leaving them vulnerable to takeover by hostile collator groups.

This RFC proposes a system whereby collation is primarily an infrastructure service, with the on-chain Treasury reimbursing costs of semi-trusted node operators, referred to as "Invulnerables". The system need not trust the individual operators, only that as a set they would be resilient to coordinated attempts to stop a single chain from halting or to censor a particular subset of transactions.

In the case that users do not trust this set, this RFC also proposes that each chain always have available collator positions that can be acquired by anyone by placing a bond.

Requirements

  • System MUST have at least one valid collator for every chain.
  • System MUST allow anyone to become a collator, provided they reserve/hold enough DOT.
  • System SHOULD select a set of collators with reasonable expectation that the set will not collude to censor any subset of transactions.
  • Collators selected by governance SHOULD have a reasonable expectation that the Treasury will reimburse their operating costs.

Stakeholders

  • Infrastructure providers (people who run validator/collator nodes)
  • Polkadot Treasury

Explanation

This protocol builds on the existing Collator Selection pallet and its notion of Invulnerables. Invulnerables are collators (identified by their AccountIds) who will be selected as part of the collator set every session. Operations relating to the management of the Invulnerables are done through privileged, governance origins. The implementation should maintain an API for adding and removing Invulnerable collators.

In addition to Invulnerables, there are also open slots for "Candidates". Anyone can register as a Candidate by placing a fixed bond. However, with a fixed bond and fixed number of slots, there is an obvious selection problem: The slots fill up without any logic to replace their occupants.

This RFC proposes that the collator selection protocol allow Candidates to increase (and decrease) their individual bonds, sort the Candidates according to bond, and select the top N Candidates. The selection and changeover should be coordinated by the session manager.

A FRAME pallet already exists for sorting ("bagging") "top N" groups, the Bags List pallet. This pallet's SortedListProvider should be integrated into the session manager of the Collator Selection pallet.

Despite the lack of apparent economic incentives (i.e., inflation), several reasons exist why one may want to bond funds to participate in the Candidates election, for example:

  • They want to build credibility to be selected as Invulnerable;
  • They want to ensure availability of an application, e.g. a stablecoin issuer might run a collator on Asset Hub to ensure transactions in its asset are included in blocks;
  • They fear censorship themselves, e.g. a voter might think their votes are being censored from governance, so they run a collator on the governance chain to include their votes.

Unlike the fixed-bond mechanism that fills up its Candidates, the election mechanism ensures that anyone can join the collator set by placing the Nth highest bond.

Set Size

In order to achieve the requirements listed under Motivation, it is reasonable to have approximately:

  • 20 collators per system chain,
  • of which 15 are Invulnerable, and
  • five are elected by bond.

Drawbacks

The primary drawback is a reliance on governance for continued treasury funding of infrastructure costs for Invulnerable collators.

Testing, Security, and Privacy

The vast majority of cases can be covered by unit testing. Integration test should ensure that the Collator Selection UpdateOrigin, which has permission to modify the Invulnerables and desired number of Candidates, can handle updates over XCM from the system's governance location.

Performance, Ergonomics, and Compatibility

This proposal has very little impact on most users of Polkadot, and should improve the performance of system chains by reducing the number of missed blocks.

Performance

As chains have strict PoV size limits, care must be taken in the PoV impact of the session manager. Appropriate benchmarking and tests should ensure that conservative limits are placed on the number of Invulnerables and Candidates.

Ergonomics

The primary group affected is Candidate collators, who, after implementation of this RFC, will need to compete in a bond-based election rather than a race to claim a Candidate spot.

Compatibility

This RFC is compatible with the existing implementation and can be handled via upgrades and migration.

Prior Art and References

Written Discussions

Prior Feedback and Input From

  • Kian Paimani
  • Jeff Burdges
  • Rob Habermeier
  • SR Labs Auditors
  • Current collators including Paranodes, Stake Plus, Turboflakes, Peter Mensik, SIK, and many more.

Unresolved Questions

None at this time.

There may exist in the future system chains for which this model of collator selection is not appropriate. These chains should be evaluated on a case-by-case basis.

(source)

Table of Contents

RFC-0008: Store parachain bootnodes in relay chain DHT

Start Date2023-07-14
DescriptionParachain bootnodes shall register themselves in the DHT of the relay chain
AuthorsPierre Krieger

Summary

The full nodes of the Polkadot peer-to-peer network maintain a distributed hash table (DHT), which is currently used for full nodes discovery and validators discovery purposes.

This RFC proposes to extend this DHT to be used to discover full nodes of the parachains of Polkadot.

Motivation

The maintenance of bootnodes has long been an annoyance for everyone.

When a bootnode is newly-deployed or removed, every chain specification must be updated in order to take the update into account. This has lead to various non-optimal solutions, such as pulling chain specifications from GitHub repositories. When it comes to RPC nodes, UX developers often have trouble finding up-to-date addresses of parachain RPC nodes. With the ongoing migration from RPC nodes to light clients, similar problems would happen with chain specifications as well.

Furthermore, there exists multiple different possible variants of a certain chain specification: with the non-raw storage, with the raw storage, with just the genesis trie root hash, with or without checkpoint, etc. All of this creates confusion. Removing the need for parachain developers to be aware of and manage these different versions would be beneficial.

Since the PeerId and addresses of bootnodes needs to be stable, extra maintenance work is required from the chain maintainers. For example, they need to be extra careful when migrating nodes within their infrastructure. In some situations, bootnodes are put behind domain names, which also requires maintenance work.

Because the list of bootnodes in chain specifications is so annoying to modify, the consequence is that the number of bootnodes is rather low (typically between 2 and 15). In order to better resist downtimes and DoS attacks, a better solution would be to use every node of a certain chain as potential bootnode, rather than special-casing some specific nodes.

While this RFC doesn't solve these problems for relay chains, it aims at solving it for parachains by storing the list of all the full nodes of a parachain on the relay chain DHT.

Assuming that this RFC is implemented, and that light clients are used, deploying a parachain wouldn't require more work than registering it onto the relay chain and starting the collators. There wouldn't be any need for special infrastructure nodes anymore.

Stakeholders

This RFC has been opened on my own initiative because I think that this is a good technical solution to a usability problem that many people are encountering and that they don't realize can be solved.

Explanation

The content of this RFC only applies for parachains and parachain nodes that are "Substrate-compatible". It is in no way mandatory for parachains to comply to this RFC.

Note that "Substrate-compatible" is very loosely defined as "implements the same mechanisms and networking protocols as Substrate". The author of this RFC believes that "Substrate-compatible" should be very precisely specified, but there is controversy on this topic.

While a lot of this RFC concerns the implementation of parachain nodes, it makes use of the resources of the Polkadot chain, and as such it is important to describe them in the Polkadot specification.

This RFC adds two mechanisms: a registration in the DHT, and a new networking protocol.

DHT provider registration

This RFC heavily relies on the functionalities of the Kademlia DHT already in use by Polkadot. You can find a link to the specification here.

Full nodes of a parachain registered on Polkadot should register themselves onto the Polkadot DHT as the providers of a key corresponding to the parachain that they are serving, as described in the Content provider advertisement section of the specification. This uses the ADD_PROVIDER system of libp2p-kademlia.

This key is: sha256(concat(scale_compact(para_id), randomness)) where the value of randomness can be found in the randomness field when calling the BabeApi_currentEpoch function. For example, for a para_id equal to 1000, and at the time of writing of this RFC (July 14th 2023 at 09:13 UTC), it is sha(0xa10f12872447958d50aa7b937b0106561a588e0e2628d33f81b5361b13dbcf8df708), which is equal to 0x483dd8084d50dbbbc962067f216c37b627831d9339f5a6e426a32e3076313d87.

In order to avoid downtime when the key changes, parachain full nodes should also register themselves as a secondary key that uses a value of randomness equal to the randomness field when calling BabeApi_nextEpoch.

Implementers should be aware that their implementation of Kademlia might already hash the key before XOR'ing it. The key is not meant to be hashed twice.

The compact SCALE encoding has been chosen in order to avoid problems related to the number of bytes and endianness of the para_id.

New networking protocol

A new request-response protocol should be added, whose name is /91b171bb158e2d3848fa23a9f1c25182fb8e20313b2c1eb49219da7a70ce90c3/paranode (that hexadecimal number is the genesis hash of the Polkadot chain, and should be adjusted appropriately for Kusama and others).

The request consists in a SCALE-compact-encoded para_id. For example, for a para_id equal to 1000, this is 0xa10f.

Note that because this is a request-response protocol, the request is always prefixed with its length in bytes. While the body of the request is simply the SCALE-compact-encoded para_id, the data actually sent onto the substream is both the length and body.

The response consists in a protobuf struct, defined as:

syntax = "proto2";

message Response {
    // Peer ID of the node on the parachain side.
    bytes peer_id = 1;

    // Multiaddresses of the parachain side of the node. The list and format are the same as for the `listenAddrs` field of the `identify` protocol.
    repeated bytes addrs = 2;

    // Genesis hash of the parachain. Used to determine the name of the networking protocol to connect to the parachain. Untrusted.
    bytes genesis_hash = 3;

    // So-called "fork ID" of the parachain. Used to determine the name of the networking protocol to connect to the parachain. Untrusted.
    optional string fork_id = 4;
};

The maximum size of a response is set to an arbitrary 16kiB. The responding side should make sure to conform to this limit. Given that fork_id is typically very small and that the only variable-length field is addrs, this is easily achieved by limiting the number of addresses.

Implementers should be aware that addrs might be very large, and are encouraged to limit the number of addrs to an implementation-defined value.

Drawbacks

The peer_id and addrs fields are in theory not strictly needed, as the PeerId and addresses could be always equal to the PeerId and addresses of the node being registered as the provider and serving the response. However, the Cumulus implementation currently uses two different networking stacks, one of the parachain and one for the relay chain, using two separate PeerIds and addresses, and as such the PeerId and addresses of the other networking stack must be indicated. Asking them to use only one networking stack wouldn't feasible in a realistic time frame.

The values of the genesis_hash and fork_id fields cannot be verified by the requester and are expected to be unused at the moment. Instead, a client that desires connecting to a parachain is expected to obtain the genesis hash and fork ID of the parachain from the parachain chain specification. These fields are included in the networking protocol nonetheless in case an acceptable solution is found in the future, and in order to allow use cases such as discovering parachains in a not-strictly-trusted way.

Testing, Security, and Privacy

Because not all nodes want to be used as bootnodes, implementers are encouraged to provide a way to disable this mechanism. However, it is very much encouraged to leave this mechanism on by default for all parachain nodes.

This mechanism doesn't add or remove any security by itself, as it relies on existing mechanisms. However, if the principle of chain specification bootnodes is entirely replaced with the mechanism described in this RFC (which is the objective), then it becomes important whether the mechanism in this RFC can be abused in order to make a parachain unreachable.

Due to the way Kademlia works, it would become the responsibility of the 20 Polkadot nodes whose sha256(peer_id) is closest to the key (described in the explanations section) to store the list of bootnodes of each parachain. Furthermore, when a large number of providers (here, a provider is a bootnode) are registered, only the providers closest to the key are kept, up to a certain implementation-defined limit.

For this reason, an attacker can abuse this mechanism by randomly generating libp2p PeerIds until they find the 20 entries closest to the key representing the target parachain. They are then in control of the parachain bootnodes. Because the key changes periodically and isn't predictable, and assuming that the Polkadot DHT is sufficiently large, it is not realistic for an attack like this to be maintained in the long term.

Furthermore, parachain clients are expected to cache a list of known good nodes on their disk. If the mechanism described in this RFC went down, it would only prevent new nodes from accessing the parachain, while clients that have connected before would not be affected.

Performance, Ergonomics, and Compatibility

Performance

The DHT mechanism generally has a low overhead, especially given that publishing providers is done only every 24 hours.

Doing a Kademlia iterative query then sending a provider record shouldn't take more than around 50 kiB in total of bandwidth for the parachain bootnode.

Assuming 1000 parachain full nodes, the 20 Polkadot full nodes corresponding to a specific parachain will each receive a sudden spike of a few megabytes of networking traffic when the key rotates. Again, this is relatively negligible. If this becomes a problem, one can add a random delay before a parachain full node registers itself to be the provider of the key corresponding to BabeApi_next_epoch.

Maybe the biggest uncertainty is the traffic that the 20 Polkadot full nodes will receive from light clients that desire knowing the bootnodes of a parachain. Light clients are generally encouraged to cache the peers that they use between restarts, so they should only query these 20 Polkadot full nodes at their first initialization. If this every becomes a problem, this value of 20 is an arbitrary constant that can be increased for more redundancy.

Ergonomics

Irrelevant.

Compatibility

Irrelevant.

Prior Art and References

None.

Unresolved Questions

While it fundamentally doesn't change much to this RFC, using BabeApi_currentEpoch and BabeApi_nextEpoch might be inappropriate. I'm not familiar enough with good practices within the runtime to have an opinion here. Should it be an entirely new pallet?

It is possible that in the future a client could connect to a parachain without having to rely on a trusted parachain specification.

(source)

Table of Contents

RFC-0010: Burn Coretime Revenue

Start Date19.07.2023
DescriptionRevenue from Coretime sales should be burned
AuthorsJonas Gehrlein

Summary

The Polkadot UC will generate revenue from the sale of available Coretime. The question then arises: how should we handle these revenues? Broadly, there are two reasonable paths – burning the revenue and thereby removing it from total issuance or divert it to the Treasury. This Request for Comment (RFC) presents arguments favoring burning as the preferred mechanism for handling revenues from Coretime sales.

Motivation

How to handle the revenue accrued from Coretime sales is an important economic question that influences the value of DOT and should be properly discussed before deciding for either of the options. Now is the best time to start this discussion.

Stakeholders

Polkadot DOT token holders.

Explanation

This RFC discusses potential benefits of burning the revenue accrued from Coretime sales instead of diverting them to Treasury. Here are the following arguments for it.

It's in the interest of the Polkadot community to have a consistent and predictable Treasury income, because volatility in the inflow can be damaging, especially in situations when it is insufficient. As such, this RFC operates under the presumption of a steady and sustainable Treasury income flow, which is crucial for the Polkadot community's stability. The assurance of a predictable Treasury income, as outlined in a prior discussion here, or through other equally effective measures, serves as a baseline assumption for this argument.

Consequently, we need not concern ourselves with this particular issue here. This naturally begs the question - why should we introduce additional volatility to the Treasury by aligning it with the variable Coretime sales? It's worth noting that Coretime revenues often exhibit an inverse relationship with periods when Treasury spending should ideally be ramped up. During periods of low Coretime utilization (indicated by lower revenue), Treasury should spend more on projects and endeavours to increase the demand for Coretime. This pattern underscores that Coretime sales, by their very nature, are an inconsistent and unpredictable source of funding for the Treasury. Given the importance of maintaining a steady and predictable inflow, it's unnecessary to rely on another volatile mechanism. Some might argue that we could have both: a steady inflow (from inflation) and some added bonus from Coretime sales, but burning the revenue would offer further benefits as described below.

  • Balancing Inflation: While DOT as a utility token inherently profits from a (reasonable) net inflation, it also benefits from a deflationary force that functions as a counterbalance to the overall inflation. Right now, the only mechanism on Polkadot that burns fees is the one for underutilized DOT in the Treasury. Finding other, more direct target for burns makes sense and the Coretime market is a good option.

  • Clear incentives: By burning the revenue accrued on Coretime sales, prices paid by buyers are clearly costs. This removes distortion from the market that might arise when the paid tokens occur on some other places within the network. In that case, some actors might have secondary motives of influencing the price of Coretime sales, because they benefit down the line. For example, actors that actively participate in the Coretime sales are likely to also benefit from a higher Treasury balance, because they might frequently request funds for their projects. While those effects might appear far-fetched, they could accumulate. Burning the revenues makes sure that the prices paid are clearly costs to the actors themselves.

  • Collective Value Accrual: Following the previous argument, burning the revenue also generates some externality, because it reduces the overall issuance of DOT and thereby increases the value of each remaining token. In contrast to the aforementioned argument, this benefits all token holders collectively and equally. Therefore, I'd consider this as the preferrable option, because burns lets all token holders participate at Polkadot's success as Coretime usage increases.

(source)

Table of Contents

RFC-0012: Process for Adding New System Collectives

Start Date24 July 2023
DescriptionA process for adding new (and removing existing) system collectives.
AuthorsJoe Petrowski

Summary

Since the introduction of the Collectives parachain, many groups have expressed interest in forming new -- or migrating existing groups into -- on-chain collectives. While adding a new collective is relatively simple from a technical standpoint, the Fellowship will need to merge new pallets into the Collectives parachain for each new collective. This RFC proposes a means for the network to ratify a new collective, thus instructing the Fellowship to instate it in the runtime.

Motivation

Many groups have expressed interest in representing collectives on-chain. Some of these include:

  • Parachain technical fellowship (new)
  • Fellowship(s) for media, education, and evangelism (new)
  • Polkadot Ambassador Program (existing)
  • Anti-Scam Team (existing)

Collectives that form part of the core Polkadot protocol should have a mandate to serve the Polkadot network. However, as part of the Polkadot protocol, the Fellowship, in its capacity of maintaining system runtimes, will need to include modules and configurations for each collective.

Once a group has developed a value proposition for the Polkadot network, it should have a clear path to having its collective accepted on-chain as part of the protocol. Acceptance should direct the Fellowship to include the new collective with a given initial configuration into the runtime. However, the network, not the Fellowship, should ultimately decide which collectives are in the interest of the network.

Stakeholders

  • Polkadot stakeholders who would like to organize on-chain.
  • Technical Fellowship, in its role of maintaining system runtimes.

Explanation

The group that wishes to operate an on-chain collective should publish the following information:

  • Charter, including the collective's mandate and how it benefits Polkadot. This would be similar to the Fellowship Manifesto.
  • Seeding recommendation.
  • Member types, i.e. should members be individuals or organizations.
  • Member management strategy, i.e. how do members join and get promoted, if applicable.
  • How much, if at all, members should get paid in salary.
  • Any special origins this Collective should have outside its self. For example, the Fellowship can whitelist calls for referenda via the WhitelistOrigin.

This information could all be in a single document or, for example, a GitHub repository.

After publication, members should seek feedback from the community and Technical Fellowship, and make any revisions needed. When the collective believes the proposal is ready, they should bring a remark with the text APPROVE_COLLECTIVE("{collective name}, {commitment}") to a Root origin referendum. The proposer should provide instructions for generating commitment. The passing of this referendum would be unequivocal direction to the Fellowship that this collective should be part of the Polkadot runtime.

Note: There is no need for a REJECT referendum. Proposals that have not been approved are simply not included in the runtime.

Removing Collectives

If someone believes that an existing collective is not acting in the interest of the network or in accordance with its charter, they should likewise have a means to instruct the Fellowship to remove that collective from Polkadot.

An on-chain remark from the Root origin with the text REMOVE_COLLECTIVE("{collective name}, {para ID}, [{pallet indices}]") would instruct the Fellowship to remove the collective via the listed pallet indices on paraId. Should someone want to construct such a remark, they should have a reasonable expectation that a member of the Fellowship would help them identify the pallet indices associated with a given collective, whether or not the Fellowship member agrees with removal.

Collective removal may also come with other governance calls, for example voiding any scheduled Treasury spends that would fund the given collective.

Drawbacks

Passing a Root origin referendum is slow. However, given the network's investment (in terms of code maintenance and salaries) in a new collective, this is an appropriate step.

Testing, Security, and Privacy

No impacts.

Performance, Ergonomics, and Compatibility

Generally all new collectives will be in the Collectives parachain. Thus, performance impacts should strictly be limited to this parachain and not affect others. As the majority of logic for collectives is generalized and reusable, we expect most collectives to be instances of similar subsets of modules. That is, new collectives should generally be compatible with UIs and other services that provide collective-related functionality, with little modifications to support new ones.

Prior Art and References

The launch of the Technical Fellowship, see the initial forum post.

Unresolved Questions

None at this time.

(source)

Table of Contents

RFC-0013: Prepare Core runtime API for MBMs

Start DateJuly 24, 2023
DescriptionPrepare the Core Runtime API for Multi-Block-Migrations
AuthorsOliver Tale-Yazdi

Summary

Introduces breaking changes to the Core runtime API by letting Core::initialize_block return an enum. The versions of Core is bumped from 4 to 5.

Motivation

The main feature that motivates this RFC are Multi-Block-Migrations (MBM); these make it possible to split a migration over multiple blocks.
Further it would be nice to not hinder the possibility of implementing a new hook poll, that runs at the beginning of the block when there are no MBMs and has access to AllPalletsWithSystem. This hook can then be used to replace the use of on_initialize and on_finalize for non-deadline critical logic.
In a similar fashion, it should not hinder the future addition of a System::PostInherents callback that always runs after all inherents were applied.

Stakeholders

  • Substrate Maintainers: They have to implement this, including tests, audit and maintenance burden.
  • Polkadot Runtime developers: They will have to adapt the runtime files to this breaking change.
  • Polkadot Parachain Teams: They have to adapt to the breaking changes but then eventually have multi-block migrations available.

Explanation

Core::initialize_block

This runtime API function is changed from returning () to ExtrinsicInclusionMode:

fn initialize_block(header: &<Block as BlockT>::Header)
+  -> ExtrinsicInclusionMode;

With ExtrinsicInclusionMode is defined as:

#![allow(unused)]
fn main() {
enum ExtrinsicInclusionMode {
  /// All extrinsics are allowed in this block.
  AllExtrinsics,
  /// Only inherents are allowed in this block.
  OnlyInherents,
}
}

A block author MUST respect the ExtrinsicInclusionMode that is returned by initialize_block. The runtime MUST reject blocks that have non-inherent extrinsics in them while OnlyInherents was returned.

Coming back to the motivations and how they can be implemented with this runtime API change:

1. Multi-Block-Migrations: The runtime is being put into lock-down mode for the duration of the migration process by returning OnlyInherents from initialize_block. This ensures that no user provided transaction can interfere with the migration process. It is absolutely necessary to ensure this, otherwise a transaction could call into un-migrated storage and violate storage invariants.

2. poll is possible by using apply_extrinsic as entry-point and not hindered by this approach. It would not be possible to use a pallet inherent like System::last_inherent to achieve this for two reasons: First is that pallets do not have access to AllPalletsWithSystem which is required to invoke the poll hook on all pallets. Second is that the runtime does currently not enforce an order of inherents.

3. System::PostInherents can be done in the same manner as poll.

Drawbacks

The previous drawback of cementing the order of inherents has been addressed and removed by redesigning the approach. No further drawbacks have been identified thus far.

Testing, Security, and Privacy

The new logic of initialize_block can be tested by checking that the block-builder will skip transactions when OnlyInherents is returned.

Security: n/a

Privacy: n/a

Performance, Ergonomics, and Compatibility

Performance

The performance overhead is minimal in the sense that no clutter was added after fulfilling the requirements. The only performance difference is that initialize_block also returns an enum that needs to be passed through the WASM boundary. This should be negligible.

Ergonomics

The new interface allows for more extensible runtime logic. In the future, this will be utilized for multi-block-migrations which should be a huge ergonomic advantage for parachain developers.

Compatibility

The advice here is OPTIONAL and outside of the RFC. To not degrade user experience, it is recommended to ensure that an updated node can still import historic blocks.

Prior Art and References

The RFC is currently being implemented in polkadot-sdk#1781 (formerly substrate#14275). Related issues and merge requests:

Unresolved Questions

Please suggest a better name for BlockExecutiveMode. We already tried: RuntimeExecutiveMode, ExtrinsicInclusionMode. The names of the modes Normal and Minimal were also called AllExtrinsics and OnlyInherents, so if you have naming preferences; please post them.
=> renamed to ExtrinsicInclusionMode

Is post_inherents more consistent instead of last_inherent? Then we should change it.
=> renamed to last_inherent

The long-term future here is to move the block building logic into the runtime. Currently there is a tight dance between the block author and the runtime; the author has to call into different runtime functions in quick succession and exact order. Any misstep causes the block to be invalid.
This can be unified and simplified by moving both parts into the runtime.

(source)

Table of Contents

RFC-0014: Improve locking mechanism for parachains

Start DateJuly 25, 2023
DescriptionImprove locking mechanism for parachains
AuthorsBryan Chen

Summary

This RFC proposes a set of changes to the parachain lock mechanism. The goal is to allow a parachain manager to self-service the parachain without root track governance action.

This is achieved by remove existing lock conditions and only lock a parachain when:

  • A parachain manager explicitly lock the parachain
  • OR a parachain block is produced successfully

Motivation

The manager of a parachain has permission to manage the parachain when the parachain is unlocked. Parachains are by default locked when onboarded to a slot. This requires the parachain wasm/genesis must be valid, otherwise a root track governance action on relaychain is required to update the parachain.

The current reliance on root track governance actions for managing parachains can be time-consuming and burdensome. This RFC aims to address this technical difficulty by allowing parachain managers to take self-service actions, rather than relying on general public voting.

The key scenarios this RFC seeks to improve are:

  1. Rescue a parachain with invalid wasm/genesis.

While we have various resources and templates to build a new parachain, it is still not a trivial task. It is very easy to make a mistake and resulting an invalid wasm/genesis. With lack of tools to help detect those issues1, it is very likely that the issues are only discovered after the parachain is onboarded on a slot. In this case, the parachain is locked and the parachain team has to go through a lengthy governance process to rescue the parachain.

  1. Perform lease renewal for an existing parachain.

One way to perform lease renewal for a parachain is by doing a least swap with another parachain with a longer lease. This requires the other parachain must be operational and able to perform XCM transact call into relaychain to dispatch the swap call. Combined with the overhead of setting up a new parachain, this is an time consuming and expensive process. Ideally, the parachain manager should be able to perform the lease swap call without having a running parachain2.

Requirements

  • A parachain manager SHOULD be able to rescue a parachain by updating the wasm/genesis without root track governance action.
  • A parachain manager MUST NOT be able to update the wasm/genesis if the parachain is locked.
  • A parachain SHOULD be locked when it successfully produced the first block.
  • A parachain manager MUST be able to perform lease swap without having a running parachain.

Stakeholders

  • Parachain teams
  • Parachain users

Explanation

Status quo

A parachain can either be locked or unlocked3. With parachain locked, the parachain manager does not have any privileges. With parachain unlocked, the parachain manager can perform following actions with the paras_registrar pallet:

  • deregister: Deregister a Para Id, freeing all data and returning any deposit.
  • swap: Initiate or confirm lease swap with another parachain.
  • add_lock: Lock the parachain.
  • schedule_code_upgrade: Schedule a parachain upgrade to update parachain wasm.
  • set_current_head: Set the parachain's current head.

Currently, a parachain can be locked with following conditions:

  • From add_lock call, which can be dispatched by relaychain Root origin, the parachain, or the parachain manager.
  • When a parachain is onboarded on a slot4.
  • When a crowdloan is created.

Only the relaychain Root origin or the parachain itself can unlock the lock5.

This creates an issue that if the parachain is unable to produce block, the parachain manager is unable to do anything and have to rely on relaychain Root origin to manage the parachain.

Proposed changes

This RFC proposes to change the lock and unlock conditions.

A parachain can be locked only with following conditions:

  • Relaychain governance MUST be able to lock any parachain.
  • A parachain MUST be able to lock its own lock.
  • A parachain manager SHOULD be able to lock the parachain.
  • A parachain SHOULD be locked when it successfully produced a block for the first time.

A parachain can be unlocked only with following conditions:

  • Relaychain governance MUST be able to unlock any parachain.
  • A parachain MUST be able to unlock its own lock.

Note that create crowdloan MUST NOT lock the parachain and onboard a parachain SHOULD NOT lock it until a new block is successfully produced.

Migration

A one off migration is proposed in order to apply this change retrospectively so that existing parachains can also be benefited from this RFC. This migration will unlock parachains that confirms with following conditions:

  • Parachain is locked.
  • Parachain never produced a block. Including from expired leases.
  • Parachain manager never explicitly lock the parachain.

Drawbacks

Parachain locks are designed in such way to ensure the decentralization of parachains. If parachains are not locked when it should be, it could introduce centralization risk for new parachains.

For example, one possible scenario is that a collective may decide to launch a parachain fully decentralized. However, if the parachain is unable to produce block, the parachain manager will be able to replace the wasm and genesis without the consent of the collective.

It is considered this risk is tolerable as it requires the wasm/genesis to be invalid at first place. It is not yet practically possible to develop a parachain without any centralized risk currently.

Another case is that a parachain team may decide to use crowdloan to help secure a slot lease. Previously, creating a crowdloan will lock a parachain. This means crowdloan participants will know exactly the genesis of the parachain for the crowdloan they are participating. However, this actually providers little assurance to crowdloan participants. For example, if the genesis block is determined before a crowdloan is started, it is not possible to have onchain mechanism to enforce reward distributions for crowdloan participants. They always have to rely on the parachain team to fulfill the promise after the parachain is alive.

Existing operational parachains will not be impacted.

Testing, Security, and Privacy

The implementation of this RFC will be tested on testnets (Rococo and Westend) first.

An audit maybe required to ensure the implementation does not introduce unwanted side effects.

There is no privacy related concerns.

Performance

This RFC should not introduce any performance impact.

Ergonomics

This RFC should improve the developer experiences for new and existing parachain teams

Compatibility

This RFC is fully compatibility with existing interfaces.

Prior Art and References

  • Parachain Slot Extension Story: https://github.com/paritytech/polkadot/issues/4758
  • Allow parachain to renew lease without actually run another parachain: https://github.com/paritytech/polkadot/issues/6685
  • Always treat parachain that never produced block for a significant amount of time as unlocked: https://github.com/paritytech/polkadot/issues/7539

Unresolved Questions

None at this stage.

This RFC is only intended to be a short term solution. Slots will be removed in future and lock mechanism is likely going to be replaced with a more generalized parachain manage & recovery system in future. Therefore long term impacts of this RFC are not considered.

1

https://github.com/paritytech/cumulus/issues/377 2: https://github.com/paritytech/polkadot/issues/6685 3: https://github.com/paritytech/polkadot/blob/994af3de79af25544bf39644844cbe70a7b4d695/runtime/common/src/paras_registrar.rs#L51-L52C15 4: https://github.com/paritytech/polkadot/blob/994af3de79af25544bf39644844cbe70a7b4d695/runtime/common/src/paras_registrar.rs#L473-L475 5: https://github.com/paritytech/polkadot/blob/994af3de79af25544bf39644844cbe70a7b4d695/runtime/common/src/paras_registrar.rs#L333-L340

(source)

Table of Contents

RFC-0022: Adopt Encointer Runtime

Start DateAug 22nd 2023
DescriptionPermanently move the Encointer runtime into the Fellowship runtimes repo.
Authors@brenzi for Encointer Association, 8000 Zurich, Switzerland

Summary

Encointer is a system chain on Kusama since Jan 2022 and has been developed and maintained by the Encointer association. This RFC proposes to treat Encointer like any other system chain and include it in the fellowship repo with this PR.

Motivation

Encointer does not seek to be in control of its runtime repository. As a decentralized system, the fellowship has a more suitable structure to maintain a system chain runtime repo than the Encointer association does.

Also, Encointer aims to update its runtime in batches with other system chains in order to have consistency for interoperability across system chains.

Stakeholders

  • Fellowship: Will continue to take upon them the review and auditing work for the Encointer runtime, but the process is streamlined with other system chains and therefore less time-consuming compared to the separate repo and CI process we currently have.
  • Kusama Network: Tokenholders can easily see the changes of all system chains in one place.
  • Encointer Association: Further decentralization of the Encointer Network necessities like devops.
  • Encointer devs: Being able to work directly in the Fellowship runtimes repo to streamline and synergize with other developers.

Explanation

Our PR has all details about our runtime and how we would move it into the fellowship repo.

Noteworthy: All Encointer-specific pallets will still be located in encointer's repo for the time being: https://github.com/encointer/pallets

It will still be the duty of the Encointer team to keep its runtime up to date and provide adequate test fixtures. Frequent dependency bumps with Polkadot releases would be beneficial for interoperability and could be streamlined with other system chains but that will not be a duty of fellowship. Whenever possible, all system chains could be upgraded jointly (including Encointer) with a batch referendum.

Further notes:

  • Encointer will publish all its crates crates.io
  • Encointer does not carry out external auditing of its runtime nor pallets. It would be beneficial but not a requirement from our side if Encointer could join the auditing process of other system chains.

Drawbacks

Other than all other system chains, development and maintenance of the Encointer Network is mainly financed by the KSM Treasury and possibly the DOT Treasury in the future. Encointer is dedicated to maintaining its network and runtime code for as long as possible, but there is a dependency on funding which is not in the hands of the fellowship. The only risk in the context of funding, however, is that the Encointer runtime will see less frequent updates if there's less funding.

Testing, Security, and Privacy

No changes to the existing system are proposed. Only changes to how maintenance is organized.

Performance, Ergonomics, and Compatibility

No changes

Prior Art and References

Existing Encointer runtime repo

Unresolved Questions

None identified

More info on Encointer: encointer.org

(source)

Table of Contents

RFC-0026: Sassafras Consensus Protocol

Start DateSeptember 06, 2023
DescriptionSassafras consensus protocol specification
AuthorsDavide Galassi

Abstract

Sassafras is a novel consensus protocol designed to address the recurring fork-related challenges encountered in other lottery-based protocols.

The protocol aims to create a mapping between each epoch's slots and the authorities set while ensuring that the identity of authorities assigned to the slots remains undisclosed until the slot is actively claimed during block production.

1. Motivation

Sassafras Protocol has been rigorously described in a comprehensive research paper authored by the Web3 Foundation research team.

This RFC is primarily intended to detail the critical implementation aspects vital for ensuring interoperability and to clarify certain aspects that are left open by the research paper and thus subject to interpretation during implementation.

1.1. Relevance to Implementors

This RFC focuses on providing implementors with the necessary insights into the core protocol's operation.

In instances of inconsistency between this document and the research paper, this RFC should be considered authoritative to eliminate ambiguities and ensure interoperability.

1.2. Supporting Sassafras for Polkadot

Beyond promoting interoperability, this RFC also aims to facilitate the implementation of Sassafras within the greater Polkadot ecosystem.

Although the specifics of deployment strategies are beyond the scope of this document, it lays the groundwork for the integration of Sassafras.

2. Stakeholders

The protocol has a central role in the next generation block authoring consensus systems.

2.1. Blockchain Core Developers

Developers responsible for creating blockchains who intend to leverage the benefits offered by the Sassafras Protocol.

2.2. Polkadot Ecosystem Contributors

Developers contributing to the Polkadot ecosystem, both relay-chain and para-chains.

3. Notation

This section outlines the notation adopted throughout this document to ensure clarity and consistency.

3.1. Data Structures Definitions

Data structures are mostly defined using standard ASN.1 syntax with few exceptions.

To ensure interoperability of serialized structures, the order of the fields must match the definitions found within this specification.

3.2. Types Alias

  • Unsigned integer: Unsigned ::= INTEGER (0..MAX)
  • n-bit unsigned integer: Unsigned<n> ::= INTEGER (0..2^n - 1)
    • 8-bit unsigned integer (octet) Unsigned8 ::= Unsigned<8>
    • 32-bit unsigned integer: Unsigned32 ::= Unsigned<32>
    • 64-bit unsigned integer: Unsigned64 ::= Unsigned<64>
  • Non-homogeneous sequence (struct/tuple): Sequence ::= SEQUENCE
  • Variable length homogeneous sequence (vector): Sequence<T> ::= SEQUENCE OF T
  • Fixed length homogeneous sequence (array): Sequence<T,n> ::= Sequence<T> (SIZE(n))
  • Variable length octet-string: OctetString ::= Sequence<Unsigned8>
  • Fixed length octet-string: OctetString<n> ::= Sequence<Unsigned8, n>

3.2. Pseudo-Code

It is convenient to make use of code snippets as part of the protocol description. As a convention, the code is formatted in a style similar to Rust, and can make use of the following set of predefined procedures:

Sequences

  • CONCAT(x₀: OctetString, ..., xₖ: OctetString) -> OctetString: Concatenates the input octet-strings as a new octet string.

  • LENGTH(s: Sequence) -> Unsigned: The number of elements in the sequence s.

  • GET(s: Sequence<T>, i: Unsigned) -> T: The i-th element of the sequence s.

  • PUSH(s: Sequence<T>, x: T): Appends x as the new last element of the sequence s.

  • POP(s: Sequence<T>) -> T: extract and returns the last element of the sequence s.

Codec

  • ENCODE(x: T) -> OctetString: Encodes x as an OctetString according to SCALE codec.

  • DECODE<T>(x: OctetString) -> T: Decodes x as a type T object according to SCALE codec.

Other

  • BLAKE2(x: OctetString) -> OctetString<32>: Standard Blake2b hash of x with 256-bit digest.

3.3. Incremental Introduction of Types and Functions

More types and helper functions are introduced incrementally as they become relevant within the document's context.

4. Protocol Introduction

The timeline is segmented into a sequentially ordered sequence of slots. This entire sequence of slots is further partitioned into distinct segments known as epochs.

Sassafras aims to map each slot within a target epoch to the authorities scheduled for that epoch, utilizing a ticketing system.

The core protocol operation can be roughly divided into four phases.

4.1. Submission of Candidate Tickets

Each authority scheduled for the target epoch generates and shares a set of candidate tickets. Every ticket has an unbiasable pseudo random score and is bundled with an anonymous proof of validity.

4.2. Validation of Candidate Tickets

Each candidate ticket undergoes a validation process for the associated validity proof and compliance with other protocol-specific constraints. Valid tickets are persisted on-chain.

4.3. Tickets Slots Binding

After collecting all valid candidate tickets and before the beginning of the target epoch, a deterministic method is used to uniquely associate a subset of these tickets to the slots of the target epoch.

4.4. Claim of Ticket Ownership

During block production phase of target epoch, the author is required to prove ownership of the ticket associated to the block's slot. This step discloses the identity of the ticket owner.

5. Bandersnatch VRFs Cryptographic Primitives

This section is not intended to serve as an exhaustive exploration of the mathematically intensive foundations of the cryptographic primitive. Rather, its primary aim is to offer a concise and accessible explanation of the primitives role and interface which is relevant within the scope of the protocol. For a more detailed explanation, refer to the Bandersnatch VRFs technical specification

Bandersnatch VRF comes in two variants:

  • Bare VRF: Extension to the IETF ECVRF RFC 9381,
  • Ring VRF: Anonymous signatures leveraging zk-SNARK.

Together with the input, which determines the VRF output, both variants offer the capability to sign some arbitrary additional data (extra) which doesn't contribute to the VRF output.

5.1 Bare VRF Interface

VRF signature construction.

#![allow(unused)]
fn main() {
    fn vrf_sign(
        secret: SecretKey,
        input: OctetString,
        extra: OctetString,
    ) -> VrfSignature
}

VRF signature verification. Returns a Boolean indicating the validity of the signature (1 on success).

#![allow(unused)]
fn main() {
    fn vrf_verify(
        public: PublicKey,
        input: OctetString,
        extra: OctetString,
        signature: VrfSignature
    ) -> Unsigned<1>;
}

VRF output derivation from input and secret.

#![allow(unused)]
fn main() {
    fn vrf_output(
        secret: SecretKey,
        input: OctetString,
    ) -> OctetString<32>;
}

VRF output derivation from a VRF signature.

#![allow(unused)]
fn main() {
    fn vrf_signed_output(
        signature: VrfSignature,
    ) -> OctetString<32>;
}

The following condition is always satisfied:

#![allow(unused)]
fn main() {
    let signature = vrf_sign(secret, input, extra);
    vrf_output(secret, input) == vrf_signed_output(signature)
}

SecretKey, PublicKey and VrfSignature types are intentionally left undefined. Their definitions can be found in the Bandersnatch VRF specification and related documents.

5.4.2. Ring VRF Interface

Ring VRF signature construction.

#![allow(unused)]
fn main() {
    fn ring_vrf_sign(
        secret: SecretKey,
        prover: RingProver,
        input: OctetString,
        extra: OctetString,
    ) -> RingVrfSignature;
}

Ring VRF signature verification. Returns a Boolean indicating the validity of the signature (1 on success). Note that verification doesn't require the signer's public key.

#![allow(unused)]
fn main() {
    fn ring_vrf_verify(
        verifier: RingVerifier,
        input: OctetString,
        extra: OctetString,
        signature: RingVrfSignature,
    ) -> Unsigned<1>;
}

VRF output derivation from a ring VRF signature.

#![allow(unused)]
fn main() {
    fn ring_vrf_signed_output(
        signature: RingVrfSignature,
    ) -> OctetString<32>;
}

The following condition is always satisfied:

#![allow(unused)]
fn main() {
    let signature = vrf_sign(secret, input, extra);
    let ring_signature = ring_vrf_sign(secret, prover, input, extra);
    vrf_signed_output(signature) == ring_vrf_signed_output(ring_signature);
}

RingProver, RingVerifier, and RingVrfSignature are intentionally left undefined. Their definitions can be found in the Bandersnatch VRF specification and related documents.

6. Sassafras Protocol

6.1. Protocol Configuration

The ProtocolConfiguration type contains some parameters to tweak the protocol behavior and primarily influences certain checks carried out during tickets validation. It is defined as:

#![allow(unused)]
fn main() {
    ProtocolConfiguration ::= Sequence {
        epoch_length: Unsigned32,
        attempts_number: Unsigned8,
        redundancy_factor: Unsigned8,
    }
}

Where:

  • epoch_length: Number of slots for each epoch.
  • attempts_number: Maximum number of tickets that each authority is allowed to submit.
  • redundancy_factor: Expected ratio between the cumulative number of valid tickets which can be submitted by the scheduled authorities and the epoch's duration in slots.

The attempts_number influences the anonymity of block producers. As all published tickets have a public attempt number less than attempts_number, all the tickets which share the attempt number value must belong to different block producers, which reduces anonymity late as we approach the epoch tail. Bigger values guarantee more anonymity but also more computation.

Details about how these parameters drive the tickets validity probability can be found in section 6.5.2.

6.2. Header Digest Log

Each block header contains a Digest log, which is defined as an ordered sequence of DigestItems:

#![allow(unused)]
fn main() {
    DigestItem ::= Sequence {
        id: OctetString<4>,
        data: OctetString
    }

    Digest ::= Sequence<DigestItem>
}

The Digest sequence is used to propagate information required for the correct protocol progress. Outside the protocol's context, the information within each DigestItem is opaque and maps to some SCALE-encoded protocol-specific structure.

For Sassafras related items, the DiegestItems id is set to the ASCII string "SASS"

Possible digest items for Sassafras:

  • Epoch change signal: Information about next epoch. This is mandatory for the first block of a new epoch.
  • Epoch tickets signal: Sequence of tickets for claiming slots in the next epoch. This is mandatory for the first block in the epoch's tail
  • Slot claim info: Additional data required for block verification. This is mandatory for each block and must be the second-to-last entry in the log.
  • Seal: Block signature added by the block author. This is mandatory for each block and must be the last entry in the log.

If any digest entry is unexpected, not found where mandatory or found in the wrong position, then the block is considered invalid.

6.3. On-Chain Randomness

A sequence of four randomness entries is maintained on-chain.

#![allow(unused)]
fn main() {
    RandomnessBuffer ::= Sequence<OctetString<32>, 4>
}

During epoch N:

  • The first entry is the current randomness accumulator and incorporates verifiable random elements from all previously executed blocks. The accumulation procedure is described in section 6.10.

  • The second entry is the snapshot of the accumulator before the execution of the first block of epoch N. This is the randomness used for tickets targeting epoch N+2.

  • The third entry is the snapshot of the accumulator before the execution of the first block of epoch N-1. This is the randomness used for tickets targeting epoch N+1 (the next epoch).

  • The third entry is the snapshot of the accumulator before the execution of the first block of epoch N-2. This is the randomness used for tickets targeting epoch N (the current epoch).

The buffer's entries are updated after each block execution.

6.4. Epoch Change Signal

The first block produced during epoch N must include a descriptor for some of the parameters to be used by the subsequent epoch (N+1).

This signal descriptor is defined as:

#![allow(unused)]
fn main() {
    NextEpochDescriptor ::= Sequence {
        randomness: OctetString<32>,
        authorities: Sequence<PublicKey>,
    }
}

Where:

  • randomness: Randomness accumulator snapshot relevant for validation of next epoch blocks. In other words, randomness used to construct the tickets targeting epoch N+1.
  • authorities: List of authorities scheduled for next epoch.

This descriptor is SCALE encoded and embedded in a DigestItem.

6.4.1. Startup Parameters

Some of the initial parameters used by the first epoch (#0), are set through the genesis configuration, which is defined as:

#![allow(unused)]
fn main() {
    GenesisConfig ::= Sequence {
        authorities: Sequence<PublicKey>,
    }
}

The on-chain RandomnessBuffer is initialized after the genesis block construction. The first buffer entry is set as the Blake2b hash of the genesis block, each of the other entries is set as the Blake2b hash of the previous entry.

Since block #0 is generated by each node as part of the genesis process, the first block that an authority explicitly produces for epoch #0 is block #1. Therefore, block #1 is required to contain the NextEpochDescriptor for the following epoch.

NextEpochDescriptor for epoch #1:

  • randomness: Third entry (index 2) of the randomness buffer.
  • authorities: The same sequence as specified in the genesis configuration.

6.5. Tickets Creation and Submission

During epoch N, each authority scheduled for epoch N+2 constructs a set of tickets which may be eligible (6.5.2) for on-chain submission.

These tickets are constructed using the on-chain randomness snapshot taken before the execution of the first block of epoch N together with other parameters and aims to secure ownership of one or more slots of epoch N+2 (target epoch).

Each authority is allowed to submit a maximum number of tickets, constrained by attempts_number field of the ProtocolConfiguration.

The ideal timing for the candidate authority to start constructing the tickets is subject to strategy. A recommended approach is to initiate tickets creation once the last block of epoch N-1 is either probabilistically or, even better, deterministically finalized. This delay is suggested to prevent wasting resources creating tickets that will be unusable if a different chain branch is chosen as canonical.

Tickets generated during epoch N are shared with the tickets relayers, which are the authorities scheduled for epoch N+1. Relayers validate and collect (off-chain) the tickets targeting epoch N+2.

When epoch N+1 starts, collected tickets are submitted on-chain by relayers as inherent extrinsics, a special type of transaction inserted by the block author at the beginning of the block's transactions sequence.

6.5.1. Ticket Identifier

Each ticket has an associated identifier defined as:

#![allow(unused)]
fn main() {
    TicketId ::= OctetString<32>;
}

The value of TicketId is completely determined by the output of Bandersnatch VRFs given the following unbiasable input:

#![allow(unused)]
fn main() {
    let ticket_vrf_input = CONCAT(
        BYTES("sassafras_ticket_seal"),
        target_epoch_randomness,
        BYTES(attempt)
    );

    let ticket_id = vrf_output(authority_secret_key, ticket_vrf_input);
}

Where:

  • target_epoch_randomness: element of RandomnessBuffer which contains the randomness for the epoch the ticket is targeting.
  • attempt: value going from 0 to the configured attempts_number - 1.

6.5.2. Tickets Threshold

A ticket is valid for on-chain submission if its TicketId value, when interpreted as a big-endian 256-bit integer normalized as a float within the range [0..1], is less than the ticket threshold computed as:

T = (r·s)/(a·v)

Where:

  • v: epoch's authorities number
  • s: epoch's slots number
  • r: redundancy factor
  • a: attempts number

In an epoch with s slots, the goal is to achieve an expected number of valid tickets equal to r·s.

It's crucial to ensure that the probability of having fewer than s winning tickets is very low, even in scenarios where up to 1/3 of the authorities might be offline. To accomplish this, we first define the winning probability of a single ticket as T = (r·s)/(a·v).

Let n be the actual number of participating authorities, where v·2/3 ≤ n ≤ v. These n authorities each make a attempts, for a total of a·n attempts.

Let X be the random variable associated to the number of winning tickets, then its expected value is E[X] = T·a·n = (r·s·n)/v. By setting r = 2, we get s·4/3 ≤ E[X] ≤ s·2. Using Bernestein's inequality we get Pr[X < s] ≤ e^(-s/21).

For instance, with s = 600 this results in Pr[X < s] < 4·10⁻¹³. Consequently, this approach offers considerable tolerance for offline nodes and ensures that all slots are likely to be filled with tickets.

For more details about threshold formula refer to probabilities and parameters paragraph in the Web3 Foundation description of the protocol.

6.5.3. Ticket Envelope

Each ticket candidate is represented by a TicketEnvelope:

#![allow(unused)]
fn main() {
    TicketEnvelope ::= Sequence {
        attempt: Unsigned8,
        extra: OctetString,
        signature: RingVrfSignature
    }   
}

Where:

  • attempt: Index associated to the ticket.
  • extra: Additional data available for user-defined applications.
  • signature: Ring VRF signature of the envelope data (attempt and extra).

Envelope data is signed using Bandersnatch Ring VRF (5.4.2).

#![allow(unused)]
fn main() {
    let signature = ring_vrf_sign(
        secret_key,
        ring_prover
        ticket_vrf_input,
        extra,
    );
}

With ticket_vrf_input defined as in 6.5.1.

6.6. On-chain Tickets Validation

Validation rules:

  1. Ring VRF signature is verified using the ring_verifier derived by the constant ring context parameters (SNARK SRS) and the next epoch authorities public keys.

  2. TicketId is locally computed from the RingVrfSignature and its value is checked to be less than tickets' threshold.

  3. On-chain tickets submission can't occur within a block part of the epoch's tail, which encompasses a configurable number of slots at the end of the epoch. This constraint is to give time to persisted on-chain tickets to be probabilistically (or even better deterministically) finalized and thus to further reduce the fork chances at the beginning of the target epoch.

  4. All tickets which are proposed within a block must be valid and all of them must end up being persisted on-chain. Because the total number of tickets persisted on-chain is limited by to the epoch's length, this may require to drop some of the previously persisted tickets. We remove tickets with greater TicketId value first.

  5. No tickets duplicates are allowed.

If at least one of the checks fails then the block must be considered invalid.

Pseudo-code for ticket validation for steps 1 and 2:

#![allow(unused)]
fn main() {
    let ticket_vrf_input = CONCAT(
        BYTES("sassafras_ticket_seal"),
        target_epoch_randomness,
        BYTES(envelope.attempt)
    );

    let result = ring_vrf_verify(
        ring_verifier,
        ticket_vrf_input,
        envelope.extra,
        envelope.ring_signature
    );
    ASSERT(result == 1);

    let ticket_id = ring_vrf_signed_output(envelope.ring_signature);
    ASSERT(ticket_id < ticket_threshold);
}

Valid tickets are persisted on-chain in a bounded sorted sequence of TicketBody objects. Items within this sequence are sorted according to their TicketId, interpreted as a 256-bit big-endian unsigned integer.

#![allow(unused)]
fn main() {
    TicketBody ::= Sequence {
        id: TicketId,
        attempt: Unsigned8,
        extra: OctetString,
    }

    Tickets ::= Sequence<TicketBody>
}

The on-chain tickets sequence length bound is set equal to the epoch length in slots according to the protocol configuration.

6.7. Ticket-Slot Binding

Before the beginning of the target epoch, the on-chain sequence of tickets must be associated to epoch's slots such that there is at most one ticket per slot.

Given an ordered sequence of tickets [t₀, t₁, ..., tₙ], the tickets are associated according to the following outside-in strategy:

    slot_index  : [  0,  1,  2,  3 ,  ... ]
    tickets     : [ t₀, tₙ, t₁, tₙ₋₁, ... ]

Here slot_index is the slot number relative to the epoch's first slot: slot_index = slot - epoch_first_slot.

The association between tickets and a slots is recorded on-chain and thus is public. What remains confidential is the ticket's author identity, and consequently, who is enabled to claim the corresponding slot. This information is known only to the ticket's author.

If the number of published tickets is less than the number of epoch's slots, some orphan slots at the end of the epoch will remain unbounded to any ticket. For orphan slots claiming strategy refer to 6.8.2. Note that this fallback situation always apply to the first two epochs after genesis.

6.8. Slot Claim

With tickets bounded to the target epoch slots, every designated authority acquires the information about the slots for which they are required to produce a block.

The procedure for slot claiming depends on whether a given slot has an associated ticket according to the on-chain state. If a slot has an associated ticket, then the primary authoring method is used. Conversely, the protocol resorts to the secondary method as a fallback.

6.8.1. Primary Method

An authority, can claim a slot using the primary method if it is the legit owner of the ticket associated to the given slot.

Let target_epoch_randomness be the entry in RandomnessBuffer relative to the epoch the block is targeting and attempt be the attempt used to construct the ticket associated to the slot to claim, the VRF input for slot claiming is constructed as:

#![allow(unused)]
fn main() {
    let seal_vrf_input = CONCAT(
        BYTES("sassafras_ticket_seal"),
        target_epoch_randomness,
        BYTES(attempt)
    );
}

The seal_vrf_input, when signed with the correct authority secret key, must generate the same TicketId which has been associated to the target slot according to the on-chain state.

6.8.2. Secondary Method

Given that the authorities scheduled for the target epoch are kept on-chain in an ordered sequence, the index of the authority which has the privilege to claim an orphan slot is given by the following procedure:

#![allow(unused)]
fn main() {
    let hash_input = CONCAT(
        target_epoch_randomness,
        ENCODE(relative_slot_index),
    );
    let hash = BLAKE2(hash_input);
    let index_bytes = CONCAT(GET(hash, 0), GET(hash, 1), GET(hash, 2), GET(hash, 3));
    let index = DECODE<Unsigned32>(index_bytes) % LENGTH(authorities);
}

With relative_slot_index the slot offset relative to the target epoch's start and authorities the sequence of target epoch authorities.

#![allow(unused)]
fn main() {
    let seal_vrf_input = CONCAT(
        BYTES("sassafras_fallback_seal"),
        target_epoch_randomness
    );
}

6.8.3. Claim Data

ClaimData is a digest entry which contains additional information required by the protocol to verify the block:

#![allow(unused)]
fn main() {
    ClaimData ::= Sequence {
        slot: Unsigned32,
        authority_index: Unsigned32,
        randomness_source: VrfSignature,
    }
}
  • slot: The slot number
  • authority_index: Block's author index relative to the on-chain authorities sequence.
  • randomness_source: VRF signature used to generate per-block randomness.

Given the seal_vrf_input constructed using the primary or secondary method, the randomness source signature is generated as follows:

#![allow(unused)]
fn main() {
    let randomness_vrf_input = CONCAT(
        BYTES("sassafras_randomness"),
        vrf_output(authority_secret_key, seal_vrf_input)
    );

    let randomness_source = vrf_sign(
        authority_secret_key,
        randomness_vrf_input,
        []
    );

    let claim = SlotClaim {
        slot,
        authority_index,
        randomness_source
    };

    PUSH(block_header.digest, ENCODE(claim));
}

The ClaimData object is SCALE encoded and pushed as the second-to-last element of the header digest log.

6.8.4. Block Seal

A block is finally sealed as follows:

#![allow(unused)]
fn main() {
    let unsealed_header_byets = ENCODE(block_header);

    let seal = vrf_sign(
        authority_secret_key,
        seal_vrf_input,
        unsealed_header_bytes
    );

    PUSH(block_header.digest, ENCODE(seal));
}

With block_header the block's header without the seal digest log entry.

The seal object is a VrfSignature, which is SCALE encoded and pushed as the last entry of the header digest log.

6.9. Slot Claim Verification

The last entry is extracted from the header digest log, and is SCALE decoded as a VrfSignature object. The unsealed header is then SCALE encoded in order to be verified.

The next entry is extracted from the header digest log, and is SCALE decoded as a ClaimData object.

The validity of the two signatures is assessed using as the authority public key corresponding to the authority_index found in the ClaimData, together with the VRF input (which depends on primary/secondary method) and additional data used by the block author.

#![allow(unused)]
fn main() {
    let seal_signature = DECODE<VrfSignature>(POP(header.digest));
    let unsealed_header_bytes = ENCODE(header);
    let claim_data = DECODE<ClaimData>(POP(header.digest));

    let authority_public_key = GET(authorities, claim_data.authority_index);

    // Verify seal signature
    let result = vrf_verify(
        authority_public_key,
        seal_vrf_input,
        unsealed_header_bytes,
        seal_signature
    );
    ASSERT(result == 1);

    let randomness_vrf_input = CONCAT(
        BYTES("sassafras_randomness"),
        vrf_signed_output(seal_signature)
    );

    // Verify per-block entropy source signature
    let result = vrf_verify(
        authority_public_key,
        randomness_vrf_input,
        [],
        claim_data.randomness_source
    );
    ASSERT(result == 1);
}

With:

  • header: The block's header.
  • authorities: Sequence of authorities for the target epoch, as recorded on-chain.
  • seal_vrf_input: VRF input data constructed as specified in 6.8.

If signatures verification is successful, then the verification process diverges based on whether the slot is associated with a ticket according to the on-chain state.

6.9.1. Primary Method

For slots tied to a ticket, the primary verification method is employed. This method verifies ticket ownership using the TicketId associated to the slot.

#![allow(unused)]
fn main() {
    let ticket_id = vrf_signed_output(seal_signature);
    ASSERT(ticket_id == expected_ticket_id);
}

With expected_ticket_id the ticket identifier committed on-chain in the associated TicketBody.

6.9.2. Secondary Method

If the slot doesn't have any associated ticket, then the authority_index contained in the ClaimData must match the one returned by the procedure outlined in section 6.8.2.

6.10. Randomness Accumulator

The randomness accumulator is updated using the randomness_source signature found within the ClaimData object. In particular, fresh randomness is derived and accumulated after block execution as follows:

#![allow(unused)]
fn main() {
    let fresh_randomness = vrf_signed_output(claim.randomness_source);  
    randomness_buffer[0] = BLAKE2(CONCAT(randomness_buffer[0], fresh_randomness));
}

7. Drawbacks

None

8. Testing, Security, and Privacy

It is critical that implementations of this RFC undergo thorough rigorous testing. A security audit may be desirable to ensure the implementation does not introduce emergent side effects.

9. Performance, Ergonomics, and Compatibility

9.1. Performance

Adopting Sassafras consensus marks a significant improvement in reducing the frequency of short-lived forks which are eliminated by design.

Forks may only result from network disruption or protocol attacks. In such cases, the choice of which fork to follow upon recovery is clear-cut, with only one valid option.

9.2. Ergonomics

No specific considerations.

9.3. Compatibility

The adoption of Sassafras affects the native client and thus can't be introduced via a "simple" runtime upgrade.

A deployment strategy should be carefully engineered for live networks. This subject is left open for a dedicated RFC.

10. Prior Art and References

11. Unresolved Questions

None

While this RFC lays the groundwork and outlines the core aspects of the protocol, several crucial topics remain to be addressed in future RFCs.

12.1. Interactions with On-Chain Code

  • Storage: Types, organization and genesis configuration.

  • Host interface: Interface that the hosting environment exposes to on-chain code (also known as host functions).

  • Unrecorded on-chain interface. Interface that on-chain code exposes to the hosting environment (also known as runtime API).

  • Transactional on-chain interface. Interface that on-chain code exposes to the World to alter the state (also known as transactions or extrinsics in the Polkadot ecosystem).

12.2. Deployment Strategies

  • Protocol Migration. Investigate of how Sassafras can seamlessly replace an already operational instance of another protocol. Future RFCs may focus on deployment strategies to facilitate a smooth transition.

12.3. ZK-SNARK Parameters

  • Parameters Setup: Determine the setup procedure for the zk-SNARK SRS (Structured Reference String) initialization. Future RFCs may provide insights into whether this process should include an ad-hoc initialization ceremony or if we can reuse an SRS from another ecosystem (e.g. Zcash or Ethereum).

12.4. Anonymous Submission of Tickets.

  • Mixnet Integration: Submitting tickets directly to the relay can pose a risk of potential deanonymization through traffic analysis. Subsequent RFCs may investigate the potential for incorporating mix network protocol or other privacy-enhancing mechanisms to address this concern.

(source)

Table of Contents

RFC-0032: Minimal Relay

Start Date20 September 2023
DescriptionProposal to minimise Relay Chain functionality.
AuthorsJoe Petrowski, Gavin Wood

Summary

The Relay Chain contains most of the core logic for the Polkadot network. While this was necessary prior to the launch of parachains and development of XCM, most of this logic can exist in parachains. This is a proposal to migrate several subsystems into system parachains.

Motivation

Polkadot's scaling approach allows many distinct state machines (known generally as parachains) to operate with common guarantees about the validity and security of their state transitions. Polkadot provides these common guarantees by executing the state transitions on a strict subset (a backing group) of the Relay Chain's validator set.

However, state transitions on the Relay Chain need to be executed by all validators. If any of those state transitions can occur on parachains, then the resources of the complement of a single backing group could be used to offer more cores. As in, they could be offering more coretime (a.k.a. blockspace) to the network.

By minimising state transition logic on the Relay Chain by migrating it into "system chains" -- a set of parachains that, with the Relay Chain, make up the Polkadot protocol -- the Polkadot Ubiquitous Computer can maximise its primary offering: secure blockspace.

Stakeholders

  • Parachains that interact with affected logic on the Relay Chain;
  • Core protocol and XCM format developers;
  • Tooling, block explorer, and UI developers.

Explanation

The following pallets and subsystems are good candidates to migrate from the Relay Chain:

  • Identity
  • Balances
  • Staking
    • Staking
    • Election Provider
    • Bags List
    • NIS
    • Nomination Pools
    • Fast Unstake
  • Governance
    • Treasury and Bounties
    • Conviction Voting
    • Referenda

Note: The Auctions and Crowdloan pallets will be replaced by Coretime, its system chain and interface described in RFC-1 and RFC-5, respectively.

Migrations

Some subsystems are simpler to move than others. For example, migrating Identity can be done by simply preventing state changes in the Relay Chain, using the Identity-related state as the genesis for a new chain, and launching that new chain with the genesis and logic (pallet) needed.

Other subsystems cannot experience any downtime like this because they are essential to the network's functioning, like Staking and Governance. However, these can likely coexist with a similarly-permissioned system chain for some time, much like how "Gov1" and "OpenGov" coexisted at the latter's introduction.

Specific migration plans will be included in release notes of runtimes from the Polkadot Fellowship when beginning the work of migrating a particular subsystem.

Interfaces

The Relay Chain, in many cases, will still need to interact with these subsystems, especially Staking and Governance. These subsystems will require making some APIs available either via dispatchable calls accessible to XCM Transact or possibly XCM Instructions in future versions.

For example, Staking provides a pallet-API to register points (e.g. for block production) and offences (e.g. equivocation). With Staking in a system chain, that chain would need to allow the Relay Chain to update validator points periodically so that it can correctly calculate rewards.

A pub-sub protocol may also lend itself to these types of interactions.

Functional Architecture

This RFC proposes that system chains form individual components within the system's architecture and that these components are chosen as functional groups. This approach allows synchronous composibility where it is most valuable, but isolates logic in such a way that provides flexibility for optimal resource allocation (see Resource Allocation). For the subsystems discussed in this RFC, namely Identity, Governance, and Staking, this would mean:

  • People Chain, for identity and personhood logic, providing functionality related to the attributes of single actors;
  • Governance Chain, for governance and system collectives, providing functionality for pluralities to express their voices within the system;
  • Staking Chain, for Polkadot's staking system, including elections, nominations, reward distribution, slashing, and non-interactive staking; and
  • Asset Hub, for fungible and non-fungible assets, including DOT.

The Collectives chain and Asset Hub already exist, so implementation of this RFC would mean two new chains (People and Staking), with Governance moving to the currently-known-as Collectives chain and Asset Hub being increasingly used for DOT over the Relay Chain.

Note that one functional group will likely include many pallets, as we do not know how pallet configurations and interfaces will evolve over time.

Resource Allocation

The system should minimise wasted blockspace. These three (and other) subsystems may not each consistently require a dedicated core. However, core scheduling is far more agile than functional grouping. While migrating functionality from one chain to another can be a multi-month endeavour, cores can be rescheduled almost on-the-fly.

Migrations are also breaking changes to some use cases, for example other parachains that need to route XCM programs to particular chains. It is thus preferable to do them a single time in migrating off the Relay Chain, reducing the risk of needing parachain splits in the future.

Therefore, chain boundaries should be based on functional grouping where synchronous composibility is most valuable; and efficient resource allocation should be managed by the core scheduling protocol.

Many of these system chains (including Asset Hub) could often share a single core in a semi-round robin fashion (the coretime may not be uniform). When needed, for example during NPoS elections or slashing events, the scheduler could allocate a dedicated core to the chain in need of more throughput.

Deployment

Actual migrations should happen based on some prioritization. This RFC proposes to migrate Identity, Staking, and Governance as the systems to work on first. A brief discussion on the factors involved in each one:

Identity

Identity will be one of the simpler pallets to migrate into a system chain, as its logic is largely self-contained and it does not "share" balances with other subsystems. As in, any DOT is held in reserve as a storage deposit and cannot be simultaneously used the way locked DOT can be locked for multiple purposes.

Therefore, migration can take place as follows:

  1. The pallet can be put in a locked state, blocking most calls to the pallet and preventing updates to identity info.
  2. The frozen state will form the genesis of a new system parachain.
  3. Functions will be added to the pallet that allow migrating the deposit to the parachain. The parachain deposit is on the order of 1/100th of the Relay Chain's. Therefore, this will result in freeing up Relay State as well as most of each user's reserved balance.
  4. The pallet and any leftover state can be removed from the Relay Chain.

User interfaces that render Identity information will need to source their data from the new system parachain.

Note: In the future, it may make sense to decommission Kusama's Identity chain and do all account identities via Polkadot's. However, the Kusama chain will serve as a dress rehearsal for Polkadot.

Staking

Migrating the staking subsystem will likely be the most complex technical undertaking, as the Staking system cannot stop (the system MUST always have a validator set) nor run in parallel (the system MUST have only one validator set) and the subsystem itself is made up of subsystems in the runtime and the node. For example, if offences are reported to the Staking parachain, validator nodes will need to submit their reports there.

Handling balances also introduces complications. The same balance can be used for staking and governance. Ideally, all balances stay on Asset Hub, and only report "credits" to system chains like Staking and Governance. However, staking mutates balances by issuing new DOT on era changes and for rewards. Allowing DOT directly on the Staking parachain would simplify staking changes.

Given the complexity, it would be pragmatic to include the Balances pallet in the Staking parachain in its first version. Any other systems that use overlapping locks, most notably governance, will need to recognise DOT held on both Asset Hub and the Staking parachain.

There is more discussion about staking in a parachain in Moving Staking off the Relay Chain.

Governance

Migrating governance into a parachain will be less complicated than staking. Most of the primitives needed for the migration already exist. The Treasury supports spending assets on remote chains and collectives like the Polkadot Technical Fellowship already function in a parachain. That is, XCM already provides the ability to express system origins across chains.

Therefore, actually moving the governance logic into a parachain will be simple. It can run in parallel with the Relay Chain's governance, which can be removed when the parachain has demonstrated sufficient functionality. It's possible that the Relay Chain maintain a Root-level emergency track for situations like parachains halting.

The only complication arises from the fact that both Asset Hub and the Staking parachain will have DOT balances; therefore, the Governance chain will need to be able to credit users' voting power based on balances from both locations. This is not expected to be difficult to handle.

Kusama

Although Polkadot and Kusama both have system chains running, they have to date only been used for introducing new features or bodies, for example fungible assets or the Technical Fellowship. There has not yet been a migration of logic/state from the Relay Chain into a parachain. Given its more realistic network conditions than testnets, Kusama is the best stage for rehearsal.

In the case of identity, Polkadot's system may be sufficient for the ecosystem. Therefore, Kusama should be used to test the migration of logic and state from Relay Chain to parachain, but these features may be (at the will of Kusama's governance) dropped from Kusama entirely after a successful migration on Polkadot.

For Governance, Polkadot already has the Collectives parachain, which would become the Governance parachain. The entire group of DOT holders is itself a collective (the legislative body), and governance provides the means to express voice. Launching a Kusama Governance chain would be sensible to rehearse a migration.

The Staking subsystem is perhaps where Kusama would provide the most value in its canary capacity. Staking is the subsystem most constrained by PoV limits. Ensuring that elections, payouts, session changes, offences/slashes, etc. work in a parachain on Kusama -- with its larger validator set -- will give confidence to the chain's robustness on Polkadot.

Drawbacks

These subsystems will have reduced resources in cores than on the Relay Chain. Staking in particular may require some optimizations to deal with constraints.

Testing, Security, and Privacy

Standard audit/review requirements apply. More powerful multi-chain integration test tools would be useful in developement.

Performance, Ergonomics, and Compatibility

Describe the impact of the proposal on the exposed functionality of Polkadot.

Performance

This is an optimization. The removal of public/user transactions on the Relay Chain ensures that its primary resources are allocated to system performance.

Ergonomics

This proposal alters very little for coretime users (e.g. parachain developers). Application developers will need to interact with multiple chains, making ergonomic light client tools particularly important for application development.

For existing parachains that interact with these subsystems, they will need to configure their runtimes to recognize the new locations in the network.

Compatibility

Implementing this proposal will require some changes to pallet APIs and/or a pub-sub protocol. Application developers will need to interact with multiple chains in the network.

Prior Art and References

Unresolved Questions

There remain some implementation questions, like how to use balances for both Staking and Governance. See, for example, Moving Staking off the Relay Chain.

Ideally the Relay Chain becomes transactionless, such that not even balances are represented there. With Staking and Governance off the Relay Chain, this is not an unreasonable next step.

With Identity on Polkadot, Kusama may opt to drop its People Chain.

(source)

Table of Contents

RFC-0042: Add System version that replaces StateVersion on RuntimeVersion

Start Date25th October 2023
DescriptionAdd System Version and remove State Version
AuthorsVedhavyas Singareddi

Summary

At the moment, we have system_version field on RuntimeVersion that derives which state version is used for the Storage. We have a use case where we want extrinsics root is derived using StateVersion::V1. Without defining a new field under RuntimeVersion, we would like to propose adding system_version that can be used to derive both storage and extrinsic state version.

Motivation

Since the extrinsic state version is always StateVersion::V0, deriving extrinsic root requires full extrinsic data. This would be problematic when we need to verify the extrinsics root if the extrinsic sizes are bigger. This problem is further explored in https://github.com/polkadot-fellows/RFCs/issues/19

For Subspace project, we have an enshrined rollups called Domain with optimistic verification and Fraud proofs are used to detect malicious behavior. One of the Fraud proof variant is to derive Domain block extrinsic root on Subspace's consensus chain. Since StateVersion::V0 requires full extrinsic data, we are forced to pass all the extrinsics through the Fraud proof. One of the main challenge here is some extrinsics could be big enough that this variant of Fraud proof may not be included in the Consensus block due to Block's weight restriction. If the extrinsic root is derived using StateVersion::V1, then we do not need to pass the full extrinsic data but rather at maximum, 32 byte of extrinsic data.

Stakeholders

  • Technical Fellowship, in its role of maintaining system runtimes.

Explanation

In order to use project specific StateVersion for extrinsic roots, we proposed an implementation that introduced parameter to frame_system::Config but that unfortunately did not feel correct. So we would like to propose adding this change to the RuntimeVersion object. The system version, if introduced, will be used to derive both storage and extrinsic state version. If system version is 0, then both Storage and Extrinsic State version would use V0. If system version is 1, then Storage State version would use V1 and Extrinsic State version would use V0. If system version is 2, then both Storage and Extrinsic State version would use V1.

If implemented, the new RuntimeVersion definition would look something similar to

#![allow(unused)]
fn main() {
/// Runtime version (Rococo).
#[sp_version::runtime_version]
pub const VERSION: RuntimeVersion = RuntimeVersion {
		spec_name: create_runtime_str!("rococo"),
		impl_name: create_runtime_str!("parity-rococo-v2.0"),
		authoring_version: 0,
		spec_version: 10020,
		impl_version: 0,
		apis: RUNTIME_API_VERSIONS,
		transaction_version: 22,
		system_version: 1,
	};
}

Drawbacks

There should be no drawbacks as it would replace state_version with same behavior but documentation should be updated so that chains know which system_version to use.

Testing, Security, and Privacy

AFAIK, should not have any impact on the security or privacy.

Performance, Ergonomics, and Compatibility

These changes should be compatible for existing chains if they use state_version value for system_verision.

Performance

I do not believe there is any performance hit with this change.

Ergonomics

This does not break any exposed Apis.

Compatibility

This change should not break any compatibility.

Prior Art and References

We proposed introducing a similar change by introducing a parameter to frame_system::Config but did not feel that is the correct way of introducing this change.

Unresolved Questions

I do not have any specific questions about this change at the moment.

IMO, this change is pretty self-contained and there won't be any future work necessary.

(source)

Table of Contents

RFC-0043: Introduce storage_proof_size Host Function for Improved Parachain Block Utilization

Start Date30 October 2023
DescriptionHost function to provide the storage proof size to runtimes.
AuthorsSebastian Kunert

Summary

This RFC proposes a new host function for parachains, storage_proof_size. It shall provide the size of the currently recorded storage proof to the runtime. Runtime authors can use the proof size to improve block utilization by retroactively reclaiming unused storage weight.

Motivation

The number of extrinsics that are included in a parachain block is limited by two constraints: execution time and proof size. FRAME weights cover both concepts, and block-builders use them to decide how many extrinsics to include in a block. However, these weights are calculated ahead of time by benchmarking on a machine with reference hardware. The execution-time properties of the state-trie and its storage items are unknown at benchmarking time. Therefore, we make some assumptions about the state-trie:

  • Trie Depth: We assume a trie depth to account for intermediary nodes.
  • Storage Item Size: We make a pessimistic assumption based on the MaxEncodedLen trait.

These pessimistic assumptions lead to an overestimation of storage weight, negatively impacting block utilization on parachains.

In addition, the current model does not account for multiple accesses to the same storage items. While these repetitive accesses will not increase storage-proof size, the runtime-side weight monitoring will account for them multiple times. Since the proof size is completely opaque to the runtime, we can not implement retroactive storage weight correction.

A solution must provide a way for the runtime to track the exact storage-proof size consumed on a per-extrinsic basis.

Stakeholders

  • Parachain Teams: They MUST include this host function in their runtime and node.
  • Light-client Implementors: They SHOULD include this host function in their runtime and node.

Explanation

This RFC proposes a new host function that exposes the storage-proof size to the runtime. As a result, runtimes can implement storage weight reclaiming mechanisms that improve block utilization.

This RFC proposes the following host function signature:

#![allow(unused)]
fn main() {
fn ext_storage_proof_size_version_1() -> u64;
}

The host function MUST return an unsigned 64-bit integer value representing the current proof size. In block-execution and block-import contexts, this function MUST return the current size of the proof. To achieve this, parachain node implementors need to enable proof recording for block imports. In other contexts, this function MUST return 18446744073709551615 (u64::MAX), which represents disabled proof recording.

Performance, Ergonomics, and Compatibility

Performance

Parachain nodes need to enable proof recording during block import to correctly implement the proposed host function. Benchmarking conducted with balance transfers has shown a performance reduction of around 0.6% when proof recording is enabled.

Ergonomics

The host function proposed in this RFC allows parachain runtime developers to keep track of the proof size. Typical usage patterns would be to keep track of the overall proof size or the difference between subsequent calls to the host function.

Compatibility

Parachain teams will need to include this host function to upgrade.

Prior Art and References

(source)

Table of Contents

RFC-0045: Lowering NFT Deposits on Asset Hub

Start Date2 November 2023
DescriptionA proposal to reduce the minimum deposit required for collection creation on the Polkadot and Kusama Asset Hubs.
AuthorsAurora Poppyseed, Just_Luuuu, Viki Val, Joe Petrowski

Summary

This RFC proposes changing the current deposit requirements on the Polkadot and Kusama Asset Hub for creating an NFT collection, minting an individual NFT, and lowering its corresponding metadata and attribute deposits. The objective is to lower the barrier to entry for NFT creators, fostering a more inclusive and vibrant ecosystem while maintaining network integrity and preventing spam.

Motivation

The current deposit of 10 DOT for collection creation (along with 0.01 DOT for item deposit and 0.2 DOT for metadata and attribute deposits) on the Polkadot Asset Hub and 0.1 KSM on Kusama Asset Hub presents a significant financial barrier for many NFT creators. By lowering the deposit requirements, we aim to encourage more NFT creators to participate in the Polkadot NFT ecosystem, thereby enriching the diversity and vibrancy of the community and its offerings.

The initial introduction of a 10 DOT deposit was an arbitrary starting point that does not consider the actual storage footprint of an NFT collection. This proposal aims to adjust the deposit first to a value based on the deposit function, which calculates a deposit based on the number of keys introduced to storage and the size of corresponding values stored.

Further, it suggests a direction for a future of calculating deposits variably based on adoption and/or market conditions. There is a discussion on tradeoffs of setting deposits too high or too low.

Requirements

  • Deposits SHOULD be derived from deposit function, adjusted by correspoding pricing mechansim.

Stakeholders

  • NFT Creators: Primary beneficiaries of the proposed change, particularly those who found the current deposit requirements prohibitive.
  • NFT Platforms: As the facilitator of artists' relations, NFT marketplaces have a vested interest in onboarding new users and making their platforms more accessible.
  • dApp Developers: Making the blockspace more accessible will encourage developers to create and build unique dApps in the Polkadot ecosystem.
  • Polkadot Community: Stands to benefit from an influx of artists, creators, and diverse NFT collections, enhancing the overall ecosystem.

Previous discussions have been held within the Polkadot Forum, with artists expressing their concerns about the deposit amounts.

Explanation

This RFC proposes a revision of the deposit constants in the configuration of the NFTs pallet on the Polkadot Asset Hub. The new deposit amounts would be determined by a standard deposit formula.

As of v1.1.1, the Collection Deposit is 10 DOT and the Item Deposit is 0.01 DOT (see here).

Based on the storage footprint of these items, this RFC proposes changing them to:

#![allow(unused)]
fn main() {
pub const NftsCollectionDeposit: Balance = system_para_deposit(1, 130);
pub const NftsItemDeposit: Balance = system_para_deposit(1, 164);
}

This results in the following deposits (calculted using this repository):

Polkadot

NameCurrent Rate (DOT)Calculated with Function (DOT)
collectionDeposit100.20064
itemDeposit0.010.20081
metadataDepositBase0.201290.20076
attributeDepositBase0.20.2

Similarly, the prices for Kusama were calculated as:

Kusama:

NameCurrent Rate (KSM)Calculated with Function (KSM)
collectionDeposit0.10.006688
itemDeposit0.0010.000167
metadataDepositBase0.0067096666170.0006709666617
attributeDepositBase0.006666666660.000666666666

Enhanced Approach to Further Lower Barriers for Entry

This RFC proposes further lowering these deposits below the rate normally charged for such a storage footprint. This is based on the economic argument that sub-rate deposits are a subsididy for growth and adoption of a specific technology. If the NFT functionality on Polkadot gains adoption, it makes it more attractive for future entrants, who would be willing to pay the non-subsidized rate because of the existing community.

Proposed Rate Adjustments

#![allow(unused)]
fn main() {
parameter_types! {
	pub const NftsCollectionDeposit: Balance = system_para_deposit(1, 130);
	pub const NftsItemDeposit: Balance = system_para_deposit(1, 164) / 40;
	pub const NftsMetadataDepositBase: Balance = system_para_deposit(1, 129) / 10;
	pub const NftsAttributeDepositBase: Balance = system_para_deposit(1, 0) / 10;
	pub const NftsDepositPerByte: Balance = system_para_deposit(0, 1);
}
}

This adjustment would result in the following DOT and KSM deposit values:

NameProposed Rate PolkadotProposed Rate Kusama
collectionDeposit0.20064 DOT0.006688 KSM
itemDeposit0.005 DOT0.000167 KSM
metadataDepositBase0.002 DOT0.0006709666617 KSM
attributeDepositBase0.002 DOT0.000666666666 KSM

Short- and Long-Term Plans

The plan presented above is recommended as an immediate step to make Polkadot a more attractive place to launch NFTs, although one would note that a forty fold reduction in the Item Deposit is just as arbitrary as the value it was replacing. As explained earlier, this is meant as a subsidy to gain more momentum for NFTs on Polkadot.

In the long term, an implementation should account for what should happen to the deposit rates assuming that the subsidy is successful and attracts a lot of deployments. Many options are discussed in the Addendum.

The deposit should be calculated as a function of the number of existing collections with maximum DOT and stablecoin values limiting the amount. With asset rates available via the Asset Conversion pallet, the system could take the lower value required. A sigmoid curve would make sense for this application to avoid sudden rate changes, as in:

$$ minDeposit + \frac{\mathrm{min(DotDeposit, StableDeposit) - minDeposit} }{\mathrm{1 + e^{a - b * x}} }$$

where the constant a moves the inflection to lower or higher x values, the constant b adjusts the rate of the deposit increase, and the independent variable x is the number of collections or items, depending on application.

Drawbacks

Modifying deposit requirements necessitates a balanced assessment of the potential drawbacks. Highlighted below are cogent points extracted from the discourse on the Polkadot Forum conversation, which provide critical perspectives on the implications of such changes.

Adjusting NFT deposit requirements on Polkadot and Kusama Asset Hubs involves key challenges:

  1. State Growth and Technical Concerns: Lowering deposit requirements can lead to increased blockchain state size, potentially causing state bloat. This growth needs to be managed to prevent strain on the network's resources and maintain operational efficiency. As stated earlier, the deposit levels proposed here are intentionally low with the thesis that future participants would pay the standard rate.

  2. Network Security and Market Response: Adapting to the cryptocurrency market's volatility is crucial. The mechanism for setting deposit amounts must be responsive yet stable, avoiding undue complexity for users.

  3. Economic Impact on Previous Stakeholders: The change could have varied economic effects on previous (before the change) creators, platform operators, and investors. Balancing these interests is essential to ensure the adjustment benefits the ecosystem without negatively impacting its value dynamics. However in the particular case of Polkadot and Kusama Asset Hub this does not pose a concern since there are very few collections currently and thus previous stakeholders wouldn't be much affected. As of date 9th January 2024 there are 42 collections on Polkadot Asset Hub and 191 on Kusama Asset Hub with a relatively low volume.

Testing, Security, and Privacy

Security concerns

As noted above, state bloat is a security concern. In the case of abuse, governance could adapt by increasing deposit rates and/or using forceDestroy on collections agreed to be spam.

Performance, Ergonomics, and Compatibility

Performance

The primary performance consideration stems from the potential for state bloat due to increased activity from lower deposit requirements. It's vital to monitor and manage this to avoid any negative impact on the chain's performance. Strategies for mitigating state bloat, including efficient data management and periodic reviews of storage requirements, will be essential.

Ergonomics

The proposed change aims to enhance the user experience for artists, traders, and utilizers of Kusama and Polkadot Asset Hubs, making Polkadot and Kusama more accessible and user-friendly.

Compatibility

The change does not impact compatibility as a redeposit function is already implemented.

Unresolved Questions

If this RFC is accepted, there should not be any unresolved questions regarding how to adapt the implementation of deposits for NFT collections.

Addendum

Several innovative proposals have been considered to enhance the network's adaptability and manage deposit requirements more effectively. The RFC recommends a mixture of the function-based model and the stablecoin model, but some tradeoffs of each are maintained here for those interested.

Enhanced Weak Governance Origin Model

The concept of a weak governance origin, controlled by a consortium like a system collective, has been proposed. This model would allow for dynamic adjustments of NFT deposit requirements in response to market conditions, adhering to storage deposit norms.

  • Responsiveness: To address concerns about delayed responses, the model could incorporate automated triggers based on predefined market indicators, ensuring timely adjustments.
  • Stability vs. Flexibility: Balancing stability with the need for flexibility is challenging. To mitigate the issue of frequent changes in DOT-based deposits, a mechanism for gradual and predictable adjustments could be introduced.
  • Scalability: The model's scalability is a concern, given the numerous deposits across the system. A more centralized approach to deposit management might be needed to avoid constant, decentralized adjustments.

Function-Based Pricing Model

Another proposal is to use a mathematical function to regulate deposit prices, initially allowing low prices to encourage participation, followed by a gradual increase to prevent network bloat.

  • Choice of Function: A logarithmic or sigmoid function is favored over an exponential one, as these functions increase prices at a rate that encourages participation while preventing prohibitive costs.
  • Adjustment of Constants: To finely tune the pricing rise, one of the function's constants could correlate with the total number of NFTs on Asset Hub. This would align the deposit requirements with the actual usage and growth of the network.

Linking Deposit to USD(x) Value

This approach suggests pegging the deposit value to a stable currency like the USD, introducing predictability and stability for network users.

  • Market Dynamics: One perspective is that fluctuations in native currency value naturally balance user participation and pricing, deterring network spam while encouraging higher-value collections. Conversely, there's an argument for allowing broader participation if the DOT/KSM value increases.
  • Complexity and Risks: Implementing a USD-based pricing system could add complexity and potential risks. The implementation needs to be carefully designed to avoid unintended consequences, such as excessive reliance on external financial systems or currencies.

Each of these proposals offers unique advantages and challenges. The optimal approach may involve a combination of these ideas, carefully adjusted to address the specific needs and dynamics of the Polkadot and Kusama networks.

(source)

Table of Contents

RFC-0047: Assignment of availability chunks to validators

Start Date03 November 2023
DescriptionAn evenly-distributing indirection layer between availability chunks and validators.
AuthorsAlin Dima

Summary

Propose a way of permuting the availability chunk indices assigned to validators, in the context of recovering available data from systematic chunks, with the purpose of fairly distributing network bandwidth usage.

Motivation

Currently, the ValidatorIndex is always identical to the ChunkIndex. Since the validator array is only shuffled once per session, naively using the ValidatorIndex as the ChunkIndex would pose an unreasonable stress on the first N/3 validators during an entire session, when favouring availability recovery from systematic chunks.

Therefore, the relay chain node needs a deterministic way of evenly distributing the first ~(N_VALIDATORS / 3) systematic availability chunks to different validators, based on the relay chain block and core. The main purpose is to ensure fair distribution of network bandwidth usage for availability recovery in general and in particular for systematic chunk holders.

Stakeholders

Relay chain node core developers.

Explanation

Systematic erasure codes

An erasure coding algorithm is considered systematic if it preserves the original unencoded data as part of the resulting code. The implementation of the erasure coding algorithm used for polkadot's availability data is systematic. Roughly speaking, the first N_VALIDATORS/3 chunks of data can be cheaply concatenated to retrieve the original data, without running the resource-intensive and time-consuming reconstruction algorithm.

You can find the concatenation procedure of systematic chunks for polkadot's erasure coding algorithm here

In a nutshell, it performs a column-wise concatenation with 2-byte chunks. The output could be zero-padded at the end, so scale decoding must be aware of the expected length in bytes and ignore trailing zeros (this assertion is already being made for regular reconstruction).

Availability recovery at present

According to the polkadot protocol spec:

A validator should request chunks by picking peers randomly and must recover at least f+1 chunks, where n=3f+k and k in {1,2,3}.

For parity's polkadot node implementation, the process was further optimised. At this moment, it works differently based on the estimated size of the available data:

(a) for small PoVs (up to 128 Kib), sequentially try requesting the unencoded data from the backing group, in a random order. If this fails, fallback to option (b).

(b) for large PoVs (over 128 Kib), launch N parallel requests for the erasure coded chunks (currently, N has an upper limit of 50), until enough chunks were recovered. Validators are tried in a random order. Then, reconstruct the original data.

All options require that after reconstruction, validators then re-encode the data and re-create the erasure chunks trie in order to check the erasure root.

Availability recovery from systematic chunks

As part of the effort of increasing polkadot's resource efficiency, scalability and performance, work is under way to modify the Availability Recovery protocol by leveraging systematic chunks. See this comment for preliminary performance results.

In this scheme, the relay chain node will first attempt to retrieve the ~N/3 systematic chunks from the validators that should hold them, before falling back to recovering from regular chunks, as before.

A re-encoding step is still needed for verifying the erasure root, so the erasure coding overhead cannot be completely brought down to 0.

Not being able to retrieve even one systematic chunk would make systematic reconstruction impossible. Therefore, backers can be used as a backup to retrieve a couple of missing systematic chunks, before falling back to retrieving regular chunks.

Chunk assignment function

Properties

The function that decides the chunk index for a validator will be parameterized by at least (validator_index, core_index) and have the following properties:

  1. deterministic
  2. relatively quick to compute and resource-efficient.
  3. when considering a fixed core_index, the function should describe a permutation of the chunk indices
  4. the validators that map to the first N/3 chunk indices should have as little overlap as possible for different cores.

In other words, we want a uniformly distributed, deterministic mapping from ValidatorIndex to ChunkIndex per core.

It's desirable to not embed this function in the runtime, for performance and complexity reasons. However, this means that the function needs to be kept very simple and with minimal or no external dependencies. Any change to this function could result in parachains being stalled and needs to be coordinated via a runtime upgrade or governance call.

Proposed function

Pseudocode:

#![allow(unused)]
fn main() {
pub fn get_chunk_index(
  n_validators: u32,
  validator_index: ValidatorIndex,
  core_index: CoreIndex
) -> ChunkIndex {
  let threshold = systematic_threshold(n_validators); // Roughly n_validators/3
  let core_start_pos = core_index * threshold;

  (core_start_pos + validator_index) % n_validators
}
}

Network protocol

The request-response /req_chunk protocol will be bumped to a new version (from v1 to v2). For v1, the request and response payloads are:

#![allow(unused)]
fn main() {
/// Request an availability chunk.
pub struct ChunkFetchingRequest {
	/// Hash of candidate we want a chunk for.
	pub candidate_hash: CandidateHash,
	/// The index of the chunk to fetch.
	pub index: ValidatorIndex,
}

/// Receive a requested erasure chunk.
pub enum ChunkFetchingResponse {
	/// The requested chunk data.
	Chunk(ChunkResponse),
	/// Node was not in possession of the requested chunk.
	NoSuchChunk,
}

/// This omits the chunk's index because it is already known by
/// the requester and by not transmitting it, we ensure the requester is going to use his index
/// value for validating the response, thus making sure he got what he requested.
pub struct ChunkResponse {
	/// The erasure-encoded chunk of data belonging to the candidate block.
	pub chunk: Vec<u8>,
	/// Proof for this chunk's branch in the Merkle tree.
	pub proof: Proof,
}
}

Version 2 will add an index field to ChunkResponse:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Encode, Decode)]
pub struct ChunkResponse {
	/// The erasure-encoded chunk of data belonging to the candidate block.
	pub chunk: Vec<u8>,
	/// Proof for this chunk's branch in the Merkle tree.
	pub proof: Proof,
	/// Chunk index.
	pub index: ChunkIndex
}
}

An important thing to note is that in version 1, the ValidatorIndex value is always equal to the ChunkIndex. Until the chunk rotation feature is enabled, this will also be true for version 2. However, after the feature is enabled, this will generally not be true.

The requester will send the request to validator with index V. The responder will map the V validator index to the C chunk index and respond with the C-th chunk. This mapping can be seamless, by having each validator store their chunk by ValidatorIndex (just as before).

The protocol implementation MAY check the returned ChunkIndex against the expected mapping to ensure that it received the right chunk. In practice, this is desirable during availability-distribution and systematic chunk recovery. However, regular recovery may not check this index, which is particularly useful when participating in disputes that don't allow for easy access to the validator->chunk mapping. See Appendix A for more details.

In any case, the requester MUST verify the chunk's proof using the provided index.

During availability-recovery, given that the requester may not know (if the mapping is not available) whether the received chunk corresponds to the requested validator index, it has to keep track of received chunk indices and ignore duplicates. Such duplicates should be considered the same as an invalid/garbage response (drop it and move on to the next validator - we can't punish via reputation changes, because we don't know which validator misbehaved).

Upgrade path

Step 1: Enabling new network protocol

In the beginning, both /req_chunk/1 and /req_chunk/2 will be supported, until all validators and collators have upgraded to use the new version. V1 will be considered deprecated. During this step, the mapping will still be 1:1 (ValidatorIndex == ChunkIndex), regardless of protocol. Once all nodes are upgraded, a new release will be cut that removes the v1 protocol. Only once all nodes have upgraded to this version will step 2 commence.

Step 2: Enabling the new validator->chunk mapping

Considering that the Validator->Chunk mapping is critical to para consensus, the change needs to be enacted atomically via governance, only after all validators have upgraded the node to a version that is aware of this mapping, functionality-wise. It needs to be explicitly stated that after the governance enactment, validators that run older client versions that don't support this mapping will not be able to participate in parachain consensus.

Additionally, an error will be logged when starting a validator with an older version, after the feature was enabled.

On the other hand, collators will not be required to upgrade in this step (but are still require to upgrade for step 1), as regular chunk recovery will work as before, granted that version 1 of the networking protocol has been removed. Note that collators only perform availability-recovery in rare, adversarial scenarios, so it is fine to not optimise for this case and let them upgrade at their own pace.

To support enabling this feature via the runtime, we will use the NodeFeatures bitfield of the HostConfiguration struct (added in https://github.com/paritytech/polkadot-sdk/pull/2177). Adding and enabling a feature with this scheme does not require a runtime upgrade, but only a referendum that issues a Configuration::set_node_feature extrinsic. Once the feature is enabled and new configuration is live, the validator->chunk mapping ceases to be a 1:1 mapping and systematic recovery may begin.

Drawbacks

  • Getting access to the core_index that used to be occupied by a candidate in some parts of the dispute protocol is very complicated (See appendix A). This RFC assumes that availability-recovery processes initiated during disputes will only use regular recovery, as before. This is acceptable since disputes are rare occurrences in practice and is something that can be optimised later, if need be. Adding the core_index to the CandidateReceipt would mitigate this problem and will likely be needed in the future for CoreJam and/or Elastic scaling. Related discussion about updating CandidateReceipt
  • It's a breaking change that requires all validators and collators to upgrade their node version at least once.

Testing, Security, and Privacy

Extensive testing will be conducted - both automated and manual. This proposal doesn't affect security or privacy.

Performance, Ergonomics, and Compatibility

Performance

This is a necessary data availability optimisation, as reed-solomon erasure coding has proven to be a top consumer of CPU time in polkadot as we scale up the parachain block size and number of availability cores.

With this optimisation, preliminary performance results show that CPU time used for reed-solomon coding/decoding can be halved and total POV recovery time decrease by 80% for large POVs. See more here.

Ergonomics

Not applicable.

Compatibility

This is a breaking change. See upgrade path section above. All validators and collators need to have upgraded their node versions before the feature will be enabled via a governance call.

Prior Art and References

See comments on the tracking issue and the in-progress PR

Unresolved Questions

Not applicable.

This enables future optimisations for the performance of availability recovery, such as retrieving batched systematic chunks from backers/approval-checkers.

Appendix A

This appendix details the intricacies of getting access to the core index of a candidate in parity's polkadot node.

Here, core_index refers to the index of the core that a candidate was occupying while it was pending availability (from backing to inclusion).

Availability-recovery can currently be triggered by the following phases in the polkadot protocol:

  1. During the approval voting process.
  2. By other collators of the same parachain.
  3. During disputes.

Getting the right core index for a candidate can be troublesome. Here's a breakdown of how different parts of the node implementation can get access to it:

  1. The approval-voting process for a candidate begins after observing that the candidate was included. Therefore, the node has easy access to the block where the candidate got included (and also the core that it occupied).

  2. The pov_recovery task of the collators starts availability recovery in response to noticing a candidate getting backed, which enables easy access to the core index the candidate started occupying.

  3. Disputes may be initiated on a number of occasions:

    3.a. is initiated by the validator as a result of finding an invalid candidate while participating in the approval-voting protocol. In this case, availability-recovery is not needed, since the validator already issued their vote.

    3.b is initiated by the validator noticing dispute votes recorded on-chain. In this case, we can safely assume that the backing event for that candidate has been recorded and kept in memory.

    3.c is initiated as a result of getting a dispute statement from another validator. It is possible that the dispute is happening on a fork that was not yet imported by this validator, so the subsystem may not have seen this candidate being backed.

A naive attempt of solving 3.c would be to add a new version for the disputes request-response networking protocol. Blindly passing the core index in the network payload would not work, since there is no way of validating that the reported core_index was indeed the one occupied by the candidate at the respective relay parent.

Another attempt could be to include in the message the relay block hash where the candidate was included. This information would be used in order to query the runtime API and retrieve the core index that the candidate was occupying. However, considering it's part of an unimported fork, the validator cannot call a runtime API on that block.

Adding the core_index to the CandidateReceipt would solve this problem and would enable systematic recovery for all dispute scenarios.

(source)

Table of Contents

RFC-0048: Generate ownership proof for SessionKeys

Start Date13 November 2023
DescriptionChange SessionKeys runtime api to support generating an ownership proof for the on chain registration.
AuthorsBastian Köcher

Summary

This RFC proposes to changes the SessionKeys::generate_session_keys runtime api interface. This runtime api is used by validator operators to generate new session keys on a node. The public session keys are then registered manually on chain by the validator operator. Before this RFC it was not possible by the on chain logic to ensure that the account setting the public session keys is also in possession of the private session keys. To solve this the RFC proposes to pass the account id of the account doing the registration on chain to generate_session_keys. Further this RFC proposes to change the return value of the generate_session_keys function also to not only return the public session keys, but also the proof of ownership for the private session keys. The validator operator will then need to send the public session keys and the proof together when registering new session keys on chain.

Motivation

When submitting the new public session keys to the on chain logic there doesn't exist any verification of possession of the private session keys. This means that users can basically register any kind of public session keys on chain. While the on chain logic ensures that there are no duplicate keys, someone could try to prevent others from registering new session keys by setting them first. While this wouldn't bring the "attacker" any kind of advantage, more like disadvantages (potential slashes on their account), it could prevent someone from e.g. changing its session key in the event of a private session key leak.

After this RFC this kind of attack would not be possible anymore, because the on chain logic can verify that the sending account is in ownership of the private session keys.

Stakeholders

  • Polkadot runtime implementors
  • Polkadot node implementors
  • Validator operators

Explanation

We are first going to explain the proof format being used:

#![allow(unused)]
fn main() {
type Proof = (Signature, Signature, ..);
}

The proof being a SCALE encoded tuple over all signatures of each private session key signing the account_id. The actual type of each signature depends on the corresponding session key cryptographic algorithm. The order of the signatures in the proof is the same as the order of the session keys in the SessionKeys type declared in the runtime.

The version of the SessionKeys needs to be bumped to 1 to reflect the changes to the signature of SessionKeys_generate_session_keys:

#![allow(unused)]
fn main() {
pub struct OpaqueGeneratedSessionKeys {
	pub keys: Vec<u8>,
	pub proof: Vec<u8>,
}

fn SessionKeys_generate_session_keys(account_id: Vec<u8>, seed: Option<Vec<u8>>) -> OpaqueGeneratedSessionKeys;
}

The default calling convention for runtime apis is applied, meaning the parameters passed as SCALE encoded array and the length of the encoded array. The return value being the SCALE encoded return value as u64 (array_ptr | length << 32). So, the actual exported function signature looks like:

#![allow(unused)]
fn main() {
fn SessionKeys_generate_session_keys(array: *const u8, len: usize) -> u64;
}

The on chain logic for setting the SessionKeys needs to be changed as well. It already gets the proof passed as Vec<u8>. This proof needs to be decoded to the actual Proof type as explained above. The proof and the SCALE encoded account_id of the sender are used to verify the ownership of the SessionKeys.

Drawbacks

Validator operators need to pass the their account id when rotating their session keys in a node. This will require updating some high level docs and making users familiar with the slightly changed ergonomics.

Testing, Security, and Privacy

Testing of the new changes only requires passing an appropriate owner for the current testing context. The changes to the proof generation and verification got audited to ensure they are correct.

Performance, Ergonomics, and Compatibility

Performance

The session key generation is an offchain process and thus, doesn't influence the performance of the chain. Verifying the proof is done on chain as part of the transaction logic for setting the session keys. The verification of the proof is a signature verification number of individual session keys times. As setting the session keys is happening quite rarely, it should not influence the overall system performance.

Ergonomics

The interfaces have been optimized to make it as easy as possible to generate the ownership proof.

Compatibility

Introduces a new version of the SessionKeys runtime api. Thus, nodes should be updated before a runtime is enacted that contains these changes otherwise they will fail to generate session keys. The RPC that exists around this runtime api needs to be updated to support passing the account id and for returning the ownership proof alongside the public session keys.

UIs would need to be updated to support the new RPC and the changed on chain logic.

Prior Art and References

None.

Unresolved Questions

None.

Substrate implementation of the RFC.

(source)

Table of Contents

RFC-0050: Fellowship Salaries

Start Date15 November 2023
DescriptionProposal to set rank-based Fellowship salary levels.
AuthorsJoe Petrowski, Gavin Wood

Summary

The Fellowship Manifesto states that members should receive a monthly allowance on par with gross income in OECD countries. This RFC proposes concrete amounts.

Motivation

One motivation for the Technical Fellowship is to provide an incentive mechanism that can induct and retain technical talent for the continued progress of the network.

In order for members to uphold their commitment to the network, they should receive support to ensure that their needs are met such that they have the time to dedicate to their work on Polkadot. Given the high expectations of Fellows, it is reasonable to consider contributions and requirements on par with a full-time job. Providing a livable wage to those making such contributions makes it pragmatic to work full-time on Polkadot.

Note: Goals of the Fellowship, expectations for each Dan, and conditions for promotion and demotion are all explained in the Manifesto. This RFC is only to propose concrete values for allowances.

Stakeholders

  • Fellowship members
  • Polkadot Treasury

Explanation

This RFC proposes agreeing on salaries relative to a single level, the III Dan. As such, changes to the amount or asset used would only be on a single value, and all others would adjust relatively. A III Dan is someone whose contributions match the expectations of a full-time individual contributor. The salary at this level should be reasonably close to averages in OECD countries.

DanFactor
I0.125
II0.25
III1
IV1.5
V2.0
VI2.5
VII2.5
VIII2.5
IX2.5

Note that there is a sizable increase between II Dan (Proficient) and III Dan (Fellow). By the third Dan, it is generally expected that one is working on Polkadot as their primary focus in a full-time capacity.

Salary Asset

Although the Manifesto (Section 8) specifies a monthly allowance in DOT, this RFC proposes the use of USDT instead. The allowance is meant to provide members stability in meeting their day-to-day needs and recognize contributions. Using USDT provides more stability and less speculation.

This RFC proposes that a III Dan earn 80,000 USDT per year. The salary at this level is commensurate with average salaries in OECD countries (note: 77,000 USD in the U.S., with an average engineer at 100,000 USD). The other ranks would thus earn:

DanAnnual Salary
I10,000
II20,000
III80,000
IV120,000
V160,000
VI200,000
VII200,000
VIII200,000
IX200,000

The salary levels for Architects (IV, V, and VI Dan) are typical of senior engineers.

Allowances will be managed by the Salary pallet.

Projections

Based on the current membership, the maximum yearly and monthly costs are shown below:

DanSalaryMembersYearlyMonthly
I10,00027270,00022,500
II20,00011220,00018,333
III80,0008640,00053,333
IV120,0003360,00030,000
V160,0005800,00066,667
VI200,0003600,00050,000
> VI200,000000
Total2,890,000240,833

Note that these are the maximum amounts; members may choose to take a passive (lower) level. On the other hand, more people will likely join the Fellowship in the coming years.

Updates

Updates to these levels, whether relative ratios, the asset used, or the amount, shall be done via RFC.

Drawbacks

By not using DOT for payment, the protocol relies on the stability of other assets and the ability to acquire them. However, the asset of choice can be changed in the future.

Testing, Security, and Privacy

N/A.

Performance, Ergonomics, and Compatibility

Performance

N/A

Ergonomics

N/A

Compatibility

N/A

Prior Art and References

Unresolved Questions

None at present.

(source)

Table of Contents

RFC-0056: Enforce only one transaction per notification

Start Date2023-11-30
DescriptionModify the transactions notifications protocol to always send only one transaction at a time
AuthorsPierre Krieger

Summary

When two peers connect to each other, they open (amongst other things) a so-called "notifications protocol" substream dedicated to gossiping transactions to each other.

Each notification on this substream currently consists in a SCALE-encoded Vec<Transaction> where Transaction is defined in the runtime.

This RFC proposes to modify the format of the notification to become (Compact(1), Transaction). This maintains backwards compatibility, as this new format decodes as a Vec of length equal to 1.

Motivation

There exists three motivations behind this change:

  • It is technically impossible to decode a SCALE-encoded Vec<Transaction> into a list of SCALE-encoded transactions without knowing how to decode a Transaction. That's because a Vec<Transaction> consists in several Transactions one after the other in memory, without any delimiter that indicates the end of a transaction and the start of the next. Unfortunately, the format of a Transaction is runtime-specific. This means that the code that receives notifications is necessarily tied to a specific runtime, and it is not possible to write runtime-agnostic code.

  • Notifications protocols are already designed to be optimized to send many items. Currently, when it comes to transactions, each item is a Vec<Transaction> that consists in multiple sub-items of type Transaction. This two-steps hierarchy is completely unnecessary, and was originally written at a time when the networking protocol of Substrate didn't have proper multiplexing.

  • It makes the implementation way more straight-forward by not having to repeat code related to back-pressure. See explanations below.

Stakeholders

Low-level developers.

Explanation

To give an example, if you send one notification with three transactions, the bytes that are sent on the wire are:

concat(
    leb128(total-size-in-bytes-of-the-rest),
    scale(compact(3)), scale(transaction1), scale(transaction2), scale(transaction3)
)

But you can also send three notifications of one transaction each, in which case it is:

concat(
    leb128(size(scale(transaction1)) + 1), scale(compact(1)), scale(transaction1),
    leb128(size(scale(transaction2)) + 1), scale(compact(1)), scale(transaction2),
    leb128(size(scale(transaction3)) + 1), scale(compact(1)), scale(transaction3)
)

Right now the sender can choose which of the two encoding to use. This RFC proposes to make the second encoding mandatory.

The format of the notification would become a SCALE-encoded (Compact(1), Transaction). A SCALE-compact encoded 1 is one byte of value 4. In other words, the format of the notification would become concat(&[4], scale_encoded_transaction). This is equivalent to forcing the Vec<Transaction> to always have a length of 1, and I expect the Substrate implementation to simply modify the sending side to add a for loop that sends one notification per item in the Vec.

As explained in the motivation section, this allows extracting scale(transaction) items without having to know how to decode them.

By "flattening" the two-steps hierarchy, an implementation only needs to back-pressure individual notifications rather than back-pressure notifications and transactions within notifications.

Drawbacks

This RFC chooses to maintain backwards compatibility at the cost of introducing a very small wart (the Compact(1)).

An alternative could be to introduce a new version of the transactions notifications protocol that sends one Transaction per notification, but this is significantly more complicated to implement and can always be done later in case the Compact(1) is bothersome.

Testing, Security, and Privacy

Irrelevant.

Performance, Ergonomics, and Compatibility

Performance

Irrelevant.

Ergonomics

Irrelevant.

Compatibility

The change is backwards compatible if done in two steps: modify the sender to always send one transaction per notification, then, after a while, modify the receiver to enforce the new format.

Prior Art and References

Irrelevant.

Unresolved Questions

None.

None. This is a simple isolated change.

(source)

Table of Contents

RFC-0059: Add a discovery mechanism for nodes based on their capabilities

Start Date2023-12-18
DescriptionNodes having certain capabilities register themselves in the DHT to be discoverable
AuthorsPierre Krieger

Summary

This RFC proposes to make the mechanism of RFC #8 more generic by introducing the concept of "capabilities".

Implementations can implement certain "capabilities", such as serving old block headers or being a parachain bootnode.

The discovery mechanism of RFC #8 is extended to be able to discover nodes of specific capabilities.

Motivation

The Polkadot peer-to-peer network is made of nodes. Not all these nodes are equal. Some nodes store only the headers of recent blocks, some nodes store all the block headers and bodies since the genesis, some nodes store the storage of all blocks since the genesis, and so on.

It is currently not possible to know ahead of time (without connecting to it and asking) which nodes have which data available, and it is not easily possible to build a list of nodes that have a specific piece of data available.

If you want to download for example the header of block 500, you have to connect to a randomly-chosen node, ask it for block 500, and if it says that it doesn't have the block, disconnect and try another randomly-chosen node. In certain situations such as downloading the storage of old blocks, nodes that have the information are relatively rare, and finding through trial and error a node that has the data can take a long time.

This RFC attempts to solve this problem by giving the possibility to build a list of nodes that are capable of serving specific data.

Stakeholders

Low-level client developers. People interested in accessing the archive of the chain.

Explanation

Reading RFC #8 first might help with comprehension, as this RFC is very similar.

Please keep in mind while reading that everything below applies for both relay chains and parachains, except mentioned otherwise.

Capabilities

This RFC defines a list of so-called capabilities:

  • Head of chain provider. An implementation with this capability must be able to serve to other nodes block headers, block bodies, justifications, calls proofs, and storage proofs of "recent" (see below) blocks, and, for relay chains, to serve to other nodes warp sync proofs where the starting block is a session change block and must participate in Grandpa and Beefy gossip.
  • History provider. An implementation with this capability must be able to serve to other nodes block headers and block bodies of any block since the genesis, and must be able to serve to other nodes justifications of any session change block since the genesis up until and including their currently finalized block.
  • Archive provider. This capability is a superset of History provider. In addition to the requirements of History provider, an implementation with this capability must be able to serve call proofs and storage proof requests of any block since the genesis up until and including their currently finalized block.
  • Parachain bootnode (only for relay chains). An implementation with this capability must be able to serve the network request described in RFC 8.

More capabilities might be added in the future.

In the context of the head of chain provider, the word "recent" means: any not-finalized-yet block that is equal to or an ancestor of a block that it has announced through a block announce, and any finalized block whose height is superior to its current finalized block minus 16. This does not include blocks that have been pruned because they're not a descendant of its current finalized block. In other words, blocks that aren't a descendant of the current finalized block can be thrown away. A gap of blocks is required due to race conditions: when a node finalizes a block, it takes some time for its peers to be made aware of this, during which they might send requests concerning older blocks. The choice of the number of blocks in this gap is arbitrary.

Substrate is currently by default a head of chain provider provider. After it has finished warp syncing, it downloads the list of old blocks, after which it becomes a history provider. If Substrate is instead configured as an archive node, then it downloads all blocks since the genesis and builds their state, after which it becomes an archive provider, history provider, and head of chain provider. If blocks pruning is enabled and the chain is a relay chain, then Substrate unfortunately doesn't implement any of these capabilities, not even head of chain provider. This is considered as a bug that should be fixed, see https://github.com/paritytech/polkadot-sdk/issues/2733.

DHT provider registration

This RFC heavily relies on the functionalities of the Kademlia DHT already in use by Polkadot. You can find a link to the specification here.

Implementations that have the history provider capability should register themselves as providers under the key sha256(concat("history", randomness)).

Implementations that have the archive provider capability should register themselves as providers under the key sha256(concat("archive", randomness)).

Implementations that have the parachain bootnode capability should register themselves as provider under the key sha256(concat(scale_compact(para_id), randomness)), as described in RFC 8.

"Register themselves as providers" consists in sending ADD_PROVIDER requests to nodes close to the key, as described in the Content provider advertisement section of the specification.

The value of randomness can be found in the randomness field when calling the BabeApi_currentEpoch function.

In order to avoid downtimes when the key changes, nodes should also register themselves as a secondary key that uses a value of randomness equal to the randomness field when calling BabeApi_nextEpoch.

Implementers should be aware that their implementation of Kademlia might already hash the key before XOR'ing it. The key is not meant to be hashed twice.

Implementations must not register themselves if they don't fulfill the capability yet. For example, a node configured to be an archive node but that is still building its archive state in the background must register itself only after it has finished building its archive.

Secondary DHTs

Implementations that have the history provider capability must also participate in a secondary DHT that comprises only of nodes with that capability. The protocol name of that secondary DHT must be /<genesis-hash>/kad/history.

Similarly, implementations that have the archive provider capability must also participate in a secondary DHT that comprises only of nodes with that capability and whose protocol name is /<genesis-hash>/kad/archive.

Just like implementations must not register themselves if they don't fulfill their capability yet, they must also not participate in the secondary DHT if they don't fulfill their capability yet.

Head of the chain providers

Implementations that have the head of the chain provider capability do not register themselves as providers, but instead are the nodes that participate in the main DHT. In other words, they are the nodes that serve requests of the /<genesis_hash>/kad protocol.

Any implementation that isn't a head of the chain provider (read: light clients) must not participate in the main DHT. This is already presently the case.

Implementations must not participate in the main DHT if they don't fulfill the capability yet. For example, a node that is still in the process of warp syncing must not participate in the main DHT. However, assuming that warp syncing doesn't last more than a few seconds, it is acceptable to ignore this requirement in order to avoid complicating implementations too much.

Drawbacks

None that I can see.

Testing, Security, and Privacy

The content of this section is basically the same as the one in RFC 8.

This mechanism doesn't add or remove any security by itself, as it relies on existing mechanisms.

Due to the way Kademlia works, it would become the responsibility of the 20 Polkadot nodes whose sha256(peer_id) is closest to the key (described in the explanations section) to store the list of nodes that have specific capabilities. Furthermore, when a large number of providers are registered, only the providers closest to the key are kept, up to a certain implementation-defined limit.

For this reason, an attacker can abuse this mechanism by randomly generating libp2p PeerIds until they find the 20 entries closest to the key representing the target capability. They are then in control of the list of nodes with that capability. While doing this can in no way be actually harmful, it could lead to eclipse attacks.

Because the key changes periodically and isn't predictable, and assuming that the Polkadot DHT is sufficiently large, it is not realistic for an attack like this to be maintained in the long term.

Performance, Ergonomics, and Compatibility

Performance

The DHT mechanism generally has a low overhead, especially given that publishing providers is done only every 24 hours.

Doing a Kademlia iterative query then sending a provider record shouldn't take more than around 50 kiB in total of bandwidth for the parachain bootnode.

Assuming 1000 nodes with a specific capability, the 20 Polkadot full nodes corresponding to that capability will each receive a sudden spike of a few megabytes of networking traffic when the key rotates. Again, this is relatively negligible. If this becomes a problem, one can add a random delay before a node registers itself to be the provider of the key corresponding to BabeApi_next_epoch.

Maybe the biggest uncertainty is the traffic that the 20 Polkadot full nodes will receive from light clients that desire knowing the nodes with a capability. If this every becomes a problem, this value of 20 is an arbitrary constant that can be increased for more redundancy.

Ergonomics

Irrelevant.

Compatibility

Irrelevant.

Prior Art and References

Unknown.

Unresolved Questions

While it fundamentally doesn't change much to this RFC, using BabeApi_currentEpoch and BabeApi_nextEpoch might be inappropriate. I'm not familiar enough with good practices within the runtime to have an opinion here. Should it be an entirely new pallet?

This RFC would make it possible to reliably discover archive nodes, which would make it possible to reliably send archive node requests, something that isn't currently possible. This could solve the problem of finding archive RPC node providers by migrating archive-related request to using the native peer-to-peer protocol rather than JSON-RPC.

If we ever decide to break backwards compatibility, we could divide the "history" and "archive" capabilities in two, between nodes capable of serving older blocks and nodes capable of serving newer blocks. We could even add to the peer-to-peer network nodes that are only capable of serving older blocks (by reading from a database) but do not participate in the head of the chain, and that just exist for historical purposes.

(source)

Table of Contents

RFC-0078: Merkleized Metadata

Start Date22 February 2024
DescriptionInclude merkleized metadata hash in extrinsic signature for trust-less metadata verification.
AuthorsZondax AG, Parity Technologies

Summary

To interact with chains in the Polkadot ecosystem it is required to know how transactions are encoded and how to read state. For doing this, Polkadot-SDK, the framework used by most of the chains in the Polkadot ecosystem, exposes metadata about the runtime to the outside. UIs, wallets, and others can use this metadata to interact with these chains. This makes the metadata a crucial piece of the transaction encoding as users are relying on the interacting software to encode the transactions in the correct format.

It gets even more important when the user signs the transaction in an offline wallet, as the device by its nature cannot get access to the metadata without relying on the online wallet to provide it. This makes it so that the offline wallet needs to trust an online party, deeming the security assumptions of the offline devices, mute.

This RFC proposes a way for offline wallets to leverage metadata, within the constraints of these. The design idea is that the metadata is chunked and these chunks are put into a merkle tree. The root hash of this merkle tree represents the metadata. The offline wallets can use the root hash to decode transactions by getting proofs for the individual chunks of the metadata. This root hash is also included in the signed data of the transaction (but not sent as part of the transaction). The runtime is then including its known metadata root hash when verifying the transaction. If the metadata root hash known by the runtime differs from the one that the offline wallet used, it very likely means that the online wallet provided some fake data and the verification of the transaction fails.

Users depend on offline wallets to correctly display decoded transactions before signing. With merkleized metadata, they can be assured of the transaction's legitimacy, as incorrect transactions will be rejected by the runtime.

Motivation

Polkadot's innovative design (both relay chain and parachains) present the ability to developers to upgrade their network as frequently as they need. These systems manage to have integrations working after the upgrades with the help of FRAME Metadata. This Metadata, which is in the order of half a MiB for most Polkadot-SDK chains, completely describes chain interfaces and properties. Securing this metadata is key for users to be able to interact with the Polkadot-SDK chain in the expected way.

On the other hand, offline wallets provide a secure way for Blockchain users to hold their own keys (some do a better job than others). These devices seldomly get upgraded, usually account for one particular network and hold very small internal memories. Currently in the Polkadot ecosystem there is no secure way of having these offline devices know the latest Metadata of the Polkadot-SDK chain they are interacting with. This results in a plethora of similar yet slightly different offline wallets for all different Polkadot-SDK chains, as well as the impediment of keeping these regularly updated, thus not fully leveraging Polkadot-SDK’s unique forkless upgrade feature.

The two main reasons why this is not possible today are:

  1. Metadata is too large for offline devices. Currently Polkadot-SDK metadata is on average 500 KiB, which is more than what the mostly adopted offline devices can hold.
  2. Metadata is not authenticated. Even if there was enough space on offline devices to hold the metadata, the user would be trusting the entity providing this metadata to the hardware wallet. In the Polkadot ecosystem, this is how currently Polkadot Vault works.

This RFC proposes a solution to make FRAME Metadata compatible with offline signers in a secure way. As it leverages FRAME Metadata, it does not only ensure that offline devices can always keep up to date with every FRAME based chain, but also that every offline wallet will be compatible with all FRAME based chains, avoiding the need of per-chain implementations.

Requirements

  1. Metadata's integrity MUST be preserved. If any compromise were to happen, extrinsics sent with compromised metadata SHOULD fail.
  2. Metadata information that could be used in signable extrinsic decoding MAY be included in digest, yet its inclusion MUST be indicated in signed extensions.
  3. Digest MUST be deterministic with respect to metadata.
  4. Digest MUST be cryptographically strong against pre-image, both first (finding an input that results in given digest) and second (finding an input that results in same digest as some other input given).
  5. Extra-metadata information necessary for extrinsic decoding and constant within runtime version MUST be included in digest.
  6. It SHOULD be possible to quickly withdraw offline signing mechanism without access to cold signing devices.
  7. Digest format SHOULD be versioned.
  8. Work necessary for proving metadata authenticity MAY be omitted at discretion of signer device design (to support automation tools).

Reduce metadata size

Metadata should be stripped from parts that are not necessary to parse a signable extrinsic, then it should be separated into a finite set of self-descriptive chunks. Thus, a subset of chunks necessary for signable extrinsic decoding and rendering could be sent, possibly in small portions (ultimately, one at a time), to cold devices together with the proof.

  1. Single chunk with proof payload size SHOULD fit within few kB;
  2. Chunks handling mechanism SHOULD support chunks being sent in any order without memory utilization overhead;
  3. Unused enum variants MUST be stripped (this has great impact on transmitted metadata size; examples: era enum, enum with all calls for call batching).

Stakeholders

  • Runtime implementors
  • UI/wallet implementors
  • Offline wallet implementors

The idea for this RFC was brought up by runtime implementors and was extensively discussed with offline wallet implementors. It was designed in such a way that it can work easily with the existing offline wallet solutions in the Polkadot ecosystem.

Explanation

The FRAME metadata provides a wide range of information about a FRAME based runtime. It contains information about the pallets, the calls per pallet, the storage entries per pallet, runtime APIs, and type information about most of the types that are used in the runtime. For decoding extrinsics on an offline wallet, what is mainly required is type information. Most of the other information in the FRAME metadata is actually not required for decoding extrinsics and thus it can be removed. Therefore, the following is a proposal on a custom representation of the metadata and how this custom metadata is chunked, ensuring that only the needed chunks required for decoding a particular extrinsic are sent to the offline wallet. The necessary information to transform the FRAME metadata type information into the type information presented in this RFC will be provided. However, not every single detail on how to convert from FRAME metadata into the RFC type information is described.

First, the MetadataDigest is introduced. After that, ExtrinsicMetadata is covered and finally the actual format of the type information. Then pruning of unrelated type information is covered and how to generate the TypeRefs. In the latest step, merkle tree calculation is explained.

Metadata digest

The metadata digest is the compact representation of the metadata. The hash of this digest is the metadata hash. Below the type declaration of the Hash type and the MetadataDigest itself can be found:

#![allow(unused)]
fn main() {
type Hash = [u8; 32];

enum MetadataDigest {
    #[index = 1]
    V1 {
        type_information_tree_root: Hash,
        extrinsic_metadata_hash: Hash,
        spec_version: u32,
        spec_name: String,
        base58_prefix: u16,
        decimals: u8,
        token_symbol: String,
    },
}
}

The Hash is 32 bytes long and blake3 is used for calculating it. The hash of the MetadataDigest is calculated by blake3(SCALE(MetadataDigest)). Therefore, MetadataDigest is at first SCALE encoded, and then those bytes are hashed.

The MetadataDigest itself is represented as an enum. This is done to make it future proof, because a SCALE encoded enum is prefixed by the index of the variant. This index represents the version of the digest. As seen above, there is no index zero and it starts directly with one. Version one of the digest contains the following elements:

  • type_information_tree_root: The root of the merkleized type information tree.
  • extrinsic_metadata_hash: The hash of the extrinsic metadata.
  • spec_version: The spec_version of the runtime as found in the RuntimeVersion when generating the metadata. While this information can also be found in the metadata, it is hidden in a big blob of data. To avoid transferring this big blob of data, we directly add this information here.
  • spec_name: Similar to spec_version, but being the spec_name found in the RuntimeVersion.
  • ss58_prefix: The SS58 prefix used for address encoding.
  • decimals: The number of decimals for the token.
  • token_symbol: The symbol of the token.

Extrinsic metadata

For decoding an extrinsic, more information on what types are being used is required. The actual format of the extrinsic is the format as described in the Polkadot specification. The metadata for an extrinsic is as follows:

#![allow(unused)]
fn main() {
struct ExtrinsicMetadata {
    version: u8,
    address_ty: TypeRef,
    call_ty: TypeRef,
    signature_ty: TypeRef,
    signed_extensions: Vec<SignedExtensionMetadata>,
}

struct SignedExtensionMetadata {
    identifier: String,
    included_in_extrinsic: TypeRef,
    included_in_signed_data: TypeRef,
}
}

To begin with, TypeRef. This is a unique identifier for a type as found in the type information. Using this TypeRef, it is possible to look up the type in the type information tree. More details on this process can be found in the section Generating TypeRef.

The actual ExtrinsicMetadata contains the following information:

  • version: The version of the extrinsic format. As of writing this, the latest version is 4.
  • address_ty: The address type used by the chain.
  • call_ty: The call type used by the chain. The call in FRAME based runtimes represents the type of transaction being executed on chain. It references the actual function to execute and the parameters of this function.
  • signature_ty: The signature type used by the chain.
  • signed_extensions: FRAME based runtimes can extend the base extrinsic with extra information. This extra information that is put into an extrinsic is called "signed extensions". These extensions offer the runtime developer the possibility to include data directly into the extrinsic, like nonce, tip, amongst others. This means that the this data is sent alongside the extrinsic to the runtime. The other possibility these extensions offer is to include extra information only in the signed data that is signed by the sender. This means that this data needs to be known by both sides, the signing side and the verification side. An example for this kind of data is the genesis hash that ensures that extrinsics are unique per chain. Another example is the metadata hash itself that will also be included in the signed data. The offline wallets need to know which signed extensions are present in the chain and this is communicated to them using this field.

The SignedExtensionMetadata provides information about a signed extension:

  • identifier: The identifier of the signed extension. An identifier is required to be unique in the Polkadot ecosystem as otherwise extrinsics are maybe built incorrectly.
  • included_in_extrinsic: The type that will be included in the extrinsic by this signed extension.
  • included_in_signed_data: The type that will be included in the signed data by this signed extension.

Type Information

As SCALE is not self descriptive like JSON, a decoder always needs to know the format of the type to decode it properly. This is where the type information comes into play. The format of the extrinsic is fixed as described above and ExtrinsicMetadata provides information on which type information is required for which part of the extrinsic. So, offline wallets only need access to the actual type information. It is a requirement that the type information can be chunked into logical pieces to reduce the amount of data that is sent to the offline wallets for decoding the extrinsics. So, the type information is structured in the following way:

#![allow(unused)]
fn main() {
struct Type {
    path: Vec<String>,
    type_def: TypeDef,
    type_id: Compact<u32>,
}

enum TypeDef {
    Composite(Vec<Field>),
    Enumeration(EnumerationVariant),
    Sequence(TypeRef),
    Array(Array),
    Tuple(Vec<TypeRef>),
    BitSequence(BitSequence),
}

struct Field {
    name: Option<String>,
    ty: TypeRef,
    type_name: Option<String>,
}

struct Array {
    len: u32,
    type_param: TypeRef,
}

struct BitSequence {
    num_bytes: u8,
    least_significant_bit_first: bool,
}

struct EnumerationVariant {
    name: String,
    fields: Vec<Field>,
    index: Compact<u32>,
}

enum TypeRef {
    Bool,
    Char,
    Str,
    U8,
    U16,
    U32,
    U64,
    U128,
    U256,
    I8,
    I16,
    I32,
    I64,
    I128,
    I256,
    CompactU8,
    CompactU16,
    CompactU32,
    CompactU64,
    CompactU128,
    CompactU256,
    Void,
    PerId(Compact<u32>),
}
}

The Type declares the structure of a type. The type has the following fields:

  • path: A path declares the position of a type locally to the place where it is defined. The path is not globally unique, this means that there can be multiple types with the same path.
  • type_def: The high-level type definition, e.g. the type is a composition of fields where each field has a type, the type is a composition of different types as tuple etc.
  • type_id: The unique identifier of this type.

Every Type is composed of multiple different types. Each of these "sub types" can reference either a full Type again or reference one of the primitive types. This is where TypeRef becomes relevant as the type referencing information. To reference a Type in the type information, a unique identifier is used. As primitive types can be represented using a single byte, they are not put as separate types into the type information. Instead the primitive types are directly part of TypeRef to not require the overhead of referencing them in an extra Type. The special primitive type Void represents a type that encodes to nothing and can be decoded from nothing. As FRAME doesn't support Compact as primitive type it requires a more involved implementation to convert a FRAME type to a Compact primitive type. SCALE only supports u8, u16, u32, u64 and u128 as Compact which maps onto the primitive type declaration in the RFC. One special case is a Compact that wraps an empty Tuple which is expressed as primitive type Void.

The TypeDef variants have the following meaning:

  • Composite: A struct like type that is composed of multiple different fields. Each Field can have its own type. The order of the fields is significant. A Composite with no fields is expressed as primitive type Void.
  • Enumeration: Stores a EnumerationVariant. A EnumerationVariant is a struct that is described by a name, an index and a vector of Fields, each of which can have it's own type. Typically Enumerations have more than just one variant, and in those cases Enumeration will appear multiple times, each time with a different variant, in the type information. Enumerations can become quite large, yet usually for decoding a type only one variant is required, therefore this design brings optimizations and helps reduce the size of the proof. An Enumeration with no variants is expressed as primitive type Void.
  • Sequence: A vector like type wrapping the given type.
  • BitSequence: A vector storing bits. num_bytes represents the size in bytes of the internal storage. If least_significant_bit_first is true the least significant bit is first, otherwise the most significant bit is first.
  • Array: A fixed-length array of a specific type.
  • Tuple: A composition of multiple types. A Tuple that is composed of no types is expressed as primitive type Void.

Using the type information together with the SCALE specification provides enough information on how to decode types.

Prune unrelated Types

The FRAME metadata contains not only the type information for decoding extrinsics, but it also contains type information about storage types. The scope of the RFC is only about decoding transactions on offline wallets. Thus, a lot of type information can be pruned. To know which type information are required to decode all possible extrinsics, ExtrinsicMetadata has been defined. The extrinsic metadata contains all the types that define the layout of an extrinsic. Therefore, all the types that are accessible from the types declared in the extrinsic metadata can be collected. To collect all accessible types, it requires to recursively iterate over all types starting from the types in ExtrinsicMetadata. Note that some types are accessible, but they don't appear in the final type information and thus, can be pruned as well. These are for example inner types of Compact or the types referenced by BitSequence. The result of collecting these accessible types is a list of all the types that are required to decode each possible extrinsic.

Generating TypeRef

Each TypeRef basically references one of the following types:

  • One of the primitive types. All primitive types can be represented by 1 byte and thus, they are directly part of the TypeRef itself to remove an extra level of indirection.
  • A Type using its unique identifier.

In FRAME metadata a primitive type is represented like any other type. So, the first step is to remove all the primitive only types from the list of types that were generated in the previous section. The resulting list of types is sorted using the id provided by FRAME metadata. In the last step the TypeRefs are created. Each reference to a primitive type is replaced by one of the corresponding TypeRef primitive type variants and every other reference is replaced by the type's unique identifier. The unique identifier of a type is the index of the type in our sorted list. For Enumerations all variants have the same unique identifier, while they are represented as multiple type information. All variants need to have the same unique identifier as the reference doesn't know which variant will appear in the actual encoded data.

#![allow(unused)]
fn main() {
let pruned_types = get_pruned_types();

for ty in pruned_types {
    if ty.is_primitive_type() {
        pruned_types.remove(ty);
    }
}

pruned_types.sort(|(left, right)|
    if left.frame_metadata_id() == right.frame_metadata_id() {
        left.variant_index() < right.variant_index()
    } else {
        left.frame_metadata_id() < right.frame_metadata_id()
    }
);

fn generate_type_ref(ty, ty_list) -> TypeRef {
    if ty.is_primitive_type() {
        TypeRef::primtive_from_ty(ty)
    }

    TypeRef::from_id(
        // Determine the id by using the position of the type in the
        // list of unique frame metadata ids.
        ty_list.position_by_frame_metadata_id(ty.frame_metadata_id())
    )
}

fn replace_all_sub_types_with_type_refs(ty, ty_list) -> Type {
    for sub_ty in ty.sub_types() {
        replace_all_sub_types_with_type_refs(sub_ty, ty_list);
        sub_ty = generate_type_ref(sub_ty, ty_list)
    }

    ty
}

let final_ty_list = Vec::new();
for ty in pruned_types {
    final_ty_list.push(replace_all_sub_types_with_type_refs(ty, ty_list))
}
}

Building the Merkle Tree Root

A complete binary merkle tree with blake3 as the hashing function is proposed. For building the merkle tree root, the initial data has to be hashed as a first step. This initial data is referred to as the leaves of the merkle tree. The leaves need to be sorted to make the tree root deterministic. The type information is sorted using their unique identifiers and for the Enumeration, variants are sort using their index. After sorting and hashing all leaves, two leaves have to be combined to one hash. The combination of these of two hashes is referred to as a node.

#![allow(unused)]
fn main() {
let nodes = leaves;
while nodes.len() > 1 {
    let right = nodes.pop_back();
    let left = nodes.pop_back();
    nodes.push_front(blake3::hash(scale::encode((left, right))));
}

let merkle_tree_root = if nodes.is_empty() { [0u8; 32] } else { nodes.back() };
}

The merkle_tree_root in the end is the last node left in the list of nodes. If there are no nodes in the list left, it means that the initial data set was empty. In this case, all zeros hash is used to represent the empty tree.

Building a tree with 5 leaves (numbered 0 to 4):

nodes: 0 1 2 3 4

nodes: [3, 4] 0 1 2

nodes: [1, 2] [3, 4] 0

nodes: [[3, 4], 0] [1, 2]

nodes: [[[3, 4], 0], [1, 2]]

The resulting tree visualized:

     [root]
     /    \
    *      *
   / \    / \
  *   0  1   2
 / \
3   4

Building a tree with 6 leaves (numbered 0 to 5):

nodes: 0 1 2 3 4 5

nodes: [4, 5] 0 1 2 3

nodes: [2, 3] [4, 5] 0 1

nodes: [0, 1] [2, 3] [4, 5]

nodes: [[2, 3], [4, 5]] [0, 1]

nodes: [[[2, 3], [4, 5]], [0, 1]]

The resulting tree visualized:

       [root]
      /      \
     *        *
   /   \     / \
  *     *   0   1
 / \   / \
2   3 4   5

Inclusion in an Extrinsic

To ensure that the offline wallet used the correct metadata to show the extrinsic to the user the metadata hash needs to be included in the extrinsic. The metadata hash is generated by hashing the SCALE encoded MetadataDigest:

#![allow(unused)]
fn main() {
blake3::hash(SCALE::encode(MetadataDigest::V1 { .. }))
}

For the runtime the metadata hash is generated at compile time. Wallets will have to generate the hash using the FRAME metadata.

The signing side should control whether it wants to add the metadata hash or if it wants to omit it. To accomplish this it is required to add one extra byte to the extrinsic itself. If this byte is 0 the metadata hash is not required and if the byte is 1 the metadata hash is added using V1 of the MetadataDigest. This leaves room for future versions of the MetadataDigest format. When the metadata hash should be included, it is only added to the data that is signed. This brings the advantage of not requiring to include 32 bytes into the extrinsic itself, because the runtime knows the metadata hash as well and can add it to the signed data as well if required. This is similar to the genesis hash, while this isn't added conditionally to the signed data. So, to recap:

  • Included in the extrinsic is u8, the "mode". The mode is either 0 which means to not include the metadata hash in the signed data or the mode is 1 to include the metadata hash in V1.
  • Included in the signed data is an Option<[u8; 32]>. Depending on the mode the value is either None or Some(metadata_hash).

Drawbacks

The chunking may not be the optimal case for every kind of offline wallet.

Testing, Security, and Privacy

All implementations are required to strictly follow the RFC to generate the metadata hash. This includes which hash function to use and how to construct the metadata types tree. So, all implementations are following the same security criteria. As the chains will calculate the metadata hash at compile time, the build process needs to be trusted. However, this is already a solved problem in the Polkadot ecosystem by using reproducible builds. So, anyone can rebuild a chain runtime to ensure that a proposal is actually containing the changes as advertised.

Implementations can also be tested easily against each other by taking some metadata and ensuring that they all come to the same metadata hash.

Privacy of users should also not be impacted. This assumes that wallets will generate the metadata hash locally and don't leak any information to third party services about which chunks a user will send to their offline wallet. Besides that, there is no leak of private information as getting the raw metadata from the chain is an operation that is done by almost everyone.

Performance, Ergonomics, and Compatibility

Performance

There should be no measurable impact on performance to Polkadot or any other chain using this feature. The metadata root hash is calculated at compile time and at runtime it is optionally used when checking the signature of a transaction. This means that at runtime no performance heavy operations are done.

Ergonomics & Compatibility

The proposal alters the way a transaction is built, signed, and verified. So, this imposes some required changes to any kind of developer who wants to construct transactions for Polkadot or any chain using this feature. As the developer can pass 0 for disabling the verification of the metadata root hash, it can be easily ignored.

Prior Art and References

RFC 46 produced by the Alzymologist team is a previous work reference that goes in this direction as well.

On other ecosystems, there are other solutions to the problem of trusted signing. Cosmos for example has a standardized way of transforming a transaction into some textual representation and this textual representation is included in the signed data. Basically achieving the same as what the RFC proposes, but it requires that for every transaction applied in a block, every node in the network always has to generate this textual representation to ensure the transaction signature is valid.

Unresolved Questions

None.

  • Does it work with all kind of offline wallets?
  • Generic types currently appear multiple times in the metadata with each instantiation. It could be may be useful to have generic type only once in the metadata and declare the generic parameters at their instantiation.
  • The metadata doesn't contain any kind of semantic information. This means that the offline wallet for example doesn't know what is a balance etc. The current solution for this problem is to match on the type name, but this isn't a sustainable solution.
  • MetadataDigest only provides one token and decimal. However, chains support a lot of chains support multiple tokens for paying fees etc. Probably more a question of having semantic information as mentioned above.

(source)

Table of Contents

RFC-0084: General transactions in extrinsic format

Start Date12 March 2024
DescriptionSupport more extrinsic types by updating the extrinsic format
AuthorsGeorge Pisaltu

Summary

This RFC proposes a change to the extrinsic format to incorporate a new transaction type, the "general" transaction.

Motivation

"General" transactions, a new type of transaction that this RFC aims to support, are transactions which obey the runtime's extensions and have according extension data yet do not have hard-coded signatures. They are first described in Extrinsic Horizon and supported in 3685. They enable users to authorize origins in new, more flexible ways (e.g. ZK proofs, mutations over pre-authenticated origins). As of now, all transactions are limited to the account signing model for origin authorization and any additional origin changes happen in extrinsic logic, which cannot leverage the validation process of extensions.

An example of a use case for such an extension would be sponsoring the transaction fee for some other user. A new extension would be put in place to verify that a part of the initial payload was signed by the author under who the extrinsic should run and change the origin, but the payment for the whole transaction should be handled under a sponsor's account. A POC for this can be found in 3712.

The new "general" transaction type would coexist with both current transaction types for a while and, therefore, the current number of supported transaction types, capped at 2, is insufficient. A new extrinsic type must be introduced alongside the current signed and unsigned types. Currently, an encoded extrinsic's first byte indicate the type of extrinsic using the most significant bit - 0 for unsigned, 1 for signed - and the 7 following bits indicate the extrinsic format version, which has been equal to 4 for a long time.

By taking one bit from the extrinsic format version encoding, we can support 2 additional extrinsic types while also having a minimal impact on our capability to extend and change the extrinsic format in the future.

Stakeholders

  • Runtime users
  • Runtime devs
  • Wallet devs

Explanation

An extrinsic is currently encoded as one byte to identify the extrinsic type and version. This RFC aims to change the interpretation of this byte regarding the reserved bits for the extrinsic type and version. In the following explanation, bits represented using T make up the extrinsic type and bits represented using V make up the extrinsic version.

Currently, the bit allocation within the leading encoded byte is 0bTVVV_VVVV. In practice in the Polkadot ecosystem, the leading byte would be 0bT000_0100 as the version has been equal to 4 for a long time.

This RFC proposes for the bit allocation to change to 0bTTVV_VVVV. As a result, the extrinsic format version will be bumped to 5 and the extrinsic type bit representation would change as follows:

bitstype
00unsigned
10signed
01reserved
11reserved

Drawbacks

This change would reduce the maximum possible transaction version from the current 127 to 63. In order to bypass the new, lower limit, the extrinsic format would have to change again.

Testing, Security, and Privacy

There is no impact on testing, security or privacy.

Performance, Ergonomics, and Compatibility

This change would allow Polkadot to support new types of transactions, with the specific "general" transaction type in mind at the time of writing this proposal.

Performance

There is no performance impact.

Ergonomics

The impact to developers and end-users is minimal as it would just be a bitmask update on their part for parsing the extrinsic type along with the version.

Compatibility

This change breaks backwards compatiblity because any transaction that is neither signed nor unsigned, but a new transaction type, would be interpreted as having a future extrinsic format version.

Prior Art and References

The original design was originally proposed in the TransactionExtension PR, which is also the motivation behind this effort.

Unresolved Questions

None.

Following this change, the "general" transaction type will be introduced as part of the Extrinsic Horizon effort, which will shape future work.

(source)

Table of Contents

RFC-0091: DHT Authority discovery record creation time

Start Date2024-05-20
DescriptionAdd creation time for DHT authority discovery records
AuthorsAlex Gheorghe (alexggh)

Summary

Extend the DHT authority discovery records with a signed creation time, so that nodes can determine which record is newer and always decide to prefer the newer records to the old ones.

Motivation

Currently, we use the Kademlia DHT for storing records regarding the p2p address of an authority discovery key, the problem is that if the nodes decide to change its PeerId/Network key it will publish a new record, however because of the distributed and replicated nature of the DHT there is no way to tell which record is newer so both old PeerId and the new PeerId will live in the network until the old one expires(36h), that creates all sort of problem and leads to the node changing its address not being properly connected for up to 36h.

After this RFC, nodes are extended to decide to keep the new record and propagate the new record to nodes that have the old record stored, so in the end all the nodes will converge faster to the new record(in the order of minutes, not 36h)

Implementation of the rfc: https://github.com/paritytech/polkadot-sdk/pull/3786.

Current issue without this enhacement: https://github.com/paritytech/polkadot-sdk/issues/3673

Stakeholders

Polkadot node developers.

Explanation

This RFC heavily relies on the functionalities of the Kademlia DHT already in use by Polkadot. You can find a link to the specification here.

In a nutshell, on a specific node the current authority-discovery protocol publishes Kademila DHT records at startup and periodically. The records contain the full address of the node for each authorithy key it owns. The node tries also to find the full address of all authorities in the network by querying the DHT and picking up the first record it finds for each of the authority id it found on chain.

The authority discovery DHT records use the protobuf protocol and the current format is specified here. This RFC proposese extending the schema in a backwards compatible manner by adding a new optional creation_time field to AuthorityRecord and nodes can use this information to determine which of the record is newer.

Diff of dht-v3.proto vs dht-v2.proto

@@ -1,10 +1,10 @@
 syntax = "proto3";

-package authority_discovery_v2;
+package authority_discovery_v3;

 // First we need to serialize the addresses in order to be able to sign them.
 message AuthorityRecord {
 	repeated bytes addresses = 1;
+	// Time since UNIX_EPOCH in nanoseconds, scale encoded
+	TimestampInfo creation_time = 2;
 }

 message PeerSignature {
@@ -13,11 +15,17 @@
 	bytes public_key = 2;
 }

+// Information regarding the creation data of the record
+message TimestampInfo {
+       // Time since UNIX_EPOCH in nanoseconds, scale encoded
+       bytes timestamp = 1;
+}
+

Each time a node wants to resolve an authorithy ID it will issue a query with a certain redundancy factor, and from all the results it receives it will decide to pick only the newest record. Additionally, in order to speed up the time until all nodes have the newest record, nodes can optionaly implement a logic where they send the new record to nodes that answered with the older record.

Drawbacks

In theory the new protocol creates a bit more traffic on the DHT network, because it waits for DHT records to be received from more than one node, while in the current implementation we just take the first record that we receive and cancel all in-flight requests to other peers. However, because the redundancy factor will be relatively small and this operation happens rarerly, every 10min, this cost is negligible.

Testing, Security, and Privacy

This RFC's implementation https://github.com/paritytech/polkadot-sdk/pull/3786 had been tested on various local test networks and versi.

With regard to security the creation time is wrapped inside SignedAuthorityRecord wo it will be signed with the authority id key, so there is no way for other malicious nodes to manipulate this field without the received node observing.

Performance, Ergonomics, and Compatibility

Irrelevant.

Performance

Irrelevant.

Ergonomics

Irrelevant.

Compatibility

The changes are backwards compatible with the existing protocol, so nodes with both the old protocol and newer protocol can exist in the network, this is achieved by the fact that we use protobuf for serializing and deserializing the records, so new fields will be ignore when deserializing with the older protocol and vice-versa when deserializing an old record with the new protocol the new field will be None and the new code accepts this record as being valid.

Prior Art and References

The enhancements have been inspired by the algorithm specified in here

Unresolved Questions

N/A

N/A

(source)

Table of Contents

RFC-0097: Unbonding Queue

Date19.06.2024
DescriptionThis RFC proposes a safe mechanism to scale the unbonding time from staking on the Relay Chain proportionally to the overall unbonding stake. This approach significantly reduces the expected duration for unbonding, while ensuring that a substantial portion of the stake is always available to slash of validators behaving maliciously within a 28-day window.
AuthorsJonas Gehrlein & Alistair Stewart

Summary

This RFC proposes a flexible unbonding mechanism for tokens that are locked from staking on the Relay Chain (DOT/KSM), aiming to enhance user convenience without compromising system security.

Locking tokens for staking ensures that Polkadot is able to slash tokens backing misbehaving validators. With changing the locking period, we still need to make sure that Polkadot can slash enough tokens to deter misbehaviour. This means that not all tokens can be unbonded immediately, however we can still allow some tokens to be unbonded quickly.

The new mechanism leads to a signficantly reduced unbonding time on average, by queuing up new unbonding requests and scaling their unbonding duration relative to the size of the queue. New requests are executed with a minimum of 2 days, when the queue is comparatively empty, to the conventional 28 days, if the sum of requests (in terms of stake) exceed some threshold. In scenarios between these two bounds, the unbonding duration scales proportionately. The new mechanism will never be worse than the current fixed 28 days.

In this document we also present an empirical analysis by retrospectively fitting the proposed mechanism to the historic unbonding timeline and show that the average unbonding duration would drastically reduce, while still being sensitive to large unbonding events. Additionally, we discuss implications for UI, UX, and conviction voting.

Note: Our proposition solely focuses on the locks imposed from staking. Other locks, such as governance, remain unchanged. Also, this mechanism should not be confused with the already existing feature of FastUnstake, which lets users unstake tokens immediately that have not received rewards for 28 days or longer.

As an initial step to gauge its effectiveness and stability, it is recommended to implement and test this model on Kusama before considering its integration into Polkadot, with appropriate adjustments to the parameters. In the following, however, we limit our discussion to Polkadot.

Motivation

Polkadot has one of the longest unbonding periods among all Proof-of-Stake protocols, because security is the most important goal. Staking on Polkadot is still attractive compared to other protocols because of its above-average staking APY. However the long unbonding period harms usability and deters potential participants that want to contribute to the security of the network.

The current length of the unbonding period imposes significant costs for any entity that even wants to perform basic tasks such as a reorganization / consolidation of their stashes, or updating their private key infrastructure. It also limits participation of users that have a large preference for liquidity.

The combination of long unbonding periods and high returns has lead to the proliferation of liquid staking, where parachains or centralised exchanges offer users their staked tokens before the 28 days unbonding period is over either in original DOT/KSM form or derivative tokens. Liquid staking is harmless if few tokens are involved but it could result in many validators being selected by a few entities if a large fraction of DOTs were involved. This may lead to centralization (see here for more discussion on threats of liquid staking) and an opportunity for attacks.

The new mechanism greatly increases the competitiveness of Polkadot, while maintaining sufficient security.

Stakeholders

  • Every DOT/KSM token holder

Explanation

Before diving into the details of how to implement the unbonding queue, we give readers context about why Polkadot has a 28-day unbonding period in the first place. The reason for it is to prevent long-range attacks (LRA) that becomes theoretically possible if more than 1/3 of validators collude. In essence, a LRA describes the inability of users, who disconnect from the consensus at time t0 and reconnects later, to realize that validators which were legitimate at a certain time, say t0 but dropped out in the meantime, are not to be trusted anymore. That means, for example, a user syncing the state could be fooled by trusting validators that fell outside the active set of validators after t0, and are building a competitive and malicious chain (fork).

LRAs of longer than 28 days are mitigated by the use of trusted checkpoints, which are assumed to be no more than 28 days old. A new node that syncs Polkadot will start at the checkpoint and look for proofs of finality of later blocks, signed by 2/3 of the validators. In an LRA fork, some of the validator sets may be different but only if 2/3 of some validator set in the last 28 days signed something incorrect.

If we detect an LRA of no more than 28 days with the current unbonding period, then we should be able to detect misbehaviour from over 1/3 of validators whose nominators are still bonded. The stake backing these validators is considerable fraction of the total stake (empirically it is 0.287 or so). If we allowed more than this stake to unbond, without checking who it was backing, then the LRA attack might be free of cost for an attacker. The proposed mechansim allows up to half this stake to unbond within 28 days. This halves the amount of tokens that can be slashed, but this is still very high in absolute terms. For example, at the time of writing (19.06.2024) this would translate to around 120 millions DOTs.

Attacks other than an LRA, such as backing incorrect parachain blocks, should be detected and slashed within 2 days. This is why the mechanism has a minimum unbonding period.

In practice an LRA does not affect clients who follow consensus more frequently than every 2 days, such as running nodes or bridges. However any time a node syncs Polkadot if an attacker is able to connect to it first, it could be misled.

In short, in the light of the huge benefits obtained, we are fine by only keeping a fraction of the total stake of validators slashable against LRAs at any given time.

Mechanism

When a user (nominator or validator) decides to unbond their tokens, they don't become instantly available. Instead, they enter an unbonding queue. The following specification illustrates how the queue works, given a user wants to unbond some portion of their stake denoted as new_unbonding_stake. We also store a variable, max_unstake that tracks how much stake we allow to unbond potentially earlier than 28 eras (28 days on Polkadot and 7 days on Kusama).

To calculate max_unstake, we record for each era how much stake was used to back the lowest-backed 1/3 of validators. We store this information for the last 28 eras and let min_lowest_third_stake be the minimum of this over the last 28 eras. max_unstake is determined by MIN_SLASHABLE_SHARE x min_lowest_third_stake. In addition, we can use UPPER_BOUND and LOWER_BOUND as variables to scale the unbonding duration of the queue.

At any time we store back_of_unbonding_queue_block_number which expresses the block number when all the existing unbonders have unbonded.

Let's assume a user wants to unbond some of their stake, i.e., new_unbonding_stake, and issues the request at some arbitrary block number denoted as current_block. Then:

unbonding_time_delta = new_unbonding_stake / max_unstake * UPPER_BOUND

This number needs to be added to the back_of_unbonding_queue_block_number under the conditions that it does not undercut current_block + LOWER_BOUND or exceed current_block + UPPER_BOUND.

back_of_unbonding_queue_block_number = max(current_block_number, back_of_unbonding_queue_block_number) + unbonding_time_delta

This determines at which block the user has their tokens unbonded, making sure that it is in the limit of LOWER_BOUND and UPPER_BOUND.

unbonding_block_number = min(UPPER_BOUND, max(back_of_unbonding_queue_block_number - current_block_number, LOWER_BOUND)) + current_block_number

Ultimately, the user's token are unbonded at unbonding_block_number.

Proposed Parameters

There are a few constants to be exogenously set. They are up for discussion, but we make the following recommendation:

  • MIN_SLASHABLE_SHARE: 1/2 - This is the share of stake backing the lowest 1/3 of validators that is slashable at any point in time. It offers a trade-off between security and unbonding time. Half is a sensible choice. Here, we have sufficient stake to slash while allowing for a short average unbonding time.
  • LOWER_BOUND: 28800 blocks (or 2 eras): This value resembles a minimum unbonding time for any stake of 2 days.
  • UPPER_BOUND: 403200 blocks (or 28 eras): This value resembles the maximum time a user faces in their unbonding time. It equals to the current unbonding time and should be familiar to users.

Rebonding

Users that chose to unbond might want to cancel their request and rebond. There is no security loss in doing this, but with the scheme above, it could imply that a large unbond increases the unbonding time for everyone else later in the queue. When the large stake is rebonded, however, the participants later in the queue move forward and can unbond more quickly than originally estimated. It would require an additional extrinsic by the user though.

Thus, we should store the unbonding_time_delta with the unbonding account. If it rebonds when it is still unbonding, then this value should be subtracted from back_of_unbonding_queue_block_number. So unbonding and rebonding leaves this number unaffected. Note that we must store unbonding_time_delta, because in later eras max_unstake might have changed and we cannot recompute it.

Empirical Analysis

We can use the proposed unbonding queue calculation, with the recommended parameters, and simulate the queue over the course of Polkadot's unbonding history. Instead of doing the analysis on a per-block basis, we calculate it on a daily basis. To simulate the unbonding queue, we require the ratio between the daily total stake of the lowest third backed validators and the daily total stake (which determines the max_unstake) and the sum of daily and newly unbonded tokens. Due to the NPoS algorithm, the first number has only small variations and we used a constant as approximation (0.287) determined by sampling a bunch of empirical eras. At this point, we want to thank Parity's Data team for allowing us to leverage their data infrastructure in these analyses.

The following graph plots said statistics.

Empirical Queue

The abovementioned graph combines two metrics into a single graph.

  • Unbonded Amount: The number of daily and newly unbonded token over time scaled to the y-axis of 28 days. In particular its normalized by daily_unbonded / max(daily_unbonded) * 28.
  • Unbonding Days: The daily expected unbonding days given the history of daily_unbonded.

We can observe that historical unbonds only trigger an unbonding time larger than LOWER_BOUND in situations with extensive and/or clustered unbonding amounts. The average unbonding time across the whole timeseries is ~2.67 days. We can, however, see it taking effect pushing unbonding times up during large unbonding events. In the largest events, we hit a maximum of 28 days. This gives us reassurance that it is sufficiently sensitive and it makes sense to match the UPPER_BOUND with the historically largest unbonds.

The main parameter affecting the situation is the max_unstake. The relationship is obvious: decreasing the max_unstake makes the queue more sensitive, i.e., having it spike more quickly and higher with unbonding events. Given that these events historically were mostly associated with parachain auctions, we can assume that, in the absence of major systemic events, users will experience drastically reduced unbonding times. The analysis can be reproduced or changed to other parameters using this repository.

Additional Considerations

Deferred slashing

Currently we defer applying many slashes until around 28 days have passed. This was implemented so we can conveniently cancel slashes via governance in the case that the slashing was due to a bug. While rare on Polkadot, such bugs cause a significant fraction of slashes. This includes slashing for attacks other than LRAs for which we've assumed that 2 days is enough to slash. But 2 days in not enough to cancel slashes via OpenGov.

Owing to the way exposures, which nominators back validators with how many tokens, are stored, it is hard to search for whether a nominator has deferred slashes that need to be applied to them on chain as of now. So we cannot simply check when a nominator attempts to withdraw their bond.

We can solve this by freezing the unbonding queue while there are pending slashes in the staking system. In the worst case, where the slash is applied, we would forced all members of the queue to unbond with 28 days minus the days since they are in the queue (i.e., nobody ever needs to wait more than 28 days) and pause the unbonding queue until there are no deferred slashes in the system. This solution is potentially easier to implement but could cause disruptions for unbonding stakers that are not slashed, because they do not benefit from the queue. It is crucial to note that unbonding is still always possible for all stakers in the usual 28 days. Since slashes should occur rarely, this should not cause distruptions in reality too often. In addition, we could further complement the solution by adding a new extrinsic where any account is allowed to point out the unbonding accounts with the deferred slashes. Then, the chain would set the unbonding_block_number of the affected accounts to after the time when the slash would be applied, which will be no more than 28 days from the time the staker unbonded. After removing the offenders from the queue, we could unfreeze the unbonding queue and restore operation for unslashed accounts immediately. To find nominators with deferred slashes it is required, however, to iterate through all nominators, which is only feasible to do off chain. There should be plenty of incentive to do so by the non-slashed unbonding accounts that seek to reduce the opportunity costs of being forced wait potentially much longer than necessary.

This solution achieves resolve the situation securely and, in the worst case where no user submits the extrinsic, no staker would exceed an unbonding duration of the usual 28 days and apply all slashes as intended.

UX/UI

As per the nature of the unbonding queue, the more a user slices up their stake to be unbonded, the quicker they find their expected unbonding time. This, however, comes at the cost of creating more and/or larger transactions, i.e., incurring higher transactions costs. We leave it to UI implementations to provide a good UX to inform users about this trade-off and help them find their individual willingness to pay to unbond even faster. For most users, splitting up their stake will not lead to any meaningful advantage because their effect on the queue is neglible.

Conviction voting

Changing the (expected) unbonding period has an indirect impact on conviction voting, because the governance locks do not stack with the staking locks. In other words, if a user is already being locked in staking, they can, for free, choose a conviction vote that is lower or equal to that locking time. Currently and with an unbonding period of a fixed 28 days, that means, the 3x conviction vote comes essentially for free. There has been discussions to rescale the conviction weights to improved parametrization. But, the transition between the old locks and new locks pose significant challenges.

We argue, that under our unbonding queue, the current conviction voting scheme logically better aligns with their impact on governance, avoiding an expensive solution to migrate existing locks to a new scheme. For example, if the average unbonding period is around 2 days from staking, locking tokens for an additional 26 days justifies a higher weight (in that regard of3x). Voters that seek maximum liquidity are free to do so but it is fair to be weighted less in governance decisions that are naturally affecting the long-term success of Polkadot.

Potential Extension

In addition to a simple queue, we could add a market component that lets users always unbond from staking at the minimum possible waiting time)(== LOWER_BOUND, e.g., 2 days), by paying a variable fee. To achieve this, it is reasonable to split the total unbonding capacity into two chunks, with the first capacity for the simple queue and the remaining capacity for the fee-based unbonding. By doing so, we allow users to choose whether they want the quickest unbond and paying a dynamic fee or join the simple queue. Setting a capacity restriction for both queues enables us to guarantee a predictable unbonding time in the simple queue, while allowing users with the respective willingness to pay to get out even earlier. The fees are dynamically adjusted and are proportional to the unbonding stake (and thereby expressed in a percentage of the requested unbonding stake). In contrast to a unified queue, this prevents the issue that users paying a fee jump in front of other users not paying a fee, pushing their unbonding time back (which would be bad for UX). The revenue generated could be burned.

This extension and further specifications are left out of this RFC, because it adds further complexity and the empirical analysis above suggests that average unbonding times will already be close the LOWER_BOUND, making a more complex design unnecessary. We advise to first implement the discussed mechanism and assess after some experience whether an extension is desirable.

Drawbacks

  • Lower security for LRAs: Without a doubt, the theoretical security against LRAs decreases. But, as we argue, the attack is still costly enough to deter attacks and the attack is sufficiently theoretical. Here, the benefits outweigh the costs.
  • Griefing attacks: A large holder could pretend to unbond a large amount of their tokens to prevent other users to exit the network earlier. This would, however be costly due to the fact that the holder loses out on staking rewards. The larger the impact on the queue, the higher the costs. In any case it must be noted that the UPPER_BOUND is still 28 days, which means that nominators are never left with a longer unbonding period than currently. There is not enough gain for the attacker to endure this cost.
  • Challenge for Custodians and Liquid Staking Providers: Changing the unbonding time, especially making it flexible, requires entities that offer staking derivatives to rethink and rework their products.

Testing, Security, and Privacy

NA

Performance, Ergonomics, and Compatibility

NA

Performance

The authors cannot see any potential impact on performance.

Ergonomics

The authors cannot see any potential impact on ergonomics for developers. We discussed potential impact on UX/UI for users above.

Compatibility

The authors cannot see any potential impact on compatibility. This should be assessed by the technical fellows.

Prior Art and References

(source)

Table of Contents

RFC-0099: Introduce a transaction extension version

Start Date03 July 2024
DescriptionIntroduce a versioning for transaction extensions.
AuthorsBastian Köcher

Summary

This RFC proposes a change to the extrinsic format to include a transaction extension version.

Motivation

The extrinsic format supports to be extended with transaction extensions. These transaction extensions are runtime specific and can be different per chain. Each transaction extension can add data to the extrinsic itself or extend the signed payload. This means that adding a transaction extension is breaking the chain specific extrinsic format. A recent example was the introduction of the CheckMetadatHash to Polkadot and all its system chains. As the extension was adding one byte to the extrinsic, it broke a lot of tooling. By introducing an extra version for the transaction extensions it will be possible to introduce changes to these transaction extensions while still being backwards compatible. Based on the version of the transaction extensions, each chain runtime could decode the extrinsic correctly and also create the correct signed payload.

Stakeholders

  • Runtime users
  • Runtime devs
  • Wallet devs

Explanation

RFC84 introduced the extrinsic format 5. The idea is to piggyback onto this change of the extrinsic format to add the extra version for the transaction extensions. If required, this could also come as extrinsic format 6, but 5 is not yet deployed anywhere.

The extrinsic format supports the following types of transactions:

  • Bare: Does not add anything to the extrinsic.
  • Signed: (Address, Signature, Extensions)
  • General: Extensions

The Signed and General transaction would change to:

  • Signed: (Address, Signature, Version, Extensions)
  • General: (Version, Extensions)

The Version being a SCALE encoded u8 representing the version of the transaction extensions.

In the chain runtime the version can be used to determine which set of transaction extensions should be used to decode and to validate the transaction.

Drawbacks

This adds one byte more to each signed transaction.

Testing, Security, and Privacy

There is no impact on testing, security or privacy.

Performance, Ergonomics, and Compatibility

This will ensure that changes to the transactions extensions can be done in a backwards compatible way.

Performance

There is no performance impact.

Ergonomics

Runtime developers need to take care of the versioning and ensure to bump as required, so that there are no compatibility breaking changes without a bump of the version. It will also add a little bit more code in the runtime to decode these old versions, but this should be neglectable.

Compatibility

When introduced together with extrinsic format version 5 from RFC84, it can be implemented in a backwards compatible way. So, transactions can still be send using the old extrinsic format and decoded by the runtime.

Prior Art and References

None.

Unresolved Questions

None.

None.

(source)

Table of Contents

RFC-0100: New XCM instruction: InitiateAssetsTransfer

Start Date11 July 2024
DescriptionAdd new XCM instruction: InitiateAssetsTransfer for mixing asset transfer types in same XCM
AuthorsAdrian Catangiu

Summary

This RFC proposes a new instruction that provides a way to initiate on remote chains, asset transfers which transfer multiple types (teleports, local-reserve, destination-reserve) of assets, using XCM alone.

The currently existing instructions are too opinionated and force each XCM asset transfer to a single transfer type (teleport, local-reserve, destination-reserve). This results in inability to combine different types of transfers in single transfer which results in overall poor UX when trying to move assets across chains.

Motivation

XCM is the de-facto cross-chain messaging protocol within the Polkadot ecosystem, and cross-chain assets transfers is one of its main use-cases. Unfortunately, in its current spec, it does not support initiating on a remote chain, one or more transfers that combine assets with different transfer types.
For example, ParachainA cannot instruct AssetHub to teleport ForeignAssetX to ParachainX alongside USDT (which has to be reserve transferred) using current XCM specification.

There currently exist DepositReserveAsset, InitiateReserveWithdraw and InitiateTeleport instructions that initiate asset transfers on execution, but they are opinionated in the type of transfer to use. Combining them is also not possible, because as a result of their individual execution, a message containing a ClearOrigin instruction is sent to the destination chain, making subsequent transfers impossible after the first instruction is executed.

The new instruction proposed by this RFC allows an XCM program to describe multiple asset transfer types, then execute them in one shot with a single remote_xcm program sent to the target chain to effect the transfer and subsequently clear origin.

Multi-hop asset transfers will benefit from this change by allowing single XCM program to handle multiple types of transfers and reduce complexity.

Bridge asset transfers greatly benefit from this change by allowing building XCM programs to transfer multiple assets across multiple hops in a single pseudo-atomic action.
For example, allows single XCM program execution to transfer multiple assets from ParaK on Kusama, through Kusama Asset Hub, over the bridge through Polkadot Asset Hub with final destination ParaP on Polkadot.

With current XCM, we are limited to doing multiple independent transfers for each individual hop in order to move both "interesting" assets, but also "supporting" assets (used to pay fees).

Stakeholders

  • Runtime users
  • Runtime devs
  • Wallet devs
  • dApps devs

Explanation

A new instruction InitiateAssetsTransfer is introduced that initiates an assets transfer from the chain it is executed on, to another chain. The executed transfer is point-to-point (chain-to-chain) with all of the transfer properties specified in the instruction parameters. The instruction also allows specifying another XCM program to be executed on the remote chain. If a transfer requires going through multiple hops, an XCM program can compose this instruction to be used at every chain along the path, on each hop describing that specific leg of the transfer.

Note: Transferring assets that require different paths (chains along the way) is not supported within same XCM because of the async nature of cross chain messages. This new instruction, however, enables initiating transfers for multiple assets that take the same path even if they require different transfer types along that path.

The usage and composition model of InitiateAssetsTransfer is the same as with existing DepositReserveAsset, InitiateReserveWithdraw and InitiateTeleport instructions. The main difference comes from the ability to handle assets that have different point-to-point transfer type between A and B. The other benefit is that it also allows specifying remote fee payment and transparently appends the required remote fees logic to the remote XCM.

We can specify the desired transfer type for some asset(s) using:

#![allow(unused)]
fn main() {
/// Specify which type of asset transfer is required for a particular `(asset, dest)` combination.
pub enum AssetTransferFilter {
	/// teleport assets matching `AssetFilter` to `dest`
	Teleport(AssetFilter),
	/// reserve-transfer assets matching `AssetFilter` to `dest`, using the local chain as reserve
	ReserveDeposit(AssetFilter),
	/// reserve-transfer assets matching `AssetFilter` to `dest`, using `dest` as reserve
	ReserveWithdraw(AssetFilter),
}
}

This RFC proposes 1 new XCM instruction:

#![allow(unused)]
fn main() {
/// Cross-chain transfer matching `assets` in the holding register as follows:
///
/// Assets in the holding register are matched using the given list of `AssetTransferFilter`s,
/// they are then transferred based on their specified transfer type:
///
/// - teleport: burn local assets and append a `ReceiveTeleportedAsset` XCM instruction to
///   the XCM program to be sent onward to the `dest` location,
///
/// - reserve deposit: place assets under the ownership of `dest` within this consensus system
///   (i.e. its sovereign account), and append a `ReserveAssetDeposited` XCM instruction
///   to the XCM program to be sent onward to the `dest` location,
///
/// - reserve withdraw: burn local assets and append a `WithdrawAsset` XCM instruction
///   to the XCM program to be sent onward to the `dest` location,
///
/// The onward XCM is then appended a `ClearOrigin` to allow safe execution of any following
/// custom XCM instructions provided in `remote_xcm`.
///
/// The onward XCM also potentially contains a `BuyExecution` instruction based on the presence
/// of the `remote_fees` parameter (see below).
///
/// If a transfer requires going through multiple hops, an XCM program can compose this instruction
/// to be used at every chain along the path, describing that specific leg of the transfer.
///
/// Parameters:
/// - `dest`: The location of the transfer next hop.
/// - `remote_fees`: If set to `Some(asset_xfer_filter)`, the single asset matching
///   `asset_xfer_filter` in the holding register will be transferred first in the remote XCM
///   program, followed by a `BuyExecution(fee)`, then rest of transfers follow.
///   This guarantees `remote_xcm` will successfully pass a `AllowTopLevelPaidExecutionFrom` barrier.
/// - `remote_xcm`: Custom instructions that will be executed on the `dest` chain. Note that
///   these instructions will be executed after a `ClearOrigin` so their origin will be `None`.
///
/// Safety: No concerns.
///
/// Kind: *Command*.
///
InitiateAssetsTransfer {
	destination: Location,
	assets: Vec<AssetTransferFilter>,
	remote_fees: Option<AssetTransferFilter>,
	remote_xcm: Xcm<()>,
}
}

An InitiateAssetsTransfer { .. } instruction shall transfer to dest, all assets in the holding register that match the provided assets and remote_fees filters. These filters identify the assets to be transferred as well as the transfer type to be used for transferring them. It shall handle the local side of the transfer, then forward an onward XCM to dest for handling the remote side of the transfer.

It should do so using same mechanisms as existing DepositReserveAsset, InitiateReserveWithdraw, InitiateTeleport instructions but practically combining all required XCM instructions to be remotely executed into a single remote XCM program to be sent over to dest.

Furthermore, through remote_fees: Option<AssetTransferFilter>, it shall allow specifying a single asset to be used for fees on dest chain. This single asset shall be remotely handled/received by the first instruction in the onward XCM and shall be followed by a BuyExecution instruction using it. If remote_fees is set to None, the first instruction in the onward XCM shall be a UnpaidExecution instruction. The rest of the assets shall be handled by subsequent instructions, thus also finally allowing single asset buy execution barrier security recommendation.

The BuyExecution appended to the onward XCM specifies WeightLimit::Unlimited, thus being limited only by the remote_fees asset "amount". This is a deliberate decision for enhancing UX - in practice, people/dApps care about limiting the amount of fee asset used and not the actually used weight.

The onward XCM, following the assets transfers instructions, ClearOrigin or DescendOrigin instructions shall be appended to stop acting on behalf of the source chain, then the caller-provided remote_xcm shall also be appended, allowing the caller to control what to do with the transferred assets.

Example usage: transferring 2 different asset types across 3 chains

  • Transferring ROCs as the native asset of RococoAssetHub and PENs as the native asset of Penpal,
  • Transfer origin is Penpal (on Rococo) and the destination is WestendAssetHub (across the bridge),
  • ROCs are native to RococoAssetHub and are registered as trust-backed assets on Penpal and WestendAssetHub,
  • PENs are native to Penpal and are registered as teleportable assets on RococoAssetHub and as foreign assets on WestendAssetHub,
  • Fees on RococoAssetHub and WestendAssetHub are paid using ROCs.

We can transfer them from Penpal (Rococo), through RococoAssetHub, over the bridge to WestendAssetHub by executing a single XCM message, even though we'll be mixing multiple types of transfers along the path:

  1. 1st leg of the transfer: Penpal -> Rococo Asset Hub:
    • teleport PENs
    • reserve withdraw ROCs
  2. 2nd leg of the transfer: Rococo Asset Hub -> Westend Asset Hub:
    • reserve deposit both PENs and ROCs
#![allow(unused)]
fn main() {
Penpal::execute_with(|| {
    let destination = Location::new(2, (GlobalConsensus(Westend), Parachain(1000)).into());
    let rocs_id: AssetId = Parent.into();
    let rocs: Asset = (rocs_id.clone(), rocs_amount).into();
    let pens: Asset = (pens_id, pens_amount).into();
    let assets: Assets = vec![rocs.clone(), pens.clone()].into();

    // XCM to be executed at dest (Westend Asset Hub)
    let xcm_on_dest =
        Xcm(vec![DepositAsset { assets: Wild(All), beneficiary: beneficiary.clone() }]);

    // XCM to be executed at Rococo Asset Hub
    let context = PenpalUniversalLocation::get();
    let reanchored_assets = assets.clone().reanchored(&local_asset_hub, &context).unwrap();
    let reanchored_dest = destination.clone().reanchored(&local_asset_hub, &context).unwrap();
    let reanchored_rocs_id = rocs_id.clone().reanchored(&local_asset_hub, &context).unwrap();

    // from AHR, both ROCs and PENs are local-reserve transferred to Westend Asset Hub
    let assets_filter = vec![
        AssetTransferFilter::ReserveDeposit(reanchored_assets.clone().into())
    ];
    // we want to pay with ROCs on WAH
    let remote_fees = Some(AssetTransferFilter::ReserveDeposit(
        AssetFilter::Wild(AllOf { id: reanchored_rocs_id.into(), fun: WildFungibility::Fungible }))
    );
    let xcm_on_ahr = Xcm(vec![
        InitiateAssetsTransfer {
            dest: reanchored_dest,
            assets: assets_filter,
            remote_fees: Some(),
            remote_xcm: xcm_on_dest,
        },
    ]);

    // pay remote fees with ROCs
    let remote_fees = Some(
        AssetTransferFilter::ReserveWithdraw(
            AssetFilter::Wild(AllOf { id: rocs_id.into(), fun: WildFungibility::Fungible })
        )
    );
    // XCM to be executed locally
    let xcm = Xcm::<penpal_runtime::RuntimeCall>(vec![
        // Withdraw both ROCs and PENs from origin account
        WithdrawAsset(assets.clone().into()),
        // Execute the transfers while paying remote fees with ROCs
        InitiateAssetsTransfer {
            dest: local_asset_hub,
            assets: vec![
                // ROCs are reserve-withdrawn on AHR
                ReserveWithdraw(rocs.into()),
                // PENs are teleported to AHR
                Teleport(pens.into()),
            ],
            remote_fees,
            remote_xcm: xcm_on_ahr,
        },
    ]);

    <Penpal as PenpalPallet>::PolkadotXcm::execute(
        signed_origin,
        bx!(xcm::VersionedXcm::V4(xcm.into())),
        Weight::MAX,
    ).unwrap();
})
}

Drawbacks

No drawbacks identified.

Testing, Security, and Privacy

There should be no security risks related to the new instruction from the XCVM perspective. It follows the same pattern as with single-type asset transfers, only now it allows combining multiple types at once.

Improves security by enabling enforcement of single asset for buying execution, which minimizes the potential free/unpaid work that a receiving chain has to do. It does so, by making the required execution fee payment, part of the instruction logic through the remote_fees: Option<AssetTransferFilter> parameter, which will make sure the remote XCM starts with a single-asset-holding-loading-instruction, immediately followed by a BuyExecution using said asset.

Performance, Ergonomics, and Compatibility

This brings no impact to the rest of the XCM spec. It is a new, independent instruction, no changes to existing instructions.

Enhances the exposed functionality of Polkadot. Will allow multi-chain transfers that are currently forced to happen in multiple programs per asset per "hop", to be possible in a single XCM program.

Performance

No performance changes/implications.

Ergonomics

The proposal enhances developers' and users' cross-chain asset transfer capabilities. This enhancement is optimized for XCM programs transferring multiple assets, needing to run their logic across multiple chains.

Compatibility

Does this proposal break compatibility with existing interfaces, older versions of implementations? Summarize necessary migrations or upgrade strategies, if any.

This enhancement is compatible with all existing XCM programs and versions.

New (XCMv5) programs using this instruction shall be best-effort downgraded to an older XCM version, but cannot guarantee success. A program where the new instruction is used to initiate multiple types of asset transfers, cannot be downgraded to older XCM versions, because there is no equivalent capability there. Such conversion attempts will explicitly fail.

Prior Art and References

None.

Unresolved Questions

None.

None.

(source)

Table of Contents

RFC-0101: XCM Transact remove require_weight_at_most parameter

Start Date12 July 2024
DescriptionRemove require_weight_at_most parameter from XCM Transact
AuthorsAdrian Catangiu

Summary

The Transact XCM instruction currently forces the user to set a specific maximum weight allowed to the inner call and then also pay for that much weight regardless of how much the call actually needs in practice.

This RFC proposes improving the usability of Transact by removing that parameter and instead get and charge the actual weight of the inner call from its dispatch info on the remote chain.

Motivation

The UX of using Transact is poor because of having to guess/estimate the require_weight_at_most weight used by the inner call on the target.

We've seen multiple Transact on-chain failures caused by guessing wrong values for this require_weight_at_most even though the rest of the XCM program would have worked.

In practice, this parameter only adds UX overhead with no real practical value. Use cases fall in one of two categories:

  1. Unpaid execution of Transacts - in these cases the require_weight_at_most is not really useful, caller doesn't have to pay for it, and on the call site it either fits the block or not;
  2. Paid execution of single Transact - the weight to be spent by the Transact is already covered by the BuyExecution weight limit parameter.

We've had multiple OpenGov root/whitelisted_caller proposals initiated by core-devs completely or partially fail because of incorrect configuration of require_weight_at_most parameter. This is a strong indication that the instruction is hard to use.

Stakeholders

  • Runtime Users,
  • Runtime Devs,
  • Wallets,
  • dApps,

Explanation

The proposed enhancement is simple: remove require_weight_at_most parameter from the instruction:

- Transact { origin_kind: OriginKind, require_weight_at_most: Weight, call: DoubleEncoded<Call> },
+ Transact { origin_kind: OriginKind, call: DoubleEncoded<Call> },

The XCVM implementation shall no longer use require_weight_at_most for weighing. Instead, it shall weigh the Transact instruction by decoding and weighing the inner call.

Drawbacks

No drawbacks, existing scenarios work as before, while this also allows new/easier flows.

Testing, Security, and Privacy

Currently, an XCVM implementation can weigh a message just by looking at the decoded instructions without decoding the Transact's call, but assuming require_weight_at_most weight for it. With the new version it has to decode the inner call to know its actual weight.

But this does not actually change the security considerations, as can be seen below.

With the new Transact the weighing happens after decoding the inner call. The entirety of the XCM program containing this Transact needs to be either covered by enough bought weight using a BuyExecution, or the origin has to be allowed to do free execution.

The security considerations around how much can someone execute for free are the same for both this new version and the old. In both cases, an "attacker" can do the XCM decoding (including Transact inner calls) for free by adding a large enough BuyExecution without actually having the funds available.

In both cases, decoding is done for free, but in both cases execution fails early on BuyExecution.

Performance, Ergonomics, and Compatibility

Performance

No performance change.

Ergonomics

Ergonomics are slightly improved by simplifying Transact API.

Compatibility

Compatible with previous XCM programs.

Prior Art and References

None.

Unresolved Questions

None.

None.

(source)

Table of Contents

RFC-0103: Introduce a CoreIndex commitment and a SessionIndex field in candidate receipts

Start Date15 July 2024
DescriptionConstrain parachain block validity to a specific core and session
AuthorsAndrei Sandu

Summary

Elastic scaling is not resilient against griefing attacks without a way for a PoV (Proof of Validity) to commit to the particular core index it was intended for. This RFC proposes a way to include core index information in the candidate commitments and the CandidateDescriptor data structure in a backward compatible way. Additionally, it proposes the addition of a SessionIndex field in the CandidateDescriptor to make dispute resolution more secure and robust.

Motivation

This RFC proposes a way to solve two different problems:

  1. For Elastic Scaling, it prevents anyone who has acquired a valid collation to DoS the parachain by providing the same collation to all backing groups assigned to the parachain. This can happen before the next valid parachain block is authored and will prevent the chain of candidates from being formed, reducing the throughput of the parachain to a single core.
  2. The dispute protocol relies on validators trusting the session index provided by other validators when initiating and participating in disputes. It is used to look up validator keys and check dispute vote signatures. By adding a SessionIndex in the CandidateDescriptor, validators no longer have to trust the Sessionindex provided by the validator raising a dispute. The dispute may concern a relay chain block not yet imported by a validator. In this case, validators can safely assume the session index refers to the session the candidate has appeared in, otherwise, the chain would have rejected the candidate.

Stakeholders

  • Polkadot core developers.
  • Cumulus node developers.
  • Tooling, block explorer developers.

This approach and alternatives have been considered and discussed in this issue.

Explanation

The approach proposed below was chosen primarily because it minimizes the number of breaking changes, the complexity and takes less implementation and testing time. The proposal is to change the existing primitives while keeping binary compatibility with the older versions. We repurpose unused fields to introduce core index and a session index information in the CandidateDescriptor and extend the UMP to transport non-XCM messages.

Reclaiming unused space in the descriptor

The CandidateDescriptor includes collator and signature fields. The collator includes a signature on the following descriptor fields: parachain id, relay parent, validation data hash, validation code hash, and the PoV hash.

However, in practice, having a collator signature in the receipt on the relay chain does not provide any benefits as there is no mechanism to punish or reward collators that have provided bad parachain blocks.

This proposal aims to remove the collator signature and all the logic that checks the collator signatures of candidate receipts. We use the first 7 reclaimed bytes to represent the version, the core, session index, and fill the rest with zeroes. So, there is no change in the layout and length of the receipt. The new primitive is binary-compatible with the old one.

UMP transport

CandidateCommitments remains unchanged as we will store scale encoded UMPSignal messages directly in the parachain UMP queue by outputting them in upward_messages.

The UMP queue layout is changed to allow the relay chain to receive both the XCM messages and UMPSignal messages. An empty message (empty Vec<u8>) is used to mark the end of XCM messages and the start of UMPSignal messages. The UMPSignal is optional and can be omitted by parachains not using elastic scaling.

This way of representing the new messages has been chosen over introducing an enum wrapper to minimize breaking changes of XCM message decoding in tools like Subscan for example.

Example:

#![allow(unused)]
fn main() {
[ XCM message1, XCM message2, ..., EMPTY message, UMPSignal::SelectCore ]
}

UMPSignal messages

#![allow(unused)]
fn main() {
/// The selector that determines the core index.
pub struct CoreSelector(pub u8);

/// The offset in the relay chain claim queue.
///
/// The state of the claim queue is given by the relay chain block
/// that is used as context for the `PoV`. 
pub struct ClaimQueueOffset(pub u8);

/// Signals sent by a parachain to the relay chain.
pub enum UMPSignal {
    /// A message sent by a parachain to select the core the candidate is committed to.
    /// Relay chain validators, in particular backers, use the `CoreSelector` and `ClaimQueueOffset`
    /// to compute the index of the core the candidate has committed to.
    SelectCore(CoreSelector, ClaimQueueOffset),
}
}

The CoreSelector together with the ClaimQueueOffset are used to index the claim queue. This way the validators can compute the CoreIndex and ensure that the collator put the correct CoreIndex into the CandidateDescriptor.

Example:

cq_offset = 1 and core_selector = 3

The table below represents a snapshot of the claim queue:

offset = 0offset = 1offset = 2
Core 1Para APara APara A
Core 2Para APara BPara A
Core 3Para BPara APara A

The purpose of ClaimQueueOffset is to select the column from the above table. For cq_offset = 1 we get [Para A, Para B, Para A] and use as input to create a sorted vec with the cores A is assigned to: [Core 1, Core 3] and call it para_assigned_cores. We use core_selector and determine the committed core index is Core 3 like this:

#![allow(unused)]
fn main() {
let committed_core_index = para_assigned_cores[core_selector % para_assigned_cores.len()];
}

Polkadot Primitive changes

New CandidateDescriptor

  • reclaim 32 bytes from collator: CollatorId and 64 bytes from signature: CollatorSignature and rename to reserved1 and reserved2 fields.
  • take 1 bytes from reserved1 for a new version: u8 field.
  • take 2 bytes from reserved1 for a new core_index: u16 field.
  • take 4 bytes from reserved1 for a new session_index: u32 field.
  • the remaining reserved1 and reserved2 fields are zeroed

The new primitive will look like this:

#![allow(unused)]
fn main() {
pub struct CandidateDescriptorV2<H = Hash> {
    /// The ID of the para this is a candidate for.
    para_id: ParaId,
    /// The hash of the relay-chain block this is executed in the context of.
    relay_parent: H,
    /// Version field. The raw value here is not exposed, instead, it is used
    /// to determine the `CandidateDescriptorVersion`
    version: InternalVersion,
    /// The core index where the candidate is backed.
    core_index: u16,
    /// The session in which the candidate is backed.
    session_index: SessionIndex,
    /// Reserved bytes.
    reserved1: [u8; 25],
    /// The blake2-256 hash of the persisted validation data. This is extra data derived from
    /// relay-chain state which may vary based on bitfields included before the candidate.
    /// Thus it cannot be derived entirely from the relay parent.
    persisted_validation_data_hash: Hash,
    /// The blake2-256 hash of the PoV.
    pov_hash: Hash,
    /// The root of a block's erasure encoding Merkle tree.
    erasure_root: Hash,
    /// Reserved bytes.
    reserved2: [u8; 64],
    /// Hash of the para header that is being generated by this candidate.
    para_head: Hash,
    /// The blake2-256 hash of the validation code bytes.
    validation_code_hash: ValidationCodeHash,
}
}

In future format versions, parts of the reserved1 and reserved2 bytes can be used to include additional information in the descriptor.

Backwards compatibility

Two flavors of candidate receipts are used in network protocols, runtime and node implementation:

  • CommittedCandidateReceipt which includes the CandidateDescriptor and the CandidateCommitments
  • CandidateReceipt which includes the CandidateDescriptor and just a hash of the commitments

We want to support both the old and new versions in the runtime and node, so the implementation must be able to detect the version of a given candidate receipt.

The version of the descriptor is detected by checking the reserved fields. If they are not zeroed, it means it is a version 1 descriptor. Otherwise the version field is used further to determine the version. It should be 0 for version 2 descriptors. If it is not the descriptor has an unknown version and should be considered invalid.

Parachain block validation

If the candidate descriptor is version 1, there are no changes.

Backers must check the validity of core_index and session_index fields. A candidate must not be backed if any of the following are true:

  • the core_index in the descriptor does not match the core the backer is assigned to
  • the session_index is not equal to the session index the candidate is backed in
  • the core_index in the descriptor does not match the one determined by the UMPSignal::SelectCore message

On-chain backing

If the candidate descriptor is version 1, there are no changes.

For version 2 descriptors the runtime will determine the core_index using the same inputs as backers did off-chain. It currently stores the claim queue at the newest allowed relay parent corresponding to the claim queue offset 0. The runtime needs to be changed to store a claim queue snapshot at all allowed relay parents.

Drawbacks

The only drawback is that further additions to the descriptor are limited to the amount of remaining unused space.

Testing, Security, and Privacy

Standard testing (unit tests, CI zombienet tests) for functionality and mandatory security audit to ensure the implementation does not introduce any new security issues.

Backward compatibility of the implementation will be tested on testnets (Versi and Westend).

There is no impact on privacy.

Performance

Overall performance will be improved by not checking the collator signatures in runtime and nodes. The impact on the UMP queue and candidate receipt processing is negligible.

The ClaimQueueOffset along with the relay parent choice allows parachains to optimize their block production for either throughput or lower XCM message processing latency. A value of 0 with the newest relay parent provides the best latency while picking older relay parents avoids re-orgs.

Ergonomics

It is mandatory for elastic parachains to switch to the new receipt format and commit to a core by sending the UMPSignal::SelectCore message. It is optional but desired that all parachains switch to the new receipts for providing the session index for disputes.

The implementation of this RFC itself must not introduce any breaking changes for the parachain runtime or collator nodes.

Compatibility

The proposed changes are not fully backward compatible, because older validators verify the collator signature of candidate descriptors.

Additional care must be taken before enabling the new descriptors by waiting for at least 2/3 + 1 validators to upgrade. Validators that have not upgraded will not back candidates using the new descriptor format and will also initiate disputes against these candidates.

Relay chain runtime

The first step is to remove collator signature checking logic in the runtime but keep the node side collator signature checks.

The runtime must be upgraded to support the new primitives before any collator or node is allowed to use the new candidate receipts format.

Validators

To ensure a smooth launch, a new node feature is required. The feature acts as a signal for supporting the new candidate receipts on the node side and can only be safely enabled if at least 2/3 + 1 of the validators are upgraded. Node implementations need to decode the new candidate descriptor once the feature is enabled otherwise they might raise disputes and get slashed.

Once the feature is enabled, the validators will skip checking the collator signature when processing the candidate receipts and verify the CoreIndex and SessionIndex fields if present in the receipt.

No new implementation of networking protocol versions for collation and validation is required.

Tooling

Any tooling that decodes UMP XCM messages needs an update to support or ignore the new UMP messages, but they should be fine to decode the regular XCM messages that come before the separator.

Prior Art and References

Forum discussion about a new CandidateReceipt format: https://forum.polkadot.network/t/pre-rfc-discussion-candidate-receipt-format-v2/3738

Unresolved Questions

N/A

The implementation is extensible and future-proof to some extent. With minimal or no breaking changes, additional fields can be added in the candidate descriptor until the reserved space is exhausted

At this point, there is a simple way to determine the version of the receipt, by testing for zeroed reserved bytes in the descriptor. Future versions of the receipt can be implemented and identified by using the version field of the descriptor introduced in this RFC.

(source)

Table of Contents

RFC-0105: XCM improved fee mechanism

Start Date23 July 2024
DescriptionAllow multiple types of fees to be paid
AuthorsFrancisco Aguirre

Summary

XCM already handles execution fees in an effective and efficient manner using the BuyExecution instruction. However, other types of fees are not handled as effectively -- for example, delivery fees. Fees exist that can't be measured using Weight -- as execution fees can -- so a new method should be thought up for those cases. This RFC proposes making the fee handling system simpler and more general, by doing two things:

  • Adding a fees register
  • Deprecating BuyExecution and adding a new instruction PayFees with new semantics to ultimately replace it.

Motivation

Execution fees are handled correctly by XCM right now. However, the addition of extra fees, like for message delivery, result in awkward ways of integrating them into the XCVM implementation. This is because these types of fees are not included in the language. The standard should have a way to correctly deal with these implementation specific fees, that might not exist in every system that uses XCM. The new instruction moves the specified amount of fees from the holding register to a dedicated fees register that the XCVM can use in flexible ways depending on its implementation. The XCVM implementation is free to use these fees to pay for execution fees, transport fees, or any other type of fee that might be necessary. This moves the specifics of fees further away from the XCM standard, and more into the actual underlying XCVM implementation, which is a good thing.

Stakeholders

  • Runtime Users
  • Runtime Devs
  • Wallets
  • dApps

Explanation

The new instruction that will replace BuyExecution is a much simpler and general version: PayFees. This instruction takes one Asset, takes it from the holding register, and puts it into a new fees register. The XCVM implementation can now use this Asset to make sure every necessary fee is paid for, this includes execution fees, delivery fees, and any other type of fee necessary for the program to execute successfully.

#![allow(unused)]
fn main() {
PayFees { asset: Asset }
}

This new instruction will reserve the entirety of the asset operand for fee payment. There is not concept of returning the leftover fees to the holding register, to allow for the implementation to charge fees at different points during execution. Because of this, the asset passed in can't be used for anything else during the entirety of the program. This is different from the current semantics of BuyExecution.

If not all Asset in the fees register is used when the execution ends, then we trap them alongside any possible leftover assets from the holding register. RefundSurplus can be used to move all leftover fees from the fees register to the holding register. Care must be taken that this is used only after all possible instructions which might charge fees, else execution will fail.

Examples

Most XCM programs that pay for execution are written like so:

#![allow(unused)]
fn main() {
// Instruction that loads the holding register
BuyExecution { asset, weight_limit }
// ...rest
}

With this RFC, the structure would be the same, but using the new instruction, that has different semantics:

#![allow(unused)]
fn main() {
// Instruction that loads the holding register
PayFees { asset }
// ...rest
}

Drawbacks

There needs to be an explicit change from BuyExecution to PayFees, most often accompanied by a reduction in the assets passed in.

Testing, Security, and Privacy

It might become a security concern if leftover fees are trapped, since a lot of them are expected.

Performance, Ergonomics, and Compatibility

Performance

There should be no performance downsides to this approach. The fees register is a simplification that may actually result in better performance, in the case an implementation is doing a workaround to achieve what this RFC proposes.

Ergonomics

The interface is going to be very similar to the already existing one. Even simpler since PayFees will only receive one asset. That asset will allow users to limit the amount of fees they are willing to pay.

Compatibility

This RFC can't just change the semantics of the BuyExecution instruction since that instruction accepts any funds, uses what it needs and returns the rest immediately. The new proposed instruction, PayFees, doesn't return the leftover immediately, it keeps it in the fees register. In practice, the deprecated BuyExecution needs to be slowly rolled out in favour of PayFees.

Prior Art and References

The closed RFC PR on the xcm-format repository, before XCM RFCs got moved to fellowship RFCs: https://github.com/polkadot-fellows/xcm-format/pull/53.

Unresolved Questions

None

This proposal would greatly benefit from an improved asset trapping system.

CustomAssetClaimer is also related, as it directly improves the ergonomics of this proposal.

LeftoverAssetsDestination execution hint would also similarly improve the ergonomics.

Removal of JIT fees is also related, they are useless with this proposal.

(source)

Table of Contents

RFC-0107: XCM Execution hints

Start Date23 July 2024
DescriptionAdd a mechanism for configuring particular XCM executions
AuthorsFrancisco Aguirre

Summary

A previous XCM RFC (https://github.com/polkadot-fellows/xcm-format/pull/37) introduced a SetAssetClaimer instruction. This idea of instructing the XCVM to change some implementation-specific behavior is useful. In order to generalize this mechanism, this RFC introduces a new instruction SetHints and makes the SetAssetClaimer be just one of many possible execution hints.

Motivation

There is a need for specifying how certain implementation-specific things should behave. Things like who can claim the assets or what can be done instead of trapping assets. Another idea for a hint:

  • AssetForFees: to signify to the executor what asset the user prefers to use for fees.
  • LeftoverAssetsDestination: for depositing leftover assets to a destination instead of trapping them

Stakeholders

  • Runtime devs
  • Wallets
  • dApps

Explanation

A new instruction, SetHints, will be added. This instruction will take a single parameter of type Hint, an enumeration. The first variant for this enum is AssetClaimer, which allows to specify a location that should be able to claim trapped assets. This means the instruction SetAssetClaimer would also be removed, in favor of this.

In Rust, the new definitions would look as follows:

#![allow(unused)]
fn main() {
enum Instruction {
  // ...snip...
  SetHints(BoundedVec<Hint, NumVariants>),
  // ...snip...
}

enum Hint {
  AssetClaimer(Location),
  // more can be added
}

type NumVariants = /* Number of variants of the `Hint` enum */;
}

Drawbacks

The SetHints instruction might be hard to benchmark, since we should look into the actual hints being set to know how much weight to attribute to it.

Testing, Security, and Privacy

Hints are specified on a per-message basis, so they have to be specified at the beginning of a message. If they were to be specified at the end, hints like AssetClaimer would be useless if an error occurs beforehand and assets get trapped before ever reaching the hint.

The instruction takes a bounded vector of hints so as to not force barriers to allow an arbitrary number of SetHint instructions.

Performance, Ergonomics, and Compatibility

Performance

None.

Ergonomics

The SetHints instruction provides a better integration with barriers. If we had to add one barrier for SetAssetClaimer and another for each new hint that's added, barriers would need to be changed all the time. Also, this instruction would make it simpler to write XCM programs. You only need to specify the hints you want in one single instruction at the top of your program.

Compatibility

None.

Prior Art and References

The previous RFC PR in the xcm-format repository before XCM RFCs moved to fellowship RFCs: https://github.com/polkadot-fellows/xcm-format/pull/59.

Unresolved Questions

None.

None.

(source)

Table of Contents

RFC-0108: Remove XCM testnet NetworkIds

Start Date23 July 2024
DescriptionRemove the NetworkIds for testnets Westend and Rococo
Authors

Summary

This RFC aims to remove the NetworkIds of Westend and Rococo, arguing that testnets shouldn't go in the language.

Motivation

We've already seen the plans to phase out Rococo and Paseo has appeared. Instead of constantly changing the testnets included in the language, we should favor specifying them via their genesis hash, using NetworkId::ByGenesis.

Stakeholders

  • Runtime devs
  • Wallets
  • dApps

Explanation

Remove Westend and Rococo from the included NetworkIds in the language.

Drawbacks

This RFC will make it less convenient to specify a testnet, but not by a large amount.

Testing, Security, and Privacy

None.

Performance, Ergonomics, and Compatibility

Performance

None.

Ergonomics

It will very slightly reduce the ergonomics of testnet developers but improve the stability of the language.

Compatibility

NetworkId::Rococo and NetworkId::Westend can just use NetworkId::ByGenesis, as can other testnets.

Prior Art and References

A previous attempt to add NetworkId::Paseo: https://github.com/polkadot-fellows/xcm-format/pull/58.

Unresolved Questions

None.

None.

(source)

Table of Contents

RFC-0122: Asset transfers can alias XCM origin on destination to original origin

Start Date01 Sep 2024.
DescriptionSingle and Multi-hop asset transfers should be able to carry over original origin
AuthorsAdrian Catangiu

Summary

XCM programs generated by the InitiateAssetTransfer instruction shall have the option to carry over the original origin all the way to the final destination. They shall do so by internally making use of AliasOrigin or ClearOrigin depending on given parameters.

This allows asset transfers to retain their original origin even across multiple hops.

Ecosystem chains would have to change their trusted aliasing rules to effectively make use of this feature.

Motivation

Currently, all XCM asset transfer instructions ultimately clear the origin in the remote XCM message by use of the ClearOrigin instruction. This is done for security considerations to ensure that subsequent (user-controlled) instructions cannot command the authority of the sending chain.

The problem with this approach is that it limits what can be achieved on remote chains through XCM. Most XCM operations require having an origin, and following any asset transfer the origin is lost, meaning not much can be done other than depositing the transferred assets to some local account or transferring them onward to another chain.

For example, we cannot transfer some funds for buying execution, then do a Transact (all in the same XCM message).

The above example is a basic, core building block for cross-chain interactions and we should support it.

Transact XCM programs today require a two step process:

Transact Today

And we want to be able to do it using a single XCM program.

Stakeholders

Runtime Users, Runtime Devs, wallets, cross-chain dApps.

Explanation

In the case of XCM programs going from source-chain directly to dest-chain without an intermediary hop, we can enable scenarios such as above by using the AliasOrigin instruction instead of the ClearOrigin instruction.

Instead of clearing the source-chain origin, the destination chain shall attempt to alias source-chain to "original origin" on the source chain. Most common such origin aliasing would be X1(Parachain(source-chain)) -> X2(Parachain(source-chain), AccountId32(origin-account)) for the case of a single hop transfer where the initiator is a (signed/pure/proxy) account origin-account on source-chain. This is equivalent to using the DescendOrigin instruction in this case, but also usable in the multi hop case.

This allows an actor on chain A to Transact on chain B without having to prefund its SA account on chain B, instead they can simply transfer the required fees in the same XCM program as the Transact.

As long as the asset transfer has the same XCM route/hops as the rest of the program, this pattern of usage can be composed across multiple hops, to ultimately Transact on the final hop using the original origin on the source chain, effectively abstracting away any intermediary hops.

Trust assumptions

The model described above works between chains that configure certain aliasing rules. Origin aliasing is highly customizable at the runtime level, so that chains can define coarse filters or granular pairs of (source, target) locations aliasing.

This RFC suggests a coarse set of aliasing rules that chains can use for allowing the vast majority of Transact usecases in a "one-click" manner (single user signature), without practically lowering their security posture.

Suggested Aliasing Rules:

  1. Any chain allows aliasing origin into a child location. Equivalent to DescendOrigin into an interior location.
  2. Parachains allow Asset Hub root location to alias into any other origin.

The first rule is allowing use of AliasOrigin with same effect as doing a DescendOrigin, so it is absolutely not controversial.

Now, the second rule as defined above in its most generic form might seem "dangerous" at first, but in practical terms if Asset Hub Root gets compromised and can access arbitrary sovereign accounts on Asset Hub and/or send arbitrary XCMs, the blast radius and potential damage to other chains is already so large that it is not relevantly impacted by this aliasing rule. A compromised system chain root would already be by itself an "apocalypse" scenario for the whole Polkadot Ecosystem.

It's important noting that the aliasing rules above are a suggestion only, ultimately they are chain specific configuration. Therefore, each chain can tighten them to their own liking. For example, use stricter range of locations that Asset Hub can alias like:

  • "allow Asset Hub root to alias Ethereum locations" - which enables support for Transact over the Ethereum Snowbridge (but doesn't support sibling parachain to Transact through Asset Hub),
  • "allow Asset Hub root to alias Kusama locations"
  • "allow Asset Hub root to alias specific pallet or smart contract on Chain X"

Please note that Bridge Hub already does something similar today: Bridge Hub root is allowed/trusted to UniversalOrigin+DescendOrigin into any external location in order to impersonate/proxy external locations.

XCM InitiateAssetsTransfer instruction changes

A new parameter preserve_origin to be added to the InitiateAssetsTransfer XCM instruction that specifies if the original origin should be preserved or cleared.

InitiateAssetsTransfer {
	destination: Location,
	assets: Vec<AssetTransferFilter>,
	remote_fees: Option<AssetTransferFilter>,
+	preserve_origin: bool,
	remote_xcm: Xcm<()>,
}

This parameter is explicitly necessary because the instruction should be usable between any two chains regardless of their origin-aliasing trust relationship. Preserving the origin requires some level of trust, while clearing it works regardless of that relationship. Specifying preserve_origin: false will always work regardless of the configured alias filters of the involved chains.

Example scenarios

Transact within the ecosytem:

  • between two chains using an asset native to either one of them for paying for Transact,
  • between two chains using an Asset Hub asset (e.g. USDT) for paying for Transact,
Ecosystem Transact

Transact over Snowbridge (same for other bridges):

  • user on Ethereum calls function in Parachain A on Polkadot, pays with ETH,
  • user on ParaA on Polkdaot calls function on Ethereum, pays with ETH,
Transact Over Bridge

Drawbacks

In terms of ergonomics and user experience, this support for combining an asset transfer with a subsequent action (like Transact) is a net positive.

In terms of performance, and privacy, this is neutral with no changes.

In terms of security, the feature by itself is also neutral because it allows preserve_origin: false usage for operating with no extra trust assumptions. When wanting to support preserving origin, chains need to configure secure origin aliasing filters. The one suggested in this RFC should be the right choice for the majority of chains, but each chain will ultimately choose depending on their business model and logic (e.g. chain does not plan to integrate with Asset Hub). It is up to the individual chains to configure accordingly.

Testing, Security, and Privacy

Barriers should now allow AliasOrigin, DescendOrigin or ClearOrigin.

Normally, XCM program builders should audit their programs and eliminate assumptions of "no origin" on remote side of this instruction. In this case, the InitiateAssetsTransfer has not been released yet, it will be part of XCMv5, and we can make this change part of the same XCMv5 so that there isn't even the possibility of someone in the wild having built XCM programs using this instruction on those wrong assumptions.

The working assumption going forward is that the origin on the remote side can either be cleared or it can be the local origin's reanchored location. This assumption is in line with the current behavior of remote XCM programs sent over using pallet_xcm::send.

The existing DepositReserveAsset, InitiateReserveWithdraw and InitiateTeleport cross chain asset transfer instructions will not attempt to do origin aliasing and will always clear origin same as before for compatibility reasons.

Performance, Ergonomics, and Compatibility

Performance

No impact.

Ergonomics

Improves ergonomics by allowing the local origin to operate on the remote chain even when the XCM program includes an asset transfer.

Compatibility

At the executor-level this change is backwards and forwards compatible. Both types of programs can be executed on new and old versions of XCM with no changes in behavior.

New version of the InitiateAssetsTransfer instruction acts same as before when used with preserve_origin: false.

For using the new capabilities, the XCM builder has to verify that the involved chains have the required origin-aliasing filters configured and use some new version of Barriers aware of AliasOrigin as an allowed alternative to ClearOrigin.

For compatibility reasons, this RFC proposes this mechanism be added as an enhancement to the yet unreleased InitiateAssetsTransfer instruction, thus eliminating possibilities of XCM logic breakages in the wild. Following the same logic, the existing DepositReserveAsset, InitiateReserveWithdraw and InitiateTeleport cross chain asset transfer instructions will not attempt to do origin aliasing and will always clear the origin same as before for compatibility reasons.

Any one of DepositReserveAsset, InitiateReserveWithdraw and InitiateTeleport instructions can be replaced with a InitiateAssetsTransfer instruction with or without origin aliasing, thus providing a clean and clear upgrade path for opting-in this new feature.

Prior Art and References

Unresolved Questions

None

(source)

Table of Contents

RFC-0000: Validator Rewards

Start DateDate of initial proposal
DescriptionRewards protocol for Polkadot validators
AuthorsJeff Burdges, ...

Summary

An off-chain approximation protocol should assign rewards based upon the approvals and availability work done by validators.

All validators track which approval votes they actually use, reporting the aggregate, after which an on-chain median computation gives a good approximation under byzantine assumptions. Approval checkers report aggregate information about which availability chunks they use too, but in availability we need a tit-for-tat game to enforce honesty, because approval committees could often bias results thanks to their small size.

Motivation

We want all polkadot subsystems be profitable for validataors, because otherwise operators might profit from running modified code. In particular, almost all rewards in Kusama/Polkadot should come from work done securing parachains, primarily approval checking, but also backing, availability, and support of XCMP.

Among these task, our highest priorities must be approval checks, which ensure soundness, and sending availability chunks to approval checkers. We prove backers must be paid strictly less than approval checkers.

At present though, validators' rewards have relatively little relationship to validators operating costs, in terms of bandwidth and CPU time. Worse, polkadot's scaling makes us particular vulnerable "no-shows" caused by validators skipping their approval checks.

We're particularly concernned about hardware specks impact upon the number of parachain cores. We've requested relatively low spec machines so far, only four physical CPU cores, although some run even lower specs like only two physical CPU cores. Alone, rewards cannot fix our low speced validator problem, but rewards and outreach together should far more impact than either alone.

In future, we'll further increase validator spec requirements, which directly improve polkadot's throughput, and which repeats this dynamic of purging underspeced nodes, except outreach becomes more important because de facto too many slow validators can "out vote" the faster ones

Stakeholders

We alter the validators rewards protocol, but with negligable impact upon rewards for honest validators who comply with hardware and bandwidth recommendations.

We shall still reward participation in relay chain concensus of course, which de facto means block production but not finality, but these current reward levels shall wind up greatly reduced. Any validators who manipulate block rewards now could lose rewards here, simply because of rewards being shifted from block production to availability, but this sounds desirable.

We've discussed roughly this rewards protocol in https://hackmd.io/@rgbPIkIdTwSICPuAq67Jbw/S1fHcvXSF and https://github.com/paritytech/polkadot-sdk/issues/1811 as well as related topics like https://github.com/paritytech/polkadot-sdk/issues/5122

Logic

Categories

We alter the current rewards scheme by reducing to roughly these proportions of total rewards:

  • 15-20% - Relay chain block production and uncle logic
  • 5% - Rnything else related to relay chain finality, primarily beefy proving, but maybe other tasts exist.
  • Any existing rewards for on-chain validity statements would only cover backers, so those rewards must be removed.

We add roughly these proportions of total rewards covering parachain work:

  • 70-75% - approval and backing validity checks, with the backing rewards being required to be less than approval rewards.
  • 5-10% - Availability redistribution from availability providers to approval checkers. We do not reward for availability distribution from backers to availability providers.

Collection

We track this data for each candidate during the approvals process:

/// Our subjective record of out availability transfers for this candidate.
CandidateRewards {
    /// Anyone who backed this parablock
    backers: [AuthorityId; NumBackers],
    /// Anyone who sent us chunks for this candidate
    downloaded_from: HashMap<AuthorityId,u16>,    
    /// Anyone to whome we sent chunks for this candidate
    uploaded_to: HashMap<AuthorityId,u16>,
}

We no longer require this data during disputes.

After we approve a relay chain block, then we collect all its CandidateRewards into an ApprovalsTally, with one ApprovalTallyRecord for each validator. In this, we compute approval_usages from the final run of the approvals loop, plus 0.8 for each backer.

/// Our subjective record of what we used from, and provided to, all other validators on the finalized chain
pub struct ApprovalsTally(Vec<ApprovalTallyLine>);

/// Our subjective record of what we used from, and provided to, all one other validators on the finalized chain
pub struct ApprovalTallyLine {
    /// Approvals by this validator which our approvals gadget used in marking candidates approved.
    approval_usages: u32,
    /// Availability chunks we downloaded from this validator for our approval checks we used.
    used_downloads: u32,
    /// Availability chunks we uploaded to this validator which whose approval checks we used.
    used_uploads: u32,
}

At finality we sum these ApprovalsTally for one for the whole epoch so far, into another ApprovalsTally. We can optionally sum them earlier at chain heads, but this requires mutablity.

Messages

After the epoch is finalized, we share the first two lines of its ApprovalTally.

/// Our subjective record of what we used from some other validator on the finalized chain
pub struct ApprovalTallyMessageLine {
    /// Approvals by this validator which our approvals gadget used in marking candidates approved.
    approval_usages: u32,
    /// Availability chunks we downloaded from this validator for our approval checks we used.
    used_downloads: u32,
}

/// Our subjective record of what we used from all other validators on the finalized chain
pub struct ApprovalsTallyMessage(Vec<ApprovalTallyMessageLine>);

Rewards compoutation

We compute the approvals rewards by taking the median of the approval_usages fields for each validator across all validators ApprovalsTallyMessages.

let mut approval_usages_medians = Vec::new(); 
for i in 0..num_validators {
    let mut v: Vec<u32> = approvals_tally_messages.iter().map(|atm| atm.0[i].approval_usages);
    v.sort();
    approval_usages_medians.push(v[num_validators/2]);
}

Assuming more than 50% honersty, these median tell us how many approval votes form each validator

We re-weight the used_downloads from the ith validator by their median times their expected f+1 chunks and divided by how many chunks downloads they claimed, and sum them

#[cfg(offchain)]
let mut my_missing_uploads = my_approvals_tally.iter().map(|l| l.used_uploads).collect();
let mut reweighted_total_used_downloads = vec[0u64; num_validators];
for (mmu,atm) in my_missing_uploads.iter_mut().zip(approvals_tally_messages) {
    let d = atm.0.iter().map(|l| l.used_downloads).sum();
    for i in 0..num_validators {
        let atm_from_i = approval_usages_medians[i] * (f+1) / d;
        #[cfg(offchain)]
        if i == me { mmu -= atm_from_i };
        reweighted_total_used_downloads[i] += atm_from_i;
    }
}

We distribute rewards on-chain using approval_usages_medians and reweighted_total_used_downloads. Approval checkers could later change from who they download chunks using my_missing_uploads.

Strategies

In theory, validators could adopt whatever strategy they like to penalize validators who stiff them on availability redistribution rewards, except they should not stiff back, only choose other availability providers. We discuss one good strategy below, but initially this could go unimplemented.

Explanation

Backing

Polkadot's efficency creates subtle liveness concerns: Anytime one node cannot perform one of its approval checks then Polkadot loses in expectation 3.25 approval checks, or 0.10833 parablocks. This makes back pressure essential.

We cannot throttle approval checks securely either, so reactive off-chain back pressure only makes sense during or before the backing phase. In other words, if nodes feel overworked themselves, or perhaps beleive others to be, then they should drop backing checks, never approval checks. It follows backing work must be rewarded less well and less reliably than approvals, as otherwise validators could benefit from behavior that harms the network.

We propose that one backing statement be rewarded at 80% of one approval statement, so backers earn only 80% of what approval checkers earn. We omit rewards for availability distribution, so backers spend more on bandwidth too. Approval checkers always fetch chunks first from backers though, so good backers earn roughly 7% there, meaning backing checks earn roughly 13% less than approval checks. We should lower this 80% if we ever increase availability redistribution rewards.

Although imperfect, we believe this simplifies implementation, and provides robustness against mistakes elsewhere, including by governance mistakes, but incurs minimal risk. In principle, backer might not distribute systemic chunks, but approval checkers fetch systemic chunks from backers first anyways, so likely this yields negligable gains.

As always we require that backers' rewards covers their operational costs plus some profit, but approval checks must be more profitable.

Approvals

In polkadot, all validators run an approval assignment loop for each candidate, in which the validator listens to other approval checkers assignments and approval statements/votes, with which it marks checkers no-show or done, and marks candidates approved. Also, this loop determines and announces validators' own approval checker assignments.

Any validator should always conclude whatever approval checks it begins, but our approval assignment loop ignore some approval checks, either because they were announced too soon or because an earlier no-show delivered its approval vote before the final approval. We say a validator $u$ uses an approval vote by a validator $v$ on a candidate $c$ if the approval assignments loop by $u$ counted the vote by $v$ towards approving the candidate $c$. We should not rewards votes announced too soon, so we unavoidably omit rewards for some honest no-show replacements too. We expect the 80% discount for backing covers these losses, so approval checks remain more profitable than backing.

We propose a simple approximate solution based upon computing medians across validators for used votes.

  1. In an epoch $e$, each validator $u$ counts of the number $\alpha_{u,v}$ of votes they used from each validator $v$, including themselves. Any time a validator marks a candidate approved, they increment these counts appropriately.

  2. After epoch $e$'s last block gets finalized, all validators of epoch $e$ submit an approvals tally message ApprovalsTallyMessage that reveals their number $\alpha_{u,v}$ of useful approvals they saw from each validator $v$ on candidates that became available in epoch $n$. We do not send $\alpha_{u,u}$ for tit-for-tat reasons discussed below, not for bias concerns. We record these approvals tally messages on-chain.

  3. After some delay, we compute on-chain the median $\alpha_v := \textrm{median} { \alpha_{u,v} : u }$ used approvals statements for each validator $v$.

As discussed in https://hackmd.io/@rgbPIkIdTwSICPuAq67Jbw/S1fHcvXSF we could compute these medians using the on-line algorithm if substrate had a nice priority queue.

We never achieve true consensus on approval checkers and their approval votes. Yet, our approval assignment loop gives a rough concensus, under our Byzantine assumption and some synchrony assumption. It then follows that miss-reporting by malicious validators should not appreciably alter the median $\alpha_v$ and hence rewards.

We never tally used approval assignments to candidate equivocations or other forks. Any validator should always conclude whatever approval checks it begins, even on other forks, but we expect relay chain equivocations should be vanishingly rare, and sassafras should make forks uncommon.

Availability redistribution

As approval checkers could easily perform useless checks, we shall reward availability providers for the availability chunks they provide that resulted in useful approval checks. We enforce honesty using a tit-for-tat mechanism because chunk transfers are inherently subjective.

An approval checker reconstructs the full parachain block by downloading distinct $f+1$ chunks from other validators, where at most $f$ validators are byzantine, out of the $n \ge 3 f + 1$ total validators. In downloading chunks, validators prefer the $f+1$ systemic chunks over the non-systemic chunks, and prefer fetching from validators who already voted valid, like backing checkers. It follows some validators should recieve credit for more than one chunk per candidate.

We expect a validator $v$ has actually performed more approval checks $\omega_v$ than the median $\alpha_v$ for which they actually received credit. In fact, approval checkers even ignore some of their own approval checks, meaning $\alpha_{v,v} \le \omega_v$ too.

Alongside approvals count for epoch $e$, approval checker $v$ computes the counts $\beta_{u,v}$ of the number of chunks they downloaded from each availability provider $u$, excluding themselves, for which they percieve the approval check turned out useful, meaning their own approval counts in $\alpha_{v,v}$. Approval checkers publish $\beta_{u,v}$ alongside $\alpha_{u,v}$ in the approvals tally message ApprovalsTallyMessage. We originally proposed include the self availability usage $\beta_{v,v}$ here, but this should not matter, and excluding simplifies the code.

Symmetrically, availability provider $u$ computes the counts $\gamma_{u,v}$ of the number of chunks they uploaded to each approval checker $v$, again including themselves, again for which they percieve the approval check turned out useful. Availability provider $u$ never reveal its $\gamma_{u,v}$ however.

At this point, $\alpha_v$, $\alpha_{v,v}$, and $\alpha_{u,v}$ all potentially differ. We established consensus upon $\alpha_v$ above however, with which we avoid approval checkers printing unearned availability provider rewards:

After receiving "all" pairs $(\alpha_{u,v},\beta_{u,v})$, validator $w$ re-weights the $\beta_{u,v}$ and their own $\gamma_{w,v}$. $$ \begin{aligned} \beta\prime_{w,v} &= {(f+1) \alpha_v \over \sum_u \beta_{u,v}} \beta_{w,v} \ \gamma\prime_{w,v} &= {(f+1) \alpha_w \over \sum_v \gamma_{w,v}} \gamma_{w,v} \ \end{aligned} $$ At this point, we compute $\beta\prime_w = \sum_v \beta\prime_{w,v}$ on-chain for each $w$ and reward $w$ proportionally.

Tit-for-tat

We employ a tit-for-tat strategy to punish validators who lie about from whome they obtain availability chunks. We only alter validators future choices in from whom they obtain availability chunks, and never punish by lying ourselves, so nothing here breaks polkadot, but not having roughly this strategy enables cheating.

An availability provider $w$ defines $\delta\prime_{w,v} := \gamma\prime_{w,v} - \beta\prime_{w,v}$ to be the re-weighted number of chunks by which $v$ stiffed $w$. Now $w$ increments their cumulative stiffing perception $\eta_{w,v}$ from $v$ by the value $\delta\prime_{w,v}$, so $\eta_{w,v} \mathrel{+}= \delta\prime_{w,v}$

In future, anytime $w$ seeks chunks in reconstruction $w$ skips $v$ proportional to $\eta_{w,v} / \sum_u \eta_{w,u}$, with each skip reducing $\eta_{w,u}$ by 1. We expect honest accedental availability stiffs have only small $\delta\prime_{w,v}$, so they clear out quickly, but intentional skips add up more quickly.

We keep $\gamma_{w,v}$ and $\alpha_{u,u}$ secret so that approval checkers cannot really know others stiffing perceptions, although $\alpha_{u,v}$ leaks some relevant information. We expect this secrecy keeps skips secret and thus prevents the tit-for-tat escalating beyond one round, which hopefully creates a desirable Nash equilibrium.

We favor skiping systematic chunks to reduce reconstructon costs, so we face costs when skipping them. We could however fetch systematic chunks from availability providers as well as backers, or even other approval checkers, so this might not become problematic in practice.

Concerns: Drawbacks, Testing, Security, and Privacy

We do not pay backers individually for availability distribution per se. We could only do so by including this information into the availability bitfields, which complicates on-chain computation. Also, if one of the two backers does not distribute then the availability core should remain occupied longer, meaning the lazy backer loses some rewards too. It's likely future protocol improbvements change this, so we should monitor for lazy backers outside the rewards system.

We discuss approvals being considered by the tit-for-tat in earlier drafts. An adversary who successfuly manipulates the rewards median votes would've alraedy violated polkadot's security assumptions though, which requires a hard fork and correcting the dot allocation. Incorrect report wrong approval_usages remain interesting statistics though.

Adversarial validators could manipulates their availability votes though, even without being a supermajority. If they still download honestly, then this costs them more rewards than they earn. We do not prevent validators from preferentially obtaining their pieces from their friends though. We should analyze, or at least observe, the long-term consequences.

A priori, whale nominator's validators could stiff validators but then rotate their validators quickly enough so that they never suffered being skipped back. We discuss several possible solution, and their difficulties, under "Rob's nominator-wise skipping" in https://hackmd.io/@rgbPIkIdTwSICPuAq67Jbw/S1fHcvXSF but overall less seems like more here. Also frequent validator rotation could be penalized elsewhere.

Performance, Ergonomics, and Compatibility

We operate off-chain except for final rewards votes and median tallies. We expect lower overhead rewards protocols would lack information, thereby admitting easier cheating.

Initially, we designed the ELVES approval gadget to allow on-chain operation, in part for rewards computation, but doing so looks expensive. Also, on-chain rewards computaiton remains only an approximation too, but could even be biased more easily than our off-chain protocol presented here.

We alraedy teach validators about missed parachain blocks, but we'll teach approval checking more going forwards, because current efforts focus more upon backing.

JAM's block exports should not complicate availability rewards, but could impact some alternative schemes.

Prior Art and References

None

Unresolved Questions

Provide specific questions to discuss and address before the RFC is voted on by the Fellowship. This should include, for example, alternatives to aspects of the proposed design where the appropriate trade-off to make is unclear.

Synthetic parachain flag

Any rewards protocol could simply be "out voted" by too many slow validators: An increase the number of parachain cores increases more workload, but this creates no-shows if too few validators could handle this workload.

We could add a synthetic parachain flag, only settable by governance, which treats no-shows as positive approval votes for that parachain, but without adding rewards. We should never enable this for real parachains, only for synthetic ones like gluttons. We should not enable the synthetic parachain flag long-term even for gluttonsm, because validators could easily modify their code. Yet, synthetic approval checks might enable pushing the hardware upgrades more agressively over the short-term.

(source)

Table of Contents

RFC-0004: Remove the host-side runtime memory allocator

Start Date2023-07-04
DescriptionUpdate the runtime-host interface to no longer make use of a host-side allocator
AuthorsPierre Krieger

Summary

Update the runtime-host interface to no longer make use of a host-side allocator.

Motivation

The heap allocation of the runtime is currently controlled by the host using a memory allocator on the host side.

The API of many host functions consists in allocating a buffer. For example, when calling ext_hashing_twox_256_version_1, the host allocates a 32 bytes buffer using the host allocator, and returns a pointer to this buffer to the runtime. The runtime later has to call ext_allocator_free_version_1 on this pointer in order to free the buffer.

Even though no benchmark has been done, it is pretty obvious that this design is very inefficient. To continue with the example of ext_hashing_twox_256_version_1, it would be more efficient to instead write the output hash to a buffer that was allocated by the runtime on its stack and passed by pointer to the function. Allocating a buffer on the stack in the worst case scenario simply consists in decreasing a number, and in the best case scenario is free. Doing so would save many Wasm memory reads and writes by the allocator, and would save a function call to ext_allocator_free_version_1.

Furthermore, the existence of the host-side allocator has become questionable over time. It is implemented in a very naive way, and for determinism and backwards compatibility reasons it needs to be implemented exactly identically in every client implementation. Runtimes make substantial use of heap memory allocations, and each allocation needs to go twice through the runtime <-> host boundary (once for allocating and once for freeing). Moving the allocator to the runtime side, while it would increase the size of the runtime, would be a good idea. But before the host-side allocator can be deprecated, all the host functions that make use of it need to be updated to not use it.

Stakeholders

No attempt was made at convincing stakeholders.

Explanation

New host functions

This section contains a list of new host functions to introduce.

(func $ext_storage_read_version_2
    (param $key i64) (param $value_out i64) (param $offset i32) (result i64))
(func $ext_default_child_storage_read_version_2
    (param $child_storage_key i64) (param $key i64) (param $value_out i64)
    (param $offset i32) (result i64))

The signature and behaviour of ext_storage_read_version_2 and ext_default_child_storage_read_version_2 is identical to their version 1 counterparts, but the return value has a different meaning. The new functions directly return the number of bytes that were written in the value_out buffer. If the entry doesn't exist, a value of -1 is returned. Given that the host must never write more bytes than the size of the buffer in value_out, and that the size of this buffer is expressed as a 32 bits number, a 64bits value of -1 is not ambiguous.

The runtime execution stops with an error if value_out is outside of the range of the memory of the virtual machine, even if the size of the buffer is 0 or if the amount of data to write would be 0 bytes.

(func $ext_storage_next_key_version_2
    (param $key i64) (param $out i64) (return i32))
(func $ext_default_child_storage_next_key_version_2
    (param $child_storage_key i64) (param $key i64) (param $out i64) (return i32))

The behaviour of these functions is identical to their version 1 counterparts. Instead of allocating a buffer, writing the next key to it, and returning a pointer to it, the new version of these functions accepts an out parameter containing a pointer-size to the memory location where the host writes the output. The runtime execution stops with an error if out is outside of the range of the memory of the virtual machine, even if the function wouldn't write anything to out. These functions return the size, in bytes, of the next key, or 0 if there is no next key. If the size of the next key is larger than the buffer in out, the bytes of the key that fit the buffer are written to out and any extra byte that doesn't fit is discarded.

Some notes:

  • It is never possible for the next key to be an empty buffer, because an empty key has no preceding key. For this reason, a return value of 0 can unambiguously be used to indicate the lack of next key.
  • The ext_storage_next_key_version_2 and ext_default_child_storage_next_key_version_2 are typically used in order to enumerate keys that starts with a certain prefix. Given that storage keys are constructed by concatenating hashes, the runtime is expected to know the size of the next key and can allocate a buffer that can fit said key. When the next key doesn't belong to the desired prefix, it might not fit the buffer, but given that the start of the key is written to the buffer anyway this can be detected in order to avoid calling the function a second time with a larger buffer.
(func $ext_hashing_keccak_256_version_2
    (param $data i64) (param $out i32))
(func $ext_hashing_keccak_512_version_2
    (param $data i64) (param $out i32))
(func $ext_hashing_sha2_256_version_2
    (param $data i64) (param $out i32))
(func $ext_hashing_blake2_128_version_2
    (param $data i64) (param $out i32))
(func $ext_hashing_blake2_256_version_2
    (param $data i64) (param $out i32))
(func $ext_hashing_twox_64_version_2
    (param $data i64) (param $out i32))
(func $ext_hashing_twox_128_version_2
    (param $data i64) (param $out i32))
(func $ext_hashing_twox_256_version_2
    (param $data i64) (param $out i32))
(func $ext_trie_blake2_256_root_version_3
    (param $data i64) (param $version i32) (param $out i32))
(func $ext_trie_blake2_256_ordered_root_version_3
    (param $data i64) (param $version i32) (param $out i32))
(func $ext_trie_keccak_256_root_version_3
    (param $data i64) (param $version i32) (param $out i32))
(func $ext_trie_keccak_256_ordered_root_version_3
    (param $data i64) (param $version i32) (param $out i32))
(func $ext_default_child_storage_root_version_3
    (param $child_storage_key i64) (param $out i32))
(func $ext_crypto_ed25519_generate_version_2
    (param $key_type_id i32) (param $seed i64) (param $out i32))
(func $ext_crypto_sr25519_generate_version_2
    (param $key_type_id i32) (param $seed i64) (param $out i32) (return i32))
(func $ext_crypto_ecdsa_generate_version_2
    (param $key_type_id i32) (param $seed i64) (param $out i32) (return i32))

The behaviour of these functions is identical to their version 1 or version 2 counterparts. Instead of allocating a buffer, writing the output to it, and returning a pointer to it, the new version of these functions accepts an out parameter containing the memory location where the host writes the output. The output is always of a size known at compilation time. The runtime execution stops with an error if out is outside of the range of the memory of the virtual machine.

(func $ext_default_child_storage_root_version_3
    (param $child_storage_key i64) (param $out i32))
(func $ext_storage_root_version_3
    (param $out i32))

The behaviour of these functions is identical to their version 1 and version 2 counterparts. Instead of allocating a buffer, writing the output to it, and returning a pointer to it, the new versions of these functions accepts an out parameter containing the memory location where the host writes the output. The output is always of a size known at compilation time. The runtime execution stops with an error if out is outside of the range of the memory of the virtual machine.

I have taken the liberty to take the version 1 of these functions as a base rather than the version 2, as a PPP deprecating the version 2 of these functions has previously been accepted: https://github.com/w3f/PPPs/pull/6.

(func $ext_storage_clear_prefix_version_3
    (param $prefix i64) (param $limit i64) (param $removed_count_out i32)
    (return i32))
(func $ext_default_child_storage_clear_prefix_version_3
    (param $child_storage_key i64) (param $prefix i64)
    (param $limit i64)  (param $removed_count_out i32) (return i32))
(func $ext_default_child_storage_kill_version_4
    (param $child_storage_key i64) (param $limit i64)
    (param $removed_count_out i32) (return i32))

The behaviour of these functions is identical to their version 2 and 3 counterparts. Instead of allocating a buffer, writing the output to it, and returning a pointer to it, the version 3 and 4 of these functions accepts a removed_count_out parameter containing the memory location to a 8 bytes buffer where the host writes the number of keys that were removed in little endian. The runtime execution stops with an error if removed_count_out is outside of the range of the memory of the virtual machine. The functions return 1 to indicate that there are keys remaining, and 0 to indicate that all keys have been removed.

Note that there is an alternative proposal to add new host functions with the same names: https://github.com/w3f/PPPs/pull/7. This alternative doesn't conflict with this one except for the version number. One proposal or the other will have to use versions 4 and 5 rather than 3 and 4.

(func $ext_crypto_ed25519_sign_version_2
    (param $key_type_id i32) (param $key i32) (param $msg i64) (param $out i32) (return i32))
(func $ext_crypto_sr25519_sign_version_2
    (param $key_type_id i32) (param $key i32) (param $msg i64) (param $out i32) (return i32))
func $ext_crypto_ecdsa_sign_version_2
    (param $key_type_id i32) (param $key i32) (param $msg i64) (param $out i32) (return i32))
(func $ext_crypto_ecdsa_sign_prehashed_version_2
    (param $key_type_id i32) (param $key i32) (param $msg i64) (param $out i32) (return i64))

The behaviour of these functions is identical to their version 1 counterparts. The new versions of these functions accept an out parameter containing the memory location where the host writes the signature. The runtime execution stops with an error if out is outside of the range of the memory of the virtual machine, even if the function wouldn't write anything to out. The signatures are always of a size known at compilation time. On success, these functions return 0. If the public key can't be found in the keystore, these functions return 1 and do not write anything to out.

Note that the return value is 0 on success and 1 on failure, while the previous version of these functions write 1 on success (as it represents a SCALE-encoded Some) and 0 on failure (as it represents a SCALE-encoded None). Returning 0 on success and non-zero on failure is consistent with common practices in the C programming language and is less surprising than the opposite.

(func $ext_crypto_secp256k1_ecdsa_recover_version_3
    (param $sig i32) (param $msg i32) (param $out i32) (return i64))
(func $ext_crypto_secp256k1_ecdsa_recover_compressed_version_3
    (param $sig i32) (param $msg i32) (param $out i32) (return i64))

The behaviour of these functions is identical to their version 2 counterparts. The new versions of these functions accept an out parameter containing the memory location where the host writes the signature. The runtime execution stops with an error if out is outside of the range of the memory of the virtual machine, even if the function wouldn't write anything to out. The signatures are always of a size known at compilation time. On success, these functions return 0. On failure, these functions return a non-zero value and do not write anything to out.

The non-zero value written on failure is:

  • 1: incorrect value of R or S
  • 2: incorrect value of V
  • 3: invalid signature

These values are equal to the values returned on error by the version 2 (see https://spec.polkadot.network/chap-host-api#defn-ecdsa-verify-error), but incremented by 1 in order to reserve 0 for success.

(func $ext_crypto_ed25519_num_public_keys_version_1
    (param $key_type_id i32) (return i32))
(func $ext_crypto_ed25519_public_key_version_2
    (param $key_type_id i32) (param $key_index i32) (param $out i32))
(func $ext_crypto_sr25519_num_public_keys_version_1
    (param $key_type_id i32) (return i32))
(func $ext_crypto_sr25519_public_key_version_2
    (param $key_type_id i32) (param $key_index i32) (param $out i32))
(func $ext_crypto_ecdsa_num_public_keys_version_1
    (param $key_type_id i32) (return i32))
(func $ext_crypto_ecdsa_public_key_version_2
    (param $key_type_id i32) (param $key_index i32) (param $out i32))

The functions superceded the ext_crypto_ed25519_public_key_version_1, ext_crypto_sr25519_public_key_version_1, and ext_crypto_ecdsa_public_key_version_1 host functions.

Instead of calling ext_crypto_ed25519_public_key_version_1 in order to obtain the list of all keys at once, the runtime should instead call ext_crypto_ed25519_num_public_keys_version_1 in order to obtain the number of public keys available, then ext_crypto_ed25519_public_key_version_2 repeatedly. The ext_crypto_ed25519_public_key_version_2 function writes the public key of the given key_index to the memory location designated by out. The key_index must be between 0 (included) and n (excluded), where n is the value returned by ext_crypto_ed25519_num_public_keys_version_1. Execution must trap if n is out of range.

The same explanations apply for ext_crypto_sr25519_public_key_version_1 and ext_crypto_ecdsa_public_key_version_1.

Host implementers should be aware that the list of public keys (including their ordering) must not change while the runtime is running. This is most likely done by copying the list of all available keys either at the start of the execution or the first time the list is accessed.

(func $ext_offchain_http_request_start_version_2
  (param $method i64) (param $uri i64) (param $meta i64) (result i32))

The behaviour of this function is identical to its version 1 counterpart. Instead of allocating a buffer, writing the request identifier in it, and returning a pointer to it, the version 2 of this function simply returns the newly-assigned identifier to the HTTP request. On failure, this function returns -1. An identifier of -1 is invalid and is reserved to indicate failure.

(func $ext_offchain_http_request_write_body_version_2
  (param $method i64) (param $uri i64) (param $meta i64) (result i32))
(func $ext_offchain_http_response_read_body_version_2
  (param $request_id i32) (param $buffer i64) (param $deadline i64) (result i64))

The behaviour of these functions is identical to their version 1 counterpart. Instead of allocating a buffer, writing two bytes in it, and returning a pointer to it, the new version of these functions simply indicates what happened:

  • For ext_offchain_http_request_write_body_version_2, 0 on success.
  • For ext_offchain_http_response_read_body_version_2, 0 or a non-zero number of bytes on success.
  • -1 if the deadline was reached.
  • -2 if there was an I/O error while processing the request.
  • -3 if the identifier of the request is invalid.

These values are equal to the values returned on error by the version 1 (see https://spec.polkadot.network/chap-host-api#defn-http-error), but tweaked in order to reserve positive numbers for success.

When it comes to ext_offchain_http_response_read_body_version_2, the host implementers must not read too much data at once in order to not create ambiguity in the returned value. Given that the size of the buffer is always inferior or equal to 4 GiB, this is not a problem.

(func $ext_offchain_http_response_wait_version_2
    (param $ids i64) (param $deadline i64) (param $out i32))

The behaviour of this function is identical to its version 1 counterpart. Instead of allocating a buffer, writing the output to it, and returning a pointer to it, the new version of this function accepts an out parameter containing the memory location where the host writes the output. The runtime execution stops with an error if out is outside of the range of the memory of the virtual machine.

The encoding of the response code is also modified compared to its version 1 counterpart and each response code now encodes to 4 little endian bytes as described below:

  • 100-999: the request has finished with the given HTTP status code.
  • -1 if the deadline was reached.
  • -2 if there was an I/O error while processing the request.
  • -3 if the identifier of the request is invalid.

The buffer passed to out must always have a size of 4 * n where n is the number of elements in the ids.

(func $ext_offchain_http_response_header_name_version_1
    (param $request_id i32) (param $header_index i32) (param $out i64) (result i64))
(func $ext_offchain_http_response_header_value_version_1
    (param $request_id i32) (param $header_index i32) (param $out i64) (result i64))

These functions supercede the ext_offchain_http_response_headers_version_1 host function.

Contrary to ext_offchain_http_response_headers_version_1, only one header indicated by header_index can be read at a time. Instead of calling ext_offchain_http_response_headers_version_1 once, the runtime should call ext_offchain_http_response_header_name_version_1 and ext_offchain_http_response_header_value_version_1 multiple times with an increasing header_index, until a value of -1 is returned.

These functions accept an out parameter containing a pointer-size to the memory location where the header name or value should be written. The runtime execution stops with an error if out is outside of the range of the memory of the virtual machine, even if the function wouldn't write anything to out.

These functions return the size, in bytes, of the header name or header value. If request doesn't exist or is in an invalid state (as documented for ext_offchain_http_response_headers_version_1) or the header_index is out of range, a value of -1 is returned. Given that the host must never write more bytes than the size of the buffer in out, and that the size of this buffer is expressed as a 32 bits number, a 64bits value of -1 is not ambiguous.

If the buffer in out is too small to fit the entire header name of value, only the bytes that fit are written and the rest are discarded.

(func $ext_offchain_submit_transaction_version_2
    (param $data i64) (return i32))
(func $ext_offchain_http_request_add_header_version_2
    (param $request_id i32) (param $name i64) (param $value i64) (result i32))

Instead of allocating a buffer, writing 1 or 0 in it, and returning a pointer to it, the version 2 of these functions return 0 or 1, where 0 indicates success and 1 indicates failure. The runtime must interpret any non-0 value as failure, but the client must always return 1 in case of failure.

(func $ext_offchain_local_storage_read_version_1
    (param $kind i32) (param $key i64) (param $value_out i64) (param $offset i32) (result i64))

This function supercedes the ext_offchain_local_storage_get_version_1 host function, and uses an API and logic similar to ext_storage_read_version_2.

It reads the offchain local storage key indicated by kind and key starting at the byte indicated by offset, and writes the value to the pointer-size indicated by value_out.

The function returns the number of bytes that were written in the value_out buffer. If the entry doesn't exist, a value of -1 is returned. Given that the host must never write more bytes than the size of the buffer in value_out, and that the size of this buffer is expressed as a 32 bits number, a 64bits value of -1 is not ambiguous.

The runtime execution stops with an error if value_out is outside of the range of the memory of the virtual machine, even if the size of the buffer is 0 or if the amount of data to write would be 0 bytes.

(func $ext_offchain_network_peer_id_version_1
    (param $out i64))

This function writes the PeerId of the local node to the memory location indicated by out. A PeerId is always 38 bytes long. The runtime execution stops with an error if out is outside of the range of the memory of the virtual machine.

(func $ext_input_size_version_1
    (return i64))
(func $ext_input_read_version_1
    (param $offset i64) (param $out i64))

When a runtime function is called, the host uses the allocator to allocate memory within the runtime where to write some input data. These two new host functions provide an alternative way to access the input that doesn't make use of the allocator.

The ext_input_size_version_1 host function returns the size in bytes of the input data.

The ext_input_read_version_1 host function copies some data from the input data to the memory of the runtime. The offset parameter indicates the offset within the input data where to start copying, and must be inferior or equal to the value returned by ext_input_size_version_1. The out parameter is a pointer-size containing the buffer where to write to. The runtime execution stops with an error if offset is strictly superior to the size of the input data, or if out is outside of the range of the memory of the virtual machine, even if the amount of data to copy would be 0 bytes.

Other changes

In addition to the new host functions, this RFC proposes two changes to the runtime-host interface:

  • The following function signature is now also accepted for runtime entry points: (func (result i64)).
  • Runtimes no longer need to expose a constant named __heap_base.

All the host functions that are being superceded by new host functions are now considered deprecated and should no longer be used. The following other host functions are similarly also considered deprecated:

  • ext_storage_get_version_1
  • ext_default_child_storage_get_version_1
  • ext_allocator_malloc_version_1
  • ext_allocator_free_version_1
  • ext_offchain_network_state_version_1

Drawbacks

This RFC might be difficult to implement in Substrate due to the internal code design. It is not clear to the author of this RFC how difficult it would be.

Prior Art

The API of these new functions was heavily inspired by API used by the C programming language.

Unresolved Questions

The changes in this RFC would need to be benchmarked. This involves implementing the RFC and measuring the speed difference.

It is expected that most host functions are faster or equal speed to their deprecated counterparts, with the following exceptions:

  • ext_input_size_version_1/ext_input_read_version_1 is inherently slower than obtaining a buffer with the entire data due to the two extra function calls and the extra copying. However, given that this only happens once per runtime call, the cost is expected to be negligible.

  • The ext_crypto_*_public_keys, ext_offchain_network_state, and ext_offchain_http_* host functions are likely slightly slower than their deprecated counterparts, but given that they are used only in offchain workers this is acceptable.

  • It is unclear how replacing ext_storage_get with ext_storage_read and ext_default_child_storage_get with ext_default_child_storage_read will impact performances.

  • It is unclear how the changes to ext_storage_next_key and ext_default_child_storage_next_key will impact performances.

Future Possibilities

After this RFC, we can remove from the source code of the host the allocator altogether in a future version, by removing support for all the deprecated host functions. This would remove the possibility to synchronize older blocks, which is probably controversial and requires a some preparations that are out of scope of this RFC.

(source)

Table of Contents

RFC-0006: Dynamic Pricing for Bulk Coretime Sales

Start DateJuly 09, 2023
DescriptionA dynamic pricing model to adapt the regular price for bulk coretime sales
AuthorsTommi Enenkel (Alice und Bob)
LicenseMIT

Summary

This RFC proposes a dynamic pricing model for the sale of Bulk Coretime on the Polkadot UC. The proposed model updates the regular price of cores for each sale period, by taking into account the number of cores sold in the previous sale, as well as a limit of cores and a target number of cores sold. It ensures a minimum price and limits price growth to a maximum price increase factor, while also giving govenance control over the steepness of the price change curve. It allows governance to address challenges arising from changing market conditions and should offer predictable and controlled price adjustments.

Accompanying visualizations are provided at [1].

Motivation

RFC-1 proposes periodic Bulk Coretime Sales as a mechanism to sell continouos regions of blockspace (suggested to be 4 weeks in length). A number of Blockspace Regions (compare RFC-1 & RFC-3) are provided for sale to the Broker-Chain each period and shall be sold in a way that provides value-capture for the Polkadot network. The exact pricing mechanism is out of scope for RFC-1 and shall be provided by this RFC.

A dynamic pricing model is needed. A limited number of Regions are offered for sale each period. The model needs to find the price for a period based on supply and demand of the previous period.

The model shall give Coretime consumers predictability about upcoming price developments and confidence that Polkadot governance can adapt the pricing model to changing market conditions.

Requirements

  1. The solution SHOULD provide a dynamic pricing model that increases price with growing demand and reduces price with shrinking demand.
  2. The solution SHOULD have a slow rate of change for price if the number of Regions sold is close to a given sales target and increase the rate of change as the number of sales deviates from the target.
  3. The solution SHOULD provide the possibility to always have a minimum price per Region.
  4. The solution SHOULD provide a maximum factor of price increase should the limit of Regions sold per period be reached.
  5. The solution should allow governance to control the steepness of the price function

Stakeholders

The primary stakeholders of this RFC are:

  • Protocol researchers and evelopers
  • Polkadot DOT token holders
  • Polkadot parachains teams
  • Brokers involved in the trade of Bulk Coretime

Explanation

Overview

The dynamic pricing model sets the new price based on supply and demand in the previous period. The model is a function of the number of Regions sold, piecewise-defined by two power functions.

  • The left side ranges from 0 to the target. It represents situations where demand was lower than the target.
  • The right side ranges from the target to limit. It represents situations where demand was higher than the target.

The curve of the function forms a plateau around the target and then falls off to the left and rises up to the right. The shape of the plateau can be controlled via a scale factor for the left side and right side of the function respectively.

Parameters

From here on, we will also refer to Regions sold as 'cores' to stay congruent with RFC-1.

NameSuggested ValueDescriptionConstraints
BULK_LIMIT45The maximum number of cores being sold0 < BULK_LIMIT
BULK_TARGET30The target number of cores being sold0 < BULK_TARGET <= BULK_LIMIT
MIN_PRICE1The minimum price a core will always cost.0 < MIN_PRICE
MAX_PRICE_INCREASE_FACTOR2The maximum factor by which the price can change.1 < MAX_PRICE_INCREASE_FACTOR
SCALE_DOWN2The steepness of the left side of the function.0 < SCALE_DOWN
SCALE_UP2The steepness of the right side of the function.0 < SCALE_UP

Function

P(n) = \begin{cases} 
    (P_{\text{old}} - P_{\text{min}}) \left(1 - \left(\frac{T - n}{T}\right)^d\right) + P_{\text{min}} & \text{if } n \leq T \\
    ((F - 1) \cdot P_{\text{old}} \cdot \left(\frac{n - T}{L - T}\right)^u) + P_{\text{old}} & \text{if } n > T 
\end{cases}
  • $P_{\text{old}}$ is the old_price, the price of a core in the previous period.
  • $P_{\text{min}}$ is the MIN_PRICE, the minimum price a core will always cost.
  • $F$ is the MAX_PRICE_INCREASE_FACTOR, the factor by which the price maximally can change from one period to another.
  • $d$ is the SCALE_DOWN, the steepness of the left side of the function.
  • $u$ is the SCALE_UP, the steepness of the right side of the function.
  • $T$ is the BULK_TARGET, the target number of cores being sold.
  • $L$ is the BULK_LIMIT, the maximum number of cores being sold.
  • $n$ is cores_sold, the number of cores being sold.

Left side

The left side is a power function that describes an increasing concave downward curvature that approaches old_price. We realize this by using the form $y = a(1 - x^d)$, usually used as a downward sloping curve, but in our case flipped horizontally by letting the argument $x = \frac{T-n}{T}$ decrease with $n$, doubly inversing the curve.

This approach is chosen over a decaying exponential because it let's us a better control the shape of the plateau, especially allowing us to get a straight line by setting SCALE_DOWN to $1$.

Ride side

The right side is a power function of the form $y = a(x^u)$.

Pseudo-code

NEW_PRICE := IF CORES_SOLD <= BULK_TARGET THEN
    (OLD_PRICE - MIN_PRICE) * (1 - ((BULK_TARGET - CORES_SOLD)^SCALE_DOWN / BULK_TARGET^SCALE_DOWN)) + MIN_PRICE
ELSE
    ((MAX_PRICE_INCREASE_FACTOR - 1) * OLD_PRICE * ((CORES_SOLD - BULK_TARGET)^SCALE_UP / (BULK_LIMIT - BULK_TARGET)^SCALE_UP)) + OLD_PRICE
END IF

Properties of the Curve

Minimum Price

We introduce MIN_PRICE to control the minimum price.

The left side of the function shall be allowed to come close to 0 if cores sold approaches 0. The rationale is that if there are actually 0 cores sold, the previous sale price was too high and the price needs to adapt quickly.

Price forms a plateau around the target

If the number of cores is close to BULK_TARGET, less extreme price changes might be sensible. This ensures that a drop in sold cores or an increase doesn’t lead to immediate price changes, but rather slowly adapts. Only if more extreme changes in the number of sold cores occur, does the price slope increase.

We introduce SCALE_DOWN and SCALE_UP to control for the steepness of the left and the right side of the function respectively.

Max price increase factor

We introduce MAX_PRICE_INCREASE_FACTOR as the factor that controls how much the price may increase from one period to another.

Introducing this variable gives governance an additional control lever and avoids the necessity for a future runtime upgrade.

Example Configurations

Baseline

This example proposes the baseline parameters. If not mentioned otherwise, other examples use these values.

The minimum price of a core is 1 DOT, the price can double every 4 weeks. Price change around BULK_TARGET is dampened slightly.

BULK_TARGET = 30
BULK_LIMIT = 45
MIN_PRICE = 1
MAX_PRICE_INCREASE_FACTOR = 2
SCALE_DOWN = 2
SCALE_UP = 2
OLD_PRICE = 1000

More aggressive pricing

We might want to have a more aggressive price growth, allowing the price to triple every 4 weeks and have a linear increase in price on the right side.

BULK_TARGET = 30
BULK_LIMIT = 45
MIN_PRICE = 1
MAX_PRICE_INCREASE_FACTOR = 3
SCALE_DOWN = 2
SCALE_UP = 1
OLD_PRICE = 1000

Conservative pricing to ensure quick corrections in an affluent market

If governance considers the risk that a sudden surge in DOT price might price chains out from bulk coretime markets, it can ensure the model quickly reacts to a quick drop in demand, by setting 0 < SCALE_DOWN < 1 and setting the max price increase factor more conservatively.

BULK_TARGET = 30
BULK_LIMIT = 45
MIN_PRICE = 1
MAX_PRICE_INCREASE_FACTOR = 1.5
SCALE_DOWN = 0.5
SCALE_UP = 2
OLD_PRICE = 1000

Linear pricing

By setting the scaling factors to 1 and potentially adapting the max price increase, we can achieve a linear function

BULK_TARGET = 30
BULK_LIMIT = 45
MIN_PRICE = 1
MAX_PRICE_INCREASE_FACTOR = 1.5
SCALE_DOWN = 1
SCALE_UP = 1
OLD_PRICE = 1000

Drawbacks

None at present.

Prior Art and References

This pricing model is based on the requirements from the basic linear solution proposed in RFC-1, which is a simple dynamic pricing model and only used as proof. The present model adds additional considerations to make the model more adaptable under real conditions.

Future Possibilities

This RFC, if accepted, shall be implemented in conjunction with RFC-1.

References

(source)

Table of Contents

RFC-0009: Improved light client requests networking protocol

Start Date2023-07-19
DescriptionModify the networking storage read requests to solve some problems with the existing one
AuthorsPierre Krieger

Summary

Improve the networking messages that query storage items from the remote, in order to reduce the bandwidth usage and number of round trips of light clients.

Motivation

Clients on the Polkadot peer-to-peer network can be divided into two categories: full nodes and light clients. So-called full nodes are nodes that store the content of the chain locally on their disk, while light clients are nodes that don't. In order to access for example the balance of an account, a full node can do a disk read, while a light client needs to send a network message to a full node and wait for the full node to reply with the desired value. This reply is in the form of a Merkle proof, which makes it possible for the light client to verify the exactness of the value.

Unfortunately, this network protocol is suffering from some issues:

  • It is not possible for the querier to check whether a key exists in the storage of the chain except by querying the value of that key. The reply will thus include the value of the key, only for that value to be discarded by the querier that isn't interested by it. This is a waste of bandwidth.
  • It is not possible for the querier to know whether a value in the storage of the chain has been modified between two blocks except by querying this value for both blocks and comparing them. Only a few storage values get modified in a block, and thus most of the time the comparison will be equal. This leads to a waste of bandwidth as the values have to be transferred.
  • While it is possible to ask for multiple specific storage keys at the same time, it is not possible to ask for a list of keys that start with a certain prefix. Due to the way FRAME works, storage keys are grouped by "prefix", for example all account balances start with the same prefix. It is thus a common necessity for a light client to obtain the list of all keys (and possibly their values) that start with a specific prefix. This is currently not possible except by performing multiple queries serially that "walk down" the trie.

Once Polkadot and Kusama will have transitioned to state_version = 1, which modifies the format of the trie entries, it will be possible to generate Merkle proofs that contain only the hashes of values in the storage. Thanks to this, it is already possible to prove the existence of a key without sending its entire value (only its hash), or to prove that a value has changed or not between two blocks (by sending just their hashes). Thus, the only reason why aforementioned issues exist is because the existing networking messages don't give the possibility for the querier to query this. This is what this proposal aims at fixing.

Stakeholders

This is the continuation of https://github.com/w3f/PPPs/pull/10, which itself is the continuation of https://github.com/w3f/PPPs/pull/5.

Explanation

The protobuf schema of the networking protocol can be found here: https://github.com/paritytech/substrate/blob/5b6519a7ff4a2d3cc424d78bc4830688f3b184c0/client/network/light/src/schema/light.v1.proto

The proposal is to modify this protocol in this way:

@@ -11,6 +11,7 @@ message Request {
                RemoteReadRequest remote_read_request = 2;
                RemoteReadChildRequest remote_read_child_request = 4;
                // Note: ids 3 and 5 were used in the past. It would be preferable to not re-use them.
+               RemoteReadRequestV2 remote_read_request_v2 = 6;
        }
 }
 
@@ -48,6 +49,21 @@ message RemoteReadRequest {
        repeated bytes keys = 3;
 }
 
+message RemoteReadRequestV2 {
+       required bytes block = 1;
+       optional ChildTrieInfo child_trie_info = 2;  // Read from the main trie if missing.
+       repeated Key keys = 3;
+       optional bytes onlyKeysAfter = 4;
+       optional bool onlyKeysAfterIgnoreLastNibble = 5;
+}
+
+message ChildTrieInfo {
+       enum ChildTrieNamespace {
+               DEFAULT = 1;
+       }
+
+       required bytes hash = 1;
+       required ChildTrieNamespace namespace = 2;
+}
+
 // Remote read response.
 message RemoteReadResponse {
        // Read proof. If missing, indicates that the remote couldn't answer, for example because
@@ -65,3 +81,8 @@ message RemoteReadChildRequest {
        // Storage keys.
        repeated bytes keys = 6;
 }
+
+message Key {
+       required bytes key = 1;
+       optional bool skipValue = 2; // Defaults to `false` if missing
+       optional bool includeDescendants = 3; // Defaults to `false` if missing
+}

Note that the field names aren't very important as they are not sent over the wire. They can be changed at any time without any consequence. I would invite people to not discuss these field names as they are implementation details.

This diff adds a new type of request (RemoteReadRequestV2).

The new child_trie_info field in the request makes it possible to specify which trie is concerned by the request. The current networking protocol uses two different structs (RemoteReadRequest and RemoteReadChildRequest) for main trie and child trie queries, while this new request would make it possible to query either. This change doesn't fix any of the issues mentioned in the previous section, but is a side change that has been done for simplicity. An alternative could have been to specify the child_trie_info for each individual Key. However this would make it necessary to send the child trie hash many times over the network, which leads to a waste of bandwidth, and in my opinion makes things more complicated for no actual gain. If a querier would like to access more than one trie at the same time, it is always possible to send one query per trie.

If skipValue is true for a Key, then the value associated with this key isn't important to the querier, and the replier is encouraged to replace the value with its hash provided that the storage item has a state_version equal to 1. If the storage value has a state_version equal to 0, then the optimization isn't possible and the replier should behave as if skipValue was false.

If includeDescendants is true for a Key, then the replier must also include in the proof all keys that are descendant of the given key (in other words, its children, children of children, children of children of children, etc.). It must do so even if key itself doesn't have any storage value associated to it. The values of all of these descendants are replaced with their hashes if skipValue is true, similarly to key itself.

The optional onlyKeysAfter and onlyKeysAfterIgnoreLastNibble fields can provide a lower bound for the keys contained in the proof. The responder must not include in its proof any node whose key is strictly inferior to the value in onlyKeysAfter. If onlyKeysAfterIgnoreLastNibble is provided, then the last 4 bits for onlyKeysAfter must be ignored. This makes it possible to represent a trie branch node that doesn't have an even number of nibbles. If no onlyKeysAfter is provided, it is equivalent to being empty, meaning that the response must start with the root node of the trie.

If onlyKeysAfterIgnoreLastNibble is missing, it is equivalent to false. If onlyKeysAfterIgnoreLastNibble is true and onlyKeysAfter is missing or empty, then the request is invalid.

For the purpose of this networking protocol, it should be considered as if the main trie contained an entry for each default child trie whose key is concat(":child_storage:default:", child_trie_hash) and whose value is equal to the trie root hash of that default child trie. This behavior is consistent with what the host functions observe when querying the storage. This behavior is present in the existing networking protocol, in other words this proposal doesn't change anything to the situation, but it is worth mentioning. Also note that child tries aren't considered as descendants of the main trie when it comes to the includeDescendants flag. In other words, if the request concerns the main trie, no content coming from child tries is ever sent back.

This protocol keeps the same maximum response size limit as currently exists (16 MiB). It is not possible for the querier to know in advance whether its query will lead to a reply that exceeds the maximum size. If the reply is too large, the replier should send back only a limited number (but at least one) of requested items in the proof. The querier should then send additional requests for the rest of the items. A response containing none of the requested items is invalid.

The server is allowed to silently discard some keys of the request if it judges that the number of requested keys is too high. This is in line with the fact that the server might truncate the response.

Drawbacks

This proposal doesn't handle one specific situation: what if a proof containing a single specific item would exceed the response size limit? For example, if the response size limit was 1 MiB, querying the runtime code (which is typically 1.0 to 1.5 MiB) would be impossible as it's impossible to generate a proof less than 1 MiB. The response size limit is currently 16 MiB, meaning that no single storage item must exceed 16 MiB.

Unfortunately, because it's impossible to verify a Merkle proof before having received it entirely, parsing the proof in a streaming way is also not possible.

A way to solve this issue would be to Merkle-ize large storage items, so that a proof could include only a portion of a large storage item. Since this would require a change to the trie format, it is not realistically feasible in a short time frame.

Testing, Security, and Privacy

The main security consideration concerns the size of replies and the resources necessary to generate them. It is for example easily possible to ask for all keys and values of the chain, which would take a very long time to generate. Since responses to this networking protocol have a maximum size, the replier should truncate proofs that would lead to the response being too large. Note that it is already possible to send a query that would lead to a very large reply with the existing network protocol. The only thing that this proposal changes is that it would make it less complicated to perform such an attack.

Implementers of the replier side should be careful to detect early on when a reply would exceed the maximum reply size, rather than inconditionally generate a reply, as this could take a very large amount of CPU, disk I/O, and memory. Existing implementations might currently be accidentally protected from such an attack thanks to the fact that requests have a maximum size, and thus that the list of keys in the query was bounded. After this proposal, this accidental protection would no longer exist.

Malicious server nodes might truncate Merkle proofs even when they don't strictly need to, and it is not possible for the client to (easily) detect this situation. However, malicious server nodes can already do undesirable things such as throttle down their upload bandwidth or simply not respond. There is no need to handle unnecessarily truncated Merkle proofs any differently than a server simply not answering the request.

Performance, Ergonomics, and Compatibility

Performance

It is unclear to the author of the RFC what the performance implications are. Servers are supposed to have limits to the amount of resources they use to respond to requests, and as such the worst that can happen is that light client requests become a bit slower than they currently are.

Ergonomics

Irrelevant.

Compatibility

The prior networking protocol is maintained for now. The older version of this protocol could get removed in a long time.

Prior Art and References

None. This RFC is a clean-up of an existing mechanism.

Unresolved Questions

None

The current networking protocol could be deprecated in a long time. Additionally, the current "state requests" protocol (used for warp syncing) could also be deprecated in favor of this one.

(source)

Table of Contents

RFC-0015: Market Design Revisit

Start Date05.08.2023
DescriptionThis RFC refines the previously proposed mechanisms involving the various Coretime markets and presents an integrated framework for harmonious interaction between all markets.
AuthorsJonas Gehrlein

Summary

This document is a proposal for restructuring the bulk markets in the Polkadot UC's coretime allocation system to improve efficiency and fairness. The proposal suggests separating the BULK_PERIOD into MARKET_PERIOD and RENEWAL_PERIOD, allowing for a market-driven price discovery through a clearing price Dutch auction during the MARKET_PERIOD followed by renewal offers at the MARKET_PRICE during the RENEWAL_PERIOD. The new system ensures synchronicity between renewal and market prices, fairness among all current tenants, and efficient price discovery, while preserving price caps to provide security for current tenants. It seeks to start a discussion about the possibility of long-term leases.

Motivation

While the initial RFC-1 has provided a robust framework for Coretime allocation within the Polkadot UC, this proposal builds upon its strengths and uses many provided building blocks to address some areas that could be further improved.

In particular, this proposal introduces the following changes:

  • It introduces a RESERVE_PRICE that anchors all markets, promoting price synchronicity within the Bulk markets (flexible + renewals).
    • This reduces complexity.
    • This makes sure all consumers pay a closely correlated price for coretime within a BULK_PERIOD.
  • It reverses the order of the market and renewal phase.
    • This allows to fine-tune the price through market forces.
  • It exposes the renewal prices, while still being beneficial for longterm tenants, more to market forces.
  • It removes the LeadIn period and introduces a (from the perspective of the coretime systemchain) passive Settlement Phase, that allows the secondary market to exert it's force.

The premise of this proposal is to reduce complexity by introducing a common price (that develops releative to capacity consumption of Polkadot UC), while still allowing for market forces to add efficiency. Longterm lease owners still receive priority IF they can pay (close to) the market price. This prevents a situation where the renewal price significantly diverges from renewal prices which allows for core captures. While maximum price increase certainty might seem contradictory to efficient price discovery, the proposed model aims to balance these elements, utilizing market forces to determine the price and allocate cores effectively within certain bounds. It must be stated, that potential price increases remain predictable (in the worst-case) but could be higher than in the originally proposed design. The argument remains, however, that we need to allow market forces to affect all prices for an efficient Coretime pricing and allocation.

Ultimately, this the framework proposed here adheres to all requirements stated in RFC-1.

Stakeholders

Primary stakeholder sets are:

  • Protocol researchers and developers, largely represented by the Polkadot Fellowship and Parity Technologies' Engineering division.
  • Polkadot Parachain teams both present and future, and their users.
  • Polkadot DOT token holders.

Explanation

Bulk Markets

The BULK_PERIOD has been restructured into two primary segments: the MARKET_PERIOD and RENEWAL_PERIOD, along with an auxiliary SETTLEMENT_PERIOD. This latter period doesn't necessitate any actions from the coretime system chain, but it facilitates a more efficient allocation of coretime in secondary markets. A significant departure from the original proposal lies in the timing of renewals, which now occur post-market phase. This adjustment aims to harmonize renewal prices with their market counterparts, ensuring a more consistent and equitable pricing model.

Market Period (14 days)

During the market period, core sales are conducted through a well-established clearing price Dutch auction that features a RESERVE_PRICE. The price initiates at a premium, designated as PRICE_PREMIUM (for instance, 30%) and descends linearly to the RESERVE_PRICE throughout the duration of the MARKET_PERIOD. Each bidder is expected to submit both their desired price and the quantity (that is, the amount of Coretime) they wish to purchase. To secure these acquisitions, bidders must make a deposit equivalent to their bid multiplied by the chosen quantity, in DOT.

The market achieves resolution once all quantities have been sold, or the RESERVE_PRICE has been reached. This situation leads to determining the MARKET_PRICE either by the lowest bid that was successful in clearing the entire market or by the RESERVE_PRICE. This mechanism yields a uniform price, shaped by market forces (refer to the following discussion for an explanation of its benefits). In other words, all buyers pay the same price (per unit of Coretime). Further down the benefits of this variant of a Dutch auction is discussed.

Note: In cases where some cores remain unsold in the market, all buyers are obligated to pay the RESERVE_PRICE.

Renewal Period (7 days)

As the RENEWAL_PERIOD commences, all current tenants are granted the opportunity to renew their cores at a slight discount of MARKET_PRICE * RENEWAL_DISCOUNT (for instance, 10%). This provision affords marginal benefits to existing tenants, balancing out the non-transferability aspect of renewals.

At the end of the period, all available cores are allocated to the current tenants who have opted for renewal and the participants who placed bids during the market period. If the demand for cores exceeds supply, the cores left unclaimed from renewals may be awarded to bidders who placed their bids early in the auction, thereby subtly incentivizing early participation. If the supply exceeds the demand, all unsold cores are transferred to the Instantanous Market.

Reserve Price Adjustment

After all cores are allocated, the RESERVE_PRICE is adjusted following the process described in RFC-1 and serves as baseline price in the next BULK_PERIOD.

Note: The particular price curve is outside the scope of the proposal. The MARKET_PRICE (as a function of RESERVE_PRICE), however, is able to capture higher demand very well while being capped downwards. That means, the curve that adjusts the RESERVE_PRICE should be more sensitive to undercapacity.

Price Predictability

Tasks that are in the "renewal-pipeline" can determine the upper bound for the price they will pay in any future period. The main driver of any price increase over time is the adjustment of the RESERVE_PRICE, that occurs at the end of each BULK_PERIOD after determining the capacity fillment of Polkadot UC. To calculate the maximum price in some future period, a task could assume maximum capacity in all upcoming periods and track the resulting price increase of RESERVE_PRICE. In the final period, that price can get a maximum premium of PRICE_PREMIUM and after deducting a potential RENEWAL_DISCOUNT, the maximum price can be determined.

Settlement Period (7 days)

During the settlement period, participants have ample time to trade Coretime on secondary markets before the onset of the next BULK_PERIOD. This allows for trading with full Coretime availability. Trading transferrable Coretime naturally continues during each BULK_PERIOD, albeit with cores already in use.

Benefits of this system

  • The introduction of a single price, the RESERVE_PRICE, provides an anchor for all Coretime markets. This is a preventative measure against the possible divergence and mismatch of prices, which could inadvertently lead to a situation where existing tenants secure cores at significantly below-market rates.
  • With a more market-responsive pricing system, we can achieve a more efficient price discovery process. Any price increases will be less arbitrary and more dynamic.
  • The ideal strategy for existing tenants is to maintain passivity, i.e., refrain from active market participation and simply accept the offer presented to them during the renewal phase. This approach lessens the organizational overhead for long-term projects.
  • In the two-week market phase, the maximum price increase is known well in advance, providing ample time for tenants to secure necessary funds to meet the potential price escalation.
  • All existing tenants pay an equal amount for Coretime, reflecting our intent to price the Coretime itself and not the relative timing of individual projects.

Discussion: Clearing Price Dutch Auctions

Having all bidders pay the market clearing price offers some benefits and disadvantages.

  • Advantages:
    • Fairness: All bidders pay the same price.
    • Active participation: Because bidders are protected from overbidding (winner's curse), they are more likely to engage and reveal their true valuations.
    • Simplicity: A single price is easier to work with for pricing renewals later.
    • Truthfulness: There is no need to try to game the market by waiting with bidding. Bidders can just bid their valuations.
  • Disadvantages:
    • (Potentially) Lower Revenue: While the theory predicts revenue-equivalence between a uniform price and pay-as-bid type of auction, slightly lower revenue for the former type is observed empirically. Arguably, revenue maximization (i.e., squeezing out the maximum willingness to pay from bidders) is not the priority for Polkadot UC. Instead, it is interested in efficient allocation and the other benefits illustrated above.
    • (Technical) Complexity: Instead of making a final purchase within the auction, the bid is only a deposit. Some refunds might happen after the auction is finished. This might pose additional challenges from the technical side (e.g., storage requirements).

Further Discussion Points

  • Long-term Coretime: The Polkadot UC is undergoing a transition from two-year leases without an instantaneous market to a model encompassing instantaneous and one-month leases. This shift seems to pivot from one extreme to another. While the introduction of short-term leases, both instantaneous and for one month, is a constructive move to lower barriers to entry and promote experimentation, it seems to be the case that established projects might benefit from more extended lease options. We could consider offering another product, such as a six-month Coretime lease, using the same mechanism described herein. Although the majority of leases would still be sold on a one-month basis, the addition of this option would enhance market efficiency as it would strengthen the impact of a secondary market.

Drawbacks

There are trade-offs that arise from this proposal, compared to the initial model. The most notable one is that here, I prioritize requirement 6 over requirement 2. The price, in the very "worst-case" (meaning a huge explosion in demand for coretime) could lead to a much larger increase of prices in Coretime. From an economic perspective, this (rare edgecase) would also mean that we'd vastly underprice Coretime in the original model, leading to highly inefficient allocations.

Prior Art and References

This RFC builds extensively on the available ideas put forward in RFC-1.

Additionally, I want to express a special thanks to Samuel Haefner and Shahar Dobzinski for fruitful discussions and helping me structure my thoughts.

Unresolved Questions

The technical feasability needs to be assessed.

(source)

Table of Contents

RFC-34: XCM Absolute Location Account Derivation

Start Date05 October 2023
DescriptionXCM Absolute Location Account Derivation
AuthorsGabriel Facco de Arruda

Summary

This RFC proposes changes that enable the use of absolute locations in AccountId derivations, which allows protocols built using XCM to have static account derivations in any runtime, regardless of its position in the family hierarchy.

Motivation

These changes would allow protocol builders to leverage absolute locations to maintain the exact same derived account address across all networks in the ecosystem, thus enhancing user experience.

One such protocol, that is the original motivation for this proposal, is InvArch's Saturn Multisig, which gives users a unifying multisig and DAO experience across all XCM connected chains.

Stakeholders

  • Ecosystem developers

Explanation

This proposal aims to make it possible to derive accounts for absolute locations, enabling protocols that require the ability to maintain the same derived account in any runtime. This is done by deriving accounts from the hash of described absolute locations, which are static across different destinations.

The same location can be represented in relative form and absolute form like so:

#![allow(unused)]
fn main() {
// Relative location (from own perspective)
{
    parents: 0,
    interior: Here
}

// Relative location (from perspective of parent)
{
    parents: 0,
    interior: [Parachain(1000)]
}

// Relative location (from perspective of sibling)
{
    parents: 1,
    interior: [Parachain(1000)]
}

// Absolute location
[GlobalConsensus(Kusama), Parachain(1000)]
}

Using DescribeFamily, the above relative locations would be described like so:

#![allow(unused)]
fn main() {
// Relative location (from own perspective)
// Not possible.

// Relative location (from perspective of parent)
(b"ChildChain", Compact::<u32>::from(*index)).encode()

// Relative location (from perspective of sibling)
(b"SiblingChain", Compact::<u32>::from(*index)).encode()

}

The proposed description for absolute location would follow the same pattern, like so:

#![allow(unused)]
fn main() {
(
    b"GlobalConsensus",
    network_id,
    b"Parachain",
    Compact::<u32>::from(para_id),
    tail
).encode()
}

This proposal requires the modification of two XCM types defined in the xcm-builder crate: The WithComputedOrigin barrier and the DescribeFamily MultiLocation descriptor.

WithComputedOrigin

The WtihComputedOrigin barrier serves as a wrapper around other barriers, consuming origin modification instructions and applying them to the message origin before passing to the inner barriers. One of the origin modifying instructions is UniversalOrigin, which serves the purpose of signaling that the origin should be a Universal Origin that represents the location as an absolute path prefixed by the GlobalConsensus junction.

In it's current state the barrier transforms locations with the UniversalOrigin instruction into relative locations, so the proposed changes aim to make it return absolute locations instead.

DescribeFamily

The DescribeFamily location descriptor is part of the HashedDescription MultiLocation hashing system and exists to describe locations in an easy format for encoding and hashing, so that an AccountId can be derived from this MultiLocation.

This implementation contains a match statement that does not match against absolute locations, so changes to it involve matching against absolute locations and providing appropriate descriptions for hashing.

Drawbacks

No drawbacks have been identified with this proposal.

Testing, Security, and Privacy

Tests can be done using simple unit tests, as this is not a change to XCM itself but rather to types defined in xcm-builder.

Security considerations should be taken with the implementation to make sure no unwanted behavior is introduced.

This proposal does not introduce any privacy considerations.

Performance, Ergonomics, and Compatibility

Performance

Depending on the final implementation, this proposal should not introduce much overhead to performance.

Ergonomics

The ergonomics of this proposal depend on the final implementation details.

Compatibility

Backwards compatibility should remain unchanged, although that depend on the final implementation.

Prior Art and References

  • DescirbeFamily type: https://github.com/paritytech/polkadot-sdk/blob/master/polkadot/xcm/xcm-builder/src/location_conversion.rs#L122
  • WithComputedOrigin type: https://github.com/paritytech/polkadot-sdk/blob/master/polkadot/xcm/xcm-builder/src/barriers.rs#L153

Unresolved Questions

Implementation details and overall code is still up to discussion.

(source)

Table of Contents

RFC-0035: Conviction Voting Delegation Modifications

October 10, 2023
Conviction Voting Delegation Modifications
ChaosDAO

Summary

This RFC proposes to make modifications to voting power delegations as part of the Conviction Voting pallet. The changes being proposed include:

  1. Allow a Delegator to vote independently of their Delegate if they so desire.
  2. Allow nested delegations – for example Charlie delegates to Bob who delegates to Alice – when Alice votes then both Bob and Charlie vote alongside Alice (in the current implementation Charlie will not vote when Alice votes).
  3. Make a change so that when a delegate votes abstain their delegated votes also vote abstain.
  4. Allow a Delegator to delegate/ undelegate their votes for all tracks with a single call.

Motivation

It has become clear since the launch of OpenGov that there are a few common tropes which pop up time and time again:

  1. The frequency of referenda is often too high for network participants to have sufficient time to review, comprehend, and ultimately vote on each individual referendum. This means that these network participants end up being inactive in on-chain governance.
  2. There are active network participants who are reviewing every referendum and are providing feedback in an attempt to help make the network thrive – but often time these participants do not control enough voting power to influence the network with their positive efforts.
  3. Delegating votes for all tracks currently requires long batched calls which result in high fees for the Delegator - resulting in a reluctance from many to delegate their votes.

We believe (based on feedback from token holders with a larger stake in the network) that if there were some changes made to delegation mechanics, these larger stake holders would be more likely to delegate their voting power to active network participants – thus greatly increasing the support turnout.

Stakeholders

The primary stakeholders of this RFC are:

  • The Polkadot Technical Fellowship who will have to research and implement the technical aspects of this RFC
  • DOT token holders in general

Explanation

This RFC proposes to make 4 changes to the convictionVoting pallet logic in order to improve the user experience of those delegating their voting power to another account.

  1. Allow a Delegator to vote independently of their Delegate if they so desire – this would empower network participants to more actively delegate their voting power to active voters, removing the tedious steps of having to undelegate across an entire track every time they do not agree with their delegate's voting direction for a particular referendum.

  2. Allow nested delegations – for example Charlie delegates to Bob who delegates to Alice – when Alice votes then both Bob and Charlie vote alongside Alice (in the current runtime Charlie will not vote when Alice votes) – This would allow network participants who control multiple (possibly derived) accounts to be able to delegate all of their voting power to a single account under their control, which would in turn delegate to a more active voting participant. Then if the delegator wishes to vote independently of their delegate they can control all of their voting power from a single account, which again removes the pain point of having to issue multiple undelegate extrinsics in the event that they disagree with their delegate.

  3. Have delegated votes follow their delegates abstain votes – there are times where delegates may vote abstain on a particular referendum and adding this functionality will increase the support of a particular referendum. It has a secondary benefit of meaning that Validators who are delegating their voting power do not lose points in the 1KV program in the event that their delegate votes abstain (another pain point which may be preventing those network participants from delegating).

  4. Allow a Delegator to delegate/ undelegate their votes for all tracks with a single call - in order to delegate votes across all tracks, a user must batch 15 calls - resulting in high costs for delegation. A single call for delegate_all/ undelegate_all would reduce the complexity and therefore costs of delegations considerably for prospective Delegators.

Drawbacks

We do not foresee any drawbacks by implementing these changes. If anything we believe that this should help to increase overall voter turnout (via the means of delegation) which we see as a net positive.

Testing, Security, and Privacy

We feel that the Polkadot Technical Fellowship would be the most competent collective to identify the testing requirements for the ideas presented in this RFC.

Performance, Ergonomics, and Compatibility

Performance

This change may add extra chain storage requirements on Polkadot, especially with respect to nested delegations.

Ergonomics & Compatibility

The change to add nested delegations may affect governance interfaces such as Nova Wallet who will have to apply changes to their indexers to support nested delegations. It may also affect the Polkadot Delegation Dashboard as well as Polkassembly & SubSquare.

We want to highlight the importance for ecosystem builders to create a mechanism for indexers and wallets to be able to understand that changes have occurred such as increasing the pallet version, etc.

Prior Art and References

N/A

Unresolved Questions

N/A

Additionally we would like to re-open the conversation about the potential for there to be free delegations. This was discussed by Dr Gavin Wood at Sub0 2022 and we feel like this would go a great way towards increasing the amount of network participants that are delegating: https://youtu.be/hSoSA6laK3Q?t=526

Overall, we strongly feel that delegations are a great way to increase voter turnout, and the ideas presented in this RFC would hopefully help in that aspect.

(source)

Table of Contents

RFC-0044: Rent based registration model

Start Date6 November 2023
DescriptionA new rent based parachain registration model
AuthorsSergej Sakac

Summary

This RFC proposes a new model for a sustainable on-demand parachain registration, involving a smaller initial deposit and periodic rent payments. The new model considers that on-demand chains may be unregistered and later re-registered. The proposed solution also ensures a quick startup for on-demand chains on Polkadot in such cases.

Motivation

With the support of on-demand parachains on Polkadot, there is a need to explore a new, more cost-effective model for registering validation code. In the current model, the parachain manager is responsible for reserving a unique ParaId and covering the cost of storing the validation code of the parachain. These costs can escalate, particularly if the validation code is large. We need a better, sustainable model for registering on-demand parachains on Polkadot to help smaller teams deploy more easily.

This RFC suggests a new payment model to create a more financially viable approach to on-demand parachain registration. In this model, a lower initial deposit is required, followed by recurring payments upon parachain registration.

This new model will coexist with the existing one-time deposit payment model, offering teams seeking to deploy on-demand parachains on Polkadot a more cost-effective alternative.

Requirements

  1. The solution SHOULD NOT affect the current model for registering validation code.
  2. The solution SHOULD offer an easily configurable way for governance to adjust the initial deposit and recurring rent cost.
  3. The solution SHOULD provide an incentive to prune validation code for which rent is not paid.
  4. The solution SHOULD allow anyone to re-register validation code under the same ParaId without the need for redundant pre-checking if it was already verified before.
  5. The solution MUST be compatible with the Agile Coretime model, as described in RFC#0001
  6. The solution MUST allow anyone to pay the rent.
  7. The solution MUST prevent the removal of validation code if it could still be required for disputes or approval checking.

Stakeholders

  • Future Polkadot on-demand Parachains

Explanation

This RFC proposes a set of changes that will enable the new rent based approach to registering and storing validation code on-chain. The new model, compared to the current one, will require periodic rent payments. The parachain won't be pruned automatically if the rent is not paid, but by permitting anyone to prune the parachain and rewarding the caller, there will be an incentive for the removal of the validation code.

On-demand parachains should still be able to utilize the current one-time payment model. However, given the size of the deposit required, it's highly likely that most on-demand parachains will opt for the new rent-based model.

Importantly, this solution doesn't require any storage migrations in the current system nor does it introduce any breaking changes. The following provides a detailed description of this solution.

Registering an on-demand parachain

In the current implementation of the registrar pallet, there are two constants that specify the necessary deposit for parachains to register and store their validation code:

#![allow(unused)]
fn main() {
trait Config {
	// -- snip --

	/// The deposit required for reserving a `ParaId`.
	#[pallet::constant]
	type ParaDeposit: Get<BalanceOf<Self>>;

	/// The deposit to be paid per byte stored on chain.
	#[pallet::constant]
	type DataDepositPerByte: Get<BalanceOf<Self>>;
}
}

This RFC proposes the addition of three new constants that will determine the payment amount and the frequency of the recurring rent payment:

#![allow(unused)]
fn main() {
trait Config {
	// -- snip --

	/// Defines how frequently the rent needs to be paid.
	///
	/// The duration is set in sessions instead of block numbers.
	#[pallet::constant]
	type RentDuration: Get<SessionIndex>;

	/// The initial deposit amount for registering validation code.
	///
	/// This is defined as a proportion of the deposit that would be required in the regular
	/// model.
	#[pallet::constant]
	type RentalDepositProportion: Get<Perbill>;

	/// The recurring rental cost defined as a proportion of the initial rental registration deposit.
	#[pallet::constant]
	type RentalRecurringProportion: Get<Perbill>;
}
}

Users will be able to reserve a ParaId and register their validation code for a proportion of the regular deposit required. However, they must also make additional rent payments at intervals of T::RentDuration.

For registering using the new rental system we will have to make modifications to the paras-registrar pallet. We should expose two new extrinsics for this:

#![allow(unused)]
fn main() {
mod pallet {
	// -- snip --

	pub fn register_rental(
		origin: OriginFor<T>,
		id: ParaId,
		genesis_head: HeadData,
		validation_code: ValidationCode,
	) -> DispatchResult { /* ... */ }

	pub fn pay_rent(origin: OriginFor<T>, id: ParaId) -> DispatchResult {
		/* ... */ 
	}
}
}

A call to register_rental will require the reservation of only a percentage of the deposit that would otherwise be required to register the validation code when using the regular model. As described later in the Quick para re-registering section below, we will also store the code hash of each parachain to enable faster re-registration after a parachain has been pruned. For this reason the total initial deposit amount is increased to account for that.

#![allow(unused)]
fn main() {
// The logic for calculating the initial deposit for parachain registered with the 
// new rent-based model:

let validation_code_deposit = per_byte_fee.saturating_mul((validation_code.0.len() as u32).into());

let head_deposit = per_byte_fee.saturating_mul((genesis_head.0.len() as u32).into())
let hash_deposit = per_byte_fee.saturating_mul(HASH_SIZE);

let deposit = T::RentalDepositProportion::get().mul_ceil(validation_code_deposit)
	.saturating_add(T::ParaDeposit::get())
	.saturating_add(head_deposit)
	.saturating_add(hash_deposit)
}

Once the ParaId is reserved and the validation code is registered the rent must be periodically paid to ensure the on-demand parachain doesn't get removed from the state. The pay_rent extrinsic should be callable by anyone, removing the need for the parachain to depend on the parachain manager for rent payments.

On-demand parachain pruning

If the rent is not paid, anyone has the option to prune the on-demand parachain and claim a portion of the initial deposit reserved for storing the validation code. This type of 'light' pruning only removes the validation code, while the head data and validation code hash are retained. The validation code hash is stored to allow anyone to register it again as well as to enable quicker re-registration by skipping the pre-checking process.

The moment the rent is no longer paid, the parachain won't be able to purchase on-demand access, meaning no new blocks are allowed. This stage is called the "hibernation" stage, during which all the parachain-related data is still stored on-chain, but new blocks are not permitted. The reason for this is to ensure that the validation code is available in case it is needed in the dispute or approval checking subsystems. Waiting for one entire session will be enough to ensure it is safe to deregister the parachain.

This means that anyone can prune the parachain only once the "hibernation" stage is over, which lasts for an entire session after the moment that the rent is not paid.

The pruning described here is a light form of pruning, since it only removes the validation code. As with all parachains, the parachain or para manager can use the deregister extrinsic to remove all associated state.

Ensuring rent is paid

The paras pallet will be loosely coupled with the para-registrar pallet. This approach enables all the pallets tightly coupled with the paras pallet to have access to the rent status information.

Once the validation code is stored without having its rent paid the assigner_on_demand pallet will ensure that an order for that parachain cannot be placed. This is easily achievable given that the assigner_on_demand pallet is tightly coupled with the paras pallet.

On-demand para re-registration

If the rent isn't paid on time, and the parachain gets pruned, the new model should provide a quick way to re-register the same validation code under the same ParaId. This can be achieved by skipping the pre-checking process, as the validation code hash will be stored on-chain, allowing us to easily verify that the uploaded code remains unchanged.

#![allow(unused)]
fn main() {
/// Stores the validation code hash for parachains that successfully completed the 
/// pre-checking process.
///
/// This is stored to enable faster on-demand para re-registration in case its pvf has been earlier
/// registered and checked.
///
/// NOTE: During a runtime upgrade where the pre-checking rules change this storage map should be
/// cleared appropriately.
#[pallet::storage]
pub(super) type CheckedCodeHash<T: Config> =
	StorageMap<_, Twox64Concat, ParaId, ValidationCodeHash>;
}

To enable parachain re-registration, we should introduce a new extrinsic in the paras-registrar pallet that allows this. The logic of this extrinsic will be same as regular registration, with the distinction that it can be called by anyone, and the required deposit will be smaller since it only has to cover for the storage of the validation code.

Drawbacks

This RFC does not alter the process of reserving a ParaId, and therefore, it does not propose reducing it, even though such a reduction could be beneficial.

Even though this RFC doesn't delve into the specifics of the configuration values for parachain registration but rather focuses on the mechanism, configuring it carelessly could lead to potential problems.

Since the validation code hash and head data are not removed when the parachain is pruned but only when the deregister extrinsic is called, the T::DataDepositPerByte must be set to a higher value to create a strong enough incentive for removing it from the state.

Testing, Security, and Privacy

The implementation of this RFC will be tested on Rococo first.

Proper research should be conducted on setting the configuration values of the new system since these values can have great impact on the network.

An audit is required to ensure the implementation's correctness.

The proposal introduces no new privacy concerns.

Performance, Ergonomics, and Compatibility

Performance

This RFC should not introduce any performance impact.

Ergonomics

This RFC does not affect the current parachains, nor the parachains that intend to use the one-time payment model for parachain registration.

Compatibility

This RFC does not break compatibility.

Prior Art and References

Prior discussion on this topic: https://github.com/paritytech/polkadot-sdk/issues/1796

Unresolved Questions

None at this time.

As noted in this GitHub issue, we want to raise the per-byte cost of on-chain data storage. However, a substantial increase in this cost would make it highly impractical for on-demand parachains to register on Polkadot. This RFC offers an alternative solution for on-demand parachains, ensuring that the per-byte cost increase doesn't overly burden the registration process.

(source)

Table of Contents

RFC-0054: Remove the concept of "heap pages" from the client

Start Date2023-11-24
DescriptionRemove the concept of heap pages from the client and move it to the runtime.
AuthorsPierre Krieger

Summary

Rather than enforce a limit to the total memory consumption on the client side by loading the value at :heappages, enforce that limit on the runtime side.

Motivation

From the early days of Substrate up until recently, the runtime was present in two forms: the wasm runtime (wasm bytecode passed through an interpreter) and the native runtime (native code directly run by the client).

Since the wasm runtime has a lower amount of available memory (4 GiB maximum) compared to the native runtime, and in order to ensure sure that the wasm and native runtimes always produce the same outcome, it was necessary to clamp the amount of memory available to both runtimes to the same value.

In order to achieve this, a special storage key (a "well-known" key) :heappages was introduced and represents the number of "wasm pages" (one page equals 64kiB) of memory that are available to the memory allocator of the runtimes. If this storage key is absent, it defaults to 2048, which is 128 MiB.

The native runtime has since then been disappeared, but the concept of "heap pages" still exists. This RFC proposes a simplification to the design of Polkadot by removing the concept of "heap pages" as is currently known, and proposes alternative ways to achieve the goal of limiting the amount of memory available.

Stakeholders

Client implementers and low-level runtime developers.

Explanation

This RFC proposes the following changes to the client:

  • The client no longer considers :heappages as special.
  • The memory allocator of the runtime is no longer bounded by the value of :heappages.

With these changes, the memory available to the runtime is now only bounded by the available memory space (4 GiB), and optionally by the maximum amount of memory specified in the Wasm binary (see https://webassembly.github.io/spec/core/bikeshed/#memories%E2%91%A0). In Rust, the latter can be controlled during compilation with the flag -Clink-arg=--max-memory=....

Since the client-side change is strictly more tolerant than before, we can perform the change immediately after the runtime has been updated, and without having to worry about backwards compatibility.

This RFC proposes three alternative paths (different chains might choose to follow different paths):

  • Path A: add back the same memory limit to the runtime, like so:

    • At initialization, the runtime loads the value of :heappages from the storage (using ext_storage_get or similar), and sets a global variable to the decoded value.
    • The runtime tracks the total amount of memory that it has allocated using its instance of #[global_allocator] (https://github.com/paritytech/polkadot-sdk/blob/e3242d2c1e2018395c218357046cc88caaed78f3/substrate/primitives/io/src/lib.rs#L1748-L1762). This tracking should also be added around the host functions that perform allocations.
    • If an allocation is attempted that would go over the value in the global variable, the memory allocation fails.
  • Path B: define the memory limit using the -Clink-arg=--max-memory=... flag.

  • Path C: don't add anything to the runtime. This is effectively the same as setting the memory limit to ~4 GiB (compared to the current default limit of 128 MiB). This solution is viable only because we're compiling for 32bits wasm rather than for example 64bits wasm. If we ever compile for 64bits wasm, this would need to be revisited.

Each parachain can choose the option that they prefer, but the author of this RFC strongly suggests either option C or B.

Drawbacks

In case of path A, there is one situation where the behaviour pre-RFC is not equivalent to the one post-RFC: when a host function that performs an allocation (for example ext_storage_get) is called, without this RFC this allocation might fail due to reaching the maximum heap pages, while after this RFC this will always succeed. This is most likely not a problem, as storage values aren't supposed to be larger than a few megabytes at the very maximum.

In the unfortunate event where the runtime runs out of memory, path B would make it more difficult to relax the memory limit, as we would need to re-upload the entire Wasm, compared to updating only :heappages in path A or before this RFC. In the case where the runtime runs out of memory only in the specific event where the Wasm runtime is modified, this could brick the chain. However, this situation is no different than the thousands of other ways that a bug in the runtime can brick a chain, and there's no reason to be particularily worried about this situation in particular.

Testing, Security, and Privacy

This RFC would reduce the chance of a consensus issue between clients. The :heappages are a rather obscure feature, and it is not clear what happens in some corner cases such as the value being too large (error? clamp?) or malformed. This RFC would completely erase these questions.

Performance, Ergonomics, and Compatibility

Performance

In case of path A, it is unclear how performances would be affected. Path A consists in moving client-side operations to the runtime without changing these operations, and as such performance differences are expected to be minimal. Overall, we're talking about one addition/subtraction per malloc and per free, so this is more than likely completely negligible.

In case of path B and C, the performance gain would be a net positive, as this RFC strictly removes things.

Ergonomics

This RFC would isolate the client and runtime more from each other, making it a bit easier to reason about the client or the runtime in isolation.

Compatibility

Not a breaking change. The runtime-side changes can be applied immediately (without even having to wait for changes in the client), then as soon as the runtime is updated, the client can be updated without any transition period. One can even consider updating the client before the runtime, as it corresponds to path C.

Prior Art and References

None.

Unresolved Questions

None.

This RFC follows the same path as https://github.com/polkadot-fellows/RFCs/pull/4 by scoping everything related to memory allocations to the runtime.

(source)

Table of Contents

RFC-0070: X Track for @kusamanetwork

Start DateJanuary 29, 2024
DescriptionAdd a governance track to facilitate posts on the @kusamanetwork's X account
AuthorAdam Clay Steeber

Summary

This RFC proposes adding a trivial governance track on Kusama to facilitate X (formerly known as Twitter) posts on the @kusamanetwork account. The technical aspect of implementing this in the runtime is very inconsequential and straight-forward, though it might get more technical if the Fellowship wants to regulate this track with a non-existent permission set. If this is implemented it would need to be followed up with:

  1. the establishment of specifications for proposing X posts via this track, and
  2. the development of tools/processes to ensure that the content contained in referenda enacted in this track would be automatically posted on X.

Motivation

The overall motivation for this RFC is to decentralize the management of the Kusama brand/communication channel to KSM holders. This is necessary in my opinion primarily because of the inactivity of the account in recent history, with posts spanning weeks or months apart. I am currently unaware of who/what entity manages the Kusama X account, but if they are affiliated with Parity or W3F this proposed solution could also offload some of the legal ramifications of making (or not making) announcements to the public regarding Kusama. While centralized control of the X account would still be present, it could become totally moot if this RFC is implemented and the community becomes totally autonomous in the management of Kusama's X posts.

This solution does not cover every single communication front for Kusama, but it does cover one of the largest. It also establishes a precedent for other communication channels that could be offloaded to openGov, provided this proof-of-concept is successful.

Finally, this RFC is the epitome of experimentation that Kusama is ideal for. This proposal may spark newfound excitement for Kusama and help us realize Kusama's potential for pushing boundaries and trying new unconventional ideas.

Stakeholders

This idea has not been formalized by any individual (or group of) KSM holder(s). To my knowledge the socialization of this idea is contained entirely in my recent X post here, but it is possible that an idea like this one has been discussed in other places. It appears to me that the ecosystem would welcome a change like this which is why I am taking action to formalize the discussion.

Explanation

The implementation of this idea can be broken down into 3 primary phases:

Phase 1 - Track configurations

First, we begin with this RFC to ensure all feedback can be discussed and implemented in the proposal. After the Fellowship and the community come to a reasonable agreement on the changes necessary to make this happen, the Fellowship can merge changes into Kusama's runtime to include this new track with appropriate track configurations. As a starting point, I recommend the following track configurations:

const APP_X_POST: Curve = Curve::make_linear(7, 28, percent(50), percent(100));
const SUP_X_POST: Curve = Curve::make_reciprocal(?, ?, percent(?), percent(?), percent(?));

// I don't know how to configure the make_reciprocal variables to get what I imagine for support,
// but I recommend starting at 50% support and sharply decreasing such that 1% is sufficient quarterway
// through the decision period and hitting 0% at the end of the decision period, or something like that.

	(
		69,
		pallet_referenda::TrackInfo {
			name: "x_post",
			max_deciding: 50,
			decision_deposit: 1 * UNIT,
			prepare_period: 10 * MINUTES,
			decision_period: 4 * DAYS,
			confirm_period: 10 * MINUTES,
			min_enactment_period: 1 * MINUTES,
			min_approval: APP_X_POST,
			min_support: SUP_X_POST,
		},
	),

I also recommend restricting permissions of this track to only submitting remarks or batches of remarks - that's all we'll need for its purpose. I'm not sure how easy that is to configure, but it is important since we don't want such an agile track to be able to make highly consequential calls.

Phase 2 - Establish Specs for X Post Track Referenda

It is important that we establish the specifications of referenda that will be submitted in this track to ensure that whatever automation tool is built can easily make posts once a referendum is enacted. As stated above, we really only need a system.remark (or batch of remarks) to indicate the contents of a proposed X post. The most straight-forward way to do this is to require remarks to adhere to X's requirements for making posts via their API.

For example, if I wanted to propose a post that contained the text "Hello World!" I would propose a referendum in the X post track that contains the following call data: 0x0000607b2274657874223a202248656c6c6f20576f726c6421227d (i.e. system.remark('{"text": "Hello World!"}')).

At first, we could support text posts only to prove the concept. Later on we could expand this spec to add support for media, likes, retweets, replies, polls, and whatever other X features we want.

Phase 3 - Release, Tooling, & Documentation

Once we agree on track configurations and specs for referenda in this track, the Fellowship can move forward with merging these changes into Kusama's runtime and include them in its next release. We could also move forward with developing the necessary tools that would listen for enacted referenda to post automatically on X. This would require coordination with whoever controls the X account; they would either need to run the tools themselves or add a third party as an authorized user to run the tools to make posts on the account's behalf. This is a bottleneck for decentralization, but as long as the tools are run by the X account manager or by a trusted third party it should be fine. I'm open to more decentralized solutions, but those always come at a cost of complexity.

For the tools themselves, we could open a bounty on Kusama for developers/teams to bid on. We could also just ask the community to step up with a Treasury proposal to have anyone fund the build. Or, the Fellowship could make the release of these changes contingent on their endorsement of developers/teams to build these tools. Lots of options! For the record, me and my team could develop all the necessary tools, but all because I'm proposing these changes doesn't entitle me to funds to build the tools needed to implement them. Here's what would be needed:

  • a listener tool that would listen for enacted referenda in this track, verify the format of the remark(s), and submit to X's API with authenticating credentials
  • a UI to allow layman users to propose referenda on this track

After everything is complete, we can update the Kusama wiki to include documentation on the X post specifications and include links to the tools/UI.

Drawbacks

The main drawback to this change is that it requires a lot of off-chain coordination. It's easy enough to include the track on Kusama but it's a totally different challenge to make it function as intended. The tools need to be built and the auth tokens need to be managed. It would certainly add an administrative burden to whoever manages the X account since they would either need to run the tools themselves or manage auth tokens.

This change also introduces on-going costs to the Treasury since it would need to compensate people to support the tools necessary to facilitate this idea. The ultimate question is whether these on-going costs would be worth the ability for KSM holders to make posts on Kusama's X account.

There's also the risk of misconfiguring the track to make referenda too easy to pass, potentially allowing a malicious actor to get content posted on X that violates X's ToS. If that happens, we risk getting Kusama banned on X!

This change might also be outside the scope of the Fellowship/openGov. Perhaps the best solution for the X account is to have the Treasury pay for a professional agency to manage posts. It wouldn't be decentralized but it would probably be more effective in terms of creating good content.

Finally, this solution is merely pseudo-decentralization since the X account manager would still have ultimate control of the account. It's decentralized insofar as the auth tokens are given to people actually running the tools; a house of cards is required to facilitate X posts via this track. Not ideal.

Testing, Security, and Privacy

There's major precedent for configuring tracks on openGov given the amount of power tracks have, so it shouldn't be hard to come up with a sound configuration. That's why I recommend restricting permissions of this track to remarks and batches of remarks, or something equally inconsequential.

Building the tools for this implementation is really straight-forward and could be audited by Fellowship members, and the community at large, on Github.

The largest security concern would be the management of Kusama's X account's auth tokens. We would need to ensure that they aren't compromised.

Performance, Ergonomics, and Compatibility

Performance

If a track on Kusama promises users that compliant referenda enacted therein would be posted on Kusama's X account, users would expect that track to perform as promised. If the house of cards tumbles down and a compliant referendum doesn't actually get anything posted, users might think that Kusama is broken or unreliable. This could be damaging to Kusama's image and cause people to question the soundness of other features on Kusama.

As mentioned in the drawbacks, the performance of this feature would depend on off-chain coordinations. We can reduce the administrative burden of these coordinations by funding third parties with the Treasury to deal with it, but then we're relying on trusting these parties.

Ergonomics

By adding a new track to Kusama, governance platforms like Polkassembly or Nova Wallet would need to include it on their applications. This shouldn't be too much of a burden or overhead since they've already built the infrastructure for other openGov tracks.

Compatibility

This change wouldn't break any compatibility as far as I know.

References

One reference to a similar feature requiring on-chain/off-chain coordination would be the Kappa-Sigma-Mu Society. Nothing on-chain necessarily enforces the rules or facilitates bids, challenges, defenses, etc. However, the Society has managed to maintain itself with integrity to its rules. So I don't think this is totally out of Kusama's scope. But it will require some off-chain effort to maintain.

Unresolved Questions

  • Who will develop the tools necessary to implement this feature? How do we select them?
  • How can this idea be better implemented with on-chain/substrate features?

(source)

Table of Contents

RFC-0073: Decision Deposit Referendum Track

Start Date12 February 2024
DescriptionAdd a referendum track which can place the decision deposit on any other track
AuthorsJelliedOwl

Summary

The current size of the decision deposit on some tracks is too high for many proposers. As a result, those needing to use it have to find someone else willing to put up the deposit for them - and a number of legitimate attempts to use the root track have timed out. This track would provide a more affordable (though slower) route for these holders to use the root track.

Motivation

There have been recent attempts to use the Kusama root track which have timed out with no decision deposit placed. Usually, these referenda have been related to parachain registration related issues.

Explanation

Propose to address this by adding a new referendum track [22] Referendum Deposit which can place the decision deposit on another referendum. This would require the following changes:

  • [Referenda Pallet] Modify the placeDecisionDesposit function to additionally allow it to be called by root, with root call bypassing the requirements for a deposit payment.
  • [Runtime] Add a new referendum track which can only call referenda->placeDecisionDeposit and the utility functions.

Referendum track parameters - Polkadot

  • Decision deposit: 1000 DOT
  • Decision period: 14 days
  • Confirmation period: 12 hours
  • Enactment period: 2 hour
  • Approval & Support curves: As per the root track, timed to match the decision period
  • Maximum deciding: 10

Referendum track parameters - Kusama

  • Decision deposit: 33.333333 KSM
  • Decision period: 7 days
  • Confirmation period: 6 hours
  • Enactment period: 1 hour
  • Approval & Support curves: As per the root track, timed to match the decision period
  • Maximum deciding: 10

Drawbacks

This track would provide a route to starting a root referendum with a much-reduced slashable deposit. This might be undesirable but, assuming the decision deposit cost for this track is still high enough, slashing would still act as a disincentive.

An alternative to this might be to reduce the decision deposit size some of the more expensive tracks. However, part of the purpose of the high deposit - at least on the root track - is to prevent spamming the limited queue with junk referenda.

Testing, Security, and Privacy

Will need additional tests case for the modified pallet and runtime. No security or privacy issues.

Performance, Ergonomics, and Compatibility

Performance

No significant performance impact.

Ergonomics

Only changes related to adding the track. Existing functionality is unchanged.

Compatibility

No compatibility issues.

Prior Art and References

Unresolved Questions

Feedback on whether my proposed implementation of this is the best way to address the issue - including which calls the track should be allowed to make. Are the track parameters correct or should be use something different? Alternative would be welcome.

(source)

Table of Contents

RFC-0074: Stateful Multisig Pallet

Start Date15 February 2024
DescriptionAdd Enhanced Multisig Pallet to System chains
AuthorsAbdelrahman Soliman (Boda)

Summary

A pallet to facilitate enhanced multisig accounts. The main enhancement is that we store a multisig account in the state with related info (signers, threshold,..etc). The module affords enhanced control over administrative operations such as adding/removing signers, changing the threshold, account deletion, canceling an existing proposal. Each signer can approve/reject a proposal while still exists. The proposal is not intended for migrating or getting rid of existing multisig. It's to allow both options to coexist.

For the rest of the RFC We use the following terms:

  • proposal to refer to an extrinsic that is to be dispatched from a multisig account after getting enough approvals.
  • Stateful Multisig to refer to the proposed pallet.
  • Stateless Multisig to refer to the current multisig pallet in polkadot-sdk.

Motivation

Problem

Entities in the Polkadot ecosystem need to have a way to manage their funds and other operations in a secure and efficient way. Multisig accounts are a common way to achieve this. Entities by definition change over time, members of the entity may change, threshold requirements may change, and the multisig account may need to be deleted. For even more enhanced hierarchical control, the multisig account may need to be controlled by other multisig accounts.

Current native solutions for multisig operations are less optimal, performance-wise (as we'll explain later in the RFC), and lack fine-grained control over the multisig account.

Stateless Multisig

We refer to current multisig pallet in polkadot-sdk because the multisig account is only derived and not stored in the state. Although deriving the account is determinsitc as it relies on exact users (sorted) and thershold to derive it. This does not allow for control over the multisig account. It's also tightly coupled to exact users and threshold. This makes it hard for an organization to manage existing accounts and to change the threshold or add/remove signers.

We believe as well that the stateless multisig is not efficient in terms of block footprint as we'll show in the performance section.

Pure Proxy

Pure proxy can achieve having a stored and determinstic multisig account from different users but it's unneeded complexity as a way around the limitations of the current multisig pallet. It doesn't also have the same fine grained control over the multisig account.

Other points mentioned by @tbaut

  • pure proxies aren't (yet) a thing cross chain
  • the end user complexity is much much higher with pure proxies, also for new users smart contract multisig are widely known while pure proxies are obscure.
  • you can shoot yourself in the foot by deleting the proxy, and effectively loosing access to funds with pure proxies.

Requirements

Basic requirements for the Stateful Multisig are:

  • The ability to have concrete and permanent (unless deleted) multisig accounts in the state.
  • The ability to add/remove signers from an existing multisig account by the multisig itself.
  • The ability to change the threshold of an existing multisig account by the multisig itself.
  • The ability to delete an existing multisig account by the multisig itself.
  • The ability to cancel an existing proposal by the multisig itself.
  • Signers of multisig account can start a proposal on behalf of the multisig account which will be dispatched after getting enough approvals.
  • Signers of multisig account can approve/reject a proposal while still exists.

Use Cases

  • Corporate Governance: In a corporate setting, multisig accounts can be employed for decision-making processes. For example, a company may require the approval of multiple executives to initiate significant financial transactions.

  • Joint Accounts: Multisig accounts can be used for joint accounts where multiple individuals need to authorize transactions. This is particularly useful in family finances or shared business accounts.

  • Decentralized Autonomous Organizations (DAOs): DAOs can utilize multisig accounts to ensure that decisions are made collectively. Multiple key holders can be required to approve changes to the organization's rules or the allocation of funds.

and much more...

Stakeholders

  • Polkadot holders
  • Polkadot developers

Explanation

I've created the stateful multisig pallet during my studies in Polkadot Blockchain Academy under supervision from @shawntabrizi and @ank4n. After that, I've enhanced it to be fully functional and this is a draft PR#3300 in polkadot-sdk. I'll list all the details and design decisions in the following sections. Note that the PR is not 1-1 exactly to the current RFC as the RFC is a more polished version of the PR after updating based on the feedback and discussions.

Let's start with a sequence diagram to illustrate the main operations of the Stateful Multisig.

multisig operations

Notes on above diagram:

  • It's a 3 step process to execute a proposal. (Start Proposal --> Approvals --> Execute Proposal)
  • Execute is an explicit extrinsic for a simpler API. It can be optimized to be executed automatically after getting enough approvals.
  • Any user can create a multisig account and they don't need to be part of it. (Alice in the diagram)
  • A proposal is any extrinsic including control extrinsics (e.g. add/remove signer, change threshold,..etc).
  • Any multisig account signer can start a proposal on behalf of the multisig account. (Bob in the diagram)
  • Any multisig account owener can execute proposal if it's approved by enough signers. (Dave in the diagram)

State Transition Functions

having the following enum to store the call or the hash:

#![allow(unused)]
fn main() {
enum CallOrHash<T: Config> {
	Call(<T as Config>::RuntimeCall),
	Hash(T::Hash),
}
}
  • create_multisig - Create a multisig account with a given threshold and initial signers. (Needs Deposit)
#![allow(unused)]
fn main() {
		/// Creates a new multisig account and attach signers with a threshold to it.
		///
		/// The dispatch origin for this call must be _Signed_. It is expected to be a nomral AccountId and not a
		/// Multisig AccountId.
		///
		/// T::BaseCreationDeposit + T::PerSignerDeposit * signers.len() will be held from the caller's account.
		///
		/// # Arguments
		///
		/// - `signers`: Initial set of accounts to add to the multisig. These may be updated later via `add_signer`
		/// and `remove_signer`.
		/// - `threshold`: The threshold number of accounts required to approve an action. Must be greater than 0 and
		/// less than or equal to the total number of signers.
		///
		/// # Errors
		///
		/// * `TooManySignatories` - The number of signatories exceeds the maximum allowed.
		/// * `InvalidThreshold` - The threshold is greater than the total number of signers.
		pub fn create_multisig(
			origin: OriginFor<T>,
			signers: BoundedBTreeSet<T::AccountId, T::MaxSignatories>,
			threshold: u32,
		) -> DispatchResult 
}
  • start_proposal - Start a multisig proposal. (Needs Deposit)
#![allow(unused)]
fn main() {
		/// Starts a new proposal for a dispatchable call for a multisig account.
		/// The caller must be one of the signers of the multisig account.
		/// T::ProposalDeposit will be held from the caller's account.
		///
		/// # Arguments
		///
		/// * `multisig_account` - The multisig account ID.
		/// * `call_or_hash` - The enum having the call or the hash of the call to be approved and executed later.
		///
		/// # Errors
		///
		/// * `MultisigNotFound` - The multisig account does not exist.
		/// * `UnAuthorizedSigner` - The caller is not an signer of the multisig account.
		/// * `TooManySignatories` - The number of signatories exceeds the maximum allowed. (shouldn't really happen as it's the first approval)
		pub fn start_proposal(
			origin: OriginFor<T>,
			multisig_account: T::AccountId,
			call_or_hash: CallOrHash,
		) -> DispatchResult
}
  • approve - Approve a multisig proposal.
#![allow(unused)]
fn main() {
		/// Approves a proposal for a dispatchable call for a multisig account.
		/// The caller must be one of the signers of the multisig account.
		///
		/// If a signer did approve -> reject -> approve, the proposal will be approved.
		/// If a signer did approve -> reject, the proposal will be rejected.
		///
		/// # Arguments
		///
		/// * `multisig_account` - The multisig account ID.
		/// * `call_or_hash` - The enum having the call or the hash of the call to be approved.
		///
		/// # Errors
		///
		/// * `MultisigNotFound` - The multisig account does not exist.
		/// * `UnAuthorizedSigner` - The caller is not an signer of the multisig account.
		/// * `TooManySignatories` - The number of signatories exceeds the maximum allowed.
		/// This shouldn't really happen as it's an approval, not an addition of a new signer.
		pub fn approve(
			origin: OriginFor<T>,
			multisig_account: T::AccountId,
			call_or_hash: CallOrHash,
		) -> DispatchResult
}
  • reject - Reject a multisig proposal.
#![allow(unused)]
fn main() {
		/// Rejects a proposal for a multisig account.
		/// The caller must be one of the signers of the multisig account.
		///
		/// Between approving and rejecting, last call wins.
		/// If a signer did approve -> reject -> approve, the proposal will be approved.
		/// If a signer did approve -> reject, the proposal will be rejected.
		///
		/// # Arguments
		///
		/// * `multisig_account` - The multisig account ID.
		/// * `call_or_hash` - The enum having the call or the hash of the call to be rejected.
		///
		/// # Errors
		///
		/// * `MultisigNotFound` - The multisig account does not exist.
		/// * `UnAuthorizedSigner` - The caller is not an signer of the multisig account.
		/// * `SignerNotFound` - The caller has not approved the proposal.
		#[pallet::call_index(3)]
		#[pallet::weight(Weight::default())]
		pub fn reject(
			origin: OriginFor<T>,
			multisig_account: T::AccountId,
			call_or_hash: CallOrHash,
		) -> DispatchResult
}
  • execute_proposal - Execute a multisig proposal. (Releases Deposit)
#![allow(unused)]
fn main() {
		/// Executes a proposal for a dispatchable call for a multisig account.
		/// Poropsal needs to be approved by enough signers (exceeds or equal multisig threshold) before it can be executed.
		/// The caller must be one of the signers of the multisig account.
		///
		/// This function does an extra check to make sure that all approvers still exist in the multisig account.
		/// That is to make sure that the multisig account is not compromised by removing an signer during an active proposal.
		///
		/// Once finished, the withheld deposit will be returned to the proposal creator.
		///
		/// # Arguments
		///
		/// * `multisig_account` - The multisig account ID.
		/// * `call_or_hash` - We should have gotten the RuntimeCall (preimage) and stored it in the proposal by the time the extrinsic is called.
		///
		/// # Errors
		///
		/// * `MultisigNotFound` - The multisig account does not exist.
		/// * `UnAuthorizedSigner` - The caller is not an signer of the multisig account.
		/// * `NotEnoughApprovers` - approvers don't exceed the threshold.
		/// * `ProposalNotFound` -  The proposal does not exist.
		/// * `CallPreImageNotFound` -  The proposal doesn't have the preimage of the call in the state.
		pub fn execute_proposal(
			origin: OriginFor<T>,
			multisig_account: T::AccountId,
			call_or_hash: CallOrHash,
		) -> DispatchResult
}
  • cancel_proposal - Cancel a multisig proposal. (Releases Deposit)
#![allow(unused)]
fn main() {
		/// Cancels an existing proposal for a multisig account.
		/// Poropsal needs to be rejected by enough signers (exceeds or equal multisig threshold) before it can be executed.
		/// The caller must be one of the signers of the multisig account.
		///
		/// This function does an extra check to make sure that all rejectors still exist in the multisig account.
		/// That is to make sure that the multisig account is not compromised by removing an signer during an active proposal.
		///
		/// Once finished, the withheld deposit will be returned to the proposal creator./
		///
		/// # Arguments
		///
		/// * `origin` - The origin multisig account who wants to cancel the proposal.
		/// * `call_or_hash` - The call or hash of the call to be canceled.
		///
		/// # Errors
		///
		/// * `MultisigNotFound` - The multisig account does not exist.
		/// * `ProposalNotFound` - The proposal does not exist.
		pub fn cancel_proposal(
		origin: OriginFor<T>, 
		multisig_account: T::AccountId, 
		call_or_hash: CallOrHash) -> DispatchResult
}
  • cancel_own_proposal - Cancel a multisig proposal started by the caller in case no other signers approved it yet. (Releases Deposit)
#![allow(unused)]
fn main() {
		/// Cancels an existing proposal for a multisig account Only if the proposal doesn't have approvers other than
		/// the proposer.
		///
		///	This function needs to be called from a the proposer of the proposal as the origin.
		///
		/// The withheld deposit will be returned to the proposal creator.
		///
		/// # Arguments
		///
		/// * `multisig_account` - The multisig account ID.
		/// * `call_or_hash` - The hash of the call to be canceled.
		///
		/// # Errors
		///
		/// * `MultisigNotFound` - The multisig account does not exist.
		/// * `ProposalNotFound` - The proposal does not exist.
		pub fn cancel_own_proposal(
			origin: OriginFor<T>,
			multisig_account: T::AccountId,
			call_or_hash: CallOrHash,
		) -> DispatchResult
}
  • cleanup_proposals - Cleanup proposals of a multisig account. (Releases Deposit)
#![allow(unused)]
fn main() {
		/// Cleanup proposals of a multisig account. This function will iterate over a max limit per extrinsic to ensure
		/// we don't have unbounded iteration over the proposals.
		///
		/// The withheld deposit will be returned to the proposal creator.
		///
		/// # Arguments
		///
		/// * `multisig_account` - The multisig account ID.
		///
		/// # Errors
		///
		/// * `MultisigNotFound` - The multisig account does not exist.
		/// * `ProposalNotFound` - The proposal does not exist.
		pub fn cleanup_proposals(
			origin: OriginFor<T>,
			multisig_account: T::AccountId,
		) -> DispatchResult
}

Note: Next functions need to be called from the multisig account itself. Deposits are reserved from the multisig account as well.

  • add_signer - Add a new signer to a multisig account. (Needs Deposit)
#![allow(unused)]
fn main() {
		/// Adds a new signer to the multisig account.
		/// This function needs to be called from a Multisig account as the origin.
		/// Otherwise it will fail with MultisigNotFound error.
		///
		/// T::PerSignerDeposit will be held from the multisig account.
		///
		/// # Arguments
		///
		/// * `origin` - The origin multisig account who wants to add a new signer to the multisig account.
		/// * `new_signer` - The AccountId of the new signer to be added.
		/// * `new_threshold` - The new threshold for the multisig account after adding the new signer.
		///
		/// # Errors
		/// * `MultisigNotFound` - The multisig account does not exist.
		/// * `InvalidThreshold` - The threshold is greater than the total number of signers or is zero.
		/// * `TooManySignatories` - The number of signatories exceeds the maximum allowed.
		pub fn add_signer(
			origin: OriginFor<T>,
			new_signer: T::AccountId,
			new_threshold: u32,
		) -> DispatchResult
}
  • remove_signer - Remove an signer from a multisig account. (Releases Deposit)
#![allow(unused)]
fn main() {
		/// Removes an  signer from the multisig account.
		/// This function needs to be called from a Multisig account as the origin.
		/// Otherwise it will fail with MultisigNotFound error.
		/// If only one signer exists and is removed, the multisig account and any pending proposals for this account will be deleted from the state.
		///
		/// # Arguments
		///
		/// * `origin` - The origin multisig account who wants to remove an signer from the multisig account.
		/// * `signer_to_remove` - The AccountId of the signer to be removed.
		/// * `new_threshold` - The new threshold for the multisig account after removing the signer. Accepts zero if
		/// the signer is the only one left.kkk
		///
		/// # Errors
		///
		/// This function can return the following errors:
		///
		/// * `MultisigNotFound` - The multisig account does not exist.
		/// * `InvalidThreshold` - The new threshold is greater than the total number of signers or is zero.
		/// * `UnAuthorizedSigner` - The caller is not an signer of the multisig account.
		pub fn remove_signer(
			origin: OriginFor<T>,
			signer_to_remove: T::AccountId,
			new_threshold: u32,
		) -> DispatchResult
}
  • set_threshold - Change the threshold of a multisig account.
#![allow(unused)]
fn main() {
		/// Sets a new threshold for a multisig account.
		///	This function needs to be called from a Multisig account as the origin.
		/// Otherwise it will fail with MultisigNotFound error.
		///
		/// # Arguments
		///
		/// * `origin` - The origin multisig account who wants to set the new threshold.
		/// * `new_threshold` - The new threshold to be set.
		/// # Errors
		///
		/// * `MultisigNotFound` - The multisig account does not exist.
		/// * `InvalidThreshold` - The new threshold is greater than the total number of signers or is zero.
		set_threshold(origin: OriginFor<T>, new_threshold: u32) -> DispatchResult
}
  • delete_multisig - Delete a multisig account. (Releases Deposit)
#![allow(unused)]
fn main() {
		/// Deletes a multisig account and all related proposals.
		///
		///	This function needs to be called from a Multisig account as the origin.
		/// Otherwise it will fail with MultisigNotFound error.
		///
		/// # Arguments
		///
		/// * `origin` - The origin multisig account who wants to cancel the proposal.
		///
		/// # Errors
		///
		/// * `MultisigNotFound` - The multisig account does not exist.
		pub fn delete_account(origin: OriginFor<T>) -> DispatchResult
}

Storage/State

  • Use 2 main storage maps to store mutlisig accounts and proposals.
#![allow(unused)]
fn main() {
#[pallet::storage]
  pub type MultisigAccount<T: Config> = StorageMap<_, Twox64Concat, T::AccountId, MultisigAccountDetails<T>>;

/// The set of open multisig proposals. A proposal is uniquely identified by the multisig account and the call hash.
/// (maybe a nonce as well in the future)
#[pallet::storage]
pub type PendingProposals<T: Config> = StorageDoubleMap<
    _,
    Twox64Concat,
    T::AccountId, // Multisig Account
    Blake2_128Concat,
    T::Hash, // Call Hash
    MultisigProposal<T>,
>;
}

As for the values:

#![allow(unused)]
fn main() {
pub struct MultisigAccountDetails<T: Config> {
	/// The signers of the multisig account. This is a BoundedBTreeSet to ensure faster operations (add, remove).
	/// As well as lookups and faster set operations to ensure approvers is always a subset from signers. (e.g. in case of removal of an signer during an active proposal)
	pub signers: BoundedBTreeSet<T::AccountId, T::MaxSignatories>,
	/// The threshold of approvers required for the multisig account to be able to execute a call.
	pub threshold: u32,
	pub deposit: BalanceOf<T>,
}
}
#![allow(unused)]
fn main() {
pub struct MultisigProposal<T: Config> {
    /// Proposal creator.
    pub creator: T::AccountId,
    pub creation_deposit: BalanceOf<T>,
    /// The extrinsic when the multisig operation was opened.
    pub when: Timepoint<BlockNumberFor<T>>,
    /// The approvers achieved so far, including the depositor.
    /// The approvers are stored in a BoundedBTreeSet to ensure faster lookup and operations (approve, reject).
    /// It's also bounded to ensure that the size don't go over the required limit by the Runtime.
    pub approvers: BoundedBTreeSet<T::AccountId, T::MaxSignatories>,
    /// The rejectors for the proposal so far.
    /// The rejectors are stored in a BoundedBTreeSet to ensure faster lookup and operations (approve, reject).
    /// It's also bounded to ensure that the size don't go over the required limit by the Runtime.
    pub rejectors: BoundedBTreeSet<T::AccountId, T::MaxSignatories>,
    /// The block number until which this multisig operation is valid. None means no expiry.
    pub expire_after: Option<BlockNumberFor<T>>,
}
}

For optimization we're using BoundedBTreeSet to allow for efficient lookups and removals. Especially in the case of approvers, we need to be able to remove an approver from the list when they reject their approval. (which we do lazily when execute_proposal is called).

There's an extra storage map for the deposits of the multisig accounts per signer added. This is to ensure that we can release the deposits when the multisig removes them even if the constant deposit per signer changed in the runtime later on.

Considerations & Edge cases

Removing an signer from the multisig account during an active proposal

We need to ensure that the approvers are always a subset from signers. This is also partially why we're using BoundedBTreeSet for signers and approvers. Once execute proposal is called we ensure that the proposal is still valid and the approvers are still a subset from current signers.

Multisig account deletion and cleaning up existing proposals

Once the last signer of a multisig account is removed or the multisig approved the account deletion we delete the multisig accound from the state and keep the proposals until someone calls cleanup_proposals multiple times which iterates over a max limit per extrinsic. This is to ensure we don't have unbounded iteration over the proposals. Users are already incentivized to call cleanup_proposals to get their deposits back.

Multisig account deletion and existing deposits

We currently just delete the account without checking for deposits (Would like to hear your thoughts here). We can either

  • Don't make deposits to begin with and make it a fee.
  • Transfer to treasury.
  • Error on deletion. (don't like this)

Approving a proposal after the threshold is changed

We always use latest threshold and don't store each proposal with different threshold. This allows the following:

  • In case threshold is lower than the number of approvers then the proposal is still valid.
  • In case threshold is higher than the number of approvers then we catch it during execute proposal and error.

Drawbacks

  • New pallet to maintain.

Testing, Security, and Privacy

Standard audit/review requirements apply.

Performance, Ergonomics, and Compatibility

Performance

Doing back of the envelop calculation to proof that the stateful multisig is more efficient than the stateless multisig given it's smaller footprint size on blocks.

Quick review over the extrinsics for both as it affects the block size:

Stateless Multisig: Both as_multi and approve_as_multi has a similar parameters:

#![allow(unused)]
fn main() {
origin: OriginFor<T>,
threshold: u16,
other_signatories: Vec<T::AccountId>,
maybe_timepoint: Option<Timepoint<BlockNumberFor<T>>>,
call_hash: [u8; 32],
max_weight: Weight,
}

Stateful Multisig: We have the following extrinsics:

#![allow(unused)]
fn main() {
pub fn start_proposal(
			origin: OriginFor<T>,
			multisig_account: T::AccountId,
			call_or_hash: CallOrHash,
		)
}
#![allow(unused)]
fn main() {
pub fn approve(
			origin: OriginFor<T>,
			multisig_account: T::AccountId,
			call_or_hash: CallOrHash,
		)
}
#![allow(unused)]
fn main() {
pub fn execute_proposal(
			origin: OriginFor<T>,
			multisig_account: T::AccountId,
			call_or_hash: CallOrHash,
		)
}

The main takeway is that we don't need to pass the threshold and other signatories in the extrinsics. This is because we already have the threshold and signatories in the state (only once).

So now for the caclulations, given the following:

  • K is the number of multisig accounts.
  • N is number of signers in each multisig account.
  • For each proposal we need to have 2N/3 approvals.

The table calculates if each of the K multisig accounts has one proposal and it gets approved by the 2N/3 and then executed. How much did the total Blocks and States sizes increased by the end of the day.

Note: We're not calculating the cost of proposal as both in statefull and stateless multisig they're almost the same and gets cleaned up from the state once the proposal is executed or canceled.

Stateless effect on blocksizes = 2/3KN^2 (as each user of the 2/3 users will need to call approve_as_multi with all the other signatories(N) in extrinsic body)

Stateful effect on blocksizes = K * N (as each user will need to call approve with the multisig account only in extrinsic body)

Stateless effect on statesizes = Nil (as the multisig account is not stored in the state)

Stateful effect on statesizes = K*N (as each multisig account (K) will be stored with all the signers (K) in the state)

PalletBlock SizeState Size
Stateless2/3KN^2Nil
StatefulK*NK*N

Simplified table removing K from the equation: | Pallet | Block Size | State Size | |----------------|:-------------:|-----------:| | Stateless | N^2 | Nil | | Stateful | N | N |

So even though the stateful multisig has a larger state size, it's still more efficient in terms of block size and total footprint on the blockchain.

Ergonomics

The Stateful Multisig will have better ergonomics for managing multisig accounts for both developers and end-users.

Compatibility

This RFC is compatible with the existing implementation and can be handled via upgrades and migration. It's not intended to replace the existing multisig pallet.

Prior Art and References

multisig pallet in polkadot-sdk

Unresolved Questions

  • On account deletion, should we transfer remaining deposits to treasury or remove signers' addition deposits completely and consider it as fees to start with?
  • Batch addition/removal of signers.
  • Add expiry to proposals. After a certain time, proposals will not accept any more approvals or executions and will be deleted.
  • Implement call filters. This will allow multisig accounts to only accept certain calls.

(source)

Table of Contents

RFC-0077: Increase maximum length of identity PGP fingerprint values from 20 bytes

Start Date20 Feb 2024
DescriptionIncrease the maximum length of identity PGP fingerprint values from 20 bytes
AuthorsLuke Schoen

Summary

This proposes to increase the maximum length of PGP Fingerprint values from a 20 bytes/chars limit to a 40 bytes/chars limit.

Motivation

Background

Pretty Good Privacy (PGP) Fingerprints are shorter versions of their corresponding Public Key that may be printed on a business card.

They may be used by someone to validate the correct corresponding Public Key.

It should be possible to add PGP Fingerprints to Polkadot on-chain identities.

GNU Privacy Guard (GPG) is compliant with PGP and the two acronyms are used interchangeably.

Problem

If you want to set a Polkadot on-chain identity, users may provide a PGP Fingerprint value in the "pgpFingerprint" field, which may be longer than 20 bytes/chars (e.g. PGP Fingerprints are 40 bytes/chars long), however that field can only store a maximum length of 20 bytes/chars of information.

Possible disadvantages of the current 20 bytes/chars limitation:

  • Discourages users from using the "pgpFingerprint" field.
  • Discourages users from using Polkadot on-chain identities for Web2 and Web3 dApp software releases where the latest "pgpFingerprint" field could be used to verify the correct PGP Fingerprint that has been used to sign the software releases so users that download the software know that it was from a trusted source.
  • Encourages dApps to link to Web2 sources to allow their users verify the correct fingerprint associated with software releases, rather than to use the Web3 Polkadot on-chain identity "pgpFingerprint" field of the releaser of the software, since it may be the case that the "pgpFingerprint" field of most on-chain identities is not widely used due to the maximum length of 20 bytes/chars restriction.
  • Discourages users from setting an on-chain identity by creating an extrinsic using Polkadot.js with identity > setIdentity(info), since if they try to provide their 40 character long PGP Fingerprint or GPG Fingerprint, which is longer than the maximum length of 20 bytes/chars, they will encounter an error.
  • Discourages users from using on-chain Web3 registrars to judge on-chain identity fields, where the shortest value they are able to generate for a "pgpFingerprint" is not less than or equal to the maximum length of 20 bytes.

Solution Requirements

The maximum length of identity PGP Fingerprint values should be increased from the current 20 bytes/chars limit at least a 40 bytes/chars limit to support PGP Fingerprints and GPG Fingerprints.

Stakeholders

  • Any Polkadot account holder wishing to use a Polkadot on-chain identity for their:
    • PGP Fingerprints that are longer than 32 characters
    • GPG Fingerprints that are longer than 32 characters

Explanation

If a user tries to setting an on-chain identity by creating an extrinsic using Polkadot.js with identity > setIdentity(info), then if they try to provide their 40 character long PGP Fingerprint or GPG Fingerprint, which is longer than the maximum length of 20 bytes/chars [u8;20], then they will encounter this error:

createType(Call):: Call: failed decoding identity.setIdentity:: Struct: failed on args: {...}:: Struct: failed on pgpFingerprint: Option<[u8;20]>:: Expected input with 20 bytes (160 bits), found 40 bytes

Increasing maximum length of identity PGP Fingerprint values from the current 20 bytes/chars limit to at least a 40 bytes/chars limit would overcome these errors and support PGP Fingerprints and GPG Fingerprints, satisfying the solution requirements.

Drawbacks

No drawbacks have been identified.

Testing, Security, and Privacy

Implementations would be tested for adherance by checking that 40 bytes/chars PGP Fingerprints are supported.

No effect on security or privacy has been identified than already exists.

No implementation pitfalls have been identified.

Performance, Ergonomics, and Compatibility

Performance

It would be an optimization, since the associated exposed interfaces to developers and end-users could start being used.

To minimize additional overhead the proposal suggests a 40 bytes/chars limit since that would at least provide support for PGP Fingerprints, satisfying the solution requirements.

Ergonomics

No potential ergonomic optimizations have been identified.

Compatibility

Updates to Polkadot.js Apps, API and its documentation and those referring to it may be required.

Prior Art and References

No prior articles or references.

Unresolved Questions

No further questions at this stage.

Relates to RFC entitled "Increase maximum length of identity raw data values from 32 bytes".

(source)

Table of Contents

RFC-0088: Add slashable locked deposit, purchaser reputation, and reserved cores for on-chain identities to broker pallet

Start Date25 Apr 2024
DescriptionAdd slashable locked deposit, purchaser reputation, and reserved cores for on-chain identities to broker pallet
AuthorsLuke Schoen

Summary

This proposes to require a slashable deposit in the broker pallet when initially purchasing or renewing Bulk Coretime or Instantaneous Coretime cores.

Additionally, it proposes to record a reputational status based on the behavior of the purchaser, as it relates to their use of Kusama Coretime cores that they purchase, and to possibly reserve a proportion of the cores for prospective purchasers that have an on-chain identity.

Motivation

Background

There are sales of Kusama Coretime cores that are scheduled to occur later this month by Coretime Marketplace Lastic.xyz initially in limited quantities, and potentially also by RegionX in future that is subject to their Polkadot referendum #582. This poses a risk in that some Kusama Coretime core purchasers may buy Kusama Coretime cores when they have no intention of actually placing a workload on them or leasing them out, which would prevent those that wish to purchase and actually use Kusama Coretime cores from being able to use any at cores at all.

Problem

The types of purchasers may include:

  • Collectors (e.g. purchase a significant core such as the first core that is sold just to increase their likelihood of receiving an NFT airdrop for being one of the first purchasers).
  • Resellers (e.g. purchase a core that may be used at a popular period of time to resell closer to the date to realise a profit)
  • Market makers (e.g. buy cores just to change the floor price or volume).
  • Anti-competitive (e.g. competitor to Polkadot ecosystem purchases cores possibly in violation of anti-trust laws just to restrict access to prospective Kusama Coretime sales cores by the Kusama community that wish to do business in the Polkadot ecosystem).

Chaoatic repurcussions could include the following:

  • Generation of "white elephant" Kusama Coretime cores, similar to "white elephant" properties in the real-estate industry that never actually get used, leased or tenanted.
  • Kusama Coretime core resellers scalping the core time faster than the average core time consumer, and then choosing to use dynamic pricing that causes prices to fluctuate based on demand.
  • Resellers that own the Kusama Coretime scalping organisations may actually turn out to be the Official Kusama Coretime sellers.
  • Official Kusama Coretime sellers may establish a monopoly on the market and abuse that power by charging exhorbitant additional charge fees for each purchase, since they could then increase their floor prices even more, pretending that there are fewer cores available and more demand to make extra profits from their scalping organisations, similar to how it occurred in these concert ticket sales. This could caused Kusama Coretime costs to be no longer be affordable to the Kusama community.
  • Official Kusama Coretime sellers may run pre-sale events, but their websites may not be able to unable to handle the traffic and crash multiple times, causing them to end up cancelling those pre-sales and the pre-sale registrants missing out on getting a core that way, which would then cause available Kusama Coretime cores to be bought and resold at a higher price on third-party sites.
  • The scalping activity may be illegal in some jurisdictions and raise anti-trust issues similar to the Taylor Swift debacle over concert tickets.

Solution Requirements

  1. On-chain identity. It may be possible to circumvent bots and scalpers to an extent by requiring a proportion of Kusama Coretime purchasers to have an on-chain identity. As such, a possible solution could be to allow the configuration of a threshold in the Broker pallet that reserves a proportion of the cores for accounts that have an on-chain identity, that reverts to a waiting list of anonymous account purchasers if the reserved proportion of cores remain unsold.

  2. Slashable deposit. A viable solution could be to require a slashable deposit to be locked prior to the purchase or renewal of a core, similar to how decision deposits are used in OpenGov to prevent spam, but where if you buy a Kusama Coretime core you could be challenged by one of more collectives of fishermen to provide proof against certain criteria of how you used it, and if you fail to provide adequate evidence in response to that scrutiny, then you would lose a proportion of that deposit and face restrictions on purchasing or renewing cores in future that may also be configured on-chain.

  3. Reputation. To disincentivise certain behaviours, a reputational status indicator could be used to record the historic behavior of the purchaser and whether on-chain judgement has determined they have adequately rectified that behaviour, as it relates to their usage of Kusama Coretime cores that they purchase.

Stakeholders

  • Any Kusama account holder wishing to use the Broker pallet in any upcoming Kusama Coretime sales.
  • Any prospective Kusama Coretime purchaser, developer, and user.
  • KSM holders.

Drawbacks

Performance

The slashable deposit if set too high, may result in an economic impact, where less Kusama Coretime core sales are purchased.

Testing, Security, and Privacy

Lack of a slashable deposit in the Broker pallet is a security concern, since it exposes Kusama Coretime sales to potential abuse.

Reserving a proportion of Kusama Coretime sales cores for those with on-chain identities should not be to the exclusion of accounts that wish to remain anonymous or cause cores to be wasted unnecessarily. As such, if cores that are reserved for on-chain identities remain unsold then they should be released to anonymous accounts that are on a waiting list.

No implementation pitfalls have been identified.

Performance, Ergonomics, and Compatibility

Performance

It should improve performance as it reduces the potential for state bloat since there is less risk of undesirable Kusama Coretime sales activity that would be apparent with no requirement for a slashable deposit or there being no reputational risk to purchasers that waste or misuse Kusama Coretime cores.

The solution proposes to minimize the risk of some Kusama Coretime cores not even being used or leased to perform any tasks at all.

It will be important to monitor and manage the slashable deposits, purchaser reputations, and utilization of the proportion of cores that are reserved for accounts with an on-chain identity.

Ergonomics

The mechanism for setting a slashable deposit amount, should avoid undue complexity for users.

Compatibility

Updates to Polkadot.js Apps, API and its documentation and those referring to it may be required.

Prior Art and References

Prior Art

No prior articles.

Unresolved Questions

None

None

(source)

Table of Contents

RFC-0001: Secondary Market for Regions

Start Date2024-06-09
DescriptionImplement a secondary market for region listings and sales
AuthorsAurora Poppyseed, Philip Lucsok

Summary

This RFC proposes the addition of a secondary market feature to either the broker pallet or as a separate pallet maintained by Lastic, enabling users to list and purchase regions. This includes creating, purchasing, and removing listings, as well as emitting relevant events and handling associated errors.

Motivation

Currently, the broker pallet lacks functionality for a secondary market, which limits users' ability to freely trade regions. This RFC aims to introduce a secure and straightforward mechanism for users to list regions they own for sale and allow other users to purchase these regions.

While integrating this functionality directly into the broker pallet is one option, another viable approach is to implement it as a separate pallet maintained by Lastic. This separate pallet would have access to the broker pallet and add minimal functionality necessary to support the secondary market.

Adding smart contracts to the Coretime chain could also address this need; however, this process is expected to be lengthy and complex. We cannot afford to wait for this extended timeline to enable basic secondary market functionality. By proposing either integration into the broker pallet or the creation of a dedicated pallet, we can quickly enhance the flexibility and utility of the broker pallet, making it more user-friendly and valuable.

Stakeholders

Primary stakeholders include:

  • Developers working on the broker pallet.
  • Secondary Coretime marketplaces.
  • Users who own regions and wish to trade them.
  • Community members interested in enhancing the broker pallet’s capabilities.

Explanation

This RFC introduces the following key features:

  1. Storage Changes:

    • Addition of Listings storage map to keep track of regions listed for sale and their prices.
  2. New Dispatchable Functions:

    • create_listing: Allows a region owner to list a region for sale.
    • purchase_listing: Allows a user to purchase a listed region.
    • remove_listing: Allows a region owner to remove their listing.
  3. Events:

    • ListingCreated: Emitted when a new listing is created.
    • RegionSold: Emitted when a region is sold.
    • ListingRemoved: Emitted when a listing is removed.
  4. Error Handling:

    • ExpiredRegion: The region has expired and cannot be listed or sold.
    • UnknownListing: The listing does not exist.
    • InvalidPrice: The listing price is invalid.
    • NotOwner: The caller is not the owner of the region.
  5. Testing:

    • Comprehensive tests to verify the correct functionality of the new features, including listing creation, purchase, removal, and handling of edge cases such as expired regions and unauthorized actions.

Drawbacks

The main drawback of adding the additional complexity directly to the broker pallet is the potential increase in maintenance overhead. Therefore, we propose adding additional functionality as a separate pallet on the Coretime chain. To take the pressure off from implementing these features, implementation along with unit tests would be taken care of by Lastic (Aurora Makovac, Philip Lucsok).

There are potential risks of security vulnerabilities in the new market functionalities, such as unauthorized region transfers or incorrect balance adjustments. Therefore, extensive security measures would have to be implemented.

Testing, Security, and Privacy

Testing

  • Comprehensive unit tests need to be provided to ensure the correctness of the new functionalities.
  • Scenarios tested should include successful and failed listing creation, purchases, and removals, as well as edge cases like expired regions and non-owner actions.

Security

  • Security audits should be performed to identify any vulnerabilities.
  • Ensure that only region owners can create or remove listings.
  • Validate all inputs to prevent invalid operations.

Privacy

  • The proposal does not introduce new privacy concerns as it only affects region trading functionality within the existing framework.

Performance, Ergonomics, and Compatibility

Performance

  • This feature is expected to introduce minimal overhead since it primarily involves read and write operations to storage maps.
  • Efforts will be made to optimize the code to prevent unnecessary computational costs.

Ergonomics

  • The new functions are designed to be intuitive and easy to use, providing clear feedback through events and errors.
  • Documentation and examples will be provided to assist developers and users.

Compatibility

  • This proposal does not break compatibility with existing interfaces or previous versions.
  • No migrations are necessary as it introduces new functionality without altering existing features.

Prior Art and References

  • All related discussions are going to be under this PR.

Unresolved Questions

  • Are there additional security measures needed to prevent potential abuses of the new functionalities?
  • Integration with external NFT marketplaces for more robust trading options.
  • Development of user interfaces to interact with the new marketplace features seamlessly.
  • Exploration of adding smart contracts to the Coretime chain, which would provide greater flexibility and functionality for the secondary market and other decentralized applications. This would require a longer time for implementation, so this proposes an intermediary solution.

(source)

Table of Contents

RFC-0002: Smart Contracts on the Coretime Chain

Start Date2024-06-09
DescriptionImplement smart contracts on the Coretime chain
AuthorsAurora Poppyseed, Phil Lucksok

Summary

This RFC proposes the integration of smart contracts on the Coretime chain to enhance flexibility and enable complex decentralized applications, including secondary market functionalities.

Motivation

Currently, the Coretime chain lacks the capability to support smart contracts, which limits the range of decentralized applications that can be developed and deployed. By enabling smart contracts, the Coretime chain can facilitate more sophisticated functionalities such as automated region trading, dynamic pricing mechanisms, and other decentralized applications that require programmable logic. This will enhance the utility of the Coretime chain, attract more developers, and create more opportunities for innovation.

Additionally, while there is a proposal (#885) to allow EVM-compatible contracts on Polkadot’s Asset Hub, the implementation of smart contracts directly on the Coretime chain will provide synchronous interactions and avoid the complexities of asynchronous operations via XCM.

Stakeholders

Primary stakeholders include:

  • Developers working on the Coretime chain.
  • Users who want to deploy decentralized applications on the Coretime chain.
  • Community members interested in expanding the capabilities of the Coretime chain.
  • Secondary Coretime marketplaces.

Explanation

This RFC introduces the following key components:

  1. Smart Contract Support:

    • Integrate support for deploying and executing smart contracts on the Coretime chain.
    • Use a well-established smart contract platform, such as Ethereum’s Solidity or Polkadot's Ink!, to ensure compatibility and ease of use.
  2. Storage and Execution:

    • Define a storage structure for smart contracts and their associated data.
    • Ensure efficient and secure execution of smart contracts, with proper resource management and gas fee mechanisms.
  3. Integration with Existing Pallets:

    • Ensure that smart contracts can interact with existing pallets on the Coretime chain, such as the broker pallet.
    • Provide APIs and interfaces for seamless integration and interaction.
  4. Security and Auditing:

    • Implement robust security measures to prevent vulnerabilities and exploits in smart contracts.
    • Conduct thorough security audits and testing before deployment.

Drawbacks

There are several drawbacks to consider:

  • Complexity: Adding smart contracts introduces significant complexity to the Coretime chain, which may increase maintenance overhead and the potential for bugs.
  • Performance: The execution of smart contracts can be resource-intensive, potentially affecting the performance of the Coretime chain.
  • Security: Smart contracts are prone to vulnerabilities and exploits, necessitating rigorous security measures and continuous monitoring.

Testing, Security, and Privacy

Testing

  • Comprehensive unit tests and integration tests should be developed to ensure the correct functionality of smart contracts.
  • Test scenarios should include various use cases and edge cases to validate the robustness of the implementation.

Security

  • Security audits should be performed to identify and mitigate vulnerabilities.
  • Implement best practices for smart contract development to minimize the risk of exploits.
  • Continuous monitoring and updates will be necessary to address new security threats.

Privacy

  • The proposal does not introduce new privacy concerns as it extends existing functionalities with programmable logic.

Performance, Ergonomics, and Compatibility

Performance

  • The introduction of smart contracts may impact performance due to the additional computational overhead.
  • Optimization techniques, such as efficient gas fee mechanisms and resource management, should be employed to minimize performance degradation.

Ergonomics

  • The new functionality should be designed to be intuitive and easy to use for developers, with comprehensive documentation and examples.
  • Provide developer tools and SDKs to facilitate the creation and deployment of smart contracts.

Compatibility

  • This proposal should maintain compatibility with existing interfaces and functionalities of the Coretime chain.
  • Ensure backward compatibility and provide migration paths if necessary.

Prior Art and References

  • Ethereum’s implementation of smart contracts using Solidity.
  • Polkadot’s Ink! smart contract platform.
  • Existing decentralized applications and use cases on other blockchain platforms.
  • Proposal #885: EVM-compatible contracts on Asset Hub, which highlights the community's interest in integrating smart contracts within the Polkadot ecosystem.

Unresolved Questions

  • What specific security measures should be implemented to prevent smart contract vulnerabilities?
  • How can we ensure optimal performance while supporting complex smart contracts?
  • What are the best practices for integrating smart contracts with existing pallets on the Coretime chain?
  • Further enhancements could include advanced developer tools and SDKs for smart contract development.
  • Integration with external decentralized applications and platforms to expand the ecosystem.
  • Continuous updates and improvements to the smart contract platform based on community feedback and emerging best practices.
  • Exploration of additional use cases for smart contracts on the Coretime chain, such as decentralized finance (DeFi) applications, voting systems, and more.

By enabling smart contracts on the Coretime chain, we can significantly expand its capabilities and attract a wider range of developers and users, fostering innovation and growth in the ecosystem.

(source)

Table of Contents

RFC-0000: Feature Name Here

Start Date13 July 2024
DescriptionImplement off-chain parachain runtime upgrades
Authorseskimor

Summary

Change the upgrade process of a parachain runtime upgrade to become an off-chain process with regards to the relay chain. Upgrades are still contained in parachain blocks, but will no longer need to end up in relay chain blocks nor in relay chain state.

Motivation

Having parachain runtime upgrades go through the relay chain has always been seen as a scalability concern. Due to optimizations in statement distribution and asynchronous backing it became less crucial and got de-prioritized, the original issue can be found here.

With the introduction of Agile Coretime and in general our efforts to reduce barrier to entry more for Polkadot more, the issue becomes more relevant again: We would like to reduce the required storage deposit for PVF registration, with the aim to not only make it cheaper to run a parachain (bulk + on-demand coretime), but also reduce the amount of capital required for the deposit. With this we would hope for far more parachains to get registered, thousands potentially even ten thousands. With so many PVFs registered, updates are expected to become more frequent and even attacks on service quality for other parachains would become a higher risk.

Stakeholders

  • Parachain Teams
  • Relay Chain Node implementation teams
  • Relay Chain runtime developers

Explanation

The issues with on-chain runtime upgrades are:

  1. Needlessly costly.
  2. A single runtime upgrade more or less occupies an entire relay chain block, thus it might affect also other parachains, especially if their candidates are also not negligible due to messages for example or they want to uprade their runtime at the same time.
  3. The signalling of the parachain to notify the relay chain of an upcoming runtime upgrade already contains the upgrade. Therefore the only way to rate limit upgrades is to drop an already distributed update in the size of megabytes: With the result that the parachain missed a block and more importantly it will try again with the very next block, until it finally succeeds. If we imagine to reduce capacity of runtime upgrades to let's say 1 every 100 relay chain blocks, this results in lot's of wasted effort and lost blocks.

We discussed introducing a separate signalling before submitting the actual runtime, but I think we should just go one step further and make upgrades fully off-chain. Which also helps bringing down deposit costs in a secure way, as we are also actually reducing costs for the network.

Introduce a new UMP message type RequestCodeUpgrade

As part of elastic scaling we are already planning to increase flexibility of UMP messages, we can now use this to our advantage and introduce another UMP message:

#![allow(unused)]
fn main() {
enum UMPSignal {
  // For elastic scaling
  OnCore(CoreIndex),
  // For off-chain upgrades
  RequestCodeUpgrade(Hash),
}
}

We could also make that new message a regular XCM, calling an extrinsic on the relay chain, but we will want to look into that message right after validation on the backers on the node side, making a straight forward semantic message more apt for the purpose.

Handle RequestCodeUpgrade on backers

We will introduce a new request/response protocol for both collators and validators, with the following request/response:

#![allow(unused)]
fn main() {
struct RequestBlob {
  blob_hash: Hash,
}

struct BlobResponse {
  blob: Vec<u8>
}
}

This protocol will be used by backers to request the PVF from collators in the following conditions:

  1. They received a collation sending RequestCodeUpgrade.
  2. They received a collation, but they don't yet have the code that was previously registered on the relaychain. (E.g. disk pruned, new validator)

In case they received the collation via PoV distribution instead of from the collator itself, they will use the exact same message to fetch from the valiator they got the PoV from.

Get the new code to all validators

Once the candidate issuing RequestCodeUpgrade got backed on chain, validators will start fetching the code from the backers as part of availability distribution.

To mitigate attack vectors we should make sure that serving requests for code can be treated as low priority requests. Thus I am suggesting the following scheme:

Validators will notice via a runtime API (TODO: Define) that a new code has been requested, the API will return the Hash and a counter, which starts at some configurable value e.g. 10. The validators are now aware of the new hash and start fetching, but they don't have to wait for the fetch to succeed to sign their bitfield.

Then on each further candidate from that chain that counter gets decremented. Validators which have not yet succeeded fetching will now try again. This game continues until the counter reached 0. Now it is mandatory to have to code in order to sign a 1 in the bitfield.

PVF pre-checking will happen after the candidate which brought the counter to 0 has been successfully included and thus is also able to assume that 2/3 of the validators have the code.

This scheme serves two purposes:

  1. Fetching can happen over a longer period of time with low priority. E.g. if we waited for the PVF at the very first avaialbility distribution, this might actually affect liveness of other chains on the same core. Distributing megabytes of data to a thousand validators, might take a bit. Thus this helps isolating parachains from each other.
  2. By configuring the initial counter value we can affect how much an upgrade costs. E.g. forcing the parachain to produce 10 blocks, means 10x the cost for issuing an update. If too frequent upgrades ever become a problem for the system, we have a knob to make them more costly.

On-chain code upgrade process

First when a candidate is backed we need to make the new hash available (together with a counter) via a runtime API so validators in availability distribution can check for it and fetch it if changed (see previous section). For performance reasons, I think we should not do an additional call, but replace the existing one with one containing the new additional information (Option<(Hash, Counter)>).

Once the candidate gets included (counter 0), the hash is given to pre-checking and only after pre-checking succeeded (and a full session passed) it is finally enacted and the parachain can switch to the new code. (Same process as it used to be.)

Handling new validators

Backers

If a backer receives a collation for a parachain it does not yet have the code as enacted on chain (see "On-chain code upgrade process"), it will use above request/response protocol to fetch it from whom it received the collation.

Availablity Distribution

Validators in availability distribution will be changed to only sign a 1 in the bitfield of a candidate if they not only have the chunk, but also the currently active PVF. They will fetch it from backers in case they don't have it yet.

How do other parties get hold of the PVF?

Two ways:

  1. Discover collators via relay chain DHT and request from them: Preferred way, as it is less load on validators.
  2. Request from validators, which will serve on a best effort basis.

Pruning

We covered how validators get hold of new code, but when can they prune old ones? In principle it is not an issue, if some validors prune code, because:

  1. We changed it so that a candidate is not deemed available if validators were not able to fetch the PVF.
  2. Backers can always fetch the PVF from collators as part of the collation fetching.

But the majority of validators should always keep the latest code of any parachain and only prune the previous one, once the first candidate using the new code got finalized. This ensures that disputes will always be able to resolve.

Drawbacks

The major drawback of this solution is the same as any solution the moves work off-chain, it adds complexity to the node. E.g. nodes needing the PVF, need to store them separately, together with their own pruning strategy as well.

Testing, Security, and Privacy

Implementations adhering to this RFC, will respond to PVF requests with the actual PVF, if they have it. Requesters will persist received PVFs on disk for as long as they are replaced by a new one. Implementations must not be lazy here, if validators only fetched the PVF when needed, they can be prevented from participating in disputes.

Validators should treat incoming requests for PVFs in general with rather low priority, but should prefer fetches from other validators over requests from random peers.

Given that we are altering what set bits in the availability bitfields mean (not only chunk, but also PVF available), it is important to have enough validators upgraded, before we allow collators to make use of the new runtime upgrade mechanism. Otherwise we would risk disputes to not being able to succeed.

This RFC has no impact on privacy.

Performance, Ergonomics, and Compatibility

Performance

This proposal lightens the load on the relay chain and is thus in general beneficial for the performance of the network, this is achieved by the following:

  1. Code upgrades are still propagated to all validators, but only once, not twice (First statements, then via the containing relay chain block).
  2. Code upgrades are only communicated to validators and other nodes which are interested, not any full node as it has been before.
  3. Relay chain block space is preserved. Previously we could only do one runtime upgrade per relay chain block, occupying almost all of the blockspace.
  4. Signalling an upgrade no longer contains the upgrade, hence if we need to push back on an upgrade for whatever reason, no network bandwidth and core time gets wasted because of this.

Ergonomics

End users are only affected by better performance and more stable block times. Parachains will need to implement the introduced request/response protocol and adapt to the new signalling mechanism via an UMP message, instead of sending the code upgrade directly.

For parachain operators we should emit events on initiated runtime upgrade and each block reporting the current counter and how many blocks to go until the upgrade gets passed to pre-checking. This is especially important for on-demand chains or bulk users not occupying a full core. Further more that behaviour of requiring multiple blocks to fully initiate a runtime upgrade needs to be well documented.

Compatibility

We will continue to support the old mechanism for code upgrades for a while, but will start to impose stricter limits over time, with the number of registered parachains going up. With those limits in place parachains not migrating to the new scheme might be having a harder time upgrading and will miss more blocks. I guess we can be lenient for a while still, so the upgrade path for parachains should be rather smooth.

In total the protocol changes we need are:

For validators and collators:

  1. New request/response protocol for fetching PVF data from collators and validators.
  2. New UMP message type for signalling a runtime upgrade.

Only for validators:

  1. New runtime API for determining to be enacted code upgrades.
  2. Different behaviour of bitfields (only sign a 1 bit, if validator has chunk + "hot" PVF).
  3. Altered behaviour in availability-distribution: Fetch missing PVFS.

Prior Art and References

Off-chain runtime upgrades have been discussed before, the architecture described here is simpler though as it piggybacks on already existing features, namely:

  1. availability-distribution: No separate I have code messages anymore.
  2. Existing pre-checking.

https://github.com/paritytech/polkadot-sdk/issues/971

Unresolved Questions

  1. What about the initial runtime, shall we make that off-chain as well?
  2. Good news, at least after the first upgrade, no code will be stored on chain any more, this means that we also have to redefine the storage deposit now. We no longer charge for chain storage, but validator disk storage -> Should be cheaper. Solution to this: Not only store the hash on chain, but also the size of the data. Then define a price per byte and charge that, but:
    • how do we charge - I guess deposit has to be provided via other means, runtime upgrade fails if not provided.
    • how do we signal to the chain that the code is too large for it to reject the upgrade? Easy: Make available and vote nay in pre-checking.

TODO: Fully resolve these questions and incorporate in RFC text.

Further Hardening

By no longer having code upgrade go through the relay chain, occupying a full relay chain block, the impact on other parachains is already greatly reduced, if we make distribution and PVF pre-checking low-priority processes on validators. The only thing attackers might be able to do is delay upgrades of other parachains.

Which seems like a problem to be solved once we actually see it as a problem in the wild (and can already be mitigated by adjusting the counter). The good thing is that we have all the ingredients to go further if need be. Signalling no longer actually includes the code, hence there is no need to reject the candidate: The parachain can make progress even if we choose not to immediately act on the request and no relay chain resources are wasted either.

We could for example introduce another UMP Signalling message RequestCodeUpgradeWithPriority which not just requests a code upgrade, but also offers some DOT to get ranked up in a queue.

Generalize this off-chain storage mechanism?

Making this storage mechanism more general purpose is worth thinking about. E.g. by resolving above "fee" question, we might also be able to resolve the pruning question in a more generic way and thus could indeed open this storage facility for other purposes as well. E.g. smart contracts, so the PoV would only need to reference contracts by hash and the actual PoV is stored on validators and collators and thus no longer needs to be part of the PoV.

A possible avenue would be to change the response to:

#![allow(unused)]
fn main() {
enum BlobResponse {
  Blob(Vec<u8>),
  Blobs(MerkleTree),
}
}

With this the hash specified in the request can also be a merkle root and the responder will respond with the entire merkle tree (only hashes, no payload). Then the requester can traverse the leaf hashes and use the same request response protocol to request any locally missing blobs in that tree.

One leaf would for example be the PVF others could be smart contracts. With a properly specified format (e.g. which leaf is the PVF?), what we got here is that a parachain can not only update its PVF, but additional data, incrementally. E.g. adding another smart contract, does not require resubmitting the entire PVF to validators, only the root hash on the relay chain gets updated, then validators fetch the merkle tree and only fetch any missing leaves. That additional data could be made available to the PVF via a to be added host function. The nice thing about this approach is, that while we can upgrade incrementally, lifetime is still tied to the PVF and we get all the same guarantees. Assuming the validators store blobs by hash, we even get disk sharing if multiple parachains use the same data (e.g. same smart contracts).

(source)

Table of Contents

RFC-0106: Remove XCM fees mode

Start Date23 July 2024
DescriptionRemove the SetFeesMode instruction and fees_mode register from XCM
AuthorsFrancisco Aguirre

Summary

The SetFeesMode instruction and the fees_mode register allow for the existence of JIT withdrawal. JIT withdrawal complicates the fee mechanism and leads to bugs and unexpected behaviour. The proposal is to remove said functionality. Another effort to simplify fee handling in XCM.

Motivation

The JIT withdrawal mechanism creates bugs such as not being able to get fees when all assets are put into holding and none left in the origin location. This is a confusing behavior, since there are funds for fees, just not where the XCVM wants them. The XCVM should have only one entrypoint to fee payment, the holding register. That way there is also less surface for bugs.

Stakeholders

  • Runtime Users
  • Runtime Devs
  • Wallets
  • dApps

Explanation

The SetFeesMode instruction will be removed. The Fees Mode register will be removed.

Drawbacks

Users will have to make sure to put enough assets in WithdrawAsset when previously some things might have been charged directly from their accounts. This leads to a more predictable behaviour though so it will only be a drawback for the minority of users.

Testing, Security, and Privacy

Implementations and benchmarking must change for most existing pallet calls that send XCMs to other locations.

Performance, Ergonomics, and Compatibility

Performance

Performance will be improved since unnecessary checks will be avoided.

Ergonomics

JIT withdrawal was a way of side-stepping the regular flow of XCM programs. By removing it, the spec is simplified but now old use-cases have to work with the original intended behaviour, which may result in more implementation work.

Ergonomics for users will undoubtedly improve since the system is more predictable.

Compatibility

Existing programs in the ecosystem will break. The instruction should be deprecated as soon as this RFC is approved (but still fully supported), then removed in a subsequent XCM version (probably deprecate in v5, remove in v6).

Prior Art and References

The previous RFC PR on the xcm-format repo, before XCM RFCs were moved to fellowship RFCs: https://github.com/polkadot-fellows/xcm-format/pull/57.

Unresolved Questions

None.

The new generic fees mechanism is related to this proposal and further stimulates it as the JIT withdraw fees mechanism will become useless anyway.

(source)

Table of Contents

RFC-0114: Introduce secp256r1_ecdsa_verify_prehashed Host Function to verify NIST-P256 elliptic curve signatures

Start Date16 August 2024
DescriptionHost function to verify NIST-P256 elliptic curve signatures.
AuthorsRodrigo Quelhas

Summary

This RFC proposes a new host function, secp256r1_ecdsa_verify_prehashed, for verifying NIST-P256 signatures. The function takes as input the message hash, r and s components of the signature, and the x and y coordinates of the public key. By providing this function, runtime authors can leverage a more efficient verification mechanism for "secp256r1" elliptic curve signatures, reducing computational costs and improving overall performance.

Motivation

“secp256r1” elliptic curve is a standardized curve by NIST which has the same calculations by different input parameters with “secp256k1” elliptic curve. The cost of combined attacks and the security conditions are almost the same for both curves. Adding a host function can provide signature verifications using the “secp256r1” elliptic curve in the runtime and multi-faceted benefits can occur. One important factor is that this curve is widely used and supported in many modern devices such as Apple’s Secure Enclave, Webauthn, Android Keychain which proves the user adoption. Additionally, the introduction of this host function could enable valuable features in the account abstraction which allows more efficient and flexible management of accounts by transaction signs in mobile devices. Most of the modern devices and applications rely on the “secp256r1” elliptic curve. The addition of this host function enables a more efficient verification of device native transaction signing mechanisms. For example:

  1. Apple's Secure Enclave: There is a separate “Trusted Execution Environment” in Apple hardware which can sign arbitrary messages and can only be accessed by biometric identification.
  2. Webauthn: Web Authentication (WebAuthn) is a web standard published by the World Wide Web Consortium (W3C). WebAuthn aims to standardize an interface for authenticating users to web-based applications and services using public-key cryptography. It is being used by almost all of the modern web browsers.
  3. Android Keystore: Android Keystore is an API that manages the private keys and signing methods. The private keys are not processed while using Keystore as the applications’ signing method. Also, it can be done in the “Trusted Execution Environment” in the microchip.
  4. Passkeys: Passkeys is utilizing FIDO Alliance and W3C standards. It replaces passwords with cryptographic key-pairs which is also can be used for the elliptic curve cryptography.

Stakeholders

  • Runtime Authors

Explanation

This RFC proposes a new host function for runtime authors to leverage a more efficient verification mechanism for "secp256r1" elliptic curve signatures.

Proposed host function signature:

#![allow(unused)]
fn main() {
fn ext_secp256r1_ecdsa_verify_prehashed_version_1(
    sig: &[u8; 64],
    msg: &[u8; 32],
    pub_key: &[u8; 64],
) -> bool;
}

The host function MUST return true if the signature is valid or false otherwise.

Drawbacks

N/A

Testing, Security, and Privacy

Security

The changes are not directly affecting the protocol security, parachains are not enforced to use the host function.

Performance, Ergonomics, and Compatibility

Performance

N/A

Ergonomics

The host function proposed in this RFC allows parachain runtime developers to use a more efficient verification mechanism for "secp256r1" elliptic curve signatures.

Compatibility

Parachain teams will need to include this host function to upgrade.

Prior Art and References

(source)

Table of Contents

RFC-0120: Referenda Confirmation by Candle Mechanism

Start Date22 March 2024
DescriptionProposal to decide polls after confirm period via a mechanism similar to a candle auction
AuthorsPablo Dorado, Daniel Olano

Summary

In an attempt to mitigate risks derived from unwanted behaviours around long decision periods on referenda, this proposal describes how to finalize and decide a result of a poll via a mechanism similar to candle auctions.

Motivation

Referenda protocol provide permissionless and efficient mechanisms to enable governance actors to decide the future of the blockchains around Polkadot network. However, they pose a series of risks derived from the game theory perspective around these mechanisms. One of them being where an actor uses the the public nature of the tally of a poll as a way of determining the best point in time to alter a poll in a meaningful way.

While this behaviour is expected based on the current design of the referenda logic, given the recent extension of ongoing times (up to 1 month), the incentives for a bad actor to cause losses on a proposer, reflected as wasted cost of opportunity increase, and thus, this otherwise reasonable outcome becomes an attack vector, a potential risk to mitigate, especially when such attack can compromise critical guarantees of the protocol (such as its upgradeability).

To mitigate this, the referenda underlying mechanisms should incentive actors to cast their votes on a poll as early as possible. This proposal's approach suggests using a Candle Auction that will be determined right after the confirm period finishes, thus decreasing the chances of actors to alter the results of a poll on confirming state, and instead incentivizing them to cast their votes earlier, on deciding state.

Stakeholders

  • Governance actors: Tokenholders and Collectives that vote on polls that have this mechanism enabled should be aware this change affects the outcome of failing a poll on its confirm period.
  • Runtime Developers: This change requires runtime developers to change configuration parameters for the Referenda Pallet.
  • Tooling and UI developers: Applications that interact with referenda must update to reflect the new Finalizing state.

Explanation

Currently, the process of a referendum/poll is defined as an sequence between an ongoing state (where accounts can vote), comprised by a with a preparation period, a decision period, and a confirm period. If the poll is passing before the decision period ends, it's possible to push forward to confirm period, and still, go back in case the poll fails. Once the decision period ends, a failure of the poll in the confirm period will lead to the poll to ultimately be rejected.

stateDiagram-v2
    sb: Submission
    pp: Preparation Period
    dp: Decision Period
    cp: Confirmation Period
    state dpd <<choice>>
    state ps <<choice>>
    cf: Approved
    rj: Rejected

    [*] --> sb
    sb --> pp
    pp --> dp: decision period starts
    dp --> cp: poll is passing
    dp --> ps: decision period ends
    ps --> cp: poll is passing
    cp --> dpd: poll fails
    dpd --> dp: decision period not deadlined
    ps --> rj: poll is failing
    dpd --> rj: decision period deadlined
    cp --> cf
    cf --> [*]
    rj --> [*]

This specification proposes three changes to implement this candle mechanism:

  1. This mechanism MUST be enabled via a configuration parameter. Once enabled, the referenda system MAY record the next poll ID from which to start enabling this mechanism. This is to preserve backwards compatibility with currently ongoing polls.

  2. A record of the poll status (whether it is passing or not) is stored once the decision period is finished.

  3. Including a Finalization period as part of the ongoing state. From this point, the poll MUST be immutable at this point.

    This period begins the moment after confirm period ends, and extends the decision for a couple of blocks, until the VRF seed used to determine the candle block can be considered "good enough". This is, not known before the ongoing period (decision/confirmation) was over.

    Once that happens, a random block within the confirm period is chosen, and the decision of approving or rejecting the poll is based on the status immediately before the block where the candle was "lit-off".

When enabled, the state diagram for the referenda system is the following:

stateDiagram-v2
    sb: Submission
    pp: Preparation Period
    dp: Decision Period
    cp: Confirmation Period
    cds: Finalization
    state dpd <<choice>>
    state ps <<choice>>
    state cd <<choice>>
    cf: Approved
    rj: Rejected

    [*] --> sb
    sb --> pp
    pp --> dp: decision period starts
    dp --> cp: poll is passing
    ps --> cp: poll is passing
    dp --> ps: decision period ends
    ps --> rj: poll is failing
    cp --> dpd: poll fails
    dpd --> cp: decision period over
    dpd --> dp: decision period not over
    cp --> cds: confirmation period ends
    cds --> cd: define moment when candle lit-off
    cd --> cf: poll passed
    cd --> rj: poll failed
    cf --> [*]
    rj --> [*]

Drawbacks

This approach doesn't include a mechanism to determine whether a change of the poll status in the confirming period is due to a legitimate change of mind of the voters, or an exploitation of its aforementioned vulnerabilities (like a sniping attack), instead treating all of them as potential attacks.

This is an issue that can be addressed by additional mechanisms, and heuristics that can help determine the probability of a change of poll status to happen as a result of a legitimate behaviour.

Testing, Security, and Privacy

The implementation of this RFC will be tested on testnets (Paseo and Westend) first. Furthermore, it should be enabled in a canary network (like Kusama) to ensure the behaviours it is trying to address is indeed avoided.

An audit will be required to ensure the implementation doesn't introduce unwanted side effects.

There are no privacy related concerns.

Performance, Ergonomics, and Compatibility

Performance

The added steps imply pessimization, necessary to meet the expected changes. An implementation MUST exit from the Finalization period as early as possible to minimize this impact.

Ergonomics

This proposal does not alter the already exposed interfaces or developers or end users. However, they must be aware of the changes in the additional overhead the new period might incur (these depend on the implemented VRF).

Compatibility

This proposal does not break compatibility with existing interfaces, older versions, but it alters the previous implementation of the referendum processing algorithm.

An acceptable upgrade strategy that can be applied is defining a point in time (block number, poll index) from which to start applying the new mechanism, thus, not affecting the already ongoing referenda.

Prior Art and References

Unresolved Questions

  • How to determine in a statistically meaningful way that a change in the poll status corresponds to an organic behaviour, and not an unwanted, malicious behaviour?

A proposed implementation of this change can be seen on this Pull Request.

(source)

Table of Contents

RFC-114: Adjust Tipper Track Confirmation Periods

Start Date17-Aug-24
DescriptionBig and Small Tipper Track Conformation Period Modification
AuthorsLeemo / ChaosDAO

Summary

This RFC proposes to change the duration of the Confirmation Period for the Big Tipper and Small Tipper tracks in Polkadot OpenGov:

  • Small Tipper: 10 Minutes -> 12 Hours

  • Big Tipper: 1 Hour -> 1 Day

Motivation

Currently, these are the durations of treasury tracks in Polkadot OpenGov. Confirmation periods for the Spender tracks were adjusted based on RFC20 and its related conversation.

Track DescriptionConfirmation Period Duration
Treasurer7 Days
Big Spender7 Days
Medium Spender4 Days
Small Spender2 Days
Big Tipper1 Hour
Small Tipper10 Minutes

You can see that there is a general trend on the Spender track that when the privilege level (the amount the track can spend) the confirmation period approximately doubles.

I believe that the Big Tipper and Small Tipper track's confirmation periods should be adjusted to match this trend.

In the current state it is possible to somewhat positively snipe these tracks, and whilst the power/privilege level of these tracks is very low (they cannot spend a large amount of funds), I believe we should increase the confirmation periods to something higher. This is backed up by the recent sentiment in the greater community regarding referendums submitted on these tracks. The parameters of Polkadot OpenGov can be adjusted based on the general sentiment of token holders when necessary.

Stakeholders

The primary stakeholders of this RFC are: – DOT token holders – as this affects the protocol's treasury – Entities wishing to submit a referendum on these tracks – as this affects the referendum's timeline – Projects with governance app integrations – see Performance, Ergonomics and Compatibility section below

Explanation

This RFC proposes to change the duration of the confirmation period for both the Big Tipper and Small Tipper tracks. To achieve this the confirm_period parameter for those tracks should be changed.

You can see the lines of code that need to be adjusted here:

  • Big Tipper: https://github.com/polkadot-fellows/runtimes/blob/f4c5d272d4672387771fb038ef52ca36f3429096/relay/polkadot/src/governance/tracks.rs#L245

  • Small Tipper: https://github.com/polkadot-fellows/runtimes/blob/f4c5d272d4672387771fb038ef52ca36f3429096/relay/polkadot/src/governance/tracks.rs#L231

This RFC proposes to change the confirm_period for the Big Tipper track to DAYS (i.e. 1 Day) and the confirm_period for the Small Tipper track to 12 * HOURS (i.e. 12 Hours).

Drawbacks

The drawback of changing these confirmation periods is that the lifecycle of referenda submitted on those tracks would be ultimately longer, and it would add a greater potential to negatively "snipe" referenda on those tracks by knocking the referendum out of its confirmation period once the decision period has ended. This can be a good or a bad thing depending on your outlook of positive vs negative sniping.

Testing, Security, and Privacy

This referendum will enhance the security of the protocol as it relates to its treasury. The confirmation period is one of the last lines of defense for the Polkadot token holder DAO to react to a potentially bad referendum and vote NAY in order for its confirmation period to be aborted.

Performance, Ergonomics, and Compatibility

Performance

This is a simple change (code wise) that should not affect the performance of the Polkadot protocol, outside of increasing the duration of the confirmation periods for these 2 tracks.

Ergonomics & Compatibility

As per the implementation of changes described in RFC-20, it was identified that governance UIs automatically update to meet the new parameters:

  • Nova Wallet - directly uses on-chain data, and change will be automatically reflected.
  • Polkassembly - directly uses on-chain data via rpc to fetch trackInfo so the change will be automatically reflected.
  • SubSquare - scan script will update their app to the latest parameters and it will be automatically reflected in their app.

Prior Art and References

N/A

Unresolved Questions

Some token holders may want these confirmation periods to remain as they are currently and for them not to increase. If this is something that the Polkadot Technical Fellowship considers to be an issue to implement into a runtime upgrade then I can create a Wish For Change to obtain token holder approval.

The parameters of Polkadot OpenGov will likely continue to change over time, there are additional discussions in the community regarding adjusting the min_support for some tracks so that it does not trend towards 0%, similar to the current state of the Whitelisted Caller track. This is outside of the scope of this RFC and requires a lot more discussion.

(source)

Table of Contents

RFC-TODO: Stale Nomination Reward Curve

Start Date10 July 2024
DescriptionIntroduce a decaying reward curve for stale nominations in staking.
AuthorsShawn Tabrizi

Summary

This is a proposal to reduce the impact of stale nominations in the Polkadot staking system. With this proposal, nominators are incentivized to update or renew their selected validators once per time period. Nominators that do not update or renew their selected validators would be considered stale, and a decaying multiplier would be applied to their nominations, reducing the weight of their nomination and rewards.

Motivation

Longer motivation behind the content of the RFC, presented as a combination of both problems and requirements for the solution.

One of Polkadot's primary utilities is providing a high quality security layer for applications built on top of it. To achieve this, Polkadot runs a Nominated Proof-of-Stake system, allowing nominators to vote on who they think are the best validators for Polkadot.

This system functions best when nominators and validators are active participants in the network. Nominators should consistently evaluate the quality and preferences of validators, and adjust their nominations accordingly.

Unfortunately, many Polkadot nominators do not play an active role in the NPoS system. For many, they set their nominations, and then seldomly look back at the.

This can lead to many negative behaviors:

  • Incumbents who received early nominations basically achieve tenure.
  • Validator quality and performance can decrease without recourse.
  • The validator set are not the optimal for Polkadot.
  • New validators have a harder time entering the active set.
  • Validators are able to "sneakily" increase their commission.

Stakeholders

Primary stakeholders are:

  • Nominators
  • Validators

Explanation

Detail-heavy explanation of the RFC, suitable for explanation to an implementer of the changeset. This should address corner cases in detail and provide justification behind decisions, and provide rationale for how the design meets the solution requirements.

Drawbacks

Description of recognized drawbacks to the approach given in the RFC. Non-exhaustively, drawbacks relating to performance, ergonomics, user experience, security, or privacy.

Testing, Security, and Privacy

Describe the the impact of the proposal on these three high-importance areas - how implementations can be tested for adherence, effects that the proposal has on security and privacy per-se, as well as any possible implementation pitfalls which should be clearly avoided.

Performance, Ergonomics, and Compatibility

Describe the impact of the proposal on the exposed functionality of Polkadot.

Performance

Is this an optimization or a necessary pessimization? What steps have been taken to minimize additional overhead?

Ergonomics

If the proposal alters exposed interfaces to developers or end-users, which types of usage patterns have been optimized for?

Compatibility

Does this proposal break compatibility with existing interfaces, older versions of implementations? Summarize necessary migrations or upgrade strategies, if any.

Prior Art and References

Provide references to either prior art or other relevant research for the submitted design.

Unresolved Questions

Provide specific questions to discuss and address before the RFC is voted on by the Fellowship. This should include, for example, alternatives to aspects of the proposed design where the appropriate trade-off to make is unclear.

Describe future work which could be enabled by this RFC, if it were accepted, as well as related RFCs. This is a place to brain-dump and explore possibilities, which themselves may become their own RFCs.