U.S. patent application number 17/051083 was filed with the patent office on 2021-07-29 for partitioning a blockchain network.
The applicant listed for this patent is nChain Holdings Limited. Invention is credited to Bassem AMMAR, Dean KRAMER, Martin SEWELL.
Application Number | 20210233074 17/051083 |
Document ID | / |
Family ID | 1000005541553 |
Filed Date | 2021-07-29 |
United States Patent
Application |
20210233074 |
Kind Code |
A1 |
KRAMER; Dean ; et
al. |
July 29, 2021 |
PARTITIONING A BLOCKCHAIN NETWORK
Abstract
A computer-implemented method for validating a blockchain
transaction is disclosed. The method comprises identifying at least
one shard comprising at least one UTXO referenced by at least one
respective input of the transaction, transmitting the transaction
to at least one member node of at least one shard, and performing a
validation check on at least one input using validity data of the
UTXO.
Inventors: |
KRAMER; Dean; (London,
GB) ; SEWELL; Martin; (London, GB) ; AMMAR;
Bassem; (Lancaster, EN) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
nChain Holdings Limited |
St. John's |
|
AG |
|
|
Family ID: |
1000005541553 |
Appl. No.: |
17/051083 |
Filed: |
April 24, 2019 |
PCT Filed: |
April 24, 2019 |
PCT NO: |
PCT/IB2019/053382 |
371 Date: |
October 27, 2020 |
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
H04L 2209/38 20130101;
G06F 9/3836 20130101; H04L 9/0643 20130101; G06Q 20/401
20130101 |
International
Class: |
G06Q 20/40 20060101
G06Q020/40; H04L 9/06 20060101 H04L009/06; G06F 9/38 20060101
G06F009/38 |
Foreign Application Data
Date |
Code |
Application Number |
Apr 27, 2018 |
GB |
1806907.0 |
Apr 27, 2018 |
GB |
1806909.6 |
Apr 27, 2018 |
GB |
1806911.2 |
Apr 27, 2018 |
GB |
1806914.6 |
Apr 27, 2018 |
GB |
1806930.2 |
Claims
1. A computer-implemented method for validating a blockchain
transaction, the method comprising: identifying at least one shard
comprising at least one UTXO referenced by at least one respective
input of the blockchain transaction; transmitting the blockchain
transaction to at least one member node of at least one shard; and
performing a validation check on at least one input using validity
data of the UTXO.
2. The method of claim 1, further comprising the step of
communicating a request for shard membership information of a node
to another node.
3. The method of claim 1, further comprising the step of
communicating shard membership information of a node to another
node.
4. The method of claim 2, wherein the step of communicating is
performed using a modified addr message.
5. A system, comprising: a processor; and memory including
executable instructions that, as a result of execution by the
processor, causes the system to perform the computer-implemented
method of claim 1.
6. A non-transitory computer-readable storage medium having stored
thereon executable instructions that, as a result of being executed
by a processor of a computer system, cause the computer system to
at least perform the computer-implemented method of claim 1.
7. A system, comprising: a processor; and memory including
executable instructions that, as a result of execution by the
processor, causes the system to perform the computer-implemented
method of claim 2.
8. A system, comprising: a processor; and memory including
executable instructions that, as a result of execution by the
processor, causes the system to perform the computer-implemented
method of claim 3.
9. A system, comprising: a processor; and memory including
executable instructions that, as a result of execution by the
processor, causes the system to perform the computer-implemented
method of claim 4.
10. A non-transitory computer-readable storage medium having stored
thereon executable instructions that, as a result of being executed
by a processor of a computer system, cause the computer system to
at least perform the computer-implemented method of claim 2.
11. A non-transitory computer-readable storage medium having stored
thereon executable instructions that, as a result of being executed
by a processor of a computer system, cause the computer system to
at least perform the computer-implemented method of claim 3.
12. A non-transitory computer-readable storage medium having stored
thereon executable instructions that, as a result of being executed
by a processor of a computer system, cause the computer system to
at least perform the computer-implemented method of claim 4.
13. The method of claim 1, wherein the validity data comprises an
unlocking script and a locking script.
14. The method of claim 1, wherein the shard is a partition in a
sharded blockchain network.
15. The method of claim 1, wherein performing the validity check
comprises determining that the blockchain transaction will not
result in a double spend.
16. The system of claim 5, wherein the shard is a partition of a
blockchain network.
17. The system of claim 5, wherein the validity check indicates
whether the blockchain transaction would result in a double spend
of the UTXO.
18. The system of claim 5, wherein performing the validity check
comprises executing a script of a stack-based scripting language to
determine whether the script evaluates to TRUE.
19. The non-transitory computer-readable storage medium of claim 6,
wherein the shard is a member in a blockchain network.
20. The non-transitory computer-readable storage medium of claim 6,
wherein the validity data comprises a set of scripts that are
evaluated to perform the validity check.
Description
[0001] The present disclosure relates to a method for partitioning
a blockchain network and a method for validating transactions of a
partitioned blockchain network, and relates particularly, but not
exclusively, to a method for partitioning the unspent transaction
output (UTXO) set of the Bitcoin blockchain and a method for
validating transactions of a partitioned UTXO set of the Bitcoin
blockchain.
[0002] In this document we use the term `blockchain` to include all
forms of electronic, computer-based, distributed ledgers. These
include consensus-based blockchain and transaction-chain
technologies, permissioned and un-permissioned ledgers, shared
ledgers and variations thereof. The most widely known application
of blockchain technology is the Bitcoin ledger, although other
blockchain implementations have been proposed and developed. While
Bitcoin may be referred to herein for the purpose of convenience
and illustration, it should be noted that the disclosure is not
limited to use with the Bitcoin blockchain and alternative
blockchain implementations and protocols fall within the scope of
the present disclosure. The term "user" may refer herein to a human
or a processor-based resource. The term "Bitcoin" is used herein to
include any version or variation that derives from or is based on
the Bitcoin protocol.
[0003] A blockchain is a peer-to-peer, electronic ledger which is
implemented as a computer-based decentralised, distributed system
made up of blocks which in turn are made up of transactions. Each
transaction is a data structure that encodes the transfer of
control of a digital asset between participants in the blockchain
system, and includes at least one input and at least one output.
Each block contains a hash of the previous block so that blocks
become chained together to create a permanent, unalterable record
of all transactions which have been written to the blockchain since
its inception. Transactions contain small programs known as scripts
embedded into their inputs and outputs, which specify how and by
whom the outputs of the transactions can be accessed. On the
Bitcoin platform, these scripts are written using a stack-based
scripting language.
[0004] In order for a transaction to be written to the blockchain,
it must be "validated". Network nodes (miners) perform work to
ensure that each transaction is valid, with invalid transactions
rejected from the network. Software clients installed on the nodes
perform this validation work on an unspent transaction (UTXO) by
executing its locking and unlocking scripts. If execution of the
locking and unlocking scripts evaluate to TRUE, the transaction is
valid and the transaction is written to the blockchain. Thus, in
order for a transaction to be written to the blockchain, it must be
i) validated by the first node that receives the transaction--if
the transaction is validated, the node relays it to the other nodes
in the network; and ii) added to a new block built by a miner; and
iii) mined, i.e. added to the public ledger of past
transactions.
[0005] Although blockchain technology is most widely known for the
use of cryptocurrency implementation, digital entrepreneurs have
begun exploring the use of both the cryptographic security system
Bitcoin is based on and the data that can be stored on the
Blockchain to implement new systems. It would be highly
advantageous if the blockchain could be used for automated tasks
and processes which are not limited to the realm of cryptocurrency.
Such solutions would be able to harness the benefits of the
blockchain (e.g. a permanent, tamper proof records of events,
distributed processing etc.) while being more versatile in their
applications.
[0006] As discussed above, a blockchain network, for example the
Bitcoin blockchain network, is a secure distributed computing
system. Full nodes of the system persist and manage a copy of the
entire blockchain, sending and receiving transactions, validating
them, and adding blocks to the blockchain based on a shared
decentralized consensus protocol. This approach, while secure, does
have scaling flaws related to the fact that each transaction is
validated and stored by every full node. In terms of validation,
this causes delays in network propagation of transactions, as each
transaction needs to be validated before it can be propagated
onwards towards a miner. Furthermore, delays attributed to
validation renders the network and susceptible to "double-spend"
related attacks, such as Sybil attacks.
[0007] The present disclosure aims to improve the scalability,
speed, reliability as well as security of the blockchain network
through the use of horizontal partitioning, also known as sharding,
as well as associated techniques or protocols for allocating and/or
validating transactions on a sharded blockchain network. Disclosed
herein are: [0008] A network structure for a sharded blockchain;
and [0009] a sharded UTXO and mempool structure.
[0010] Partitioning in the art considers two specific dimensions:
horizontal and vertical. In the partitioned sections of a
horizontally-partitioned database, known as shards, there are
effectively multiple instances of a specific database schema, with
data spread across each of these instances, discounting instance
redundancy. Vertical partitioning however is the splitting of a
given database schema across multiple nodes, whereby attributes of
a specific object are spread using normalisation.
[0011] Different parties wanting to be involved in blockchain
networks can possess a range of computing resources, ranging from
small low-powered machines, to server farms. Participating parties
are therefore limited by computing resources to a predetermined
level of involvement in the blockchain network.
[0012] In Bitcoin, the blockchain itself is a set of linked
transactions which mark specific changes in the ownership of coins
which are mined at the creation of a block. During transaction
validation, one of the checks required is to check that there has
not been a double spend. A double spend is when a transaction
output has been referenced in a transaction input either already in
the Bitcoin mempool, or confirmed on the blockchain. A mempool is a
known to be a reference to a memory pool or area for Bitcoin
transactions that each full node maintains for itself.
Traditionally, after a transaction is verified by a node, it waits
inside a mempool until inserted into a block. To make the
validation of transactions more efficient in terms of checking
transaction inputs, instead of validating the entire blockchain,
the current state of the network is kept within an independent
structure known as the UTXO set. This structure contains each
transaction output which is yet to be spent by a transaction, which
can include coinbase and standard transactions.
[0013] According to an aspect of the present disclosure, there is
provided a computer-implemented method of partitioning a blockchain
network into shards. The method comprises: identifying a
transaction id of a blockchain transaction; and allocating the
transaction to a shard based on the transaction id.
[0014] Partitioning a blockchain network into shards enables users
to choose their own level of involvement with the blockchain
network. Each user can choose to be a member of one or more shards.
A user who is a member of fewer than all shards requires less
storage space to store all of the transactions allocated to the
shards of which the user is a member. Allocating a transaction to a
shard based on its transaction id provides the advantage that the
resulting shard sizes will be approximately equal, thereby avoiding
placing undue burden on members of a larger shard relative to
members of a smaller shard, while at the same time enabling the
transactions and associated verifications to be performed
accurately, and without any undue delays.
[0015] The users referred herein may be associated with one or more
nodes or computing devices, and these nodes may also be referred to
as client entities in the partitioned blockchain network.
Hereinafter, a reference to a user may be also understood to be a
reference to the node or entity associated with the user (that may
own or control the node or entity that is part of the sharded or
partitioned blockchain network). Each node may be communicatively
coupled with at least one or more other nodes in the partitioned
blockchain network.
[0016] These advantages discussed herein that are associated with
this as well as other aspects of the present disclosure (discussed
below) are attributed to the structure of the nodes and resulting
network topology and architecture of the sharded blockchain network
and protocols associated with the nodes of the network. Receiving,
storing and/or validating UTXO's in such a sharded network is
performed using the described and claimed methods, rules or
protocols for communication, data storage, data sharing as well as
validation techniques for nodes within each shard, as well as based
on the rules and protocols associated for communication with nodes
belonging to different shards.
[0017] These specific structures, methods of data flow, transaction
allocation and validation protocols will be further explained below
with respect to the various embodiments of the present disclosure.
Advantageously, the sharded network structure or architecture and
associated methods described herein for allocation of transactions
and validation of such allocated transactions within the sharded
blockchain network enable novel techniques for data flow, data
storage and UTXO validation checks. Furthermore, these techniques
advantageously prevent double spend attacks, such as Sybil attacks
in the Bitcoin blockchain in view of the structure and data
communication/validation protocols.
[0018] The method may further comprise the step of performing an
operation using the transaction id. The step of allocating the
transaction to a shard may be based on the result of the
operation.
[0019] This provides the advantage that the arrangement of the
shards can be tailored dependent on the choice of the
operation.
[0020] The operation may comprise a modulo operation.
[0021] This provides the advantage that a desired number of
equal-sized shards can be more easily generated. According to the
present disclosure, there is provided a further
computer-implemented method of partitioning a blockchain network
into shards. The method comprises: identifying a parent blockchain
transaction, the parent transaction defined by an output which
corresponds to an input of a child blockchain transaction; and
allocating the parent transaction and the child transaction to the
same shard.
[0022] Partitioning a blockchain network into shards enables users
to choose their own level of involvement with the blockchain
network. Each user can choose to be a member of one or more shards.
A user who is a member of fewer than all shards requires less
storage space to store all of the transactions allocated to the
shards of which the user is a member. Allocating a transaction to a
shard based on identifying a parent blockchain transaction, the
parent transaction defined by an output which corresponds to an
input of a child blockchain transaction, and allocating the parent
transaction and the child transaction to the same shard provides
the advantage that validation operations performed by users who are
members of a particular shard may be performed while requiring less
information to be transmitted to and from users who are members of
different shards, because a child transaction being validated will
always have a parent transaction which is a member of the same
shard.
[0023] A parent transaction may be identified using an input of a
plurality of inputs of the child blockchain transaction. The input
used may be selected on the basis of its index. The index may be 1,
in which case the input used is the first input of the plurality of
inputs.
[0024] This provides the advantage of enabling a child transaction
having multiple inputs to be allocated to a shard.
[0025] A parent transaction may be identified using a largest
subset of inputs of a plurality of inputs of the child blockchain
transaction. For example, in the instance where a child transaction
has five inputs, where two of the five refer to two outputs of an
earlier transaction, and each of the remaining three inputs refer
to three different earlier transactions, the parent transaction is
defined as the earlier transaction to which the two inputs both
refer as they are the largest subset of inputs.
[0026] This provides the advantage that, for a child transaction
having multiple inputs, the amount of information required from
users who are members of different shards is reduced.
[0027] According to the present disclosure, there is also provided
a computer-implemented method for validating a blockchain
transaction. The method comprises: requesting at least one UTXO
referenced by at least one respective input of the transaction from
a member node of at least one shard comprising at least one UTXO;
obtaining validity data of at least one UTXO from at least one
node; and performing a validation check on at least one input using
the validity data.
[0028] This method enables validation of a blockchain transaction
to take place in a sharded blockchain network. The advantage
provided by this method is that each user can choose to be a member
of one or more shards, and a user who is a member of fewer than all
shards requires less computing power to validate all of the
transactions allocated to the shards of which the user is a
member.
[0029] According to the present disclosure, there is further
provided a computer-implemented method for validating a blockchain
transaction. The method comprises: identifying at least one shard
comprising at least one UTXO referenced by at least one respective
input of the transaction; transmitting the transaction to at least
one member node of at least one shard; and performing a validation
check on at least one input using validity data of the UTXO.
[0030] This method enables validation of a blockchain transaction
to take place in a sharded blockchain network. The advantage
provided by this method is that each user can choose to be a member
of one or more shards, and a user who is a member of fewer than all
shards requires less computing power to validate all of the
transactions allocated to the shards of which the user is a
member.
[0031] Accordingly, the present disclosure relates to a
computer-implemented method for validating a blockchain
transaction, associated with a blockchain network, wherein the
blockchain network is partitioned into a plurality of shards, each
shard comprising at least one member node, and wherein each node in
the blockchain network is member of least one shard among the
plurality of shards, the method comprising the steps of: responsive
to receiving a given transaction at a node, determining that at
least one UTXO associated with the node is referenced by at least
one respective input of the transaction, wherein the node is
associated with a set of UTXOs that relate to one or more
transactions allocated to each shard that that the node is a member
of; based on a determination that the at least one input of the
given transaction is associated with a shard that the node is a
member of, performing a validation check on at least one input
using validity data associated with the UTXO; based on a
determination that the transaction is valid, adding the transaction
to a mempool associated with the node; and propagating the
transaction to other member nodes of the at least one shard that
the node is a member of.
[0032] Any of the above methods may further comprise the step of
communicating a request for shard membership information of a node
to another node.
[0033] This provides the advantage that the node seeking shard
membership information is provided with a mechanism for locating
that information more easily.
[0034] Any of the above methods may further comprising the step of
communicating shard membership information of a node to another
node.
[0035] This provides a mechanism for shard membership information
of nodes to be transferred between nodes, thereby providing the
advantage of decreasing the likelihood that a node performing a
validation operation will fail.
[0036] In some embodiments, the methods includes the step of
communicating or broadcasting shard membership information of a
node to all other nodes in the shard associated with the node
and/or one or more other nodes in the network. The method may also
include that the communication is performed using a modified addr
message, wherein the modified addr message includes an indication
of one or more shards that the node is associated with.
[0037] The communication may be performed using a modified addr
message.
[0038] This provides the advantage of providing a more secure
mechanism for exchanging shard membership information between
nodes.
[0039] The disclosure also provides a system, comprising: [0040] a
processor; and [0041] memory including executable instructions
that, as a result of execution by the processor, causes the system
to perform any embodiment of the computer-implemented method
described herein.
[0042] The disclosure also provides a non-transitory
computer-readable storage medium having stored thereon executable
instructions that, as a result of being executed by a processor of
a computer system, cause the computer system to at least perform an
embodiment of the computer-implemented method described herein.
[0043] Preferred embodiments of the present disclosure are
described below, in a general and not in a limitative sense, with
reference to accompanying drawings, in which:
[0044] FIGS. 1a and 1b illustrate a comparison between a
traditional blockchain network (FIG. 1a) and a sharded blockchain
(FIG. 1b) network according first embodiment of the present
disclosure;
[0045] FIG. 2 illustrates node usage rotation used in relation to a
second embodiment of the present disclosure;
[0046] FIG. 3 illustrates a method of allocating a transaction to a
shard according to a third embodiment of the present
disclosure.
[0047] FIG. 4 illustrates a method of allocating a transaction to a
shard according to a fourth embodiment.
[0048] FIG. 5 illustrate a UTXO data structure of the prior
art;
[0049] FIG. 6 illustrates a fifth embodiment of the present
disclosure;
[0050] FIGS. 7a and 6b illustrate a sixth embodiment of the present
disclosure; and
[0051] FIGS. 8a and 8b illustrate a seventh embodiment of the
present disclosure.
[0052] FIG. 9 is a schematic diagram illustrates a computing
environment in which various embodiments of the present disclosure
can be implemented.
[0053] In the current blockchain network, different nodes are
connected peer-to-peer in a largely unstructured fashion (with the
exception of a number of hardcoded network seeds within the Bitcoin
client to aid node discovery). These nodes communicate to share
valid transactions, blocks, and information regarding other
nodes.
[0054] Structure of a Sharded Network
[0055] A first embodiment of the present disclosure can be seen in
FIG. 1b, which depicts a structure of a sharded blockchain network
according to the present disclosure. FIG. 1a on the other hand
shows a structure of the existing, i.e. prior art blockchain
network.
[0056] According to the present disclosure, to reduce the reliance
on having expensive and powerful computing resources for parties to
participate in a predetermined level of involvement in the
blockchain network, parties may be allowed to be members of any
number of shards of a sharded blockchain network. This means small
parties, including hobbyists, can choose to be members of a single
shard of the network shown in FIG. 1b, and large parties, such as
financial institutions, can choose to be members of many or even
all shards of the sharded blockchain network of FIG. 1b. This
approach accommodates entities or parties that may require
transaction history security, for instance to ensure that these
parties that want or need greater security may be able to validate
and store every transaction in the blockchain, while other parties
or entities that may not wish to or require the same (greater)
level or may want a lighter weight involvement, may also
participate in the same sharded blockchain network of FIG. 1b and
may store just a subset of the blockchain.
[0057] As can be seen in FIG. 1b, a particular node can be a member
of one or more shard groups. This is seen by the shaded lines shown
in this figure, where a node within the shaded area is a member of
both, shard 2 as well as shard 3. For communication, in the current
Bitcoin network and Bitcoin SV (BSV) client, a list of available
peers, i.e. nodes in the network, holds information regarding nodes
it can connect to, distribute to, and receive from. In a sharded
blockchain according to the first embodiment, additional
information is held, including of which shard each node is a
member. In some implementations, for handling transaction
propagation across the network, each node shown in the sharded
network in FIG. 1b arranged or configured such that it may
communicate with at least a single node from each shard to
propagate their transactions destined for a different shard. In
some implementations, the information held by each node may be in
the form of a data structure to indicate nodes it can connect to,
distribute to, receive from, and the shard that it belongs to in
the sharded network seen in FIG. 1b. Other details pertaining to
the node, such as an identifier, entity association etc. may also
be held. This data structure may be held within a memory associated
with each node, or may be held in a memory associated with the
shard, for instance.
[0058] In a second embodiment that relates to a sharded network as
seen in FIG. 1b, a technique where the nodes communicate with
multiple other nodes within a single shard is explained. This
technique advantageously prevents a "Sybil style" attack within a
blockchain network A Sybil attack is an attack where a single
adversary or malign entity may be controlling multiple nodes on a
network, unknown to the network. For example, an adversary may
create multiple computers and IP addresses, and may also can create
multiple accounts/nodes in an attempt to pretend that they all
exist. The manifestation of such an attack may be seen by the
following example implementations. If an attacker attempts to fill
the network with clients that they control, then a node may then be
very likely to connect only to attacker nodes. For example, the
attacker can refuse to relay blocks and transactions for a node,
effectively disconnecting that particular node from the network.
This can also be manifested by the attacker relaying blocks that
they create, effectively putting a node or entity on a separate
network, thereby leaving a node and transactions associated with
that node or the entity it represents open to double-spending
attacks. Sybil attacks are thus a problem for existing blockchain
networks.
[0059] To prevent Sybil attacks in a sharded blockchain network
such as seen in FIG. 1b, nodes are configured to communicate with
multiple or all other nodes within a single shard, according to the
second embodiment. As discussed above, Sybil attacks can
effectively disregard transactions sent from a particular node,
preventing their propagation further through the network.
Therefore, in the second embodiment of the present disclosure, a
technique by which nodes in a given shard can exchange information
regarding nodes in other shards, and rotate their usage, as is seen
in FIG. 2.
[0060] According to the second embodiment, each node in the sharded
network can broadcast every transaction to each other. If a given
node is not a member of the shard that is associated with a
transaction (this association is described below with reference to
the third and fourth embodiments), then instead of doing a full
transaction validation, it performs basic transaction-level checks
before propagating onwards. It is noted that in some
implementations, the protocols and rules discussed above in
relation to the second embodiment are related to and considered to
be part of one or more or all of the other embodiments of the
present disclosure discussed herein.
[0061] At different or specific times/instances, details about
other nodes can also be shared between nodes in specific shards.
This is performed according to the second embodiment of the present
disclosure using a modified version of addr protocol messages. An
implementation of addr messages that currently exist as part of the
Bitcoin protocol is used to list or identify one or more IP
addresses and ports. For example, a getaddr request may be used to
obtain an addr message containing a bunch of known-active peers
(for bootstrapping, for example). addr messages often contain only
one address, but sometimes contain many more, and in some examples,
up to a 1000. In some examples, all nodes broadcast an addr
containing their own IP address periodically, i.e. every 24 hours.
Nodes may then relay these messages to their peers, and can store
the addresses relayed, if new to them. This way, the nodes in the
network may have a reasonably clear picture of which IPs are
connected to the network at the moment of, or after connecting to
the network. In most cases the IP address gets added to everyone's
address database because of an initial addr broadcast.
[0062] An implementation of a modified addr protocol according to
the present disclosure, in addition to the above may be capable of
transmitting additional information on which to shard or shards a
particular node belongs. For instance, in the modified addr
protocol, when a node in a shared network such as FIG. 1b joins a
particular shard in the network, then what is broadcast as part of
the addr message may also include a field identifying the one or
more shards it is a member of. This information is also therefore
returned in response to a getaddr request from a peer in the
network of FIG. 1b. As discussed in the first embodiment, such
information may be based on a data structure associated with each
node and/or each shard to which the node is associated with. In
some embodiments, the modified addr protocol may also include the
status of the shards of which the node is a member of and/or the
status of the node itself. For instance, details of the number of
nodes in each member shard may be identified, or if a particular
shard is active, or the number of active nodes in a given shard may
also be identified.
[0063] Allocating Transactions to Shards in a Sharded Network
[0064] As described above, in a sharded blockchain network,
transactions are not validated and stored by every node, but
instead they are allocated to one or more specific shards.
Therefore, a strategy for allocating transactions to different
shards is required. Two possible embodiments will be described
below and are referred to as "transaction id-based" sharding
according to a third embodiment of the present disclosure and
"input-based" sharding, according to a fourth embodiment of the
present disclosure.
[0065] In some implementations, it is possible that the existing
Bitcoin protocol would be likely to undergo a fork in order to
initiate either scheme. When nodes in a given shard subsequently
receive a transaction, they may check it has been sent to the
correct shard. This approach provides balancing of transactions
across shards.
[0066] Either sharding method may be applied to the blockchain
retroactively, and to any extent. That is, either method may be
applied such that a sharded network is defined as existing from the
time of the first block in the blockchain (the so-called genesis
block in the case of the Bitcoin blockchain) all the way through to
an arbitrarily chosen block number in the future.
[0067] The sharding methods described below may be applied a
multiple number of times in sequence, and in any order. For
example, transaction-id sharding may be performed in the first
instance, and input-based sharding may be performed at a later
date. Furthermore, either one of the methods may be applied
retroactively, as described above, and further to this, either
method may be subsequently applied. The number of shards, n, may be
chosen each time a sharding method is applied and allows the
protocol to scale by increasing the number of nodes. The number of
nodes may be chosen based on the number of total nodes on the
network, the size of the blockchain, and/or on other
characteristics. For both of the sharding methods described below,
the manner in which the transactions are stored by each node once
sharding has taken place will also be described.
[0068] Transaction Id Based Shard Distribution
[0069] In a horizontally-partitioned blockchain, as each shard does
not contain and handle all transactions on the network, a strategy
for allocating transactions to different shards is required.
Furthermore, any sharding method needs to be capable of performing
further sharding. In a third embodiment of the present disclosure,
as explained with the help of FIG. 3, transaction distribution
across shards is handled based on a transaction id (txid).
[0070] In step 302, the transaction id for a given transaction is
created, indicated as txid. In some implementations, this txid is
obtained as a result of applying a SHA256 function to the
transaction data.
[0071] In step 304, using this transaction id, an operation is
carried out based on the txid and the available number of shards in
the sharded network. In some implementations, a modulus of the
number of shards currently active on the blockchain network is
applied to the transaction id, i.e. shard number=txid mod n, where
n is the (desired or active) number of shards.
[0072] In step 306, the result of step 304 then corresponds to the
shard to which the given transaction is allocated.
[0073] In step 308, once allocated in step 306, the transaction is
distributed to the identified shard, i.e. the transaction is to be
distributed to nodes comprised in the identified shard in step
306.
[0074] Therefore, when nodes in a given shard receive a
transaction, they can easily check it has been sent to the correct
shard. In some embodiments, such checking may be facilitated based
on data structures associated with each node that include
information associated with the node, as discussed above in the
first embodiment. Advantageously, this approach provides an even
balancing of transactions across shards.
[0075] Shard count on the network can be arbitrarily chosen on the
basis of a number of parameters, including: [0076] Number of total
nodes on the network; and [0077] The size of the blockchain.
[0078] Input-Based Shard Distribution
[0079] A sharding method according to a fourth embodiment of the
present disclosure as explained with the help of the flow diagram
in FIG. 4.
[0080] In this embodiment, in step 402, an input of a given
transaction is identified. In some implementations, this is the
first input for the transaction.
[0081] In step 404, an output of an earlier transaction to which
the input in step 402 refers to is identified.
[0082] In step 406, the results of step 402 and 404, i.e. the
corresponding input, and the output from an earlier transaction,
are both allocated to the same shard in the sharded network as seen
in FIG. 1b. In some implementations, this step includes identifying
the shard to allocate both the transactions to. In one example,
this may be a shard that is associated with the earlier
transaction, in case this has already been allocated. In another
example, as discussed above, a modified addr broadcast or a
response to a getaddr request for either the given or earlier
transaction may be used to identify the shard. In other examples, a
shard may be selected on a random or a prescribed, i.e. rotation
based, basis for both transactions, as long as both are assigned to
the same shard. This may be applied for instance if a parent
transaction is not identified, i.e. if it is a coinbase transaction
that is received.
[0083] Step 408 shows that the above process in steps 402 to 406 is
iterated to generate chains of transactions linked by their first
inputs.
[0084] A transaction whose output is referred to by a first input
of a subsequent transaction is referred to in this context as a
"parent" transaction, and the subsequent transaction is referred to
as a "child" transaction.
[0085] It is to be noted that the usage of the first input to
define a parent in step 402 is not essential to the method; as any
input may be chosen to perform the method if a plurality of inputs
are present in a given transaction. For example, an earlier
transaction may be defined as a parent of a child transaction if a
particular number of inputs of the child transaction refer to
outputs of transactions in the same shard as the parent. The number
of inputs may define a majority number of inputs of the child
transaction. Thus, in some implementations discussed above in step
406 the shard that is allocated will be the same as that of the
identified parent based on either the number of inputs or indeed
the first or any other prescribed particular input to be
considered.
[0086] It is to be noted that the above two sharding methods of the
third and fourth embodiments may be performed sequentially, in any
order, and that the two methods may be performed multiple times as
desired. For example, a blockchain network may be forked in
accordance with Input-Based distribution of the fourth embodiment,
and subsequently one or more of the resulting forks may be sharded
in accordance with Transaction ID-Based distribution of the third
embodiment.
[0087] UTXO Set/Mempool Sharding
[0088] In the Bitcoin network currently, every node maintains its
own UTXO set, which is checked and updated during validation. An
example of a UTXO set is shown in FIG. 5.
[0089] In accordance with a fifth embodiment of the present
disclosure, in a sharded blockchain (such as shown in FIG. 1b),
each member node of one or more shards has a UTXO set related to
the transactions which are related to each shard of which the node
is a member. This is further illustrated in FIG. 6, which depicts
the nodes that are members of more than one shard. These are seen
by the overlapping distinctly shaded regions in this figure. It
will be understood that in some implementations, such UTXO sets,
hereinafter referred to as sharded UTXO's, in relation to the fifth
embodiment may relate to and considered to be part of one or more
or all of the other embodiments of the present disclosure discussed
herein.
[0090] Transaction Validation
[0091] For transactions to validate, the UTXO set needs to be
checked and updated in the Bitcoin network. The present disclosure
provides a new version of this process for implementing validation
for a sharded blockchain, when the UTXO set is sharded. As
described above, each node on the sharded blockchain, such as in
FIG. 1b, is associated with or maintains a list of nodes on the
network, including information of which shard they are members.
This is discussed above, in relation to the first embodiment.
[0092] Two methods are described below for transaction validation
according to the present disclosure, which can be used for checking
the UTXO set. These are named Transaction Shard Validation,
according to the sixth embodiment; and UTXO Shard Validation,
according to the seventh embodiment of the disclosure,
respectively.
[0093] Transaction Shard Validation
[0094] In the sixth embodiment, transaction validation is carried
out by the shard to which the transaction is allocated. As
described above in relation to the third embodiment, transactions
are distributed to a shard using the result of a modulo function
applied to the transaction id. Because a transaction can have
inputs from different shards, validation nodes communicate with
other shards for UTXO checks.
[0095] Referring to FIG. 7a, the UTXO set check carried out between
nodes in different shards will now be described. This process is
also explained in relation to FIG. 7b.
[0096] According to the sixth embodiment, a node in Shard 4 makes a
request to the nodes in Shard 1 that are known to it to fetch the
UTXO. This is seen in step 702. The shard numbers are specified for
illustration only, and any given node associated with any given
shard may perform this request.
[0097] The validity of the response received is then assessed in
step 704. If none of the nodes have the UTXO, a null response is
given. In this case, the transaction in question is deemed invalid
in step 706. No further propagation of the transaction will take
place, in this case. In some cases, a transaction is also deemed
invalid if there is a script error, for instance, or any indication
that the UTXO is not available.
[0098] Where a UTXO of the given transaction is received, the
transaction input is deemed as valid in step 708. As discussed in
the background section, it is known that software clients or
programs or applications installed on nodes may perform this
validation on a UTXO by executing its locking and unlocking
scripts. In some implementations, this is referred to as the
validity data for the transactions. If execution of the locking and
unlocking scripts evaluate to TRUE, the transaction is valid and
the transaction is written to the blockchain. Furthermore, as also
discussed above, one of the validity checks is to check that there
has not been a double spend. In some implementations, when a node
receives a transaction, it will look up the UTXOs that the
transaction spends in a data structure associated with the node, or
of the associated shard.
[0099] In step 710 the transaction in question is then added to
shard 4's or the node on shard 4's mempool.
[0100] In step 712, the transaction is then propagated to other
nodes in shard 4.
[0101] UTXO Shard-Based Validation
[0102] In the seventh embodiment, transactions are propagated to
the shards (of a sharded network as seen in FIG. 1b) containing the
UTXOs of a given transaction.
[0103] FIG. 8a illustrates a spending transaction (Tx) created by a
node in Shard 4 being propagated to each of the shards containing
the UTXOs of that transaction. In this embodiment, the node sends
the transaction to both Shard 1 and 2. The process is further
illustrated in FIG. 8b.
[0104] When a node within a shard receives a given transaction in
step 802, it proceed to then validate the transaction based on
inputs which are within the same shard in this embodiment. In some
implementations, responsive to receiving a given transaction at a
node, this step may further include determining at least one shard
comprising that at least one UTXO associated with the node is
referenced by at least one respective input of the given
transaction. As discussed above, the node is associated with a set
of UTXOs that relate to one or more transactions allocated to each
shard that that the node is a member of.
[0105] In step 804, it is checked if the given transaction's input
is associated with the same shard. The transaction may be allocated
to a shard as discussed above, according to the fourth embodiment.
As discussed above, if the node is a member of more than one shard,
then the "same" shard check in this step will apply to any of such
shard.
[0106] Inputs that relate to UTXOs in a different shard are not
validated, as seen in step 806b. In some implementations, the
validation of each input may be carried out much in the same way as
validations may be currently carried out in the Bitcoin network.
Otherwise, the node proceeds to validate the transaction in step
806a.
[0107] Further to step 806a, the validity of an input associated
with the given transaction is checked in step 808. As discussed
above, and also in steps 706 and 708 of FIG. 7b, validation on an
unspent transaction (UTXO) may be by executing its locking and
unlocking scripts. In some implementations, this is referred to as
the validity data for the transactions. If execution of the locking
and unlocking scripts evaluate to TRUE, the transaction is valid
and the transaction is written to the blockchain. Furthermore, as
also discussed above, one of the validity checks is to check that
there has not been a double spend. In some implementations, when a
node receives a transaction, it will look up the UTXOs that the
transaction spends in a data structure associated with the node, or
with the shard of the node.
[0108] In the case where an input is invalid, such as when the UTXO
does not exist, or the value is greater than the UTXO, or if there
is a script error; then the given transaction is dropped as seen in
step 810b. In this case, the given transaction is not propagated to
other nodes in the same shard.
[0109] If the input is deemed valid, the transaction is identified
as being valid in step 810a.
[0110] The transaction is then added to the node's mempool in step
812.
[0111] The transaction is propagated to other nodes in the shard
associated with the node, in step 814.
[0112] Turning now to FIG. 9, there is provided an illustrative,
simplified block diagram of a computing device 2600 that may be
used to practice at least one embodiment of the present disclosure.
In various embodiments, the computing device 2600 may be used to
implement a node or a combination of nodes in one or more shards of
the sharded blockchain network seen in FIG. 1b, and/or any of the
computer implemented systems, methods or protocols illustrated and
described above when taken alone or when communicatively coupled to
one or more other such nodes or systems.
[0113] For example, the computing device 2600 may be configured for
use as a data server, a web server, a portable computing device, a
personal computer, or any electronic computing device. As shown in
FIG. 9, the computing device 2600 may include one or more
processors with one or more levels of cache memory and a memory
controller (collectively labelled 2602) that can be configured to
communicate with a storage subsystem 2606 that includes main memory
2608 and persistent storage 2610. The main memory 2608 can include
dynamic random-access memory (DRAM) 2618 and read-only memory (ROM)
2620 as shown. The storage subsystem 2606 and the cache memory 2602
and may be used for storage of information, such as details
associated with transactions and blocks as described in the present
disclosure. The processor(s) 2602 may be utilized to provide the
steps or functionality of any embodiment as described in the
present disclosure.
[0114] The processor(s) 2602 can also communicate with one or more
user interface input devices 2612, one or more user interface
output devices 2614, and a network interface subsystem 2616.
[0115] A bus subsystem 2604 may provide a mechanism for enabling
the various components and subsystems of computing device 2600 to
communicate with each other as intended. Although the bus subsystem
2604 is shown schematically as a single bus, alternative
embodiments of the bus subsystem may utilize multiple busses.
[0116] The network interface subsystem 2616 may provide an
interface to other computing devices and networks. The network
interface subsystem 2616 may serve as an interface for receiving
data from, and transmitting data to, other systems from the
computing device 2600. For example, the network interface subsystem
2616 may enable a data technician to connect the device to a
network such that the data technician may be able to transmit data
to the device and receive data from the device while in a remote
location, such as a data centre.
[0117] The user interface input devices 2612 may include one or
more user input devices such as a keyboard; pointing devices such
as an integrated mouse, trackball, touchpad, or graphics tablet; a
scanner; a barcode scanner; a touch screen incorporated into the
display; audio input devices such as voice recognition systems,
microphones; and other types of input devices. In general, use of
the term "input device" is intended to include all possible types
of devices and mechanisms for inputting information to the
computing device 2600.
[0118] The one or more user interface output devices 2614 may
include a display subsystem, a printer, or non-visual displays such
as audio output devices, etc. The display subsystem may be a
cathode ray tube (CRT), a flat-panel device such as a liquid
crystal display (LCD), light emitting diode (LED) display, or a
projection or other display device. In general, use of the term
"output device" is intended to include all possible types of
devices and mechanisms for outputting information from the
computing device 2600. The one or more user interface output
devices 2614 may be used, for example, to present user interfaces
to facilitate user interaction with applications performing
processes described and variations therein, when such interaction
may be appropriate.
[0119] The storage subsystem 2606 may provide a computer-readable
storage medium for storing the basic programming and data
constructs that may provide the functionality of at least one
embodiment of the present disclosure. The applications (programs,
code modules, instructions), when executed by one or more
processors, may provide the functionality of one or more
embodiments of the present disclosure, and may be stored in the
storage subsystem 2606. These application modules or instructions
may be executed by the one or more processors 2602. The storage
subsystem 2606 may additionally provide a repository for storing
data used in accordance with the present disclosure. For example,
the main memory 2608 and cache memory 2602 can provide volatile
storage for program and data. The persistent storage 2610 can
provide persistent (non-volatile) storage for program and data and
may include flash memory, one or more solid state drives, one or
more magnetic hard disk drives, one or more floppy disk drives with
associated removable media, one or more optical drives (e.g. CD-ROM
or DVD or Blue-Ray) drive with associated removable media, and
other like storage media. Such program and data can include
programs for carrying out the steps of one or more embodiments as
described in the present disclosure as well as data associated with
transactions and blocks as described in the present disclosure.
[0120] The computing device 2600 may be of various types, including
a portable computer device, tablet computer, a workstation, or any
other device described below. Additionally, the computing device
2600 may include another device that may be connected to the
computing device 2600 through one or more ports (e.g., USB, a
headphone jack, Lightning connector, etc.). The device that may be
connected to the computing device 2600 may include a plurality of
ports configured to accept fibre-optic connectors. Accordingly,
this device may be configured to convert optical signals to
electrical signals that may be transmitted through the port
connecting the device to the computing device 2600 for processing.
Due to the ever-changing nature of computers and networks, the
description of the computing device 2600 depicted in FIG. 9 is
intended only as a specific example for purposes of illustrating
the preferred embodiment of the device. Many other configurations
having more or fewer components than the system depicted in FIG. 9
are possible.
[0121] It should be noted that the above-mentioned embodiments
illustrate rather than limit the disclosure, and that those skilled
in the art will be capable of designing many alternative
embodiments without departing from the scope of the disclosure as
defined by the appended claims. In the claims, any reference signs
placed in parentheses shall not be construed as limiting the
claims. The word "comprising" and "comprises", and the like, does
not exclude the presence of elements or steps other than those
listed in any claim or the specification as a whole. In the present
specification, "comprises" means "includes or consists of" and
"comprising" means "including or consisting of". The singular
reference of an element does not exclude the plural reference of
such elements and vice-versa. The disclosure may be implemented by
means of hardware comprising several distinct elements, and by
means of a suitably programmed computer. In a device claim
enumerating several means, several of these means may be embodied
by one and the same item of hardware. The mere fact that certain
measures are recited in mutually different dependent claims does
not indicate that a combination of these measures cannot be used to
advantage.
[0122] It is to be understood that the above description is
intended to be illustrative, and not restrictive. Many other
implementations will be apparent to those of skill in the art upon
reading and understanding the above description. Although the
disclosure has been described with reference to specific example
implementations, it will be recognized that the disclosure is not
limited to the implementations described but can be practiced with
modification and alteration within the scope of the appended
claims. Accordingly, the specification and drawings are to be
regarded in an illustrative sense rather than a restrictive sense.
The scope of the disclosure should, therefore, be determined with
reference to the appended claims, along with the full scope of
equivalents to which such claims are entitled.
* * * * *