<!– wp:paragraph –>
<p>BLOCKCHAIN ABBREVIATION Implemented by Message Passing and Shared Memory (Extended Abstract) Maxim Amelchenko Shlomi Dolev Ben Gurion University of the Negev maxam@post.
bgu.
ac.
il,
dolev@cs.
bgu.
ac.
il<br></p>
<!– /wp:paragraph –>
<!– wp:paragraph –>
<p><strong><em>Abstract</em>— Blockchain’s ever increasing size has become a major problem.
Bitcoin [7],
for example,
has grown to 115120 MB as of May 2017,
which is roughly 115 GB.
This uncontrollable growth of the Blockchain is bound to become an issue in the future,
as hard disks may become too small to store the entire Blockchain history and traversing the transactions databases may become increasingly slow.
Already,
there are lightweight clients in various Blockchain platforms (Bitcoin included),
who do not store the entire chain locally but rely on a third party to send them the blocks they need.
There are many issues with these clients,
mainly security problems,
since they go back to trusting a central authority rather than gaining trust from several distributed peers.
These clients’ knowledge of the Blockchain is solely based on some third party that should be trusted,
while the conceptual base for Blockchain is trust distributing.
In this paper we present two Blockchain abbreviation schemes.
The fifirst one is based on the Ethereum [8] project and proposes replacing the full Blockchain with a new Genesis block,
which summarizes everyone’s account balances at a certain point in time.
One possible benefifit is to use less communication while still storing the prefifix of the old Blockchain (or signature of the Blockchain that can validate a version archived by other participants) in a local archive.
Here we trade loss of transaction history for effificiency.
Our second contribution is a UNIX based architecture using the fifile system,
for implementing Blockchain.
We demonstrate a Blockchain abbreviation technique for this architecture too.
</strong>I.
INTRODUCTION Blockchain technology is a very promising futurist concept for constructing distributed trust structures.
The main idea is to control the schedule of actions in the entire distributed system to be serial with a very high probability.
The need to mine,
compute and search for a hard to fifind string that is hashed to a specifific pattern enforces even the Byzantine participants to obey the rate of new block creation that implies with high probability sequential new block creation in the entire system.
In rare cases,
several blocks are introduced in parallel and symmetry is broken over time by continuous gossip among the participants and comparison of chain lengths.
The gossip has a known (small) latency among all participants.
Thus,
allowing a short bound on the time it takes to have agreement on the fifinalization of a block inclusion in the block chain,
allowing to fifinalize the (possibly,
fifinancial) transactions associated with the block.
<br></p>
<!– /wp:paragraph –>
<!– wp:paragraph –>
<p>In more details,
the participants in implementing a Blockchain start with a commonly agreed genesis block.
This genesis block is publicly known to everyone and cannot be forged later by others since it will break the cryptographic (hash function) connection between the genesis block and the next block.
They constantly perform operations which involve collecting transactions from the entire network,
gathering them into a block and mining.
This process of mining (which is part of a Proof-of-Work scheme on which we will elaborate later),
involves repetitive hashing of the block contents (with an alternating nonce) until the hash value is smaller than an agreed upon threshold.
In each round of hashing,
the participants change a certain value in the block called nonce,
yielding a different hash value.
The fifirst node to reach the threshold publishes its block to the entire network.
Each node receiving a new block,
performs a series of checks to ensure the validity of the block and the associated transactions.
Once the verifification is completed,
the block is accepted and appended to the Blockchain in each node.
Blockchain platforms usually employ one of two methods to achieve consensus without a single point of authority.
The fifirst method is Proof-of-Work (PoW) and the other is Proof[1]of-Stake (PoS).
We will elaborate more about PoW as it is much more common,
it is easier to understand and we also utilized it in the implementation of our schemes.
The idea behind PoW is making participants in a system work and be able to prove that a certain amount of work has been done,
before they can use,
contribute and profifit from a system.
PoW is used to achieve multiple goals: <em>• </em>Prevent Sybil attacks.
In these attacks,
an attacker creates many fake identities and tries to inflfluence the network by posing as a majority or a large subcommunity within a community.
An attacker wanting to sabotage a system using PoW will have to physically own a very large amount of computational power.
<em>• </em>Achieve consensus using majority voting.
If we represent Blockchain as a voting system,
where participants vote on what is the currently active chain,
then the longest chain (or the chain which required the most work to produce) is the majority vote.
<br></p>
<!– /wp:paragraph –>
<!– wp:paragraph –>
<p><em>• </em>Set the data generation rate.
Using PoW where the diffificulty of puzzles is adjustable,
allows systems to adapt to changes of its clients and maintain a stable release rate of its product (a cryptocurrency or anything else).
For ex[1]ample,
if hardware becomes stronger,
or new algorithms are found to fifind faster solutions to puzzles,
the system can increase its PoW diffificulty so the data generation rate is uneffected.
No special synchronization is needed to adjust diffificulty,
since it is calculated independently at each node based on public information like average block generation time.
The most important assumption of PoW is that there is an honest majority.
More precisely,
the vast majority of CPU power is controlled by honest nodes.
Since the honest nodes hold the vast majority of the computational power,
their chain will always be the fastest to grow and no chain will be able to compete with it.
Over the years many papers were published to address the issue of increasing Blockchain size and scalability.
However,
most of them targeted the general Blockchain community growth,
and the scalability issues that come with it.
Among them,
are block size limit,
databases bottlenecks and overload of network traffific.
Mini-Blockchain project lead by J.
D.
Bruce [1] is one of the few that actually tackled the problem of the chain length.
The project details a solution that involves cutting off the strong cryptographic connections between transactions and maintaining an account tree to quickly check peer bal[1]ances.
This allows deleting transactions that are confifirmed,
effectively reducing the size of the chain to a constant at the expense of keeping a large account tree.
Our abbreviation scheme is conceptually different.
We propose an abbreviation process that cuts a prefifix of one or more blocks from the Blockchain,
without altering the connections between the remaining blocks.
A more formal defifinition of the abbreviation will be given for each one of the schemes we present.
II.
BLOCKCHAIN SYSTEM SETTINGS We defifine Blockchain as a distributed system of machines,
each running the same program consisting of executable statements.
We represent this distributed system by a set of <em>n </em>state machines called nodes that communicate with each other.
Every node can communicate with its neighbors either by exchanging messages,
or using shared memory.
When using shared memory,
all nodes must know one another,
forming a complete undirected graph of nodes.
When using message passing,
each node has to be connected to at least one other node,
and a (reliable,
non-Byzantine) path must exist between every two nodes,
forming an undirected tree graph.
Communication by writing in,
and reading from,
the shared memory usually fifits systems where the nodes are geograph[1]ically close to each other.
A message-passing distributed model fifits both nodes that are located close to each other and wide-area distributed systems such as communication networks.
In a message-passing model nodes communicate by sending and receiving messages over the wire.
In such a system,
network latency can have a big inflfluence over the </p>
<!– /wp:paragraph –>
<!– wp:paragraph –>
<p>synchronization model of the system.
This has been examined to some extent by Garay,
Kiayias and Leonardos [4].
We will further elaborate how it affects the Blockchain,
taking into account the abbreviation times.
III.
PRELIMINARY It has been proved that for any distributed system of size <em>n</em>,
for consensus to be reached,
no more than <em>n</em>3 nodes can be faulty [3] .
Faulty processes can exhibit a number of behaviors,
such as presenting different messages to different nodes,
dropping messages,
and altering messages on transit.
However,
Blockchain,
which is also a distributed system,
does not have these limitations.
The reason lies in the mining process,
and in the ability to verify the correctness of newly published blocks.
In order to distribute two different but legitimate blocks,
a node will have to beat the entire network twice in a row thus possessing an unrealistic amount of computational power.
Even if a node manages to ditribute two different but legitimate blocks,
causing a fork in the network,
eventually when more blocks are added the network will converge to a single consensus chain,
based on the diffificulty of all the blocks.
Once a block is published,
all other nodes can verify the transactions in it are valid,
and that enough work has been put into creating this block.
This guarantees that messages accepted in the Blockchain are legitimate.
IV.
MESSAGE PASSING BLOCKCHAIN ABBREVIATION <strong>Overview.
</strong>Message Passing Blockchain is a type of Blockchain where nodes send each other messages.
This Blockchain is usually cryptocurrency-based,
but other use[1]cases have also been demonstrated,
such as voting systems [6].
In this architecture,
not every node has to know every other node but there must exist a channel between each node to at least one other node.
Nodes joining the network have a number of ways of discovering their peers.
Over the years,
IRC (Internet Relay Chat) channels have been used to locate IP addresses of running nodes.
When joining IRC channels,
the client receives messages broadcasted from other clients already subscribed to that channel.
This method has been abandoned in favor of DNS seeds and hard-coded IP lists.
The DNS seeds resolve to a group of IP addresses which are known to be stable.
If a node is experiencing a DNS failure it will try the hard-coded IP addresses.
We will address the common situation where an asset-based Blockchain (such as cryptocurrency) uses the message-passing model for communication.
In this scheme,
every participant has an account which contains its balance of assets (for example,
number of coins).
We will use this feature in the abbreviation of the chain,
as capturing the balances of all participants at a certain point in time can serve as a reference point for all future transactions.
This point in time can become the new “beginning of time”,
or the new Genesis block.