Professor Aggelos Kiayias joins IOHK as Chief Scientist
21 October 2016 4 mins read
Professor Aggelos Kiayias joins IOHK as Chief Scientist
Prof Kiayias is the pre-eminent academic in the field of blockchain security. He has produced pioneering work providing the rigorous analysis necessary to secure confidence in this emerging technology.
Blockchain technology has the potential to significantly reshape the future of finance, and Prof Kiayias is the leading researcher promoting and studying the science behind blockchain systems. In one project, Prof Kiayias led a team in Athens last year that designed the first e-voting system, where voters could verify that votes cast actually went to the intended candidate, without relying on a trusted system component.
Prof Kiayias produced a key paper proving fundamental properties of bitcoin blockchain and played a lead role in writing IOHK’s paper on the first provably secure Proof-of-Stake protocol, published in the IACR Cryptology ePrint Archive and available in the IOHK research library.
Prof Kiayias said that IOHK’s rigorous standards made it stand out among cryptocurrency companies.
“I am very happy to engage with a team of very enthusiastic researchers in the area of blockchain systems,” he said.
“While other companies may consider taking shortcuts or focus on just engineering their products, IOHK makes long-term commitments in building basic tools that are available to the community as a whole and in advancing the science behind blockchain systems.”
Charles Hoskinson, chief executive and co-founder of IOHK, said: “Professor Kiayas is one of the most rigorous and creative cryptographers working in the cryptocurrency space, with a long pedigree of accomplishments from the GKL15 model to the first provably secure Proof-of-Stake algorithm.
“It’s a tremendous honor to have Aggelos on board as our Chief Scientist to help IOHK navigate the complex, yet always sublime world of cryptography. We look forward to the results this collaboration will bring as well as the great papers that will undoubtedly follow.”
Prof Aggelos Kiayias is the Chair in Cyber Security and Privacy at the University of Edinburgh. His research interests are in computer security, information security, applied cryptography and foundations of cryptography with a particular emphasis in blockchain technologies and distributed systems, e-voting and secure multiparty protocols as well as privacy and identity management. He joins IOHK as chief scientist through a long-term consulting agreement between IOHK and the University of Edinburgh, UK, where he is based and continues to do research and teach courses in cyber security and cryptography. Prof Kiayias is also Professor in Residence (gratis) at the University of Connecticut, USA, and Associate Professor of Cryptography and Security (on leave) at the National and Kapodistrian University of Athens, Greece.
Prof Kiayias’s cyber security research over the years has been funded by the Horizon 2020 programme (EU), the European Research Council (EU), the General Secretariat for Research and Technology (Greece), the National Science Foundation (USA), the Department of Homeland Security (USA), and the National Institute of Standards and Technology (USA). He has received an ERC Starting Grant, a Marie Curie fellowship, an NSF Career Award, and a Fulbright Fellowship. He holds a Ph.D. from the City University of New York and he is a graduate of the Department of Mathematics at the University of Athens. He has more than 100 publications in journals and conference proceedings in the area.
He currently serves as the program chair of the Financial Cryptography and Data Security 2017 conference.
IOHK | Latvia. Executive update
19 October 2016 3 mins read
For some of the IOHK team it was their first time meeting in person when we gathered in Latvia this month for an intensive working session to push forward several of our key projects. As a distributed company we usually collaborate remotely, across time zones, so coming together to exchange ideas was inspiring. For two weeks in the beautiful capital city of Riga, we advanced our groundbreaking work implementing the first provably secure proof-of-stake protocol. Our research fellow, Bernardo David, gave an earlier presentation on the protocol.
Our core developer team and partner, Serokell, is preparing a version of RS|coin to be used in a production system. We published our implementation of RS|coin - the central bank cryptocurrency conceived by University College London researchers - earlier this year.
It’s extremely exciting to know something we built will actually get used and be useful. Our team here in Latvia is working with our Japanese branch to help a client secure their information on the blockchain. The Serokell team, led by Arseniy Seroka and Jonn Mostovoy, includes expert developers and engineers drawn from Russia, Latvia, Croatia and Portugal. We've found exciting innovations that come from using Haskell. Serokell and the design team are putting the finishing touches to a block explorer for a major client.
Riga has also been very much about our ScoreX project. Alexander Chepurnoy, the director of the ScoreX project and Dmitry Meshkov, IOHK research fellow and ScoreX project developer, gave fascinating presentations on the latest developments of this pioneering framework: a prototype cryptocurrency that allows anyone to slot in and test parts of their own cryptocurrency code.
Both presentations covered the differences and considerations between a pedagogical implementation like ScoreX, meant to be used for research and education, and actually releasing a real cryptocurrency in the wild. In addition, many on the team took the opportunity to learn more about the ScoreX 2.0 release and what progress has been made in making it even more modular and flexible. Written in Scala, the project takes forward research in the area of authenticated data structures. Their findings are due to be published in an academic paper early next year.
Our creative team is here in Riga, and has been working to design the UI and UX for a big project coming next year. And our operations team has been hard at work laying the foundations for all our work. All in all, Riga has been extremely productive and very exciting, and we are very much looking forward to the launch of one of these projects for a major client in a few months.
A major technical challenge of the cryptocurrencies is to find a way to safely increase the throughput of the system in terms of number of transactions. An approach to tackle this limitation is to review the role of the blockchain, or even to take that data structure out of the picture completely. In this post, we will comment a paper by Boyen, Carr and Haines named Blockchain-Free Cryptocurrencies: A Rational Framework for Truly Decentralized Fast Transactions which proposes the latter approach.
Before describing the idea of the paper, and where the main contribution is, it is helpful to give an overview of how we can generally split the main components of a cryptocurrency protocol, or what is commonly known as the consensus protocol. The three main components are:
- the protocol itself, the routine that every node follows;
- the blockchain data structure; and
- the Proof of Work (PoW) paradigm.
Here we are clearly giving the emphasis on the Bitecoin like systems which rely on the PoW idea. The reader should keep in mind that not all systems rely on it (as for example Proof of Stake based protocol).
We start by reviewing the second item above: the blockchain.
The BlockchainThe reader familiar with decentralized cryptocurrencies is aware that the blockchain is a data-structure which is kept by the network players, which are often called *miners* in Bitcoin jargon.
Each smallest part of such structure is called block, hence the name, which is where the transactions are described. There are different ways of building the actual structure (for example, Bitcoin-NG and GHOST, however in the end, the pace of the the block generation (including its addition in the structure) is the heartbeat of the whole system.
That is, with each new added block, containing brand new transactions, we are sure that the system is "alive" and making progress. However here also lies a problem.
The Scalability of the BlockchainIn the pace of the whole system lies the fundamental constrain of the growth speed of the blockchain. Nakamoto was aware of it as well as its implications, as stated in the Bitcoin seminal [Nakamoto’s whitepaper](/research/library/#2VPKGNSS).
A major implication of this limitation is the regulated speed of block generation using the difficulty level. That is the hash value that the player needs to find in order to generate a new block in PoW. That value is carefully set within the Bitcoin network to keep just the right balance between the block generation time and the block spread time over the network.
In a different setting, say, with an arbitrarily low difficulty level, there would be a higher block generation rate for sure. However, due to the competitive nature of the PoW and the blockchain addition, the forks will become more frequent, therefore increasing the risk of attacks. One should expect this because the miners are competing against each other, in order to generate new valid blocks.
In other words, the scalability constraints derive from the blockchain centralized setting.
A Blockchain-free ApproachThe main idea behind the study of Boyen et al. is to substitute the blockchain by a directed graph of **transactions**, not blocks.
In that setting the system is not required to make progress by the pace of a single data structure like the blockchain. In fact, there is not a notion of block in their idea. The obvious direct consequence is that there is no race among the players of the system to generate the new block. The players are allowed to choose in which point in the graph they want to expand it.
Instead of the previous competitive blockchain environment, the players cooperatively expand the directed graph by adding new nodes, i.e., transactions, to it. We stress that every new node in the graph is a new transaction. And the action of attaching the transaction into the graph validates the transactions where the new one is attached to, i.e., the parent nodes (the new node has two parent nodes in the graph).
In order to incentivize the players to keep the progress of the system, every new transaction specifies a fee, which is decided by the issuer of the transaction. Consequently every new node which is later attached to that transaction (as described earlier) does two actions: (1) validate the transaction represented of its two parents (the number of parents is fixed by their framework) and (2) collects the fee from its parents.
Differently from the blockchain setting, there is not a notion of rounds (or epochs), i.e., the new block generation. However, due to the collection of fees, the players are incentivized to keep adding transactions to the system, therefore making the transaction history evolve. Another advantage is that the issuer of a transaction can increase the speed of the validation by offering a higher fee.
This framework reverses the competitive nature of the miners to a more cooperative approach to validate transactions in the system.
Other papers describing scalability issues within blockchain based protocols can be found here:
- On Scaling Decentralized Blockchains
- Inclusive Block Chain Protocols
Log-Structured-Merge trees (LSMT) are a good fit for modern SSD storage and offer good performance and reliability. LSMT are also a good fit for blockchain storage requirements (snapshots, consistency, proof of existence). This blog post describes a database designed specifically for blockchain storage, inspired by existing LSMT implementations (RocksDB, COLA tree).
The current state-of-the-art LSMT implementation is probably RocksDB, with in-memory write buffers, parallel compaction and snapshots. Another similar algorithm are COLA tree. That is a btree-like structure where each node has separate write buffer. Finally there is SSTable from Cassandra which is fairly simple, but offers great write performance.
For Scorex and other projects, we designed a storage with codename IODB, which combines features from all the above implementations. It has log-structured storage with fast binary search and multilevel-multithreaded compaction. It will also have snapshots, MVCC isolation, concurrent transactions, online backups etc...
Compared to RocksDB IODB it will have better concurrency, lesser write amplification and smaller disk overhead during compaction. IODB also runs on JVM, is easier to deploy and its functionality can be extended with plugins.
RocksDB is probably the current state-of-the-art LSMT implementation. It offers great performance and great concurrency. However it also has some trade-offs.
RocksDB (and other LSMT implementations) is using compaction to reclaim disk space, and to remove the old version of data from logs. The compaction process is not optimal:
- To perform full compaction all files have to be rewritten.
- Full compaction requires extra disk space to create new files, twice more than the data size.
- Compaction process operates over large files. It can take a long time before a single compaction task finishes. This makes it hard to temporarily pause the compaction process or close the database quickly.
- Write amplification is too big, the data entries are moved too many times during compaction process, even if they were not modified.
To solve that IODB took an inspiration from COLA tree (also known as Fractal Tree), which offers great performance under random updates. A COLA tree is a BTree like structure, where each node has its own write buffer. If the buffer becomes too large its content is propagated to a lower level by compaction process. A COLA tree is complex, but the basic idea of separate writer buffers is simple and great.
A COLA tree is difficult to implement; the code for compaction and rebalancing is complex, almost impossible in a concurrent environment. IODB avoids that by sharding data into non-overlapping intervals, rather than hierarchical BTree like nodes. Self-balancing code is simple; if an interval becomes too large (over 64MB), it is sliced into smaller intervals by the compaction process. If an interval becomes too small, it is merged into its neighbours.
Each interval in IODB is compacted with multi-level merge, the same way as RocksDB or any other LSMT implementation. So in practice IODB is composed of several small LSTM, one for each interval.
Background compaction is composed of several small atomic tasks. Each task operates on a single interval with limited size (bellow 64MB). Small tasks are easier to tune and run better in multiple threads. The limited size requires very little space overhead. Finally it is fast to close store or temporary pause compaction.
Multi source pre-sorted merge
Every commit (or write batch) is saved into a separate file. Over time you get multiple files sorted by time. To find an entry one starts from the newest file and traverses all files, until the required key is found. The compaction process merges those multiple files together.
RocksDB compaction can merge only two files at a time. IODB compaction can merge more than two files at a time. If the source data are presorted, the multi merge requires only one single pass over the source data. Creating the new merged file takes approximately
O(N*log(M)) time, where N is the total number of entries in interval, and M is the number of source files.
The number of source files (M) and the interval size (N) can be tuned. Large M means less IO and more CPU usage. Larger N reduces IO and CPU usage at the expense of larger memory usage. Configuring those parameters should make IODB usable in various workloads.
There is a number of optimizations. For example to reduce CPU overhead from comparing keys, IODB can be configured so that all keys in one interval share the same prefix. Only the key suffix is then compared (this trick comes from Cassandra).
Binary search over sorted table
Scorex uses fixed size keys. We can eliminate a file offset translation layer, and perform binary search directly over a memory-mapped file. This is very disk cache friendly and much faster than traditional hierarchy based structures, such as BTrees.
IODB is designed to be very easy to deploy. It is pure-Java and can run on any Java8 enabled JVM.
It is also written in a way which allows to integrate it into J2EE container or Spring Dependency Injection. It is a set of independent components wired together with the constructor parameters.
It also uses common Java components such as
ScheduledExecutorService to run background tasks. IODB can share thread pools with other Java libraries etc..
Finally IODB takes some features from RocksDB. It will support incremental backups and snapshots, created with hard file links...
MapDB will not use IODB directly, but will get a storage component based on IODB design. MapDB needs a different set of features (variable key size, 8-byte recids, less random data) and is written in a different language (Scala vs Kotlin), so it will have to get separate implementation.
MapDB will get append-only storage engine based on IODB design. Recids (keys) in MapDB store are 8-byte longs, which allows many optimizations by using primitive arrays and collections.
SortedTableMap could also use compaction similar to that of IODB design. So we will have a writable
on top of the append-only storage. It will use
SortedTableMap storage format for a single file, but it will support
updates, snapshots, compaction and all the features of IODB.