Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
저 시-비 리- 경 지 2.0 한민
는 아래 조건 르는 경 에 한하여 게
l 저 물 복제, 포, 전송, 전시, 공연 송할 수 습니다.
다 과 같 조건 라야 합니다:
l 하는, 저 물 나 포 경 , 저 물에 적 된 허락조건 명확하게 나타내어야 합니다.
l 저 터 허가를 면 러한 조건들 적 되지 않습니다.
저 에 른 리는 내 에 하여 향 지 않습니다.
것 허락규약(Legal Code) 해하 쉽게 약한 것 니다.
Disclaimer
저 시. 하는 원저 를 시하여야 합니다.
비 리. 하는 저 물 리 목적 할 수 없습니다.
경 지. 하는 저 물 개 , 형 또는 가공할 수 없습니다.
M.S. THESIS
A Data Retailer Based Decentralized Oracle
Design for Smart Contracts in Blockchain
블록체인 내 스마트 컨트랙트를 위한 데이터 리테일러 기반
분산형 오라클 디자인
FEBRUARY 2019
DEPARTMENT OF COMPUTER SCIENCE ENGINEERING
COLLEGE OF ENGINEERING
SEOUL NATIONAL UNIVERSITY
Park Soobin
M.S. THESIS
A Data Retailer Based Decentralized Oracle
Design for Smart Contracts in Blockchain
블록체인 내 스마트 컨트랙트를 위한 데이터 리테일러 기반
분산형 오라클 디자인
FEBRUARY 2019
DEPARTMENT OF COMPUTER SCIENCE ENGINEERING
COLLEGE OF ENGINEERING
SEOUL NATIONAL UNIVERSITY
Park Soobin
A Data Retailer Based Decentralized Oracle Design
for Smart Contracts in Blockchain
블록체인 내 스마트 컨트랙트를 위한 데이터 리테일러
기반 분산형 오라클 디자인
지도교수 권태경
이 논문을 공학석사학위논문으로 제출함
2018 년 11 월
서울대학교 대학원
컴퓨터 공학부
박수빈
박수빈의 석사학위논문을 인준함
2018 년 12 월
위 원 장 김종권 (인)
부위원장 권태경 (인)
위 원 전화숙 (인)
Abstract
A Data Retailer based Decentralized
Oracle Design for Smart Contracts In
Blockchain
Soobin Park
School of Computer Science Engineering
Collage of Engineering
The Graduate School
Seoul National University
Smart contracts, a program that enables the automated execution of terms
from the pre-defined condition establishment, became a core component of dis-
tributed applications (Dapps) leveraging the decentralized blockchain technol-
ogy. Although many researchers keep trying adopting them to solve problems
in the centralized systems such as the single point of failure (SPOF) and the
need for excessive trust, there are multiple technical challenges in the practical
use of them.
Amongst the challenges, we focus on “the oracle problem” in this paper,
which means how to feed external data with ensuring the trustworthiness to
the smart contracts. There are two primary technical considerations: (1)how
to make the smart contracts to access the external data (2)how can measure
the reliability of the data. In this paper, we propose a useful oracle design and
implementation with an investigation of target platforms and related works to
analyze the technical requirements and available design choices.
The proposed design focuses on the trustworthiness of the data sources
i
by separating the roles of the data provider into “publisher” and “retailer”.
The former, the data sources, can gain rewards by publishing data on their net-
works, and the latter gains rewards by relaying the data to the target blockchain
network. A reasonable incentive model motivates them to behave honestly to
maintain the oracle stable. Consumer contracts can receive the resolved data
from the multiple retailers by using the oracle deployed as an on-chain smart
contract. In the case of data inconsistency, the proposed reputation-based data
consensus protocol determines a value from the collected data pools. The se-
lected retailers and publishers can gain rewards from the deposit pre-paid by
the consumer contract, and the rejected ones get penalties.
The main goal of this design is providing sufficient high-quality data to the
blockchain by relieving the data sources’ burden of participating in blockchain
networks of potential consumers. The plenty of data supply may decrease the
data price, that leads to great inter-operability between blockchain systems
and the rest of the Internet. We also implement the protocols on Ethereum
network and prove the practicality of the proposal in cost and time aspects,
while most of the precedent researches are stuck with the protocol design. We
expect that the design to contribute to the practical use of Dapps as the first
working decentralized oracle through the code optimization process.
Keywords: SNU, Computer Science Engineering, thesis
Student Number: 2017-21931
ii
Contents
Abstract i
Contents iii
List of Figures v
List of Tables vi
Chapter 1 Introduction 1
Chapter 2 Background 3
2.1 Smart Contract Platforms . . . . . . . . . . . . . . . . . . . . . . 3
2.1.1 Bitcoin . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1.2 Ethereum . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.3 EOS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 The Oracle Problem . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3.1 Oraclize . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3.2 SchellingCoin . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3.3 Augur . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3.4 Witnet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
iii
Chapter 3 Design 10
3.1 Design Choices and Considerations . . . . . . . . . . . . . . . . . 10
3.1.1 Centralized vs. Decentralized . . . . . . . . . . . . . . . . 10
3.1.2 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.1.3 Service Scope . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 Design Proposal . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Chapter 4 Protocol 16
4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.1.1 Request phase . . . . . . . . . . . . . . . . . . . . . . . . 17
4.1.2 Delivery Phase . . . . . . . . . . . . . . . . . . . . . . . . 18
4.1.3 Consensus Phase . . . . . . . . . . . . . . . . . . . . . . . 20
4.2 Data Structure and Authentication . . . . . . . . . . . . . . . . 20
4.3 Data Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.4 Data Delivery Tag . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.5 Incentive and Penalty . . . . . . . . . . . . . . . . . . . . . . . . 24
4.6 Data Consensus . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Chapter 5 Proof of Concept Implementation 27
5.1 Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.2 Evaluations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.2.1 Experimental environment . . . . . . . . . . . . . . . . . . 29
5.2.2 Transaction Cost and Processing Time . . . . . . . . . . 29
5.2.3 Practicality . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Chapter 6 Conclusion 34
Bibliography 35
요약 37
iv
List of Figures
Figure 3.1 Data delivery from an off-chain network to consumers in
the target blockchain over the retailers. . . . . . . . . . . 14
Figure 4.1 Protocol overview . . . . . . . . . . . . . . . . . . . . . . 16
Figure 4.2 Flow of request phase . . . . . . . . . . . . . . . . . . . . 17
Figure 4.3 Flow of delivery phase . . . . . . . . . . . . . . . . . . . 18
Figure 4.4 Flow of consensus phase . . . . . . . . . . . . . . . . . . 19
Figure 4.5 Data structure for publishing . . . . . . . . . . . . . . . 21
Figure 4.6 Data tag verification during data delivery process . . . . 23
Figure 4.7 Reputation calculating formula . . . . . . . . . . . . . . 25
Figure 5.1 Memory/storage variables on the contract . . . . . . . . 28
Figure 5.2 Pseudocode of oracle interfaces(1/2) . . . . . . . . . . . . 28
Figure 5.3 Pseudocode of oracle interfaces(2/2) . . . . . . . . . . . . 28
Figure 5.4 Gas consumption to the numbers of deliveries in trans-
action for interface O-4 . . . . . . . . . . . . . . . . . . . 31
Figure 5.5 Total cost for an advertisement . . . . . . . . . . . . . . 31
Figure 5.6 The lower bound of processing time for an advertisement 32
Figure 5.7 Total data delivery transaction cost to number of deliveries 33
v
List of Tables
Table 5.1 Expected processing time and cost to the gas price in each
interface transaction (Simulated on Nov. 2018) . . . . . . 30
vi
Chapter 1
Introduction
Smart contracts are programs that enforce an execution of terms automatically
under a pre-defined condition. The concept was initially proposed by Nick Szabo
in 1997 [1], but could not be realized due to the lacks of necessary technologies.
Ever since Satoshi Nakamoto presented the idea of Bitcoin in 2008 [17], they
are typically executed on blockchains of which immutability property enables a
trustless consensus. Leveraging the strong integrity assurance of events, smart
contracts allow credible transactions without the trusted third parties which
require an excessive trust to them and have a single point of failure (SPOF)
threat. The advent of specialized platforms such as Ethereum [16] and EOS [13],
smart contracts became a core component of distributed applications (Dapps).
However, despite the potential of Dapps, there have been some reported
technical challenges in the practical use of them, e.g., lack of the transaction
privacy [2], irreversible insecure contracts [6], and the corruptibility of data
sources. Among the problems, we focus on “the oracle problem”. As the most
compelling applications of smart contracts (e.g., insurance, payment, and lo-
gistics applications) necessarily require access to data about real-world events,
ensuring the data trustworthiness is a critical technical requirement for Dapps.
1
There are two primary technical considerations in the oracle design. The first
one is building a data transfer channel between the network of data sources
to another of smart contracts. The latter is a blockchain, while the former
can be any off-chain network. Smart contracts cannot perform a self-triggering
by themselves or access external network. Therefore, an oracle is required to
perform the ‘aggregation’ and ‘pushing’ data to smart contracts.
The second concern is how the smart contracts measure the trustworthiness
of external data. In a centralized approach, a smart contract developer can use
a trusted third party service exists on the outside of the blockchain. However,
it can be a single point of data corruption. The decentralized approach seems
worthwhile in this sense, but it accompanies other complex design considera-
tions for robustness such as data consensus mechanism and an incentive model
enforcing the participants be honest. As the models should consider relation-
ships among the multiple stakeholders with possible security threats, a set of
built decentralized oracle protocols requires to be proven to work in an intended
way.
In this paper, we introduce and report the technical requirements details and
possible design choices for them from investigating the existing platforms and
services. Then we propose a scalable design of the decentralized oracle, which
runs as an on-chain smart contract. A new entity, called a retailer, plays the
role of setting up a “channel” between a network of a data source and another
of a consumer. The retailer seeks to relay the (external) data that can satisfy
the consumer’s data requirements. In this way, smart contracts become more
practical, and the sources do not need to participate in multiple blockchains
of potential consumers. We also formally analyze the total cost and expected
processing time models to evaluate. Our implementation of the design and run-
ning simulation using the models shows that the design is worthwhile as the
first practical decentralized oracle.
2
Chapter 2
Background
2.1 Smart Contract Platforms
2.1.1 Bitcoin
Bitcoin [17] is the first practical public blockchain network designed to serve as
the public ledger for the asset transferring transactions. The basic idea of public
blockchain is preventing illegal modification to digital records by maintaining
them in multiple copies with integrity validation. To achieve this goal, Bitcoin
is made up of three main components: transactions, consensus protocol, and
the peer-to-peer (P2P) communication network [7]. Bitcoin network maintains
a stable blockchain on these main components as followings.
• Participating nodes generate a transaction message to transfer their assets
using an ad-hoc script (i.e., OPCODE).
• The message contains a sender’s address and signature, receiver’s address,
and a locking script that only can be unlocked by the receiver.
• The transactions are propagated to peers along over the P2P communi-
cation network, then aggregated as a new block by each peer node.
3
• The decentralized peers make a consensus to decide a corresponding block
to be the latest block of the blockchain using the consensus protocol (i.e.,
the Proof-of-Work protocol for Bitcoin).
In this scenario, the script verification can be considered as a kind of smart
contract that allows the transactions which satisfy the signature verification
(i.e., condition) to be accepted by any nodes (i.e., terms). A reliable asset
transfer is carried out automatically since the blockchain guarantees the trust-
worthiness of input data. Bitcoin scripts are limited to perform a general task
as smart contracts due to the functional restrictions. For example, it does not
support the iterative loop or state storing to prevent the code from falling into
an unexpected state (e.g., infinite loop). Though, the Bitcoin is the first imple-
mentation which eliminates the need of a trusted third party of the untrusted
system entirely by introducing a concept of the smart contract. This invention
has significantly leveraged the decentralized blockchain technology.
2.1.2 Ethereum
Motivated by the Bitcoin’s technical potential but poor expandability, Vitalik
Buterin et al. presented Ethereum [16] in 2014, a decentralized platform to run
the custom-built smart contracts. The concept of “smart contracts” began to
be used commonly since Ethereum uses the word. Currently, Ethereum is the
largest eco-system of smart contracts built on the public blockchain.
The most significant progress in Ethereum is that they not only deal with
transactions but also perform general computational tasks including iterative
loop and storing the states. Therefore, developers became to make various types
of Dapps for their purposes. Ethereum provides a built-in Turing-complete pro-
gramming language “Solidity” that help the developers to handle the Ethereum
specific events, e.g., account handling, message authentication, token transfer-
ring, and so on. Furthermore, there are several newly introduced concepts in-
troduced by Ethereum. In the rest of this section, we introduce some essential
4
features among them since they significantly inspire the other succeeding smart
contract platforms.
Accounts: In Ethereum, not only the users but also the smart contracts
own accounts. The former is called a “externally owned account”, and the latter
is called an “contract account”. When a developer deploys a smart contract on
the Ethereum network, a contract account is assigned to the compiled binary
code and stored in the Ethereum blockchain.
Message and State: Message in Ethereum is a similar concept of “trans-
action” in Bitcoin, but it is used for more general purposes as they can transfer
additional data with or without the value transfer. Another important differ-
ence is that the contracts also can be senders or recipients of the message as
they have accounts. Ethereum introduces a concept of “state” and “state tran-
sition” on these characteristics, which can be widely adopted to the general
programming. Following features define the “state” in Ethereum.
• All accounts maintain a state such as Eth balance, variable states, and
storage contents.
• Only the message sent to the account can trigger a transition of the state.
The transition requires the corresponding amount of gas, which is relative
to the required amount of the task for the transition.
• As messages are recorded in the Ethereum blockchain like Bitcoin’s trans-
actions, state transitions are immutable and credible.
Consequently, a message is a “trigger” to functions of smart contracts, and
the transferred data is an input data to them. Contract-to-contract messaging
is also allowed in Ethereum. It helps the modulized implementation of a com-
plex system. In other words, the Ethereum smart contracts cannot trigger a
5
function of themselves and even cannot be triggered by an event occurred in
the blockchain without a message delivery. Therefore, typical Ethereum Dapps
are composed of two parts: the smart contract as a back-end core service, and
the web-based front-end part that can listen to events and interact with the
smart contract.
Ethereum virtual machine (EVM) EVM refers the whole Ethereum
network which works to build a consensus for the final state defined as a de-
scriptive tuple of all accounts like a state machine. Participating P2P nodes
act like the Bitcoin nodes under the consensus protocol. They are incentivized
by contributing to maintaining the network through the message propagation.
In comparison to the traditional centralized server-client model, EVM can be
regarded as a server infrastructure, but it provides free to maintain, non-stop,
trustworthy execution environment which a centralized system cannot guaran-
tee.
The biggest weakness of Ethereum as a platform is the limited scalability.
Only 20-30 transactions (assumed to be value-transferring transactions without
an additional data handling) are processed per second on the network currently.
Ethereum foundation is preparing to change its core consensus mechanism from
the PoW (proof of work) way to the PoS (proof of stake) way to improve the
transaction throughput.
2.1.3 EOS
EOS.IO [13] was introduced in 2017 for scaling of the Dapp ecosystem. The
major goal of EOS is making the development of Dapps easy by providing a set
of scalable services and functions that Dapps can make use of. Being compared
with Ethereum, it has two significant differences. The first is its consensus
algorithm, a deligated proof of stake (DPOS) mechanism. This choice is for
6
enabling the commercial scalability handling 10,000-100,000 transactions per
second.
The second difference is an economic model. In Ethereum, amount of gas is
required to all network resource usage including messaging, storage, and com-
putation power. Therefore, the users of Dapps should pay the fees as much
as they use the applications. In contrast, EOS utilizes an ownership model, in
which holding EOS tokens allows the contract operators (developers) a pro-
portional share in the network resources. For example, if someone owns N% of
the total EOS tokens, it has access to N% of network bandwidth to operate
its application. Consequently, the entry barriers to the normal users in EOS is
lower than theirs of Ethereum since they do not need to own enough stake, but
developers have higher barriers to deploy and run their Dapps.
2.2 The Oracle Problem
Regardless of the platform dependencies, data trustworthiness is critical to
smart contracts since data-related conditions trigger their executions which
are irreversible. There are two kinds of data : (1) internal data published in the
ledger, (2) external data coming from sources outside the blockchain. Trans-
actions inside the blockchain are internal, which are credible since they are
verifiable and trackable. The problem is how to ensure the trustworthiness of
the external data.
This issue is called “the oracle problem”, which means where the blockchain
can get the feed of trustworthy external data from. The blockchain network run
in a deterministic way so that the miner nodes should have the same view
for an event. Since web-like data sources are usually not guaranteed to be
consistent, a reliable oracle system to decide an exact data is necessary. It
is quite complicated to design a set of the oracle protocols because multiple
components in the off-chain and on-chain network operate the oracle without
trust. Technical challenges exist on the all the steps of data feeding as followings.
7
• Data source: Are the data sources trustworthy? How to authenticate the
trusted sources?
• Data delivery: How the obtained data can be delivered to the blockchain
network securely? Is there any possibility of forgery or corruption during
data transmission?
• Data reliability: How the smart contracts ensure the data is “fact” by
themselves?
Without being connected to the off-chain real-world data, use of smart con-
tracts is strongly restricted. In the following section, we introduce some related
works to solve the oracle problem.
2.3 Related Works
2.3.1 Oraclize
Oraclize [8] is a representative centralized oracle for smart contracts which is
currently working. Contracts willing to use Oraclize API need to send a query
to Oraclize’s contract on the same platform with paying the price corresponding
to the data size and retrieval complexity. The destination of a data source and
a ruleset to parse the response to get an exact part should be specified in the
query string, e.g., URL with the structure of the XML/JSON, InterPlanetary
File System (IPFS) address and file name. Once the Oraclize’s system detects
a requesting event emission, the server on the cloud (outside of the blockchain)
performs the delegated data retrieval using its network capacity to access the
sources. Then, the result is delivered to the requester contract via the call back
interface implemented on it.
8
2.3.2 SchellingCoin
Vitalik Buterin proposed the concept of “SchellingCoin” as a decentralized data
feeding protocol in Ethereum white paper [16]. It assumes that all the Ethereum
nodes will try to report correct data so that they can be included in the honest
group for a reward. Its median based value decision mechanism is not proved
to be stable yet, but the basic concept inspires the following studies.
2.3.3 Augur
Augur [10] built a prediction market on the Ethereum blockchain that provides
correct outcomes in a decentralized fashion. Users of Augur can participate in
the market by betting to an expected result for a future event such as weather,
political election, and growth of a company. The winners gain proper Ethereum
tokens as prize money, and also earn some reputations tokens which are re-
distributed from the losers’. Augur is valuable in that it is the first working
example of a collective intelligence based data decision protocol on a blockchain
network.
2.3.4 Witnet
Witnet is a design for decentralized oracle that is still under being implemented
and tested. Witnet runs on a side chain with a native protocol token (called
Wit), which miners and the witnesses earn by retrieving, attesting and deliver-
ing web contents for clients. A new entity “witness” is assigned to be delegated
to do a data retrieving task from a specified web URL destination. Another new
role in Witnet is “bridge” nodes which are the subset of the witness nodes and
specialized in fulfilling the observation events on other blockchain networks and
bring the result to Witnet network. Its consensus mechanism assigns the roles
based on their reputation. Therefore, the participants can have more opportu-
nity to earn the fees by gaining the higher reputation from behaving honestly.
9
Chapter 3
Design
From the investigation of target systems and the related works, the critical
design considerations in oracle are categorized. In this chapter, we first present
the organized design choices with trade-offs. On this basis, we propose a new
decentralized oracle model with an overview.
3.1 Design Choices and Considerations
3.1.1 Centralized vs. Decentralized
The advantages of using the centralized oracle model is a competitive price
and the relatively low system complexity. As the service flow requires only
a few interactions between the contracts, processing fee is low. However, it
requires an excessive trust to the oracle, to its infrastructure, and even to the
data source (i.e., URL destination) since the data is not authenticable in the
customer’s side. Introducing a trusted hardware (e.g., Intel’s Software Guard
Extensions) to the oracle server can relieve this tampering threats by securing
authentication [4]. Nevertheless, still there is a threat of a single point of failure
(SPOF) or compromise that defeats the ultimate purpose of using Dapps. The
10
SPOF threat may defeats the blockchain philosophy.
A decentralized approach resolves the centralized trust issue, but more costly
and complex since multiple entities participate in the oracle. There are different
kinds of critical considerations for implementing a oracle in the decentralized
fashion as listed below.
Participants and Roles
A decentralized oracle is built up with multiple participants. The Augur com-
munity is composed of the market creator and the reporters, and the Witnet
network is composed of the clients, witnesses, and bridges. Roles and respon-
sibilities of those participants significantly affects the overall system design
regarding to security model, cost model, rewarding system, and the consensus
design. From the nature of a decentralized network, the system will be more
robust as more participants are involved since a few adversaries cannot easily
harm the system. Though, the larger amount of fees and the more fine-grained
distribution of them are required to a customer (the market creator in Augur
and the clients in the Witnet) as much parties join the data resolution task.
Incentive and Punishment
Reasonable incentive and penalty are a core component of a decentralized or-
acle model that makes the oracle practical. It enforces the participants behave
honestly. The fundamental security requirement of it is making the sum of the
benefits from a fraud behavior must be less than a value he can gain from an
honest behavior. In addition to an economic system, introducing the concept of
reputation system is helping to prevent the predicted misbehaving. An adver-
sary who keeps behaving to blame the system becomes to lose the opportunities
to participate in the system gradually as they lose the reputation.
11
Consensus
Data consensus mechanism is used to determines the exact value of data from
the multiple ones submitted by multiple parties (like in Augur). In other words,
it is the consensus design that decides the honest and dishonest actors. There-
fore, the mechanism must to be designed carefully with enough consideration
of the incentive system. For the deterministic types of data such as ‘true or
false’, ‘yes or no’, and noun type of data, a standard or weighted majority rule
can be adapted as the Schelling coin propose. However, if the data content is
a numeric type, the consensus algorithms can be implemented in the various
ways: mean approach, median approach, voting based approach and so on.
3.1.2 Architecture
On-chain Architecture
Building an on-chain oracle application is the most straightforward way to
implement an oracle system that demands the less attention to securing the
untrusted communication channel between the environment of the blockchain
network and the off-chain network. All the oracle-related events happen on-
chain, thus the processes are transparent as they are trackable and verifiable.
Furthermore, the system is easy to deploy since it does not require a modifica-
tion to the platform protocols. One weakness of on-chain oracle is that it may
be expensive as all the computation tasks and data handling require network
resources which are quite costly.
Off-chain architecture
Off-chain architecture is more lightweight than the on-chain approach by del-
egating the tasks to the off-chain resources which are cheaper. However, the
design necessarily needs the additional trust and security robustness to the off-
chain infrastructures. The hybrid way to relieve the trust problem is a side-chain
12
approach which the Augur and the Witnet adopt. The side-chain approach is
reasonable because it is a kind of off-loading model widely used in traditional
computing resource management systems. In these case, it can be challenging to
integrate the ad-hoc currency to the target platforms one by one. Additionally,
as the participants gain the monetary rewards as the ad-hoc tokens assumed
to be exchangeable to the major currencies, the economic value of the tokens
must be maintained in a stable mood to prevent the participants leaving.
3.1.3 Service Scope
Service scope is defined by what methods the oracle supports to retrieve the
data. If it only delivers data from the specified source such as URL and file
system, a built-in data specifying grammar can be implemented in the protocol,
e.g., the query specification of Oraclize and Witnet. The responsibility of the
data submitter(s) is accessing the source, getting the data, and submitting it to
the oracle. Augur like a market-based model can serve data from various sources.
However, informing and understanding the exact market topic is critical, e.g.,
Augur users are actual human-beings who can understand the market title and
join the poll. An oracle also can be specialized in a specific topic and provide a
persistent service only for the use, e.g., an oracle for the real-time diesel price
feeding. This model is different from the market-based model in that the latter
is only valid during the market is running and only for the market creator. The
data submitters who are interested in the specialized topic can concentrate on
customers using the oracle.
3.2 Design Proposal
In this paper, we propose a new design and protocol for an on-chain decentral-
ized oracle. we introduce a new entity, called a retailer, plays the role of setting
up a “channel” between the external data sources and the smart contracts on
blockchain network. The retailers take over the overhead to cross the network
13
border which is data providers suffers in existing models. The motivation here
is that the data sources are considered as a critical participant in the proposed
design; other existing systems do not consider them as a dependent entity and
give no incentive to them. Attracting enough numbers of high-quality data
sources is expected to invigorate this kind of data market, and lower the cost
to use the data. In retailers’ view, the design is compatible to existing models
which of all nodes can freely submit data after getting it from the other sources
(like the Witnet model) or creating their own opinions (like the Augur model).
However, there are strong improvements for data providers and customers; (1)
the data providers in real-world can more effectively provide the data to vari-
ous blockchain networks for monetary reward (2) the data providers can easily
gain proper reputation by revealing their identity in the blockchain via trusted
off-chain channel (3) the customers (i.e., developers of Dapps) can specify a fine-
grained data retrieval condition that improves the reliability. Consequently, the
increase in high-quality data in oracle networks will lead to the decrease of the
Figure 3.1 Data delivery from an off-chain network to consumers in the target
blockchain over the retailers.
14
price.
Figure 3.1 illustrates how the proposed oracle model works. The data source
can be any off-chain network, e.g., Internet, IoT, or another blockchain. The
publishers, who publish data over their off-chain channels to one or more data
retailers, need to create identities on the target blockchain network in advance
to get the rewards. They are motivated by getting enough reward consists of
economic gains and the reputation that is related to the future incomes. A con-
sumer specifies the data requirements in her running smart contract. Then, the
retailers try to fetch the suitable data from the publisher’s off-chain channel.
For this, the retailers must have the capability to access both of the on-chain
and off-chain networks as they serve as a bridge. The data delivering transac-
tions will be propagated through P2P connections to be verified by miners as
same as the normal transactions (omitted in Figure 3.1). When the conditions
(i.e., the above transactions) of a smart contract are fulfilled, the contract au-
tomatically distributes the reward to the publishers and retailers using their
identity information attached to the data of interest.
15
Chapter 4
Protocol
4.1 Overview
Figure 4.1 Protocol overview
Figure 4.1 illustrates the entire data feeding process with some of main pro-
tocols. The whole process is divided into three phases: request phase, delivery
16
phase, and consensus phase. Consumer contracts can initiate the data retrieval
to oracle in the request phase. Then during the delivery phase, the retailer
nodes deliver suitable data values that can be fetched from off-chain networks.
The collected data will be used to figure out the correct answer in the consen-
sus phase. From the result, the publishers and retailers of ‘selected’ data gain
rewards, but ones of ‘rejected’ data get some penalties. More details of each
phase are provided in the next sections.
4.1.1 Request phase
Figure 4.2 Flow of request phase
A series of interactions among participants in the request phase is illustrated
in Figure 4.2. It is the first phase to retrieve external data from the consumer
contract on the blockchain. The phase starts with a message from a consumer
contract to the oracle contract that requests to open an advertisement for data
delivery. Enough monetary deposit (i.e. cryptocurrency tokens) to create the
advertisement must be enclosed in the message together. By getting the mes-
sage, the oracle contract will try to fetch requirements of the consumer contract
to decide whether it accepts the request or not. The consumer contract should
17
provide a defined callback interface for this. If the requirements make sense, a
new advertisement will be included in an “opened ad” list which is accessible
publicly. Retailers, who are expected to keep listening to or polling this event,
can obtain the consumers’ accounts and try to get the data requirements from
them using the same interface that oracle used. If the retailers judge that they
can deliver the satisfying data, they will participate in the next phase, a de-
livery phase. More details about the requirement statement and validation is
discussed in 4.3.
4.1.2 Delivery Phase
Figure 4.3 Flow of delivery phase
The retailer nodes in delivery phase have two choices: (1) accessing to a
proper data publisher to fetch data through the provided channel in off-chain
environment, e.g., open API, web page, or a trusted file storage (2) generating
corresponding data by themselves. By choosing the second option, retailers
can perform both roles of retailer and publisher since the roles are logically
defined. Whichever option they choose, the fetched or created data should be
formatted to defined structure which consists of two parts: a value part and an
18
authentication information part for the publisher. The format specification is
described in 4.2.
Retailers can submit the data to the oracle contract with consumer address
and the delivery tags via “interface O-3” in figure 4.3. The submission will be
accepted if its format is valid and it satisfies the consumer data requirements.
To report a successful delivery, retailers need to pass the tags to the consumer
contracts after transforming it into an unguessable value along with the defined
protocol described in section 4.4. The tags which additionally serves a kind of
“delivery proof” should be collected in the consumer contracts and presented
to oracle in the consensus phase. By introducing this deposit-based 2-way re-
porting process instead of direct data delivery to the consumer, the system can
prevent dishonest consumers from false reporting to lower or snatch the fees
of honest retailers after taking the data. Furthermore, in the implementation
aspect, such design allows the consumer can decide when to terminate the de-
livery phase by stating it in itself. As the tag is designed to be only verifiable by
the oracle with a message sender information, which is resistant to tampering in
platform level, an adversary who only steals the data value from an observation
cannot re-use it to incapacitate another retailer’s work by replay attack.
Figure 4.4 Flow of consensus phase
19
4.1.3 Consensus Phase
When a consumer’s pre-defined conditions are fulfilled, the consumer can ter-
minate the delivery phase and getting the finalized data value from the oracle
contract. Collected tags (i.e., the converted tag from the original value) from
the retailers should be presented to the oracle to close the advertisement and
to start the consensus phase. Each tag is paired with a delivery information
including publisher identity, retailer identity, and the data value. The oracle
can load all the data after validating the tags using tag’ and consumer account
information.
As the types of the data for each oracle depend on its category, the oracle
should use a proper consensus algorithm for its purpose. If an oracle is deployed
to provide a diesel price of the United States, for example, the consensus al-
gorithm must be able to handle the numeric types of data. The details of the
consensus algorithm is addressed in section 4.6. During the consensus-making
process, some of the deliveries are selected, but some of them are rejected to
adopt the reports since the algorithm regards them as incorrect ones. For both
cases, the oracle updates the reputation information at the end of this phase
by granting incentives to the former and giving a penalty to the latter. The
incentive and penalty are consist of a monetary reward and reputation as dis-
cussed in 4.5. After distributing the fees, the monetary rewards, amount of
remained consumer’s deposit is returned back to consumer’s account. Accord-
ingly, the consumer obtains the data which is estimated as a correct answer
over the explained three phases.
4.2 Data Structure and Authentication
From this section to the end of this chapter, we provide the protocol details not
covered in the overview section. As the first detail, the basic structure of data
for publishing is described in this section. Data authentication is essential in
20
the proposed model since the roles of data publishing and delivering are sepa-
rated. Publishing time information also should be included in the data object
to validate its freshness if consumers want. For the reasons, a data publisher
must include its identity information, a timestamp, and data value to publish.
Then she can sign the data object using her private key related to her account
address before publishing it1. An expected data structure in Ethereum platform
is illustrated in Figure 4.5. When the signed data is fetched by the retailers and
delivered to the oracle, the oracle can authenticate the publisher’s identity us-
ing a built-in address verification function. The protocol only requires a little
effort to since the most of the well-known platforms support the sign-and-verify
process with inherent functions.
Figure 4.5 Data structure for publishing
4.3 Data Requirements
As explained in the overview section, consumers can specify and present some
data requirements which are also used to state the restrictions for data collec-
tion. Available fields of data requirements are listed below.
• Blacklist: A list of addresses that a consumer does not want to retrieve
the data from. The oracle blocks any data objects published by one of the
list, and the retailers may not deliver them since their goal is earning the
reward.
• Whitelist: A list of addresses that are only allowed by the consumer to
retrieve the data from. If the publisher of the delivered data object is not
1In Ethereum, the address refers the first some bytes of the account’s public key.
21
included in the list, the delivery will be rejected.
• Time condition: A time requirement which is represented in a timestamp
format. A data object including an unreasonable timestamp (i.e., later
than the current timestamp obtained by the platform) or a timestamp
that not satisfying the freshness will be rejected.
• Source diversity: If the consumer wants to retrieve data from multiple pub-
lishers for cross-checking, it can specify this field in their requirements. It
will be useful to mention this field with the whitelist feature to guarantee
that the data is coming from the multiple trusted sources.
Data requirements specification helps the proposed model work in an in-
tended way in following two aspects: (1) the decentralized oracle system can
be maintained by the law of supply and demand as a kind of market for in-
formation (2) the consumers can explicitly block some publishers who are not
preferred or known as dishonest. This feature grants an advantage to the well-
known publishers who may gain a higher reputation in the real world as they
will can be referred much more times than the new entities. Since the publisher
reputation model is related to the recent number of successful delivery activi-
ties, the publishers can get a higher reputation in this market. This is a very
intentional direction for the oracle design as the publishers will be incentivized
to participate in the oracle, and they are expected to provide great quality of
data.
4.4 Data Delivery Tag
The delivery tag at [Interface O-3] and [Interface C-2] in Figure 4.3 is applied
as a delivery proof in the two-way delivery system, that is comparable to a
check in a traditional banking system. Figure 4.6 illustrates how the tag works
in data delivery process.
22
1. To generate a tag value, a retailer should make a “secret salt”, an unguess-
able random value.
2. By hashing a string from concatenating the salt value and the retailer’s
address, a tag prime is generated.
3. Then, he can obtain the original tag by hashing the tag prime value after
concatenating it with the consumer’s address.
4. This tag is passed to the oracle in data delivery through [Interface O-3].
5. The tag prime is passed to the consumer for noticing the delivery through
[Interface C-2].
6. All the tag primes from retailers are collected by the consumer and sub-
mitted to the oracle when she closes the advertisement.
7. Then the oracle can reproduce the tag value by hashing the tag primes
with the consumer’s address.
8. Finally, this tag value is used as a key for loading a mapped data object
from its data container.
As the oracle regenerate the value by referring the consumer authentication
information, only authorized entity can access to the data store to retrieve it.
Figure 4.6 Data tag verification during data delivery process
23
4.5 Incentive and Penalty
Fee as a monetary reward
The first incentive for the contributors to the oracle system is a monetary
reward, commonly called fee. The publishers gain the “publishing fee”, and the
retailers gain the “sales fee” as a reward. The amount of publishing fee may be
smaller than the sales fee since the publisher can collect fees from the multiple
deliveries by other retailers. The portions of publishing fees for each selected
publisher are related to the amount of reputation each publisher has. The sales
fee also must be bigger than the sum of the messaging fees that the retailers
spend during the delivery phase in advance. The messaging fees are decided by
the message size (i.e., data size) and the network status if the platform adopts
a competitive fee model for transaction processing. Even for the same size of
data, the amount of fees can be various as the price of the platform’s token
may fluctuate. Thus, the oracle should consider the exact prepaid price. For the
penalty, exclusion of the incorrect data contributors from the rewarding pool
serves as a penalty for the retailers since they will lose the prepaid expenses.
Thus, maintaining reliable and various fetching channels is crucial for retailers.
If the delivered data is not adopted by the oracle, they cannot collect their
investment.
In the consumer’s view, data consumption should be worth more than the
sum of the sales fee and the publishing fee.
Reputation
In addition to the monetary reward, a notion of the publisher reputation is
one of the design principles that encourage the publishers to report reliable
data. Proposed reputation system estimates each publisher’s reputation based
on his activities with three factors: the total number of submission during re-
cent N advertisement, the number of acceptance, and the number of rejection.
Reputation value calculation formula and notations are provided in Figure 4.7.
24
The first term of the right side reflects the data reliability of this publisher.
If all the data from this publisher were accepted, the result of this term al-
ways becomes 1. The second term of the right side indicates how much the
effective deliveries happened during N of recent advertisements. The number of
acceptance is only counted during recent N times, but the number of rejected
submissions per a advertisement is counted and accumulated from the first time
of oracle deployed. Therefore, a reputation value of publisher whose data have
been rejected more than threshold is converged to a low value. By controlling
the constant K, the oracle deployer can coordinate the balance between the
reliability and activeness. A publisher fully selected during the recent N ad-
vertisement may have K+1 times of reputation than a publisher who is just
joined.
In our design, each oracle contract maintains own reputation status to be
used for data consensus computation. This is from a fundamental idea that the
reliability is only valid in its service field. A similar approach is found in an
existing knowledge system that are comparable to a decentralized oracle for
humans beings, Quora [12]. A concept of reputation system in Quora estimates
how much the answerer is specialized for each field, the question category.
Figure 4.7 Reputation calculating formula
25
4.6 Data Consensus
We adopt a weighted majority-based consensus model to figure out the result
value. The reputation points are used as the weight of a publisher, the owner of
the reputation. By controlling the value of K, the oracle can decide how much
more weight it will give to the “active” publishers than the newcomers.
For the data as non-deterministic values, the numerics, we introduce a new
approach named clustering-based mean calculation. As our model run on the
on-chain environment, the median approach requires a sorting process which
requires much more computation resource in the blockchain network. Therefore,
we are motivated to mean based approach with maintaining the rule of excluding
out-of-range answers from the rewarding. The clustering-based mean calculation
uses a threshold value to group the similar submitted values into the same
cluster. The cluster core value is newly calculated in every time a new member
is added. The sum of the weight of publishers whose data is in the same cluster
is used to select the most trustworthy cluster. As a result, who deliver the
value close to the most influential publishers’ data have a high probability to
be rewarded.
26
Chapter 5
Proof of Concept Implementation
We implemented an example oracle on the proposed protocol as an Ethereum
smart contract to prove the concept. We suppose that the oracle is designed to
provide the Korean-won to US-dollar exchange rate (won per 1 dollar), which is
a numeric type of data. By implementing a working example on the Ethereum,
the most widely used smart contract platform, we proved that the feasibility
of the proposed model. Furthermore, we evaluate the practicality of the model
in two aspects, the first one is a reasonable total cost that should be paid by a
consumer, and the second is the total processing time.
5.1 Code
The code is written in the Solidity language and under 500 lines including four
interfaces, sixteen of helper functions, and one abstract contract for the con-
sumer contracts to implement to use the oracle. Figure 5.1 shows the variables
of the contract commonly used as the memory or storage spaces. As chang-
ing the variables’ state induces gas consumption, maintaining minimum spaces
is one of the implementation goals. ‘AdResult’, ‘PulblisherActivity’, ‘DataRe-
quirement’, ‘DataPacket’ refer the custom structures to handle the appropriate
27
data. There are relevant comments for each variable on the figure. The pseu-
docode of interface functions are described in figure 5.2 and figure 5.3.
Figure 5.1 Memory/storage variables on the contract
Figure 5.2 Pseudocode of oracle interfaces(1/2)
Figure 5.3 Pseudocode of oracle interfaces(2/2)
28
5.2 Evaluations
5.2.1 Experimental environment
We use a smart contract simulator “Truffle suites [14]” to estimate the amount
of gas usage for running the code. Based on the gained gas amount used to run
each function, we calculate the fee values in US dollar if it is consumed on the
main Ethereum network. The currency exchange is referred out in [15].
5.2.2 Transaction Cost and Processing Time
Total time consumed to process a data advertisement is composed of transaction
processing time for each interface. Since the processing time is related to the gas
price the senders pay for the transaction as Ethereum adopts the price racing
model, we calculate the cost and time consumption for three cases: the fastest
but most expensive case, the average case, and the cheapest case. Recall that the
estimation to the interface C-1 and O-2 are omitted as the “call” transactions
used to read state do not consume the gas in Ethereum. Table 5.1 lists the
results.
For the fastest case, we set the gas price as 21 Gwei (Giga wei). This means
that the transaction pays 21 nano Ether per 1 gas consumption to be processed.
The gas price is set as 2.4 Gwei and 2.3 Gwei for each of the average case and
cheapest case. The time is evaluated as an “expected time to be confirmed” con-
sidering the main network status at the evaluation time (i.e., Nov. 2018). The
table shows that the interface O-3 and O-4 require the more massive amount of
gas than other interfaces as they transfer the extended data. A transaction for
interface O-4 to close advertisement requires a particularly tremendous amount
of gas and processing time, as the function also carries out a complex compu-
tation including clustering and reputation handling. The time estimation from
the transaction results in about 4,000-5,000 seconds which means the consumer
can confirm the data after about 80 minutes when she pays the average price.
For interface O-1 that used to open advertisement and interface C-2 that for
29
Interface O-1 O-3 O-4 C-2
Gas consumption(per transaction)
148472 298594 5323497 48318
Fastest(21 Gwei)
Cost($) 0.327 0.659 11.742 0.107Time(s) 35 35 1773 35
Average(2.4 Gwei)
Cost($) 0.051 0.103 1.841 0.017Time(s) 2016 2016 4608 2016
Cheapest(2.3 Gwei)
Cost($) 0.047 0.094 1.675 0.015Time(s) 2533 2533 5345 2533
Table 5.1 Expected processing time and cost to the gas price in each interfacetransaction (Simulated on Nov. 2018)
delivery reporting, the required gas amount may not fluctuate as the message
sizes are fixed. The delivering data size is the main factor to decide the cost in
case of the interface O-3. In the evaluation, we set the pure data part (excluding
the signatures and timestamp parts) as a simple six bytes string to indicate the
exchange rate multiplied by one hundred (for handling the decimal point). As
the data length is longer, the cost required to send it and store it becomes
higher. Another main factor related to the cost is the number of deliveries
the consumer want to refer. In addition to that the retailing and publishing
fees that are directly correlated to the number, the total length of proofs and
computational tasks in interface O-4 increase so that the transaction becomes
costly. The amount of gas in table 5.1 is for two deliveries. Figure 5.4 shows
the relation between the number of deliveries and gas consumption required to
transfer the request message and make a consensus from them.
5.2.3 Practicality
We now demonstrate the proposal’s practicality by analyzing the cost and time
required to take the whole protocols. We build a model to calculate the total
cost and time required to the consumer in Ethereum network. A total cost
model described in figure 5.5 reflects the sum of transaction cost, sales fees,
publishing fees, and the oracle fee which can be optionally required. For the
30
0 1 2 3 4 5
5.2
5.4
5.6
5.8·106
Number of Deliveries (K)
Gas
con
sum
ed
Figure 5.4 Gas consumption to the numbers of deliveries in transaction forinterface O-4
total processing time, we provide a model is designed as a lower bound model
in figure 5.6 because the actual time consumption is decided how the retailers
actively participate in the advertisement. Therefore, we ignore the time con-
sumed for observing the ads and fetching data in the model and assume that
the retailers try to deliver data as soon as an advertisement opens.
Figure 5.5 Total cost for an advertisement
Figure 5.7 shows the simulation result using the models above. By setting
the most expensive gas price for all the transactions to accelerate them, the
31
total cost a consumer should pay for 15 deliveries is about 30$ and it may takes
more than 31 minutes to end up the process. If all the transactions pays the
gas price as an average value, the total price is decreased to under 5$ under
the same condition, but the lower bound of processing time goes up to about
215 minutes. To compromise between the price and time, the consumer can
optimize the cost model by setting gas prices selectively. If she allows all the
transactions to adopt the fastest option except the transaction for interface O-
4, the most expensive transaction to close her advertisement, she can use the
oracle protocol under 15$. The lower bound of the time delay is shortened to
about 177 minutes.
The prices are reasonable even if 0.3$ of the sales fees and publishing fees
are included. We suppose that the retailers and publishers to be implemented
in an automated system like the Oraclize, consequently the market price will
go down in the competitive environment.
Figure 5.6 The lower bound of processing time for an advertisement
32
0 1 2 3 4 5 6 7 8 9 10 1505
101520253035404550
Number of Deliveries (K)
Tot
alco
st[$
]
Fastest price(31.3 mins)
Optimized price(177.6 mins)
Average price(215.7 mins)
Figure 5.7 Total data delivery transaction cost to number of deliveries
33
Chapter 6
Conclusion
We have introduced a decentralized oracle which separates the data provider’s
role into two sub-roles: the publisher and the retailer. The publisher is in charge
of publishing a data in an authenticable format. The reatiler takes off the de-
livery overhead by relaying the data from the off to the on-chain oracle to gain
fees. In the design, the data trustworthiness is estimated based on the sum of
reputations of the publishers who submit the similar data values. As a result,
the publishers who have maintained the continuously high reputation have more
opportunity to be selected and to earn the money. Therefore, the existing pow-
erful data providers are motivated to provide a proper data for smart contracts
since their off-chain reputation can be reflected to the oracle from the consumer
requirements. Through an implementation on Ethereum environment, we prove
the practicality of proposed model. We will continue optimization of the code
to make it use less memory and reduce the calculation complexity to lower the
total cost consumption. Using a complete set of protocol, we expect to provide
the first working decentralized oracle to the typical smart contract platforms.
34
Bibliography
[1] Szabo, N. “The idea of smart contracts,” Nick Szabos Papers and Concise
Tutorials, vol. 6, 1997
[2] KOSBA, Ahmed, et al. “Hawk: The blockchain model of cryptography and
privacy-preserving smart contracts,” in Proceedings of IEEE symposium on
security and privacy, 2016. p. 839-858.
[3] Kalra, Sukrit and Goel, Seep and Dhawan, Mohan and Sharma, Subodh
“Zeus: Analyzing safety of smart contracts,” in Proceedings of The Network
and Distributed System Security Symposium (NDSS), 2018.
[4] Zhang, Fan and Cecchetti, et al. “Town crier: An authenticated data feed
for smart contracts,” in Proceedings of the 2016 aCM sIGSAC conference
on computer and communications security (CCS), 2016. p. 270-282.
[5] Haber, Stuart, et al. “How to time-stamp a digital document.” in Con-
ference on the Theory and Application of Cryptography, Springer, Berlin,
Heidelberg, 1990.
[6] Kalra, Sukrit et al. “Zeus: Analyzing safety of smart contracts.” in Proceed-
ings of The Network and Distributed System Security Symposium (NDSS),
2018.
35
[7] BONNEAU, Joseph, et al. “Sok: Research perspectives and challenges for
bitcoin and cryptocurrencies.” In: Security and Privacy (SP), 2015 IEEE
Symposium on. IEEE, 2015. p. 104-121.
[8] Oraclize - blockchain oracle service, enabling data-rich smart contracts,
http://www.oraclize.it/
[9] Reality Keys - Facts about the future, cryptographic proof when they come
true, https://www.realitykeys.com/
[10] Augur - A Decentralized Oracle and Prediction Market Protocol,
https://www.augur.net/
[11] Verity - Real-time decentralized oracle, https://verity.network/
[12] Quora - Quora is a place to gain and share knowledge,
https://www.quora.com/
[13] Ian Grigg, Dan Larimer and others, “EOS.IO Technical White Paper v2,”
https://github.com/EOSIO/Documentation/blob/master/TechnicalWhitePaper.md
[14] Truffle Suite, Sweet Tools for Smart Contracts,
https://truffleframework.com/
[15] Eth Gas Station — Consumer oriented metrics for the Ethereum gas mar-
ket https://ethgasstation.info/calculatorTxV.php
[16] Buterin, Vitalik and others, “A next-generation smart contract and decen-
tralized application platform,” white paper, 2014.
[17] S. Nakamoto, “Bitcoin: A peer-to-peer electronic cash system.”, 2008.
36
요약
스마트 컨트랙트는 입력 값이 사전에 약속된 조건을 만족하면 자동으로 계약을
이행하는 일종의 프로그램으로서, 분산 블록체인 기술의 발달과 함께 분산 어플리
케이션(Distributed Applications, Dapps) 구현의 핵심 요소로 자리잡았다. 기존
의 중앙화된 시스템에 존재했던 단일 장애점 문제, 과도한 신뢰 요구 등의 문제를
Dapp의 도입을 통해 해결하고자 하는 노력들이 계속되었음에도 불구하고, 기존
시스템을 대체할만한 수준의 솔루션을 구현하기 위해 해결해야 하는 기술적 난제
들이 아직 상당수 존재한다.
본 논문에서는 이들 중 외부 데이터 연동 문제를 해결하기 위한 기법을 제안
한다. 이러한 문제는 일반적으로 “오라클 문제”로도 불리는데, 블록체인 내부에서
발생하는 데이터는 모두 검증 가능하므로 신뢰할 수 있는것에 비해 외부 네트
워크에서 생성된 데이터의 신뢰성은 내부에서 보장할 수 없기 때문에 신뢰할 수
있을만한 데이터 공급처인 오라클을 어떻게 구현할 것인지에 대한 문제로 귀결
된다. 오라클의 기술적 요구사항은 크게 두가지로 나누어 지는데, 외부 접근성이
없는스마트컨트랙트로하여금어떻게외부데이터에접근할수있게할것인지에
대한 문제와, 해당 데이터의 신뢰성을 어떻게 보장할 것인지에 대한 문제가 함께
고려되어야한다.본논문에서는대표적인스마트컨트랙트플랫폼및관련서비스
분석을 통해 오라클 디자인에서 고려되어야 하는 특성들과 디자인 결정사항들을
도출하고 이를 기반으로 효과적인 오라클 디자인을 제안 및 구현하고자 하였다.
제안된기법은기존관련연구들과달리데이터자체의신뢰성에집중하기위해
외부의데이터 “발행자(Publisher)”와이를블록체인네트워크로전달하고수수료
를얻는 “리테일러(Retailer)”로네트워크참여자의역할을분리하고,적절한수익
배분 모델을 구성하여 이들로 하여금 정직한 활동을 유도함으로서 오라클 시스템
을 동작케하도록 설계되었다. On-chain 스마트 컨트랙트로 배포된 오라클을 통해
소비자(Consumer) 컨트랙트는 원하는 정보를 다수의 리테일러로부터 전달받을
수 있다. 전달받은 데이터들이 값이 서로 다른 경우 이들은 발행자의 Reputation
37
에 따른 가중치를 고려한 제안된 합의 프로토콜을 통해 신뢰할 수 있는 하나의
값으로 결정된다. 정직한 데이터를 제공한 발행자와 리테일러는 오라클을 통해 소
비자가 미리 지불한 금액의 일부를 보상으로 획득할 수 있으며, 그렇지 않은 경우
합당한 패널티를 얻게 되어 향후 데이터 공급시 불이익을 받게 된다.
본 기법은 궁극적으로 데이터 발행자에게 보상을 지급하고 직접적인 블록체인
네트워크 참여 없이도 데이터를 발행할 수 있게 함으로써 기존 시스템들이 보유한
다수의 양질의 데이터가 안전하게 블록체인으로 공급될 수 있도록 설계되었다.
풍부한 데이터의 공급은 데이터 조달에 요구되는 비용의 감소를 통해 블록체인
네트워크와의 원활한 상호 운용성 보장으로 이어질 수 있다. 또한 현재까지 제
안된 기법들이 주로 Whitepaper를 통한 아이디어 중심의 제안이거나 단순 분산
투표를 구현한 집단 지성 플랫폼인 것에 반해, 본 연구에서는 실질적인 데이터를
전달하는 오라클을 이더리움 네트워크에서 구현함으로서 제안된 모델이 실제 운
용 가능하며 합당한 수준의 비용과 처리 시간으로 동작함을 증명하였다. 제안된
기법은 향후 내부 메모리 사용량과 연산량을 최소화하는 최적화 과정을 통해, 최
초의 상용 분산 오라클 모델로서 더욱 실용적인 Dapp들의 구현에 기여할 수 있을
것으로 기대된다.
주요어: 서울대학교, 컴퓨터 공학부, 졸업논문
학번: 2017-21931
38