13
Journal of Cloud Computing: Advances, Systems and Applications Brito et al. Journal of Cloud Computing: Advances, Systems and Applications (2019) 8:19 https://doi.org/10.1186/s13677-019-0141-z RESEARCH Open Access Secure end-to-end processing of smart metering data Andrey Brito 1* , Christof Fetzer 2 , Stefan Köpsell 2* , Peter Pietzuch 3 , Marcelo Pasin 4 , Pascal Felber 4 , Keiko Fonseca 5 , Marcelo Rosa 5 , Luiz Gomes-Jr. 5 , Rodrigo Riella 6 , Charles Prado 7 , Luiz F. Rust 7 , Daniel E. Lucani 8 , Márton Sipos 8 , László Nagy 8 and Marcell Fehér 8 Abstract Cloud computing considerably reduces the costs of deploying applications through on-demand, automated and fine-granular allocation of resources. Even in private settings, cloud computing platforms enable agile and self-service management, which means that physical resources are shared more efficiently. Cloud computing considerably reduces the costs of deploying applications through on-demand, automated and fine-granular allocation of resources. Even in private settings, cloud computing platforms enable agile and self-service management, which means that physical resources are shared more efficiently. Nevertheless, using shared infrastructures also creates more opportunities for attacks and data breaches. In this paper, we describe the SecureCloud approach. The SecureCloud project aims to enable confidentiality and integrity of data and applications running in potentially untrusted cloud environments. The project leverages technologies such as Intel SGX, OpenStack and Kubernetes to provide a cloud platform that supports secure applications. In addition, the project provides tools that help generating cloud-native, secure applications and services that can be deployed on potentially untrusted clouds. The results have been validated in a real-world smart grid scenario to enable a data workflow that is protected end-to-end: from the collection of data to the generation of high-level information such as fraud alerts. Keywords: Cloud computing, Security, Trusted execution, Smart grids, Privacy, Confidential computing Introduction Cloud computing has emerged as the main paradigm for managing and delivering services over the Internet. The automated, elastic and fine-granular provisioning of computing resources reduces the barrier for the deploy- ment of applications for both small and large enterprises. Nevertheless, confidentiality, integrity and availability of applications and their data are of immediate concern to almost all organizations that use cloud computing. This is particularly true for organizations that must com- ply with strict confidentiality, availability and integrity policies, including society’s most critical infrastructures, such as finance, utilities, health care and smart grids. The SecureCloud project focused on removing tech- nical impediments to dependable cloud computing, *Correspondence: [email protected]; [email protected] 1 Universidade Federal de Campina Grande, Campina Grande, Brazil 2 Technische Universität Dresden, Dresden, Germany Full list of author information is available at the end of the article overtaking the barriers to a broader adoption of cloud computing. Therefore, the project developed technologies that can help secure computing resources to be provided quickly, by using familiar tools and paradigms to derive meaningful, actionable information from low-level data in a secure and efficient fashion. With this goal, the Secure- Cloud project makes use of state-of-the-art technologies such as OpenStack 1 , Kubernetes 2 , and Intel SGX [1], generating tools or extensions that enable the deployment of secure applications. The validation of the developed technologies has been conducted in the general context of smart grids. This application domain, as many others, is affected by the increasing amount of data generated by a large range of device types (e.g., meters and sensors in the distribution or transmission systems) and by the sensitivity of such data (e.g., revealing habits from individual consumers 1 https://www.openstack.org 2 https://www.kubernetes.io © The Author(s). 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

RESEARCH OpenAccess Secureend-to-endprocessingofsmart … · 2019. 12. 4. · Britoetal.JournalofCloudComputing:Advances,SystemsandApplications (2019) 8:19 Page2of13 oropeningvulnerabilitiesintheoperationofthepower

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

  • Journal of Cloud Computing:Advances, Systems and Applications

    Brito et al. Journal of Cloud Computing: Advances, Systemsand Applications (2019) 8:19 https://doi.org/10.1186/s13677-019-0141-z

    RESEARCH Open Access

    Secure end-to-end processing of smartmetering dataAndrey Brito1*, Christof Fetzer2, Stefan Köpsell2*, Peter Pietzuch3, Marcelo Pasin4, Pascal Felber4,Keiko Fonseca5, Marcelo Rosa5, Luiz Gomes-Jr.5, Rodrigo Riella6, Charles Prado7, Luiz F. Rust7,Daniel E. Lucani8, Márton Sipos8, László Nagy8 and Marcell Fehér8

    Abstract

    Cloud computing considerably reduces the costs of deploying applications through on-demand, automated andfine-granular allocation of resources. Even in private settings, cloud computing platforms enable agile and self-servicemanagement, which means that physical resources are shared more efficiently. Cloud computing considerablyreduces the costs of deploying applications through on-demand, automated and fine-granular allocation ofresources. Even in private settings, cloud computing platforms enable agile and self-service management, whichmeans that physical resources are shared more efficiently. Nevertheless, using shared infrastructures also creates moreopportunities for attacks and data breaches. In this paper, we describe the SecureCloud approach. The SecureCloudproject aims to enable confidentiality and integrity of data and applications running in potentially untrusted cloudenvironments. The project leverages technologies such as Intel SGX, OpenStack and Kubernetes to provide a cloudplatform that supports secure applications. In addition, the project provides tools that help generating cloud-native,secure applications and services that can be deployed on potentially untrusted clouds. The results have beenvalidated in a real-world smart grid scenario to enable a data workflow that is protected end-to-end: from thecollection of data to the generation of high-level information such as fraud alerts.

    Keywords: Cloud computing, Security, Trusted execution, Smart grids, Privacy, Confidential computing

    IntroductionCloud computing has emerged as the main paradigmfor managing and delivering services over the Internet.The automated, elastic and fine-granular provisioning ofcomputing resources reduces the barrier for the deploy-ment of applications for both small and large enterprises.Nevertheless, confidentiality, integrity and availability ofapplications and their data are of immediate concernto almost all organizations that use cloud computing.This is particularly true for organizations that must com-ply with strict confidentiality, availability and integritypolicies, including society’s most critical infrastructures,such as finance, utilities, health care and smart grids.The SecureCloud project focused on removing tech-

    nical impediments to dependable cloud computing,

    *Correspondence: [email protected];[email protected] Federal de Campina Grande, Campina Grande, Brazil2Technische Universität Dresden, Dresden, GermanyFull list of author information is available at the end of the article

    overtaking the barriers to a broader adoption of cloudcomputing. Therefore, the project developed technologiesthat can help secure computing resources to be providedquickly, by using familiar tools and paradigms to derivemeaningful, actionable information from low-level data ina secure and efficient fashion. With this goal, the Secure-Cloud project makes use of state-of-the-art technologiessuch as OpenStack1, Kubernetes2, and Intel SGX [1],generating tools or extensions that enable the deploymentof secure applications.The validation of the developed technologies has been

    conducted in the general context of smart grids. Thisapplication domain, as many others, is affected by theincreasing amount of data generated by a large range ofdevice types (e.g., meters and sensors in the distributionor transmission systems) and by the sensitivity of suchdata (e.g., revealing habits from individual consumers

    1https://www.openstack.org2https://www.kubernetes.io

    © The Author(s). 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, andreproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to theCreative Commons license, and indicate if changes were made.

    http://crossmark.crossref.org/dialog/?doi=10.1186/s13677-019-0141-z&domain=pdfmailto: [email protected]: [email protected]://www.openstack.orghttps://www.kubernetes.iohttp://creativecommons.org/licenses/by/4.0/

  • Brito et al. Journal of Cloud Computing: Advances, Systems and Applications (2019) 8:19 Page 2 of 13

    or opening vulnerabilities in the operation of the powersystem).The project considers several applications in the smart

    grids domain to demonstrate the feasibility and appropri-ateness of the SecureCloud platform for big data process-ing. In this paper we focus on two of them, which are partof a smart metering big data analytics use case:

    • Data validation: computes a billing report, applyingnot only the billing logic, but also considering thecompleteness of the dataset and verifying theintegrity of individual measurements;

    • Fraud detection: estimates the likelihood of anindividual consumer to be involved in frauds.3

    The rest of the paper is organized as follows. We intro-duce the use case scenario in the “Smart metering appli-cations” section and discuss the SecureCloud approachin the following section, “The SecureCloud approach”.The “SQL-to-KVS converter – cascaDB” section and“Confidential batch job processor – asperathos” sectiondiscuss the components used in the two applications.The “Related work” section discusses other academicand industry consortia that address similar issues. The“Conclusion” section concludes the paper with some finalremarks.

    Smart metering applicationsThe smart metering scenario comprises the communi-cation and data storage structure of Advanced MeteringInfrastructure (AMI) systems, from the smart metersup to the Metering Data Collector (MDC), includingthe metering database. Figure 1 shows the completecommunication, processing, and storage structure neededto validate a smart metering management scenario.Initially, this application was deployed using only realsmart meters to validate this process. The smart meterscommunicate with an aggregator module using a meshnetwork based on a customized Zigbee stack for the trans-port layer [2]. As application protocol, we used a modifiedversion of the Brazilian protocol for electronic meteringreading, named ABNT NBR 14522 [3]. The modifica-tions were necessary to enable the protocol to run in anasynchronous communication system, as well as to addsome features, such as data encryption and to commandload connection and disconnection. With this modifica-tion in the smart meter, all communication steps in thedata workflow are encrypted. As shown in the figure, it isalso possible to use different key lengths or algorithms.

    3Frauds in electrical metering systems are endemic in some regions of Brazil,being responsible for approximately 27 TWh of energy loss annually. Ingeneral, energy theft accounts for 6% of the total distributed energy in Brazilaccording to the Brazilian Association of Electrical Power Distributors (seehttp://www.abradee.com.br/setor-de-distribuicao/furto-e-fraude-de-energia/– in Portuguese).

    A smart meter simulator software was then added toenabled experimenting with increasing number of con-nections with the MDC, making it model real communi-cation demands more closely. It also provided flexibilitywhen testing for the whole AMI as well as the Secure-Cloud platform in terms of data volume. This software isable to simulate up to 65,000 smart meters communicat-ing and sending data simultaneously. The simulated datais created based on energy consumption data acquiredfrom real customers.Secure data communication is required to gather the

    metering data since the smart meters are located outsideof the SecureCloud platform’s scope. Data communicationbetween the smart meters and the MDC, from physical totransport layers, must go through two different networks.In the pathway, aggregators work as bridges connectingsmartmeters to the authentication application in the cloud.The smart metering management structure is inher-

    ently distributed, having usually hundreds of smartmeters connected to one aggregator and several aggre-gators connected to authentication applications. Thus,the structure should enable several parallel authenticationmicroservices to communicate with the aggregators act-ing as external clients and services.The confidentiality and integrity of metering data must

    be protected. Confidentiality is important to maintaincustomers’ privacy. Integrity is needed to protect thepower utility against fraud attempts. Therefore, data pro-cessing within microservices in the cloud should besecure. The SecureCloud platform leverages SCONE [4]as the runtime for services and applications. As will bedetailed in the next section, SCONE guarantees that appli-cations are executed in protected regions of a processor’smemory, shielding data and encryption keys even fromsoftware with higher privileged levels. These protectedmemory regions are named enclaves.Besides the basic link-encryption between the differ-

    ent components, e.g., smart meters to authenticator andauthenticator to MDC, smart meter data is addition-ally protected by applying end-to-end encryption andauthentication. An RSA key pair and the public key ofthe MDC are embedded into each smart meter, enablingmutual authentication. To ensure end-to-end security,all ABNT request and response messages are encrypted.Since the MDC application and the database are locatedwithin the boundaries of a secured cloud platform, thedata is decrypted inside the enclaves in the authentica-tion application and stored using the platform databaseservice. Data is also signed using cryptographic keys pro-vided by INMETRO, the (Brazilian) National Institute forMetrology, Quality and Technology. These signatures willbe later used for auditing of the bills. Finally, the data isstored into a secure key-value store (KVS) and consumedby applications.

    http://www.abradee.com.br/setor-de-distribuicao/furto-e-fraude-de-energia/

  • Brito et al. Journal of Cloud Computing: Advances, Systems and Applications (2019) 8:19 Page 3 of 13

    Fig. 1 Smart metering management implementation structure

    Two applications that consume the data are consid-ered here. The first application is for bill auditing andvalidation. It is composed of two main parts: one undercontrol of the distribution company and does the billing,producing a bill and a hash value; while the other is con-trolled by INMETRO and, based on the hash, can checkwhether the bill considered all the measurements properly(e.g., no values were duplicated or missing).The second application is for fraud detection. It pro-

    cesses all the measurements from consumers, consideringalso other demographic information (e.g., informationspecific to the individual consumer or of the type of indus-try) and generates a fraud risk evaluation. In particular,fraud attempts of large customers (i.e., with voltage supplygreater than 1 kV ) are typically much more sophisti-cated than frauds of residential customers. This type ofcustomers have very high energy costs and, therefore,successful fraud attempts award large economical gains,fostering more sophisticated frauds.The main idea of the fraud detection application is

    to use the stored measurements to detect variationsin the customer’s consumption behavior. Measurements,

    named phasor reports, are acquired every 15 minutesand contain data such as line voltage, current, angle,three-phase power, power factor, harmonic distortion,frequency, demand, and energy consumption. The basicconcept behind the analysis includes the processing ofthe metering data by a classifier, based on self-organizingmaps, in order to classify the behavior of consumptiondata. After the classification, the correlation coefficientmatrix of the customer with the regular behavior is calcu-lated, using a fixed time-window of one day. The resultsare used to create a ranking of potential fraudulent con-sumer units.Both applications discussed above work as batch pro-

    cesses with the following tasks:

    • Periodic, independent tasks to analyze individualconsumers;

    • A task for billing validation, which requires only themeasurements from a single customer and the resultis a validated billing report;

    • A task for fraud detection, which may requireadditional information from other records or tables

  • Brito et al. Journal of Cloud Computing: Advances, Systems and Applications (2019) 8:19 Page 4 of 13

    and may issue additional queries to the database; theresult is a fraud risk analysis report.

    The SecureCloud approachThe SecureCloud approach is depicted in Fig. 2. Servicesare divided into infrastructure services, which controlthe cloud resources, and platform services, which imple-ment higher level services used in the development ofapplications. Applications can be developed in a mul-titude of programming languages. Simple applicationsmay use no platform services, while others could needsupport for secure indirect communication or scalablestorage. Services and applications execute over a run-time, which facilitates the usage of hardware secu-rity features. These layers are described in the nextsections.

    SecureCloud runtimeSecuring data by guaranteeing confidentiality andintegrity is extremely desirable to protect sensitive smartgrid data from malicious attacks. There are currentlymany cloud provider-specific approaches that providesome level of confidentiality and integrity. Nevertheless,they lack the capability to prevent application data frombeing compromised by software with higher privilegelevels, such as hypervisors [5, 6].Intel’s Software Guard eXtensions (SGX) [1, 7] is becom-

    ing increasingly popular. It is a hardware-based tech-nology that guarantees data integrity and confidentiality,creating a Trusted Execution Environment (TEE) thatprotects the code even in cases where the operating sys-tem and the hypervisor are not to be trusted.SGX operates with a modified memory engine, gen-

    erating protected areas named enclaves [8]. To pro-vide integrity capabilities, SGX also offers local andremote attestation features [7], in which a third partycan ensure that only the expected code is being exe-cuted inside an enclave, and on an authentic SGX-capableserver.Microservices in the SecureCloud platform can be

    developed either by directly using the Intel SDK [1] orusing the SCONE runtime [4]. In our case, the recom-mendation is to use SCONE for the SecureCloud appli-cation development, i.e., for SecureCloud microservicesthat are closely related to a given SecureCloud applica-tion. Indeed, SCONE provides better usability and relievesthe developers from much of the complexity of SGX pro-gramming. Intel SDK has been used for developing cer-tain core microservices part of the SecureCloud platformitself, such as those related to storage and communica-tion. In this case, the flexibility of the lower level pro-gramming enables optimizing performance and resourceconsumption at the cost of much greater developmentcomplexity.

    SCONE provides the necessary runtime environment –based on the musl library4 – to execute a microservicewithin an SGX enclave. SCONE is a set of tools thatprovide the necessary support for compiling and pack-aging Docker containers from the microservice’s sourcecode. The SecureCloud runtime supports several pro-gramming languages, notably C, C++, Go, Fortran, Lua,Python, R, and Rust.Finally, because of its availability and security guar-

    antees, we currently focus on Intel SGX. Nevertheless,the SCONE approach abstracts most security concernsand, thus, other trusted execution environments could beeventually supported. To illustrate the performance costof using SCONE with Python, the most popular program-ming language currently, we have selected a variety ofbenchmarks from the Python community5 and created aninteractive portal with the PySpeed tool to make it easierto compare performance of different execution alterna-tives6. A subset of the benchmarks is shown in Fig. 3.Four bars are shown for each benchmark, illustrating theslowdown or speed up when using two versions of Python(the default CPython interpreter and, PyPy, an alternative,more efficient implementation7), with or without SCONE.The first bar in each benchmark depicts the executionwith PyPy, but without the SCONE runtime; the secondshows the execution with PyPy and SCONE; the third,which is the reference value, corresponds to the executionwith the default Python and without SCONE; finally, thefourth bar depicts the execution with the default Pythoninterpreter and SCONE. In most of the cases, the over-head of using SCONE is small and can be mitigated byusing a more efficient implementation of Python, such asPyPy.In the case of our two applications, they have been

    developed to use the SCONE toolset: the Data Validationis written in C and compiled with SCONE, while the FraudDetection has been developed in Python and runs ontop of a Python interpreter compiled with SCONE. Bothtools use infrastructure and platform services developedin different languages as detailed next.

    Infrastructure servicesThe lower-level services in the SecureCloud ecosystemare named SecureCloud Infrastructure Services and com-prise services necessary to deploy and execute Secure-Cloud microservices. Examples of infrastructure servicesare scheduling and orchestration, attestation, auditing andmonitoring.Two platforms, OpenStack and Kubernetes, are the cor-

    nerstones of the SecureCloud infrastructure services and

    4https://www.musl-libc.org/5https://speed.pypy.org/about/6http://pyspeed.lsd.ufcg.edu.br7https://pypy.org/

    https://www.musl-libc.org/https://speed.pypy.org/about/http://pyspeed.lsd.ufcg.edu.brhttps://pypy.org/

  • Brito et al. Journal of Cloud Computing: Advances, Systems and Applications (2019) 8:19 Page 5 of 13

    Fig. 2 Overview of the SecureCloud platform and its components

    have been selected because of their popularity and theprovided features. OpenStack is an open-source IaaS plat-form that can be used to deploy private, public, andhybrid clouds. Recent user surveys among OpenStack [9]and Kubernetes users [10] report that Kubernetes is themost popular tool for using containers in the cloud, usedby 50% of deployments. The surveys still highlight thatKubernetes leads the usage for orchestration in produc-tion environments with 32%, followed by Docker Swarm(10%) and Apache Mesos (6%).In Fig. 2, three services are relevant for the applications

    described in this paper. TheMonitoring Service is based onthe upstream OpenStack Monasca system.8 In our case,we simply add a few metric collection agents that exposemetrics related to the usage of secure resources, for exam-ple, SGX Enclave Page Cache (EPC) usage. This metric isuseful because over-commitment of EPC memory causesconsiderable performance degradation. In addition, otherrelevant metrics for the monitoring of secure applicationshave been added and open sourced9.Once applications are running inside an enclave, the

    next issue is to provide them sensitive configuration,such as secrets. This is a known hassle of Intel SGXas it requires the development and operation of a dif-ferent piece of software to remotely attest the originalapplications before passing the secrets. Both our demon-stration applications require secrets, especially certificatesfor authenticating itself and the other microservices used.The Configuration and Attestation Service (CAS) is thenresponsible for the attestation and trust management,abstracting the problem of verifying if applications thatare actually running have the code hashes registered dur-ing development and validation. These code hashes arecomputed upon application instantiation and serve as a

    8http://monasca.io/9https://github.com/christoffetzer/linux-sgx-driver

    signature for that application or microservice. If the sig-nature of the running application match with a previouslyregistered signature, the application is granted access tosecrets, such as database credentials, TLS certificates andconfiguration parameters.Lastly, the Scheduling and Orchestration Service pro-

    vides functionalities spread over a set of low-level services.For example, we modify OpenStack Nova so that meta-data on the flavor is passed to the special hypervisorthat is able to instantiate VMs with SGX access. Thisspecial hypervisor is a fork from the popular KVM hyper-visor and is maintained by Intel10. For cloud environmentswithout KVM, an alternative is to consider bare-metalinstances (e.g., using OpenStack Ironic11 or infrastructurecontainers (e.g., LXD12), which could directly access thenon-virtualized SGX device. In addition, the OpenStackMagnum component instantiates Kubernetes clusters ondemand, clusters which now need to be aware of SGX.

    Platform servicesThe higher level services are named SecureCloud platformservices. These services offer functionality that can beused by applications. More specifically, the SecureCloudplatform services offer different types of storage services,such a secure key-value and object store, and an SQLadapter on top of it. For communication, the SecureCloudplatform services offer point-to-point communication aswell as many-to-many communication. The latter, namedSecure Content Based Routing (SCBR) [11], is based on thepublish/subscribe model.Additionally, data processing services are also offered.

    One example is the usage of MapReduce paradigm forsecure data processing [12]. Another example is a setof modifications made on Apache Spark that enable

    10https://github.com/intel/kvm-sgx11https://wiki.openstack.org/wiki/Ironic12https://linuxcontainers.org/lxd/getting-started-openstack/

    http://monasca.io/https://github.com/christoffetzer/linux-sgx-driverhttps://github.com/intel/kvm-sgxhttps://wiki.openstack.org/wiki/Ironichttps://linuxcontainers.org/lxd/getting-started-openstack/

  • Brito et al. Journal of Cloud Computing: Advances, Systems and Applications (2019) 8:19 Page 6 of 13

    Fig. 3 Performance overhead of common Python benchmarks running with SCONE

    encrypted data to be processed in a way that decryption isdone only inside enclaves.Three platform services are specially important for the

    data validation and the fraud detection applications pre-sented in this paper: the secure key-value store (KVS),the SQL-to-KVS converter (CascaDB) and the batchjob processor (Asperathos). These services are detailednext.

    Secure KVSWe have implemented a distributed Secure Key-ValueStore (KVS) that exhibits a trade-off between the fea-tures required by big data applications and the complexity,performance, and security impact of individual features.The rationale for our feature set is described in [13]. Asa result, we managed to implement a reliable, scalable,and secure system that satisfies our use case requirements.For example, both Billing Validation and Fraud Detectionapplications have a phase for data aggregation where largeblobs of measurement sets for individual consumers andretrieved and analysed, while being kept confidential in allsteps from storage to processing, even on top of standardstorage volumes provided by the cloud infrastructure.The KVS provides scalability by being itself a set of

    microservices that run on top of Kubernetes. As seenin Fig. 4, client applications communicate with the KVSvia HTTPS connections, using their unique API keys toauthenticate themselves. In order to protect the confi-dentiality and authenticity of objects stored in the KVS,a microservice was added in front of our REST API. Werefer to it as the TLS interceptor which is based on TaLoS[14], our library for building applications that support TLSconnection termination inside enclaves. This interception

    service has the role of intermediating the connectionbetween applications and storage components, handlingthe HTTPS requests inside the enclaves and encryptingthe data that is written to the KVS. Likewise, it inter-cepts and decrypts HTTPS responses containing objectdata before they are returned to the client. The termi-nation of TLS connections and encryption/decryptionoperations are performed inside an SGX enclave, thuscleartext object data is never exposed outside of an

    Fig. 4 Secure KVS:architecture overview

  • Brito et al. Journal of Cloud Computing: Advances, Systems and Applications (2019) 8:19 Page 7 of 13

    enclave. The other microservices that make up the KVShandle only encrypted object data and have no access tothe decryption keys. After the TLS interceptor compo-nent, the request is forwarded to a well-defined RESTAPI.The request is handled in the KVS microservices accord-ing to the target storage policy, i.e. redundancy is addedbefore it gets stored on a variable number of distributedpersistent storage nodes. Positioning the TLS interceptoras the gateway of the KVS removes the need to providetrusted computation capabilities in the other componentsof our system. Thus, the other KVS microservices do notneed to run on SGX hardware.This secure KVS provides reliability and redundancy

    by distributing data onto multiple storage nodes usingreplication, erasure coding, or a combination of thetwo. Any combination of replication and erasure cod-ing can be specified to achieve the optimal trade-offbetween storage cost, throughput, and reliability. A fre-quently used object (hot data) can be stored with mul-tiple replicas. An infrequently used object (warm data)could have one exact replica and eight erasure codedfragments, of which only six are necessary for recover-ing a copy of the object. An archival object (cold data)can be stored simply by using erasure coding. Addition-ally, the policy can be changed as object access patternschange.As seen on Fig. 5, replication has much higher stor-

    age requirements than erasure coding. However, it hasa higher throughput for storing or accessing data as itdoes not require encoding or decoding steps, respectively.Hybrid policies are in between the two approaches in stor-age and throughput, thus, they provide a natural trade-off:data can be accessed from the replica as often as possible,while storage costs are reduced.Lastly, we have deployed the KVS to an OpenStack

    cluster, using the infrastructure services discussed earlier,and measured the read and write operations per secondfor various data sizes. The results can be seen Fig. 6.We used Kubernetes to orchestrate the microservices.The benchmarks used 32 distinct Python client pro-cesses to execute the read and write operations, ensuringthat enough load is placed on the system. There were8 instances of the TLS interceptor component run-ning on two different nodes, each with its own 8-coreSGX-enabled virtual CPU (vCPU) and 16 GB or RAM.Incoming requests were load-balanced between the two.These were the only nodes in the deployment to utilizeSGX hardware. The microservices in charge of access-ing the underlying storage subsystem ran on 6 nodeswith a single vCPU core and 2 GB of RAM each. Thesestorage nodes used Kubernetes’ persistent volumes fea-ture to ensure they always accessed the correct storagevolume. The volumes were mapped to the underlyingphysical storage devices using OpenStack Cinder with a

    CEPH backend13, both required no modification as datawas already encrypted. All other KVS microservices werescheduled to run on a single node with an 8-core vCPUand 16GB of RAM. Additional details on the experimentscan also be found in [13].

    SQL-to-KVS converter – cascaDBMany applications need to handle high volumes of data,be it a simple information system that needs to persistcustomer data or a complex application that needs topersist low-level data for posterior aggregation. Using adatabase factors out much of the complexity to managethe data. The two applications discussed in this paper areno exception.When choosing a database, one initial decision a devel-

    oper needs to make is regarding the trade-off betweenscalability and feature set. On the one hand, key-valuestorage systems are easier to scale both in performanceand capacity. On the other hand, relational systems offera much richer set of features and count on the experienceof developers with SQL semantics.Not surprisingly, power grids, as other critical systems,

    have many legacy applications. Rewriting applicationsis a major obstacle that needs to be overcome whenmigrating the applications to the SecureCloud platform.In the SecureCloud project, we take a hybrid, layeredapproach. First, the core of existing applications doesnot need to be modified to run in enclaves. As detailedin previous sections, the SCONE toolset enables com-piling applications written in languages such as C/C++and Fortran to be run inside enclaves and be remotelyattested.Next, we understand that managing big data requires

    systems that scale massively. In addition, and maybe evenmore importantly, making these systems fault-tolerantand secure requires carefully thinking on which featureswill be implemented. We used the distributed secureKVS described in “Secure KVS” section and built a SQLmapping engine that runs inside enclaves, supporting thebasic queries needed by our applications.Thus, on top of the KVS, the Customizable Adapter

    for Secure Cloud Applications (CascaDB) converts SQLstatements into key-value accesses transparently. It offersTLS connections for communication and runs insideSGX enclaves to enable protection of sensitive data(including the SQL statements themselves). Figure 7depicts the approach adopted for CascaDB’s implemen-tation. On the top left-hand side, the querying service(e.g., MDC) accesses CascaDB. On the top right-handside, the KVS implements the actual storage. Usingsuch an approach, legacy systems used to insert mea-surements in the databases require no modifications,

    13https://ceph.io

    https://ceph.io

  • Brito et al. Journal of Cloud Computing: Advances, Systems and Applications (2019) 8:19 Page 8 of 13

    Fig. 5 Secure KVS: Comparing the raw storage requirements of different data protection policies, namely, replication, erasure coding and hybridpolicies

    while new analysis applications such as new imple-mentations of the Data Validation and Fraud Detectionapplications can be written to access data directly inthe Secure KVS to perform big data analysis moreefficiently.

    CascaDB receives SQL statements and returns JSONresults after transforming these data requests into key-value ones. It was built in a modular architecture.Independent modules are easier to fit in the enclave’slimited memory and can be replicated for scalability.

    Fig. 6 Secure KVS: read and write operations for different object sizes

  • Brito et al. Journal of Cloud Computing: Advances, Systems and Applications (2019) 8:19 Page 9 of 13

    Fig. 7 CascaDB approach

    The SQL translation engine handles SQL queries andinteracts with the underlying KVS module to answerrequests. Figure 8 depicts the internal parts of the SQLtranslation module and its interactions with the KVS.Data definition (DDL) queries are handled by the parserand have their settings stored in the data dictionary. Theschema and keys define how each tuple will later bemapped to the key-value model.When a data manipulation (DML) query is received, a

    dictionary lookup is performed to determine the map-pings between the models. For insertion queries, a keyis composed by a concatenation of the table name andthe primary key(s) (a predefined separator is used forreadability). Tuple attributes are also concatenated to bestored as the value. Figure 9a depicts an insert query fora table named reading that has the first two attributesas primary keys. Figure 9b depicts the resulting key-value

    pair generated. The key-value pair is then submitted to theKVS. For selection queries the key is assembled with thetable name and the keys presented in the query, potentiallywith a wildcard character (*) to retrieve all keys match-ing the prefix. Currently only queries with conditions overthe primary keys are supported. More flexible selects andjoins are upcoming developments.For example, a wider coverage of the SQL language

    can be implemented with extra indexes for non-keyattributes – stored in the key-value store. Joins on the pri-mary keys can be implemented efficiently since the KVSreturns keys in alphabetical order, enabling the use of amerge-join algorithm. Other joins can be implementedusing the aforementioned extra indexes or full table scanswhen indexes are not available.

    Confidential batch job processor – asperathosThe Asperathos framework14 provides tools to facilitatethe deployment and control of applications running incloud environments. For example, Asperathos can providequality of service (QoS) by controlling resources allocatedduring runtime. Nevertheless, in contrast to other orches-tration tools, such as Kubernetes itself or OpenStackHeat,it can be configured to consider application specific met-rics and to actuate in a customized fashion. In the caseof batch jobs using enclaves for data protection, as in thecase in the two applications considered here, this is evenmore relevant as the operator may want to add the usageof enclave memory into the control logic [15].The architecture of Asperathos is depicted in Fig. 10. It

    is composed of three main modules: (i) the Manager isthe entry point for the user and is responsible for receiv-ing an application submission, triggering all other steps;(ii) theMonitor is responsible for gathering, transformingand publishing metrics collected from applications (e.g.,the application progress) or environment resources (e.g.,CPU usage); (iii) the Controller is the component thatadjusts the amount of allocated resources dedicated to anapplication.Each of the components above can be customized with

    plugins. For example, in a deployment with the SecureKVS serving several applications and the Data Validationapplication executing analysis of the user billing data, theuser could could want to have custom plugins in theMonitor and in the Controller so that whenever theSecure KVS scales up its TLS interceptors, the DataValidation batch workers are reduced to free up SGXresources.While both applications require some batch pro-

    cessing in addition to the storage and communicationservices, diversifying the number of infrastructure ser-vices can increase the operation burden. Therefore, we

    14https://github.com/ufcg-lsd/asperathos

    https://github.com/ufcg-lsd/asperathos

  • Brito et al. Journal of Cloud Computing: Advances, Systems and Applications (2019) 8:19 Page 10 of 13

    Fig. 8 Service architecture

    leveraged Kubernetes also for the orchestration of legacybatch jobs.As discussed in “Smart metering applications” section,

    both the Data Validation and FraudDetection applicationsdiscussed in this paper work as a set of independent tasks.This pattern is easily mapped into Kubernetes with thehelp of the Kubernetes Job object15. The job abstractionprovides the ability to run finite workloads by creating aset of containers (i.e., Pods, in Kubernetes terminology)and guaranteeing that a specified number of themsuccessfully terminate. It is also possible to specifyparallelism, time-out deadlines, and basic retrial policies,turning it into an option for running batch workloadswhile benefiting from other Kubernetes features and tools.We have then implemented three plugins to enable theexecution of batch processing tasks using Kubernetesjobs, and the logic to monitor and actuate (e.g., scaling thecluster) through Kubernetes APIs.To properly orchestrate secure containers on stan-

    dard cloud clusters, Kubernetes needs to deal with the

    15https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/

    infrastructure’s SGX capabilities. As discussed above, con-tainers requiring SGX will contend on the availabilityof enclave memory. The monitoring infrastructure thatfeeds Kubernetes’ scheduler with resource metrics mustkeep track of enclave memory requests and allocate thecontainers accordingly.We then developed a vertical implementation of

    an SGX-aware architecture for orchestrating containersinside Kubernetes clusters. Using a slightly customizedSGX driver, we have Linux report on enclave mem-ory usage per container and to enforce limits to theirallocation, providing Kubernetes with a new device plug-in. We also implemented a new Kubernetes scheduler

    Fig. 9 Example of an insert query (a) mapping into a key-value pair(b). clientID and timestamp compose the primary key of the table andbecome part of the mapped key. The table has 30 attributes thatwere omitted for clarity

    https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/

  • Brito et al. Journal of Cloud Computing: Advances, Systems and Applications (2019) 8:19 Page 11 of 13

    Fig. 10 Asperathos architecture

    that takes into account the amount of enclave memoryrequired by the containers and the amount of enclavememory available in the cluster nodes, preventing over-allocations, which may abruptly decrease performance inorders of magnitude [4]. Our implementation has beenused to map container jobs with security requirementswith priority on SGX machines [16] and to demonstratethat appropriate management of enclave memory man-agement can reduce their overall turnaround time.To illustrate the Asperathos framework in execution,

    Fig. 11 present a view of a Grafana dashboard.16 Threedifferent executions are shown, each with a different tol-erance value regarding to how late an execution could bebefore actuation is triggered. As expected, less toleranceto deviations implies on more actuation events on theinfrastructure.

    Related workThe use of Trusted Execution Environments (TEE) incloud computing scenarios has been growing recently asit can help address confidentiality and integrity challengeswhen delegating critical processing and sensitive data tothe cloud. As an example of other initiatives, SERECA[17] has investigated the usage of SGX enclaves to hardenmicroservices following a reactive paradigm based on theVert.x toolkit17. Fetzer et al. then illustrate the usage ofthis framework with two use cases: a water managementsystem, which has an integrity requirement, and a cloud-based service for performance analysis of applications,which has a confidentiality requirement.

    16https://grafana.com/17http://vertx.io/

    Another research initiative to use TEE in cloud com-puting is the ATMOSPHERE project [18]. In this case,the consortium addresses the more general problem oftrustworthiness in the health domain. In their case, inaddition to confidentiality and integrity, the TEEs are usedto support other trustworthiness features such as privacy(by executing anonymization algorithms in enclaves) andtransparency (by using enclaves to enforce that accesses tosensitive data are logged and can be tracked).Finally, on the industry side, the recent Confidential

    Computing Consortium, announced by the Linux Foun-dation18, is a cross industry initiative aiming to speedup the adoption of confidential computing mechanisms.The initiative includes producers of alternatives to theSCONE toolkit used in this paper to port applications torun on SGX enclaves: Microsoft’s Open Enclave SDK19and Google’s Asylo20.

    ConclusionThis paper presents the SecureCloud approach for imple-menting secure data processing for big data applica-tions. The SecureCloud approach is divided into runtime,infrastructure, and platform services. The infrastructureservices provide the mechanisms for orchestrating basicresources such as VMs, containers, clusters, and secrets(including both the attestation of applications and theprovisioning of the secrets). On top of the infrastruc-ture services, platform services have been provided tosupport the application developer. Examples of platformservices are the secure content-based publish-subscribeservice (SCBR) and the secure key-value store (SecureKVS). For the applications, end users with limited secu-rity knowledge are advised to build applications on top ofthe SCONE runtime, having the option of using compilers(e.g., for C or Fortran) or interpreters (e.g., Python). Suchan approach relieves the developer from handling non-trivial tasks, such as having to split code between secureand insecure portions, and to build ad hoc helpers toolsfor attestation and secret provisioning. Leveraging a richerruntime also enables the user to inherit benefits fromapproaches that enhance security, such as side-channelprotection mechanisms [19].For the development of the infrastructure and platform

    services themselves, we used a mixed approach. Someservices were built on top of Intel SGX SDK to enablemore flexibility in splitting the code into secure and inse-cure parts, leading to less resource usage at the costof much higher complexity. Some other services reuseexisting code and therefore are simply adapted and recom-piled using the SCONE toolset. Details on individual

    18https://confidentialcomputing.io19https://openenclave.io20https://asylo.dev/

    https://grafana.com/http://vertx.io/https://confidentialcomputing.iohttps://openenclave.iohttps://asylo.dev/

  • Brito et al. Journal of Cloud Computing: Advances, Systems and Applications (2019) 8:19 Page 12 of 13

    Fig. 11 Selected executions in the Grafana dashboard with tolerances set to 0%, 5%, and 10%

    components, including performance evaluations, can befound on the project’s website21.Finally, two applications in the context of Smart Grids

    were used to validate the proposed tools: a data valida-tion and a fraud detection application. These applicationshave been used to illustrate the need for some of theinfrastructure and platform services, for example: theneed for bootstrapping application requires trust manage-ment and distribution of secrets, achieved through theConfiguration and Attestation Service (CAS); the need forsupport for both legacy SQL queries and blob accessesfor big data processing jobs, achieved through the Cas-caDB and Secure KVS; and the need for orchestrat-ing applications using popular tools, but still enablingthe efficient usage of the currently very limited enclavememory, achieved using OpenStack, Kubernetes andAsperathos.

    AcknowledgementsThe SecureCloud Project was funded by the Brazilian Ministry of ScienceTechnology and Communications, the European Commission and the SwissState Secretariat for Education, Research and Innovation through the Horizon2020 Program, in the 3rd Brazil-Europe coordinated call. We also thank the UCCCloud Challenge 2018 organizers and the anonymous reviewers for theirfeedback. Open Access Funding by the Publication Fund of the TU Dresden.

    Authors’ contributionsAs the paper describes an end-to-end approach for a complex problem, theauthor’s contributions encompass several different areas, as mentioned below.

    21https://www.securecloudproject.eu

    AB is the lead architect on the OpenStack integration and of theimplementation Asperathos framework. CF is the lead architect for the SCONEtoolset. SK was responsible for the overall architecture definition andintegration. PP is the lead architect in the TaLos toolset. MP is the leadarchitect on the integration of the SGX scheduler into Kubernetes. PF is thelead architect for the SCBR message bus. KF was responsible for theintegration of the applications. MR was the main responsible for theevaluation of the applications. LG-Jr is the lead architect of CascaDB. RRrepresented the power grid client companies and contributed with the loadgenerator and application requirements. CP was responsible for thedevelopment of the data validation application. LFR was the lead architect forthe data validation application. DEL was the lead architect for the Secure KVSservice. MS was responsible for the development of the secure KVS service. LNwas responsible for the operation and evaluation of the secure KVS service. MFwas responsible for the initial implementation of the secure KVS service. Allauthors read and approved the final manuscript.

    Availability of data andmaterialsThe data used to feed the experiments in the project contain personalinformation and cannot be disclosed. Nevertheless, the actual data used in theprocessing does not affect the results of the experiments presented. Moreinformation about tools used in the paper can be found in the projectwebpage: www.securecloudproject.eu.

    Competing interestsThe authors declare that they have no competing interests.

    Author details1Universidade Federal de Campina Grande, Campina Grande, Brazil.2Technische Universität Dresden, Dresden, Germany. 3Imperial CollegeLondon, London, United Kingdom. 4University of Neuchâtel, Neuchâtel,Switzerland. 5Universidade Tecnológica Federal do Paraná, Curitiba, Brazil.6LACTEC, Curitiba, Brazil. 7INMETRO, Rio de Janeiro, Brazil. 8Chocolate CloudApS, Aalborg, Denmark.

    Received: 6 May 2019 Accepted: 15 October 2019

    https://www.securecloudproject.euwww.securecloudproject.eu

  • Brito et al. Journal of Cloud Computing: Advances, Systems and Applications (2019) 8:19 Page 13 of 13

    References1. Intel (2015) Intel Software Guard Extensions. Cryptology ePrint Archive,

    Report 2016/086. https://software.intel.com/sites/default/2. ZigBee Standards Organization (2012) ZigBee Specification 053474r20.

    Rev. 203. Associação Brasileira de Normas Técnicas (ABNT) (2008) Intercâmbio de

    informações para sistemas de medição de energia elétrica. Rev. 14. Arnautov S, Trach B, Gregor F, Knauth T, Martin A, Priebe C, Lind J,

    Muthukumaran D, O’Keeffe D, Stillwell ML, Goltzsche D, Eyers D, Kapitza R,Pietzuch P, Fetzer C (2016) Scone: Secure linux containers with intel sgx.In: Proceedings of the 12th USENIX Conference on Operating SystemsDesign and Implementation, OSDI’16. USENIX Association, Berkeley.pp 689–703. http://dl.acm.org/citation.cfm?id=3026877.3026930

    5. Szefer J, Keller E, Lee RB, Rexford J (2011) Eliminating the hypervisor attacksurface for a more secure cloud. In: Proceedings of the 18th ACMConference on Computer and Communications Security, CCS ’11.pp 401–412. https://doi.org/10.1145/2046707.2046754

    6. Subashini S, Kavitha V (2011) Review: A survey on security issues in servicedelivery models of cloud computing. J Netw Comput Appl 34(1):1–11

    7. Costan V, Devadas S (2016) Intel SGX Explained. Cryptology ePrintArchive, Report 2016/086. http://eprint.iacr.org/2016/086

    8. McKeen F, Alexandrovich I, Berenzon A, Rozas CV, Shafi H, Shanbhogue V,Savagaonkar UR (2013) Innovative instructions and software model forisolated execution. In: Proceedings of the 2Nd International Workshop onHardware and Architectural Support for Security and Privacy, HASP ’13.ACM, New York. pp 10–1101. https://doi.org/10.1145/2487726.2488368,http://doi.acm.org/10.1145/2487726.2488368

    9. OpenStack Foundation OpenStack User Survey November 2017. https://www.openstack.org/assets/survey/OpenStack-User-Survey-Nov17.pdf.Access 1 Apr 2019

    10. Cloud Native Computing Foundation (2017) Cloud Native TechnologiesAre Scaling Production Applications. https://www.cncf.io/blog/2017/12/06/cloud-native-technologies-scaling-production-applications/. Access1 Apr 2019

    11. Pires R, Pasin M, Felber P, Fetzer C (2016) Secure content-based routingusing intel software guard extensions. In: Proceedings of the 17thInternational Middleware Conference, Middleware ’16. ACM, New York.pp 10–11010. https://doi.org/10.1145/2988336.2988346. http://doi.acm.org/10.1145/2988336.2988346

    12. Pires R, Gavril D, Felber P, Onica E, Pasin M (2017) A lightweightmapreduce framework for secure processing with sgx. In: Proceedings ofthe 17th IEEE/ACM International Symposium on Cluster, Cloud and GridComputing, CCGrid ’17. IEEE Press, Piscataway. pp 1100–1107. https://doi.org/10.1109/CCGRID.2017.129. https://doi.org/10.1109/CCGRID.2017.129

    13. Lucani DE, Feher M, Fonseca K, Rosa M, Despotov B (2018) Secure andscalable key value storage for managing big data in smart cities usingintel sgx. In: 2018 IEEE International Conference on Smart Cloud(SmartCloud). pp 70–76. https://doi.org/10.1109/SmartCloud.2018.00020

    14. Aublin P-L, Kelbert F, O’Keeffe D, Muthukumaran D, Priebe C, Lind J, KrahnR, Fetzer C, Eyers D, Pietzuch P (2018) LibSEAL: Revealing Service IntegrityViolations Using Trusted Execution. In: Proceedings of the ThirteenthEuroSys Conference. ACM, New York. pp 24:1–24:15. http://doi.acm.org/10.1145/3190508.3190547. http://doi.org/10.1145/3190508.3190547

    15. Ataide I, Vinha G, Souza C, Brito A (2018) Implementing quality of serviceand confidentiality for batch processing applications. In: 2018 IEEE/ACMInternational Conference on Utility and Cloud Computing Companion(UCC Companion). pp 258–265. https://doi.org/10.1109/UCC-Companion.2018.00065

    16. Vaucher S, Pires R, Felber P, Pasin M, Schiavoni V, Fetzer C (2018)Sgx-aware container orchestration for heterogeneous clusters. In: 2018IEEE 38th International Conference on Distributed Computing Systems(ICDCS). pp 730–741. https://doi.org/10.1109/ICDCS.2018.00076

    17. Fetzer C, Mazzeo G, Oliver J, Romano L, Verburg M (2017) Integratingreactive cloud applications in sereca. In: Proceedings of the 12thInternational Conference on Availability, Reliability and Security, ARES ’17.ACM, New York. pp 39–1398. https://doi.org/10.1145/3098954.3105820.http://doi.acm.org/10.1145/3098954.3105820

    18. Brasileiro F, Brito A, Blanquer I (2018) Atmosphere: Adaptive, trustworthy,manageable, orchestrated, secure, privacy-assuring, hybrid ecosystem forresilient cloud computing. In: 2018 48th Annual IEEE/IFIP International

    Conference on Dependable Systems and Networks Workshops (DSN-W).pp 51–52. https://doi.org/10.1109/DSN-W.2018.00025

    19. Oleksenko O, Trach B, Krahn R, Silberstein M, Fetzer C (2018) Varys:Protecting SGX enclaves from practical side-channel attacks. In: 2018USENIX Annual Technical Conference (USENIX ATC 18). USENIXAssociation, Boston. pp 227–240. https://www.usenix.org/conference/atc18/presentation/oleksenko

    Publisher’s NoteSpringer Nature remains neutral with regard to jurisdictional claims inpublished maps and institutional affiliations.

    https://software.intel.com/sites/default/http://dl.acm.org/citation.cfm?id=3026877.3026930https://doi.org/10.1145/2046707.2046754http://eprint.iacr.org/2016/086https://doi.org/10.1145/2487726.2488368http://doi.acm.org/10.1145/2487726.2488368https://www.openstack.org/assets/survey/OpenStack-User-Survey-Nov17.pdfhttps://www.openstack.org/assets/survey/OpenStack-User-Survey-Nov17.pdfhttps://www.cncf.io/blog/2017/12/06/cloud-native-technologies-scaling-production-applications/https://www.cncf.io/blog/2017/12/06/cloud-native-technologies-scaling-production-applications/https://doi.org/10.1145/2988336.2988346http://doi.acm.org/10.1145/2988336.2988346http://doi.acm.org/10.1145/2988336.2988346https://doi.org/10.1109/CCGRID.2017.129https://doi.org/10.1109/CCGRID.2017.129https://doi.org/10.1109/CCGRID.2017.129https://doi.org/10.1109/SmartCloud.2018.00020http://doi.acm.org/10.1145/3190508.3190547http://doi.acm.org/10.1145/3190508.3190547http://doi.org/10.1145/3190508.3190547https://doi.org/10.1109/UCC-Companion.2018.00065https://doi.org/10.1109/UCC-Companion.2018.00065https://doi.org/10.1109/ICDCS.2018.00076https://doi.org/10.1145/3098954.3105820http://doi.acm.org/10.1145/3098954.3105820https://doi.org/10.1109/DSN-W.2018.00025https://www.usenix.org/conference/atc18/presentation/oleksenkohttps://www.usenix.org/conference/atc18/presentation/oleksenko

    AbstractKeywords

    IntroductionSmart metering applicationsThe SecureCloud approachSecureCloud runtimeInfrastructure servicesPlatform services

    Secure KVSSQL-to-KVS converter – cascaDBConfidential batch job processor – asperathosRelated workConclusionAcknowledgementsAuthors' contributionsAvailability of data and materialsCompeting interestsAuthor detailsReferencesPublisher's Note