CHARIOT: Goal-driven Orchestration Middleware for Resilient ...CHARIOT uses the analysis and...

Preview:

Citation preview

  • CHARIOT: Goal-driven OrchestrationMiddleware for Resilient IoT Systems

    Subhav Pradhan†, Abhishek Dubey†, Shweta Khare†, Saideep Nannapaneni?,Aniruddha Gokhale†, Sankaran Mahadevan?, Douglas C Schmidt†, Martin Lehofer‡

    †Dept of Electrical Engineering and Computer Science, Vanderbilt University, Nashville,TN 37235, USA?Dept of Civil and Environmental Engineering, Vanderbilt University, Nashville,TN 37235, USA

    ‡Siemens Corporate Technology, Princeton, NJ, USA

    ABSTRACTAn emerging trend in Internet of Things (IoT) applicationsis to move the computation (cyber) closer to the source of thedata (physical). This paradigm is often referred to as edgecomputing. If edge resources are pooled together they canbe used as decentralized shared resources for IoT applica-tions, providing increased capacity to scale up computationsand minimize end-to-end latency. Managing applications onthese edge resources is hard, however, due to their remote,distributed, and (possibly) dynamic nature, which necessi-tates autonomous management mechanisms that facilitateapplication deployment, failure avoidance, failure manage-ment, and incremental updates. To address these needs, wepresent CHARIOT, which is orchestration middleware ca-pable of autonomously managing IoT systems consisting ofedge resources and applications.

    CHARIOT implements a three-layer architecture. Thetopmost layer comprises a system description language, themiddle layer comprises a persistent data storage layer andthe corresponding schema to store system information, andthe bottom layer comprises a management engine that usesinformation stored persistently to formulate constraints thatencode system properties and requirements, thereby enablingthe use of Satisfiability Modulo Theories (SMT) solvers tocompute optimal system (re)configurations dynamically atruntime. This paper describes the structure and function-ality of CHARIOT and evaluates its efficacy as the basisfor a smart parking system case study that uses sensors tomanage parking spaces.

    1. INTRODUCTIONEmerging trends and challenges. Popular IoT ecosystemplatforms, such as Beaglebone Blacks, Raspberry Pi, IntelEdison, and other related technologies like SCALE [5] andParadrop[33], provide new capabilities for data collection,analysis, and processing at the edge [32] (also referred to as

    Fog Computing [6]). When pooled together, edge resourcescan be used as decentralized shared resources that host thedata collection, analysis, and actuation loops of IoT appli-cations.

    Examples of IoT applications include air quality moni-toring, parking space detection, and smart emergency re-sponse. In this paper, we refer to the combination of remoteedge resources and applications deployed on them as IoTsystems. IoT systems provide the capacity to scale up com-putations, as well as minimize end-to-end latency, therebymaking them well-suited to support novel use cases for smartand connected communities.

    While the promise of the IoT paradigm is significant, sev-eral challenges must be resolved before they become ubiq-uitous and effective. Conventional enterprise architecturesuse centralized servers or clouds with static network layoutsand a fixed number of devices without sensors and actu-ators to interact with their physical environment. In con-trast, edge deployment use cases must address key challengesnot encountered in cloud computing, including (1) handlingthe high degree of dynamism arising from computation andcommunication resource uncertainty and (2) managing re-source constraints imposed due to the cyber-physical natureof applications and system hardware/software components.

    Computation resource uncertainty in IoT systems stemsfrom several factors, including increased likelihood of fail-ures, which are in turn caused by increased exposure tonatural and human-caused effects, as well as dynamic en-vironments where devices can join and leave a system atany time. Communication resource uncertainty is caused bynetwork equipment failure, interference, or due to the mobilenature of some systems (e.g., swarms of drones or fraction-ated satellites). Unlike traditional enterprise architectures(whose resource constraints narrowly focus on only CPU,memory, storage and network), IoT systems must be able toexpress and satisfy more stringent resource constraints dueto their cyber-physical nature, such as their deployment onresource-limited sensors and actuators.

    Even under the uncertainties and constraints outlined above,IoT systems must be capable of managing their applicationsto ensure maximum availability, especially since these appli-cations are often mission-critical. Each application deployedfor a mission has specific goal(s) that must be satisfied atall times. IoT systems should therefore be equipped withmechanisms that ensure all critical goals are satisfied foras long as possible, i.e., they must be resilient by facilitat-ing failure avoidance, failure management, and operations

    1

  • management to support incremental hardware and softwarechanges over time. Moreover, since IoT systems compris-ing edge resources are often deployed remotely, resiliencemechanism should be autonomous to ensure availability andcost-effective management.Solution approach → Autonomous resilience manage-ment mechanisms. To address the challenges describedabove, IoT systems need autonomous mechanisms that en-able the analysis and management of (1) the overall systemgoals describing the required applications, (2) the compo-sition and requirements of applications, and (3) the con-straints governing the deployment and (re)configuration ofapplications. This paper describes a holistic solution calledCyber-pHysical Application aRchItecture with Objective-basedreconfiguraTion (CHARIOT), which is orchestration mid-dleware that supports the autonomous management of re-motely deployed IoT systems.

    CHARIOT uses the analysis and management capabilitiesoutlined above to provide services for initial application de-ployment, failure avoidance, failure management, and opera-tions management. CHARIOT’s three-layered architecturestack consists of a design layer, a data layer, and a man-agement layer, as shown in Figure 1 and described in thesummary of its four primary research contributions below.

    Figure 1: The Layered Architecture of CHARIOT.

    Contribution 1: A generic system description language. Atthe top of CHARIOT’s stack is a design layer implementedvia a generic system description language. This layer cap-tures system specifications in terms of different types ofavailable hardware resources, software applications, and theresource provided/required relationship between them. CHAR-IOT implements this layer using a domain-specific modelinglanguage (DSML) called CHARIOT-ML whose goal-basedsystem description approach yields a generic means of de-scribing complex IoT systems. This approach extends ourprior work [26, 27] by (1) using the concept of componenttypes (instead of specific implementations) to enhance flexi-bility and (2) supporting a suite of redundancy patterns. Itis further described in Section 3.1.Contribution 2: A schema for persistent storage of systeminformation. In the middle of CHARIOT’s stack is a data

    layer implemented using a persistent data store and thecorresponding schema to characterize system information,which includes a design-time system description and a run-time representation of the system. This layer canonicalizesthe format in which information about an IoT system isrepresented. We describe this contribution further in Sec-tion 3.2.Contribution 3: A management engine to facilitate autono-mous resilience. The bottom of CHARIOT’s stack is amanagement layer the supports monitoring and deploymentmechanisms, as well as a novel management engine that fa-cilitates application (re)configuration as a means of support-ing autonomous resilience. This management engine usesIoT system information stored in CHARIOT’s data layer toformulate Satisfiability Modulo Theories (SMT) constraintsthat encode system properties and requirements, enablingthe use of SMT solvers (such as Z3 [8]) to dynamically com-pute optimal system (re)configuration at runtime. We de-scribe this contribution further in Section 3.3.Contribution 4: Distributed implementation and evaluationof CHARIOT. CHARIOT uses MongoDB [23] as a persis-tent storage service, ZooKeeper [1] as a coordination serviceto facilitate group-membership and failure detection, andZeroMQ [13] as a high-performance communication middle-ware. We describe this contribution further in Section 4,which also describes a smart-parking application that servesas a case study to present experimental evaluation that showshow CHARIOT’s orchestration middleware capabilities aresuitable to manage edge computing for IoT systems.Paper organization. The remainder of this paper is orga-nized as follows: Section 2 describes the research problemaddressed by our work on CHARIOT; Section 3 explainsour first three contributions by describing the CHARIOTsolution in detail; Section 4 explains our fourth contributionby describing an implementation of CHARIOT and evalu-ating this implementation; Section 5 compares our work onCHARIOT with related work; and Section 6 presents con-cluding remarks and future work.

    2. PROBLEM DESCRIPTIONThis section describes the research problem addressed by

    our work on CHARIOT presented in this paper. We focus onIoT systems comprising clusters of heterogeneous nodes thatprovide computation and communication resources, as wellas a variety of sensors and actuators. Cluster membershipcan change over time due to failures, or addition and removalof resources.

    This distributed platform supports the needs of IoT ap-plications, which may span multiple nodes due to the avail-ability of resources, e.g., some nodes may have sensors, somemay have actuators, some may have the computing or stor-age resources, and some need more than the processing poweravailable on one node. These IoT applications are composedof loosely connected, interacting components [11], runningon different processes, as shown in Figure 2.

    A component provides a certain functionality and may re-quire one or more functionalities1 via its input and outputports. The same functionality can be provided by differentcomponents. These provided and required relations betweencomponents and functionalities establish dependencies be-

    1In this context, functionalities are synonymous to servicesor capabilities associated with a component.

    2

  • Figure 2: A Component-based IoT Application Model.

    tween components. IoT applications can thus be assembledfrom components that provide specific services. Likewise,components may be used (or reused) by many active appli-cations. Moreover, the cluster of computing nodes can hostmultiple applications concurrently.

    An IoT system running in a CHARIOT-based distributedplatform must manage the resources and applications to en-sure that functionalities provided by application componentsare always available. This capability is important since IoTapplications are often mission-critical, so functionalities re-quired to satisfy mission goals must be available as long aspossible. This notion of functionality requirement can alsobe hierarchical, i.e., high-level functionality may be furtherdivided into sub-functionalities.

    The possibility of having hierarchical functionalities re-sults in a functionality tree, which distinguishes betweenfunctionalities that can be divided into sub-functionalitiesand functionalities that cannot be decomposed further. Thelatter represents a leaf of the tree and should always map toone or more application components. Although each compo-nent provides a single functionality, the same functionalitycan be provided by multiple components.

    The requirement relationship between each parent and itschildren at every level of this functionality tree can be ex-pressed using a boolean expression [24, 15] that yields anand-or tree. Additional resource and implicit dependencyconstraints between components may arise due to systemconstraints. Examples of these system constraints include(1) availability of memory and storage capacity for compo-nents to use, (2) availability of devices and software artifacts(libraries) for components to use, and (3) network links be-tween nodes of a system, which restricts deployment of com-ponent instances with inter-dependencies.

    2.1 A Representative IoT System Case StudyConsider an indoor parking management system installed

    in a garage. This case study focuses on the vacancy detec-tion and notification functionality. This system is designedto simplify clients’ use of parking facilities by tracking theavailability of spaces in a parking lot and servicing clientparking requests by determining available parking spacesand assigning a specific parking space to a client. We usethis system as a running example throughout the rest of thispaper to explain various aspects of CHARIOT.

    Figure 3 visually depicts this IoT system, which consistsof a number of pairs of camera nodes (wireless camera) andprocessing nodes (Intel Edison module mounted on Arduino

    board)2 placed on the ceiling to provide coverage for mul-tiple parking spaces. Each pair comprising a camera and

    Figure 3: The Parking Management System Case Study.

    a processing node is connected via a wired connection. Inaddition, the parking lot has an entry terminal node thatdrivers interact with as they enter the parking lot.

    In addition to the hardware devices that comprise the sys-tem, Figure 3 also shows a distributed application consistingof five different types of components deployed on the hard-ware outlined above and described below:

    • ImageCapture component, which runs on a cameranode and periodically captures an image and sends itto an OccupancyDetector component that runs on aprocessing node.• OccupancyDetector component, which detects vehicles

    in an image and determines occupancy status of park-ing spaces captured in the image.• LoadBalancer component, which keeps track of the dif-

    ferent OccupancyDetector components available. Be-fore an ImageCapture component can send images toan OccupancyDetector component, it must first findthe OccupancyDetector by using the LoadBalancer com-ponent, which also runs on a processing node.• ParkingManager component, which keeps track of oc-

    cupancy status of the entire parking lot. After an Oc-cupancyDetector component analyses an image for oc-cupancy status of different parking spaces, it sends theresult to the ParkingManager component, which alsoruns on a processing node.• Client component, which runs on the entry terminal

    and interacts with users to allow them to use the smartparking application.

    2.2 Problem StatementAs mentioned before, IoT systems are dynamic; the degree

    of dynamism can vary from one system to another. For ex-ample, the smart parking example presented in Section 2.1is an example of a less dynamic system since the physicalresources are spatially static and any dynamism is relatedto system update associated with addition or removal of re-sources. A cluster of drones or fractionated satellites, how-ever, is an example of highly dynamic systems. Regardless ofthe degree of dynamism, support for autonomous resilienceis of high importance to every IoT system. For example, it

    2https://www.arduino.cc/en/ArduinoCertified/IntelEdison

    3

    https://www.arduino.cc/en/ArduinoCertified/IntelEdison

  • is essential to ensure that the ParkingManager componentis not a single point of failure, i.e., the smart parking systemshould not fail if the ParkingManager component fails.

    Addressing the problems described above requires orches-tration middleware that holistically addresses both (1) thedesign-time challenges of capturing the system descriptionand (2) the runtime challenges of designing and implement-ing a solution that facilitates failure avoidance, failure man-agement, and operations management.

    1. Failure avoidance is necessary for scenarios where fail-ures must be tolerated as long as possible without hav-ing to manage them. This capability is important forsystems that cannot withstand reconfiguration down-time. Although failures cannot be avoided altogether,we require mechanisms to avoid failure management.

    2. Failure management is needed to minimize downtimedue to failures that cannot be avoided, including fail-ures caused by unanticipated changes. The desired so-lution should ensure all application goals are satisfiedfor as long as possible, even after failures.

    3. Operations management is needed to minimize the chal-lenges faced when intentionally changing or evolvingan existing IoT system, i.e., these are anticipated changes.A solution for this should consider changes in hardwarecomponents, as well as software applications and mid-dleware components.

    3. CHARIOT: ORCHESTRATION MIDDLE-WARE FOR IOT SYSTEMS

    This section presents detailed description of CHARIOT,which is orchestration middleware we developed to addressthe challenges and requirements identified in Section 1. Asshown in Figure 4 the design-time aspect of CHARIOTincludes a modeling language and associated interpreters.The runtime aspect includes entities that comprise a self-reconfiguration loop, which implements a sense-plan-act closed-loop to (1) detect and diagnose failures, (2) compute recon-figuration plan(s), and (3) reconfigure the system. Withrespect to the different layers of CHARIOT (see Section 1),the design layer is part of the design-time aspect, the man-agement layer is part of the runtime aspect, and the datalayer cross cuts across both aspects.

    CHARIOT handles failure avoidance via functionality re-dundancy and optimal distribution of redundant function-alities. It tolerate failures by strategically deploying redun-dant copies of components that provide critical functionali-ties, so more failures are avoided/tolerated without havingto reconfigure the system. CHARIOT’s failure avoidancemechanisms are described further in Section 3.1.2.

    CHARIOT handles failure management via the sense-plan-act loop outlined above. Its Monitoring Infrastructure is re-sponsible for detecting failures, which is the sensing phase.We use capabilities supported by ZooKeeper [14] to imple-ment a monitoring infrastructure (see Section 4.2). Afterfailure detection and diagnosis, the Management Engine de-termines the actions needed to reconfigure the system sothat failures are mitigated, which is the planning phase andis based on the Z3 [8] open-source Satisfiability Modulo The-ories (SMT) solver (see Section 3.3). Once reconfigurationactions are computed, the Deployment Infrastructure usesthem to reconfigure the system, which is the acting phase.

    Figure 4: Overview of the CHARIOT Orchestration Mid-dleware.

    CHARIOT handles anticipated changes (i.e., planned up-date or evolution) via operations management. These changesinclude both hardware changes (e.g., addition of new nodesand removal of existing nodes) and software changes (e.g.,addition of new applications, and modification or removal ofexisting applications) performed at runtime.

    Figure 5: Reconfiguration Triggers Associated with FailureManagement and Operations Management.

    Figure 5 depicts detection and reconfiguration trigger mech-anisms associated with failure management and operationsmanagement. As shown in this figure, reconfiguration forfailure management and hardware update (operations man-agement) is triggered by the monitoring infrastructure. Con-versely, reconfiguration for software update (operations man-agement) is triggered manually after the system model isupdated.

    3.1 Design LayerThis section describes the CHARIOT design layer, which

    addresses the requirements of a design-time entity to cap-ture system descriptions. CHARIOT’s design layer allowsimplicit and flexible system description prior to runtime.An IoT system can be described in terms of required com-

    4

  • ponents or it can be described in terms of functionalitiesprovided by components. The former approach is inflexiblesince it tightly couples specific components with the system.CHARIOT therefore supports the latter approach, whichis more generic and flexible since it describes the systemin terms of required functionalities, where different compo-nents can be used to satisfy system requirements, dependingon their availability.

    A key challenge faced when creating CHARIOT was todevise a design-time environment whose system descriptionmechanism can capture system information (e.g., properties,provisions, requirements, and constraints) without explicitmanagement directives (e.g., if node A fails, move all com-ponents to node B). This mechanism enables CHARIOT tomanage failures by efficiently searching for alternative so-lutions at runtime. Another challenge faced when creatingCHARIOT was how to devise abstractions that ensure bothcorrectness and flexibility so it can easily support operationsmanagement.

    To meet the challenges described above, CHARIOT’s de-sign layer allows application developers to model IoT sys-tems using a generic system description mechanism. Thismechanism is implemented using a goal-based system de-scription approach. The key entities modeled as part of asystem’s description are (1) resource categories and tem-plates, (2) different types of components that provide vari-ous functionalities, and (3) goal descriptions correspondingto different applications that must be hosted on availableresources. CHARIOT defines a goal as a collection of objec-tives, where each objective is a collection of functionalitiesthat can have inter-dependencies.

    CHARIOT’s design layer concretizes the functionality treedescribed in Section 2. It enforces a two-layer functionalityhierarchy, where objectives are high-level functionalities thatsatisfy goals and functionalities are leaf nodes associatedwith component types. When these component types areinstantiated, each component instance provides associatedfunctionalities. To maximize composability and reusability,a component type can only be associated with a single func-tionality, though multiple component types can provide thesame functionality.

    To further explain CHARIOT’s design layer the remainderof this section presents the system description of the smartparking system summarized in Section 2.1. Figure 6 showsthe corresponding functionality tree, which is used below todescribe the different entities comprising the IoT system’sdescription using snippets of models built using CHARIOT-ML, which is our design-time modeling environment. A de-tailed description of the modeling language appears in [27].

    3.1.1 Node Categories and TemplatesSince physical nodes are part of an IoT system, CHARIOT-

    ML models them using categories and templates. The nodesare not explicitly modeled since the group of nodes compris-ing a system can change dynamically at runtime. CHAR-IOT thus only models node categories and node templates.A node category is a logical concept used to establish groupsof nodes; every node that is part of a IoT system belongs toa certain node category.

    Since CHARIOT does not explicitly model nodes at design-time, it uses the concept of node template to represent thetypes of nodes that can belong to a category. A node cat-

    Figure 6: Parking System Description for the ExampleShown in Figure 3.

    Figure 7: Snippet of Node Categories and Node TemplatesDeclarations.

    egory3 is thus a collection of node templates, where a nodetemplate is a collection of generic information, such as spec-ifications of memory, storage, devices, available software ar-tifacts, supported operating system, and available communi-cation middleware. A node template can be associated withany node that is an instance of the node template. Whena node joins a cluster at runtime the only information itneeds to provide (beyond node-specific network information)is which node template it is an instance of.

    Figure 7 presents the node categories and templates forthe smart parking system.There are three categories of nodesshown in this figure: CameraNode (line 3-10), ProcessingN-ode (line 12-18), and TerminalNode (line 20-26). Each cat-egory contains one template each. The CameraNode cat-

    3The concept of node categories becomes important whenassigning a per-node replication constraint (discussed in Sec-tion 3.1.2), which requires that a functionality be deployedon each node of the given category.

    5

  • egory contains a wifi cam template that represents a Wi-Fi enabled wireless IP camera. The ProcessingNode cate-gory contains an Edison template that represents an Edi-son board. The TerminalNode category contains an en-try terminal template that represents a parking control sta-tion placed at an entrance of a parking space. This scenariois consistent with the smart parking system described in Sec-tion 2.1. For simplicity, we only model memory and storagespecifications for each node template.

    3.1.2 Functionalities, Compositions and GoalsFunctionalities in CHARIOT-ML are modeled as entities

    with one or more input and output ports, whereas composi-tions are modeled as a collection of functionalities and theirinter-dependencies. Figure 8 presents four different func-tionalities (parking manager, image capture, load balancer,and occupancy detector) and the corresponding composition(occupancy checking) that is associated with the Occupancy-Checking objective (see line 6 in Figure 9). This figure alsoshows that composition is a collection of functionalities andtheir inter-dependencies, which are captured as connectionsbetween input and output ports of different functionalities.

    Figure 8: Snippet of Functionalities and CorrespondingComposition Declaration.

    The goal description for the smart parking application isshown in Figure 9. The goal itself is declared as Smart-Parking (line 3). Following the goal declaration is a list ofthe objectives required to satisfy the goal (line 5-6). Twoobjectives are defined in this example: the ClientInterac-tion objective and the OccupancyChecking objective. TheClientInteraction objective is related to the task of handling

    Figure 9: Snippet of Smart Parking Goal Description Com-prising Objectives and Replication Constraints.

    client parking requests, whereas the OccupancyChecking ob-jective is related to the task of determining the occupancystatus of different parking spaces.

    In CHARIOT-ML, objectives are instantiations of com-positions. The ClientInteraction objective is an instantia-tion of the client ineraction composition (line 5) and theOccupancyChecking objective is an instantiation of the oc-cupancy checking composition (line 6).

    Support for Redundant Deployment Patterns: CH-ARIOT-ML also supports redundant deployment patternsas a result of which functionalities can be associated withreplication constraints. For example, Figure 9 shows the as-sociation of the image capture functionality with a per-nodereplication constraint (line 9-10), which means this function-ality should be present on each node that is an instantiationof any node template belonging to CameraNode category.Similarly, the parking client functionality is also associatedwith a per-node replication constraint (line 11-12) for Termi-nalNode category. Finally, the occupancy detector function-ality is associated with a cluster replication constraint (line13-14), which means this functionality should be deployedas a cluster of at-least 2 and at-most 4 instances.

    CHARIOT-ML supports functionality replication usingfour different redundancy patterns: the (1) voter pattern,(2) consensus pattern, (3) cluster pattern, and (4) per-nodepattern, as shown in Figure 10. The per-node pattern (as de-scribed above for the image capture functionality) requiresthat the associated functionality be replicated on a per-node basis. Replication of functionalities associated withthe other three redundancy patterns is based on their redun-dancy factor, which can be expressed by either (1) explicitlystating the number of redundant functionalities required or(2) providing a range. The latter (as previously describedfor the occupancy detector functionality) requires the asso-ciated functionality to have a minimum number for redun-dancy and a maximum number for redundancy, i.e., if thenumber of functionalities present at any given time is withinthe range, the system is still valid and no reconfiguration isrequired.

    Figure 10 presents a graphical representation of voter,consensus, and cluster redundancy patterns (the case of theconsensus pattern, CS represents consensus services). Dif-ferent redundancy factors are used for each. As shown inthe figure, the voter pattern involves a voter in additionto the functionality replicas, the consensus pattern involvesa consensus service each for the functionality replicas andthese consensus services implement a consensus ring, and

    6

  • (a) Voter pattern with factor = 3. (b) Consensus pattern with factor = 4. (c) Cluster pattern with factor = 2.

    Figure 10: Example Redundancy Patterns for Functionality F1. The CSn m Entities Represent Consensus Service Providers.

    the cluster pattern only involves the functionality replicas.Implementing the consensus service is beyond the scope ofthis paper. In practice, CHARIOT uses existing consensusprotocols, such as Raft [25], for this purpose.

    3.1.3 Component TypesCHARIOT-ML does not model component instances, but

    instead models component types. Each component type isassociated with a functionality. When a component type isinstantiated, the component instance provides the function-ality associated with its type. A component instance there-fore only provides a single functionality, whereas a function-ality can be provided by component instances of differenttypes. Two advantages of modeling component types in-stead of component instances include the flexibility it pro-vides with respect to (1) the number of possible runtime in-stances of a component type and (2) the number of possiblecomponent types that can provide the same functionality.

    Figure 11: Snippet of Component Type Declaration.

    Figure 11 shows how the ParkingManager component typeis modeled in CHARIOT-ML. As part of the componenttype declaration, we first model the functionality that isprovided by the component (line 4). After the functionalityof a component type is modeled, we model various resourcerequirements (Figure 11 only shows memory requirementsin line 6) and the launch script (line 8), which can be usedto instantiate an instance of the component by spawning anapplication process.

    CHARIOT supports two different types of component ty-pes: hardware components and software components. Thecomponent type presented in Figure 11 is an example ofa software component. Hardware components are modeledin a similar fashion, though we just model the function-ality provided by a hardware component and nothing elsesince a hardware component is a specific type of componentwhose lifecycle is tightly coupled to the node with whichit is associated. A hardware component is therefore neveractively managed (reconfigured) by the CHARIOT orches-tration middleware. The only thing that affects the state of

    a hardware node is the state of its hosting node, i.e., if thenode is on and functioning well, the component is active andif it is not, then the component is inactive.

    In context of the smart parking system case study, theImageCapture component is a hardware component that isassociated with camera nodes. As a result, an instance of theImageCapture component runs on each active camera node.We model this requirement using the per-node redundancypattern (see line 32-33 in Figure 9). Likewise, the failure ofa camera node implies failure of the hosted ImageCapturecomponent instance, so this failure cannot be mitigated.

    3.1.4 Summary of the Design LayerCHARIOT-ML is a Domain Specific Modeling Language

    (DSML) built using the Xtext framework [2] that comprisesCHARIOT’s design layer. This DSML is a textual modelinglanguage designed using the Xtext framework [2]. Currently,CHARIOT-ML allows modeling of resources such as soft-ware artifacts, devices, memory, storage, operating system,and communication middleware. Although this is an exten-sive list of resource types for most IoT systems, it mightnot be sufficient for all possible IoT systems. Therefore,it might require modifications and extensions depending onthe domain in which it is being used. For example, in orderto model self-degrading systems that rely on monitoring ofQoS parameters, CHARIOT-ML must facilitate modeling ofQoS thresholds at different levels of abstractions.

    Furthermore, the language currently only facilitates repli-cation constraints, which are a type of deployment con-straint that specifies the number of certain functionalitythat must be deployed. There are other scenarios, however,where replication constraints are not sufficient and more spe-cific deployment constraints are required, such as deploy-on-same-node, deploy-on-different-node, and deploy-only-on-a-specific-resource-category.

    Due to the modular nature of the Xtext framework, in-troducing these changes will not be difficult. However, wemust ensure that the new concepts do not violate any ex-isting rules already implemented. Furthermore, the dataschema defined in Section 3.2 ensures that the extensionsintroduced at design layer can be supported by the underly-ing management layer, assuming that the functionalities areonly being added in and not modifying existing concepts.

    3.2 Data Description LayerThis section presents the CHARIOT data layer, which de-

    fines a schema that forms the basis for persistently storingsystem information, such as design-time system descriptionand runtime system information. This layer codifies the for-

    7

  • (a) Schema to Store Design-time System Descriptions.

    (b) Schema to Store Runtime System Representations.

    Figure 12: UML Class Diagrams for Schemas Used to Store System Information.

    mat in which system information should be represented. Akey advantage of this codification is its decoupling of CHAR-IOT’s design layer (top layer) from its management layer(bottom layer), which yields a flexible architecture that canaccommodate varying implementations of the design layer,as long as those implementations adhere to the data layerschema described in this section.

    Figure 12 presents UML class diagrams as schemas usedto store design-time system description and runtime systeminformation. These schemas are designed for document-oriented databases. An instance of a class that is not achild in a composition relationship therefore represents aroot document. Below we describe CHARIOT’s design-timeand runtime schemas in detail.

    3.2.1 Design-time System Description SchemaThe schema for design-time system description comprises

    entities to store node categories, component types, and goal

    descriptions, as shown in Figure 12a. These concepts havebeen previously described in Section 3.1. Neither node cat-egories nor component types are application-specific sincemultiple applications can be simultaneously hosted on nodesof an IoT system and a component type can be used bymultiple applications. In addition to other attributes, theComponentType class also captures scripts that can be usedto start and stop an instance of a component type; this in-formation is used at runtime to instantiate components.

    As shown in Figure 12a, a goal description comprises ob-jectives, which are composed of functionalities, and repli-cation constraints. The ReplicationConstraint class repre-sents replication constraints and consists of maxInstances,minInstance, and numInstances attributes that are relatedto the degree of replication. The latter attribute is used if aspecific number of replicas are required, whereas the formertwo attributes are used to describe a range-based replication.The nodeCategories attribute is used for per-node replica-

    8

  • tion constraints. The serviceComponentType attribute is re-lated to specific component types that provide special repli-cation services, such as a component type that provides avoter service or a consensus service.

    3.2.2 Runtime Information SchemaThe schema for runtime system information comprises en-

    tities to store functionality instances, nodes, deployment ac-tions, reconfiguration events, and look-ahead information,as shown in Figure 12b. Since functionalities can be repli-cated, the FunctionalityInstance class is used to store infor-mation about functionality instances. The ComponentTypeattribute is only relevant for voter and consensus serviceproviding functionality instances as they are not associatedwith functionalities that are part of a goal description. Fur-thermore, the alwaysDeployOnNode attribute ties a func-tionality instance to a specific node and is only relevantfor functionality instances related to per-node replicationgroups. Finally, the mustDeploy boolean attribute indicateswhether a functionality instance should always be deployed.

    The Node class represents compute nodes, the Processclass represents processes running on nodes, and the Com-ponentInstance class represents component instances hostedon processes. As shown in Figure 12b, these three classeshave containment relationship. The functionalityInstance-Name attribute in ComponentInstance class represents thename of the corresponding functionality instance as a com-ponent instance is always associated with a functionalityinstance (see Section 3.3.3).

    The DeploymentAction class represents runtime deploy-ment actions that are computed by the CHARIOT manage-ment engine to (re)configure a system. The DeploymentAc-tion class consists of an action, a completed boolean flag toindicate if an action has been taken, process affected by theaction, node on which the action should be performed, andscripts to perform the action. CHARIOT supports two kindsof actions: start actions and stop actions. The LookAheadclass represents precomputed solutions (see Section 3.3.6).It consists of attributes that represent a failed entity, anda set of recovery actions (deployment actions) that must beperformed to recover from the failure.

    The ReconfigurationEvent class represents runtime recon-figuration events. It is used to keep track of failure and up-date events that trigger system reconfiguration. It consistsof detectionTime, solutionFoundTime, and reconfiguredTimeto keep track of when a failure or update was detected, whena solution was computed, and when the computed solutionwas deployed. It also consists of a completed attribute toindicate whether a reconfiguration event is complete or notand an actionCount attribute to keep track of number ofactions required to complete a reconfiguration event.

    3.2.3 Summary of the Data Description LayerAlthough not a novel contribution by itself, the data layer

    described in this section is critical to the overall CHARIOTecosystem. This layer addresses the challenge of providing ageneric and uniform system state that can be queried by therest of the system at runtime. Having a well-defined modelfor system information not only helps the CHARIOT ecosys-tem remain flexible by decoupling the design and manage-ment layers, it also aids in future extensions when required.

    For example, as previously discussed in Section 3.1.4, letus assume that a set of deployment constraints needs to

    be added. This requires us to extend the current design-time system description (shown in Figure 12a) in such away that the deployment constraints modeled at design-timecan be easily stored in the data layer and retrieved by themanagement layer without affecting any existing code. Thiscould be achieved by adding a DeploymentConstraint classsimilar to the ReplicationConstraint class and have it be partof the GoalDescription class.

    3.3 Runtime Management LayerThe CHARIOT runtime management layer comprises a

    monitoring and deployment infrastructure, as well as a man-agement engine, as previously outlined in Figure 4. Themonitoring and deployment of distributed applications iscovered in prior work [26]; CHARIOT implements these ca-pabilities using existing technologies described in Section 4.This section focuses on CHARIOT’s management enginethat facilitates self-reconfiguration of IoT systems by (1)adding the capability to compute exact component instancesfrom available component types, (2) encoding redundancypatterns using SMT constraints, and (3) using a finite hori-zon look-ahead strategy that pre-computes solutions to sig-nificantly improve the performance of CHARIOT’s manage-ment engine.

    3.3.1 Configuration Space and PointsThe general idea behind CHARIOT’s self-reconfiguration

    approach relies on the concepts of configuration space andconfiguration points. If a system’s state is represented bya configuration point in a configuration space, then recon-figuration of that system entails moving from one configu-ration point to another in the same configuration space. Aconfiguration space includes (1) goal descriptions of differ-ent applications, (2) replication constraints corresponding toredundancy patterns associated with different applications,(3) component types that can be used to instantiate dif-ferent component instances and therefore applications, and(4) available resources, which includes different nodes andtheir corresponding resources, such as memory, storage, andcomputing elements.

    Figure 13: A Configuration Space with different Configura-tion Points. This figure depicts two faults that disable partsof the system resulting in two reconfigurations.

    At any given time a configuration space of an IoT sys-tem can represent multiple applications associated with thesystem. A configuration space can therefore contain mul-tiple configuration points, as shown in Figure 13. Theseconfiguration points represent valid configurations of all ap-plications that are part of the IoT system represented by theconfiguration space.

    A valid configuration point represents component-instance-to-node mappings (i.e., a deployment) for all component in-

    9

  • stances needed to realize different functionalities essentialfor the objectives required to satisfy goals of one or moreapplications. The initial configuration point represents theinitial (baseline) deployment, whereas, current configurationpoint represents the current deployment.

    3.3.2 Computing the Configuration PointGiven above definition of configuration space and points,

    a valid reconfiguration mechanism entails moving from oneconfiguration point to another in the same configurationspace (see Figure 13). When a failure occurs, the currentconfiguration point is rendered faulty. Moreover, parts of theconfiguration space may also be rendered faulty, dependingon the failure. For example, consider a scenario where mul-tiple configuration points map one or more components to anode. If this node fails then all aforementioned configurationpoints are rendered faulty. In addition to failure, hardwareand software updates can also result in reconfiguration, asdiscussed earlier.

    Specifically, reconfiguration in CHARIOT happens by iden-tifying a new valid configuration point and determining theset of actions required to transition from current (faulty)configuration point to the new (desired) configuration point.Configuration points and their transitions thus form the coreof CHARIOT’s reconfiguration mechanism. For any recon-figuration, several valid configuration points might be avail-able. From the available configuration points, an optimalconfiguration point that satisfies the system requirementscan be obtained based on several criteria, such as transi-tion cost, reliability, operation cost, and/or utility. TheConfiguration Point Computation (CPC) algorithm servesthis purpose and thus defines the core of CHARIOT’s self-reconfiguration mechanism. The CPC algorithm can be de-composed into three phases: the (1) instance computationphase, (2) constraint encoding phase, and (3) solution com-putation phase, as described next.

    3.3.3 Phase 1: Instance ComputationThe first phase of a CPC computes required instances of

    different functionalities and subsequently components, basedon the system description provided at design-time. Eachfunctionality can have multiple instances if it is associatedwith a replication constraint. Each functionality instanceshould have a corresponding component instance that pro-vides the functionality associated with the functionality in-stance. Depending upon the number of component typesthat provide a given functionality, a functionality instancecan have multiple component instances. Only one of thecomponent instances will be deployed at runtime, however,so there is always a one-to-one mapping between a function-ality instance and a deployed component instance.

    The CPC algorithm first computes different functional-ity instances using Algorithm 1, which is invoked for eachobjective. Every functionality is initially checked for repli-cation constraints (line 3). If a functionality does not have areplication constraint, a single functionality instance is cre-ated (line 32). For every functionality that has one or morereplication constraints associated with it, each constraint ishandled depending on the type of the constraint. A per-nodereplication constraint is handled by generating a function-ality instance and an assign constraint each for applicablenodes (line 6-11). An application node is a node that isalive and belongs to the node category associated with the

    per-node replication constraint.Unlike a per-node replication constraint, the voter, con-

    sensus, and cluster replication constraints depend on an ex-act replication value or a replication range to determine thenumber of replicas (line 13-19). In the case of a range-based replication, CHARIOT tries to maximize the numberof replicas by using maximum of the range, which ensuresthat maximum number of failures is tolerated without hav-ing to reconfigure the system. After the number of replicas isdetermined, CHARIOT computes the replica functionalityinstances (line 21), as well as special functionality instancesthat support different types of replication constraint.

    For example, for each replica functionality instance ina consensus replication constraint, CHARIOT generates aconsensus service functionality instance (line 23) (a consen-sus service functionality is provided by a component that im-plements consensus logic using existing algorithms, such asPaxos [16] and Raft [25]). For a voter replication constraint,in contrast, CHARIOT generates a single voter functional-ity instance for the entire replication group (line 27). In thecase of a cluster replication constraint, no special function-ality instance is generated as a cluster replication comprisesindependent functionality instances that do not require anysynchronization (see Section 3.1.2).

    In order to ensure proper management of instances relatedto functionalities with voter, consensus, or cluster replica-tion constraints, CHARIOT uses four different constraints:(1) implies, (2) collocate, (3) atleast, and (4) distribute.The implies constraint ensures all replica functionality in-stances associated with a consensus pattern require theircorresponding consensus service functionality instances (line24). Similarly, the collocate constraint ensures each replicafunctionality instance and its corresponding consensus ser-vice functionality instance are always collocated on the samenode (line 25). The atleast constraint ensures the minimumnumber of replicas are always present in scenarios wherea replication range is provided (line 28-29). Finally, thedistribute constraint ensures that the replica functionalitiesare distributed across different nodes (line 30). CHARIOT’sability to support multiple instances of functionalities anddistribute them across different nodes is the basis of the fail-ure avoidance mechanism.

    After functionality instances are created, CHARIOT nextcreates the component instances corresponding to each func-tionality instance. In general, it identifies a component typethat provides the functionality associated with each func-tionality instance and instantiates that component type. Asexplained in Section 3.1.3, component types are modeled aspart of the system description. Different component typescan provide the same functionality, in which case multiplecomponent types are instantiated, but a constraint is addedto ensure only one of those instances is deployed and runningat any given time. In addition, all constraints previouslycreated in terms of functionality instances are ultimatelyapplied in terms of corresponding component instances. Wedescribe the constraints next.

    3.3.4 Phase 2: Constraint EncodingThe second phase of the CPC algorithm is responsible for

    constraint encoding and optimization. These constraints aresummarized below:

    1. Since reconfiguration involves transitioning from oneconfiguration point to another, constraints that repre-

    10

  • Algorithm 1 Functionality Instances Computation.

    Input: objective (obj), nodes (nodes list), computed functionalities (computed functionalities)Output: functionality instances for obj (ret list)

    1: for func in obj.functionalities do2: if func not in computed functionalities then . Make sure a functionality is processed only once.3: if func has associated replication constraints then4: constraints = all replication constraints associated with func5: for c in constraints do6: if c.kind == PER NODE then . Handle per node replication.7: for node category in c.nodeCategories do8: nodes = nodes in nodes list that are alive and belong to category node category9: for n in nodes do

    10: create functionality instance and add it to ret list11: add assign (functionality instance, n) constraint

    12: else13: replica num = 0 . Initial number of replicas, which will be set to max value if range given.14: range based = False . Flag to indicate if a replication constraints is range based.15: if c.numInstances 6= 0 then16: replica num = c.numInstances17: else18: range based = True19: replica num = c.maxInstances

    20: for i = 0 to replica num do . Create replica functionality instances.21: create replica functionality instance and add it to ret list22: if c.kind == CONSENSUS then . Handle consensus replication.23: create consensus service functionality instance and add it to ret list24: add implies (replica functionality instance, consensus service functionality instance) constraint25: add collocate (replica functionality instance, consensus service functionality instance) constraint

    26: if c.kind == V OTER then . Handle voter replication.27: create voter functionality instance and add it to ret list

    28: if range based == True then . If replication range is given, add atleast constraints.29: add atleast (c.rangeMinValue, replica functionality instances) constraint

    30: add distribute (replica functionality instances) constraint

    31: else32: create functionality instance and add it to ret list

    33: add func to computed functionalities

    sent a configuration point are of utmost importance.2. Constraints to ensure component instances that must

    be deployed are always deployed.3. Constraints to ensure component instances that com-

    municate with each other are either deployed on thesame node or on nodes that have network links be-tween them.

    4. Constraints to ensure the resources’ provided-requiredrelationships are valid.

    5. Constraints encoded in the first phase of the CPC algo-rithm for proper management of component instancesassociated with replication constraints.

    6. Constraints to represent failures, such as node failureor device failures.

    The remainder of this section describes how CHARIOTimplements the constraints listed above as SMT constraints.

    Representing the configuration points: A configu-ration point in CHARIOT is therefore presented using acomponent-instance-to-node (C2N) matrix, as shown below.A C2N matrix comprises rows that represent component in-stances and columns that represent nodes; the size of thismatrix is α × β, where α is the number of component in-stances and β is the number of available nodes (Equation 1).

    Each element of the matrix is encoded as a Z3 integer vari-able whose value can either be 0 or 1 (Equation 2). A valueof 0 for an element means that the corresponding compo-nent instance (row) is not deployed on the correspondingnode (column). Conversely, a value of 1 for an element indi-cates deployment of the corresponding component instanceon the corresponding node. For a valid C2N matrix, a com-ponent instance must not be deployed more than once, asshown in Equation 3.

    C2N =

    c2n00 c2n01 c2n02 . . . c2n0βc2n10 c2n11 c2n12 . . . c2n1βc2n20 c2n21 c2n22 . . . c2n2β. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .c2nα0 c2nα1 c2nα2 . . . c2nαβ

    c2ncn : c ∈ {0 . . . α}, n ∈ {0 . . . β}, (α, β) ∈ Z+ (1)

    ∀c2ncn ∈ C2N : c2ncn ∈ {0, 1} (2)

    ∀c :β∑n=0

    c2ncn ≤ 1 (3)

    11

  • Now that we have constraints defined to represent a con-figuration point (i.e., a valid component-instance-to-nodemapping). A constraint is needed to ensure component in-stances that should be deployed are always deployed. Atthis point it is important to recall range-based replicationdescribed in Section 3.1.2. This approach results in a set ofinstances where a certain number (at least the minimum)should always be deployed, but the remaining (differencebetween maximum and minimum) are not always required,even though all of them are deployed initially. At any giventime, therefore, a configuration point can comprise somecomponent instances that must be deployed and others thatare not always required be deployed. In CHARIOT we en-code the ”must deploy assignment” constraint as follows:

    Capturing the Must Deploy Constraint: The “mustdeploy assignment” constraint is used to ensure all compo-nent instances that should be deployed are in fact deployed.This constraint therefore uses the C2N matrix (Equation 1)and a set of component instances that must be deployed, asshown in Equation 4.

    Let M be a set of all component instances that must bedeployed.

    ∀m ∈M :β∑n=0

    c2nmn == 1 (4)

    The third set of constraints ensure that component in-stances with inter-dependencies (i.e., that communicate witheach other) are either deployed on the same node or on nodesthat have network links between them. CHARIOT encodesthis constraint as follows:

    Capturing the dependencies between components:This constraint ensures that interacting component instancesare always deployed on resources with appropriate networklinks to support communication. This constraint is encodedin terms of a node-to-node (N2N) matrix, which is a squarematrix that represents existence of network links betweennodes. This N2N matrix thus comprises rows and columnsthat represents different nodes (Equation 5). Each elementof the N2N matrix is either 0 or 1, where 0 means there doesnot exist a link between the two corresponding nodes and1 means there exists a link between the two correspondingnodes. The constraint is presented in Equation 6.

    N2N =

    n2n00 n2n01 n2n02 . . . n2n0βn2n10 n2n11 n2n12 . . . n2n1β. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .n2nβ0 n2nβ1 n2nβ2 . . . n2nββ

    n2nn1n2 : (n1, n2) ∈ {0 . . . β}, β ∈ Z

    + (5)

    Let cs and cd be two component instances that are depen-dent on each other.

    ∀n1, ∀n2 : ((c2ncsn1 × c2ncdn2 6= 0) ∧ (n1 6= n2)) =⇒(n2nn1n2 == 1)

    (6)

    Capturing the Resource Constraints: The fourth setof constraints ensure the validity of the resources’ provided-required relationships, such that essential component in-stances of one or more applications can be provisioned. InCHARIOT these constraints are encoded in terms of re-

    sources provided by nodes and required by component in-stances. Moreover, resources are classified into two cat-egories: (1) cumulative resources and (2) comparative re-sources. Cumulative resources have a numerical value thatincreases or decreases depending on whether a resource isused or freed. Examples of cumulative resources includeprimary memory and secondary storage. Comparative re-sources have a boolean value, i.e., they are either availableor not available and their value does not change dependingon whether a resource is used or freed. Examples of compar-ative resources include devices and software artifacts. Thesetwo constraints can be encoded as follows:

    The “cumulative resource” constraint is encoded using aprovided resource-to-node (CuR2N) matrix and a requiredresource-to-component-instance (CuR2C) matrix. The ma-trix CuR2N comprises rows that represent different cumu-lative resources and columns that represent nodes; the sizeof this matrix is γ×β, where γ is the number of cumulativeresources and β is the number of available nodes (Equa-tion 7). The CuR2C matrix comprises rows that representdifferent cumulative resources and columns that representcomponent instances; the size of this matrix is γ ×α, whereγ is the number of cumulative resources and α is number ofcomponent instances (Equation 8). Each element of thesematrices are integers. The constraint itself ensures that foreach available cumulative resource and node, the sum of theamount of the resource required by the component instancesdeployed on the node is less than or equal to the amount ofthe resource available on the node, as shown in Equation 9.

    CuR2N =

    r2n00 r2n01 r2n02 . . . r2n0βr2n10 r2n11 r2n12 . . . r2n1β. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .r2nγ0 r2nγ1 r2nγ2 . . . r2nγβ

    r2nrn : r ∈ {0 . . . γ}, n ∈ {0 . . . β}, (γ, β) ∈ Z+ (7)

    CuR2C =

    r2c00 r2c01 r2c02 . . . r2c0αr2c10 r2c11 r2c12 . . . r2c1α. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .r2cγ0 r2cγ1 r2cγ2 . . . r2cγα

    r2crc : r ∈ {0 . . . γ}, c ∈ {0 . . . α}, (γ, α) ∈ Z+ (8)

    ∀r,∀n :

    (α∑c=0

    c2ncn × r2crc

    )≤ r2nrn (9)

    The “comparative resource” constraint is encoded using aprovided resource-to-node (CoR2N) matrix and a requiredresource-to-component-instance (CoR2C) matrix. The ma-trix CoR2N comprises rows that represent different com-parative resources and columns that represents nodes; thesize of this matrix is φ × β, where φ is the number of com-parative resources and β is the number of available nodes(Equation 10). Similarly, the CoR2C matrix comprises rowsthat represent different comparative resources and columnsthat represent component instances; the size of this matrixis φ×α, where φ is the number of comparative resources andα is number of component instances (Equation 11). Each el-ement of these matrices is either 0 or 1, where 0 means the

    12

  • corresponding resource is not provided by the correspondingnode (for CoR2N matrix) or not required by the correspond-ing component instance (for CoR2C matrix) and 1 means theopposite. The constraint itself (Equation 12) ensures thatfor each available comparative resource, node, and compo-nent instance, if the component instance is deployed on thenode and requires the resource, then the resource must alsobe provided by the node.

    CoR2N =

    r2n00 r2n01 r2n02 . . . r2n0βr2n10 r2n11 r2n12 . . . r2n1β. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .r2nφ0 r2nφ1 r2nφ2 . . . r2nφβ

    r2nrn : r ∈ {0 . . . φ}, n ∈ {0 . . . β}, (φ, β) ∈ Z+ (10)

    CoR2C =

    r2c00 r2c01 r2c02 . . . r2c0αr2c10 r2c11 r2c12 . . . r2c1α. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .r2cφ0 r2cφ1 r2cφ2 . . . r2cφα

    r2crc : r ∈ {0 . . . φ}, c ∈ {0 . . . α}, (φ, α) ∈ Z+ (11)

    ∀r, ∀n,∀c : Assigned(c, n) =⇒ (r2nrn == r2crc) (12)

    Assigned (c, n) function returns true if component c isdeployed on node n, i.e., it returns true if c2ncn == 1.

    Handling the replication constraints: The fifth set ofconstraints ensures management of component instances as-sociated with replication constraints. As mentioned in Sec-tion 3.3.3, assign, implies, collocate, atleast, and distributeare the five different kinds of constraints that must be en-coded. Each of these constraints is encoded as follows:

    The “assign constraint” is used for component instancescorresponding to functionalities associated with per-nodereplication constraint. It ensures that a component instanceis only ever deployed on a given node. In CHARIOT, anassign constraint is encoded, as shown in Equation 13.

    Let c be a component instance that should be assigned toa node n.

    Enabled(c) =⇒ (c2ncn == 1) (13)

    Enabled(c) function returns true if component instance c

    is assigned to any node, i.e, it checks if∑βn=0 c2ncn == 1.

    The “implies” constraint is used to ensure that if a compo-nent depends upon other components then its dependenciesare satisfied. It is encoded using the implies construct pro-vided by an SMT solver like Z3.

    A “collocate” constraint is used to ensure that two collo-cated component instances are always deployed on the samenode. In CHARIOT this constraint is encoded by ensuringthe assignment of the two component instances is same forall nodes, as shown in Equation 14.

    Let c1 and c2 be two component instances that need to becollocated.

    (Enabled(c1) ∧ Enabled(c2)) =⇒(∀n : c2nc1n == c2nc2n)

    (14)

    An “atleast” constraint is used to encode a M out of Nsemantics to ensure that given a set of components (i.e. N),a specified number of those components (i.e. M) is alwaysdeployed. CHARIOT uses this constraint for range-basedreplication constraints only and its implementation is twofold. First, during the initial deployment CHARIOT triesto maximize M and deploy as many component instances aspossible. The current implementation of CHARIOT uses themaximum value associated with a range and initially deploysN component instances, as shown in Equation 15. This ofcourse assumes availability of enough resources. A bettersolution to this would be to use the maximize optimization,as shown in Equation16. However, in Z3 solver, which isthe SMT solver used by CHARIOT, this optimization isexperimental and does not scale well. Second, for subsequentnon-initial deployment CHARIOT relies on the fact thatmaximum possible deployment was achieved during initialdeployment, so it ensures the minimum number required isalways met, as shown in Equation 17.

    Let S = {c1, c2 . . . cα′} be a set of replica component in-stances associated with an atleast constraint; N is the sizeof this set. Also, let min value be the minimum number ofcomponent instances required; this is synonymous to M.

    ∑c∈S

    β∑n=0

    c2ncn == max value (15)

    maximize(∑c∈S

    β∑n=0

    c2ncn) (16)

    ∑c∈S

    β∑n=0

    c2ncn ≥ min value (17)

    A “distribute” constraint is used to ensure that a set ofcomponents is deployed on different nodes. In CHARIOTthis constraint is encoded by ensuring at most only one com-ponent instance out of the set is deployed on a single node,as shown in Equation 18.

    Let S = {c1, c2 . . . cα′} be a set of components that needsto be distributed.

    ∀n :∑c∈S

    c2ncn ≤ 1 (18)

    Capturing failures as constraints: The final step (step8) of the second phase of the CPC algorithm encodes andadds failure constraints. Depending on the type of failure,there can be different types of failure constraints. This sixthset of constraints handles failure representation, which areencoded in CHARIOT as shown below:

    A “node failure” constraint is used to ensure that no com-ponents are deployed on a failed node. CHARIOT encodesthis constraint as shown in Equation 19.

    Let nf be a failed node.

    α∑c=0

    c2ncnf == 0 (19)

    Since components can fail for various reasons, there aredifferent ways to resolve a component failure. One approachis to ensure that a component is redeployed on any node

    13

  • other than the node on which it failed (Equation 20). Ifa component keeps failing in multiple different nodes, thenCHARIOT may need to consider another constraint to en-sure the component is not redeployed on any node (Equa-tion 21).

    Let us assume component c1 failed on node n1.

    c2nc1n1==0 (20)

    β∑n=0

    c2nc1n == 0 (21)

    3.3.5 Solution Computation PhaseThe third and final phase of the CPC algorithm involves

    computing a “least distance” configuration point, i.e., a con-figuration point that is the least distance away from thecurrent configuration point. This computation ensures thata system always undergoes the least possible number ofchanges during reconfiguration. The distance is computed asthe number of changes required to transition to the new con-figuration point. Since a configuration point is a component-instance-to-node mapping represented as C2N matrix (seeEquation 1), the distance between two configuration pointsis the distance between their corresponding C2N matrices.In CHARIOT, the least distance constraint is encoded asshown below:

    Least Distance Constraint The “least distance” con-straint is used to ensure that we find a valid configurationpoint that is closest to the current configuration point. Thedistance between two configuration points is the distancebetween their corresponding C2N matrices. This distance iscomputed as shown in Equation 22. The distance betweentwo valid configuration points A and B is the sum of the ab-solute difference between each element of the C2N matricescorresponding to the two configuration points.

    To ensure we obtain least distance configuration point, anideal solution would be to use minimize optimization (Equa-tion 23), which is supported by SMT solvers like Z3. Likethe Z3 maximize optimization, however, the Z3 minimizeoptimization implementation is experimental and does notscale well. In CHARIOT we therefore implement this con-straint using an iterative logic, which upon every successfulsolution computation adds the distance constraint (Equa-tion 22) before invoking the solver again to find a solutionthat is at a lesser distance compared to the previous solu-tion. This iteration stops when no solution can be found,in which case the previous solution is used as the optimum(least distance away) solution.

    config distance =

    β∑n=0

    |c2n Acn − c2n Bcn| (22)

    minimize(config distance) (23)

    At this point in the CPC algorithm, CHARIOT invokesthe Z3 solver to check for a solution. If all constraints are sat-isfied and a solution is found, the CPC algorithm computesa set of deployment actions. CHARIOT computes deploy-ment actions by comparing each element of the C2N matrixthat represents the current configuration point with the cor-

    responding element of the C2N matrix associated with com-puted solution, i.e., the target configuration point. If thevalue of an element in the former is 0 and later is 1, CHAR-IOT adds a START action for the corresponding componentinstance on the corresponding node. Conversely, if the valueof an element in the former is 1 and the latter is 0, CHAR-IOT adds a STOP action. Applying this operation to eachelement of the matrix results in a complete set of deploymentactions required for successful system transition.

    3.3.6 The Look-ahead ReconfigurationThe CPC algorithm presented above yields a reactive self-

    reconfiguration approach since the algorithm executes aftera failure is detected. As such, runtime reconfiguration in-curs the time taken to compute a new configuration pointand determine deployment actions required to transition toa new configuration. This approach may be acceptable forIoT systems consisting of non-real-time applications thatcan incur considerable downtime. For IoT systems involvingreal-time mission-critical applications, however, predictableand timely reconfiguration is essential. Since all dynamicreconfiguration mechanisms rely on runtime computation tocalculate a reconfiguration solution, the time to compute asolution increases with the scale of the IoT system. TheCPC algorithm is no different, as shown by experimentalresults in our prior work [26].

    To address this issue, we therefore extend the CPC al-gorithm by adding a configurable capability to use a finitehorizon look-ahead strategy that pre-computes solution andthus significantly improves the performance of the manage-ment engine. We call this capability the Look-ahead Re-Configuration (LaRC). The general goal of the LaRC ap-proach is to pre-compute and store solutions, so it just findsthe appropriate solution and applies it when required. Whenthe CPC algorithm is configured to execute in the “look-ahead” mode, solutions are pre-computed every time thesystem state (i.e., the current configuration point) changes.

    The first pre-computation happens once the system is ini-tially deployed using the default CPC algorithm. After asystem is initially deployed, CHARIOT pre-computes solu-tions to handle failure events. These pre-computed solutionscannot be used for update events since these types of eventschange the system in such a way that the previously pre-computed solutions are rendered invalid. Once CHARIOThas a set of pre-computed solutions, therefore, failures arehandled by finding the appropriate pre-computed solution,applying the found solution, and pre-computing solutionsto handle future failure events. For update events, in con-trast, the default CPC algorithm is invoked again (same asduring initial deployment) to compute a solution. After asolution for an update event is computed, CHARIOT againpre-compute solutions to handle failure events.

    To pre-compute solutions, CHARIOT currently uses Al-gorithm 2. Since this paper focuses on node failures, Algo-rithm 2 only pre-computes solutions for node failures. As-suming that a system is in a stable state, this algorithm firstremoves any existing look-ahead solutions (line 1) since it iseither invalid (update event) or already used (failure event).After this the algorithm iterates through each available node(line 2-3) and for each node, the algorithm creates a tempo-rary copy of the configuration space (line 4), which includesthe current (stable) configuration point. All subsequent ac-tions are taken with respect to the temporary configuration

    14

  • Algorithm 2 Solution Pre-computation.

    Input: nodes (nodes list)

    1: remove existing look-ahead information from the config-uration space

    2: for node in node list do3: if node is alive then4: tmp config space = get configuration space5: mark node as failed in tmp config space6: actions = CPC algorithm on tmp config space7: if actions ! = null then8: l ahead = new LookAhead instance9: l ahead.failedEntity = node.name

    10: l ahead.failureKind = NODE11: l ahead.deploymentActions = actions12: store l ahead in the configuration space

    space copy, so the original copy is not corrupted during thepre-computation computation process.

    After a copy of the configuration space is made, the partic-ular node is marked as failed (line 5) and the CPC algorithmis invoked (line 6). This pre-computation algorithm thus es-sentially injects a failure and asks the CPC algorithm fora solution. If a solution is found, the injected failure in-formation and the solution is stored as an instance of theLookAhead class presented in Section 3.2.2 (line 7-12).

    3.3.7 Summary of management layerThe description of the LaRC approach in Section 3.3.6

    yields interesting observations with regards to the pre-computationalgorithm. First, the current version of the pre-computationalgorithm only considers node failures. We will alleviate thislimitation in future work by adding system-wide capabilitiesto monitor, detect, and handle failures involving applicationprocesses, components, and network elements.

    Second, the pre-computation algorithm specifically pre-computes solutions only for the next step, i.e., the algo-rithm only looks one step ahead. We believe that the num-ber of steps to look-ahead should be a configurable parame-ter as different classes of system might benefit from differ-ent setting of this parameter. For example, consider highlydynamic IoT systems that are subject to frequent failuresresulting in bursts of failure events. For such systems, itis important to look-ahead more than one step at a time,otherwise multiple failures that happen in short timespancannot be handled. However, for IoT systems that are com-paratively more static, such as the smart parking systempresented in Section 2.1, a higher Mean Time To Failure(MTTF) is expected, so pre-computed solutions need notlook ahead more than one step at a time.

    There is clearly a trade-off between time, space, and num-ber of failures tolerated when considering the number of pre-computation steps. Multi-step pre-computation takes moretime and space to store large number of solutions based onvarious permutation and combination of possible failures,but can handle bursts of failures. Conversely, a single-steppre-computation will be much faster and occupy less space,but it will be harder to handle bursts of failures.

    An ideal solution would involve a dynamic solution pre-computation algorithm. The dynamism is with respect tothe configuration of the pre-computation steps parameter.For any given system, however, we assume that there is aninitial value that can change at runtime depending on the

    system behavior. Further investigating and implementingsuch a solution is part of our future work.

    4. IMPLEMENTATION AND EVALUATIONOF CHARIOT

    This section describes and empirically evaluates the CHAR-IOT runtime implementation using the Smart Parking Sys-tem use-case scenario presented in Section 2.1. Figure 14depicts CHARIOT’s runtime implementation architecture,which consists of compute nodes comprising the layered stackshown in figure 2.

    Figure 14: The CHARIOT Runtime Implementation Archi-tecture.

    Each CHARIOT-enabled compute node hosts two plat-form services: a Node Monitor and a Deployment Manager.The Node Manager assesses the liveliness of its specific node,whereas the Deployment Manager manages the lifecycle ofapplications deployed on a node. In addition to computenodes, CHARIOT’s runtime also comprises one or more in-stances of three different types of server nodes: (1) DatabaseServers that store system information, (2) Management En-gines that facilitate failure avoidance, failure management,and operation management, and (3) Monitoring Servers thatmonitor for failures.4

    CHARIOT’s Node Manager is implemented as a ZooKeep-er [14] client that registers itself with a Monitoring Server. Inturn, the Monitoring Server is implemented as a ZooKeeperserver and uses ZooKeeper’s group membership functional-ity to detect member (node) additions and removals (i.e.,failure detection). This design supports dynamic resources,i.e., nodes that can join or leave a cluster at any time. Agroup of Node Monitors (each residing on a node of a clus-ter) and one or more instances of Monitoring Servers definethe monitoring infrastructure described in Section 3.

    The Deployment Manager is implemented as a ZeroMQ[13] subscriber that receives management commands froma Management Engine, which is in turn implemented asa ZeroMQ publisher. The Management Engine computesthe initial configuration point for application deployment,as well as subsequent configuration points for the system torecover from failures. After a Deployment Manager receivesmanagement commands from the Management Engine, itexecutes those commands locally to control the lifecycle of

    4Since failure detection and diagnosis is not the primaryfocus of this paper, our current implementation focuses onresolving node failures, though CHARIOT can be easily ex-tended to support mechanism to detect component, process,and network failures.

    15

  • application components. Application components managedby CHARIOT can be in one of two states: active or inac-tive. A group of Deployment Managers, each residing on anode of a cluster, represents the deployment infrastructuredescribed in Section 3.

    A Database Server is an instance of a MongoDB server.For the experiments presented in Section 4.4, we only con-sider compute node failures, so deploying single instancesof Monitoring Servers, Database Servers, and ManagementEngines fulfills our need. To avoid single points of fail-ure, however, CHARIOT can deploy each of these serversin a replicated scenario. In the case of Monitoring Serversand Database Servers, replication is supported by existingZooKeeper and MongoDB mechanisms. Likewise, replica-tion is trivial for Management Engines since they are state-less. A Management Engine executes the CPC algorithm(see Section 3.3.2), with or without the LaRC configura-tion (see Section 3.3.6), using relevant information from aDatabase Server. CHARIOT can therefore have multiplereplicas of Management Engines running, but only one per-forms reconfiguration algorithms. This constraint is achievedby implementing a rank-based leader election among differ-ent Management Engines. Since a Management Engine im-plements a ZeroMQ server and since ZeroMQ does not pro-vide a service discovery capability by default, CHARIOTneeds some mechanism to handle publisher discovery whena Management Engine fails. This capability is achieved byusing ZooKeeper as a coordination service for ZeroMQ pub-lishers and subscribers.

    4.1 Application Deployment MechanismFor initial application deployment, CHARIOT ML (see

    Section 3.1) is used to model the corresponding system thatcomprises the application, as well as resources on whichthe application will be deployed. This design-time model isthen interpreted to generate a configuration space (see Sec-tion 3.3.1) and store it in the Database Server, after whichpoint a Management Engine is invoked to initiate the deploy-ment. When the Management Engine is requested to per-form initial deployment, it retrieves the configuration spacefrom the Database Server and computes a set of deploy-ment commands. These commands are then stored in theDatabase Server and sent to relevant Deployment Managers,which take local actions to achieve a distributed applicationdeployment. After a Deployment Manager executes an ac-tion, it updates the configuration space accordingly.

    4.2 Failure and Update Detection MechanismCHARIOT leverages capabilities provided by ZooKeeper

    to implement a node failure detection mechanism, whichperforms the following steps: (1) each computing node runsa Node Manager after it boots up to ensure that each noderegisters itself with a Monitoring Server, (2) when a noderegisters with a Monitoring Server, the latter creates a cor-responding ephemeral node,5 and (3) since node membershipinformation is stored as ephemeral nodes in the MonitoringServer, it can detect failures of these nodes.

    4.3 Reconfiguration MechanismAfter a failure is detected a Monitoring Server notifies

    the Management Engine, as shown in Figure 14. This fig-

    5ZooKeeper stores information in a tree like structure com-prising simple nodes, sequential nodes, or ephemeral nodes.

    ure also shows that the Management Engine then queriesthe Database Server to obtain the configuration space andreconfigure the system using relevant information from theconfiguration space and the detected failure.

    4.4 Experimental EvaluationAlthough we have previously used CHARIOT to deploy

    and manage applications on an embedded system comprisingIntel Edison nodes (see chariot.isis.vanderbilt.edu/tutorial.html), this paper uses a cloud-based setup to evaluate CHAR-IOT at a larger scale. Below we first describe our experimenttest-bed and then describe the application and set of eventsused for our evaluation. We next present an evaluation ofthe default CPC algorithm and evaluate the CPC algorithmwith the LaRC algorithm. Finally, we present CHARIOTresource consumption metrics.

    4.4.1 Hardware and Software TestbedOur testbed comprises 45 virtual machines (VMs) each

    with 1GB RAM, 1VCPU, and 10GB disk in our privateOpenStack cloud. We treat these 45 VMs as embedded com-pute nodes. In addition to these 45 VMs, 3 additional VMswith 2 VCPUs, 4 GB memory, and 40GB disk is used asserver nodes to host Monitoring Server, Database Server,and Management Engine (see Figure 14). All these VMsran Ubuntu 14.04 and were placed in the same virtual LAN.

    4.4.2 Application and Event SequenceTo evaluate CHARIOT, we use the smart parking system

    described in Section 2.1. We divide the 45 compute nodesinto 21 processing nodes (corresponding to the edison nodetemplate in Figure 7), 21 camera nodes (corresponding tothe wifi cam node template in Figure 7), and 3 terminalnodes (corresponding to the entry terminal node templatein Figure 7). The goal description we used is the same shownin Figure 9, except we increase the replication range of theoccupancy detector functionality to minimum 7 and maxi-mum 10.

    To evaluate the default CPC algorithm we use 34 differentevents presented in Table 1. As shown in the table, the firstevent is the initial deployment of the smart parking systemover 21 nodes (10 processing nodes, 10 camera nodes, and1 terminal node). This initial deployment results in a totalof 23 component instances. After initial deployment, we in-troduce 6 different node failure events, one at a time. Wethen update the system by adding 2 terminal nodes, 11 pro-cessing nodes, and 11 camera nodes. These nodes are addedone at a time, resulting in a total of 45 nodes (includingthe 6 failed nodes). These updates are examples of intendedupdates and show CHARIOT’s operations management ca-pabilities. After updating the system, we introduce threemore node failures.

    4.4.3 Evaluation of the Default CPC AlgorithmFigure 15 presents evaluation of the default CPC algo-

    rithm using application and event sequence described above.To evaluate the default CPC algorithm we use the total so-lution computation time, which is measured in seconds. Thetotal solution computation time can be decomposed into twoparts: (1) problem setup time and (2) Z3 solver time. Theproblem setup time corresponds to the first two phases ofthe CPC algorithm (see Section 3.3.3 and Section 3.3.4),whereas the Z3 solver time corresponds to the third phase

    16

    chariot.isis.vanderbilt.edu/tutorial.htmlchariot.isis.vanderbilt.edu/tutorial.html

  • Table 1: Sequence of Events Used for Evaluation of the CPC Algorithm.

    Events Description1 Initial deployment over 21 nodes (10 processing nodes, 10 camera nodes, and 1 terminal node) resulting in 23

    component instances; 10 different component instances related to the occupancy detector functionality due toits corresponding cluster replication constraint, 10 different component instances related to the image capturefunctionality due to its corresponding per-node replication constraint associated with camera nodes (we have10 camera nodes), a component instance related to the client functionality due to its corresponding per-nodereplication constraint associated with terminal nodes (we have 1 terminal node), and a component instance eachrelated to the load balancer, and parking manager functionalities.

    2 Failure of a camera node. No reconfiguration is required for this failure as a camera node hosts only a node-specificcomponent that provides the image capture functionality.

    3 Failure of the processing node that hosts a component instance each related to the load balancer and park-ing manager functionalities. This results in reconfiguration of the aforementioned two component instances.Furthermore, since the processing node hosts an instance of the occupancy detection functionality, the numberof component instances related to this functionality decreases from 10 to 9. Since 9 is still within the providedredundancy range (min = 7, max = 10), however, this component instance does not get reconfigured.

    4 Failure of the processing node on which the component instance related to the parking manager functionality wasreconfigured to as the result of the previous event. This event results in the parking manager functionality relatedcomponent instance to again be reconfigured to a different node. Moreover, the number of component instancesrelated to the occupancy detector functionality decreases to 8, which is still within the provided redundancy range;as such, reconfiguration of that component instance is not required.

    5 Failure of the processing node on which the component instance related to the load balancer functionality wasreconfigured to as result of event 3. This event results in the component instance being reconfigured again to adifferent node. Also, the number of component instances related to the occupancy detector functionality decreasesto 7, which is still within the provided redundancy range so no reconfiguration is required.

    6 Failure of another processing node. This node only hosts a component instance related to the occupancy detectorfunctionality. As a result of this failure event, therefore, the provided redundancy range associated with theoccupancy detector functionality is violated since the number of corresponding component instances decreasesto 6. This component instance is then reconfigured to a different node to maintain at least 7 instances of theoccupancy detector functionality.

    7 Failure of the single available terminal node on which the component instance related to the client functionalitywas deployed as part of the initial deployment (event 1). This event results in an invalid system state since thereare no other terminal nodes and thus no instances of client functionality are available.

    8-31 Hardware updates associated with addition of 2 terminal nodes, 11 processing nodes, and 11 camera nodes. Dueto associated per-node replication constraints, addition of a terminal node results in deployment of a componentinstance associated with the client functionality. Similarly, adding a camera node results in deployment of acomponent instance associated with the image capture functionality. Adding processing node does not result inany new deployment, however, since it is not associated with a per-node replication constraint.

    32 Failure of a processing node that hosts a component instance related to the occupancy detector functionality. Thisresults in reconfiguration of the component instance to a different node.

    33 Failure of another processing node, which hosts no applications. Therefore, no reconfiguration is required.34 Failure of a camera node. Again, no reconfiguration is required (see event 2 above).

    Figure 15: Default CPC Algorithm Performance. (Please refer to Table 1 for details about each event shown in this graph.)

    17

  • of the CPC algorithm (see Section 3.3.5).Figure 15 shows that for initial deployment and the first 5

    failure events, the total solution computation time is similar(average = 48 seconds) because the size of the C2N matrixand associated constraints created during the problem setuptime are roughly the same. The 6th failure (7th event inFigure 15), is associated with the one and only terminal nodein the system. The Z3 solver therefore quickly determinesthere is no solution, so the Z3 solver time for the 7th eventis the minimal 1.74 seconds.

    Events 8 through 31 are associated with a system updatevia the addition of a single node per event. These eventsshow that for most cases the total solution computationtime increases with each addition of node. The problemsetup time increases consistently with increase in the num-ber of nodes because the size of the C2N matrix, as well asthe number of constraints, increases with an increase in thenumber of nodes. The Z3 solver time also increases withincrease in number of nodes in the system, however, it doesnot increase as consistently as the problem setup time dueto the least distance configuration computation presented inSection 3.3.5. The number of iterations, and therefore thetotal time, taken by the Z3 solver to find a solution withleast distance is non-deterministic. If a good solution (withrespect to distance) is found in the first iteration, it takesless number of iterations to find the optimal solution. Wedemonstrate this non-deterministic behavior using experi-mental results in Section 4.4

Recommended