Following is a description of the SAL model of a representative agent society of the UltraLog system. The continuous planing and execution (CPE) agent society is an example of an agent architecture that shares numerous of similarities with UltraLog , including survivability requirements. The CPE society is, however, much simpler and contains very few agents but shares the same attributes with the UltraLog architecture. Just as in UltraLog, the CPE agents are organized in a hierarchical manner, and distributed among a set of enclaves containing application agents, robustness and security managers, and certificate authorities.
The following figure illustrates a typical layout of agents in the CPE or, for that matter, UltraLog societies. An enclave containing application agents "i"CP is monitored by a security manager and a robustness manager that are agents loaded with the appropriate security and robustness management plug-ins. Every single agents relies on a certificate authority in order to establish its credential as a legitimate agent. A set of certificate authorities is distributed in a hierarchical manner on different enclaves in order to ensure redundancy and availability.
The description of the CPE in SAL is no different from the description of UltraLog. Individual agents, managers, and CAs are described similarly. The only difference resides in the initial configuration of the CPE society. That is, the network topology, the number of nodes, and the distribution of agents, managers, and CAs on the different physical nodes. The following is the description of the initial configuration as it is described in SAL:
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % % SRI INTERNATIONAL % System Design Lab % % Robustness Modeling for CPE % 2004. % % % SAL Specification of CPE % http://sal.csl.sri.com % Author: Hassen Saidi % saidi@sdl.sri.com % 333 Ravenswood Ave % Menlo Park, CA 94025 % % % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % % % % (5) % (1) ----------- % -------------------------------------------- | #20 | % | N14 N15 #1 1BDE Rob N1 | | Rob |N16 % | 1Supply 2Supply | #16 | | CA1 #21 |N17 % | #14 #15 | | | CA2 #22 |N18 % -------------------------------------------- | CA3 #23 |N19 % | ----------- % --------------------------------------------------- % (2) | (3) | (4) | % |--------------------| |--------------------| |--------------------| % |N2 #2 | #17 | |N3 #3 | #18 | |N4 #4 | #19 | % | 1BN Rob | | 2BN Rob | | 3BN Rob | % | ------------ | | ------------ | | ------------ | % | | | | | | | | | | | | | | | % | | | | | | | | | | | | | | | % |N5| N6| N7| | | N8| N9| N10| | |N11 | N12| N13| | % | 1CP 2CP 3CP | | 1CP 2CP 3CP | | 1CP 2CP 3CP | % | #5 #6 #7 | | #8 #9 #10 | | #11 #12 #13 | % |____________________| |____________________| |____________________| % % % % % The model is an abstraction of the current CPE % implementation in which timing and network information are % abstracted away. % The Model is parameterized by the number of agents, robustness % managers, security managers, services, and nodes. % cpe : CONTEXT = BEGIN %number of enclaves Enclaves : NATURAL = 5 ; % number of nodes Nodes : NATURAL = 19 ; % number of agents AppAgents: NATURAL = 15 ; % number of Robustness managers Robustness_Managers: NATURAL = 5 ; % number of agents that need certificates Agents: NATURAL = AppAgents + Robustness_Managers %=20 ; % number of Certificate Authorities Certificate_Authority: NATURAL = 3 ; % Total number of agents including CAs Items: NATURAL = Agents + Certificate_Authority %=23 ;
After defining the number of enclaves, nodes, agents, managers, and CAs, it is necessary to define the associate nodes to enclaves, agents to managers, and CAs to each agent.
% % definition of the robustness community % % % Each node belongs to an enclave % enclave: ARRAY [1..Nodes] OF [1..Enclaves] = [[i: [1..Nodes]] IF i<=4 THEN i ELSE IF i>=5 AND i<= 7 THEN 2 ELSE IF i>=8 AND i<= 10 THEN 3 ELSE IF i>=11 AND i<= 13 THEN 4 ELSE IF i>=14 AND i<= 15 THEN 1 ELSE 5 ENDIF ENDIF ENDIF ENDIF ENDIF ] ; % % Each agent is associated to a manager % manager: ARRAY [1..AppAgents] OF [1..Robustness_Managers] = [[i: [1..AppAgents]] IF i<=4 THEN i ELSE IF i<=5 AND i<= 7 THEN 2 ELSE IF i>=8 AND i<= 10 THEN 3 ELSE IF i>=11 AND i<= 13 THEN 4 ELSE 1 ENDIF ENDIF ENDIF ENDIF ] ; % % Each agent is associated to a CA % certificate : ARRAY [1..Items] OF [1..Certificate_Authority] = [[i: [1..Items]] IF i=1 THEN 1 ELSE IF i>=2 AND i<= 4 THEN 2 ELSE IF i>=5 AND i<= 13 THEN 3 ELSE IF i>=14 AND i<= 16 THEN 1 ELSE IF i>=17 AND i<= 19 THEN 2 ELSE 1 ENDIF ENDIF ENDIF ENDIF ENDIF ] ; init_location :LocationType = [[i: [1..Items]] IF i<=15 THEN i ELSE IF i>=16 AND i<= 19 THEN i-15 ELSE IF i>=21 AND i<= 24 THEN i-20 ELSE i-4 ENDIF ENDIF ENDIF ] ;
[ % % After initialization the agent is alive and sending heartbeats % to its manager % state_agent[idAgent] = a_init AND state_node[Location[idAgent]] = up --> blackboard'[idAgent] = persisted_blackboard[idAgent]; state_agent'[idAgent] = a_alive % % Manipulating Blackboard % % [] ([] (i: [1..Max_BlackBoard]) : state_agent[idAgent] = a_alive AND NOT P(blackboard[idAgent][i]) AND state_node[Location[idAgent]] = up --> state_agent'[idAgent] = a_busy; LocalTask' = i ) % % Tasks that can be done locally are finished with success % as long as the agent is alive % [] state_agent[idAgent] = a_busy AND A(blackboard[idAgent][LocalTask])=idAgent AND state_node[Location[idAgent]] = up --> state_agent'[idAgent] = a_alive; blackboard'[idAgent][LocalTask] = Alloc(T(blackboard[idAgent][LocalTask]), idAgent, O(blackboard[idAgent][LocalTask]), TRUE, TRUE ) % % when a task is allocated to another agent, it is copied % to the agent's blackboard % [] ([] (i: [1..Agents]) : ([] (j: [1..Max_BlackBoard]) : state_agent[idAgent] = a_busy AND A(blackboard[idAgent][LocalTask])/=idAgent AND O(blackboard[idAgent][LocalTask])=idAgent AND S(blackboard[idAgent][LocalTask]) = FALSE AND blackboard[A(blackboard[idAgent][LocalTask])][j] = no_elem AND state_agent[A(blackboard[idAgent][LocalTask])] = a_alive AND state_node[Location[idAgent]] = up --> state_agent'[idAgent] = a_alive; blackboard' = (blackboard WITH [A(blackboard[idAgent][LocalTask])][j] := Alloc(T(blackboard[idAgent][LocalTask]), A(blackboard[idAgent][LocalTask]), idAgent, FALSE, FALSE )) WITH [idAgent][LocalTask] := Alloc(T(blackboard[idAgent][LocalTask]), A(blackboard[idAgent][LocalTask]), idAgent, FALSE, TRUE ) ) ) % % when a task is finished its status is TRUE, it % is removed from the blackboard % [] ([] (i: [1..Max_BlackBoard]) : state_agent[idAgent] = a_alive AND S(blackboard[idAgent][i]) = TRUE AND state_node[Location[idAgent]] = up --> LocalTask' = i; state_agent'[idAgent] = IF O(blackboard[idAgent][i]) /= idAgent THEN a_update ELSE a_alive ENDIF; LocalUpdate' = blackboard[idAgent][i]; blackboard'[idAgent][i] = no_elem ) % % when a task is finished its status is TRUE and it % originated from another agent, the other agent is informed % [] ([] (j: [1..Max_BlackBoard]) : state_agent[idAgent] = a_update AND O(blackboard[O(LocalUpdate)][j])=O(LocalUpdate) AND state_node[Location[idAgent]] = up --> state_agent'[idAgent] = a_alive; blackboard'[O(LocalUpdate)][j]=LocalUpdate ) % % Persist data when blackboard content changes % [] state_agent[idAgent] = a_alive AND blackboard[idAgent] /= persisted_blackboard[idAgent] AND state_node[Location[idAgent]] = up --> persisted_blackboard'[idAgent] = blackboard[idAgent] % % Success means that all blackboards are finished. % [] state_agent[idAgent] = a_alive AND FORALL (i: [1..Max_BlackBoard]) : blackboard[idAgent][i] = no_elem AND idAgent = MasterAgent AND state_node[Location[idAgent]] = up --> Success' = TRUE ] END;
[ % After initialization, the manager requests a certificate state_manager[idManager] = m_init AND state_ca[certificate[rob2agent(idManager)]] = ca_alive --> state_manager'[idManager] = m_alive % % when an agent is not sending heartbeats or is late, the % manager decides to relocate the agent % ONLY AppAgents % [] state_manager[idManager] = m_alive --> state_manager'[idManager] = m_pre_relocate ; relocated_agent' IN {j:T_AppAgents | manager[j]=idManager AND state_agent[j] = a_dead} [] state_manager[idManager] = m_pre_relocate --> state_manager'[idManager] = m_relocate ; old_node' = Location[relocated_agent] ; new_node' IN { j: T_Nodes | state_node[j]=up AND IF Location[relocated_agent] = Location[rob2agent(idManager)] THEN j=Location[relocated_agent] ELSE enclave[j]=Location[relocated_agent] AND (FORALL (y:T_Nodes): ( enclave[y]/= enclave[Location[relocated_agent]] OR Nodes_Load[j] <= Nodes_Load[y]) ) ENDIF } % % all agents in the same node are moved % to the new node with fewer agents % and dont care about agents that are not in the same node % [] state_manager[idManager] = m_relocate AND Nodes_Load[old_node] /= 0 --> state_manager'[idManager] = m_select_relocate; relocated_agent' IN {i:T_AppAgents | Location[i] = old_node} [] state_manager[idManager] = m_select_relocate --> state_manager'[idManager] = m_relocate; Location'[relocated_agent] = new_node; state_agent'[relocated_agent] = a_init; Nodes_Load' = ( Nodes_Load WITH [new_node] := Nodes_Load[new_node] + 1 ) WITH [old_node] := Nodes_Load[old_node] -1 ; % % when all agents in the same node have been relocated % go back to alive. % [] state_manager[idManager] = m_relocate AND Nodes_Load[old_node] = 0 --> state_manager'[idManager] = m_alive ]
[ state_ca[idCA] = ca_init AND state_ca[certificate[ca2item(idCA)]] = ca_alive --> state_ca'[idCA] = ca_alive ]
[ % kill agent ([] (i:T_AppAgents) : state_agent[i] /= a_dead AND counter_agent_nodes < MaxThreat --> state_agent'[i] = a_dead; counter_agent_nodes' = counter_agent_nodes + 1 ) % kill node [] ([] (i:T_Nodes) : ([] (j:T_Agents) : state_node[i] = up AND counter_agent_nodes < MaxThreat --> state_node'[i] = down; counter_agent_nodes' = counter_agent_nodes + 1; state_agent' IN {Z : AgentStateType | FORALL (k:[1..AppAgents]) : IF Location[k]= i THEN Z[k] = a_dead ELSE Z[k] = state_agent[k] ENDIF } ; state_manager' IN {Z : R_ManagerStateType | FORALL (k:[1..Robustness_Managers]) : IF Location[rob2agent(k)]= i THEN Z[k] = m_dead ELSE Z[k] = state_manager[k] ENDIF } ; state_ca' IN {Z : CA_StateType | FORALL (k:[1..Certificate_Authority]) : IF Location[ca2item(k)]= i THEN Z[k] = ca_dead ELSE Z[k] = state_ca[k] ENDIF } ; )) % kill robustness manager [] ([] (i: [1..Robustness_Managers]) : state_manager'[i] /= m_dead AND counter_managers_certificate < MaxThreat --> counter_managers_certificate' = counter_managers_certificate + 1; state_manager'[i] = m_dead ) % % kill certificate authority % In this case, all agents that depend on this CA % have their certificates revoked % [] ([] (i: [1..Certificate_Authority]) : state_ca[i] /= ca_dead AND counter_managers_certificate < MaxThreat --> counter_managers_certificate' = counter_managers_certificate + 1; state_ca' IN { Z: ARRAY [1..Certificate_Authority] OF CA_State | FORALL (k:[1..Certificate_Authority]) : IF k=i OR certificate[ca2item(k)]=i THEN Z[k]=ca_dead ELSE Z[k]=state_ca[k] ENDIF } ; state_agent' IN { Z: ARRAY [1..AppAgents] OF AgentState | FORALL (k:[1..AppAgents]) : IF certificate[k]=i THEN Z[k]=a_dead ELSE Z[k]=state_agent[k] ENDIF } ; state_manager' IN { Z: ARRAY [1..Robustness_Managers] OF R_ManagerState | FORALL (k:[1..Robustness_Managers]) : IF certificate[rob2agent(k)]=i THEN Z[k]=m_dead ELSE Z[k]=state_manager[k] ENDIF } ) ]
society : MODULE = ([] (i:T_AppAgents) : M_Agent[i] ) [] ([] (j:T_RManagers) : M_RManager[j] ) [] ([] (l:T_CA) : M_CAuthority[l] ) [] M_Threat ;
Dependency graphs are built for individual atomic predicates. Following are the dependency graphs of some of the predicates appearing in various properties about agents, managers, nodes, and certificate authorities. For each graph, the square shaped node contains the predicate for which the dependency graph is built. Dependencies can be computed in a modular way for each predicate in the property. Notice that not all the dependencies have to be computed. For instance, when computing the dependencies of the predicate "ca_alive(i)" that indicates that the certificate authority "i" is alive, we reach the predicate "ca_alive(ca(i))" which indicates that the certificate authority of the certificate authority "i" is also alive. Therefore, we conclude that two certificate authorities can be permuted if they have the same certificate authority, or certificate authorities that them self have the same certificate authority and so on. Therefore, it is not necessary to compute the dependencies of the predicate "ca_alive(ca(i))".
This property indicates that a failing node will always recover and be back up again. Since permuting the predicates "node(i)=up" and "node(j)=up" for any "i" and "j" leaves the property unchanged, we track down the dependencies of those predicates. We find that the predicate "node(i)=up" depends on the predicate "node(i)=down". Therefore we can permute "node(i)=up" and "node(j)=up" if we can permute the predicates "node(i)=down" and "node(j)=down". We compute then the dependencies of the predicate "node(i)=down" and we determined that it depends on the predicate "node(i)=up". Therefor, for the property above we can compose the two permutations ("node(i)=down" , "node(j)=down") and ("node(i)=up" , "node(j)=up" ).
• Consider the property: for all certificate authority "i", always eventually "ca_alive(i)"This property indicates that a failing certificate authority will always recover and be back up again. Since permuting the predicates "ca_alive(i)" and "ca_alive(i)" for any "i" and "j" leaves the property unchanged, we track down the dependencies of those predicates. We find that the predicate "ca_alive(i)" depends on the predicates "ca_alive(ca(i))" and "ca_init(i)". Therefore we can permute "ca_alive(i)" and "ca_alive(j)" if we can permute the predicates "ca_init(i)" and "ca_init(j)" and permute the predicates "ca_alive(ca(i))" and "ca_alive(ca(j))". We compute then the dependencies of the predicate "ca_init(i)". We have three certificate authorities CA1, CA2, and CA3. CA1 is the certificate authority of CA2 and CA3. Since the dependencies of the predicates "ca_init(i)" and "ca_init(j)" can be permuted and that "ca_alive(ca(2))" and "ca_alive(ca(3))" are equal, then we can compose the following permutations ("ca_alive(2)","ca_alive(3)"), ("ca_init(2)","ca_init(3)"), and ("ca_dead(2)","ca_dead(3)")
The total number of state for the above CPE description is a little more than 66 Million states.