Discovering Symmetries

This page meant to provide some of the information related to the experiments with discovering symmetries and the application of these techniques to the UltraLog system. The technical report attached to it can be downloaded here. I will add more details as time permits.

Symmetry Reduction : Application to the UltraLog System

Following is a description of the SAL model of a representative agent society of the UltraLog system. The continuous planing and execution (CPE) agent society is an example of an agent architecture that shares numerous of similarities with UltraLog , including survivability requirements. The CPE society is, however, much simpler and contains very few agents but shares the same attributes with the UltraLog architecture. Just as in UltraLog, the CPE agents are organized in a hierarchical manner, and distributed among a set of enclaves containing application agents, robustness and security managers, and certificate authorities.

For more information on SAL, please visit the SAL web page.

• CPE Agent Society
• CPE Model in SAL
• Agent Module
• Manager Module
• Certificate Authority Module
• Threat Module
• Society Module
• Dependency Graphs
• Discovering Symmetries
• State Space Reduction

CPE Agent Society:

The following figure illustrates a typical layout of agents in the CPE or, for that matter, UltraLog societies. An enclave containing application agents "i"CP is monitored by a security manager and a robustness manager that are agents loaded with the appropriate security and robustness management plug-ins. Every single agents relies on a certificate authority in order to establish its credential as a legitimate agent. A set of certificate authorities is distributed in a hierarchical manner on different enclaves in order to ensure redundancy and availability.

CPE Model in SAL:

The description of the CPE in SAL is no different from the description of UltraLog. Individual agents, managers, and CAs are described similarly. The only difference resides in the initial configuration of the CPE society. That is, the network topology, the number of nodes, and the distribution of agents, managers, and CAs on the different physical nodes. The following is the description of the initial configuration as it is described in SAL:

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%   
%  SRI INTERNATIONAL
%  System Design Lab
%    
%  Robustness Modeling for CPE
%  2004.
%  
%  
%  SAL Specification of CPE
%  http://sal.csl.sri.com
%  Author: Hassen Saidi
%          saidi@sdl.sri.com
%          333 Ravenswood Ave
%          Menlo Park, CA 94025
%
%
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
%
%                                                                  
%                                                                  (5)
%                             (1)                               -----------
%              --------------------------------------------     | #20     |
%              |   N14      N15 #1  1BDE   Rob  N1        |     | Rob     |N16
%              | 1Supply   2Supply   |     #16            |     | CA1 #21 |N17
%              |   #14      #15      |                    |     | CA2 #22 |N18
%              --------------------------------------------     | CA3 #23 |N19
%                                    |                          ----------- 
%           ---------------------------------------------------
%     (2)   |                    (3) |                  (4)   |
%  |--------------------|  |--------------------|   |--------------------|
%  |N2  #2  | #17       |  |N3  #3   | #18      |   |N4  #4   | #19      |
%  |    1BN  Rob        |  |    2BN  Rob        |   |    3BN  Rob        |
%  |  ------------      |  |    ------------    |   |    ------------    |
%  |  |     |    |      |  |    |     |    |    |   |    |     |    |    |
%  |  |     |    |      |  |    |     |    |    |   |    |     |    |    |
%  |N5|   N6|  N7|      |  |  N8|   N9| N10|    |   |N11 |  N12| N13|    |
%  | 1CP   2CP  3CP     |  |   1CP   2CP  3CP   |   |   1CP   2CP  3CP   |
%  | #5    #6    #7     |  |    #8   #9    #10  |   |   #11    #12   #13 |
%  |____________________|  |____________________|   |____________________|
%
%
%
%
% The model is an abstraction of the current CPE
% implementation in which timing and network information are 
% abstracted away. 
% The Model is parameterized by the number of agents, robustness 
% managers, security managers, services, and nodes.
%

cpe : CONTEXT = 
BEGIN

%number of enclaves
Enclaves : NATURAL = 5
;

% number of nodes
Nodes : NATURAL = 19           
;

% number of agents 
AppAgents: NATURAL = 15
;

% number of Robustness managers
Robustness_Managers: NATURAL = 5
;


% number of agents that need certificates
Agents: NATURAL = AppAgents + Robustness_Managers  %=20
;

% number of Certificate Authorities
Certificate_Authority: NATURAL = 3
;

% Total number of agents including CAs
Items: NATURAL = Agents + Certificate_Authority %=23
;

After defining the number of enclaves, nodes, agents, managers, and CAs, it is necessary to define the associate nodes to enclaves, agents to managers, and CAs to each agent.

%
% definition of the robustness community
%

%
% Each node belongs to an enclave
%
enclave: ARRAY [1..Nodes] OF [1..Enclaves] =
     [[i: [1..Nodes]]
             IF i<=4 THEN i
        ELSE IF i>=5  AND i<= 7  THEN 2
        ELSE IF i>=8  AND i<= 10 THEN 3
        ELSE IF i>=11 AND i<= 13 THEN 4
        ELSE IF i>=14 AND i<= 15 THEN 1
        ELSE 5 
         ENDIF ENDIF ENDIF ENDIF ENDIF    
]
;
%
% Each agent is associated to a manager
%

manager:  ARRAY [1..AppAgents] OF [1..Robustness_Managers]  =
   [[i: [1..AppAgents]]
             IF i<=4 THEN i
        ELSE IF i<=5  AND i<= 7  THEN 2
        ELSE IF i>=8  AND i<= 10 THEN 3
        ELSE IF i>=11 AND i<= 13 THEN 4
        ELSE 1
         ENDIF ENDIF ENDIF ENDIF 

]
;


%
% Each agent is associated to a CA
%

certificate :  ARRAY [1..Items] OF  [1..Certificate_Authority]  =
 [[i: [1..Items]] 
             IF i=1 THEN 1
        ELSE IF i>=2  AND i<= 4  THEN 2
        ELSE IF i>=5  AND i<= 13 THEN 3
        ELSE IF i>=14 AND i<= 16 THEN 1
        ELSE IF i>=17 AND i<= 19 THEN 2
        ELSE 1 
         ENDIF ENDIF ENDIF ENDIF ENDIF 
        
]
;


init_location :LocationType =
    [[i: [1..Items]] 
             IF i<=15 THEN i        
        ELSE IF i>=16 AND i<= 19 THEN i-15
        ELSE IF i>=21 AND i<= 24 THEN i-20
        ELSE i-4
        ENDIF ENDIF ENDIF      
]
;

Agent Module:

[
 
%
% After initialization the agent is alive and sending heartbeats
% to its manager
%
        state_agent[idAgent] = a_init 
              AND state_node[Location[idAgent]] = up
             --> 
                blackboard'[idAgent] = persisted_blackboard[idAgent];
                state_agent'[idAgent] = a_alive


%
% Manipulating Blackboard
% 
%

  [] 
     ([]   (i: [1..Max_BlackBoard]) :
      state_agent[idAgent] = a_alive
       AND NOT P(blackboard[idAgent][i]) 
              AND state_node[Location[idAgent]] = up
           --> 
           state_agent'[idAgent] = a_busy;
           LocalTask' = i             
     )

%
% Tasks that can be done locally are finished with success
% as long as the agent is alive
%

  []
     state_agent[idAgent] = a_busy
      AND A(blackboard[idAgent][LocalTask])=idAgent 
              AND state_node[Location[idAgent]] = up
         -->
          state_agent'[idAgent] = a_alive;
          blackboard'[idAgent][LocalTask] =
              Alloc(T(blackboard[idAgent][LocalTask]),
                    idAgent,
                    O(blackboard[idAgent][LocalTask]),
                    TRUE,
                    TRUE
                    )
  
%
% when a task is allocated to another agent, it is copied
% to the agent's blackboard
%

  []
     ([]   (i: [1..Agents]) :
        ([]   (j: [1..Max_BlackBoard]) :
     state_agent[idAgent] = a_busy
      AND A(blackboard[idAgent][LocalTask])/=idAgent
        AND O(blackboard[idAgent][LocalTask])=idAgent
         AND S(blackboard[idAgent][LocalTask]) = FALSE
          AND 
   blackboard[A(blackboard[idAgent][LocalTask])][j] = no_elem
            AND state_agent[A(blackboard[idAgent][LocalTask])] = a_alive 
              AND state_node[Location[idAgent]] = up
           --> 
           state_agent'[idAgent] = a_alive;
           blackboard' = (blackboard WITH
     [A(blackboard[idAgent][LocalTask])][j] :=
              Alloc(T(blackboard[idAgent][LocalTask]),
                    A(blackboard[idAgent][LocalTask]),
                    idAgent,
                    FALSE,
                    FALSE
                    )) 
                 WITH [idAgent][LocalTask] :=
              Alloc(T(blackboard[idAgent][LocalTask]),
                    A(blackboard[idAgent][LocalTask]),
                    idAgent,
                    FALSE,
                    TRUE
                    )
       )
      )

%
% when a task is finished its status is TRUE, it 
% is removed from the blackboard
%
  []

 ([]   (i: [1..Max_BlackBoard]) :
      state_agent[idAgent] = a_alive
        AND S(blackboard[idAgent][i]) = TRUE 
              AND state_node[Location[idAgent]] = up
           --> 
           LocalTask' = i;
           state_agent'[idAgent] = IF O(blackboard[idAgent][i]) /= idAgent
                                      THEN a_update
                                      ELSE a_alive
                                   ENDIF;
           LocalUpdate' = blackboard[idAgent][i];
           blackboard'[idAgent][i] = no_elem
     )

%
% when a task is finished its status is TRUE and it 
% originated from another agent, the other agent is informed
%

  []
   ([] (j: [1..Max_BlackBoard]) :
      state_agent[idAgent] = a_update
       AND O(blackboard[O(LocalUpdate)][j])=O(LocalUpdate) 
              AND state_node[Location[idAgent]] = up
           --> 
           state_agent'[idAgent] = a_alive;
           blackboard'[O(LocalUpdate)][j]=LocalUpdate
     )


%
% Persist data when blackboard content changes
%

 
   []  state_agent[idAgent] = a_alive
        AND blackboard[idAgent] /= persisted_blackboard[idAgent] 
              AND state_node[Location[idAgent]] = up
           -->
           persisted_blackboard'[idAgent] = blackboard[idAgent]
        

%
% Success means that all blackboards are finished.
% 


   []
       state_agent[idAgent] = a_alive
        AND 
           FORALL (i: [1..Max_BlackBoard]) 
            : blackboard[idAgent][i] = no_elem
          AND idAgent = MasterAgent 
              AND state_node[Location[idAgent]] = up
            --> 
             Success' = TRUE
    
]
END;

Manager Module:

[
 
% After initialization, the manager requests a certificate
 state_manager[idManager] = m_init 
   AND state_ca[certificate[rob2agent(idManager)]] = ca_alive
        -->
   state_manager'[idManager] = m_alive
   

% 
% when an agent is not sending heartbeats or is late, the
% manager decides to relocate the agent
%  ONLY AppAgents
%

  [] state_manager[idManager] = m_alive
     -->
      state_manager'[idManager] = m_pre_relocate ;
      relocated_agent' IN {j:T_AppAgents | manager[j]=idManager
                                            AND state_agent[j] = a_dead}

 []
     state_manager[idManager] = m_pre_relocate  
       -->
         state_manager'[idManager] = m_relocate ;     
         old_node' = Location[relocated_agent] ;
         new_node' IN 
               { j: T_Nodes | state_node[j]=up
                                          AND
				IF Location[relocated_agent]
      					= Location[rob2agent(idManager)] 
                                  THEN j=Location[relocated_agent] 
                                  ELSE
                             enclave[j]=Location[relocated_agent]
                                AND  (FORALL (y:T_Nodes): 
                                      ( enclave[y]/=
                                            enclave[Location[relocated_agent]]
                                               OR
                                           Nodes_Load[j] <= Nodes_Load[y])
                                                )
                                ENDIF
                         }
%
% all agents in the same node are moved
% to the new node with fewer agents
% and dont care about agents that are not in the same node
%
    []
      state_manager[idManager] = m_relocate
       AND Nodes_Load[old_node] /= 0
          -->
            state_manager'[idManager] = m_select_relocate;
            relocated_agent' IN  {i:T_AppAgents 
                                   |  Location[i] = old_node}

    []
        state_manager[idManager] = m_select_relocate
         -->
           state_manager'[idManager] = m_relocate;
           Location'[relocated_agent] = new_node;
           state_agent'[relocated_agent] = a_init;
           Nodes_Load' =
                ( Nodes_Load 
                  WITH 
                  [new_node] := Nodes_Load[new_node] + 1
                    )
                  WITH
                  [old_node] := Nodes_Load[old_node] -1 ;
               
  

%
% when all agents in the same node have been relocated
% go back to alive.
%

 [] 
    state_manager[idManager] = m_relocate
     AND Nodes_Load[old_node] = 0
       --> state_manager'[idManager] = m_alive

]

Certificate Authority Module:

[

state_ca[idCA] = ca_init
  AND state_ca[certificate[ca2item(idCA)]] = ca_alive
 --> state_ca'[idCA] = ca_alive
  

]

Threat Module:

[

% kill agent
 
    ([] (i:T_AppAgents) :
          state_agent[i] /= a_dead AND counter_agent_nodes < MaxThreat
            -->
              state_agent'[i] = a_dead;
              counter_agent_nodes' = counter_agent_nodes + 1
     )

% kill node
  []
    ([] (i:T_Nodes) : 
            ([] (j:T_Agents) : 
          state_node[i] = up AND counter_agent_nodes < MaxThreat
            -->
              state_node'[i] = down;
              counter_agent_nodes' = counter_agent_nodes + 1;
              state_agent' IN {Z : AgentStateType | 
                      FORALL (k:[1..AppAgents]) :
                      IF Location[k]= i
                        THEN Z[k] = a_dead
                        ELSE Z[k] = state_agent[k]
                      ENDIF
                 } ;
   state_manager' IN {Z : R_ManagerStateType | 
                      FORALL (k:[1..Robustness_Managers]) :
                      IF  Location[rob2agent(k)]= i
                        THEN Z[k] = m_dead
                        ELSE Z[k] = state_manager[k]
                      ENDIF
                 } ;
   state_ca' IN {Z : CA_StateType | 
                      FORALL (k:[1..Certificate_Authority]) :
                      IF  Location[ca2item(k)]= i
                        THEN Z[k] = ca_dead
                        ELSE Z[k] = state_ca[k]
                      ENDIF
                 } ;
    ))


% kill robustness manager
 []
   ([] (i: [1..Robustness_Managers]) : 
             state_manager'[i] /= m_dead
                  AND counter_managers_certificate < MaxThreat
           --> 
             counter_managers_certificate' = counter_managers_certificate + 1;
             state_manager'[i] = m_dead
          
    )



%
% kill certificate authority
% In this case, all agents that depend on this CA 
% have their certificates revoked
%

 []
   ([] (i: [1..Certificate_Authority]) : 
             state_ca[i] /= ca_dead
                  AND counter_managers_certificate < MaxThreat  
           --> 
             counter_managers_certificate' = counter_managers_certificate + 1;
             state_ca' IN 
     { Z:  ARRAY [1..Certificate_Authority] OF CA_State |
                 FORALL (k:[1..Certificate_Authority]) :
                   IF k=i OR certificate[ca2item(k)]=i
                          THEN Z[k]=ca_dead
                          ELSE Z[k]=state_ca[k]
                    ENDIF
                             }  ;

             state_agent' IN 
     { Z:  ARRAY [1..AppAgents] OF AgentState |
                 FORALL (k:[1..AppAgents]) :
                   IF certificate[k]=i 
                          THEN Z[k]=a_dead
                          ELSE Z[k]=state_agent[k]
                      ENDIF
                             }  ;

             state_manager' IN 
     { Z:  ARRAY [1..Robustness_Managers] OF R_ManagerState |
                 FORALL (k:[1..Robustness_Managers]) :
                   IF certificate[rob2agent(k)]=i 
                          THEN Z[k]=m_dead
                          ELSE Z[k]=state_manager[k]
                      ENDIF
                             } 
             
          
     )
]

Society Module:

We consider for this example a simple version of the CPE society composed of all agents, managers, certificate authorities and the threat model.

society : MODULE =        ([] (i:T_AppAgents) : M_Agent[i]      ) 
                       [] ([] (j:T_RManagers) : M_RManager[j]   )
                       [] ([] (l:T_CA)        : M_CAuthority[l] ) 
                       [] M_Threat 
;

Dependency Graphs:

Dependency graphs are built for individual atomic predicates. Following are the dependency graphs of some of the predicates appearing in various properties about agents, managers, nodes, and certificate authorities. For each graph, the square shaped node contains the predicate for which the dependency graph is built. Dependencies can be computed in a modular way for each predicate in the property. Notice that not all the dependencies have to be computed. For instance, when computing the dependencies of the predicate "ca_alive(i)" that indicates that the certificate authority "i" is alive, we reach the predicate "ca_alive(ca(i))" which indicates that the certificate authority of the certificate authority "i" is also alive. Therefore, we conclude that two certificate authorities can be permuted if they have the same certificate authority, or certificate authorities that them self have the same certificate authority and so on. Therefore, it is not necessary to compute the dependencies of the predicate "ca_alive(ca(i))".

Discovering Symmetries:

The following are example of computed permutations that induce a generalized frame automorphism for this model of the CPE. There is a slight change of names from the SAL model to the graph since I needed to draw the graphs by hand from the result of the computation of dependency graphs. u is up, d is down, node(i) is equivalent to state_node(i). ca_alive(i) is equivalent to state_ca(i)=alive. m_alive(i) is equivalent to state_manager(i)= alive, and so on.

• Consider the property: for all node "i", always eventually "node(i)=up"

This property indicates that a failing node will always recover and be back up again. Since permuting the predicates "node(i)=up" and "node(j)=up" for any "i" and "j" leaves the property unchanged, we track down the dependencies of those predicates. We find that the predicate "node(i)=up" depends on the predicate "node(i)=down". Therefore we can permute "node(i)=up" and "node(j)=up" if we can permute the predicates "node(i)=down" and "node(j)=down". We compute then the dependencies of the predicate "node(i)=down" and we determined that it depends on the predicate "node(i)=up". Therefor, for the property above we can compose the two permutations ("node(i)=down" , "node(j)=down") and ("node(i)=up" , "node(j)=up" ).

• Consider the property: for all certificate authority "i", always eventually "ca_alive(i)"

This property indicates that a failing certificate authority will always recover and be back up again. Since permuting the predicates "ca_alive(i)" and "ca_alive(i)" for any "i" and "j" leaves the property unchanged, we track down the dependencies of those predicates. We find that the predicate "ca_alive(i)" depends on the predicates "ca_alive(ca(i))" and "ca_init(i)". Therefore we can permute "ca_alive(i)" and "ca_alive(j)" if we can permute the predicates "ca_init(i)" and "ca_init(j)" and permute the predicates "ca_alive(ca(i))" and "ca_alive(ca(j))". We compute then the dependencies of the predicate "ca_init(i)". We have three certificate authorities CA1, CA2, and CA3. CA1 is the certificate authority of CA2 and CA3. Since the dependencies of the predicates "ca_init(i)" and "ca_init(j)" can be permuted and that "ca_alive(ca(2))" and "ca_alive(ca(3))" are equal, then we can compose the following permutations ("ca_alive(2)","ca_alive(3)"), ("ca_init(2)","ca_init(3)"), and ("ca_dead(2)","ca_dead(3)")

State Space Reduction:

The total number of state for the above CPE description is a little more than 66 Million states.

• The verification of the first property with the discovered symmetries requires the exploration of only a little more than 6300 states.
• The verification of the second property with the discovered symmetries requires the exploration of only a little more than 10800 states.