CSE 635 -- Asynchronous Systems

Spring 1997

Adding BDDs to Factory's local model checker

Abstract
Introduction and Motivation
On the Concurrency Factory's Local Model Checker
On BDD's and ADD's
Problem Description
More Details
The CUDD Package
Problems Encountered
Implementation
Results
Summary and Future Work
References

Abstract

This project shows how one can use binary-decision diagram to implement certain

dictionary like data-structures which support add and search operations. The main

motivation behind such implementations is that these are very space-efficient. The

idea was implemented using the CU-decision-diagram package on the local model

checking procedure of the Concurrency Factory. Certain preliminary results that

compare the memory usage with and without the modification are presented.

Introduction and Motivation

Ordered Binary Decision Diagrams (OBDDs [2]) represent Boolean functions as directed acyclic graphs. They form a canonical representation, making testing of functional properties such as satisfiability and equivalence straight forward. A number of operations on Boolean functions can be implemented as graph algorithms on OBDD data structures. OBDDs are substantially more compact than traditional representations such as conjunctive normal form and disjunctive normal form. Hence, OBDDs have found application in many computer aided design tasks, including symbolic verification of computer logic. This motivates the use of OBDDs in this project.

The Concurrency Factory [6] is a CASE tool for specification and verification of distributed systems. The concurrency factory uses model checking on the alternation free segment of the modal mu-calculus as one of the main verification facilities. Model checkers, in general, face the problem of state explosion while trying to verify certain large practical systems. Memory becomes a bottleneck when executing the Concurrency Factory's local model checker too in certain applications.

On the other hand, symbolic model checking [3] has been successful in avoiding the state explosion problem through the use of OBDDs to represent the state space. This suggests that using OBDDs to represent certain tables (that cause memory bottleneck in the factory's model checker) might improve the performance of the Concurrency Factory. While symbolic model checking is good for hardware verification, it performs badly on other problems, like communication protocol verification [3].

On the Concurrency Factory's Local Model Checker

In this section, certain details of the implementation of the local model checking algorithm as in the Concurrency Factory are given. The product graph of the system state and the formula is constructed in a demand driven manner. A node in the product graph is uniquely identified by its (state, variable_id) pair. The information in a node is stored in two different structures - class CPGraphNode and class PGraphNode. Every node that has been seen during the execution of the algorithm has a unique object of the class CPGraphNode associated with it. If the node has been explored (i.e. its children have been computed and assigned objects of class CPGraphNode) then it also has a unique object of class PGraphNode associated with it. The child node pointers and pointers to parent nodes are stored in the PGraphNode object. The objects of classes CPGraphNode and PGraphNode have pointers to each other.

In order to ensure that each (state, variable_id) pair has a unique CPGraphNode object associated with it, we keep all the existing CPGraphNode object pointers in a Table. There are two different tables in the current implementation.

Table : Pointers to those CPGraphNode objects that have been explored are kept in this table. Every CPGraphNode object in this table has a non-null PGraphNode pointer.
UTable : Pointers to those CPGraphNode objects that have not been explored yet are kept in this table. All such nodes have a null PGraphNode pointer. Their children have not been computed yet.

The other main objects used in the model checking procedure are the following four stacks:

L stack : This is the DFS stack. It contains pointers to the CPGraphNode objects. It contains only explored nodes. A CPGraphNode pointer is removed from the L stack only when it is either done (i.e., its value is known), or all of its chilren have been explored.
P stack : When a node has all of its children explored but still has no value associated with it (i.e. it is not done), then it is put in the P stack. Nodes in the P stack have empty list of children (the list contains only unexplored children).
D stack : This contains nodes that are done (i.e., have their final value) but that haven't communicated their value to their parents. Nodes in this stack are in some sense in their final stage. Once they communicate their values to their parents, we do not need to keep them.
C stack : This is a stack which contains nodes that form a counter-example for the formula being verified.

So, in short, the typical life of a node is as follows : it gets a CPGraphNode object created for itself and enters the UTable. At a later time, it is explored and moved from UTable to Table. Now it has a non-null PGraphNode object for it and it enters the L stack. Depending on how it gets done, the node moves from the L-stack to the D-stack directly or through the P-stack. Once out of the D-stack, it no longer has the PGraphNode object for it. But, we still keep the CGraphNode object and the nodes entry in Table. This is because of the possibility of the DFS procedure hitting this node again.

Rather than keeping the CPGraphNode object for the node which has completed its life cycle (described in the last paragraph), all such nodes can be compactly kept in a more space-efficient data-structure. We then would no longer need to keep the pointer entry in Table for that node. This motivates the work done in this project.

On BDD's and ADD's

In this section we briefly discuss the Algebraic Decision Diagrams (ADD's) and why they are ideal choices for this project. ADD's extend BDD's by allowing values from an arbitrary finite domain to be associated with the terminal nodes [1]. A short discussion on BDD's precedes the discussion on ADD's.

A binary-decision diagram represents a Boolean function as a rooted, directed acyclic graph. As an example, Figure illustrates a representation of the function f(x_1, x_2, x_3) defined by f=x_2.x_3 + x_1.x_3. (Boolean AND is represented by . and boolean OR by +). This representation is actually a tree, a decision tree in the usual sense. Now we can define three transformation rules over such graphs that do not alter the function represented:

Remove Duplicate Terminals.Eliminate all but one terminal vertex with a given label and redirect all arcs into the eliminated vertices to the remaining one.
Remove Duplicate Nonterminals. If two nonterminal vertices have the same label (variable name) and point to the same nodes, then eliminate one of the two vertices and redirect all incoming arcs to the other vertex.
Remove Redundant Tests. If a nonterminal vertex has both of its pointers pointing to the same vertex, then eliminate that nonterminal vertex and redirect all incoming arcs to its child.

Clearly, applying these transformations leads to a compressed acyclic graph representing the same function. For a given ordering on the variables, the final representation one gets by exhautively applying these transformations, is canonical. This is called a BDD (or an OBDD, for ordered BDD).

An ADD can be seen as a BDD with a set of constant values different from the set {0,1}. For example, the ADD in figure 3 represents the function f defined by f(00-)=0, f(010)=1, f(1-0)=1, f(011)=2, f(1-1)=2. Thus, ADD's represent functions from boolean values to arbitrary finite sets. As in the case of BDD's , ADD's too achieve high degree of space efficiency due to sharing. ADD's have possible applications in logic synthesis, formal verification, and testing of digital systems.

Problem Description

As we have seen earlier, the Concurrency Factory's local model checker maintains a table of nodes of the product graph (product of the labelled transition system (LTS) and the modal mu-calculus formula graph. See [6] for details). This table grows in size as the execution of the model checker proceeds, and soon it consumes enormous space. More specifically, the following two declarations in the local model checking algorithm implementation are being referred to :

PGSearch Table; UPGSearch UTable;

The current implementation for the classes PGSearch and UPGSearch uses AVL trees, the nodes of which are objects of class CPGraphNode (compressed product graph node) and UCPGraphNode respectively. The class CPGraphNode holds minimal information about a product graph node - its (state, VarId), status (if it is done or not, etc) and value (true or false).These tables are referred to whenever we want to create a product graph node with a given (state, variable_id) pair. If there is already a CPGraphNode for the (state, variable _id) pair, then these tables help to identify it, thus avoiding having more than one CPGraph Node for a given (state, variable_id) pair.

The two tables mentioned above can be together thought of as defining a function from (state, VarId) to (status, value). ADD's could be used to store this function using potentially lesser space as outlined below.

In this project we use an ADD with three terminal nodes, represented by constants 0, 1 and 2. The boolean variables of the ADD represent the bits in the (state, variable_id) pair of a product graph node. This way we can construct an ADD that represents any arbitrary function f : (state, variable_id) -> {0, 1, 2}. We choose 0 to mean nodes that are not done, 1 to mean nodes that are done and whose value is 0 and 2 to mean nodes that are done and whose value is 1.This way a given ADD in our case, is equivalent to a table which stores the (state, variable_id) pair corresponding to every node of the product graph along with the information of whether that node is done and if it is, what its value is. Such an ADD based structure can be used to replace certain AVL tree based structures that are being used in the current implementation of the model checker. But the story is not so simple and we were forced to rewrite a substantial part of the local model checker in order to incorpor- ate this idea. We discuss these problems and the changes they forced next.

Problems Encountered after Proposal and Changes made thereafter

BDDs and ADDs are suitable for storing certain kinds of information only. The original idea was to replace AVL trees by BDD based representation. But it turned out that the information stored with each node in the AVL tree is so different from that stored in any other node in the tree that using BDDs would not have been any more space-efficient. More specifically, the AVL tree nodes (which are of class CPGraphNode) store, apart from DONE and VALUE attributes for (systemstate, variable) pairs, a pointer to the corresponding PGraphNode too. This pointer is used to construct the product graph. This means that Table is actually the following function:

Table : (state, variable_id) -> (DONE, VALUE, PGNode pointer)

The information contained in the pointer cannot be succintly coded into a BDD based structure.

So, rather than encoding the Table and UTable structures, it was suggested that we keep just a subset of nodes present in Table in a BDD. The DONE nodes appeared ideal candidates for this purpose because there would be no need of the PGraphNode pointer in their case (after they had been processed through ProcessD). In other words, for the done nodes Table behaves as if it was the following function:

Done_Table : (state, variable_id) -> (DONE, VALUE)

But we would still have to keep the Table and UTable structures, AVL trees for the other undone nodes. Only certain DONE nodes would be moved from Table to a done_bdd. This could reduce the memory usage of the local model checking procedure too.

But then we ran into a second more difficult problem. This problem was more specific to the current implementation of the local model checker. The CPGraphNode corresponding to done nodes in Table are pointed to from the L-stack and the P-stack described earlier. These pointers are directly accessed whenever a new node is chosen to be explored from the L-stack or when nodes in P-stack re processed. Also certain done CPGraphNode's might be pointed to from the successor and predecessor lists in the PGraphNode structure. This means that even the done nodes can not be removed from the Table (AVL tree containing explored nodes) without in some way changing the implementation of the local model checker. The only solution to this problem was that enough information be stored that would allow us to know if a done CPGraphNode is being pointed to from any other place. All these changes were made to the local model checker.

Implementation Details

The following Class definition was added :

class BddInterface

{

protected :

DdManager *manager; // Required by all functions manipulating BDDs.

DdNode *done_bdd; // This is the root of the

done BDD.

DdNode *bdd_done0; // This is the terminal node

that represents done=0

DdNode *bdd_done1; // This is the terminal node

that represents done=1

int numVars; // Number of variables in the ADD.

public:

short BDDSearchLookup( SystemState*, VarId);

// Search in done_bdd.

void BDDCuddInit( int ); // Initialize CUDD.

void BDDAddNode( CPGraphNode* );

// Add node to BDD.

};

The functions are defined and the code is included in the appendix. We have used the CU- Decision Diagram package called CUDD for implementing these functions.

The BDDInterface was added to the class Table as follows :

class PGSearch

{

protected:

AVLSet<CPGraphNode> _NodeSet;

#ifdef TIWARI

BDDInterface DoneBdd;

/* Instead of storing CPGraphNodes, we JUST STORE the

mapping from (systemstate, value) to VALUE for some DONE

nodes and remove these nodes from the above AVLSet. */

CPGraphNode* CNodeTrue;

// This is a CONSTANT CPGraphNode representing all DONE,

// TRUE nodes

CPGraphNode* CNodeFalse;

// This is a CONSTANT CPGraphNode representing all DONE,

// FALSE nodes

#endif TIWARI

public:

#ifdef TIWARI

void Initialize( );

~PGSearch()

#endif TIWARI

.........

};

Compiling the code with -DTIWARI flag set enables the changes made. WIthout this flag, the program behaves as what it was - i.e. without the BDD related changes. Certain functions of the class PGSearch were modified by adding calls to the functions defined in the class BDDInterface. Essentially whenever a given (state, variable_id) pair was to be looked up in Table, it is now first looked up in the done_bdd. If the node is found to have an entry and it is found to be done, then if it s value is TRUE then we return (CNodeTrue), and if its value was FALSE then we return (CNodeFalse). CNodeTrue and CNodeFalse are two dummy CPGraphNodes, with a arbitrary (state, variable_id) and appropriate (DONE, VALUE) bits. If the node is not found to be in the set of done nodes stored in the BDD, then we continue as before - looking up Table, followed by a lookup in UTable if the former is unsuccessful.

Nodes have to be added to the BDD too! This is done during the model-checking process. Whenever a node is identified with the following properties :

It is done and has a value assigned to it.
The done value has been propogated to all of its existing parents. (i.e. the node is no longer in the D-stack).
The node is not in the L-stack and the P-stack.
The node is not pointed to from any other successor or predecessor lists contained in PGraphNode structures.

The first three conditions are made sure explicitly, while the last condition is made sure implicitly. Abstractly then the done_bdd can be seen as implementing a data-structure that implements the search and add operations.

How are nodes added to the ADD ? Initially the ADD represents the function f(*) = 0, i.e. all nodes are not done. Suppose at any time we have that the ADD represents the function f. Whenever a node with state s, variable_id v, and value b is to be added , we construct a new ADD representing the function g(s,v) = b+1 and g(rest) = 0. Using a library function to add two ADDs, we get the new ADD by adding the two ADDs representing functions f and g. The new ADD would represent the function f+g, which is what was desired. Other details can be referred to in the appendix where we have the code for this.

The CUDD Package

The CU Decision Diagram package [7] provides functions to manipulate Binary Decision Diagrams, Algebraic Decision Diagrams and Zero suppressed decision diagrams [4]. CUDD can be used either as a black box or as a clear box. In this project, CUDD is used as a black box. Since ADDs are used to represent functions from {0,1}^n to an arbitrary set, which was exactly the requirement for this project, we chose ADDs to implement the required tables. Further, CUDD also provides functions to onvert ADDs to BDDs and vice-versa. CUDD is available in the directory ~concurr/CUDD/cudd-2.1.1 in the department machines. For a list of functions that are provided to CUDD, see the file cudd. doc in the cudd/doc sub-directory of the main directory.

The newly defined functions have been implemented in C++ though CUDD has been written in C. The function BDDInterface::BDDCuddInit(int) creates a DdManager and an ADD which has all system state, variable identifier pairs pointing to 0, which means that none of the nodes is initially DONE. The function BDDCuddInit is called from the function PGSearch::StartLookup.

The function BDDInterface::BDDSearchLookup searches the done_bdd for the (system state, variable id) pair passed as an argument to it, and depending on the value obtained from the ADD, returns an appropriate value. (-1,0 or 1 corresponding to 0,1 or 2 in ADD). The function PGSearch::SearchLookup calls this function first to check if the given system state and variable id pair are in the done_bdd. If they are not present, only then does SearchLookup continue to search in the other tables.

The function BDDInterface::BDDMoveNode removes a compressed product graph node entry from _NodeSet (which is the AVL tree containing all explored nodes) and adds that (system state, var id) and DONE information into the done_bdd. This function is called whenever a node is to be added to the BDD.

Results

We have a working implementation of the Concurrency factory's local model checker that uses ADDs to store certain state-space search information. The procedure has been tested for correctness on a number of simple and elaborate examples and seems to work fine, always producing the correct results.

In the table below, we summarize the results obtained. The first column is the name of the specification file (.vpl) which was tested for the formula mentioned in the second column. The file "abp.vpl" refers to the alternating-bit protocol., "leader.x.vpl" is the leader election protocol for x processes and "sieve.x.vpl" is the vpl code for Erathosthenes' Sieve, x denotes the number of processes. The formula "deadlockfree.nu" checks for deadlock freedom, "liveness.nu" checks if the protocol has any livelocks and "safety" is the no-duplication property in the ABP protocol.

The third and the fourth columns contain the values for the amount of space used for model checking, i.e. to store the tables and other product graph information. The third column gives this value for the case when we do not have BDD's in the implementation, while the fourth column gives this value for the case when we use BDD's in order to store certain information.

Results
Test-File	Formula	Mem Usage, No BDD	Mem Usage, BDD
abp.lmc	liveness.nu	0.029 M	0.475 M
abp.vpl	safety.nu	0.180 M	1.02 M
leader.2.vpl	deadlockfree.nu	0.10 M	1.38 M
leader.3.vpl	deadlockfree.nu	0.17 M	3.11 M
leader.4.vpl	deadlockfree.nu	0.37 M	6.37 M
sieve.2.vpl	deadlockfree.nu	0.31 M	1.66 M
sievc.3.vpl	deadlockfree.nu	0.41 M	2.03 M
sieve.4.vpl	deadlockfree.nu	0.54 M	3.24 M
sieve.4.vpl	sieve.gfp.nu	8.84 M	48.07 M

Test-File leader.2.vpl	Formula leader.gfp.nu	Mem Usage, No BDD 0.07 M	Mem Usage, BDD 1.38 M
leader.2.vpl	leader.lfp.nu	0.07 M	1.38 M
leader.3.vpl	leader.gfp.nu	0.35 M	3.10 M
leader.3.vpl	leader.lfp.nu	0.35 M	3.10 M
leader.4.vpl	leader.gfp.nu	2.26 M	6.37 M
leader.4.vpl	leader.lfp.nu	2.26 M	6.38 M
sieve.2.vpl	sieve.lfp.nu	0.72 M	1.61 M
sieve.2.vpl	sieve.gfp.nu	0.45 M	3.18 M
sieve.3.vpl	sieve.lfp.nu	3.47 M	1.94 M
sieve.3.vpl	sieve.gfp.nu	1.96 M	11.21 M

The results above are not very encouraging. Quite contrary to the expectations, the runs that carry the BDD optimization seem to always take up more memory! This could have to do with a lot of reasons. Some of the optimizations that can be done will be discussed in the next section. It needs to be pointed out here that the CUDD package used in the implementation also uses a lot of memory. In particular, in order to avoid having to do repeated computations, the CUDD package maintains a cache. Although certain parameters cab be set to ensure that the cache size is restricted, these runs above were carried out with the default values for cache sizes which was 0.25 M. In certain runs, the size was automatically incresed to 2 M. Due to the lack of time, results with the cache size restricted to a bare minimum, were not taken though they should have been.

Summary and Future Work

This project shows how one can use binary-decision diagram based data-strcutures to implement certain tables which support add and search operations. The idea was implemented using the CU-decision-diagram package on the Concurrency Factory's local model-checker.

In the current implementation, the variable ordering for BDD construction is fixed. Essentially the ordering is that of the ordering in the CSV (compressed state vector), which in turn is decided arbitrarily. CUDD package provides functions that can do dymanic variable reorderings. These functions could be used to see if there is a substantial saving by reordering the variables. Alternatively, we could perform variable reordering based on the structure of the network.

Secondly, we can also have good state-assignment algorithms. Currently the state-bits are assigned essentially in a dumb manner. For instance, we may like to have two state vectors that are adjacent in the LTS to have the minimum possible hamming distance. This would in turn ensure that we can have a lot of sharing in the BDD.

References

[1] Bahar, R.I., Frohm, E.A., Gaona, C.M., Hachtel, G.D., Macii, E., Pardo, A., Somenzi, F., Algebraic Decision Diagrams and their Applications. In IEEE Intl Conf on Computer-Aided Verification, pages 188-191, Santa Clara, CA, Nov 1993.

[2] Bryant, R.E. , Symbolic Boolean Manipulation with Ordered Binary-Decision Diagrams. In Computing Surveys, Vol. 24, No 3, 293-318, Sep 1992.

[3] Burch, J.R. , Clarke, E.M. , McMillan, K.L. , Dill, D.L. , Hwang, L.J. , Symbolic Model Checking: 10^20 States and Beyond. In Information and Computation 98, 142-170 , 1992.

[4] Minato, Shin-ichi., Zero-Suppressed BDDs for Set Manipulation in combinatorial Problems. In Design Automation Conference, 272-277, 1993.

[5] Ramakrishna, Y.S., Private communication.

[6] Sokolsky, Oleg., Efficient Graph-Based Algorithms for Model Checking in the Modal Mu-Calculus. PhD dissertation, SUNY-Stony Brook, May 1996.

[7] Somenzi, Fabio., CUDD: CU Decision Diagram Package Release 1.2.1. Anonymous ftp from vlsi.Colorado.EDU, Sept 1996.

Appendix

The following is the code for the relevant parts :

lmc.cc : Contains the main model checking procedure. This file also contains certain other function definitions for the classes PGSearch and UPGSearch.
pgsearch_routines.cc : This file contains the routines that provide the BDD interface.
search.h : This file defines classes PGSearch, UPGSearch and BDDInterface.