Go backward to 7 Approaches for Overcoming Deficiencies 2
Go up to Top
Go forward to 9 Architectures for Survivability 2

8 Architectures for Survivability 1

ENPM 808s
Information Systems Survivability:
8. Architectures for Survivability 1
- - - - - - - - - - - - - - - - - - -
System- and network-oriented approaches; architectures: architectural components, architectural structure, and structural architectures; servers; mobile-code paradigms; composition

System Structure
- - - - - - - - - - - - - - - - - - -
Highly structured system architectures can have enormous benefits in system specification, development, implementation, maintenance, evolution, networking, and flexibility, and interoperability, for years to come. Useful in protocol suites, scalable multicast protocols, cryptographic key distribution and key management, divide and conquer algorithms, etc. Several historical examples of system designs are worth noting.

System Structure: The T.H.E. System
- - - - - - - - - - - - - - - - - - -
The T.H.E. (Eindhoven) system, Dijkstra 1968, with a hierarchically locking strategy to prevent interlayer deadly embraces via a linearized synchronization strategy. Deadly embraces still occurred occasionally within a level, but never between levels. [Parnas strict-sense "uses" hierarchy: each layer depends on the correctness of the lower layer.]
Layer 5: The operator
Layer 4: Independent users
Layer 3: Each process has its own virtual keyboard, multiplexed onto a single keyboard, buffering of I/O
Layer 2: Message interpreter controlling operator keyboard
Layer 1: Segment controller for drum interrupt synch
Layer 0: Allocation of processes to processors, respectful of explicit mutual synch

 SIFT, A Highly Available System
----------------------------------
Software-Implemented Fault
  Tolerance for commercial flight
  control (SRI-Bendix-NASA)
Extraordinarily high availability.
7 computers, highly redundant,
  broadcasting of task results,
  fault masking by 2/3 voting on 
  critical tasks (3/5 not needed),
  fault detection and elimination
  by robust self-diagnosis with 
  deconfigured faulty computers
  and reallocated tasks
Design & HOL code of paper system
  formally proven in HDM, redone and
  extended in EHDM.
System ran for years at NASA Langley.

Proof Refinements for SIFT Paper System
   (Melliar-Smith and Schwartz 82)
-----------------------------------------
Markov Model Failure Probability 10**(-10)/hr
 /             using HW error-rate analysis  
(  I/O Model System SAFE =>
 \    /        "all tasks correct"
Replication  Task replicated;
  Model        values voted upon on 
    |          task completion
 Activity    Task actitivies: startup,
  Model        broadcast of values, vote
    |          execute, synchronization
 Operating   SPECIAL specs for OS:
  System       scheduler, voter, dispatcher
    |          buffer manager, etc.
  Pascal     Pascal code for each routine
 Programs
    |
BDX-930 Code [Pascal to BDX not proved.]

Multics Hardware (1965!)
- - - - - - - - - - - - - - - - - - -
Memory mapping for virtual memory
Descriptor for each virtual segment
Access control bits in descriptor
Descriptor cache for performance
Segmentation and paging
Process notion supported in hardware
Process context defined by address space and point of execution, user separation
Rings generalize supervisor/user concept to linearly-ordered domains
Support for argument validation

Original Multics Software (1965!)
- - - - - - - - - - - - - - - - - - -
Modular: written in PL/I, reentrant, supports process-per-user concept. File system, I/O integrated.
Central access control policy (discretionary)
Unified design philosophy, e.g., dependence on symbolic addressing, dynamic linking of symbolic I/O and file names, segmentation, paging, command standards, conventions, canonicalization, ...
"Hard-core" O/S vs. other user services
O/S reliability/security/survivability based on ring mechanism. Robustness hierarchy.
Multilevel secure retrofit put MLS kernel in Ring 1. 8 levels, 18 categories.

SRI's Provably Secure Operating System
- - - - - - - - - - - - - - - - - - -
PSOS (a paper system, begun in 1973, with reports in 1975, 1977, and slightly revised in 1980) was apparently the first object-oriented operating system design, in hardware and software. It addressed then-advanced security requirements. Tagged capabilities in hardware, non-kernel multilevel security, hierarchical abstraction. Lower layers (0-12) formally specified. (See Feiertag-Neumann paper on PGN's Web site.)

            PSOS Abstraction Hierarchy          
|---------------------------------------------|
|Level|      PSOS Abstraction or Function     |
|---------------------------------------------|
| 16 | user request interpreter *             |
| 15 | user environments and name spaces *    |
| 14 | user input-output *                    |
| 13 | procedure records *                    |
| 12 | user processes*, visible input-output* |
| 11 | creation and deletion of user objects* |
| 10 | directories (*)[c11]                   |
|  9 | extended types (*)[c11]                |  
|  8 | segmentation and windows (*)[c11]      |
|  7 | paging [8]                             |
|  6 | system processes and input-output [12] |
|  5 | primitive input/output [6]             |
|  4 | arithmetic, other basic operations *   |
|  3 | clocks [6]                             |
|  2 | interrupts [6]                         |
|  1 | registers (*), addressable memory [7]  |
|  0 | capabilities *                         |
|---------------------------------------------|
|  *  = functions visible at user interface.  |
| (*) = partially visible at user interface.  |
| [i] = module hidden by level i.             |
| [c11] = creation/deletion hidden by level 11|
|---------------------------------------------|

    
     Illustrative PSOS Properties
(Not all are primary security properties)
-----------------------------------------
17. Soundness of user type managers
15. Search path flaw avoidance
12. Process isolation, i-o soundness
11. No lost objects (w/o capabilities)
 9. Generic type safety
 8. Correct segment, no residues
 6. Interrupts properly masked
 4. Correctness!
 0. Nonforgeable, nonbypassable, 
    nonalterable (MLS if desired)

         PSOS Generic Hierarchy                  
|--------------------------------------|
|Level|    PSOS Abstraction    | Level | 
|--------------------------------------| 
|  F | user abstractions       | 14-16 |
|  E | community abstractions  | 10-13 | 
|  D | abstract object manager |   9   |
|  C | virtual resources       |  6-8  | 
|  B | physical resources      |  1-5  | 
|  A | capabilities            |   0   |
|--------------------------------------|

    
        A generic capability:
+-+------+------+----------+-------+-+
|T|Unique|Access|Capability|Address|C|
|A|  ID  |Rights|Data      |Base & |O|
|G|      |      |Type      |Bounds |P|
| |      |      |          |       |Y|
+-+------+------+----------+-------+-+
PSOS capability contains Tag, UID. TAG bit identifies it as a capability. UID unique for system lifetime.
PSOS capability strongly types the referenced abstract object. Primitive access rights are known to hardware, others interpreted by datatype type managers.
Address base and bounds define virtual address and extent of data referenced.
Copy bits determine rights of passage of capabilities.
Other capability systems vary...
No tag in descriptor-based systems (capabilities stored separately). Descriptors reusable sooner or later.

Multilevel Security
- - - - - - - - - - - - - - - - - - -
The mandatory nature of multilevel-security policies is very attractive in principle, but full of difficulties in practice.
Kernels, references monitors, trusted computing bases, untrusted applications
Where multilevel security is vital as an overall system/network property, eschew the Orange Book kernel/TCB paradigm locally in favor of the Proctor-Neumann approach in which end-user systems are single level, and MLS trustworthiness is only in certain servers:
http://www.csl.sri.com/neumann/ncs92.html
This largely avoids the desire for MLS PC-like end-user systems, relying instead on MLS/MLI servers and highly trustworthy components just where they are essential.
Greater attention is needed for MLS as a global system/network property, while avoiding trying to enforce MLS globally where trustworthiness cannot be assured.
Continue exploration of what is required to achieve a Red Book (Trusted Network Interpretation) for composition.

NO ADVERSE FLOW --> (e.g., MLS downward)

  ----------               ----------
  |        |    -------    |        |
  | "High" |--->|Guard|--->| "Low"  | 
  | (sec)  |<---|     |<---| (sec)  |
  | (int)  |    |-----|    | (int)  |
  |--------|               |--------|

NO ADVERSE FLOW <-- (e.g., MLI upward)

Multiplexed Single-Level MLS Architectures
- - - - - - - - - - - - - - - - - - -
* Rushby and Randell 1983 Newcastle Connection Distributed Secure System (lots of covert channels)
* Proctor and Neumann 1992 SRI multiple single-level (no user covert channels)
http://www.csl.sri.com/neumann/ncs92.html
Can use off-the-shelf single-level client systems with trustworthy servers, providing MLS, MLI, and a sense of MLX.
Note: Naval Research Lab (Kang et al. 97, etc.) approach addresses the ability to read down to lower-level systems (without the corresponding MLI to guard against Trojan horses, etc.) via one-way flow architecture (Davidson 96), NRL Pump (Kang 96), SINTRA, COTS switched client workstations.

Architectures for Trustworthy DBMSs
-----------------------------------------------
| OS \ DB   | Untrusted DB    | Trust in DB   |
-----------------------------------------------
| Untrusted | System high     | Dedicated DB: |
| OS        |                 | OS encaps'd?  |
-----------------------------------------------
|           | Restrictive:    | Extended TCB: |
| Trusted   | IPSharp Grohn,  | SeaView *     |
| OS TCB    |   Bonyun        | ASD-Views**   |
|           | Hinke-Schaefer  | Honeyw SAT DB |
|           | SeaView MLS only| AOG-Gemini DB |
-----------------------------------------------
 * SeaView Exended TCB untrusted for mandatory
   access ctl, trusted for DAC, DB consistency
 ** ASD-Views extended TCB encapsulates views,
   which become trusted

SeaView Design Hierarchy (Lunt et al. 88)

Ring, Database Functionality
-------------------------------------
TOTALLY UNTRUSTED:
7. User applications 
-------------------------------------
6. User interface, presentation view 
   mgr, conventional DBMS functions.
   <MSQL Preprocessor if untrusted>
-------------------------------------
TRUSTED FOR DB DAC, INTEGRITY:
5. DBMS Nucleus Resource Mgr --
   Extended DB DAC, DB consistency,
   MSQL Preprocessor <integ. trusted>
-------------------------------------
TRUSTED SINGLE-LEVEL RELATIONS:
3-4. Discretionary TCB (GEMSOS).
     DAC on relations hidden by DBMS
     in MSQL trusted option
0-2. Mandatory TCB (GEMSOS) used for
     single-level relations
-------------------------------------
MSQL provides virtual multilevel
relations and views.

SeaView Simplified Model Hierarchy
(Denning et al. 88)
- - - - - - - - - - - - - - - - - - -
2. TCB MODEL: Database TCB for views, multilevel (virtual) relations, discretionary access, labeling, data consistency, sanitization, reclassification, constraints on transactions, polyinstantiation.
1. MLS MODEL: MLS security for single-level base relations.

SeaView TCB Security Model
- - - - - - - - - - - - - - - - - - -
A state is secure if it satisfies the STATE PROPERTIES.
A command is secure if, beginning in a secure state, it ends in a secure state AND satisfies all TRANSITION PROPERTIES.
A transaction is secure if it satisfies the SEQUENCE PROPERTIES.
SeaView SECURITY THEOREM: IF the initial state is secure,
AND all commands are secure,
AND all transactions are secure,
AND all axioms (state-independent properties) are satisfied
THEN the system is secure.

SeaView Model Properties
- - - - - - - - - - - - - - - - - - -
Classification constraints
Integrity constraints
Rule-based classifications
Discretionary properties
Constraints on multilevel relations
Multilevel integrity rules
Multilevel entity integrity
Multilevel referential integrity
Application-dependent contraints

Multilevel Survivability
- - - - - - - - - - - - - - - - - - -
Explore principles of the MLS/MLI/ MLA/MLX concepts to distributed systems and networks that must be survivable, encompassing security and reliability, within the framework of mechanisms providing generalized dependence.
Strive to avoid all of the pitfalls that have retarded the appearance of true MLS systems and networks, with realistic common sense, hindsight, and foresight.

Encryption
- - - - - - - - - - - - - - - - - - -
Good encryption must be used, securely embedded into survivable implementations. Compromises from within and below, bypasses, etc., must be addressed.
We need pervasive use of cryptographically based authentication (one-time passwords).
Robust key management is essential. Strong public-key crypto is recommended.
Key escrow/recovery seems appealing to law enforcement and intelligence, but is potentially disastrous to everyone else, including national security.

"Trusted Paths" and Resource Integrity
- - - - - - - - - - - - - - - - - - -
Without trustworthy paths to systems from users, to users from systems, and between systems, all bets are off.
Trusted paths are difficult in inherently untrustworthy systems.
The University of Pennsylvania Secure Bootstrap represents an attempt to overcome PC untrustworthiness.
Independent of MLI concepts, integrity checks are desirable for critical entities.

Once-Writable Virtual Store
- - - - - - - - - - - - - - - - - - -
Greatly simplifies historical database maintenance, inegrity, tamper resistance, secure audit recording, recovery, backup. (Cf. the Plan 9 file system)
Lower-level overwritable store hidden (encapsulated).
Used in Postgres (cf. Ingres)

Trustworthy Servers
- - - - - - - - - - - - - - - - - - -
File servers, network servers, name servers, authentication servers, real-time analysis are all vital.
Servers must be survivable, and especially secure and reliable.
Sound operating systems, networking, and encryption are absolutely essential for servers, even more than on end-user systems.

The Mobile-Code Paradigm
- - - - - - - - - - - - - - - - - - -
Write-once, (verify-once,) run-anywhere: potential for greater survivability. WOVORA: verify, add integrity seals.
Survivable mobile code requires secure operating systems and networking, sensible application design, good software engineering, and pervasive attention to implementation detail. Encryption, digital signatures, confined execution, proof-carrying code, formal methods helpful.
Need to reduce the necessary dependence on untrustworthy components, invoking generalized-dependence mechanisms. The MLX concept is useful in structuring.

Mobile-Code Architectures
- - - - - - - - - - - - - - - - - - -
A significant new paradigm for controlled execution involves the use of mobile code, that is, code that can be executed independently of where it is stored. The most common case involves portable reusable code that is acquired from some particular sources (remote or local) and executed locally. From a different perspective, it could involve local code that is executed elsewhere, or remote code that is executed at another remote site. Ideally, mobile code should be platform independent, and capable of running anywhere irrespective of how and from where it was obtained. Used in connection with the principles of separation of domains and allocation of least privilege, dynamic linking and dynamic loading with persistent access controls, this paradigm provides an opportunity for the secure execution of mobile code, and represents a very promising approach for achieving ultrasurvivable systems.
Of course, you can have major integrity, confidentiality, availability, denial-of-service, and general survivability risks involved in executing arbitrary code on one of your systems. The existence of mobile code whose origin and execution characteristics are typically not well known necessitates the enforcement of strict security controls to prevent Trojan horses and other unanticipated effects. It may be necessary to provide repeated reauthentication and validation, and revocation or cache deletion as needed. When combined with digital signatures and proof-carrying code to ensure authenticity and provenance, dynamically-linked mobile code provides a compelling organizing principle for highly survivable systems.
In principle, properly implemented environments for executing mobile code can contribute to survivability in various ways:

Enable the execution of machine-independent trustworthy programs that have been carefully analyzed. Thus the paradigm becomes not just write-once run-anywhere (WORA), but rather write-once, verify-once, and run-anywhere (WOVORA). This can greatly enhance survivability by reducing the otherwise enormous task of verifying many different versions of the same code.
Enable a distinction between the execution of programs on intrinsically unreliable and unsecure sites from execution on reliable and secure sites. In the sense of multilevel survivability, critical operations should not depend on code or execution on less trustworthy sites. In order for an MLX concept to be enforced, some measure of subsystem survivability must exist from which the aggregate survivability can be inferred or derived.

A highly survivable overall mobile-code architecture can be aided by a combination of trustworthy servers, encrypted network traffic, digital signatures, proof-carrying code, and other components and concepts. (See Sections 8.3 and 8.4 of the arl-one report.)

Three contemporary doctoral theses provide important contributions to the establishment of such an architecture, the formal-methods language-centric considerations of Drew Dean, the system-protection-centric work of Dan Wallach, and the proof-carrying code approach of George Necula. Background on understanding code mobility rather independently of survivability and security issues is given in a useful article by Fuggetta et al. (in a special issue of the IEEE Transactions on Software Engineering on mobility and network-aware computing). Formal methods are also particularly relevant to mobile code, because of the critical dependence on type safety (for example, the formalization of dynamic and static type checking for mobile code given by Riely and Hennessy).

An extraordinary compilation of articles on various aspects of the mobile code paradigm has been assembled by Giovanni Vigna, and published by Springer Verlag. This book reflects most of the potential problems with mobile code, and suggests numerous approaches to reducing the risks. Considering the enormous potential impact, this book is mandatory reading for anyone trying to use the mobile-code paradigm in supposedly survivable systems. The book also contains copious references, summarized in the arl-one report.

Emerging Trends in Architecture
- - - - - - - - - - - - - - - - - - -
Patchworked systems and networks, networked PCs, borrowed software, full of vulnerabilities.
Hardware becoming cheap, fast, powerful, small; development of prototype chips as well as production becoming automated
Software development still costly, methodological aids and tools useful, formal methods becoming realistic
Lots of useful research in distributed systems and control, system architecture, authentication, knowledge-based interfaces, ...
Network requires a pervasive system view!

Reading for the Next Class Period
- - - - - - - - - - - - - - - - - - -
This paper on anomaly and misuse detection is particularly relevant because of its emphasis on the importance of architecture and good software engineering:
P.G. Neumann and P.A. Porras, Experience with EMERALD to Date, Proceedings of the First USENIX Workshop on Intrusion Detection and Network Monitoring, Santa Clara, California, April 1999, pages 73-80:
http://www.csl.sri.com/neumann/det.html
A paper referred to therein is also on-line:
http://www2.csl.sri.com/emerald/emerald-niss97.html
I will use some slides that are not included here, most of which can be found at
http://www.csl.sri.com/intrusion/
(click on EMERALD).