Prev Up Next
Go backward to 8 Architectures for Survivability 1
Go up to Top
Go forward to 10 Reliability in Perspective

9 Architectures for Survivability 2

ENPM 808s
Information Systems Survivability:
9. Architectures for Survivability 2

- - - - - - - - - - - - - - - - - - -
The importance of human interfaces; various open-source paradigms; real-time monitoring of survivability, including anomaly and misuse detection covering penetrators and insiders
Use of Open-Source Software

Because some of the serious systemic deficiencies are not likely to be overcome in proprietary systems, it would be advantageous to make more systematic use of nonproprietary open-source software, especially if trustworthy distribution paths can be used consistently. Dependence on proprietary code can be seriously complicated any of a variety of factors:

It is a sad commentary on commercial and proprietary software development that some of the most useful software components are open-source software products that are the results of labors of love, and widely available free of charge over the Internet, such as the Emacs editor and the LaTeXdocument system (both of which have been used in the preparation of this report), Perl (Practical Extraction Resource Language), Bind (allowing symbolic naming of IP addresses), the Netscape browser source code, the Apache Web server, Berkeley BSD Unix, Linux, to name just a few. It is unfortunate that the same is not more widely true of security systems, although the Diffie-Hellman crypto algorithm is now in the public domain and a few simple schemes such as S-Key one-time passwords are freely available. Similarly, PGP (Pretty Good Privacy) is becoming more widespread as it becomes seamlessly embedded in e-mail environments. Although there are risks that Trojan horses might be implanted in variant versions of open-source software, trustworthy software distribution is using public-key authentication schemes can overcome some of the uncertainty.

Real-time Analysis of Behavior and Response
- - - - - - - - - - - - - - - - - - -
There is a great need for the ability to provide real-time detection and analysis of system and network behavior, with appropriate real-time responses, from the coordinated perspective of survivability and its subtended requirements. There has been considerable work on this topic for almost two decades.

SRI has pioneered work on rule-based expert system analysis and statistical analysis, through IDES (Intrusion Detection Expert System) and NIDES (Next-Generation IDES). The current work on EMERALD (Event Monitoring Enabling Responses to Anomalous Live Disturbances) is the current extension of IDES and NIDES to monitor network activity. Overall, there are no efforts that we know of besides EMERALD that are oriented toward the ability to detect problems arising in connection with generalized survivability. (See

There are many other institutions that have been developing systems addressing various aspects of the intrusion-detection problem, typically using either rule-based techniques or statistical analyses, but in most cases not both, and usually dealing with users of individual systems or local networks. See arl-one for copious references.

Anomaly and Misuse Detection
- - - - - - - - - - - - - - - - - - -
"Intrusion Detection" is too narrow a term, because we need to encompass insider misuse, and threats to reliability and survivability. "Misuse and anomaly detection" is somewhat closer to what is needed, and should encompass insiders and outsiders, intentional and accidental misuse, and ultimately reliability and survivability issues.

Tremendous reliance on these technologies are being recommended by various Government organizations. However, much greater emphasis on robustness and prevention of misuse is needed.

Commercial systems are relatively inflexible, heavily slanted toward exploitations of known vulnerabilities by intruders, using string matching or pattern matching, largely ignoring insider attacks, reliability vulnerabilities, and hitherto unrecognized threats. For a recent workshop on insider misuse, see

Detection can be aided by better prevention.

Real-Time Analysis for Survivability:
Event Monitoring Enabling Responses
to Anomalous Live Disturbances (EMERALD) and
Numerous slides can be found at the former URL.
- - - - - - - - - - - - - - - - - - -
Distributed real-time analysis, with various analytic engines: expert system seeks known attack signatures, statistics seek deviations from expected normal behavior, capable of enterprise-wide hierarchical correlation

Well software engineered for generality, flexibility, scalability, interoperability with other systems, rapid lightweight deployment, polymorphic use via a library of application bases for different targets

Applicable to heterogeneous events (systems, servers, network services) and requirements (survivability, security, reliability, etc.)

"Motherhood" is Short Shrifted!
- - - - - - - - - - - - - - - - - - -
Functional requirements for survivability and its subtended abilities must be correct, consistent, complete, and precisely specified from the outset. Requirements must also anticipate operational needs such as interoperability, reusability, and evolvability.

Architectural fundamentals must be sound. Designs must be thoroughly specified, and evaluated for consistency with their requirements prior to implementation.

Implementation strategy must be realistic, adhered to consistently, and demonstrably capable of satisfying the requirements.

Plans for testing must be anticipatory and thorough.

A Little Wisdom
- - - - - - - - - - - - - - - - - - -
Understand your long-term goals, as well as the threats and risks.

Understand your operational needs up front.

Understand the practical limitations of your vendors, developers, administrators, and operational personnel.

Good software engineering practice and foresight are incredibly valuable, and worth the extra effort in complex systems and networks.

Short-term optimization is usually counterproductive.

A Baseline Survivable Architecture
- - - - - - - - - - - - - - - - - - -
Future work will include the definition and detailed documentation of a baseline family of architectures capable of satisfying critical survivability requirements: an evolutionary open-system approach, minimizing the scope of trustworthiness, including trusted paths where needed, systemic authentication, robust encryption, demonstrably trustworthy mobile code, generalized-dependence mechanisms as needed, accommodating MLS/MLI/MLX as needed without unduly complicating single-level operation, extensive real-time anomaly/misuse detection, ... [From arl-one]
Putting It All Together
- - - - - - - - - - - - - - - - - - -
Establish the mission requirements and map them into specific system/network requirements.

Establish a specific instance of the baseline architecture.

Analyze it with respect to the requirements.

Refine the architecture and document it.

Implement it, test it, red-team it, operate it, inject faults and attacks, and determine the effects, let the users in and see if it prevents penetrations and other misuse, iterating whenever necessary. Be wary rather than keeping your head in the sand.

Residual Vulnerabilities and Risks
- - - - - - - - - - - - - - - - - - -
Despite your very best efforts, there will still be vulnerabilities, serious threats that are unchecked, and risks:

Hardware failures
Software flaws
Malicious penetrations
Trojan horses
Insider misuse
Inadvertent human actions
Environmental disasters
Other operational hazards
Other unforeseen circumstances!

Anticipate these realities, and plan accordingly. Misuse and anomaly detection remains desirable, even in the very best of circumstances.

Some Suggested Future R&D Directions
- - - - - - - - - - - - - - - - - - -
Establish a more detailed set of generic system/network requirements for survivability and characterize their interdependencies.

Establish baseline families of survivable architectures and document them thoroughly.

Define more-robust networking protocols

Establish a survivable foundation for mobile code, specifying requirements on operating systems, browsers and servers, boundary protection, network protocols, programming languages, system composition, and other factors, and defining adequate system/network architectures

Develop prototype testbeds that realize the best architectures, and conduct experiments.

Develop a rigorous framework for generalized dependence

Formalize predictable subsystem composition

Formalize the process of transforming "mission requirements" into specific requirements (subset of generic)

Provide a rigorous basis for trustworthiness-preserving dynamic system reconfiguration and dynamic adaptability

PCCIP Organizational Recommendations
- - - - - - - - - - - - - - - - - - -
Office of National Infrastructure Assurance
National Infrastructure Assurance Council
Infrastructure Assurance Support Office
A Federal Lead Agency for each sector
Sector Infrastr. Assurance Coordinator
Information Sharing and Analysis Center
Warning Center

Presidential Decision Directive 63 transformed the PCCIP into the Critical Infrastructure Assurance Office (CIAO), to support the National Coordinator (overseeing critical infrastructure protection and foreign terrorism).

Information survivability and its subtended requirements must be addressed fully.

PCCIP R&D Recommendations
- - - - - - - - - - - - - - - - - - -
1. Information assurance
2. Intrusion monitoring and detection
3. Vulnerability assessment and systems analysis
4. Risk-management decision support (risky?)
5. Protection and mitigation
6. Incident response and recovery

The recommendations here extend security to survivability, and address items 1, 2, 3, 5, and the recovery part of 6. The incident response part is useful, but has serious effectiveness problems. Item 4 is risky (see Computer-Related Risks, pp.255-257)!

Education and Training
- - - - - - - - - - - - - - - - - - -
Concurrently with R&D efforts, we recommend preparing detailed course materials inspired by this project, teaching those materials in institutional settings (government, university, corporate), and influencing the integration of survivability concepts into regular curricula.

A strong understanding of the requirements for survivable systems and networks, the vulnerabilities, threats and risks, the deficiencies in available (sub)systems, necessities for robust architectures, and guidelines for system development (and in the case of DoD and corporations, for procurement) is essential for everyone involved in systems intended to be survivable.

Prev Up Next