Introduces a variety of hardware and software techniques for designing and modeling fault-tolerant computers. Topics include: coding techniques (Hamming, SECSED, SECDED, etc.); majority voting schemes (TMR); software redundancy (N-Version programming); software recovery schemes; network reliability design and estimation. Introduces probabilistic methods for reliability modeling. Examples from space fault-tolerant systems, networks, commercial non-stop systems (TANDEM and STRATUS). RAID memory systems. Fault-tolerant modeling tools such as HARP, SHURE, SHARPE. |