University of Twente (THE NETHERLANDS)
Prediction of system availability in presence of local recovery.
xADL, Java, ArchStudio
CADP (Construction and Analysis of Distributed Processes)
Local recovery is an important technique to achieve fault tolerance.
Whereas a global recovery strategy restarts the whole system upon
detection of an error, thus making the entire system unavailable until
its normal operational mode is reached again, local recovery strategies
work at a lower level of granularity. They partition the system into
several recoverable units (RUs) so that each RU consists of a number of
software modules, and each RU can be recovered independently. Thus, a
better availability is guaranteed: recovering a part of the system is
usually faster than recovering the whole system and, moreover, the
non-affected system parts remain operational.
The availability of the system (percentage of time the system is up) heavily depends on the chosen software decomposition, i.e., the grouping of software modules are grouped into RUs. Since the implementation of a local recovery strategy is a time-consuming and a nontrivial task, it is important to have a quick, easy and accurate method that predicts the system availability of a given decomposition alternative at design time. In this way, various decomposition alternatives can be compared and only the best one can be implemented. The method proposed for solving this problem proceeds as follows. Taking as input a software decomposition, together with failure and repair rates for each module, a continuous-time Markov chain (CTMC) is generated automatically. Then, standard CTMC analysis methods are used to compute the system availability. Other performance and dependability measures, such as the average number of operational units, number of failures during the first hour of operation, can be obtained as well.
All architectural elements are translated into a set of Input/Output Interactive Markov Chains (I/O-IMCs), which augment traditional CTMCs with discrete actions, thus enabling synchronization between them. More precisely, the architectural elements are specified using MIOA (Markovian Input/Output Automata) and are automatically translated into I/O-IMCs accepted as input by CADP. To reduce the size of the final CTMC, the intermediate I/O-IMCs modeling the various elements (failure and recovery behavior of each module, recovery manager, interfaces between RUs and the recovery manager) are composed one by one and reduced incrementally using the BCG_MIN tool of CADP after each composition. The whole process is automated using the SVL scripting language of CADP.
The proposed methodology was implemented using JAVA and CADP, and was integrated it into the FLORA framework, which facilitates the decomposition and implementation of software architectures for local recovery. FLORA is part of the ArchStudio environment and enhances it with a push button technology to predict the availability of systems resulting from certain software decompositions. The modeling and analysis methodology was experimented on a real-life software system, namely the MPlayer open-source media player. Four different decomposition alternatives were investigated and the availability predicted by the analytical models was compared to the availability measurements obtained from the actual implementations. It turned out that the analytical results closely match the measured availabilities.
An easy-to-use methodology was proposed to provide quantitative means
for comparing different software decomposition alternatives in terms
of their availabilities. The whole analysis procedure was automated
with a Java/CADP-based tool and was integrated, with the FLORA
framework, into the ArchStudio environment, which is based on xADL.
This methodology can also be integrated into other modeling/analysis
frameworks, such as the UML or the AADL modeling formalisms, to support
certain dependability/performance analyses.
"Architecting Fault-Tolerant Software Systems".
PhD thesis, University of Twente, The Netherlands, January 2009.
Available on-line at: http://doc.utwente.nl/60513/1/thesis_H_Sozer.pdf
or from our FTP site in PDF or PostScript
[Boudali-Sozer-Stoelinga-09] Hichem Boudali, Hasan Sözer, and Marielle Stoelinga. "Architectural Availability Analysis of Software Decomposition for Local Recovery". Proceedings of the 3rd IEEE International Conference on Secure Software Integration and Reliability Improvement SSIRI'2009, pp. 14-22. IEEE Computer Society Press, July 2009.
Available on-line at: http://doc.utwente.nl/68511/1/archAvailSWRec.pdf
or from our FTP site in PDF or PostScript
Dept. of Computer Science
Formal Methods & Tools
P.O. Box 217
7500 AE Enschede
Tel: +31 53 489 3773
Fax: +31 53 489 3247
|Further remarks:||This tool, amongst others, is described on the CADP Web site: http://cadp.inria.fr/software|