Technical University of Eindhoven (THE NETHERLANDS)
CADP (Construction and Analysis of Distributed Processes)
Heartbeat protocols are used as the underlying synchronization
mechanism for many other distributed protocols. The basic idea
behind a heartbeat protocol is that once a participating process
or a communication channel crashes, other processes become aware
of this fact and become inactive within a certain interval.
To this end, processes periodically exchange simple messages,
called heartbeats, to inform each other about their liveness.
If an expected heartbeat is not received after a specific time,
it is assumed that either the respective process has failed or the
communication medium is down. After a number of periods without any
response, the expecting processes eventually become inactive, thus
guaranteeing timely inactivation of all participants after a process
or channel crash.
This work deals with the formal specification and analysis of several existing variants of accelerated heartbeat protocols (binary, static, expanded, and dynamic), which aim at reducing the overhead (the rate of heartbeat transmissions), minimizing the detection delay (the interval between the crash and the deactivation of all processes), and maximizing reliability (minimizing the probability of inactivation due to lost heartbeats). The heartbeat protocols were described using two different specification formalisms: the process algebra mCRL2 and the timed-automata language of UPPAAL. Next, a set of three basic requirements were specified, characterizing the safety and liveness of the protocols, namely that upon a crash, all processes will eventually be deactivated within a certain period of time and, if no process crashes and no message is lost or delayed, then no process will decide to deactivate.
For the process algebraic specification, these properties were specified using a combination of monitor processes and modal mu-calculus formulas, and were verified using the EVALUATOR model checker of CADP. For the automata-theoretic specification, the properties were specified using a combination of monitor timed automata and reachability properties, and were verified using UPPAAL. Quite surprisingly, for each of the protocols the verification found out situations where one or both of the above properties are not satisfied. Using the counterexamples generated by EVALUATOR, these violations of correctness properties are traced back to two main causes: inappropriate handling of simultaneous events and incorrect time-bounds for the inactivation of processes. Fixes of the various heartbeat protocol variants were proposed, and the model checking of the fixed versions indicates that the errors were removed.
Formal specification and verification were successfully applied to
accelerated heartbeat protocols, and allowed to detect flaws in their
functioning. Several corrections of these protocols were proposed,
and shown by means of model checking to behave correctly.
Muhammad Atif and MohammadReza Mousavi.
"Formal Specification and Analysis of Accelerated Heartbeat Protocols".
CS-Report 09-04, Technical University of Eindhoven, 24 pages,
Available on-line at: http://www.win.tue.nl/~atif/reports/report0904.pdf
or from our FTP site in PDF or PostScript
Department of Mathematics and Computer Science
Eindhoven University of Technology
P.O. Box 513
Phone: +31 402 474 425
|Further remarks:||This case study, amongst others, is described on the CADP Web site: http://cadp.inria.fr/case-studies|