Radboud University Nijmegen (THE NETHERLANDS)
Aalborg University (DENMARK)
CADP (Construction and Analysis of Distributed Processes)
The problem to build a state machine model of a system by providing
inputs to it and observing the resulting outputs, often referred to as
black box system identification, is both fundamental and of clear
practical interest. A major challenge is to let computers perform this
task in a rigorous manner for systems with large numbers of states.
The most efficient techniques for constructing models from observation
of component behavior are based on active learning, where a model of a
system is learned by actively performing experiments on that system.
Tools that are able to infer state machine models automatically by
systematically "pushing buttons" and recording outputs have numerous
applications in various domains, such as understanding and analyzing
legacy software, regression testing of software components, protocol
conformance testing based on reference implementations, reverse
engineering of proprietary/classified protocols, fuzz testing of
protocol implementations, and inference of botnet protocols.
This work presents an algorithm that is able to compute appropriate abstractions for a restricted class of system models. The algorithm applies to a class of extended finite state machines, called scalarset Mealy machines, in which one can test for equality of data parameters, but no operations on data are allowed. The approach constructs a so-called mapper, which is placed in between the teacher (or system-under-test, SUT) and the learner, and transforms the interface of the teacher by an abstraction that maps (in a history dependent manner) the large set of actions of the teacher into a small set of abstract actions.
The learning algorithm has been implemented in a prototype tool named Tomte, after the creature that shrank Nils Holgersson into a gnome and (after numerous adventures) changed him back to his normal size again. The basic learning tool used is LearnLib and the mapper is constructed in terms of a counterexample guided abstraction refinement (CEGAR) procedure. Tomte was used for learning fully automatically models of several realistic software components, including the biometric passport and the SIP protocol. For the benchmarks requiring history dependent abstractions, Tomte is the first tool that has been able to learn the models fully automatically. All models inferred were checked to be observation equivalent to the corresponding SUT, by combining the learned model with the abstraction and applying the CADP toolbox for equivalence checking.
During the last few years important developments have taken place on
the borderline of verification, model-based testing and automata
learning. By combining ideas from these three areas, it will certainly
become possible to learn models of realistic software components with
state spaces that are many orders of magnitude larger than what tools
can currently handle. Even though the class of systems to which Tomte
applies is limited, the fact that the approach makes it possible to
learn models of systems with data fully automatically is a major step
towards a practically useful technology for automatic learning of
models of software components.
Fides Aarts, Faranak Heidarian, Harco Kuppens, Petur Olsen, and
Frits W. Vaandrager.
"Automata Learning through Counterexample Guided Abstraction
Proceedings of the 18th International Symposium on Formal Methods
FM'2012 (Paris, France), Lecture Notes in Computer Science vol. 7436,
pages 10-27. Springer Verlag, August 2012.
Available on-line at: http://www.mbsd.cs.ru.nl/publications/papers/fvaan/CEGAR12/report.pdf
or from the CADP Web site in PDF or PostScript
[Vaandrager-12] Frits W. Vaandrager. "Active Learning of Extended Finite State Machines". Proceedings of the 24th IFIP WG 6.1 International Conference on Testing Software and Systels ICTSS'2012 (Aalborg, Denmark), Lecture Notes in Computer Science vol. 7641, pages 5-7. Springer Verlag, November 2012.
Available on-line at: http://link.springer.com/content/pdf/10.1007%2F978-3-642-34691-0_2.pdf
[Aarts-14] Fides Aarts. "Tomte: Bridging the Gap Between Active Learning and Real-World Systems". PhD thesis, Radboud University Nijmegen (The Netherlands), October 2014.
Available on-line at: http://repository.ubn.ru.nl/handle/2066/130428
or from the CADP Web site in PDF
Prof. Dr. Frits W. Vaandrager
Institute for Computing and Information Sciences
Mailbox 47, Faculty of Science
Radboud University Nijmegen
Heijendaalseweg 135, Huygens Building
6525 AJ Nijmegen
Tel: +31 24 365 2216
Fax: +31 24 365 2525
|Further remarks:||This tool, amongst others, is described on the CADP Web site: http://cadp.inria.fr/software|