Database of Research Tools Developed Using CADP

Automata Learning through Counterexample-guided Abstraction Refinement

Organisation: Radboud University Nijmegen (THE NETHERLANDS)
Aalborg University (DENMARK)

Functionality: Learning automata.

Tools used: LearnLib
CADP (Construction and Analysis of Distributed Processes)

Period: 2012

Description: The problem to build a state machine model of a system by providing inputs to it and observing the resulting outputs, often referred to as black box system identification, is both fundamental and of clear practical interest. A major challenge is to let computers perform this task in a rigorous manner for systems with large numbers of states. The most efficient techniques for constructing models from observation of component behavior are based on active learning, where a model of a system is learned by actively performing experiments on that system. Tools that are able to infer state machine models automatically by systematically "pushing buttons" and recording outputs have numerous applications in various domains, such as understanding and analyzing legacy software, regression testing of software components, protocol conformance testing based on reference implementations, reverse engineering of proprietary/classified protocols, fuzz testing of protocol implementations, and inference of botnet protocols.

This work presents an algorithm that is able to compute appropriate abstractions for a restricted class of system models. The algorithm applies to a class of extended finite state machines, called scalarset Mealy machines, in which one can test for equality of data parameters, but no operations on data are allowed. The approach constructs a so-called mapper, which is placed in between the teacher (or system-under-test, SUT) and the learner, and transforms the interface of the teacher by an abstraction that maps (in a history dependent manner) the large set of actions of the teacher into a small set of abstract actions.

The learning algorithm has been implemented in a prototype tool named Tomte, after the creature that shrank Nils Holgersson into a gnome and (after numerous adventures) changed him back to his normal size again. The basic learning tool used is LearnLib and the mapper is constructed in terms of a counterexample guided abstraction refinement (CEGAR) procedure. Tomte was used for learning fully automatically models of several realistic software components, including the biometric passport and the SIP protocol. For the benchmarks requiring history dependent abstractions, Tomte is the first tool that has been able to learn the models fully automatically. All models inferred were checked to be observation equivalent to the corresponding SUT, by combining the learned model with the abstraction and applying the CADP toolbox for equivalence checking.

Conclusions: During the last few years important developments have taken place on the borderline of verification, model-based testing and automata learning. By combining ideas from these three areas, it will certainly become possible to learn models of realistic software components with state spaces that are many orders of magnitude larger than what tools can currently handle. Even though the class of systems to which Tomte applies is limited, the fact that the approach makes it possible to learn models of systems with data fully automatically is a major step towards a practically useful technology for automatic learning of models of software components.

Publications: [Aarts-Heidarian-Kuppens-Olsen-Vaandrager-12] Fides Aarts, Faranak Heidarian, Harco Kuppens, Petur Olsen, and Frits W. Vaandrager. "Automata Learning through Counterexample Guided Abstraction Refinement". Proceedings of the 18th International Symposium on Formal Methods (FM'2012), Paris, France, Lecture Notes in Computer Science vol. 7436, pages 10-27. Springer Verlag, August 2012.
Available on-line at:
or from the CADP Web site in PDF or PostScript

[Vaandrager-12] Frits W. Vaandrager. "Active Learning of Extended Finite State Machines". Proceedings of the 24th IFIP WG 6.1 International Conference on Testing Software and Systems (ICTSS'2012), Aalborg, Denmark, Lecture Notes in Computer Science vol. 7641, pages 5-7. Springer Verlag, November 2012.
Available on-line at:
or from the CADP Web site in PDF or PostScript

[Aarts-14] Fides Aarts. "Tomte: Bridging the Gap Between Active Learning and Real-World Systems". PhD thesis, Radboud University Nijmegen (The Netherlands), October 2014.
Available on-line at:
or from the CADP Web site in PDF or PostScript

[Roose-22] Gideon Roose. "An Experience Report on Model Learning". BSc thesis, Vrije Universiteit Amsterdam (The Netherlands), July 2022.
Available on-line at:
or from the CADP Web site in PDF or PostScript

Prof. Dr. Frits W. Vaandrager
Institute for Computing and Information Sciences
Mailbox 47, Faculty of Science
Radboud University Nijmegen
Heijendaalseweg 135, Huygens Building
6525 AJ Nijmegen
The Netherlands
Tel: +31 24 365 2216
Fax: +31 24 365 2525

Further remarks: This tool, amongst others, is described on the CADP Web site:

Last modified: Mon Feb 12 10:01:27 2024.

Back to the CADP research tools page