Setting Up Learning

To learn any system, simply implement the reset (pre and post) and step methods of the SUL interface. For automata supported by AALpy, SUL implementations already exist. For more detailed explanation and examples on how to implement SUL interface take a look at the SUL Interface, or How to Learn Your Systems section of the Wiki.

Once you have implemented SUL, you need to select the equivalence oracle. For more detailed discussion about conformance checking and equivalence oracles, please refer to the Equivalence Oracles and Conformance Checking section of the Wiki.

Once you have done so, you should have an input alphabet, implemented SUL and an equivalence oracle. Now let us some, but most used and valuable parameters that you can customize while learning. For in depth look for all parameters, take a look at the official documentation.

Following are the parameters that are valid for learning deterministic, non-deterministic and stochastic systems.

Common Learning Options

Printing Learning Progress

Oftentimes we want to know the current status of learning. Therefore, we preset 4 printing options. Option 3 includes printout from option 2 and 1, and option 2 include printouts from option 2. They are set by setting the print_level parameter of the learning algorithm to one of the following:

0 -> No printing during learning
1 -> Only display learning statistics when the learning is done
2 -> In each learning round, print the number of states of the current hypothesis
3 -> In each learning round, print the complete observation table

Example learning statistics printout.

-----------------------------------
Learning Finished.
Learning Rounds:  2
Number of states: 4
Time (in seconds)
  Total                : 0.0
  Learning algorithm   : 0.0
  Conformance checking : 0.0
Learning Algorithm
 # Membership Queries  : 16
 # MQ Saved by Caching : 10
 # Steps               : 45
Equivalence Query
 # Membership Queries  : 47
 # Steps               : 530
-----------------------------------

Example observation table. ======================================== denotes the begging of the extended S set.

----------------------------------------
Prefixes / E set |()    |('b',) |('a',) 
----------------------------------------
()               |True  |False  |False  
----------------------------------------
('a',)           |False |False  |True   
----------------------------------------
('b',)           |False |True   |False  
----------------------------------------
('b', 'a')       |False |False  |False  
========================================
----------------------------------------
('a',)           |False |False  |True   
----------------------------------------
('b',)           |False |True   |False  
----------------------------------------
('a', 'a')       |True  |False  |False  
----------------------------------------
('a', 'b')       |False |False  |False  
----------------------------------------
('b', 'a')       |False |False  |False  
----------------------------------------
('b', 'b')       |True  |False  |False  
----------------------------------------
('b', 'a', 'a')  |False |True   |False  
----------------------------------------
('b', 'a', 'b')  |False |False  |True   
----------------------------------------

Returning Learning Statistics

If return_data is set to True, dictionary containing following values will be returned alongside the hypothesis once the learning is done.

info = {
    'learning_rounds': learning_rounds,
    'automaton_size': len(hypothesis.states),
    'queries_learning': sul.num_queries,
    'steps_learning': sul.num_steps,
    'queries_eq_oracle': eq_oracle.num_queries,
    'steps_eq_oracle': eq_oracle.num_steps,
    'learning_time': learning_time,
    'eq_oracle_time': eq_query_time,
    'total_time': total_time
}
# additional field for deterministic systems
if cache_and_non_det_check:
    info['cache_saved'] = sul.num_cached_queries

Maximum Number of Learning Rounds

Sometimes we want to stop learning earlier. It might be due to the model size or time constraints. If max_learning_rounds is set, learning will terminated after max_learning_rounds value.

Learning Deterministic Systems

Following are the most important aspects of the deterministic automata learning that the user can change. Most users can simply use the default values of the run_Lstar method with appropriate automaton_type value. automaton_type determines whether DFA, Mealy machine or Moore machine will be learned.

For all possible variation of basic L* algorithm, refer to: https://emuskardin.github.io/AALpy/aalpy/learning_algs/deterministic/LStar.html

Cache

Cache is a multiset of all traces observed so far. It is encoded as a tree. We use this tree/cache to check for violations of determinism and to improve runtime by not performing unnecessary membership queries. For all deterministic systems, cache is enabled by default. We strongly suggest always leaving it enabled.

Usage of cache is important when learning non-simulated systems or systems where each step/reset is computationally/time expansive. Little overhead introduced by cache is not compatible by the speedup obtained by saving several membership queries.

When is it acceptable to disable cache? If the system you are learning has fast step/membership queries and you are certain that it will behave deterministically.

Counterexample Processioning

AALpy offers 3 different counterexample processing options. We suggest using Riverst-Schapire counterexample processing. User can select either None, 'rs' (Riverst-Schapire) or 'longest_prefix' (Shahbaz-Groz) as a counterexample processing strategy. If no counterexample processing is selected, then the consistency check will be used.

For more details take a look at: https://emuskardin.github.io/AALpy/aalpy/learning_algs/deterministic/CounterExampleProcessing.html