Previous chapter

4. Agents in ETOILE

4.1. Role distribution

We can summarize the previous section by saying that a learning environment is structured along two dimensions: A sequence of microworlds (with the described features) and a set of agents who interact with the learner within each microworld. By specifying several agents with various roles, we have been able to separate pedagogical knowledge from domain specific knowledge. The separation idea itself had two reasons:

first, we aimed to build a domain-independent toolbox, i.e. to apply the same pedagogical knowledge on various contents;
second, we wanted ETOILE to integrate various teaching strategies, i.e. to apply various teaching styles to the same content.

We should point out that the terms `domain-independent' and `pedagogical' are not synonymous. There exists some domain-specific pedagogical knowledge, for instance knowing that problem-23 is more difficult than problem-24. Therefore, we have to cope with the fact that the boundary between what is domain specific and what is pedagogical is not a straight line.

Figure 4: The learner-tutor-expert triangle in ETOILE: the tutor monitors the interaction between the learner and a domain-specific agent.

The originality of ETOILE with respect to other AI-based authoring shells such as DOCENT (Winne and Kramer, 1988), IDE (Russell, Moran and Jordan, 1988), ISD (Merrill and Li, 1989) or CML (Jones and Wipond, 1991) is the approach chosen for separating pedagogical knowledge from domain-specific expertise. The common solution to this problem is to create a terminology for describing any subject domain (by using abstractions such as units, chunks, concepts, rules, principles,...). Variables in common pedagogical rules refer to this kind of terminology. The author's task is to describe the domain using those building blocks. However, building a terminology that fits such diverse domains as astronomy, music and engineering is a challenge that educators and even philosophers have been facing for 2000 years. Therefore, we advocate another approach. The pedagogical rules never refer to the domain. The pedagogical rules refer to the interaction between the learner and a domain-specific agent. These interactions are described in terms of agreement, disagreement, repair, and so forth. The domain-specific agent is (in the current implementation) an expert system able to solve any problem submitted to the learner. The domain specific terminology used by this expert is unknown to the tutor, it is confined to the expert (although shared with the learner) and therefore escapes from generality constraints. We will describe how the tutor influences the expert-learner interactions, but in order to do that we must first explain how the learner and the expert collaborate.

4.2. How to implement the interaction between the learner and a computational agent?

Imagine two expert systems thar collaborate closely. Each expert has its own rule base, but they share the same facts representing the problem state. Any action performed by one expert changes the set of facts and is thus noticed by the other expert. Thereby, they permanently have a common representation of the current problem. However, our challenge was to get the expert system to collaborate with a human user, not with another expert system. Nevertheless, we applied the same principle of "sharing" facts: instead of an internal representation, based on predicates, the expert uses an external representation, based on interface objects. If the learner changes the problem state, the expert's representation is updated. Conversely, if the expert changes the problem state the learner can see the change.

Figure 5: Step-by-step collaboration between the learner and the expert

The main implication for the expert's rule base is that most rule conditions refer to the problem state displayed on the screen and most rule conclusions change this problem-state display. The relationship between the expert rules and screen objects has been possible because of the object-orientedness of our inference engine (see Appendix III: "The Inference Engine" (page 28)).

The need for shared workspace appeared during our experiments with a previous collaborative system (Dillenbourg & Self, 1992b). In this system (PEOPLE POWER), human-machine collaboration was based on a dialogue about the task (instead on shared task-level actions). Within a shared workspace, if one partner wants to help the other, he can see the other's partial solution and provide concrete help by performing the next operation (Terveen et al, 1991).

The collaboration between the expert and the learner is performed step-by-step. A machine step is a sequence of rule firings that ends when an activated rule executes a `main command'. A learner step is a sequence of interface actions that ends with a `main command'. Consequently, a main command is a command that can be performed by both the learner and the expert. It serves as a basis for comparing the learner's actions with the expert's potential actions (see section 4.3. "Diagnosis or Agreement" on page 11). At each step, the expert may either `propose' an action or `observe' the learner's action. It is generally the tutor who decides whether the next step should be performed by the expert or by the learner. In our terminology, the tutor decides the `local interaction mode' (LIM; `local' meaning `a single step in the solution path').

The global balance of learner's and expert's steps along a problem solving path determines the degree of collaboration between the learner and the expert. We refer to it as the global interaction mode (GIM). Its state is represented by an integer between 0 and 5. A global interaction mode equal to 0 corresponds to a demonstration done by the expert (the expert performing all the steps). A global interaction mode equal to 5 corresponds to a simple exercise that the learner performs alone. As it is implemented, the interaction mode is based on a quantitative criterion, i.e. the number of steps performed by the learner and the expert. This is of course a limitation of our system because during problem solving some steps may be more critical than others (Dillenbourg, to appear) and it represents an interesting issue for future research.

Let us make clear that collaboration in the system remains flexible. The learner may always interrupt the expert in order to continue the task by himself. Conversely, when the learner is asked by the tutor to perform several successive steps, he is allowed to ask the expert to take over the task. The tutor may disagree with these learner's initiatives and coerce the learner to proceed with the same interaction mode. However, in case of repeated disagreement, the tutor will resign and ask the coach to select another tutor, more adapted to the learner's requests (see the notion of pedagogical drift in section 4.6. "Selecting teaching styles" on page 14).

The expert has no pedagogical intention. Of course, the design of the expert is biased by pedagogical criteria. They can be found for instance in the granularity of rules or in the explanatory power of the comments associated with each rule. Nevertheless, during interaction, the expert's only goal is to solve the problem. It tries to maintain the minimal level of mutual understanding necessary to the joint accomplishment of this task. The expert does not try to teach, or to decide what is good for the learner. He asks questions to get answers, not to see if the learner knows the answers. Pedagogical decisions are the tutor's business.

4.3. Diagnosis or Agreement

The role of an expert in a learning environment is not only to guide or interact with the learner, but to detect the learner's mistakes and eventually to understand the cause of these mistakes in terms of missing or erroneous knowledge. This diagnosis process is often referred to as `learner modelling'. Diagnosis in ETOILE is inspired by the model tracing paradigm (Anderson, 1984). However, within the more collaborative style of ETOILE, we should rather use the word `agreement' instead of `learner modelling'. We consider three types of diagnosis:

`Positive' diagnosis
The expert agrees with the learner if one of the rules that he could have activated leads to the same interface command as the command performed by the learner (same command and same arguments).
`Unknown' diagnosis
If none of the rules that the expert was ready to activate corresponds to the learner's action, the expert states that he does not understand the learner.
`Negative' diagnosis
If, after the learner's action, the expert triggers a `repair rule' (explained below), this means that the learner's previous action produced something wrong, something to be repaired.

The expert has a small set of `repair rules'. The concept of repair rule is close to the concept of malrule that has been intensively used in student modelling (Dillenbourg & Self, 1992a). The expert's normal rulebase implements a preferred optimal solution path for a task. However, interaction with the learner may move the expert away from optimal to other possible solutions. Outside the solution path, we encounter erroneous problem states that are not covered by the expert's normal rules. Hence, we added a few `repair' rules enabling the expert to recover from these situations. Determining these situations a priori implies to anticipate the classical learner mistakes. For instance, in MEMOLAB, the expert can repair a list of words which is unevenly distributed (all the long words at the beginning and the short one at the end). This diagnosis technique raised a set of technical problems that led us to modify some aspects of the inference engine (see appendix section III.3. "Adapting OOPS to ETOILE" (page 31)).

4.4. The learner-expert-tutor triangle

The tutor monitors the expert-learner interactions: he receives messages from the learner or from the tutor about what is going on. On the basis of this information, the tutor takes a pedagogical decision. For instance, the expert may send a comment that the learner is wrong, and the tutor will decide whether, in case of error, the learner should be interrupted or not. This decision depends on the tutor's teaching style: one tutor could decide an intervention while the other would not care. We illustrate this triangular relationship by two short scenarios. Messages from and to the learner are materialized as popup windows, messages between the expert and the tutor are not visible on the screen

Figure 6: The tutor solves a Nth conflict between the learner and the expert.

In the first scenario (Figure 6), the learner performs some action not understood by the expert (`unknown' diagnosis). The expert then asks the learner whether he can `undo' what the learner did. He also sends a message to the tutor to inform him that he did not understand the learner (step 1). The learner has the possibility to refuse that the expert undoes his action. He answers "no" to the expert, this refusal being perceived by the tutor (step 2). The tutor then has to solve a conflict between the learner and the expert. In Figure 6, the tutor enforces the expert's initiative: he tells the learner "listen to the expert" and he asks the expert to continue (step 3). In order to solve this expert-learner conflict, the tutor does not try to determine who is right. He could not take such a decision since he hasn't any domain knowledge. By definition the expert is right. However, the tutor may decide for pedagogical reasons that despite the learner' mistake, it may be better to leave him exploring his own ideas. The tutor illustrated in Figure 6 (called Vygotsky) actually hands control to the learner at the first conflict, but gives preference to the expert at the second one (for the same problem). The dialogue illustrated in Figure 6 corresponds to the second time the learner refuses to listen the expert.

Figure 7 illustrates another mini-scenario. The expert fires a repair rule and informs the learner about the reasons why he changed something on the screen (step 1). The expert sends another message to the tutor to say that he had to repair something that was wrong. After the learner has read the expert's message (step 2) and has clicked on `ok', the tutor processes the expert's message. In the dialogue illustrated by Figure 7, the tutor's decision is to provide the learner with some theory related to the learner's mistake. The expert's repair rules which correspond to classical mistakes, are connected to specific nodes of the hypertext. Another tutor could decide not to open this hypertext and to leave the learner trying to discover these facts by himself.

Figure 7: The tutor opens the hypertext because the expert told him that he had to `repair' what the learner did.

This last scenario illustrates the debate about the separation of pedagogical and domain knowledge. The expert is an egocentric agent: he `repairs' the experiment because he wants to build a good experiment, not because he wants to teach the learner. The pedagogical decision is taken by the tutor. However, expert design is not completely neutral from a pedagogical viewpoint since the anticipation of the typical learner's major mistakes are encoded in the rules.

4.5. Teaching styles

The tutor has to take several pedagogical decisions, some of which already have been described:

to choose the value of the global interaction mode;
to choose the value of the local interaction mode;
to solve conflicts between the learner and the expert (`undo');
to choose what to do in a case of a `repair' by the expert;
to choose the next problem to be proposed to the learner and the expert;
to decide when to present theory (from the hypertext), i.e. to prefer an inductive (examples first, theory next) or deductive (theory first, examples next) approach;
to decide when to end the session concerning a specific goal (success or failure);
to decide to resign before the end of a goal session (explained in section 4.6. on page 14).

Each of these decisions implicitly define an axis along which tutoring behaviour can vary. Originally, in our system, tutor behaviour was summarized by a set of parameters, corresponding more or less to these axes. The value of each parameter was chosen by a set of rules. When we wrote these rules, we learned that the parameters were actually not independent from each other. It is for instance difficult to marry a inductive approach with a high level of interruptiveness. The concept of teaching style stems precisely from the consistency among a set of pedagogical decisions.

With all those teaching parameters it is theoretically possible to define a very large number of teaching styles. However, we reduced this large space to a few styles in order to define styles that differ significantly from each other. There are two reasons for working with a few contrasted teaching styles. The first reason is that if the learner fails to reach a goal with a tutor applying some style, there is a low probability that he would succeed with an almost identical style. The system does better to choose a radically different approach. The second reason for having clear differences between teaching styles is that the learner should perceive these differences (Elsom-Cook, 1991), namely be able to choose his `favourite' tutor later on.

The five teaching styles we have defined are respectively labelled Skinner, Bloom, Vygotsky, Piaget and Papert. These names do not imply that the we succeeded in translating the theories of these five key figures in the history of education and psychology. We rather use these terms as flags: we defined a tutor that roughly teaches `à la Vygotsky', another one `à la Papert' and so forth. Each teaching style corresponds to a tutor and is implemented as an independent rule base. Each rulebase is itself divided in three rule sets: (1) selection of activities and problems, (2) monitoring of joint problem solving and (3) ending a session. There is of course some overlap between the knowledge of each tutor, but it is easier to manage them as separate rule bases.

The only tutor that has so far by-passed the stage of an early implementation and goes through some experimentation is Vygotsky. Before investing in the full implementation of other tutors, we need to solve the problems related to the expert-learner interaction (see section 7. "Experiments" on page 18). From the outset, Vygotsky selects rather difficult problems. The expert initially performs most of the problem solving steps (GIM = 2). Then Vygotsky progressively increases the global interaction mode, i.e. the expert fades out and the learner plays an increasingly important role in the solution process, until he solves the problem alone (GIM = 5). In other words, the Vygotsky tutor is characterized by a particular GIM curve (GIM X Time), while Skinner would be depicted by a `problem difficulty' curve.

We must acknowledge that the design of teaching styles is bounded by the general architecture of ETOILE. The various components of ETOILE already have some pedagogical bias: the coach's goal selection process is inspired by mastery learning, the expert-learner interaction is inspired by apprenticeship, and the environment facilities are inspired by the microworld philosophy. If we had wanted to push to its end the logic of each teaching style, ETOILE would become much more complex. For instance, the notion of diagnosis is not the same (a) in a behaviourist feedback based on the learner's performance, (b) in a constructivist approach based on the learner's cognitive structure, and (c) in the apprenticeship mode where the expert attempts to integrate the learner's action into his own framework.

4.6. Selecting teaching styles

Since ETOILE includes several tutors, it needs a hierarchically superior agent which selects the tutors. His name is `the coach'. The coach also selects the goals in the pedagogical curriculum. The goal selection process and tutor selection process have been implemented in the simplest and most transparent way. The coach selects the first goal whose pre-requisite goals have all been mastered (remember that the curriculum is a network of goals with prerequisite links). When the learner succeeds that goal, its status is set to `mastered' and the goal selection process is repeated.

After a goal has been selected, the coach selects a tutor. The mission of a tutor is always relative to a single goal. Goal selection is mainly based on the learner's preferences and his performance history. If he has already failed the current goal with tutor T1, the coach will select a tutor T2 that provides more guidance than T1. On the contrary, if the learner succeeded the previous goal with tutor T1, the coach will select a tutor that provides less guidance. Our five tutors are sorted by decreasing level of directiveness: Skinner works step by step, Bloom makes larger steps but with close control of mastery, Vygotsky is based on participation, Piaget intervenes only to point out some problems and Papert does not interrupt the learner.

This general principle is however shadowed by more specific rules that recommend to avoid activating a tutor that has been repeatedly inefficient with a particular learner. The learner may also select a tutor. These preferences are recorded and taken into account in later selections. The learner (or a human teacher) can also remove a tutor from ETOILE's set of tutors for the rest of the session.

The tutor has rules to determine when the learner has mastered or not a goal and he can close a session concerning a goal. In order to provide more flexibility to the system, tutors can resign before the end of their `contract'. If a tutor like Piaget observes that the learner is frequently asking for help or for theory, this means that his teaching style does not fit the learner needs. Conversely, if the learner being taught by the tutor `Skinner' always asks to continue the problem on his own, he would certainly be more efficient with another tutor. We refer to this process a `pedagogical drift', because the actual interaction with the learner may lead the tutor to `drift away' from the behavior space for which he has been designed for and within which he is supposed to be efficient. In case of drift the tutor returns the control to the coach and indicates the type of drift that he observed. The coach will then select another tutor.

This monitoring of pedagogical drift is implemented in a very simple way: each learner command that has an effect on the locus of learner activities' control creates as a side-effect an instance of the class `initiative'. This class possesses two subclasses `want-more-guidance-initiative' and `want-less-guidance-initiative'. Some commands like seeking expert help or asking to read the hypertext generate instances of `want-more-guidance-initiative'. Some learner actions, like `I want to continue', `I want to select the problem', `I refuse to undo', create instances of `want-less-guidance-initiative'. All significant commands in ETOILE are characterized with respect to these two subclasses. Therefore, all domain-specific commands relevant at this level must create an instance of one of these two sub-classes. The tutor compares the number of instances in each subclass and resigns if there is a strong disequilibrium between the two sets of used commands.