DIAGNOSIS

Summary of Chapter 17 of ETIENNE WENGER'S book

ARTIFICIAL INTELLIGENCE AND TUTORING SYSTEMS Computational and Cognitive Approaches to the Communication of Knowledge.

INTRODUCTION
Organization of the Chapter
17.1 BEHAVIORAL DIAGNOSIS
17.1.1. Noninferential evaluation
17.1.2. Inference of unobservable behaviour
17.2. EPISTEMIC DIAGNOSIS
17.2.2. Structural consistency
17.2.3. Longitudinal consistency
17.3. NOISE:SOURCES AND SOLUTIONS
17.3.1. Noise in the data: variations in behavior over time
17.3.2. NOISE IN THE DIAGNOSTIC PROCESS: AMBIGUITIES
17.3.3. NOISE IN THE MODEL OF COMMUNICABLE KNOWLEDGE
17.4. Sources of evidence: diagnostic data
17.4.1. Diagnostic observation
17.4.2. OVERT DIAGNOSTIC ACTIONS
17.5 ASPECTS OF DIAGNOSTIC EXPERTISE
CONCLUSION
REFERENCES

Introduction

When the book was written, in 1987, the conceptual coherence and analytic structure to ITS research, treated by Etienne Wenger, was quite new and provided a rather complete historical framework for analyzing and comparing intelligent tutoring systems. Wenger explains the primary function of a tutoring system as a "vehicle of communication". He claims that knowledge can only be communicated, and therefore learned, through the mediation of "warrants" which connect with the understanding of an individual. Different kinds of warrants - causal, functional, teleological- are manifest in various ways, ranging from experiences to verbal explanations. Wenger discusses mental models and claims that they emerged as a response to system failures to handle certain kinds of communication tasks.

Diagnosis, as a pedagogical activity, aimed at collecting and inferring information about the student or his actions, constitutes one of the 6 main functions of intelligent tutors, the others being: Correction of errors, help for consecutive action in case the error is due to incomplete procedures,implementation of a global strategy for new plans of action, prediction of future action, evaluation of student work.

The following summary will contain information about Wenger's approach to diagnostic systems and respective computer programs.

Organization of the Chapter

Some fundamental distinctions and some terminology

Etymologically the word DIAGNOSIS means "to know thoroughly". In this text, however, this definition shall be understood as "to know sufficiently" in order to allow for efficient feedback.

Diagnostic tasks can be viewed form various perspectives:

1.INFERENCES: The inferential view highlights the fundamental assumption that internal states produce behaviour in a deterministic fashion. Therefore, is makes sense to establish reasoning chains, knowledge structures and various internal states to account for observed behaviour.

2. INTERPRETATION: Observations and inferences are placed in context and viewpoints and goals examined. Explanations are sought through plausible rationalizations and problem-solving stories.

3. CLASSIFICATION: Relevent distinctions are established in order to characterize or evaluate observations and inferences according to expectations

Diagnostic activities distinguish between three levels at which information can be relevant for educational purposes:

1.THE BEHAVIORAL LEVEL: Behavioral diagnosis deals with behavior and the product of behavior, without trying to perceive the knowledge state involved in its generation. It is useful to distinguish between
a.observable behaviour which consists of external actions (eg.typing an answer)
b. unobservable behavior is the use of knowledge in a reasoning chain which is viewed as distinct form of knowledge itself.

2. THE EPISTEMIC LEVEL: Epistemic diagnosis deals with the student's knowledge state and includes aspects of the student's model of the domain (general model) and his strategic knowledge (inference procedure).

3. THE INDIVIDUAL LEVEL: Individual diagnosis is concerned with the individual whose behavior and knowledge are of interest to the first two levels. The student is not only considered a recipient of communicable knowledge, but also an agent with an identitiy of his own, engaged in an active learning process. Wenger recommends this approach as a topic for further research.

17.1 BEHAVIORAL DIAGNOSIS:

17.1.1.Noninferential evaluation:
At the behavioral level, a number of systems attempt to give the student direct informed feedback, without inferences of a modeling nature, though they perform other types of inference of a dignostic nature. Classification and evaluation capabilities are derived from a representation of knowledge about correctness in their domain. SOPHIE-I, for instance has access to a circuit simulator and some knowledge about optimal measurements to evaluate the student's input. EXCHECK applies a theorem prover, ACE refers to a set of correct reasoning steps.
Some systems perform program analysis, errors are characterized. Their method consists of a set of program descriptions: annotated plans in MYCROFT, parse trees in MENO-II, and definitions of algorithms in TALUS.
SOPHIE and GUIDON, for example, always evaluate hypotheses with respect to the data known by the student and require a rather complex interpretation process by which student responses are matched against the standards. WEST infers the underlying strategic viewpoint in order to perform the proper evaluation of moves.
TALUS, in order to show that 2 versions of an algorithm are equivalent, translates programs into description language and uses a theorem prover and a heuristic counterexample generator.
SOPHIE-I uses sophisticated heuristics for simulation experiments on its model of the circuit. Interpretetion includes conversion into a testable form.

The above systems are distinguished by their complexiy and explicitness of their knowledge about correctness they apply in their analyses, and the much greater flexibility afforded by the dynamic application of this knowledge. Evaluation in all these examples is performed by analysis independently of knowledge and processes by which people generate correct and incorrect solutions.

17.1.2.Inference of unobservable behaviour:

It differs from the above analyses in the attention it pays to problem-solving processes,thereby constituing a different use of AI which requires modeling capabilities and a theory of errors. Complex behavioral interpretations and the need to have more data points for subsequent tasks of inferring a knowledge state are required. The emphasis can be on the product or on the process.

a. Reconstructive interpretation of dynamically constructed solutions
The dynamic construction of a goal structure can determine the correctness and the shape of the final output in problem solving domains. Behavioral interpretation and evaluation requires the reconstruction of problem-solving rationales, an idealized form of "solution path". In this case, the diagnostic approch applies a general strategy by Brown, Collins and Harris (1978), who claim that the key mechanism for this type of interpretative understanding is reconstructive problem-solving in terms of goals and plans. Their observation of experts and students reveal that the analysis of electronic circuits involves reconstructing both the designer's intentions and the realization of these intentions as hierarchies of plans.

For the diagnostic process, these systems adopt a bidirectional search in order to connect student input to some goals with a coherent planninc structure. PROUST and FLOW, for example include incorrect methods for interpreting faulty solutions. Individual systems show variations of the alternation between model-driven confirmation and data driven recognition of plans and goals. Specific requirements that interfere with the simple construction of a hierarchical structure are fulfilled in form of global and external constraints in SPADE, multiple interacting goals in PROUST, collapsed chains of infernces in ODYSSEUS Suboptimal solutions are accounted for in MACSYMA ADVISOR(completion of parse) and IMAGE(ranked expectations). Bidirectional search affords diagnostic leverage, generating expectations to be confirmed in the data. Some systems are purely reconstructive, others build the goal structure as they follow the student's problem solving. The interpretative setup provides additional constaints: IMAGE uses a first level of interpretations provided by NEOMYCIN'S functional descriptors and reconstructs the student's plan by means of successive observations. ACTP's tutors have the reconstruction of student's goal structures built into the step-by step interpretation of behavior.

b. Inferred behavior as data for epistemic diagnosis:solution process
While in procedural domains the form of the final solution is less dependent on the process, evaluation and behavioral reconstruction are prerequisites to inferring the student's knowledge state.
If the system is interested in procedure but not understanding (of the procedure), it needs to reconstruct the sequence of primitive operations that leads to the student's answer. This process relies on different sources of constraints:
In DEBUGGY, for example, constraints are provided by the correct procedure, assumptions of minimal erors and heuristics about bug compounds. In the data driven reconstruction ACM's path finder, the notion of minimal errors requires the use of heuristics about the plausibility of solution paths.
In nonprocedural domains diagnostic interpretation must use a mixture of knowledge about the problem-solving process, whose elements could potentially be constituents of a knowledge state and of knowledge about correctness in the domain. PROUSTshows the diagnostic nature in heuristics about likely goal interactions - which are not likely to be part of a student's knowledge state.

17.2. EPISTEMIC DIAGNOSIS
Three phases in the inference of a knowledge state are distinguished:

17.2.1.The first task, called direct assignment of credit and blame, is to determine which knowledge elements have been directly involved in the available account of behaviour. Discovering which knowledge (correct or incorrect) has been used to produce behavior and which relevant knowledge has been overlooked. WEST, GUIDON, for example provide differental modeling, comparing student knowledge to the one the expert would have used.

Other systems require some form of modeling language.
The idea behind model tracing is to create a close correspondence between units of the internal model and single steps of observable behavior. Knowledge complilation is really the core principle of the model-tracing paradigm. In fact, compliled knowledge is the only form of expertise represented in ACTP's tutors as implemented. Frame-based CAI, on the other hand, uses lists of expected responses matching rules and mal-rules against the student's steps. Being very close to observable behavior, compiled knowledge allows direct comparisons of the student's input with single steps of the system's internal model. In this sense, model tracing is at the border between behavioral and epistemic diagnosis. The difference being that expected responses that trigger branching decisions and remediation are only behavioral diagnostic devices. Rules used in model tracing stand for beliefs supposedly held by the student and viewed as constituents of his knowledge stage. The cognitive aspect is brought forward by the interpretation of local behavior and the formation of a global student model by means of an extended overlay. The extent to which this global student model is exploited for pedagogical decisions and for further diagnosis, is of practical importance in the operation of a system.

.Reconstruction of behavior can form the basis for epistemic inferences.
In DEBUGGY and LMS behavioral operations are units of the procedure of searching the ramifications of an interpretation, for the formation of a partial knowledge state. Recent versions of PROUST use planning methods that can be monitored and recorded by extended overlay on plans from the library.
Diagnostic complexity is usually dealt with during the initial reconstruction process-unless behavioral and epistemic modeling languages are substantially different as in the MACSYMA ADVISOR, or IMAGE/ODYSSEUS. In the case of ACM machine-learning techniques are used to transform diagnostic reconstructions of behavior into a diagnostic description of knowledge.
Reconstruction can capture the internal use of knowledge by mental processes that may not be easily acessible even with the best interfaces, high levels of articulation are observed: PROUST's plans, using overlays which capture the internal use of knowledge, are potentially quite different from ACTP's tutors, using overlays on the rules, although recording schemes for the mastery of individual elements may be fairly similar.

An issue is a curriculum element whose participation in decisions can be recognized and discussed. Issues are classification categories, viewed as general diagnostic devices. The concept of issues is used in WEST, BIP and MHO. Issues are not directly tied to behavior, any number of them can be independently recognized as having participated or not in any decision. They render possible the diagnosis of articulate aspects of knowledge without the need to model the exact process of their participation in generating behavior. This is useful for capturing emergent properties of knowledge states such as the existance of a viewpoint.
The design of the issues is much less constrained than model tracing and epistemic diagnosis. Issues do not produce a runnable student model. Their diagnostic complexity is hidden in the design of classification schemes, called issue recognizers. The design of general classification schemes seem useful for nontrivial pedagogical issues, as well as, for similar systems for medical diagnosis.

Hybrid approaches can advantageously combine model tracing, reconstruction and issues
IMAGE, for example adopts a composite scheme, tracing the student's actions with HERACLES/NEOMYCIN while reconstructing intermediate steps to infer his plan. MYCIN's rules are part of an operational model of expertise, which for diagnostic purposes, is asumed to reflect that of the student and is used to trace his behavior. GUIDON is resorting to a complex classification scheme that uses multiple sources of evidence to estimate the likelihood that each relevant rule was actually applied by the student to arrive at a hypothesis.

Two other tasks have received less attention:

17.2.2. Structural consistency is the broadcast of repercussions of direct recogniton on the rest of the knowledge state. Asserting (arbitrary) correlations between the likelihoods of various pieces of knowledge is handled by "consistency rules", as in UMF. In WUSOR III, they follow the genetic organization of the set of rules. These extensionally defined networks can be designed in the absence of epistemological structure and of a model of conceptual interactions for compiled knowledge.

Generative reduction can be viewed as a form of dynamic decompilation of knowledge. Given the computational complexity of pure simultation-based diagnosis, as in REPAIR/STEP, on-line generative reduction may require the coexistence of multiple models of expertise, as in WEST, with different levels of articulation. Diagnosis in this case involves a combination of model tracing, reconstruction and issue recognition.
MENO-II performs a simple version of generative reduction with an extensionally defined network that associates diagnosed surface bugs with possible underlying misconceptions. A compiled rule in MYCIN contains some elements of control, some commonsense knowledge and some allusions to domain-specific processes, for the system to infer its mastery in an overlay . In MACSYMA ADVISOR, the reconstruction of the user's derivation tree in terms of planning methods provides a structure for the inference of understanding. Dataflow constraints reveal the user's beliefs about the preconditions or postconditions of primitive operations. In IMAGE and ODYSSEUS, the reconstruction of the student's strategy provides a context for attributing his decisions to specific constituents of expertise.

Clancey claims that information about the inference procedure can serve as a pivot for getting to the general model and vice versa.

17.2.3.Longitudinal consistency.
The update of the existing student model, takes place in the context of a continuous series of observations from which new diagnostic information has to be integrated into the existing student model. The diagnostic process must fulfil the contradictory requirements of being at once sensitive enough to adapt the tutor's attitude without delay, and stable enough not to be easilly disturbed by local variations in performance. Scalar attributes, associated with individual elements of knowledge state in the context of an overlay, are used. As knowledge manifests itself in behavior, these weights are updated by statistical or pseudostatistical computations. These rely on an assumption of independence between various pieces of knowledge notes (Kimball1982).
Such diagnostic updating methods of "reasoning about learning" have been considered by different authors: (Doyle 1979, De Kleer 1986, Halpern 1986). In fact, simple updating methods, currently adopted, tend to consider the possible effects of epistemic interdepedance as noise.

17.3. NOISE: SOURCES AND SOLUTIONS:

The issue of noise must be addressed by any realistic system performing pedagogical diagnosis. The following sources of noise have been identified:
- student's behavior which is not consistent over time
- inherent ambiguities in the diagnosic process
- the assumptions of the model of communicable knowledge may not hold in certain situations, in which case noise itself can become a source of information.

17.3.1. Noise in the data: variations in behavior over time
There is no complied correspondance between the level of the source of noise and the level of the solution. At the behavioral level, noise is generated by local inconsitencies. It originates from a mismatch between behavior and knowledge. Sources of behavioral noise correspond to individual parameters. At the epistemic level, noise originates from the student's knowledge modifications.

.Solutions of a "behavioral" character: scalar weights.
Statistical or pseudostatistical methods attempting to integrate new information as cleverly as possible, without making use of a model of the student's knowledge. WUSOR-II is maintaining a history of usage of individual pieces of knowledge with evaluators gauging the reliability of the infomation by the stability of data over time. MHO tutor uses heuristics for issues on a scale of mastery. ACTP's tutors accumulate reinforcement in the form of an absolute numerical weight.

Solutions of an "epistemic" character: genetic organization.
The reliability of incoming data is assessed with the help of information derived from the contents of the knowledge state. The assumption that acquisition of new pieces of knowledge follows natural links connecting them to existing knowledge, allows for a prediction of the locus of learning by the frontier of the student's knowedge, as defined on the network of "genetic" links. Incoming data that indicate variations of the current frontier are suspect and must be considered with caution. In the student model, confidence in the mastery of each piece of knowledge can be assessed in terms of the number and nature of genetic connections with other parts of the curriculum that have been made explicit by instruction.

Solutions of an "individual" character: modeling inconsistencies
Behavior variations are called noise when they are caused by phenomena that are not being modeled. To what extent accurate individual diagnosis could turn the treatment of noise into actual modeling is a thought for the future. The transition from DEBUGGY to REPAIR theory is an interesting example: In DEBUGGY, a model-driven treatment of noise, bug migration is a form of noise that can be dealt only by statistical methods. In REPAIR theory bug migration is explained by the fact that repairs are applied to the problem-solving process and not to knowledge itself. By combining all possible repairs with a given impasse, this theory can even predict classes of bugs within which migration is likely to be observed. These heuristics are derived from an implicit model of perceptual and performance processes.

17.3.2.NOISE IN THE DIAGNOSTIC PROCESS: AMBIGUITIES

The diagnostic process is inherently ambiguous because of the restricted channel of any communicative situation. Choices must be made between competing explanations in the absence of discriminating data. Heuristic selection of a hypothesis allows immediate action, and the reinforcement of accumulative weights tends to focus on perceived remediation needs quickly, in keeping with the directive attitude typical of model-tracing tutors, such as ACTP's. In WEST credit or blame among competitors can be spread. In this alternative, the assessment of rediation requirements is slower but more certain in keeping with the cautious attitude of a unobtrusive coach. Credit and blame need not be spread evenly. GUIDON applies a complex heuristic strategy to individually modifying its diagnostic beliefs for each relevant rule.

17.3.3.NOISE IN THE MODEL OF COMMUNICABLE KNOWLEDGE:

The model of communicable knowledge is itself a source of noise. The student's viewpoint, a qualitative aspect of learning , can shift. For this reason noise can itself become data: to indicate a mismatch of viewpoints between the system and the student, called "the degree of cognitive tear" (Burton and Browne) . WEST monitors the use of each issue at each move, when noise is persistent and apparently not due to the spreading of credit and blame. It tries to detect different strategies for determining the best moves, providing better explanations of observed behavior.

17.4. Sources of evidence: diagnostic data
The design of a good diagnostic interface is critical to the succes of a diagnostic module. The extensions of the channel through which diagnostic mechanisms collect evidence, are vital for collecting data. Only indirect information can be made available to computers. In FLOW for instance, each keystroke is timed and the tutor applies heuristic knowledge about the natural length of pauses in human programming, to detect needs for intervention.

17.4. 1. Diagnostic observation
An appropriately designed interface can ensure that the system receives a maximum of information about what a student is doing to help make its diagnosis both computationally tractable and more accurate

Granularity of information:
In some systems the model tracing scheme depends on the granularity of information in structured environments, to avoid the inference of unobservable behavior. In ACTP's tutors, artifacts like the structural editor for LISP or the proof tree for geometry, allow the student to communicate his decisions to the tutor, not only in a natural, unobtrusive fashion, but also at the level of detail imposed by the interface. There is duality of purpose in the instructional interface design between the requirements of communication in each direction, often resulting in synergism, rather than competition: ALGEBRALAND forces the student to communicate his decisions within a specified framework and provides him with a set of tools with which to understand the domain.

.The price of additional diagnostic data
Although desirable, an increase in diagnostic data requires additional mechanisms for collection and and interpretation. Observing intermediate steps is only useful if they can be understood by the system. In procedural domains interpretation may not require unreasonable extensions, although some of the behavior hypothesized by REPAIR theory may be quite complex and hard to follow. In programming, a nonprocedural domain, a nontrivial program may require much additonal machinery. Both BRIDGE and the LISP tutor try to address multiple problem spaces dealing with the semantics of intermediate steps which may not be the same as the semantics of the final solution. While the LISP tutor follows and guides the student's programming step by step, PROUST analyzes a finshed program and leaves aside intermediate steps. All this requires sophisticated analytical machinery to make up for the lack of data. ACTP stringently constrains the domain of intermediate steps.

17.4.2. OVERT DIAGNOSTIC ACTIONS

One can supplement passive observation with overt diagnostic actions to increase the power of a diagnostic module:

1.Active Diagnosis:
The system can test its hypothesis. MHO's steering-testing scheme is based on constraint-posting. This is a dynamic process since the student's knowledge state evolves with each test. The diagnostic action is inserted into the instructional sequence in an unobtrusive way without disturbing other pedagogical parameters such as interest or continuity. IDEBUGGY AND PIXIE are purely diagnostic systems, which include algorithms for generating diagnosic problems.

2. Interactive Diagnosis:
Few systems invite the student to participate in the diagnostic process by volunteering information. WUSOR requests simple forms of self-evaluation before a session and BIP after each problem. The LISP tutor sometimes engages in menu driven, dialectic dialogues with the student. ACE attempts to understand a student's explanations of his reasoning in terms of its own internal trace. In their restricted forms of Socratic dialogue, WHY, GUIDON and MENO-TUTOR probe the sudent's knowledge asking him to justify predictions or hypotheses in the context of specific cases. MACSYMA ADVISOR takes advantage of the sudent's communication abilities by performing an interactive diagnosis. The ability to consider noncompiled forms of knowledge moves the dialogue to a level of abstraction at which the student is less likely to be puzzled by the viewpoint underying the system's questions.

17.5 ASPECTS OF DIAGNOSTIC EXPERTISE

In essence diagnosis can be viewed as a process of reasoning about reasoning and about learning. As a modeling activity it reconstructs knowledge states responsible for observable behavior by a reversal of the individual model that has processed knowledge into actions. Useful diagnostic heuristics have been compiled from underlying individual models, as in ACM's path finder and DEBUGGY's coercions. If diagnosis is defined as "to know sufficiently", it is worth asking how much is actually sufficient. Ohlsson claims that the role of diagnosis should be viewed as monitoring the success of unfolding pedagogical plans. He suggests that the usefulness of diagnostic distinctions can be measured by their abilities to distinguish between possible courses of action.

CONCLUSION:

The role of diagnosis in knowledge communication systems is an important issue in ITS research, but, according to Wenger, the final answer for specific cases will always be dependent on their specific contexts. The issue of the potential role of diagnosis in computer-based tutors has to be situated in the general context of a computational understanding of knowledge communication. As didactic responsibilities are deeply rooted in diagnostic perception, explicit interest in and careful study of diagnostic strategies, typical of ITS, are relevant even beyond the communication process.

REFERENCES:

ARTIFICIAL INTELLIGENCE AND TUTORING SYSTEMS, Computational and Cognitive Approaches to the Communication of Knowledge. Morgand Kaufmann Publishers, Inc. 95 First Street, Los Altos, California 940022, January 12, 1987

P. Mendelsohn et P. Dillenbourg: Le Developpement de l'Enseignement intelligemment assisté par ordinateur, Symposium Intelligence Naturelle et Intelligence Artificielle, Rome 23-25 September 1991

Email: zotter00@uni2a.unige.ch

Helgard Zotter, April 1968