This document contains the research plan from the original project proposal. More recent ideas and progress reports will be periodically made available from the BOOTNAP home page .
For a quick overview of the proposal, glance at BOOTNAP Proposal Summary .
This project, like our research team, is multidisciplinary. It is articulated with the research programme Learning in Human and Machines funded by the European Science Foundation (see section1.5). This international and multidisciplinary program includes five task forces. The main contractor of this project, P. Dillenbourg , is responsible for the task force collaborative learning. The first co-contractor, Prof. Mendelsohn , is also member of the program steering committee. The research plan explains how the experiments that we propose will serve as a basis for this international program.
Back to table of contents.
Social grounding refers to the process through which two discussants try to elaborate the mutual belief that the other partner has understood what he meant to a criterion sufficient for current purposes. Clark & Brennan (1991) describe various mechanisms of social grounding: repeating what has been said in another way, pointing to objects ('you mean this one'), left dislocation (e.g. 'Your square, it is too large.'), using words that invite the partner to confirm his understanding,... These grounding mechanisms change according to the medium of communication. For instance, eye contact is generally not available in computer-supported distance collaboration. This project focuses on how people use external references (a diagram, a picture, ...) during social grounding. In this project, we use the term "mouse gesture" to refer to the way users draw something, point to an object, circle an area, and so forth.
We try to avoid the issue of natural language processing in order to keep this project within reasonable boundaries. Of course, some verbal dialogue has to be performed, mouse gestures can only be a complement. The link between this project and computational linguistics does not concern the techniques for natural language processing, but the underlying rhetoric structures (arguments, refutations, refinements, illustration, ...) (Baker, 1993; Cawsey, 1993) and belief structures (knowledge gaps, conflicts, awareness, ...) (Cohen & Perrault, 1979). We focus on dialogue models specific to collaborative problem solving. Mouse gestures will be characterized with respect to a more abstract terminology, as classes of communicative acts (Searle, 1969).
"Instead of thinking about whether a task should be performed by a person or by a machine, we should instead realize that functions are performed by people and machine together. Activities must be shared between people and machines and not just allocated to one or the other.".
In the field of expert systems, this "joint construction" can be understood in two non-exclusive ways. The first interpretation leads to participatory design: the future user must participate in building the system (Clancey, 1993). We propose another interpretation: to increase the depth of user-system interaction, i.e. the extent to which the interaction with the user influences the system's reasoning. We discriminate three levels of interaction:
The reason why we refer to the ESF program is that we designed the research plan accordingly to the research methodology adopted by the task force 5 "collaborative learning". In order to make ESF workshops more fruitful, protocols of human-human collaborative problem solving will be shared by participants. They will serve as a concrete basis to compare theoretical positions and computational models. The second stage of our project aims to collect such protocols. The third stage will not only be driven by our own analysis of the protocols. It will receive multiple enlightenment from scientists with various theoretical standings and from various disciplines. Moreover, they will enrich the analysis by providing protocols collected in other settings.
Back to table of contents.
In PEOPLE POWER, the human learner and the machine learner played with an electoral simulation. They tried to gain seats by moving wards from one constituency to another. The computerized co-learner had a set of naive rules for reasoning about elections. The experiment conducted showed that human learners were able to interact with this rule-based agent, namely to point to a particular rule and to instantiate it with problem data. However, it appeared that this type of discussion 'at the knowledge level' was secondary. The primary task of subjects was to work on the graphical problem representation (a table of votes per party and per ward). Human-computer collaboration would have been more fruitful if it had concerned this problem representation instead of more abstract knowledge. In the design of the next system, MEMOLAB, we paid attention to this issue.
In MEMOLAB, the human learner and the machine expert jointly constructed an experiment on human memory. Collaboration is based on what can be the most easily shared between a person and a machine: the interface. Let us imagine two rule based systems that use a same set of facts. They share the same representation of the problem. Any fact produced by one of them is added to this shared set of facts. Hence, at the next cycle of the inference engine, this new fact may trigger a rule in either of the two rule-bases. Now, let us replace one computerized expert by a human learner. The principle may still apply provided we use an external problem representation instead of the internal one. The shared set of facts is the problem representation as displayed on the interface (see figure 1). All the conditions of the machine's rules refer only to objects displayed on the screen. The actions performed by the rules modify the problem representation.
Figure to appear
Figure 1: Opportunism in human-machine collaboration in MEMOLAB
In short, the shared representation is visible by both partners and can be modified by both partners. We do not claim that they share the same 'internal' representation. Sharing an external representation does not imply at all that both partners build the same internal representation. The shared concrete representation simply facilitates discussion of the differences between the internal representations and hence supports grounding mechanisms. Some recent experiments with MEMOLAB (Dillenbourg et al., 1993) revealed that such mechanisms may occur between a human and a machine: the learner perceives how the machine understands him (i.e. he makes a diagnosis of the machine diagnosis) and reacts in order to correct eventual misdiagnosis:
"He supposes that I wanted the subjects to do something during 40 seconds. I wanted the subjects to do nothing."
"He'll ask me what I wanted to do. And then, since... he'll base himself on wrong things since this is not what I want to do."
These mechanisms can be formalised as nested beliefs (see section 1.2): "the user believes that the system believes that he believes X" is written "belief (user, belief (system, belief (user, X)))". If the learner notices a misunderstanding, he might start a dialogue with the system to repair the system misdiagnosis. This may sound too heavy for natural dialogues, but we do it in everyday conversations, as illustrated by the fictitious example below:
A "Numbers ending with 4 or 6 are even"
B "778 is also even"
C "I did not say that those ending with an 8 are not even."
When speaker A repairs speaker B's misunderstanding, he does not simply repeat what he said previously. He reinterprets his first utterance from B's point of view in order to repair what B has understood. Interpreting what one said from one's partner's viewpoint corresponds to a learning mechanism referred as 'appropriation' in the socio-cultural theories of learning (Newman, 1989; Rogoff, 1990). These misunderstandings and the subsequent repair mechanisms are necessary to build a shared representation of the problem. Hence, our goal is not to design collaboration techniques that avoid any misunderstanding (if that was possible), but to build techniques that provide the flexibility required for negotiating meanings. In the current implementation of MEMOLAB, rule variables unambiguously refer to screen objects. To support social grounding, mechanisms, the instantiation of variables should not be a internal process, but the result of some interaction with the learner.
Back to table of contents.
The experiments that we want to conduct (see stage 2) will concern two people collaborating on remote terminals. It would be easier to observe conversations between two humans sitting side-by-side and using a piece of paper. The reasons for choosing computer-supported collaboration are:
The core part of this system, the problem solving environment, will later be reused to develop the human-computer collaborative system. This environment may be very simple. It includes an interface allowing the user to solve the problem and the code necessary to respond to the user's actions. The problem to be solved will be selected according to the following criteria:
Our experiments will address two issues:
In setting 1, the communication facilities include the sound channel, the sketchpad and the notepad. This the main setting, in which we aim:
In setting 3, the sound channel is permanently OFF. The users hence communicate via the notepad (written communication). We thereby isolate the role of grounding gestures with respect to other grounding mechanisms. Clark and Brennan (1991) have shown that grounding techniques change with the medium, because media vary with respect to delaying speech, turn taking, making and repairing errors, and so forth. By comparing setting 1 (sound ON) and setting 3 (sound OFF), we will observe how grounding mechanisms adapt to the communication medium. In oral communication, disambiguating sub-dialogues are cheap and fast. Since this is not the case for written messages, the diagrams may get more importance. On the other hand, written communication has several advantages. It leaves a trace on which users can come back to repair misunderstanding. Moreover, it will be the channel for human-computer verbal messages.
The outcome of this second phase will be a high-level description of human-human grounding techniques, as structures of communicative acts with respect to joint problem solving process. This description will be independent from the problem and from the agent (human/machine).
The goal of this third stage is to develop new interaction techniques that enhance the collaboration between a human user and a knowledge-based system. The user may understand some rules differently from the system because he has not the same frame of reference as the system designer (Clancey, 1991). He or she may not see how a rule fits the problem data. This is why the collaboration between the user and the system requires grounding mechanisms.
A human-computer collaboration system includes three components:
The design of the agent constitutes the main task. The reasoning of this agent must be compatible with (but not necessarily identical to) the way humans solve the same problem. Hence, the design of the rulebase will be inspired by the think aloud protocols collected in phase 2, setting 2. The challenge it to develop an interactive inference engine that satisfies the requirements presented in section 1.4 (level 2 interaction). Selection and instantiation of rules is jointly performed by the user and the system. Given our experience (see section 2), such an engine should be object-oriented, i.e. the problem representation is a set of objects. In MEMOLAB, these objects are displayed on the screen. Thereby, the user and the system share a concrete representation of the problem. The joint instantiation of rule variables by the problem data can be done by pointing an object on the screen. This technique raises however a fundamental issue. Screen objects are particular instances, while rule variables concern classes. The generality ratio between a variable and the object by which this variable is instantiated must be negotiated by participants ("Do you mean any piece like that can be moved?"). The precise role of those dialogues will be clarified by the observations conducted in phase 2.
The difficulty is not to build an interface but to integrate interface acts in an inference engine (see previous point). The original piece in this interface will be the sketchpad. This sketchpad will be simpler than the one used in human-human collaboration. Since the system has to understand the drawings, we can not allow the user to draw any kind of diagram, otherwise we will face issues of image interpretation that are beyond the scope of this project. Instead, the human and the machine users will use a limited set of graphical objects (lines, boxes, circles, arrows,...). The set of objects and symbols to be provided will be determined after the observations of the drawings used in human-human collaboration (phase 2, settings 1 and 3). For these issues, we have a contact with Prof. Pun, a specialist in computer vision (Université de Genève).
Back to table of contents.
Because this project focuses on the role of images and diagrams in joint problem solving, it is also relevant for the development of the multimedia technologies. Currently, most multimedia systems are weakly interactive: interaction concerns the selection and the display of fixed or animated images, but the system and the user do not interact about the images. Images are considered an add-on. There are few efforts to study how images may more deeply impact on the user's work. This project investigates the role that images could play in collaborative problem solving.
Back to table of contents.
Baker, M. (1992) The collaborative construction of explanations. Paper presented at the "2èmes journées Explication du PRC-GDR-IA du CNRS, Sophia-Antipolis.
Baker, M. (1993) Negotiation in Collaborative Problem-Solving Dialogues. Rapport CR-2/93. CNRS, Laboratoire IRPEACS, Equipe Coast, Ecole Normale Supérieure de Lyon.
Bargh, J.A. & Schul, Y. (1980) On the cognitive benefits of teaching. Journal of Educational Psychology, 72 (5), 593- 604.
Behrend, S.D. & Roschelle, J. (to appear) The construction of shared knowledge in collaborative problem solving. In C.E. O'Malley (Ed). Computer Supported Collaborative Learning. New York: Springer-Verlag.
Bird, S.D. (1993) Toward a taxonomy of multi-agents systems. International Journal of Man-Machine Studies, 39, 689-704.
Blaye, A., Light, P., Joiner, R. & Sheldon, S. (1991) Collaboration as a facilitator of planning and problem solving on a computer based task. British Journal of Psychology, 9, 471-483.
Butterworth, G. (1982) A brief account of the conflict between the individual & the social in models of cognitive growth. In G. Butterworth & P. Light (Eds) Social Cognition (3-16). Brighton, Sussex: Harvester Press.
Cawsey, A. Planning Interactive Explanations. International Journal of Man-Machine Studies, 38, 1993, 169-199.
Chi, M.T., Bassok, M., Lewis, M.W., Reimann, P. & Glaser, R. (1989) Self-Explanations: How Students Study and Use Examples in Learning to Solve Problems. Cognitive Science, 13, 145-182.
Clancey, W.J. (1991) The frame of reference problem in the design of intelligent machines. In K. Van Lehn (Ed.) Architectures for Intelligence: The twenty-second Carnegie symposium on cognition (357-424). Hillsdale: Lawrence Erlbaum.
Clancey, W.J. (1992) Guidon-Manage Revisited: A Socio-Technical Systems Approach. Journal of Artificial Intelligence in Education, Vol. 4, 1, 5-34.
Clark, H.H. & Brennan S.E. (1991) Grounding in Communication. In L. Resnick, J. Levine and S. Teasley (Eds).Perspectives on Socially Shared Cognition (127-149). Hyattsville, MD: American Psychological Association.
Cohen, P.R. & Perrault, C.R. (1979) Elements of a Plan-Based Theory of Speech Acts. Cognitive Science, 3, 177-212.
Dewan, P. (1993) Tools for implementing multi-user interfaces. In Bass and Dewan (Eds) User Interface Software, John Wiley.
Dillenbourg, P. (1991) Human-Computer Collaborative Learning. Doctoral dissertation. Department of Computing. University of Lancaster, Lancaster LA14YR, UK.
Dillenbourg, P. (to appear) Distributing cognition over brains and machines. In S. Vosniadou, E. De Corte, B. Glaser & H. Mandl (Eds), International Perspectives on the Psychological Foundations of Technology-Based Learning Environments. Hamburg: Springer-Verlag.
Dillenbourg, P., Hilario, M., Mendelsohn, P., Schneider D. and Borcic, B. (1993) The Memolab Project. Research Report. TECFA Document. TECFA, University of Geneva.
Doise, W. & Mugny, G. (1984) The social development of the intellect. Oxford: Pergamon Press.
Durfee, E.H., Lesser, V.R. & Corkill, D.D. (1989) Cooperative Distributed Problem Solving. In A. Barr, P.R. Cohen & E.A. Feigenbaum (Eds) The Handbook of Artificial Intelligence, (Vol. IV, 83-127). Reading, Massachusetts: Addison-Wesley.
Jennings, N. (1992) Joint intentions as a model of multi-agent cooperation. Technical Report 92/18. Department of Electronic Engineering. University of London.
Hill, R.D., Brinck, T., Patterson, J.F., Rohall, S.L. & Wilner, W.T. (1993) The Rendezvous language and architecture: Tools for constructing multi-user interactive systems. Communications of the ACM, 36, (1), 62-67.
Kantowitz, B.H. & Sorkin, R.D. (1987) Allocation of functions. In G. Salvendy (ed.), Handbook of human Factors. New York: Wiley.
Hancock, P.A. (1992) On the future of hybrid human-machine systems. In J.A. Wise, V.D. Hoipkin, and P Stager (Eds) Verification and Validation of Complex Systems: Human Factors. NATO ASI Series F: Computer and Systems Sciences, Vol. 10, 61-85.
Kozulin, A. (1990) Vygotsky's psychology. A biography of ideas. Harvester, Hertfordshire.
Krauss, R.M. & Fussell, S.R. (1991) Constructing shared communicative environments. In L. Resnick, J. Levine and S. Teasley (Eds). Perspectives on Socially Shared Cognition (172-202). Hyattsville, MD: American Psychological Association.
Lave, J. (1988) Cognition in Practice. Cambridge: Cambridge University Press
Miyake, N. (1986) Constructive Interaction and the Iterative Process of Understanding. Cognitive Science, 10, 151-177.
Newman, D. (1989) Is a student model necessary? Apprenticeship as a model for ITS. Proceedings of the 4th AI & Education Conference (pp.177-184), May 24-26. Amsterdam, The Netherlands: IOS.
O'malley, C. (1987) Understanding explanation. Paper presented at the third CeRCLe Workshop 'Teaching Knowledge and Intelligent Tutoring (April), Ullswater, UK.
Perret-Clermont, A.-N., Perret J.-F. & Bell N. (1991) The Social Construction of Meaning and Cognitive Activity in Elementary School Children. In L. Resnick, J. Levine and S. Teasley (Eds). Perspectives on Socially Shared Cognition (41- 62). Hyattsville, MD: American Psychological Association.
Resnick, L.B. (1991) Shared cognition: thinking as social practice. In L. Resnick, J. Levine and S. Teasley (Eds). Perspectives on Socially Shared Cognition (127-149). Hyattsville, MD: American Psychological Association.
Rogoff, B. (1990) Apprenticeship in thinking. New York: Oxford University Press
Searle, J. (1969) Speech acts: An essay in the philosophy of language. Cambridge: Cambridge University Press.
Sheridan, T.B. (1991) Task allocation and supervisory control. In M.Helander (Ed) Handbook of Human-Computer Interaction, 159-173. Amsterdam: North Holland.
Suchman, L.A. (1987) Plans and Situated Actions. The problem of human-machine communication. Cambridge: Cambridge University Press.
Wertsch, J. V. (1979) The regulation of human action and the given-new organization of private speech. In G. Zivin (Ed) The development of self-regulation through private speech, 79-98. New York: John Wiley & Sons.
Wertsch, J.V. (1991) A socio-cultural approach to socially shared cognition. In L. Resnick, J. Levine and S. Teasley (Eds). Perspectives on Socially Shared Cognition (1 - 20). Hyattsville, MD: American Psychological Association.
Woods, D.D. & Roth, E.M. (1991) Cognitive System Engineering. In M.Helander (Ed) Handbook of Human-Computer Interaction, 3-35. Amsterdam: North Holland.
Back to table of contents.
HTML version by David Traum