ANALYZING LEARNER-COMPUTER INTERACTION: LESSONS FROM EMPIRICAL STUDIES

Symposium organized during the Sixth European Conference on Learning and Instruction

Nijmegen, The Netherlands, August 26-31, 1995.

Symposium schedule: Sunday August 27, from 16.00 to 18.00

Symposium sponsored by
EARLI SIG5 "Learning and Instruction with Computers"
Nira Hativa and Peter Goodyear, co-ordinators.

Organisers:Pierre Dillenbourg, TECFA, Faculté de Psychologie et des Sciences de l'Education, University of Geneva, Switzerland and Jean-François Rouet, INRIA, Grenoble and CNRS, University of Poitiers, France

Participants:

Jean-Michel Passerault and Jean-François Rouet, CNRS, University of Poitiers, France. "Analyzing learner-hypermedia interaction: A methodological review."
Donatella Cesareni, University of Roma, "La Sapienza", Italy. "Navigating through Ecoland". (non present in Nijmegen)
Susan Goldman, Vanderbilt University, USA."Computer Technology and Complex Problem Solving:Issues in the Study of Complex Cognitive Activity".
André Tricot and Jean-Paul Coste, University of Provence, France."Evaluating complex learner-hypermedia interaction:What criteria for what tasks?"
Paivi Hakkinen, TOTY, University of Joensuu, Finland."Designers', teachers' and students' reciprocal interpretations of computer-based learning environments: A methodological approach".
Edith Ackerman, University of Provence, France (Discussant)

Symposium Abstract

The development of computerized information systems has stressed the need for reliable evaluation methods. In addition to traditional techniques (e.g., post-tests or self-reports), the analysis of learner-courseware interaction protocols (e.g., videotaping, reading time records, selections logbooks, etc.) can be a valuable source of information to the researcher or the system designer. The analysis of interaction protocols represents a potentially important tool for instructional research both at theoretical and practical levels. At the theoretical level, interaction protocols can be used as dependent measures to understand the nature of the learning process. At the applied level, the use of on-line data is a valuable means to evaluate a proposed piece of courseware.

This symposium will examine the methodological issues involved in recording and analysing interaction protocols. Participants will present case studies or experiments involving the analysis of interaction traces. They will explain the techniques involved, the difficulties they may have faced, how they solved them and what they learned from these data (about the system and about the learner). The symposium may be of interest to educators or researchers involved in the development, evaluation or usage of courseware (CAI systems, tutors, hypermedia applications). More generally, the symposium concerns all researchers who use computer-based settings to test any psychological or pedagogical hypothesis.

1- General presentation and objectives

The development of computerized information systems has stressed the need for reliable evaluation methods. Typical methods include measures of students' performance (e.g., learning and/or memory tests) as well as subjective evaluation (students' self-reports of opinion, satisfaction, difficulties etc.). In addition to these techniques, the analysis of learner-courseware interaction protocols (e.g., videotaping, reading time records, selections logbooks, etc.) can be a valuable source of information to the researcher or system designer. However, the collection and analysis of this type of data raises numerous questions: How to extract relevant indicators from raw data? How to interpret interindividual differences in interaction protocols? How to draw consequences for system improvement? From the current literature on courseware design and evaluation, it is obvious that these questions are indeed serious ones and have not yet received any definitive answer.

The analysis of interaction protocols poses two major types of problems.

First is the problem of mixing a qualitative analysis with quantitative indicators. Qualitative analysis consists in describing thoroughly the navigation patterns of a few subjects. It may provide rich and meaningful observations of subjects' navigation strategies. However, due to its cost it is hardly generalizable beyond a few cases. Quantitative indicators (e.g., percentage of "loops", or multiple visits of the same node) allow a faster processing of many cases but they may distort the information contained in the interaction protocols. Combining these two approaches is a major problem in most empirical studies of interaction protocols.

Second is the problem of interpreting the navigation patterns. Let us consider a well-known example. When analyzing learners' navigation in hypertext systems, the number of loops in the exploration path is often used as an indicator of navigation difficulties. However, this interpretation has been questioned. Some loops may be induced by the system because the screen includes a "return" button. In other cases, users may explore the hyperspace around an important or central node. In this case, loops indicate structured exploration rather than navigation difficulties. To provide a correct interpretation, the researcher will often have to combine different sources of information: interaction protocols may be related to reading time, memory for content, subject's evaluation etc. Interpreting interaction protocols also requires researchers to take into account cognitive models of information-processing (e.g. text comprehension and working memory models).

These problems are not specific to learner-computer interaction protocols, but they are emphasized by the apparent easiness to collect and compute data with computers.

Despite these difficulties, the analysis of interaction protocols represents a potentially important tool for instructional research both at theoretical and practical levels.

- At the theoretical, level interaction protocols can be used as dependent measures to understand the nature of the learning process. In this perspective the use of hypermedia systems is a means to build up research designs that extend beyond the classical pre-test/post-test paradigm. Through the use of interaction protocols the psychologist or instructional researcher has a more direct access to the learner's cognitive processes and strategies.

- At the applied level, the use of on-line data is a valuable means to evaluate a proposed piece of courseware. Interaction protocols can indicate whether learners are able to make use of the system in meaningful, efficient ways. Changes in interaction patterns can reveal how training affects the learner's ability to interact successfully with the system. These observations can lead to significant improvements of the system's features and/or interface.

This symposium will examine the methodological issues involved in recording and analysing interaction protocols. The symposium will include a general introduction to the methods and measures available from raw interaction protocols (e.g., system recording or videotaping) as well as presentation of empirical studies making use of such type of data. Participants will present case studies or experiments involving the analysis of interaction traces. They will explain the techniques involved, the difficulties they may have faced, how they solved them and what they learned from these data (about the system and about the learner). The presentations will be followed by a discussion that will aim to identify some features, difficulties and possible solutions that were repeated across studies. Ample time will be allocated to audience interventions and questions.

2. Relevance to EARLI domain and objectives

This symposium is concerned with the improvement of evaluation methods for any experimental setting in which the computer automatically collects data. Therefore, the proposed topic falls into the scope of EARLI'95 theme IV "Methodology and assessment".

The symposium proposal is endorsed by and organized on behalf of EARLI SIG 5 "Learning and Instruction with Computers" (Nira Hativa and Peter Goodyear, co-ordinators).

The symposium may be of interest to educators or researchers involved in the development, evaluation or usage of courseware (CAI systems, tutors, hypermedia applications). Nonetheless, the methodological aspects to be addressed are not restricted to those who are specialised in educational technology. It concerns indeed all researchers who use computer-based settings to test any psychological or pedagogical hypothesis.

Finally, the symposium will address explicitly issues that may concern several other conference sessions. The organisers have been made aware of several symposium proposals that will present research and development on computer-based information systems. The present symposium will provide a methodological meeting point for participants involved in other courseware-related sessions. The illustration of analysis techniques in different study contexts may help attendants develop their own evaluation and assessment methods.

Analyzing Learner-Hypermedia Interaction:
A methodological review.

Jean-Michel Passerault (1) and Jean-François Rouet (2)

(1) Laboratoire de Psychologie, URA CNRS 1607, 95 avenue du Recteur Pineau 86022 Poitiers Cedex. Tel (+33) 49.45.32.45; Fax (+33) 49.45.33.01; email psylan@zeus.univ-poitiers.fr

(2) INRIA Rhône-Alpes, 46 avenue Félix Viallet, 38031 Grenoble Cedex.

Paper submitted to Symposium

"Analyzing Learner-Computer Interactions: Lessons from empirical studies"

Sixth European Conference on Learning and Instruction

Nijmegen, The Netherlands, August 26-31, 1995.

Abstract

The purpose of this presentation is to examine the methods available to study learner-hypermedia interaction. Online methods have been extensively used in cognitive research, and especially in the area of discourse processing. We suggest that these methods can serve as building blocks when studying learner-hypermedia interaction. In the first part of the presentation we examine the use of three main indicators in discourse comprehension research: Reading time (either with self-paced presentation or eye movements), secondary task and verbal protocols. We discuss the benefits and limits of each technique. In the second part we focus on the study of hypermedia usage. We introduce the Evaluation-Selection-Processing cycle as a basic framework to represent learner-hypermedia interaction. Then we present different approaches to studying learner-hypermedia interaction, based on several dimensions: "Observation grain", or the degree of precision of the events recorded, depth and breadth of analysis. We specify the research context for which each approach seems most appropriate. We conclude that whatever the approach, hypermedia research should comply with the general standards of empirical research, e.g., explicit hypotheses and controlled study conditions.

Introduction

Due to the increased availability of computer technology and data analysis methods, cognitive and instructional research rely more and more heavily on the online study of learners' activities, especially in the areas of knowledge acquisition, problem solving and discourse processing. Generally speaking, online methods consist in analyzing the activity itself (e.g., time taken to read a text, concurrent verbal protocols), rather than its outcomes (e.g., ability to answer post-questions).

Online methods may be useful for basic cognitive research as well as for the evaluation and assessment of instructional technology (e.g., hypermedia systems). However, these methods pose a number of theoretical and technical problems. The purpose of this paper is to review the main methods used so far and to discuss their advantages and limitations.

In the first part, we review the main types of online indicators used in discourse processing research: Measure of reading time, eye movements, secondary task paradigm. In the second part we examine the use of online methods in the domain of hypermedia research. First we introduce the Evaluation-Selection-Process as a general framework for hypermedia usage. Then we discuss the notions of "observation grain" and "analyzis depth and breadth" when analyzing learner-hypermedia interaction.

1. Online methods in discourse comprehension

The use of online methods to study basic processes in discourse comprehension has become quite popular in recent years. In this section we present the main types of indicators used in this research area. We point out the advantages and limitations of each of them.

1.1. Reading time

Reading time is probably the most popular parameter in research on comprehension processes. A general underlying principle is that the more complex or demanding the mental processes, the longer the time spent on a certain piece of text.

Two main techniques are used. The first technique consists in recording all the eye movements of the subject. Then the duration of the eye pauses on the words are measured. This variable is especially sensitive to many lexical and syntactic parameters of a text. However, the recording of eye movements is technically difficult. It necessitates a complex apparatus, training of the subjects, and simply doesn't work with some of them. The second technique is self -paced procedure. In the basic form, the reader presses a key to display a text segment (word, phrase, sentence) on the computer screen. Each new segment either replaces the previous one or is added after the previous one. Exposure time (i.e., time between two key presses) is considered a good indicator of the reading processes.

The use of reading time has been criticized on two grounds: First, comprehension involves both immediate and delayed processes. Thus there may not be a strict correspondence between what the reader looks at and what he or she actually processes in working memory. Second, the relation between time needed and complexity may not be strictly linear. Hence the following technique.

1.2. Secondary task paradigm

The secondary task paradigm (STP) relies on the postulate that only a limited amount of cognitive processing resources are available at a certain time. Each cognitive process has a certain cost in terms of resources. The STP consists in asking the subject to perform in parallel a main task (e.g., reading a text passage) and a secondary task (e.g., respond to auditory signals). When at certain points of the activity the reaction time increases, it can be inferred that the main currently performed is cognitively more demanding.

Thus, compared to reading time, the STP provides a means to evaluate the intensity of the cognitive activity (rather than only its duration). This method is especially interesting since cognitive resources may be allotted strategically by the subject as a function of his or her needs or objectives.

1.3. Verbal reports

Reading time and secondary task provide information on the cost, not the nature of the observed processes. The collection of verbal protocols during the subject's activity is a means to come closer to a qualitative look on the cognitive processes. This method consists in training the subject to "think aloud" while performing the task, and/or to ask questions during task performance. The verbal protocols are then analyzed so as to identify indicators of the various processes involved in the activity. These indicators are especially interesting when studying the strategic aspects of high-level cognitive processes. However, protocols are often difficult to analyze, and thinking aloud is sometimes considered a costly secondary task that may interfere with the main activity.

2. Online study of learner-courseware interactions

The above methods are useful when studying lengthy processes that do not involve any physical activity, e.g., reading a passage of text. However, the use of interactive information systems (e.g., hypertext or other types of courseware) provides new means to analyze high-level cognitive activities. Namely, it is possible to record interaction protocols, i.e. a logbook of the events that take place during system usage.

In this section we examine the use of interaction protocols as data to study the processes involved in using hypertext systems. We have limited our scope to the analysis of computer-controlled recordings, despite the existence of other online sources of information (e.g., verbal protocols; interaction between users, video recordings etc.).

2.1. The Evaluation-Selection-Processing cycle

The use of hypermedia systems can be schematized as follows: The learner has a certain task or goal to achieve. He or she has access to a hypermedia system which includes a large set of data and an interface, and which may be more or less familiar to the learner. A communication process takes place, which may be characterized as a three-step cycle:

(a) The learner evaluates his or her current information needs with respect to the objectives or task requirements. Based on this evaluation the learner may decide to make use of the system or to exit.

(b) The learner selects a target node in the system (from a menu or a list of options);

(c) The learner processes node information (text and/or graphics), integrates it with previous information and recycles to (a)

The evaluation-selection-processing cycle or ESP involves a complex hierarchy of cognitive processes. A general research issue is to identify the cognitive or situational factors that may influence the effectiveness of the ESP cycle. (see Rouet and Tricot, 1995). This issue involves both studying the quality of the interaction and its outcomes in terms of learning, user satisfaction, etc.

Most hypermedia systems allow the recording of interaction protocols. Basically any action of the user (keyboard strokes, mouse moves and clicks...) can be recorded along with the hypermedia node where they took place. Consequently, the main issue when studying interaction protocols is to avoid being overwhelmed by the amount of data collected. To that end, the researcher has to define an appropriate "observation grain", and to choose an appropriate analysis method.

2.2. Setting the observation grain

The "observation grain" may be defined as the precision of the events recorded for further analysis. Generally speaking there may be three levels of grain:

- Fine grain: At this level, all the observable actions are taken into account. These include mouse moves, clicks, keyboard strokes, etc.

- Average grain: At this level only significant events are recorded. For instance the researcher may decide that the smallest significant event is the move from one hypertext node to another, regardless of what happens between two such moves.

- Large grain: At this level sequences of actions are grouped to form meaningful behavioral chunks. For instance, when studying large hypertext systems, hypertext "areas" rather than individual nodes may be taken as units. In that case the smallest significant events will be entrance and exit from an area, regardless of what happens in the meantime.

Selection of an appropriate observation grain is a matter of study objectives. When investigating the cognitive consequences of a particular interface feature (e.g., position of a window or button), the researcher may need to collect very small grain data. However, if the aim is to identify navigation patterns among a group of users, an average or large observation grain may be more appropriate.

2.3. Depth vs. breadth of analyzis

Once an appropriate grain has been defined the researcher has to define an appropriate analysis method. There may be three different approaches, based on the type of data collected as well as on the study objectives.

- Case studies of raw sequences: This approach consists in sampling a few protocols and making a thorough qualitative analysis of the complete sequence of events. In this case priority is given to the co-occurrence of several events, in order to achieve a global understanding of the learner's activity. This approach preserves the integrity of the data and may be used in preliminary phases of a research study. However, the qualitative analysis of a whole sequence requires much effort and cannot be generalized beyond a few cases. Thus, this approach can be characterized as a deep but narrow type of analysis.

- Definition of numerical parameters. This approach consists in summarizing a series of events through a numerical parameter; i.e., average study time per node or looping ratio. The reduction of a sequence to a set of parameter can be used when the researcher wants to check specific expectations. Furthermore the computation of numerical parameters can be partly automated, which makes it possible to study a large number of cases. The counterpart is a loss of potentially important aspects of the interaction. Consequently this approach can be seen as wider but more shallow, compared to the first one.

- Frequency analysis of key events. In some cases the researcher will be interested only in a few key events, e.g., number of times the learner has accessed a certain nodes. In those cases analyzing the interaction protocol will amount to selecting the events of interest and then making appropriate computations. This rather restrictive approach is valid only if the researcher was able to formulate specific hypotheses about the phenomena at work during the interaction process. This is usually possible at advanced stages of empirical research.

It should be emphasized that the selection and definition of relevant parameters involved in the last two approaches should be conducted carefully, knowing that there is a risk to introduce some bias by neglecting potentially important aspects of the data.

Finally, it must be noted that the analysis of interaction sequences can be performed based on, or in addition to other on-line methods such as those presented in the first section. For example, exploratory analysis of raw sequences may be fruitfully completed by verbal protocols; the "numerical parameter" approach will often rely on reading time measurement.

Conclusions

The use of online data is more and more widespread in the area of discourse comprehension research. It can also be used fruitfully to analyze the cognitive processes at work in hypermedia usage. However, the use of interaction protocols presents some constraints on the organization of research studies:

First, research objectives and hypotheses should be carefully formulated before collecting interaction protocols, in order to avoid being overwhelmed by the amount and complexity of the data.

Second, data should be collected in tightly controlled conditions. The researcher must keep in mind that many situational and individual factors (e.g., age, familiarity with the system) can influence the learner-hypermedia interaction.

Finally, the observation grain and type of analyzis must be defined in accordance with the objectives of the study.

In exploratory research phases, best results are usually obtained when several approaches are used in parallel. For instance, reducing data to numerical indicators can be fruitfully paired with a more case-based, qualitative analysis. Moreover, the analysis of interaction protocols can be enriched by the use of off-line observations, such as post-tests or interviews.

NAVIGATING THROUGH ECOLAND

Donatella Cesareni

Dipartimento Psicologia dei Processi di Sviluppo e Socializzazione

via dei Marsi 78, 00185 Roma

E-mail: D.Cesareni@agora.stm.it

FAX: +39 6 49917652

Tel.+39 6 49917669

!!! This paper will not be presented at Nijmegen (the author could not come) !!! Summary

Researchers underscore the educational possibilities given by Hypermedia, defined as 'flexible personalized information tools'. The aim of this paper is to investigate how students can use Hypermedia, and how it is possible to relate success in learning to students' navigation in a Hypermedia environment.

An experimental study was conducted with 114 twelve to fifteen year-old students comparing a studying activity using the hypermedia Ecoland with the same activity using printed material. More over we made a comparison between two different uses of hypermedia: a free way of exploring and a guided study session. We collected data about knowledge acquisition and data concerning the task (reports and notes written by students and records of their navigation through the hypermedia).

The use of the hypermedia gave better results than the use of printed material. The unguided use of the hypermedia gave better results in knowledge acquisition than the guided study session; but students solved better their task when using the guided version of the hypermedia. We analyzed navigation records of students using Ecoland, to define which navigation strategies give better results in learning through hypertext. We could find that students with no score increments between pre and post test usually consult few pages with more written information and tend to go in and out the hypertext pages without a clear strategy.

Introduction

We are living at a time when new learning systems based on the use of the computer are being created (micro-worlds, simulations, hypermedia); many writers underscore the possibilities offered by these tools in supporting self-learning and the involvement of the students in meaningful cognitive information processing. Hypertexts in particular and, more generally, hypermedia appear to open up great educational possibilities in complex, multi-disciplinary areas. Researchers however indicate some problems in the use of hypertext in education: Hypertext may be ineffective if learners navigate through the knowledge base in an unmotivated and haphazard fashion.

The aim of this paper is to investigate how students can use Hypermedia, and how it is possible to relate success in learning to students' navigation in the Hypermedia environment.

The hypermedia Ecoland has been designed and developed according to five design principles, derived from critic reflection about literature and previous research. In summary, the hypertext must assign a meaningful task to the students, in order to motivate and encourage involvement; it must support cooperation; the structure of the application must reflect the structure of the knowledge base to be presented; the navigation system must be very simple; the structure of the hypertext must help students to understand the relationships that exist between different parts of the knowledge domain.

According to these design principles, Ecoland presents a hypermedia learning environment based on information retrieval and the discovery of relationships. The students are required to explore an environment in order to collect as much information as they can about the consequences of three solutions about waste treatment. Information is organized using a spatial metaphor: the region of Ecoland contains three small towns; in each town, students can enter four different places in order to gather information at different levels of depth and complexity. They can visit the town hall, the library, the town's archives and the main square of the town.

From a studying activity conducted with Ecoland we expect knowledge acquisition in science and environmental protection as well as an increasing awareness of logic relations existing between concepts concerned with environmental education.

Method

In order to test the educational potentiality of the hypermedia Ecoland, we conducted an empirical study involving 114 twelve to fifteen year-old students. The students engaged in a collaborative work during 3 learning sessions with the hypermedia Ecoland. The experimental design was aimed to investigate the effects of two factors:

a) one concerning the comparison between 2 presentation formats: the first one involving the use of the hypermedia, the second one with printed material, which present exactly the same contents.

b) the second factor concerned two different uses of hypermedia: a free way of exploring and a guided study section.

The students were divided into 3 groups and assigned to a different activity: a) a studying activity with the hypermedia Ecoland; b) the same activity using a guided version of the hypermedia; c) a studying activity with printed materials which give the same information as the hypermedia.

In each group students worked cooperatively in dyads. Dependent variables are knowledge acquisition and the ability of connecting concepts about the environment. In order to take measures of these variables we have developed and validated a set of tests. Moreover we collected data concerning the task (reports and notes written by students and records of their navigation through the hypermedia).

General research results

Scores of subjects who worked with the hypertext increased more between pre and post test than scores of subjects using printed material. Looking at the differences between guided and unguided use of hypertext, we found that knowledge acquisition scores increased more in students using the hypertext in the unguided version than in the other group; but the guided use of hypertext allowed students to better solve the comparison task we set them. Actually the analysis of students' notes and report showed more accuracy, articulation and exhaustiveness in notes and reports of students working with the guided version.

Navigation analysis

We collected data about exploration strategies used by students, i.e. records of navigation, paged visited and time spent in each page. In our previous experiment about the use of hypermedia, conducted with 18 students, we noted that in a lot of navigation records there was a high quantity of pages in which students stopped only for few seconds. We calculated a measure of average time spent in one page, that was obtained dividing total time spent in the hypertext by number of pages visited. We hypothesized that the lower the average time, the less attention students will play to the information collected in Ecoland. So we called this measure "attention level". Relating these measures of "attention level" to students' scores at the post test, we noted that the three students with the lowest average time didn't increase their scores between pre and post-test; four of the five students that had the better results had an "attention level" measure higher than the average of the sample.

We tried to use this "attention level" measure also for the next experimental research, but we noted that there were some problems, because this measure took into account also pages with very long times, in which students stop for reorganizing their notes, or for writing their reports. This very long time spent in few pages affect the average time. So we decided to use another measure, that makes a distinction between pages that were really read and pages that were only "passage way". Conventionally we considered "passage way" that pages in which students stopped for less than 5 seconds. Obviously it is only a convention, because it is impossible to establish if students really read the other pages, but it is evident that you cannot really read a page with written information if you stop there for less than 5 seconds. We calculated the percentage of "passage way" on the total number of pages visited by students and used this measure in analyzing data.

We also analyzed navigation records in order to find different exploration strategies, i.e., which pages students decided to visit in order to solve their task. We could find different strategies such as researching all the information of the same kind (for example entering the town halls of the three towns, than all the archives, than all the library and so on) or trying to take all information in one town before going to another. Some students solved the task mainly looking to laws and to people opinions, others gave more prominence to scientific information given by books in the library.

In order to define which navigation strategies give better results in learning through hypertext, we analyzed navigation records of that dyads in which there were the 12 students that didn't have pre-post test changes and that of the 12 students who had a great improvement. We could find differences in exploratory strategies. The most important difference between the two groups is about the number of pages in which students stop for less than 5 seconds, those pages that we called 'passage way'. A very high percentage of pages visited for less than 5 seconds indicate that subjects do not read information, but merely go here and there without a real information retrieval strategy. Subjects with low score increment between pre and post test have a higher percentage of pages visited for less than 5 seconds than subjects with high increment. More over they usually consult less pages with more written information and use to go in and out the hypertext pages: often they re-enter many times the same page for few seconds. It is interesting to observe that this navigation strategy is not related with ability in reading comprehension.

Discussion and Conclusion

This study indicated that students working with hypermedia had better results, in term of knowledge acquisition and ability of connecting concepts, than students using printed materials. This result encourages the use of hypermedia in education, especially for complex and interrelated topics, that can get great benefit from the particular associative hypertext structure.

One of the most important problem in the use of hypermedia is the well-known problem of 'getting lost' in a knowledge base. In my opinion it does not mean only to know where I am or where to go now, but what kind of information I need to solve my task and how to get it. In an unfamiliar domain, learners can become confused and may navigate without any real awareness of the choices they are making. Many researchers began to consider the necessity of supporting simple hypertext systems with tools that support the choices they make and encourage involvement.

In a preliminary analysis we found that students with a very low pretest score did not improve at the post test. We observed that they seemed to be unable to choose what information to look for and where to find it. One of the most important problems in education is to stimulate learning also in less able or motivated students. We developed a different version of the Hypermedia Ecoland, which provides a methodological support guiding students in exploring activities and information retrieval.

We had an unexpected and interesting result: the guided version helps students to solve better their comparison task, but their knowledge acquisition scores increase more using the Hypermedia in a free way.

In order to understand what does affect students' results, we began to analyze navigation records. We have only preliminary results, showing that "zapping" strategies (to go here and there stopping in each page for a little time like in television zapping) do not lead students to good results. It is important to note that navigation strategies are not the main cause of success or failure; in fact the two members of dyads, of course using the same strategy, do not have the same results. We need to find better information about the students' use of hypermedia. In a next research we intend to analyze verbal interaction protocols of dyads using Ecoland, taking also account of differences in exploratory strategies between students using the guided or unguided version of Ecoland.

EVALUATING COMPLEX LEARNER-HYPERMEDIA INTERACTION:
WHAT CRITERIA FOR WHAT TASKS?

André Tricot (1) and Jean-Paul Coste (2)

(1) CREPCO-CNRS, University of Provence, 29 avenue R. Schuman, 13621 Aix en Provence Cedex, France

(2) Equipe Hermès, University of Provence, 3 place Victor Hugo, 13331 Marseille Cedex, France.

Summary

In this paper we address the issue of criteria definition when analyzing learner-hypermedia interaction (LHI). So far studies on hypermedia usability have used simple information search tasks where the subject's goal corresponds to a small subset of nodes and for which simple dependent measures can be used. However, hypermedia-based learning tasks often involve more complex interactions, for which the usual criteria are no longer relevant. We suggest that, like any human behavior, description and evaluation of LCI should refer to some psychologically relevant model, i.e. a model of the task performed by the learner. To illustrate that point, we describe an experiment in which university students were asked to use a very large hypermedia database in order to perform a complex learning task. We propose several methods to characterize learner-hypermedia interactions in this situation, and we present some outcomes of our analyzes. We conclude that in order to understand the potential of hypermedia for learning, comprehensive activity models related to different learning tasks or objectives are needed. However, such models are not yet available at the present time.

Introduction

In order to assess the potential of hypermedia applications for education, it is important to build up appropriate observation methods. In this paper we address the issue of criteria definition when analyzing learner-hypermedia interaction (LHI).

Hypermedia systems may be used for a wide range of tasks, which may be defined as a goal to achieve through a series of actions within a certain environment (in this case a computer system). So far, most empirical studies on hypermedia usage have used simple information search tasks in which:

- The subject has to answer a small number of questions, with little or no relation between them.

- The number of relevant nodes is rather small: 1 or 2 relevant nodes per question, sometimes 5 or 6 for questions labelled "judgement" or "synthesis".

- The systems are themselves very simple: A few dozen of nodes at most.

In these studies, the criteria used to evaluate the subjects' performance are often simple quantitative measures:

- Recall: Did the subject open the relevant node(s)?

- Precision: Did the subject disregard irrelevant node(s)?

- Economy: Did the subject use the shortest path to reach a target node? Did the subject return several times to a given node (looping)?

(It should be noticed that those criteria are directly drawn from research on automatic information retrieval, rather than on models of human performance).

However, real-life learning tasks involve more complex interactions between the learner and the system. The learner's goal may be more ambitious than simple information retrieval (e.g., learn a complex set of related concepts in a given content area), and the goal-node correspondence may be less well defined. In these situations the simple quantitative criteria listed above are no longer relevant. Consequently, an attempt must be made to define relevant methods to characterize the learner's activity.

In order to explore this issue we designed an experiment in which university students were asked to use a large hypermedia CAL system in order to perform a complex learning task. In this paper we will describe this study, focusing on the analysis of students' interaction protocols.

Method

The content area studied in this experiment was physics and the specific topic was wave propagation. A major instructional problem in this area is to help university students to build up different forms of cognitive representations for a given phenomenon (e.g., shift from a physical to a vectorial representation). A hypermedia-CAL database was designed in order to provide students with multiple representations (including dynamic simulations) of wave propagation phenomena. The database was implemented as a set of Hypercard stacks and included more than 1300 nodes.

The subjects were senior students in engineering. The experiment took place as a 5-hour lab session during which the subjects were grouped by pairs and asked to study wave propagation problems. At the beginning of the session the subjects were given a booklet containing background information (mainly equations), a user manual for the hypermedia system and a series of questions to be answered. Answers could be either explicit in the system or inferable from system information. Questions and relations between questions were so complex that subjects could not apply a sequential problem solving strategy. Instead they had to hierarchize the problem space into a global representation of the dynamic states of a physical space and local representations associated with each question.

For the purpose of the experiment a subset of 25 questions was used. For these questions thirty-two nodes spread across the system were considered directly relevant.

Analyzing subject-hypermedia interactions

We decided not to consider any search strategy as more or less relevant in absolute terms. Instead we studied the relationships between search patterns and learning outcomes (i.e., students' ability to answer the questions). Overall, the students managed to provide acceptable answers in about two thirds of the cases, which is a first indication that they managed to get information out of the system. Following are the main observations concerning interaction protocols.

Orientation and navigation in the system.

First, we observed that a great number of different routes were used across questions and students. Overall, the routes were not the most "economic" ones: only 36% of the selected nodes were directly relevant. The relevant nodes were selected 5 times on average. Moreover orientation nodes (i.e., menus, indexes, tables of contents) represented 35.8% of the opened nodes.

Some subjects used a form of "surface navigation", i.e. they repeatedly went back to orientation nodes after selecting just one or two content nodes. In other terms they did not go "deeply" into a series of content nodes. This type of navigation pattern might have an orientation function, a function that may not be fulfilled by the so-called "orientation stacks" included in the system (i.e., series of cards which contain information about the system's organization).

Node selection and performance.

There was a negative relation between the total number of relevant nodes opened and subjects' performance: Subjects who provided correct answers opened less relevant nodes than the other subjects. In fact we found that the production of a correct answer required the subject to open relevant nodes several times; however opening relevant nodes was no guarantee of a correct answer.

In order to explain this apparent paradox, we computed the average number of relevant selections (RS) for the 32 task-relevant nodes. RS was defined as the average number of selections of a relevant node for the subset of subjects who provided correct answers, minus the same measure for the subset of subjects who did not answer correctly. The average RS was 3,17 and tended to be higher for subjects who opened many orientation nodes.

Discussion

The objectives of this study were to evaluate the hypermedia-CAL application and to analyze the navigation of students confronted with a complex learning task. We believe that this type of situation (students learning through interaction with a complex information system) will become more and more frequent as the use of self-instruction systems (e.g., hypermedia libraries) develops.

We observed that most subjects went through the task successfully. However, there was no strict correspondence between opening relevant nodes and correct answers. In fact, looking at relevant information seems to be a necessary but unsufficient condition. Also we found that opening relevant nodes several times was associated with correct answers. Thus, it seems that redundancy in the learner's routes (i.e., "looping") can have a positive effect, although it is sometimes interpreted as a negative symptom.

Our experiment also showed that interaction protocols can be interpreted only in light of a model of the task. There have been many references to task models in hypermedia literature but frequently researchers use "ideal" task models, i.e. models of the task as it would be completed by a perfectly efficient system. Ideal task models are relevant only in cognitively simple situations where the subject can reach a minimal level of efficiency (speed, accuracy, performance, etc.).

More generally it is important to consider both a formal model of a task, which predicts the most efficient way to perform the task, and a model of the activity, which takes into account the subjective complexity of the task and the constraints of the human information processing system. However, in the case of learner-hypermedia interactions, this type of cognitive model remains to be developed.