The feasibility of capturing learner interactions based on logs informed by eye-tracking and remote observation studies

Two small studies, one an eye-tracking study and the other a remote observation study, have been conducted to investigate ways to identify two kinds of online learner interactions: users flicking through the web pages in “browsing” action, and users engaging with the content of a page in “learning” action. The video data from four participants of the two small studies using the OpenLearn open educational resource materials offers some evidence for differentiating between ‘browsing’ and ‘learning’. Further analysis of the data has considered possible ways of identifying similar browsing and learning actions based on automatic user logs. This research provides a specification for researching the pedagogical value of capturing and transforming logs of user interactions into external forms of representations. The paper examines the feasibility and challenge of capturing learner interactions giving examples of external representations such as sequence flow charts, timelines, and table of logs. The objective users information these represent offer potential for understanding user interactions both to aid design and improve feedback means that they should be given greater consideration alongside other more subjective ways to research user experience.


Introduction
OpenLearn provides an open educational resource that can be used by the user in any way they wish. Such flexibility in accessing learning materials in raises questions as to how different users (differing in age, in location, in qualifications, etc.) with different purposes (information-seeking, studying a course, etc.) interact with online resources. It is a challenge to be able to differentiate different user actions from their interactions, particularly, to be able to determine whether users are 'learning' from their interactions with the site.
Learner-computer interaction may be indicated by variety of sources: logs, interviews, observations, recorded screen activities, etc. (e.g. San Diego & Aczel, 2007;Sheard, Ceddia, & Hurst, 2003). For our purposes we need to select appropriate techniques that can be used in researching online computer interactions and apply them in laboratory based observation studies, remote observation, and indirect observation based on user logs. The structure we are following is an experimental approach to bring out methodological issues and provide further specifications based on experience.
An initial desk study of tools for capture of interaction data with web sites and user trials to assess interaction models focused on using eye-tracking and other observation as appropriate. This research considered digital technologies such as a 'non-intrusive' eye-tracking device, remote desktop sharing tool, screen capture software, digital video cameras, webcams, etc., to find out whether integrated analysis of the data from these technologies can provide information about 'learning' interactions in the context of OpenLearn.
As a further stage we look at how we can go from the information from our studies to consider how we can use the data we have about all of our users, rather than just the observed user. The observations based on the eye-tracking study and the observations based on the remote observation study are therefore examined to come up with specifications for indirect study of learner interactions based on examining logged actions in the OpenLearn systems to classify user actions. Different kinds of visualisations using recorded logs (e.g. site visited, time of visits, and so on) are illustrated; visualisations which may help interpret differentiate models for learning interaction within OpenLearn.

Researching online learner interactions
Researching human-computer-interactions sometimes takes place in a laboratory setting as the studies require a controlled procedure and set-up; and this set-up can be difficult to prepare and do in natural settings (Hall, 2000). Researchers may also find capturing and analysing computer actions challenging when using conventional 'note-taking' observation techniques (Foster, 1996, Pirie, 1996. An alternative technique is the use of video, which can allow capture of simultaneous actions using multiple videos, repeated replay, playing at different speeds, post-analysis, multiple observers, etc. (Powell, Francisco, & Maher, 2003, Roschelle, 2000. This technique coupled with think aloud technique (Ericsson and Simon, 1984) makes a powerful research method.
Researchers also have an opportunity to explore the opportunities that 'new' technologies offer (e.g. screen capture, digital video cameras, eye-tracking) to capturing computer-learning interactions. The technique amalgamating eye-tracking and think-aloud protocol has provided methodological opportunities for describing learning interactions (e.g. San Diego & Aczel, 2007. For example, it is possible to describe users interaction while reading through a body of texts, represented by 'eye-marks' embedded on those texts, generated by the eye-tracking device (San Diego & Aczel, 2007). While it is possible to bring users in a lab for this kind of study, this can be problematic to do when researching users located at a distance.
Researching users in their natural setting is less intrusive than studying them in a laboratory (Hammersley & Atkinson, 1995). There are other 'new' technologies that can help in capturing users' learning-interactions at a distance. Thompson (2003), in a study conducted to test the usability of a library website with the aim to improve its design, has attempted to use technologies such as screen capture software, online conferencing software, and remote computer desktop-sharing application software to capture and record remote computer-interactions. Hosein, Aczel, Clow and Richardson (2007) conducted a similar study but in the context of looking at learning with mathematical software. These previous studies have suggested that remotely capturing users' interaction can reduce methodological problems associated with the setting and researcher's presence in the same location.
Another approach to capturing learner-interactions is by indirect observation. For example, in a Learning Activity Management System (Dalziel, 2003), a user interaction with an online activity is captured and presented in a 'learner monitoring environment'. This monitoring environment pictorially represents different activities as 'blocks' and the colours of the blocks represent users' log of their completion of an activity. This kind of feedback has been found to be useful for teachers as it gives teachers the feedback of users' current state in the activity sequence (Cameron, 2007). However, it fails to represent the time users has spent within each activity. Teachers can also benefit from modelling interaction in terms of the time (Laurillard, 2006) as users face constraints of completing learning activities. Sheard, Cedia and Hurst (2003) have offered an alternative way of researching online learning interactions. They have attempted to provide educators some information about online learning interactions in order to inform the development and improvement or online course materials. They captured the users' frequency of use of online resources designed to support learning, and the time users spend with these resources. However the study did not attempt to transform these logs into different representational forms to help inform teachers and researchers consider different kinds of learning interactions based on these logs. The technique of capturing and recording of user interaction based on logs potentially offers a less obtrusive technique than laboratory-based and remote observation techniques.
A novel consideration is to look at the possibility to determine different kinds of users' actions from matching observation based on eye-tracking experiences together with the observations that can be done remotely; and the extent with which the pattern of actions can also be represented by userlogs (Ivory and Hearst, 2001). Previous research has informed us some of the methodological advantage and pedagogical value of an approach to capturing user interaction based on videos and logs. However, it is a challenge to explore the extent that it is possible to distinguish the difference between users flicking through the pages in "browsing" action, and when they are more engaged in "learning" action based on these logs. Being able to meet this challenge can help i) reach the majority of online users who may not be available for user studies ii) provide a relatively 'non-obtrusive' data capture iii) inform course design and linked resources.
The study that is carried out involves users of the OpenLearn site. The OpenLearn site offers free access to Open University courses online. The next section presents some of the examples and describes the courses and the resources available in OpenLearn. The particular tasks given to users during the study are also outlined.

The OpenLearn 'Unit' and the tasks for the feasibility study
OpenLearn gives free access to some of the Open University course materials. The course materials are referred to as 'learning units'. Each unit can be accessed through a web browser, and contains typical components such as the unit outline, the time expected to finish the unit, and 'level' of the unit (i.e. introductory, intermediate or advanced unit) ( Figure 1). Users can view the content of a unit in a sequence of 'pages' presented in the web browser window ( Figure 2). Within the web browser, some of the content of a unit may include resources available as either embedded or as a text hyperlink that can be viewed i) as a PDF or Word document (e.g. Figure  3) ii) a video or audio resource iii) as another link to a different resource. In this research, a predefined concept of 'browsing' interaction takes place when users go though a page very quickly without thoroughly reading the content of a page whereas 'learning' interaction takes place when users read the content of the page. Three tasks were designed to investigate if it is possible to characterise the difference between these two types of interactions. The first task is to find out what users do when studying a course unit. A unit was chosen and used to investigate this. This learning unit was selected to be 'generic', of general interest, and to not require any background knowledge in order for users to go through the content. Thus, the unit chosen for this task is about the introduction to 'Global Warming' as this unit fits the criteria given to the users participating in the study. The feasibility study, which involved asking users to complete all three tasks, is designed for an hour and a half session as consented with the users. This unit chosen is designed to take 5 hours but only the first few pages of this unit are used in the study. The first task is: 'Study' task -Users were asked to go through the Global warming unit and to complete the first set of exercises.
To compare what users do when they are 'browsing' the second task is: 'Browse' task -Users were asked to browse through some of the contents of OpenLearn.
And, as a combination of the two kinds of task, the third task is: 'Study-choice' task -Users were asked to study a certain unit of their choice.
The tasks were designed to investigate if it is possible to differentiate between 'browsing' and 'learning' interactions. The tasks were refined and tested with two users in a pilot study.

Eye-tracking and remote observation studies
The aim of this research is to investigate ways to determine different kinds of users' actions from matching observation based on eye-tracking and remote observations. It aims to identify differences between actions that can be described as browsing and as learning. Consequently, this research aims to provide some specifications for conducting indirect study of users' interactions based on logs and to be able to identify browsing and learning actions based on these logged. Logged actions can be frequency of visit, and the time spent, in every page or resource.
Two studies were designed to investigate the extent with which log actions can correspond to two types of actions. These two studies are eye-tracking and remote observation studies. The investigation focuses on whether observed actions from eye-tracking study and observed actions from the remote observation study be said to correspond to some patterns.
The technologies and the observation techniques and the kinds of data corresponding to each of the studies are given in Table 1  Eye-tracking technique has been used by many to look at learner interactions (e.g. Gluck, 1999, San Diego & Aczel, 2007, Hansen, Hauland, & Andersen, 2001, Yoon and Narayanan, 2004. In this research, the eye-tracking study combined different techniques to capture what users say, do and see; and relates these with 'logged data' of users' interaction. This set of techniques was established and tested based on a larger research by the first author (see San Diego, 2008). Figure 4 shows an example of two screens recorded in the eye-tracking study conducted. The researcher observes the actions in real-time in a laboratory. User actions and utterances (user video on the left of Figure 4); and screen activity with eye-tracking data (eye-tracking video on the right of Figure 4) were recorded and can be played in synchronisation. On the figure, the eye-tracking videos shows 'saccades' as lines -the path the eye took across the screen; 'fixation' as blobs -the place where the eye dwelled on the part of a screen; and mouse clicks as 'x' marks -location of clicks. Examples of logged data are given in Table  5 later in this paper.

Figure 4: A screenshot of the videos recorded in the eye-tracking study
Remote observation is still in its infancy as the technology remains limited to be able to capture realtime video of remote desktop via the internet and to send this real-time video signal back to the remote observer for recording 'just-in-time' (Ivory & Hearst, 2001). Thus, the procedure and technologies for the remote observation study was tested and trialled several times with two colleagues. The kinds of data captured in this study have some similarities to the eye-tracking study. In the remote observation, user's gazes cannot be recorded because the home computers lack the cameras that detect users' eyemovements; and due to the limitation of the system where OpenLearn was developed (i.e. Moodle™an open-source course management system) logged mouse actions cannot be recorded. Using a remote-desktop sharing technology and screen capture software, it was possible to remotely capture what users' say and do. Figure 5 shows a remote user video (left) recorded onto the researcher's computer. The remote video and audio signal are transmitted via the internet and recorded in the researcher's computer using Camtasia™. The researcher (right) watched the user's interaction in 'nearsynchronous' time during the observation session. The user is not in the same room as the researcher. Four users, who were aware of the OpenLearn site but do not have much experience with any of the units, participated in the study. Two users were assigned in the eye-tracking study and the other two in the remote observation study. Their activities were recorded as described above and their logs were captured. In the two studies, the instructions given in completing tasks were slightly varied due to the differences in technologies used and the procedures required for capturing data.

Differing actions when 'browsing' and 'learning'
The videos from the two studies were watched several times before seeking to identify which kinds of actions can correspond to 'browsing' and 'learning'. After having, watched the videos, several times, a pattern seems to emerge from relating the kinds of logged actions with users' utterances and actions in both studies. The analysis of the videos suggests a difference between the interactions when users are asked to complete the browsing task and when users are asked to complete the study-choice task. Given the analysis between these two tasks, the following variations on users' actions are derived in Table 2 below.
Browsing task -Users were asked to browse through some of the contents of OpenLearn.
Study-choice task -Users were asked to study a certain unit of their choice.
In both eye-tracking and remote study, the users: Move mouse pointer fast almost the entire time of the task; screen display scrolls fast down and up repeatedly, then slows down in several occasions all throughout the task; Choose random pages with a unit and open resources at random; Open resource at random.
In both eye-tracking and remote study, the users: Move mouse pointes fast at the first few seconds; (sometimes) drag mouse pointer from left to right along the text that is being read out loud; screen display scrolls down and up slowly with few occasions of scrolling up; choose a page following a certain sequence or visit related pages; Open resource as read out.
Evidence from eye-tracking shows Eye-moves 'ballistic' (i.e. saccades moving left-to right, up-to-down) as text being read out loud are found in different sentences or different paragraph within a page of a certain unit. Eye-movements do not follow a specific pattern of fixation on contents within a webpage.
Evidence from eye-tracking shows Indication of 'reading' may be represented by a 'worm-like' saccadic movement from left to right (e.g. Figure 4).  Table 2, the 'browsing' action happens when the user, for example, is choosing content that is of interest to him or her. This is denoted by fast scrolling mouse-movement, random clicking, random word reading, random page flicking, random opening of resources, and coupled with 'ballistic' eye movement. The 'learning' action happens when the user, for example, has chosen a certain part to thoroughly read through. This action is denoted by a slow scrolling, slow mouse movement (may be from left to right along the line of a text), clicking of pages in sequence, opening of resources in sequence, and 'worm-like' movement of the saccades. These actions are further described in Table  below. The 'study-choice' task both illustrated an indication of 'browsing' action described above when the users were choosing content then shifted to a 'learning' action when the user decided to go through a chosen content. These forms of interaction may possibly represent some of the typical actions that a user might engage with when they are 'browsing' and 'learning' a unit in the OpenLearn.
An example extract is given below of a user's talk while 'browsing'.
The user looked at the range of topics (00:00:46:15) "Oh I'm torn between going for something that interest me and going for something that I think might be of use to me... I'll combine the two and go for 'Science and Nature'" The user then scrolled up and down the list of sub topics and pausing when (00:01:01:18) "Nutrition. Vitamins and Minerals', 'Evolution through Natural Selection'... that looks interesting..." The user looked at another sub-topic then decided to look at the first choice (00:01:30:00) "Ok... I'm tempted by both of them but I know more about earthquakes than I do about evolution... So I'm going to look at evolution first." After deriving some descriptors for actions associated with browsing and learning, the logs were scrutinised and analysed. The following visualisation of models that can be interpreted as browsing and learning are given in the next section.

Specifying kinds and models of browsing and learning actions based on logs
The OpenLearn Units are created in a platform that can automatically generate logged actions such as time when a user visited a resource, the kind of action (i.e. viewing a content of a certain Unit, etc), and also the label given to a resource. Table 3 is an example which shows a log of a user's interaction in the browsing task of a certain Unit 'Teaching for good behaviour'. The description on the last column (right) has been added to indicate the Page number and the resources available within each Page.  Table 3, it is easy to extract the frequency of visits in every page. However, the platform does not include in the log the resources that users open. Also, by careful inspection of the logged time, it is also difficult to accurately determine the amount of time spent in each page as the time is not recorded in sufficient detail.
By chronologically examining the sequence of pages visited, the log shows that the user did not visit the pages in the designed sequence. In the case shown in the table it is possible to identify a browsing interaction from the logged actions. For example, the user interactions at 12:57 can be denoted as browsing because seven different pages were visited within one minute. Each of the pages consisted of texts and resources that cannot be read within one minute. On the other hand, log of time and pages visited between 12:58 and 13:02 are indicative of learning action Page 1 because three minutes is a reasonable time to read the content of that page. However, it may be argued that the user could have their attention elsewhere. The eye-tracking study records other kinds of logs that can reduce this limitation. For example, mouse movement logs and slow scrolling from up to down can indicate that the user is present in front of the computer; and that the logged action may indicate that a user is going through the content of a page.
The rest of this section shows some possible visualisations of how logs can be transformed into a representation that can help characterise whether 'browsing' and 'learning' action are happening. It also illustrates the kinds of browsing and learning actions taking place when users are studying by analysing the four users' logged actions in completing the 'study task' (discussed later in this section). First, the sequence of page visits based on logs is illustrated as to how this can also be used to examine different sets of actions.
To illustrate the value of a sequence visualisation based on logs, an example logged action from one of the users completing the study task is used below. The study task is to study the Global Warming Unit and complete the 'first exercise'. Like other learning units in OpenLearn, the Global Warming Unit is likely to have been designed with the expectation that users would go through the unit following a certain sequence. First, users are expected to read Page 1, the introduction and the learning outcomes of the unit; then to go through Page 2; and so on. Table 4 shows the sequence that users are expected to follow. The first (left) column of Table 4 shows the expected time users may spend for each of the resources from Page 1 up to the 'first exercise' of the Unit; threshold time, in minutes, is the expected amount time that users may spend going through the content within a page. The middle column gives the resource and the third column shows the corresponding URLs. It is expected that users may take from 15 minutes to 25 minutes to go through this part of the unit.

Threshold (in minutes)
Resource URL  The sequence above can be represented in a flow diagram as in Figure 6. This kind of visualisation helps to make clear the kinds of resources and the amount of text, picture, diagram, video, document, etc. that a unit consists. The number and amount of visits can also be embedded on each page (see for example in Figure 6 below, Page 2 and Page 3, where n represents the frequency of visit, and t represents the amount of time spent). The accumulated number of visits across a set of user in every page may indicate the least popular Page and most popular one visited by users. An alternative visualisation may help and is presented in the next figure. It may also valuable to visualise the sequence followed by a user to help examine the degree of match, or mismatch, between a perceived pedagogic sequence and the actual sequence followed by a user. A possibly better visualisation than Figure 6 is the use of timeline visualisation. An example of a timeline is given in Figure 8. Figures 8 show 'bars' in different colours where each colour represents a page, the n = 5; t = 2.3 min n = 2; t = 1 min length of the bar represents the proportion of time spent. This representation is better than Figure 6 because it can show both the sequence and the amount of time spent in a more visual way; however, it does not pictorially represent the kinds of resources. Figure 8 is the timeline generated from transforming Table 4 above which represents the pedagogic sequence design of a course developer. An extract of an actual user log based on the eye-tracking system is given in Table 5 below. By aggregating the sequence followed by a number of users, it may be possible to inform the course designer on how the sequence should be redesigned. Figure 9 shows the actual sequence followed by the four users in completing the study task. In Figure 9, User 2 can be discounted as this user withdrew from doing the study task but agreed to browse though the content. User 2 withdrew when asked to do an exercise. As the figure shows, User 2 spent most of the time on the 'Unit Page' but also browsed through and looked at the exercise in Page 3.
By inspecting the timelines in Figure 9, Page 3 appeared several times as the expected sequence suggest. However, by carefully inspecting the proportion of the bar, users did not spend much time in other pages before the first visit of Page 3. This may suggest that User 1, 3 and 4, browsed through In the legend: the width of the bar represents a minute first few pages because the length of the bar is less than a minute. This length of time is not enough to go through the content of each of the page. Triangulating logs with other forms of research data can help interpret and validate the kinds of actions found. For example, in Figure 9 above, the three users were found browsing the Pages because they were looking for the exercise. The talk data and the scrolling into the area of where the exercise can be found on the part of the screen validate this.
User 4 visited the Unit Page quickly click the 'Introduction link' which brings him to Page 1. he read out random words on Page 1, then clicked on page 2, then Page 3, and then he said, (00:00:46:15) "I would like to see the exercise…Err… Ok… I know what I am going to look for." Users 1 and 3 have performed similar actions. The three users looked for the page where the exercise can be found. After having found the exercise, the length of the bar changed to a longer bar. This indicates more time spent on each the pages. The length of time commensurate to the amount of time users may spend looking at the pages. However, it can be argued that by just relying on this visualisation, it may not precisely represent that 'learning' is taking place. A way to reduce this is to it with other forms of data such as logs of mouse clicks, mouse movement, scrolling of the display. Additionally, other researcher techniques, e.g. survey questionnaires and online interview questions, may further help interpret the data based on logs.

Concluding thoughts and recommendations for future work
The purpose of this research is to investigate the possibility of classifying different online interactions actions from an eye-tracking study and a remote observation study. This research offers some insights for describing the difference between browsing and learning actions by examining visualisations of recorded user activity.
Although, digital technologies may offer possibilities to research learner-interactions there are many challenges that need to be addressed in future research. Among others, for example, ethical considerations for remote observation studies can be more complicated than the ethics concerning traditional observation. Trust and rapport with participants may not be easy to establish due to the In the legend: the width of the bar represents a minute 'physical' absence of the researcher. There are also methodological challenges such as the interpretation of logs without the presence of user video capture, in such cases recorded website interaction times may not really represent an actual time as users may not be in front of their screen between the times of different actions.
While others have distinguished a 'learning' action as frequency of visit and time of visit to online resources (e.g. Sheard et al., 2003), this research has extended this by specifying the possibility of determining differences between different kinds of online interactions. The data from four participants of the two small studies (i.e. an eye-tracking study and a 'remote observation' study) seem to offer evidence for identifying the difference between users flicking through the web pages in "browsing" action, and when they are more engaged in "learning" action. This research has provided some examples of visualisations that could show a representation of the perceived pedagogic sequence of learner interaction with a course material as thought of by course designers and the visual representation of actual user interaction based on logs. By analysing user interactions based on visualisations of logs triangulated with users' utterances, the evidence suggests that although an OpenLearn unit may have been designed to follow a certain pedagogic sequence, logs show users may not follow the same sequence. For example, the two users in this research 'jumped' to the webpage where the assessment questions are. It seemed that the users performed some sort of 'answer searching' strategy. Without constraining users of specific instructions in performing interactions, mismatch between these visualisations might provide a way to adapt the design of the sequence of online learning materials.
This research has attempted to illustrate the feasibility of examining, identifying and observing 'learning' and 'browsing actions based on user logs. While it is possible classify different kinds of user actions, this research has also considered ways to transform logs from different systems into a visualisation that may be more easily interpretable than the 'raw' text-form of user logs. One of the limitations of determining user interactions from logs is whether users are actually looking at the website based on logs. This can be reduced if there is a technology that captures sufficient detail in the logs (frequency of webpage visits, start and end time of webpage visits in milliseconds, duration of visits, mouse pointer movement, mouse and keyboard actions, and if possible screen capture of remote user's computer) that can contribute to measuring when user action is happening associated with the user being present in front of the computer screen. While this does not rule out the possibility that users are not reading the content, the logs can be coupled with inputs of estimated time users would take to go through a webpage; and whenever the user log goes extremely beyond the estimated time without any mouse movement (e.g. slow scrolling up to down, clicking, movement left to right), this is an indicator that users are not active and data may not be valid. Even so, the kinds of interactions based on logs have to be carefully interpreted and complemented and triangulated with other forms of research data, such as that from surveys, interviews, questionnaires, and remote screen capture.
Capture of logs will also be improved if webpages can be divided into several sections (i.e. slicing a page of a unit into several parts depending on the kind of resource present), and recording of logs by different sections. Such detailed log records would also help identify interactions with different kinds of resources and the elements within a webpage. Interpreting user actions could not only improve our understanding as researchers, but also lead to benefits for users. In the future, systems may be able to change the sequence of pages depending on the pattern of user interactions of a majority of users and feedback this information to designers to help them improve online learning resources.