Assuming that scenes would be visually scanned by chunking information, we partitioned fixation sequences of web page viewers into chunks using isolate gaze point(s) as the delimiter. Fixations were coded in terms of the segments in a
Eyes seldom stay completely still. They continually move even when one tries to fixate one’s gaze on an object because of the tremors, drifts, and microsaccades that occur on a small scale [
During fixation, people closely scan a limited part of the scene they are interested in. They then quickly move their eyes to the next fixation area by saccade, which momentarily disrupts vision. However, it normally goes unnoticed thanks to our vision system that produces continuous transsaccades perception [
In viewing natural scenes or displays, a chunk continues to grow until interrupted by one or more isolate gaze points resulting from drifting attention or by accident. These do not participate in any fixation. Whatever causes the interruption, we believe that such isolate points serve as chunk delimiters, like the pauses in speech. As a pause can be either short or long, interruptions by isolate points can vary in length. Figure
Two fixations in one chunk (a) and in separate chunks (b).
Granting our conjecture, one may still wonder what particular merits will accrue from the analysis of chunks in lieu of ordinary plain fixation sequences. The expected merits are twofold: separation of between- and within-chunk patterns and extraction of common patterns across records. Neither of these is attainable when dealing with multiple records by heat maps of fixations accumulated with no regard to sequential connections [
Equation (
Although it was not explicitly stated, McCarthy et al. [
By focusing on the frequency of glances as an indication of importance, they disregarded the length of the chunks, that is, the number of fixations within glances. Also disregarded was the shift of glances, that is, between-chunk sequences. To us, both within- and between-chunk patterns seem to contain rich information worthy of investigation. The information can be extracted from partitioned sequences but not from plain ones. In addition, partitioned sequences will be of great value when some AOIs are nested into broader AOIs (see [
For the sake of simplicity, we will focus on the eye movements of web page viewers, and we will assume that the pages are divided into grid-like AOIs, that the fixations are coded in terms of the areas in which they fall, and that chunks are delimited by isolate gaze points.
The distance between two successive fixations indicates how far the interest shifted or did not shift in a looped transition that represents sustained interest in a given area. In our view, a chunk of fixations reflects continuous interest, and a new one begins after a momentary drift of the gaze. It seems natural to expect the distance distribution of the within-chunk shifts to differ, to some extent, from that of the between-chunk shifts.
The distance analysis explained above exploits information from the cumulative records across all viewers. Hence, it is possible that the results are influenced by some dominant patterns in particular records. If one is interested in sequential regularities often shared among records, frequent sequential pattern mining is useful, as explained below.
Among others, we will employ PrefixSpan, developed by Pei et al. [
Pattern extraction by prefix “
Record | Initial sequences | Patterns prefixed by |
---|---|---|
1 |
|
|
2 |
|
|
3 |
|
|
4 |
|
|
frequent code |
|
|
Codes
For every frequent code, one scans the reduced records, devoid of infrequent codes, for patterns prefixed by the given code. Those found for prefix “
Table
All of the frequent patterns extracted from Table
|
|
|
Ordinarily, one finds too few patterns at a high ms level and too many at a low level to make an interesting analysis. However, once one recognizes the inclusive relations, making use of multiple levels becomes a plausible solution for identifying strongly frequent patterns as opposed to mildly and weakly frequent ones. (See the Appendix for the relation networks among the patterns identified at ms2, ms3, and ms4.)
The present approach is expected to advance eye-tracking research along with conventional heat maps, scan paths, and network analysis recently developed by Matsuda and Takeuchi [
Twenty residents (seven males and 13 females) living near the AIST Research Institute in Japan were recruited for the experiments. They had normal or corrected vision, and their ages ranged from 19 to 48 years (average 30). Ten of the Ss were university students, five were housewives, and the rest were part-time workers. Eleven Ss were heavy Internet users, while the rest were light users, as judged from their reports about the number of hours they spent browsing online in a week.
The front (or top) pages of ten commercial web sites were selected from various business areas: airline companies, commerce and shopping, and banking. These were classifiable into three groups according to the layout types [
The stimuli were presented with 1024 × 768 pixel resolution on a TFT 17′′ display in a Tobii 1750 eye-tracking system at a rate of 50 Hz. The web pages were randomly displayed to the Ss one at a time for 20 sec. The Ss were asked to browse each page at their own pace. The translated instructions are “Various web pages will be shown on the computer display in turn. Please look at each page as you usually do until the screen darkens and then, click the mouse button when you are ready to proceed.” The Ss were informed that the experiment would last for approximately five minutes.
A 5 × 5 mesh was superposed on the effective part of each page, after the page was stripped of white margins that had no text or graphics. A uniform mesh was employed for ease of comparison among pages that varied in design beyond the basic layout. The distance of a shift between two segments was measured by the Euclidean distance, computed as the square root of
The rows (and columns) of the mesh were alphabetically (and numerically) labeled in descending order: A through E (and 1 through 5). The segments were coded by combining these labels as seen in Figure
Segment coding.
The raw tracking data for each subject consisted of time-stamped gaze points measured in
Each fixation was then translated into code sequences according to the segments in which the fixation fell. Finally, each fixation sequence was partitioned into chunks using the isolate gaze points as delimiters.
In accord with the algorithm, 25 segments were first recoded using letters
Frequent patterns were extracted at three levels of minimum support (denoted as ms12, ms14, and ms16) corresponding to 60, 70, and 80% of the subjects.
The four pages used as stimuli will be referred to as P1, P2, P3, and P4.
The total number of chunks did not greatly differ among pages, ranging from 539 (P2) to 592 (P1). The pages agreed well on the lengths and proportions of primary, secondary, and tertiary chunks that contained one, two, and three fixations, respectively. Primary chunks accounted for 53.3 (P4) to 60.4% (P1) of the total chunks, and secondary chunks accounted for 21.9 (P1) and 25.1% (P4). Putting the primary and secondary chunks together, the vast majority of the chunks (≥78.4%) were very short. The proportions of the tertiary chunks were much smaller, ranging from 6.9 (P3) to 12.2% (P4). The longer chunks accounted for 7.9 (P1) to 11.6% (P3).
The primary shifts of transitions within double-fixation chunks were loops (distance = 0) across pages. These accounted for 48.5 (P1) to 62.2% (P2). The pages agreed also on the secondary (
Loops and one-block shifts were also dominant among the chunks of length three or more. Loops accounted for 49.2 (P4) to 60.8% (P3) of the shifts, and one-block shifts accounted for 34.4 (P3) to 42.7% (P1) of them. Putting these together, the overwhelming majority (
Similarly, extremely short shifts (≤
The low prominence of the first two modal shifts was compensated by the relatively large proportions of the longer ones. Each of the two-block shifts (
The frequent patterns extracted at three different ms levels (ms12, ms14, and ms16) are inclusive within each page in the sense that (a) subpatterns of a frequent pattern are also frequent at a given level and (b) the patterns extracted at a higher level are included in those at a lower level. For the sake of simplicity, the term “frequent” will be omitted below when obvious. Prior to mining, special coding was applied to the within-chunk loops as explained in Section
As seen in Table
Number of patterns (
len | ms12 | ms14 | ms16 | ||||
---|---|---|---|---|---|---|---|
|
Loops |
|
Loops |
|
Loops | ||
P1 | 1 | 18 | B1/1 | 11 | 5 | ||
2 | 19 | 4 | 1 | ||||
|
|||||||
P2 | 1 | 15 | A1/1, B1/1, D1/1 | 9 | B1/1 | 5 | B1/1 |
2 | 14 | B1/2 | 4 | B1/1 | 1 | ||
3 | 1 | ||||||
|
|||||||
P3 | 1 | 14 | A1/1 | 12 | A1/1 | 7 | A1/1 |
2 | 34 | A1/8 | 11 | A1/3 | 2 | A1/1 | |
3 | 5 | A1/2 | |||||
|
|||||||
P4 | 1 | 18 | A1/1 | 15 | A1/1 | 6 | |
2 | 20 | A1/5 | 3 | 1 |
In the six patterns of length three found on P2 and P3 at ms12, the constituent codes were partially or totally homogenous. Five of them contained two repeated codes, either A2 or B3, including those prefixed by (A1..) as reported above. The remaining one, found solely on P3, contained A2. In the following examination of the double-chunk patterns, loops will be treated as single codes to reduce complexity.
The double-chunk patterns are listed in Table
Double-chunk patterns by direction.
Page | Direction | Pattern |
---|---|---|
P1 |
|
B2A2 B2A3R B2A4R |
== |
|
|
|
| |
A2A4R A3A4R B2B1L B2B5R B3B2L |
||
|
A1D3R B2C3R B2C4R B2D2 B2D3R B3D3 | |
|
||
P2 |
|
B3A1L B3A3 |
== |
| |
|
|
|
|
| |
|
||
P3 |
|
|
== |
| |
|
|
|
|
| |
A1B2R A1B4R A1B5R A1C3R A1D2R |
||
|
||
P4 |
|
(none) |
== |
|
|
|
|
|
|
| |
A2C4R A3B4R B2C3R B3C3 B3C4R C3D3 |
At ms16, the patterns were homogenous (B2B2 on P1; B3B3 on P2), horizontal (A1A2 on P3; C3C4 on P4), or downward (A2B3 on P3) sequences with the exception of down-rightward pattern A2B3 on P3. There was no leftward heterogeneous pattern.
The new patterns found at ms14 included an upward sequence (B2A3 on P3) and five downward sequences (B3C2 on P2; A1B3, A2B4, and A2C3 on P3; and A2B4 on P4) in addition to four homogenous sequences (B3B3 on P1; A2A2, B3B3 on P3; C3C3 on P4) and six horizontal sequences (A1A4 and B2B3 on P1; B3B1 on P2; and A1A3, A2A3, and B3B4 on P3). Among the 12 heterogeneous patterns, only two (B3B1 and B3C2 on P2) were leftward.
The patterns extracted at ms14 and above had no segments in rows D and E and no segments in the fifth column. None of the seven upward and downward sequences were strictly vertical, involving adjacent or nonadjacent columns in the ratio of 4 to 3. These vertical patterns mostly involved adjacent rows (6 out of 7).
Some of the constituent segments of the sequences at ms14 and above appeared solely as prefixes (A1 on P1 and P3; A2 on P4) or as postfixes (B3 on P1; B1 and C2 on P2; A3, B3, B4, and C3 on P3; B4 and C4 on P4).
The new double-chunk patterns found at ms12 had (a) segments in row D and in column 5, (b) notable positions of the new segments, (c) increased heterogeneous patterns, (d) increased sequences between nonadjacent rows, (e) strictly vertical sequences, and (f) bilateral sequence pairs. The segments in row D appeared only as postfixes in the downward sequences (D2 and D3 on P1 and P2; D2, D3, and D5 on P3; and D3 on P4). Similarly, the new segments found in row C were postfixes (C3 and C4 on P1; C3 on P2; C1 on P3; and C2 on P4) with a single exception (C2 on P3). The new segments in row B were mostly postfixes: B1, B4, and B5 on P1, B5 on P3, and B4 and B5 on P4. B2 and B3 on P4 were prefixes. An interesting case was B2 on P2 which was special, being a prefix to itself (B2B2). Dual roles were more notable than unary ones among the new segments in row A (A2 and A3 on P1, A1 on P2, and A3 on P4).
A total of seven new upward sequences were found, three on P1 and two on both P2 and P3, but still none on P4. These were prefixed by B2 (on P1 and P3), B3 (P2), or C3 (P3) and postfixed by the segments in row A—A1, A2, A3, or A4. Only C3A2 involved nonadjacent rows. A strictly vertical sequence was present on each of P1, P2, and P3—B2A2, B3A3, and B2A2. The rest were rightward (B2A3 and B2A4 on P1) or leftward on P2 and P3 (B3A1 on P2; C3A2 on P3).
A total of five new homogenous sequences were found on P2 and P3, one in row A (A1A1 on P2), three in row B (B1B1 on P2 and B2B2 on P2 and P3), and one in row C (C3C3 on P3). Like those at ms14 and above, none of the constituents were in columns 4 or 5.
A total of 17 new horizontal sequences were found on P1 (two in row A and four in row B), P2 (two in B), P3 (one in A, three in B, and one in C), and P4 (one in A, two in B, and one in C). A2 and A3 appeared as a prefix or as a postfix, while A4 appeared only as a postfix. The same held for B1, B2, and B3, while B4 and B5 appeared only as postfixes. C2 assumed dual positions in C2C1 on P3 and C3C2 on P4, both of which were leftward. The ratio of leftward to rightward sequences was 2 : 4, 1 : 1, 3 : 2, and 1 : 3 in the order of P1, P2, P3, and P4.
A total of 29 new downward sequences were found, six on P1, three on P2, 14 on P3, and six on P4. The prefixes concentrated in rows A and B with two exceptions (C3D2 on P3 and C3D3 on P4). In contrast, the postfixes concentrated in rows C and D with exceptions of five patterns on P3 and one on P4. Half or more of the downward patterns on P1, P2, and P3 involved nonadjacent rows (A-D/1 and B-D/3 on P1; B-D/2 on P2; and A-C/1, A-D/5, and B-D/2 on P3, where
Among all of the patterns in Table
The individual constituents of the multichunk patterns were frequent by themselves as primitive patterns at a given ms level, but not vice versa. Table
Isolate primitives by ms level.
ms12 | ms14 | ms16 | |
---|---|---|---|
P1 | A5 C2 C5 E3 |
|
A1 |
P2 | A2 B4 C1 D1 |
|
|
P3 | (none) | B5 C1 C2 D2 | A3 B2 C3 |
P4 | A5 |
B1 B2 |
A2 |
Generally, an isolate primitive at a given ms level would become a member of sequence(s) at a lower level and would not be present at a higher level. Exceptionally, C5, located in the rightmost column, persisted on P4 as an isolate at all ms levels. Partial persistence was observed between ms14 and ms16 on P1 (A2), P2 (A1, A3), and P4 (B3) as well as between ms12 and ms14 on P4 (B5, C1). No persistence was observed on P3. The persistent ones on P1 and P4 were limited to the first three columns of the top row,
Finally, E3 on P1 at ms12 was the sole frequent segment in the bottom row E where segments were generally infrequent across pages at all ms levels.
Eye-tracking researchers have inferred a fixation from gaze points closely clustered in space and time, treating it as a meaningful unit of information processing, that is, a
Most of the identified chunks were short, consisting of one or two fixations. Also, the transitions within multifixation chunks and between chunks were mostly short in distance, either loops or one-block shifts to adjacent segments. These seem to be attributable to the minimum criterion of the delimiter we employed—at least one isolate gaze point. Hence, even an accidental dislocation of one’s gaze resulted in chunking. It would be ideal if we could separate cognitively meaningful chunking from accidental chunking. Until an effective method is established, the best we can do is to be cautious in interpreting the results.
Actually, setting an appropriate criterion is a difficult task due to the possible individual and situational variations. Perhaps individuated criteria will be appropriate instead of a uniform criterion. Further investigation of the distributions of gaze points participating in fixations and those that are isolated is necessary.
As reported earlier, within- and between-chunk transitions were similar in that the first two modal distances were zero (i.e., loops) and one block. However, these differed in order and in magnitude. Loops were primary among within-chunk transitions but secondary among between-chunk transitions. The opposite was true for the one-block shifts. Next, the proportions of the primary and secondary distances of the within-chunk transitions exceeded the respective proportions pertaining to the between-chunk transitions. Similarly, there were more long-distance shifts between chunks than within them.
These results seem to suggest that the attention of our subjects was most likely shifted, after a pause, to an adjacent segment one block away or within the same segment. The medium or long-distance shifts were also separated by pauses, though their proportions were smaller than the short ones. Shifts without a pause, that is, within-chunk shifts, were short, chiefly occurring in the same segment or between adjacent segments one block away.
Now we turn to a discussion of the frequent patterns (i.e., subsequences) extracted by PrefixSpan. The patterns were simple in structure, mostly consisting of single or double chunks. Furthermore, the chunks themselves contained single fixations or single loops as expected from the chunk properties discussed above. More complex structures might have resulted if we had employed less stringent criteria for the delimiter. Even so, beneath the structural simplicity, interesting properties emerged as to the segment differentiation and the directional unevenness in attentional shifts.
First, the within-chunk loops were limited to (A1..), (B1..), and (D1..), all of which were in the leftmost column. While the presence of (D1..) was quite limited, the leading roles of (A1..) and (B1..) as prefixes in the multichunk sequences are noteworthy. These roles might be attributable to menu items placed in the segments. Second, the multichunk sequences chiefly consisted of the segments in rows A, B, and C. In particular, the leading role of A1 on P1 and P3 was noteworthy, like the loop (A1..), though its dual role as pre- and postfix was observed on P2. In contrast, A4, B4, and C4 were consistently positioned as postfixes. The same held for the segments in row D, which appeared only at the lowest ms level. The segments in row E were totally absent in multichunk sequences.
Third, the sequences at ms14 and ms16 were more likely to be horizontal, including homogenous codes, than downward and, to much less extent, than the upward sequence, which remained least likely among the additional patterns found at ms12. The order between horizontal and downward sequences varied across pages at ms12.
By chunking eye-tracking records into smaller units, we discovered interesting properties of the eye movement of web page viewers. However, further studies seem necessary to enhance the present approach, for example, by setting up nested AOI’s to reflect the hierarchical structure of the web objects [
We briefly explain frequent sequential pattern mining by PrefixSpan (prefix-projected sequential pattern mining) developed by Pei et al. [
Let us use Table
The goal of PrefixSpan is to find subsequences frequently shared among the records in DB. A subsequence is defined as the list of nonempty subsets of the elements of a given sequence, where the sequential order of elements is preserved. For example,
Subsequences of special importance are a prefix and the associated suffix. For instance, a frequent item
The network of the frequent patterns extracted at
Network of the frequent patterns extracted at ms2 (small letters in dark blue), ms3 (large letters in dark red), and ms4 (underscored).
More formally, a sequence
The suffix of
Scanning with respect to the prefix stops when the suffix becomes nil
It must be noted that some of the extracted patterns may be hard to identify in the original sequences, due to the intermittent removal of infrequent items from the projected database during the process, for example, the extracted pattern
The authors declare that there is no conflict of interests regarding the publication of this paper.