An interesting side-note is that the latest, 2nd generation, iPad Pro’s (mid 2017) only support a maximum frame rate of 30 fps with the front-facing camera (see table 4–21) whereas the 1st generation iPad Pro’s support up to 60 fps in most modes (cf. table 4–26).
It would be interesting to know the rationale for this step backward but I haven’t been able to find anything online.
Neonates (children less than four weeks old) can sometimes track objects
if they are large and move slowly. At this age they seem to rely on saccades
(rapid jumps between different fixation points in the same direction, usually
occuring several times per second).
From about six weeks, a different eye movement, smooth pursuits, improves
and infants begin to track things moving in a smooth pattern. At about
five months, smooth pursuits reaches adult performance when it comes to
sinusoidally and horizontally moving targets.
To explore the uncharted territory of UIs for infants, Pursuits seem to be a
particularly good match, especially if limited to smooth horizontal movement
patterns.
For me, there are a few particularly interesting aspects here. First; if we
want to know if it is possible for infants to learn to use applications
equipped with gaze interaction, what kind of interfaces do we construct in
order to try to find that out?
Second; what if we could get these gaze-driven applications in the hands of
(millions of) parents through tablets, mobiles and the web? It is here that I
think Pursuits could be extremely useful. The built-in video capabilites of
most devices today should be good enough to identify gaze-following (at least
with exaggerated motions), and since there is no calibration step to go through
(which is difficult when it comes to infants), the big technical obstacles
may very well be out of the way.
The thing is that it would really only take a few children to provide a
starting point for iterative improvements. Once we know it is possible,
we have a sort of baseline from which we can start exploring the design
of these user interfaces.
Third; if successful, what do we do with that knowledge? What interesting
research questions could we then hope to answer? What interesting applications
will be developed?
As infant motor skills develop considerably later than their vision,
there should be a window for gaze-only interaction design of at least
six months.
Gaze tracking tries to establish where one is looking, like fixated on a
point in space or tracking a moving object, by taking into account relative
pupil position, head position, etc. Usually this is done with specialized
head- or screen-mounted hardware but if gaze-tracking is to become more
common, it is likely that sensors will have to be integrated in the devices
users would like to use hands-free. Lacking special eye-tracking hardware,
many devices today include video calling capapabilites where the video camera
can be used as a sensor (albeit not with the same precision).
As the eyes are a relatively small feature of the face, device-mounted
eye-trackers need very good accuracy if they should be consistently capable to
register minute changes in gaze. At a distance of half a meter from a display,
shifting our eyes as little as one degree represents about one centimeter of
gaze shift on the display. The Tobii Pro X3-120, a capable screen-mounted
eye tracker, can generally pick up eye movements as small as half a degree
under good conditions.
However, in order to reach this kind of precision, eye tracking sensors need to
be calibrated to each user’s eyes. Recalibration is often also needed when
lighting conditions change, especially if moving from indoors to outdoors. This
step could make it cumbersome to just pick up a device and use it. For new
users, it could even be difficult to know when and how to calibrate.
Enter Pursuits
Pursuits (Vidal 2014, Vidal 2013a, Vidal 2013b) enables calibration-free
interaction with graphical devices using only gaze. It does this by introducing
a new type of graphical user interface element, based on movement (see figure
below). A user selects an element by following its specific
movements.
Pursuits utilizes the smooth pursuits movements of the eye, which is a
type of movement that only happens when we are following something with our
eyes. Most people can not reproduce this movement on their own, which means
that triggering false positives while “just looking” can largely be avoided.
As this technique does not depend on having to identify the position on the
screen a user is gazing, only that the gaze is moving in a specific pattern, it
seems to be less dependent on exact readings and, better yet, calibration is
not necessary as only relative eye movements are relied upon.
For reference, here’s my child using the iPad at 6.5 months of age:
At this age, his motor skills were not good enough for touch UI in general
but he immediately picked up how to use this specific application.
One thing I was not aware of when I wrote the project specification, is how
much infants want to use touch (fingers, hands) to explore things. Even if
they would be able to use a gaze-based UI, they might not accept a
hands-free interaction.