Eye-Hand-Mouth Natural User Interface

I filed a patent application to JPO. The background motivations are as follows.

  • Problem of cognitive load of current GUI:
  • Confusion on what natural user interface is:



finger gesture on a surface

I applied a patent to Japan patent Office a few weeks ago. It is about sensing of finger gestures on some object surface (palm, desk, wall, bag, and any). I may apply PCT patent too.

The background motivation:

I have a suspect about gesture in the air. It is socially strange if you do such gestures in air in front of the other persons. It is because human kind has not used such gestures in its long history (except pointing by index finger). It lacks the reality of human natural behaviors. Moreover it does not keep privacy. And, from the design point of view, a gesture decoding has the complexity to search in the 3D space.

On the other hand, human kind has a long history of use of finger gesture on surface of some objects – papers, and naturally uses it. Moreover it can hide gesture from someone else. And a gesture decoding in the 2D space is simpler.



Tobii 4C

I recently bought Tobii 4C, and has used with desktop PC.

Here are some observations.

  • After simple calibration, the accuracy of eye tracking is quite high and robust. I had used ITU Gaze Tracker (Eye Tribe). The calibration was pains, and a head move easily destroy tracking. Tobii, leader of gaze tracker, is doing a good job.
  • Tobii is doing a good job to use gaze tracking with current mainstream UI. Tobii core SDK bundles an utility for gaze tracking to use with Windows GUI. For instance, the Mouse warp feature helps the mouse pointer move fast beyond Hick’s law. It’s nice to use.

Moreover, they owned a lot of patents in the area of hardware design and applications with GUI/Touch UI.

So then I wonder why Tobii has not played a main role in main-stream of the human machine interfaces yet.

Some reasons may be:

  • Tobii style gaze tracking has its limitations. With non-portable type, use must sit near and in front of PC monitor. With portable type, user must wear bothering glasses.  Those would be overcome by some other players.
  • Business strategy
    • Tobii device was too expensive until Eye Tribe started to offer cheap device and Tobii was forced to follow it. Being expensive blocked a rapid creation of market of gaze tracking by Tobii.
    • Tobii could provide technology of platform layers, or could partner with platform (OS) players. But they are currently focusing on gaming and user research consulting…
    • Tobii patent portfolio blocks other players to realize the platform evolution, while Tobii does not have ability to play a platform provider…

AR is not natural

I examined recently how NUI is described.

One description describes that AR is among NUI. I don’t think so.

Real world seen is already rich and dangerous enough. The AR, for instance with Head Mount Display, adds extra information for seeing, which leads further stress on human sight/brain. AR is practical for those who are used to exposed and manage spiritual tensions such as soldiers. However AR will be an extra stress to most people, until human being get used to it as part of life, and it make time time more than a few generations.

Smart phone UI occupies eyes/hand more than PC

Smart phone user interface is simpler to that of PC. But it worsens in one aspect of PC user interface.

PC user interface is complex.  For example, mouse has right click button which many users (like my wife) don’t care about. keyboard has many function keys which many users don’t care about.  PC screen shows many objects in tool-bars/menus which many users don’t know their meanings.

On the other hand, Smart phone (or WEB) offers “see and select” user interface. Users has only to see several objects(icons or texts) and select one of them, and repeat it. Moreover it’s a direct manipulation by finger. This simplicity of user interface, besides mobility, must be the major factor of the boarder adoptions.

By the way, PC user interface heavily relies on hands/fingers to give information to computers, and eyes to get information from computers.

Smart phone has smaller screen. Users must be more focused on the smaller screen. Smart phone is mobile but users must hold it in one hand and operate on it by the other hand. That is, smart phone occupies user’s eyes and both hands more than PC. It worsens PC user interface in a sense.

Voice interaction is going to make eye/hand free, but its use is limited to closed environments.

It industry should offer better user interaction.


Car driving as a human machine interaction model

While you are driving a car, you do the followings in parallel.

  • watch forward of the read, get feedback of your action
  • hear outside sounds
  • manipulate handle by hand and brake/accelerator bu foot
  • talk with passenger seat person

You sight organ, audio organ, and body motor systems are working in parallel, even two parts of body motor systems are working in parallel, and you can achieve your goal to get some place.

How wonderful the power of the human being is!

On the contrarily,  current human computer interaction is very limited and naive. It doesn’t take advantage of the human power as above:  You give information only by hand, and get information only by sight of 2D, serially. Voice interaction is going to add another channel b/w human and machine, but the channel is isolated from the others.


User Interactions in the age of AI+Voice?

According to a news about this years’ CES, its major trend is the AI+voice interacting with life backbone systems such as car or home electronic devices. (Now smartphone, mobility, or wearable are already old fashioned vocabularies). How does it, the human machine interaction, look like?

Voice can support texting and discrete commanding. But it can not be suffice for human to interact machines in coming ages.

  • Voice has a problem for privacy. Its use is limited in private/closed spaces.
  • Voice lacks capability of pointing and analog commanding (as mouse had).

I believe:

  • Voice interaction should be supplemented by some other means for pointing and analog quantity commanding.
  • Voice interaction should be taken over someday by some other texting means which support privacy.

IT should innovate further in those interaction areas too.