The Future of Human Machine Interfaces

Designing successful consumer electronics products requires a holistic approach; including consideration of new technologies capabilities, users requirements, key industry or market trends and business strategy. Developing new technologies is of interest specifically because they can improve current capabilities or provide new features which can increase user value when incorporated into a product. New technology can have strong legal protection applied to it in the form of a patent, something hugely valuable if the technology in question truly provides a unique advantage. This is exactly how Dyson as a company started, with a board patient on a cyclonic vacuum cleaner protecting their competitive advantage for many years.

In my role as a Senior Research Engineer in the consumer electronics industry and technology startups, developing the new technologies on which to build the future generations of products was, and still is, one of my primary responsibilities; be it running in-house research projects, proposing and funding PhDs, working with Universities to apply the results of their research, partnering with other companies or simply buying existing companies or individual products with exclusive agreements.

In all these cases, a new technology must be understood and assessed against a number of criteria. As comercial decision essentially boils down to a weigh up of risk vs reward. In this case, the user value provided by a technology is the central element of the total reward provided. A deep understanding of users is required to be able to establish a causal link between quantitative technology metrics and user value, but doing so provides clear quantified targets for future technology development. Business value, such as reduced costs, can also be provided and the potential for patentability and the strength of that protection provide a competitive advantage if what is protected is fundementaly linked to the value created. In terms of risk, the development cost and time to maket combined with likelihood of success are the headline metrics.

The Human Machine Interface is a concept core to consumer electronics as it governs every interaction we have with technology in our lives. The first consumer electronics products used the analogue interactions of the products that had come before, dials and leavers. Physical buttons then gave way to digital buttons on screens and this paradigm now governs most of our interactions with electronics today. However new technology is poised to radically change this paradigm in the next decade.

Novel MEMS sensors continue to come to market, ever increasing the environmental variables machines can quantify. Utilised in wearable or smart home devices these will allow our interaction with machines to to become predictive and sub-conscious. Actuators such as new microLED displays are poised to come to market in the next few years and there is a flood of companies combining them with waveguides to create AR headsets and custom silicon is helping to drive the wide roll out of new machine learning techniques into applications such as ubiquitous voice interaction.

The real advantage many of these technologies offer is the promise of a new paradigm of interaction with machines that better accounts for our innate physicality. Prevously predicted revolutions of interaction, such as the paperless office or living in virtual worlds have not come to pass precisly they did not account for the fact that humans are primitive organisms, enthralled by our senses. Michio Kaku calls this “The Caveman Principle” in his book ‘Physics of the Future’. Rather these technologies will bring about a revolution in the way we interact with machines precisly because they enable the machine to interact with us on our physical terms, through our senses, our language and by manipulating our enviroment. In doing so they will meet Mark Weiser’s criteria for ‘Profound Technology’ described in his 1991 paper titled ‘The Computer for the 21st Century’.

The most profound technologies are those that disappear. They weave themselves into the fabric of everyday life until they are indistinguishable from it.

Mark Weiser

These developments provide significant opportunity for different and better consumer eletronics products. From my experience in the market I have picked out a few interesting examples of technologies in each catagory that I believe will shape the Human Machine Interface of the future. I have grouped these devices by how they relate to us, through sensing our physical enviroment, computing a responce, or actuating a responce in the world.

Sensors

Healbe Gobe 2 – Non-invasive caloric intake sensing

55% of smart band owners believe they have improved their physical health according to Mintel (2018). Tracking calorie intake is the counterpart to fitness monitors which can calculate calories burned for those trying to lose weight or meet fitness goals. Today, calorie intake tracking is most often done through apps requiring onerous manual logging of meals. Healbe Gobe offers to do this automatically, as “The only smart band in the world that tracks digested calories and body water automatically”. Competing approaches monitoring food intake include acoustic (analyzing chewing sounds), visual (analyzing photographs of food), inertial (analyzing wrist movements), EMG/EGG/piezoelectric (analyzing muscle movements during swallowing), and respiratory (analyzing breathing). None of these techniques can quantify energy consumption unless paired with other sensors or manual approaches.

Healbe Gobe 2 can monitor energy consumption with best in class accuracy for a wearable. An independent validation by The Foods for Health Institute at the University of California Davis, comprising a 14-day study was conducted on 27 adult volunteers between the ages 18-40 years old to confirm the accuracy of the Healbe GoBe2 in automatic tracking of digested calories by comparing data from manually recorded intake of precise prepared meals and the data from the GoBe2. Data was transformed into three-day rolling averages, and the GoBe2 accuracy was 87%. Typical manual recording is well below 50%. More information on the study can be found at http://bit.ly/ucdavis_validation.

The device works by using a bioimpedence sensor to measure water content around your cells. Water is released by cells when they absorb glucose, the energy digested from food, from the blood.  Healbe Corp. currently holds 22 patents, of which the most fundamental is “Method for determining glucose concentration in human blood” detailing the measurement technique, granted worldwide. Other patents concern the creation of a device to realize this measurement technique.

ContinUse Biometrics – IR laser & AI to sense everything

ContinUse Biometrics are an Israeli company established in 2015 developing a unique, low cost sensor and data analytics platform that captures a wide array of medical grade physiological information from people remotely, without any physical contact in a continuous manner. Many different metrics can be monitored continuously at the same time, some examples are:

  • Cardiographs for heart rate, variability & health conditions

  • Blood pressure

  • Breathing rate and respiratory conditions

  • Muscle myography for interactions

  • Secure user authentication using multiple biomarkers

  • Long range, isolated microphone

  • Vibrations useful for analysis of components in industry

  • Stress levels

  • Thermal Comfort Levels

  • Fatigue

  • Presence of Blood Alcohol

The sensor used consists of any standard camera sensor and an eye-safe IR laser diode. The camera picks up reflected IR laser light and performs interferometry calculations to detect vibration. AI processing in a cloud platform allows all the sensed parameters to be inferred from this vibration signal. The optics are designed to work over a range fixed at manufacture, 20-70cm and 35-65m are typical ranges. The cloud platform uses AI processing to analyses the data and return the metrics.

A wider number of the metrics could be of huge interest to HMI applications. For instance the ability to pick up a single person speaking across a crowded and noisy room without interference could be of great benefit to speaker phones and security of the smart phone could be greatly increased by using multiple biomarkers. Continous are currently focused on contactless, real time medical diagnostic in the home.

ESPROS Photonics, Chromation, Unispectral – Spectrometer on chip

Spectral information carries a wealth of information about our physical world. Skin conditions can be diagnosed, materials can be identified, composition of food and beverages can be analyzed, and a persons exposure to light can be recorded in order to provide insights or interventions to support a healthy circadian rhythm.

Until recently spectrometers have required multiple optical components arranged with a high degree of precision to ensure light can be split and measured accurately. This lead to high costs (>$1000) and large sizes. The small, low cost alternative has been to use individual photo-diodes each calibrated to report the intensity of a single wavelength range. While feasible for measuring intensity of all visible light, this approach does not scale to allow monitoring of a wide number of wavelengths, meaning applications that require higher fidelity such as material identification, or making recommendations about a persons circadian rhythms have not been possible. Recently a number of companies have proposed different solid state techniques which dramatically reduce both cost and size of spectral sensors without decreasing fidelity significantly. 

Espros SPM64 Multispectral Sensor is one of a new generation of spectrometers on chip capable of sensing 400 – 900nm with 64 channels spaced across this spectrum. It uses a micro-patterned bandpass filter array on top of a hybrid CCD-CMOS image sensor in a package less than 2.7mm2 in size. This device was developed for the smartphone industry.

Chromation are a direct competitor and have created the world’s first commercial photonic crystal spectrometer with a folded optical path. The technology provides best-in-class spectral range with competitive spectral resolution in an highly compact form factor. The optical component is manufactured using a silicon-compatible process for a low component cost. A fused silica substrate with CMOS pixels sensing intensity is paired with planar optical structures fabricated via optical lithography and dry etch (no curved surfaces, ruling, or imprint) that split the light by wavelength and the parts then have a simple stacked alignment. As non-imaging optics are used, the design scales down well, no external optics or fine optical tolerances are required and the cost is low. The device available today is the first demonstration of the technology. The roadmap covers a number of initiatives to further reduce size and expand capabilities including the move to surface mount chip packaging and extension further into the NIR. The current device has a range of 350-950nm, a resolution of 15nm FWHM, is able to measure an incident light level of <172nW-300uW and requires 5-30mA @ 5V while sampling (typical sampling time is 20-100ms). Range and resolution can be adjusted (Operation at 10.6um, and resolution down to 2nm have been demonstrated) and the folded optical path allows for integration of additional optical elements to expand capabilities.

Unispectral from Israel are developing a MEMS based tunable optical band gap filter that can be placed in front of an existing smartphone camera to turn it into a low cost spectrometer. The device is a Fabry-Perot interferometer that can have a spectral range from 400nm to 1,000 nm (visible to near IR). The filter can turn the band-pass wavelengths down to a resolution of 20nm in real-time by actuating the distance between its FP lens elements. This enables the filter to capture only the necessary light at a high image resolution. Unispectral MEMS optical filter is 0.82mm thick with an optical transmittance of 90% and a diameter comparable to most mobile phone cameras.

Unispectral claim differentiation from competitors through its optical filter being compatible with existing CMOS sensors on mobile devices, small size and higher spectral resolution. Their closest competitor is Microsoft research ‘hypercam’ which uses a number of LEDs to illuminate a scene photographed with a CMoS camera. It has not been commercialized and is not capable of the same spectral resolution however. Using Unispectral’s platform, users can capture high resolution images that include spectral data for every object in the picture. The combination of computer vision and spectral analysis tools allows cameras to automatically identify objects and determine their chemical components from a distance.

Hamamatsu also have a portfolio of miniaturized devices using a more traditional optical setup and are limited in terms of cost and size improvements that can be made over existing technology.

Epicore Biosystems – Microfluidic sweat sensor

Epicore Biosystems develop microfluidic sensors that measure sweat and sweat biomarkers in order to monitor sweat rate and electrolyte loss. Skin conductance sensors have been available for some time and are used to measure galvanic skin response or changes in sweat gland activity. These are reflective of the intensity of our emotional state, otherwise known as emotional arousal. A much greater resolution of information is available through monitoring biomarkers.

The core technology is a thin, miniaturized, wearable microfluidic sensor  designed for precise capture of microliter volumes of sweat released from eccrine glands and a reusable, battery-free electronics module for measuring, recording and transmitting sweat conductivity and rate in real time using wireless power from and data communication with mobile devices via NFC. The platform exploits ultra-thin electrodes integrated within a collection of microchannels as interfaces to circuits that leverage NFC protocols. The company claims differentiation in lower cost with higher accuracy with its disposable patch sensors. Initially introduced in 2016, earlier microfluidic designs measured chloride loss, glucose, lactate and pH levels in sweat. Newer designs also quantify concentrations of heavy metals such as lead, arsenic, urea and creatinine levels, the latter of which is related to kidney health. The devices can measure these chemistries continuously, in real time, allowing wearers to monitor how there sweat chemistry changes during an exercise regimen and throughout the day. The devices are able to work underwater during aquatic sports. New adhesive materials and microfluidic designs maintain water-tight seals to the skin.

Paratech, Next Input and Hypersurface – QTC and MEMS Force Sensors

Force sensors allow deeper and potentially more natural interactions with a smartphone display through responding to different levels of pressure in different areas. Touch sensing is often added to screens through capacitive or restive sensors and less commonly through optical means. Capacitive and resistive solutions do not work in the wet and all three of these solutions only work on a limited number of substrates materials. Apple use a system based on capacitive sensors in the screen but this is very expensive. Inductive sensors from TI and Microchip only work with metal chassis. Outside of screens and metal on plastic or wood substrates, two main categories of device are available, those based on MEMS and those based on composite materials, either piezo-resistive or quantum tunneling composite (QTC).

Paratech’s QTC ink is available in opaque and transparent versions and typically layered up with silver conductive tracks, adhesive and PET plastic sheets for protection and electrical isolation. The finished stack is thinner (75um with 10um displacement required for force sensing) and lower cost that MEMS based solutions and more thermally stable than piezo-resistive materials. During shipping, temperatures above 60 degrees can cause QTC material to soften and so destroy the absolute force calibration done on the production line to make the response curve liner and accounts for different amounts of pre-load applied during assembly. This same effect can happen to piezo-resistive materials at lower temperatures. Electrically QTC is a variable resistor and simple to implement. It is possible to matrix it to provide sensitivity to for position, though capacitive solutions have a far higher accuracy and careful mechanical design is required to ensure the material squeezed transfers pressure to the expected QTC sensors reliably. QTC works with gloves and in the rain unlike capacitive solutions. It also has a wide dynamic range (100g – 1kg) and high fidelity (10g).

Next Input’s MEMS system is low cost at volume with world class quality and reliability. Next Input are on their 3rd generation sensor, and 2nd generation Analog Front End (AFE) since 2015, improving key performance metrics such as sensitivity, power, and size. With no movement tolerances required, force sensing on screen surface and device sides is minimally invasive and simple to incorporate. The FT-4000 ForceTouch™ sensor family represents the highest performance, smallest size solution optimized for applications that require indirect force sensing such as LCD displays and other sensing solutions where the sensor must be placed at the perimeter of the touch area. The footprint is 1.33 mm x 1.33 mm x 0.57 mm and it requires preload and a stiff backing. The FT-7000 ForceGauge™ sensor family is optimized for applications where the sensor can be directly applied to the surface of the touch interface, such as a glass panel, an OLED display, or leather, plastic, or metal material. This sensor is smaller in height, being 0.22 mm and no pre-load is required as it adheres directly to touch surface. Both sensor requires 5μA when interfacing with AF-3068 analogue front end.

Hypersurfaces patented AI utilizes data gathered via microphones and inertial sensors to locate input such as touch, slide, and recognition of specific actions, with minimal need to embed technology within a surface. The real advantage of this approach is that it uses standard chips and common vibration sensors (piezoresistive or accelerometers typically) and requires minimal hardware. The sensors can be applied to any surface (wood, car dash, metal, glass) and the software can recommend the minimum number and placement of sensors that will allow full fidelity of input.  The software is then able to  uniquely leverage combinations of supervised and unsupervised neural networks to interpret the vibrational patterns detected on physical objects (such as those from human gestures or interactions) in real time, and converts them into digital commands. The software is calibrated for a surface and sensor configuration and able to account for noise such as the vibration of a car in motion. This allows a number of applications as it can make large surfaces such as tables, walls and car interior interactive and able to sense input and recognize user activity e.g. taps, or placing a specific object on the surface. Being based on Machine Learning it is easily embeddable in a number of products with minimal changes to hardware.

mBrainTrain – On Ear EEG

Augmenting reality visually has a number of physics limits that make high resolution, high brightness, wide eybox and FOV very challenge in small spaces. Augmenting reality with audio is much more achievable and a number of major companies are beginning to work on hearables, for example, the Bose AR program, Google Buds, Apple Airpods, Microsoft Surface Earbuts and Samsung Galaxy Buds. Many of the challenges in this sector surround packing the necessary technology into the small space envelope of headphones and the software. The sensors and other hardware already exist to enable many significant use cases. However there are still novel sensors that can enable new user value. Electroencephalogram (EEG) sensors have been restricted to lab use and never been incorporated into consumer electronics previously.  EEG belong to a class of sensors including ECoG, MEG, fNIRS, fMRI that are able to monitor brain activity. Recent advances have allowed EEG sensors to shrink small enough to fit around and in the ear, and yet retain the fidelity to determine the brain region that brain activity is occurring in and accuracy.

mBrainTrain, a 7 year old company based in Serbia with EU funding, is selling a behind-ear EEG sensor array on flexible, transparent foil called cEEGrid, which enables unobtrusive multi-channel EEG acquisition from around the ear, though a drop of conducting gel is still required. MBT reference a number of scientific studies they claim prove that they provide the scope of features correspond to those of full cap recording systems with acceptable accuracy. Professor Danilo Mandic, Professor of Signal Processing at Imperial College London is building earbud EEG devices. His latest is a  memory foam earplug with two electrodes of silver coated fabric. The key advantages being a low cost, generic device that avoids custom ear pieces and needing only saline solution to form establish low impedance contact. With this the group obtained high-quality EEG signals and the device is also capable of picking up cardiac and respiratory signals (Hearables: Multimodal physiological in-ear sensing). The CIBIM laboratory at Oxford University, led by Prof. Maarten De Vos focus research efforts on miniaturizing the mobile EEG device. A first prototype of a near-invisible high-quality brain monitoring device was developed in collaboration with Neuropsychology at the University of Oldenburg, CRITIAS (ETS, Montreal) and Sonomax. More broardly, EEG sensors have already been shown to be highly accurate sleep trackers, alertness/drowsiness detectors, acting as mindfulness trainers and enabling brain computer interfaces, for instance allowing the movement of a mouse with thoughts only. Incorporating EEG could allow manufacturers to build hearable devices that use thought as an input to smartphones which could help avoid the stigma of voice assistants in public, allowing an assistant to speak to the user and the users response be elicited by detecting a P300 wave which indicates a decision in response to a stimulus. Additional information could also be recorded and shared with loved ones, for instance adding a visualization or emoji representing current mindset to a message. Manufacturers could provide best in class wellbeing improvements through monitoring stress levels, sleep and alertness levels far more accurately than previously possible and providing interventions and training to improve mindset. Ultimately health applications could be enabled including the diagnosis of medical disorders. While researchers have proven the feasibility and utility of these sensors, engineering challenges remain. The electronics need miniaturing, being made robust and designing into products that provide genuine user value.

Nanoscent – Electronic nose

Volatile Organic Compounds (VOCs) are organic chemicals with a very low boiling point, causing them to enter the surrounding air as a gas. VOCs are numerous, varied, and ubiquitous. Most scents we can smell are of VOCs, gases we cannot smell can still be detected by sensors to identify compounds such as C4 or TNT. VOCs enable communication between plants, and plants to animals, their detection allowing identifying plant stress such as that caused by disease early. Some VOCs are dangerous to human health or cause harm to the environment. The detection of VOCs is typically carried out by large, complex, expensive lab equipment, for instance gas chromatography and mass spectrometry.

Israel’s Nanoscent offer an new, smaller chemical sensor and software package that forms an electronic nose scent recognition system to detect VOCs. NanoScent’s chemical sensor consists of a nanoparticle array, which only reacts to VOCs and is unaffected by the presence of lighter molecules like CO2 and CO. The sensor is 5x5mm in size and outputs a change in electrical voltage upon sensing chemicals in concentrations as low as 20 parts per billion. This change in voltage is used as an input by NanoScent’s software to classify the presence of different VOCs. The VOCs need to be exposed to the sensor anywhere between 5 seconds and 60 seconds for the system to complete its sampling. NanoScent is capable of sensing 30 VOCs simultaneously and has a library of 250 scents that can be created from different combinations of these 30 VOCs. The company claims its sensors are disposable and reusable, though lifetime is not known.

The company’s core differentiation is in its machine learning models, which it claims allow the scent recognition system to be scaled to multiple applications faster than competing e-nose offerings, which rely on tailored chemical libraries to detect VOCs within an industry sector like food spoilage detection. NanoScent develops machine learning models for applications on human health, such as measuring changes in human fluid odor (e.g. sweat, saliva) to indicate health conditions. Customers can use the SDK to develop their own applications. The software models can be tuned to detect single compounds to multiple compounds, depending on the complexity required for the use case. As the number of molecules for detection increase, so does the computational power needed by the software. The software could be hosted on a high compute server or a local raspberry Pi.

Aernos – Nanostructure MEMS gas sensor

Smaller gas sensors are typically based on metal oxide semiconductor (MOSFET), surface acoustic wave, quartz crystal micro-balance, spectrometry or conducting polymers. Aernos have developed a gas sensor for monitoring air quality based on carbon black nanostructues on silicon to create a MEMS system on chip (SoC) with multiple sensing channels. Each channel has a unique nanostructure onto which particular gas molecules absorb for detection. AerNos fabricates the nanostructures onto the SoC using in-house proprietary fabrication techniques, while the SoC and the nanomaterials themselves are procured from its partners. The nanomaterials are coated with molecules that activate them to attract specific types of gas molecules. The attracted gas molecules absorb onto the nanomaterials and create a measurable change in electrical characteristics like resistance and capacitance. The SoC has a chip that processes the information and determines the concentration of gas molecules. The correlation of change in resistance and capacitance to the concentration are programmed for each gas compound by AerNos before deployment. THe company claims that AerNos’ sensors can specify the concentrations of individual gas compounds like nitrogen oxides, ozone, carbon monoxide, sulfur dioxide, and hydrogen sulfide with a sensitivity up to one part per billion. The sensor SoC consumes less than 20 mA and has a footprint of 2,025mm2 (45x45mm), but the core sensing element with multiple channels  of nanomaterials is about 9mm2 (3x3mm).

AerNos’ sensors do not need to be heated or cooled for activation as in the case of metal oxide based sensors. As a result, these sensor require less power than competing solutions to measure the same number of gas compounds on a sensor of the same size. Additionally, AerNos’ sensors measure the absolute concentration of gases within the environment as opposed to the relative concentration. This is because the SoCs are preprogrammed with the base resistance and capacitance metrics across the nanotubes in standard ambient conditions and can self calibrate for fluctuating ambient conditions. AerNos sensors are tested in-house with conditions ranging between -40 deg Celsius and 85 deg Celsius, at a RH of up to 99%. AerNos is in talks with a major system integrator for a pilot, where its sensors will be included on devices installed on parking meters to monitor pollutants. In addition to the standalone sensors, the company is currently also marketing its wearable device, the AerBand. This device targets health conscious consumers who are interested in monitoring environmental air quality. Additionally, AerNos hopes that hospitals will use the AerBand to monitor health risks due to pollution for pregnant women.

Actuators

Revel HMI – Small haptic feedback device

Haptic feedback activates our sense of touch to create sensations from digital input. The actuators for these systems are typically motors with eccentric weights or resonating liner drivers (piezoelectric or electro-active polymer based). These actuators create vibration by moving a mass over at a certain frequency. This typically results in a small actuator that is only capable of high speed operation (that humans cannot perceive very well) or very large actuators that operate at lower frequencies. The ALPS system is industry leading in terms of performance for VR headsets and gaming controllers but it is very large and so unsuitable for smartphone applications.  The Taptic engine as used by Apple is also large and very expensive ($7 per unit).

Based in USA, RevelHMI has developed multiple technologies to improve performance of resonating actuators. They have developed solutions to improve power, efficiency, broaden frequency response and to improve consistency and reliability while also simplifying the design to allow low cost mass production and small space requirements. In general, the technology allows smaller motors, that operate at lower frequencies with high levels of vibration power and improved efficiency. They also achieve their small package size  by moving a smaller mass through a greater distance through an arc, rather than in a purely liner fashion. The unit is expected to cost less than 30 cents at volume and has a 2-4x size improvement over ALPS. The smaller mass also provides much improved start times, stop times, power modulation, and frequency control to enable crisp, highly consistent haptic feedback so Revel HMI also developed a new resonating motor driver technology that is optimized to deliver maximum performance from the actuators by increasing drive precision. The driver uses a new sensor system and drive algorithm that improves resonance tracking while removing phase shift issues to improve power and efficiency. This driver innovation allows the use of highly efficient High-Q actuators to provide broad frequency response and to rapidly and accurately adapt to dynamic shifts in dampening. Finally, existing haptic actuators and drivers have very limited performance, so existing haptic APIs are very simple, usually a single API that has a parameter to turn a motor on for a specific time. To enable software developers to harness the full power of Revel HMI actuators, they have developed a full set of haptic APIs. They support a broad range of vibration patterns, haptic signals and enable multiple layers of abstraction, allowing a developer to build solutions that are supported by a wide variety of hardware implementations.

TruLife Optics & Ceres – Volume Diffractive Waveguides

To create transparent displays you can use diffraction or refraction to combine two light fields in a transparent substrate, interpolate by having transparent gaps between pixels, use cameras to make a display ‘see through’ digitally or position a traditionally display in only a portion of the users field of view. All the latter approaches have significant disadvantages and the vast majority of augmented reality devices use refractive or diffractive elements. Diffractive elements are smaller, more transparent and allow exit pupil expansion, unlike refractive elements, such as LentinAR, Lumus or Google ‘glass’ technology, making them better suited to consumer electronics. Diffractive optical elements can be created in a huge number of different ways and the physics is well understood (see Practical Holography by Graham Saxby). Volume holograms recorded in analogue media are well known to provide:

  • The best possible first order efficiency (>99.98%)

  • The ability to diffract RGB & IR in a single layer

  • The ability to be mounted on curved surfaces (for instance prescription lenses)

  • The best possible transparency (>98%)

The elements can be arranged as free space reflectors or as waveguides if sandwiched between two sheets of high refractive index. They have however traditionally been difficult to produce in high volume however due to requiring expensive silver halide film, vibration isolated optical benches and very high laser powers. TruLife Optics, Ceres Holographic, Apple’s Akonia and Digilens are manufactures of volume holograms where as companies such as WaveOptics, Displex and Wayray are all pursuing diffractive gratings with other configurations such as surface relief etching or printing. These are typically simpler diffractive gratings rather than true holograms and are hard to apply to curved surfaces or diffract multiple wavelengths without cross talk. Between them, these companies are partnered with almost all of the large companies building AR glasses.

TruLife Optics in London have capitalized on the emergence of Bayer’s low cost, easy to process Bayfol HX4000 Photopolymer (a liquid crystal and UV curing monomer material) and low cost, high power lasers by developing a machine capable of mass producing volume holograms roll to roll. This enables the advantages of volume holograms to be realized in production for the first time. As the physics is well understood their IP is only related to the specifics of the production machine and not fundamental. ByNorth from Canada have an unknown partner producing a volume hologram for their ‘focals’ AR glasses, Arkonia were purchased by Apple and Ceres Holographic in Scotland have also produced a machine using the same Bayfol material to produce holograms roll to roll. TruLife are also developing an on axis retinal tracking system that uses IR light diffracted by the same hologram; its direct view of the eye increases accuracy.

Digilens – Active Volume Diffractive Waveguide

Digilens are one of the oldest optical waveguide specialists creating the transparent optical elements for AR glasses and HUD, having worked with the US military for many years. Digilens’s core technology is a direct competitor to Apple’s Akonia, TruLife Optics and Ceres Holographic, being based on a photopolymer and liquid crystal mixture that has the interference fringes of a reference and object beam recorded in the refractive index pattern of the LCs and the monomer is then cured with UV light. This allows the creation of precise refractive input, output and expansion gratings, sandwiched between high refractive index glass or plastic to form a waveguide. Use of volume bragg gratings allow curved lenses and multiple colors in a single layer. The reduction of layers leads to reduced cost by decreasing optical complexity. Unlike TruLife Optics and Ceres Holographic, Digilens’s material is proprietary and has been further refined to provide high transparency with low haze. Volume holograms offer the potential of best in class efficiency, transparency and thickness over surface relief hologram approaches taken by Displex, WayRay and WaveOptics or refractive approach taken by Lumus as well as the myriad of other transparent optical combining approaches.

Digilens’s truly unique offering is active holographic waveguides. These twist the liquid crystals in response to voltage, changing the refractive index of the elements and effectively turning them on/off. By putting each colour of light in a different layer and projecting each colour in turn, switching the input grating for the non-applicable layers off during projection of that colour, Digilens can further reduce cross-talk noise. Current embodiments of the technology use two layers for different screen colours as well as layering different functions such as on axis eye tracking. Digilens claim a 40 degree FOV today and a roadmap to 150 degree with the limitations claims to be in the projection engine.

Dispelix – Surface Relief Diffractive Waveguide

Dispelix are a Finish company developing optical waveguides for AR applications. These are diffractive waveguides based on surface relief treatments rather than volume bragg gratings. This approach is less transparent, less efficient, hard to put on a curved surface and only able to support narrow wavelength bands in each layer. However the manufacturing challenges are less complex, as proven by Microsoft who have used Nokia’s IP to create surface relief diffractive waveguides for their Hololens headset in great numbers as well as licensed this design to Vuzix, who manufactures their waveguides in a factory funded by Intel, and sell products under a co-branding partnership with Lenovo on the Chinese enterprise market in high volume. This technique is also used by Magic Leap. Dispelix currently offer a 0.8mm thick waveguide with 30 deg FOV and 16x12mm eyebox from a 5mm input pupil, due to the use of exit pupil expansion. Dispelix has shown that a micro-ridge can also be created with an additive process, by printing the micro-ridges onto the lens, as well as injection molding, which is potentially 10 times cheaper than etching processes used by competitors.

Lumus – Refractive Waveguide

The most basic waveguide structure to combine a digital display with a view of the real world is a refractive approach. Google used a very crude prism in their ‘glass’ product. Lumus Optics (originating from Israeli military industrial research) were the first to commercialize TIR waveguides that used refractive in and out coupling. In this structure light from the optical engine is coupled into the waveguide through a reflective mirror. After several TIR bounces inside the glass substrate, light encounters an array of transflective surfaces to release the image. Transflective surfaces embedded at angles are designed to reflect part of the light out, and transmit the rest to the next transflective surface. This process allows 1D exit pupil expansion. This does not affect the image because the exit pupil is only the Fourier plane of the virtual image, and the human eye will convert the angular information from this plane to spatial information through its lens. The image on the retina, merges all light with the same angle onto the same ‘pixel’, thus creating only one image. The ideal exit pupil for AR is >20mm, accounting for the 4mm eye pupil size, its movement in the socket, differences between peoples interocular spacing and tolerances of the glasses and their position on the head (Why making good AR displays is so hard by Daniel Wagner).

The key advantage of the refractive approach over simple diffractive optical elements is that they have the same properties independent of the light wavelength, avoiding any colour uniformity issues or rainbowing of environmental light incident from the back and requiring only a single layer. Lumus’s refractive waveguide utilizes conventional optical design process, simulation tools, and manufacturing process. Because the geometric optical structures pose no bias on color, the resulting image can be free of chromatic aberration. Lumus Optics are currently selling a waveguide with 55 deg FOV at 2mm thickness though they are unlikely to be made much smaller in future. However, there are still challenges in the manufacturing process. Each translative mirror needs a different reflection to transmission ratio to guarantee a uniform light output within the eyebox, meaning many coated optical components need to be stacked and glued together, then cut at a precise angle. It is likely that other process will be cheaper to make in the future. The size of the structures makes them visible and this approach is not applicable to curved displays.

Lumus has most recently been the OEM supplier to both Daqri and Atheer. Google and Sony both have intellectual property around waveguide designs similar in nature. A series of waveguide joint-research projects were also done by HiMax, Essilor and Optivent. Essilor of France is the world’s largest lens manufacturer, and recently merged with Luxottica of Italy, the world’s largest eye-frames company. Many players have already taken this technology to its limits and technology with more interesting future potential is available.

WaveOptics – Surface Relief Diffractive Waveguide

WaveOptics are a recent entry to the market from the UK offering surface relief diffractive waveguides for AR applications much like Dispelix and WayRay.  Diffractive waveguides based on surface relief treatments rather than volume bragg gratings are theoretically less transparent, less efficient, hard to put on a curved surface and only able to support narrow wavelength bands in each layer (often single colours). However the manufacturing challenges are less complex. WaveOptics standard offerings also include exit pupil expansion gratings to be able to resize the image in two dimensions and offer a large eye box of 19x15mm and FOV of 40 deg. WaveOptics have a roadmap showing improvements in both metrics as well as curved screens and single layer full colour.

The real advantage of WaveOptics other others offering similar technology is their focus on manufacturing at scale. In November 2018, Wave Optics signed an exclusive waveguide production partnership with Goertek, a high-tech consumer electronics design and manufacturing company, to enable high volume manufacturing of WaveOptics’ diffractive waveguides for AR headsets. WaveOptics have also recently begun designing the light projection engines to further increase their ability to offer a turn key solution.

WayRay – Wavelength Specific Mirror

Holographic optical elements can be used to create diffractive gratings suable for in/out coupling, folding (changing direction) and exit pupil expansion for a waveguide if sandwiched between two layers of high refractive index, allowing TIR to occur. Alternatively they can be used as wavelength selectable mirrors. This is the configuration that WayRay, a Swiss based designer and manufacturer of surface relief holograms, are using them in. They are focused on the automotive market and the application of their surface relief holographic optical elements to windshields. Their first product is WayRay Navigation, a retro-fittable AR navigation system for cars that projects GPS and driver notifications while utilizes gesture and voice control inputs. This currently only uses a flat screen and has an 8 deg FOV. Navdy was the first “HUD on the dash with a spherical combiner”, the list of near direct copies and derivative products include Aker ROAV, Carrobot & Hudway, iScout, WayRay, Lumens, Exploride, Kivic (Karl Guttag).

Using a thin holographic film on the windshield instead of a mirror system made it possible to reduce the dimensions of the device WayRay claim best in class eye box, FOV, and device size. WayRay are focused on offering a turn key solution for the automotive market which includes not only the optics, but projection engine, compute, software framework including SLAM algorithms and an SDK and end user application software.

LentinAR – Refractive Waveguide

LentinAR in South Korea have developed an improvement on the refractive waveguide solution sold by Lumus that attempts to address some of Google Glass’s size issues. Their system is based on an array of 15 pin mirrors embedded in a transparent substrate. These are tilted at 45 deg to send the image into the viewers eye. There diameter is small enough that they act as a pinhole camera and provide a large depth of field (25cm to infinity) which results in the image being in-focus regardless of where your eye is focused. The mirrors can also have free-form curvature, allowing a larger diameter and therefore greater efficiency while still providing a large depth of field. The mirrors are so small that they are not particularly visible to the user, however the image does appear to have gaps in it and their transparency is not comparable to other optical combiners in the current embodiment.

The key advantage of the refractive approach over simple diffractive optical elements is that they have the same properties independent of the light wavelength, avoiding any colour uniformity issues or rainbowing of environmental light incident from the back and requiring only a single layer. It is this colour uniformity issues that requires some simple diffractive elements to use multiple layers or active in coupling optical elements (as Digilens does). In LentinAR’s arrangement each individual pin mirror shows the full display to the viewer and has a FOV of about 15 deg. The real advantage of LentinAR’s arrangement over Lumus and other refractive optical elements is that they increase the number of mirrors used, each showing the same image. This means the images from different pin mirrors can become visible as the eye moves. This can extend the total eyebox and increase FOV to 80 deg in the future. This solution would result in the image not translating smoothly from one mirror to the next. The companies solution is to use aspherical optical design with tight tolerances, necessitating the use of high accuracy injection molding. An OLED microdisplay is used as the projection engine and this can be simply mounted on the top or side of the lens. This reduces size and complexity of the optical path but also drives minimum thickness of the lens itself. However, LentinAR lenses are currently 4.5mm thick and flat.

The University of North Carolina and Nvidia released a paper describing a similar approach in 2014 (http://pinlights.info) and the startup Kura is developing the same technology, though with additions including a hybrid uLED & MEMS scanning display.

DeepOptics – Dynamic Liquid Crystal Lenses

In AR optical systems, all digital data is displayed as an overlay applied to reality. Most AR systems display this overlay at a fixed focal plane regardless of the contents virtual depth. This discrepancy, also known as Vergence-Accommodation Conflict (VAC), frequently can result in blurry vision, fatigue, nausea, eye strain and other symptoms of physical discomfort. Israeli firm, Deep Optics’s dynamic lens technology can be applied to bring the user’s vergence and accommodation reflexes to an agreement, alleviating the negative side effects and providing a superior, comfortable and immersive experience.

The lens is based on Liquid Crystals, a birefringent material that can change its refractive index as the molecules change their orientation under electric voltage. By applying different voltage profiles to a liquid crystal layer, it is possible to control the refractive index. In order to achieve high precision control over the liquid crystal layer, Deep Optics has developed a unique Liquid Crystal panel, similar to an LCD, with a very high resolution, transparent electrode grid. This dense electrode grid enables the application of high precision voltage profiles over the liquid crystal layer, or other optical properties. The result is a single panel that can implement many lens prescriptions, dynamically – positive and negative prescriptions, at a large diopter range and unlimited accuracy (not limited to 0.25D steps). The center of the lens can also be controlled in real-time; it can be positioned anywhere in the panel, and moved across the panel dynamically. As LCs are birefringent two layers are needed, one to affect each polarization of light. The total light transmission is likely to be <95%.

When incorporated into an AR optical stack, Deep Optics technology allows the focal plane of the digital image to be adjusted, making objects appear to be in the foreground or the background compared to a given object in the real world. The integration of Deep Optics LC layer is likely to be complex, reduce brightness and efficiency, increase thickness and success would be highly dependent on the rest of the AR stack.

Plessey Semiconductor and Jade Bird Display – Monolithic Micro LED

Micro LED based displays are very desirable for mobile phones as, compared to traditional LCD and OLED displays they offer:

  • Higher intensity (>10x) from direct emitters

  • Higher colour accuracy from direct emitters

  • Lower power as ‘off’ pixels do not draw power

  • Higher contrast due to a combination of the above

  • Faster response times

For AR applications they offer much greater intensity than LCD and OLED and are far more efficient and offer a much greater total brightness than DMD or Amplitude LCoS based solutions. Phase LCoS and retinal laser scanning are alternative technologies at equally early stages to uLED, none have high volume applications in the market today.  Display intensity is critical for optical AR applications as the ‘black’ is in fact the light value coming from the real world through the glass. Any augmentations must therefore be significantly brighter than this. Furthermore, large eyeboxes and FOV require greater intensity to keep the brightness high at any given point. To maxamise brightness a frame rate of 240Hz would allow a duty cycle of 100% without causing motion blur (Why making good AR displays is so hard by Daniel Wagner). Plessy Semiconductor in the UK are developing micro LEDs to provide some of these advantages.

The key benefit of Plessey’s approach is its Gan-on-Si backplane. Traditional GaN-on-Sapphire LEDs require pick-and-place arrangement into displays. New, smaller LEDs, spacing and tolerances for displays requires new and expensive equipment (such as VueReal’s). As sizes shrink and volumes grow, yield issues will make this approach infeasible both commercially and technically. Furthermore, sapphire substrates are currently 6” and scaling up will be expensive. Si substrates on the other hand are already large (>200mm) and result in surface not volume emitting LEDs, reducing crosstalk. Most importantly they enable building GaN microLEDs on top of, and interconnected with, a CMOS silicon thin-film-transistor backplane with no requirement for pick-and-place and all the driving electronics integrated. Plessey currently build displays with blue LEDs with different phosphors on top but are also working on incorporating direct RGB emitters. In April 2019, Plessey achieved the world’s first wafer to wafer bond between a GaN-on-Si LED wafer and a high-density CMOS backplane requiring two million individual electrical bonds. They also sell both symbolicly addressed units that do not have a full matrixed backplane but address LEDs in predefined groups (like a segment display). With a total input power of 250 mW, Plessey’s native green segments can emit 2 million nits of brightness. Segment definition is full colour with pixel capabilities down to 2-micron.

Jade Bird Display of Hong Kong have developed a monolithic micro LED manufacturing process that bonds a blank sapphire wafer coated with GaN (Gallium Nitride) to the CMOS silicon driver wafer, and then removes the sapphire substrate leaving the GaN material on top of a Si substrate. This reduces thermal issues and the need for costly, time consuming micro pick and place as the LEDs are produced directly on the driver circuitry matrix. JBD they can currently achieve pixels as small as a 2.5 microns, achieving 1080P resolution with the smallest display area of 4.8mm x 2.7mm and 1 million nits at a peak wavelength of 625 nm. This brightness is several orders of magnitude higher that reported OLED micro-display performance. The 2.5-um pitch series is scheduled to be released in late 2020. However these displays are only monochrome. Incorporating colour is a challenge as it requires phosphor materials or transfer of individual LEDs and different LED chemistry and structures cannot be grown together in one MOCVD chamber. JBD is working on full colour. JBD’s LEDs offer -50C to 100C operation temperatures, an advantage over LC based technologies for outdoor or automotive applications.

Glo – GaN Nanowire Micro-LED Display

Micro LEDs displays are brighter, more efficient, faster, have better colour accuracy and better contrast. uLEDs are <100um. A number of companies are developing uLED displays. In 2014 Apple acquired LuxVue Technology and is rumored to be in talks with Taiwanese display makers AU Optronics, Epistar and Epiled Technologies though no further details are available. Plessey Semiconductor are developing GaN on monolithic Si uLEDs in the UK targeting 2020 for production while Osram in Germany and Nichia in Japan are developing micro GaN on Sapphire LEDs to be arranged with advanced pick and place machines. Nichia are targeting 2022. In China Jade Bird Display, San’an Optoeletronics (working with Samsung), HC Semitech and Changelight also claim to be developing uLED though no details are available save for JBD who are taking a monolithic approach. Micro Nitride in Japan is developing micro UV LEDs with RGB phosphors but not offering displays. Many more suppliers are working miniLED (>100um). None of these solutions are yet on the market as there are numerous challenges, small bright sources require very high power density without affecting lifetime and uniformity. Additionally assembly, or manufacturing yield for monolithic solutions, gets very challenging at this scale.

Established in 2010 in the USA based on fundamental research performed at Lund University in Sweden and now patented, Glo is developing GaN nanowire based micro LEDs on both CMOS and LTPS backplanes. By growing InGan in thin columns, they avoid crystal defects and boost the quantum efficiency and increasing manufacturing yield. These are bonded to silicon CMOS, glass or flexible active backplanes using patented wafer transfer technology. Glo’s approach is quite different to many of the other players and so potentially offers strong fundamental IP. Though detailed performance metrics or roadmaps of performance are not currently known.

Eyeway Vision – Direct Retinal Projection

Laser Beam Scanning (LBS) refers to the use of a columned laser being deflected by a two axis MEMS scanning mirror to trace fast horizontal lines, that are then moved horizontally. This builds up a rectangle, the individual RG&B lasers that are beam combined to make up the laser beam can then have their power modulated at any given point within the rectangle to create different colored pixels. These projection engines have recently come to the market in picoprojectors as the powers of blue and green lasers are finally able to match the powers of the red lasers available. They are small and do not require focusing optics and have unlimited depth of field but have a number of disadvantages:

  • Poor laser diode efficiencies in blue and green

  • Low resolution and low frame rate limited by MEMS speed

  • Extremely small eye-box and FOV

  • Beam combining alignment complexity

  • Shadows on the retina due to “floaters” in the eye

  • Laser speckle

Speckle can be removed with a secondary diffusing screen, though this will require traditional imaging optics to image the screen onto an external surface. The MEMS mirror roadmaps from the key players (Opus, Microvision, ST, Hamamatsu, Microchip, Mirrorcle) do not show revolutionary improvements in LBS meaning resolution and frame rate issues will remain. The laser diode roadmaps from key players making blue and green diodes (Osram, Nichia, SLD, Sony, Panasonic) do show improvements in efficiency on >5 year time frames.

Direct retinal projection uses this projection engine to scan directly onto the retina. Doing so means only very low power laser output is required (in fact regulatory requirements fix them at <5mW) and optical efficiency can be high due to minimal components, though electrical inefficiencies remain. However, due to the very small pupil and collimation of the laser, the FOV and eyebox are very small. Intel’s monochromatic Vaunt prototype used this approach as does North (formally Thalmic Labs) of Canada, the company that purchased the Intel IP. Both suffered from unpractical FOV (15 deg) and eyeboxes (4mm), with North’s solution requiring individual customization to account for differences in interocular distance. Both Intel and North have patents on ways to increase both, including duplicating the beam and use of secondary lenses, but none have been demonstrated.

To make eye scanning practical, Eyeway Vision are building a solution that tracks the eye using an IR laser on the optical axis of the eye capable of 1/60 degree accuracy. The scanned area is then deflected with a second MEMS mirror to follow the pupil, keeping the eye within the eyebox. The scan pattern itself is also adjusted to provide high resolution on the fovea and low resolution in the peripheral vision to help improve resolution and frame rate.

Kura – Scanning line of microLEDs for VR

There are many competing display technologies applicable for AR applications, for example LCoS, OLED, DMD, microLED, and LBS. uLEDs stand out as offering high efficiency, brightness, high frame rates,  better colour accuracy and better contrast without introducing speckle, resolution or specific FOV & eyebox issues associated with Laser Beam Scanning. However, large area uLED displays are not yet available, limiting the resolution and eyebox of companies such as LentinAR that utilize them today.

Kura use a custom small uLED screen from an unknown supplier in conjunction with a 1D scanning system to create a high resolution, large area display at the cost of brightness and flicker. Kura are currently claiming a resolution of 8K per eye, 150 deg FOV and unlimited depth of field. Their glasses use a LentinAR like pin reflector waveguide solution to increase eyebox and provide a high FOV and depth of field as well as avoid diffraction rainbowing of stray light. Though the structures themselves are visible, the technology leads to thick lenses and is difficult to apply to curved surfaces.

Daqri & Envisics – LCoS based Digital Holography

True holography is defined as the recreation of a light field, exactly as would be produced by light reflecting off a real object, most uniquely providing true representations of depth. Some types of analogue hologram (full colour volume holograms), such as those recorded in photographic emulsion, are able to create extremely lifelike recreations and have been used many times by museums to show very valuable objects. To create an image digitally in this way is clearly the ultimate embodiment of augmented reality. Many groups have been attempting this for some time, notably the MIT Spatial Imaging Group’s work under Dr Stephen Benton, focusing on recording volume holograms with lasers and a display in rewritable film (media.mit.edu/spi/). This work has now been taken on by the object based media group under Dr Michael Bove (obm.media.mit.edu). These systems were large and complex, but offered very high resolution and large eye boxes. It would also be feasible to create true full colour holograms with this approach. The most practical solution that still enables true digital holography is the use of Liquid Crystal on Silicon devices (LCoS). LCoS are used as amplitude modulators in many of Sony’s projectors, however with only minor alterations (to the driving electronics and liquid crystal type), they can be used as phase modulators and therefore for holography. The holograms they create are often monochromatic as the LCs, lasers and back plate mirrors are tuned for specific wavelengths, furthermore there are efficiency losses between pixels or in the many layers of the technology stack. LCoS devices are being explored by Professor Daping Chu is the Director of the Centre for Photonic Devices and Sensors and the Director and Chairman of CAPE at Cambridge University (cpds.eng.cam.ac.uk). Several years ago fundamental IP around the use of these devices for holography was spun off into two startup companies, Light Blue Optics and Two Trees Photonics. Two Trees Photonics had the superior IP for AR applications, being based on the use of analogue backplane devices using pnumatic liquid crystals which allowed analogue modulation and therefore did not introduce noise, increasing efficiency in the first order. (Finisar are one of the few remaining manufactures of such devices). After releasing a holographic HUD in the Range Rover Evoque, Two Trees Photonics were purchased by Daqri. Dr. Jamieson Christmas, who did his PhD at Cambridge under Professor Chu and founded Two Trees Photonics, founded Envisics.

Daqri’s automotive HUD claims a 75% smaller size, 75% less energy and wider FOV than its competitors. Daqri also claim integration in more than 150,000 vehicles and have 284 patents pending and granted on the technology. Holography also allows multiple focal planes, and ultimately true 3D. Envisics currently claim 2 focal planes at 2.5 and 20-100m. Please note that Magic leap are able to achieve a similar effect by doubling up on static optical elements, but this approach is not inherently scalable. Phase LCoS has also been theorized improving the efficiency of a traditional amplitude projector by >90% (The Path to 100 lm/W in Embedded Projection by Adrian Cable et. al. of Light Blue Optics).

Compute

Neura – AI for Contextual Awareness

The term artificial intelligence is used to describe any machine that mimics the cognitive processing of humans. Machine learning is a computer processing technique that is being used today to process data. It excels at processing high volumes of unstructured data and is able to improve over time without human intervention.

Neura are providing cloud based AI processing as a service which can be implemented in a companies smartphome app. The platform is designed to provide contextual information about each user. By analyzing data from existing smartphone sensors (GPS, accelerometer, etc), Neura are able to provide a simple status and confidence level through push or pull methods.

Predictive contextual alerts provided by Neura with a confidence level include:

  • userWokeUp

  • userLeftHome

  • userStartedTransitByWalking

  • userStartedWorkout

  • userArrivedToWork

It is clear that these alerts could be used to build more intelligent and personally useful interactions with consumer electronics in a huge number of ways. Neura are also able to leverage the data that builds up over time to provide highly detailed persona’s on each user that are constantly updated. Behavior traits assigned include:

  • Late sleeper, long sleeper, early riser, night owl

  • Fitness enthusiast, dabbler or active person 

  • Busy person

  • Easy going

  • Works on weekends

  • Home type

  • Hard worker

  • Morning commuter

  • Shopper

  • Socialite

This information would be invaluable as part of the design process for future products and services. Beyond this, it also allows software and hardware to customize its functionality to provide more user value. Wellbeing data for individual days can also be returned, including:

  • Minutes walked

  • Steps

  • Calories

  • Time in bed, sleep duration, deep sleep, light sleep, sleep efficiency, wakeup time

Lightform – Projector Mapping for AR

Augmenting reality visually has a number of physics limits that make high resolution, high brightness, wide eyebox and FOV very challenge in small spaces. Design of AR glasses therefore involve a huge number of complex tradeoffs within the system. Projectors that light external surfaces within the real world do not have such constrained size and weight requirements and so are far easier to produce. Projectors today are high resolution, high brightness, high contrast and have an excellent colour gamut. Flicker, energy efficiency and colour quality (narrow spectrum allows a wide colour gamut but represents environmental colour very poorly) remain problems for applications outside of imaging on screens so applications need to be selected carefully. Lightform offer a sensor and software tool to allow projection mapping onto complex 3D surfaces that are responsive in real time.

LF1 is a wireless 3D scanner and media player that mounts to your projector. Its built-in camera scans your scene with structured light to build a depth map. Lightform Creator software uses the scan data in real time to make the visuals interactive.  The LF1 on-board computer plays back your video along with real-time generative effects. The advantage is in enabling any projector to be used as a room-scale AR device quickly and easily. Reactive shop window displays are the most common application today but instructions on the kitchen work top and a virtual desktop overlaid on the real office desk are equally applicable.

Wrnch – Depth tracking from a 2D camera

3D and depth information today is supplied by specialized sensors including, time of flight, LiDAR, structured light and steriovision. The use of these sensors adds to the BOM as well as electronics and compute overheads required.

Canada’s Wrnch have developed an AI platform that extracts human behavior from any video source. The engine analyzes regular 2D RGB video and returns 3D motion data of people in the scene in real time. A series of convolutional neural networks (CNNs) analyze 2D video and tracks up to 63 body parts including each finger and toe. The first engine processes video from a camera and separates the people from the background. The second creates a 2D skeleton on top of the people in the video. Finally the 2D skeleton servers as a landmark from which the 3D pose is extracted and filled out. This process works across most camera types, lighting conditions, clothing types, body shapes and viewpoints. The engine is capable of assigning individuals with unique tracking IDs and tracking them through 3D space over time as well as recognizing human motion and gestures. Under development are further improvements in gesture recognition (pointing, thumps up, fist, waving), activity recognition (fall detection, picking up and putting down items), as well as multi-camera analytics including triangulation of objects positions, persistent tracking and 3D sensors.

Wrnch Engine is delivered as compiled libraries. Wrnch provide a simple SDK that is easy to integrate into C++ and Python applications running on Windows and Linux. No special hardware is required (consumer grade cameras and GPUs are acceptable).

Chipintelli and Syntiant – Local speech recognition chips

As the use of voice control grows in the smarthome, smartphone and wearable devices, systems to convert audio into commands a computer can understand become more important. Audio is a very unstructured data source however and its capture, storage, and processing come with significant privacy concerns from users. All the big consumer electronics companies are now developing smart assistants based on this technology, most of these solutions currently rely on keyword detection by a chip doing processing locally. Once detected, a segment of audio is captured and sent to the cloud for processing.

China’s Chipintelli develops AI chips and related products for local automatic speech recognition (ASR) based on deep learning. It offers three major product lines:  CI1006, a bare-bones chip for ASR; Dual MIC, a dual microphone intelligent speech recognition system; Single MIC, a single microphone speech recognition system. All three options included Chipintelli’s Brain Networking Processing Unit which can support parallel computing of general-structue deep neural networks with very high efficiency and low power consumption. Some of the companies key claims include lower power consumption ranging from 0.1W to 0.4W, faster speech recognition rates ranging from 200ms to 800ms, support for a large vocabulary and complex sentences, and higher recognition accuracy of 95% – 97%. The advantages of local processing more generally are overcoming the low response times, low security and some of the privacy concerns around use of the cloud, as well as allowing offline usage.

Syntiant in the USA are developing a 20 terra-operations/watt chip using 4-8 bit precision to speed up voice recognition operations. It uses an array of hundreds of thousands of NOR cells to compute TensorFlow neural network jobs in an analogue way.; meaning it performs the matrix-vector operations central to all major neural networks with ultra low power, fully paralleled computation by taking advantage of custom architecture at the transistor level and keeping the weighting for matrix operations in memory. In doing so, it runs deep learning processes 50x more efficiently that traditional stored program architectures. Its microwatt level power requirements are low enough to allow OEMs to build inference into earbuds, smart sensors and mobile devices. The chip is able to handle 500,000 parameter neural network and modify existing designs slightly so that one chip with a single DNN design can handle most tasks.

Mythic AI and ThinCI- AI chips for edge processing

Artificial Intelligence often today refers to the use of machine learning algorithms used to perform something that has traditionally been hard for computers, such as processing video input to extract semantic understanding of a scene. This processing is often done on servers in the cloud as the calculations can be complex, however this approach requires regular transfer of large amounts of data which requires expensive, high power, high bandwidth technologies, comes with associated privacy and security issues, and adds latency. By optimizing chips for AI algorithms, it is possible to run AI processes locally. Apple, Google and Facebook have all already taken steps to add custom silicon for AI  processing into smartphones and data centers.

USA’s Mythic IPUs (intelligence processing units) are a new architecture for running AI at the edge, locally. They offer huge advantages in power, performance, and cost. Key features include, lowest latency (single frame delay by running batch size of 1), highest performance per watt (>4 TOPS/Watt), hyper scalability (Low-power single-chip to high-performance rack systems), ease of use (major platform support including TensorFlow), and topology-agnostic performance. The Mythic IPU features an array of tiles, each tile in a Mythic IPU has a large analog compute array to store bulky neural network weights, local SRAM memory for data being passed between the neural network nodes, a single-instruction multiple-data (SIMD) unit for processing operations not handled by the analog compute array, and a nano-processor for controlling the sequencing and operation of the tile. The tiles are interconnected with an efficient on-chip router network, which facilitates the dataflow from one tile to the next. On the edge of the IPU, off-chip connections connect to to either other Mythic chips or to the host system via PCI-E.  In essence, this model eliminates the large, costly general purpose processor with its 3 different types of memory (L1 cache, Dram, SSD) and transfers the deep learning computations to the memory structures. Storing the neural network weights in memory arrays and adding local compute to each memory array, allows each to act as an individual ‘graph node’ in an AI system and process the data directly next to each memory. This achieves an enormous memory that has the same performance and efficiency as L1 cache (or even register files) and allows full parallel processing. This is possible because the execution flow of the neural network is deterministic and it is possible to strategically control the location of data in memory, instead of building a cache. The use of memory also allows for analogue computing. This describes compute occurring directly inside the memory array itself. By using the memory elements as tunable resistors, supplying the inputs as voltages and collecting the outputs as currents it is possible to perform core neural network matrix operations, where we are multiplying an input vector by a weight matrix. This eliminates memory movement for the neural network weights since they are used in place as resistors and it allows hundreds of thousands of multiply-accumulate operations occurring in parallel, creating a high-performance yet highly efficient system.

USA’s ThinCI have developed proprietary Graph Streaming Processor (GSP) computing architecture to facilitate large data workloads at the edge. The main differentiators between this chip and its competitors is the focus on power-efficiency, scalability, and ease of programming. The chip is designed to process graph data structures that are created in real time. Graph data structures are a way to connect data together through the use of nodes and used in many big data applications. The first chip ThinCI plans to release will have 16 processor cores that are capable of processing 6.5 8-bit teraflops at 1.1W. When compared to the Nvidia Tesla P4 chip (GPU designed for deep learning), the ThinCI chip was capable of processing 48x more images per watt in the same timeframe. The chip is instruction based like a CPU, however, similar to a filed programmable gate array, the GSP is re-programmable and capable of allocating chip space to different functions, such as processing multiple sensors. ThinCI software architecture would be easy for developers to adopt. To that end, the company has built a custom compiler and SDK to provide an interface between the hardware and popular machine learning frameworks such as Tensorflow, Caffe, and OpenVX. The compiler automatically generates graphs and is auto-parallelizing, which relieves the developer of needing to encode graph structures themselves. 

State of the art neural networks often employ thousands of neurons, making the achieved memory reduction and processing power balance very significant. AI on chip could enable desktop GPU capabilities in a module the size of a shirt button.

Brodmann17 – AI software for edge processing

The term artificial intelligence is used to describe any machine that mimics the cognitive processing of humans. Machine learning is a computer processing technique that is being used today to process data. It excels at processing high volumes of unstructured data and is able to improve over time without human intervention. Brodmann17 develop machine learning software for object localization targeted at embedded devices.

The companies detection accuracy is on par with industry standards, however they utilize only 5% of the resource (compute, memory, and training data) typically required with speeding up local processing of images and videos 20x. Using proprietary techniques, Brodmann17’s deep learning architecture generates smaller neural networks that are faster and more accurate than any other network available today. The take is able to do real time face detection at 25 frames per second on a single Samsung Artik A15 processor core. On the CEVA-XM platform they can achieve state of the art accuracy at 100 frames per second. This is 170% better performance than the same software running on the NVIDIA Jetson TX2 AI super computer.

Steve Humpston

Researcher, designer, engineer

https://www.pushbutton.design
Previous
Previous

Python for Data Analysis

Next
Next

Consumer Research for Product Design