Sometimes the best place to showcase the potential of a bold, world-changing technology is a flower garden. Take the case of Ofer Dekel, for example. He manages the Machine Learning and Optimization group at Microsoft’s research lab in Redmond, Washington. Squirrels often devoured flower bulbs in his garden and seeds from his bird feeder, depriving him and his family of blooms and birdsong.
To solve the problem, he trained a computer-vision model to detect squirrels and deployed the code onto a Raspberry Pi 3, an inexpensive, resource-constrained single-board computer. The device now keeps watch over his backyard and triggers his sprinkler system whenever the vermin pounce. “Every hobbyist who owns a Raspberry Pi should be able to do that,” said Dekel. “Today, very few of them can.”
Dekel, an expert in machine learning, is aiming to solve that problem. He leads a multidisciplinary team of about 30 computer scientists, software engineers and research interns at Microsoft’s research labs in Redmond and Bangalore, India, that is developing a new class of machine-learning software and tools to embed artificial intelligence onto bread-crumb size computer processors. Early previews of the software are available for download on GitHub.
The project is part of a paradigm shift within the technology industry that Microsoft CEO Satya Nadella recently described during his keynote address at the company’s Build 2017 conference in Seattle. “We’re moving from what is today’s mobile-first, cloud-first world to a new world that is going to be made up of an intelligent cloud and intelligent edge,” he said.
Intelligent edge Creating the intelligent edge is a step toward realizing the promise of a world populated with tiny intelligent devices at every turn – embedded in our clothes, scattered around our homes and offices and deployed to perform tasks such as anomaly detection and predictive maintenance everywhere from car engines and elevators to operating rooms and oil rigs.
Today, these types of devices mostly work as sensors that collect and send data to machine-learning models running in the cloud. “All the processing requires a lot of compute, it requires a lot of storage,” said Shabnam Erfani, director of business and technical operations for Microsoft’s research lab in Redmond. “You can’t fit all that hardware into a low-cost embedded device.”
Dekel and his colleagues are aiming “to do the impossible,” she added. “To shrink and make machine learning so much more efficient that you can actually run it on the devices.”
These intelligent devices are part of the so-called Internet of Things, or IoT, except that these things are intended to be smart, or intelligent, even without an Internet connection.
Pushing machine learning to edge devices reduces bandwidth constraints and eliminates concerns about network latency, which is the time it takes for data to travel to the cloud for processing and back to the device. On-device machine learning also limits battery drain from constant communication with the cloud and protects privacy by keeping personal and sensitive information local, Varma noted.
The researchers imagine all sorts of intelligent devices that could be created with this method, from smart soil-moisture sensors deployed for precision irrigation on remote farms to brain implants that warn users of impending seizures so that they can get to a safe place and call a caregiver.
“If you’re driving on a highway and there isn’t connectivity there, you don’t want the implant to stop working,” said Varma. “In fact, that’s where you really need it the most.”
Top down
The team is taking top-down and bottom-up approaches to the challenge of deploying machine-learning models onto resource-constrained devices.
The top-down approach involves developing algorithms that compress machine-learning models trained for the cloud to run efficiently on devices such as the Raspberry Pi 3 and Raspberry Pi Zero.
Many of today’s machine-learning models are deep neural networks, a class of predictors inspired by the biology of human brains. Dekel and his colleagues use a variety of techniques to compress deep neural networks to fit on small devices. A technique called weight quantization, for example, represents each neural network parameter with only a few bits, sometimes a single bit, instead of the standard 32.
“We can cram more parameters into a smaller space and the computer can churn through it much, much faster,” said Dekel.
To illustrate the difference, he played a video comparing a state-of-the-art neural network with and without quantization trained and deployed for computer vision on Raspberry Pi 3s: The models are equally accurate, but the compressed version runs about 20 times faster.
Early previews of these compression and training algorithms are available for download on GitHub. The team is also working on tools that will enable hobbyists, makers and other non-machine-learning experts to navigate the end-to-end process of collecting and cleaning data, training the models and deploying them onto their devices.
“Giving these powerful machine-learning tools to everyday people is the democratization of AI,” said Saleema Amershi, a human-computer-interaction researcher in the Redmond lab. “If we have the technology to put the smarts on the tiny devices, but the only people who can use it are the machine-learning experts, then where have we gotten?”
Another compression technique being investigated by the team is pruning, or sparsification, of neural networks to remove redundancies, which should result in faster evaluation times as well as the ability to deploy onto smaller computers, such as the ARM Cortex M7.
Bottom up
All this compression work will only make existing machine-learning models 10 to 100 times smaller. To deploy machine-learning onto Cortex M0s, the smallest of the ARM processors – they are physically about the size of a red-pepper flake and Dekel calls them “computer dust” – the models need to be made 1,000 to 10,000 times smaller.
“There is just no way to take a deep neural network, have it stay as accurate as it is today, and consume 10,000 less resources. You can’t do it,” said Dekel. “So, for that, we have a longer-term approach, which is to start from scratch. To start from math on the white board and invent a new set of machine-learning technologies and tools that are tailored for these resource-constrained platforms.”
The bottom-up approach starts from the tiny end of the spectrum, where team members are focused on building a library full of training algorithms, each tuned to perform optimally for a niche set of scenarios: one for applications such as the brain implant, for example, and another to detect anomalies such as in a jet engine to predict when maintenance is required.
The smallest device that the group has focused on is the Arduino Uno, a severely resource-constrained single-board computer with only 2 kilobytes of RAM. The algorithms train machine-learning models for tasks such as answering yes-or-no and multiple-choice questions, predicting likely target values and ranking a list of items.
These models are inspired by cloud-based systems, but they are being re-engineered in ways that shrink the amount of data learned, reduce computational complexity and limit memory requirements yet maintain accuracy and speed.
“Ultimately, you get predictions that are almost as accurate as (cloud-based neural networks) but now your model size is very little so you can deploy it for a few kilobytes of RAM,” explained Varma.
A prototype device in development to showcase the potential of this research is an intelligent walking stick for Varma, who is visually impaired, that can detect falls and issue a call for assistance. Another potential application is an intelligent glove that can interpret American Sign Language and voice the signed words through a speaker.
“I like helping people with their impairments be more productive and better integrated into society,” said Varma.
Imagining the future
The drive to embed AI on tiny devices is part of a broader initiative within Microsoft’s research organization to envision technologies that could be pervasive a decade from now. For Dekel and colleagues, this is a world filled with intelligent, secure devices built with tools that are accessible to anyone with an idea and desire to make it.
For now, the research project is serving the maker community – people who have problems such as Dekel’s with squirrels and a vision to solve them with homemade technology. Other makers are domain experts such as a swimmer who wants to train a fitness band to count laps and distinguish between freestyle, breaststroke and butterfly.
Varma envisions a role for these makers throughout industry as well, developing intelligent, secure devices optimized for anomaly detection and predictive maintenance tasks. “Fixing something when it is broken,” he noted, “is much more costly than identifying the problem before the break.”
Only a few of these devices will ever exist if the work of making them is left to the relatively small population of computer scientists with PhDs in artificial intelligence, noted Amershi. She is working on interfaces and other tools to reduce the complexity and monotony of training and deploying machine-learning models onto edge devices so that makers of all kinds can be productive.
“Machine learning is not a one-shot thing,” she said. “It’s an art. It takes some effort, some iterations, to steer these machine-learning models to do the thing you want them to do.”