On Google, Tensor, and the age of ambient computing

The groundwork was laid in 2016, but the real work is just getting started

On Google, Tensor, and the age of ambient computing

On September 24, 2016, Hiroshi Lockheimer sent out a tweet.

On the surface, Lockheimer, SVP of Platforms and Ecosystems at Google, was referring to a new family of hardware made by Google itself. Ten days later on October 4, the company announced the first of its Made by Google products, which would eventually come to encompass brands like the Google Pixel line of smartphones, Google Nest-branded smart-things (read: thermostats, displays, routers), and other devices.

At the time, hardware was new territory for the company, which had built its reputation on software expertise. Sure, it had worked with hardware before. Google sold Nexus devices - reference smartphones and tablets used to host the newest features in Android. But leading up to 2016, Google only made Android, not the devices it ran on. During the life of the Nexus program, the company worked with various third-party ODM’s (Original Device Manufacturers) to actually make its devices. Samsung, LG, Huawei, Motorola and Asus all partnered with Google to realize the hardware for Nexus phones and tablets.

So when Google announced its new Pixel Smartphone on October 4th 2016, people were cautiously optimistic. This was the first time Google had started to vertically integrate a smartphone - controlling both the hardware and software that runs on it. And as Apple had shown with its iPhone for years prior to the launch of the Pixel, vertical integration almost always leads to a more cohesive overall experience.

But even after the launch of the Pixel, Apple still had something Google didn’t - it designed its own processors. As much of a leap as it was for Google to start branding and selling its own devices, total vertical integration can’t happen unless you control the hardware, the software, and the silicon that binds them. And with the launch of the Pixel, Google was still constrained by what Qualcomm, the company who designed the chips that actually powered its devices, prioritized. Even though its devices now had Google branding on them, the Pixel wouldn’t truly be a phone Made By Google until it designed its processors in-house.

Now, in September 2021, it’s been nearly five years since Lockheimer sent that tweet. Google is on the heels of releasing its sixth generation smartphones, the Pixel 6 and the Pixel 6 Pro, alongside what was a surprise to many, an in-house processor that finally completes Google’s vertical integration efforts. A processor called Tensor.

But I don’t think Lockheimer’s tweet was about the Pixel, or Tensor, or vertical integration at all. I believe that the real reason he knew we’d be talking about October 4th 2016 all these years later is something much more fundamental; a total shift in how we interact with technology. And it starts with a virtual assistant embedded in nearly every device Google unveiled that day.

Ambient computing - the next frontier

Google Assistant expands broadcast feature to phones, Family Bells to 8 new  languages, more

When we interact with technology, it can take a variety of inputs to extract the information we want. Operating a smartphone can require sight, touch, vocalization and audition. And the friction between us and our technology is determined by the inputs and time-to-output required to extract information or value from the technology we use.

So it follows, then, that we should work to reduce the friction between ourselves and our technology as much as possible. There is a theory that in any system, every additional required input will make a user exponentially less likely to finish using the system. Maintaining our smartphone as an example, looking up a song playing in your coffee shop requires unlocking your phone, opening your browser, searching the lyrics, and reading the result off your screen. That’s a lot of inputs, and often, it may be too many inputs for a user to execute on it at all.

So how can we reduce the inputs required to execute on that search? In 2017, Google unveiled a solution, called Now Playing. Now Playing utilized the Always-On display of the Pixel 2 to ambiently display the title and artist of the music playing around you. What once required multiple inputs to compute and extract the information you wanted was reduced to a single glance. The friction of the system became almost nothing, and the computing was done for you, ambiently.

Now Playing is a feature of the Google Assistant, the virtual assistant I alluded to earlier. The goal of a virtual assistant is to reduce the friction of interacting with technology as much as possible, by cutting down on the inputs required to execute tasks and extract information from your devices. Aside from telling you what song is playing at your coffee shop, the Google Assistant can screen the phone calls you receive, remind you when to leave to get to your flight on time, and even make reservations for you at a restaurant.

This is what we would call ambient computing; technology working around you in the most frictionless way possible. The purest form of ambient computing can be described by the Now Playing feature I mentioned earlier, with no inputs required to compute information you might want. Of course, most computing needs at least one input to produce an output. But instead of using your phone to type a search query into your browser, the Google Assistant allows you to simply ask the question with your voice.

But for ambient computing to work effectively, we need computers capable of executing these tasks to exist all around us. If we shift from primarily computing with our screens to mostly performing computational tasks with our voices, it follows that we would need devices capable of receiving that vocal input to be everywhere. That’s why on October 4, 2016, Google started the process of sneaking the Google Assistant into as many devices as possible.

Why ambient computing?

Google is an ads business. As much as it may seem like Google primarily dabbles in search, email, or even video streaming, the core business that keeps the infinite money machine brrrr-ing along are the ads Google serves to its users. So when Google creates a new product, whether it’s physical hardware or virtual software, there’s almost always a hook that leads back to serving you an ad.

But Google didn’t build its empire by simply letting its search engine funnel in money. The company has always experimented with new ways of serving you information, and as a result, new ways of serving you ads. Google needs to be where the people are, and the best way to make sure it keeps doing that is to figure out how people will interact with technology in the future, before they get there. If Google can dictate how we compute five years down the line, it can build in a monetization model around that interaction.

In 2013, Google announced Google Glass, a pair of smart glasses that could display search results, guide you with Google Maps, and even record video. At the time, this was seen as a logical next step in how we use technology. Instead of pulling out a separate device to perform tasks, you could simply glance to the display within the glasses to extract the information you needed.

Unfortunately, Google Glass eventually failed, primarily due to privacy concerns. There were reports of users getting booted from movie theaters since no one could tell if they were recording, and people were even pulled over on the road, since highway patrol thought they were watching content while driving.

And even though Google Glass didn’t work out, the sentiment of what it represented for the future of ambient computing persisted. Other wearables like smartwatches and fitness trackers eventually took over, existing as a more functional replacement for things we already used in our daily life. Instead of simply telling us the time, these wearable computers now gave us useful information about our health, while also displaying things like notifications in an ambient fashion.

But as useful as wearables would become, they still didn’t represent the most frictionless way to interact with our devices. They still had screens we had to touch or view get information from. Virtual ambient assistants have always been seen as the best possible way to interact with our technology, especially when they react in a natural and human way. If we could simply ask a question and get an answer, we could continue living our lives as we normally would.

So when Google launched the Google Assistant in its now defunct messaging service Allo in May 2016 (may it rest in peace) and later integrated it directly into the Google Pixel smartphone and Google Home smart speaker in October of that same year, no one was surprised. Voice assistants weren’t a purely new concept, with Siri landing on the iPhone in 2011 and Amazon debuting its own Alexa voice assistant three years later in 2014. But Google was uniquely positioned to take advantage of voice assistants in a way no other company could. Google services have given the company more raw data than nearly any other company on the planet. And if there’s one thing virtual assistants LOVE, it’s data.

Google may not have been the first company to recognize the benefits of voice assistants, but it may be the most well-suited to take advantage of them. After all, voice assistants are based on machine learning - the process of comparing an input to a huge data set and using an algorithmic model to determine with a high level of confidence that the output is correct. Google has collected massive data sets from nearly every product it makes. Google Photos and Google Images give it huge image sets to reference, and products like Google Translate help Google better understand natural language. Merge that with the fact that Google itself makes TensorFlow, one of the most popular machine learning models around, and you’ve got the perfect combination to create one of the most sophisticated virtual assistants on the block.

But to transition to this ambient computing future, Google needed its voice assistant to be everywhere. Google couldn’t rely on you needing to run to another room to find a Google Assistant beacon. To truly minimize the friction of computing, users needed to be able to speak out into the world and get a result processed and returned. But for that to happen, the Assistant needed to blend in. So in 2016 Google started with the obvious. It put Google Assistant in your phone - the one device you carry at all times, and it put it in an air freshener-shaped smart speaker, which it named the Google Home.

This transition kicked off the design language for the Made by Google program, which I alluded to earlier. The first Google Pixel smartphone was as basic as it gets, with a simple two-tone design and colors aptly named Clearly White and Just Black - testaments to the simplicity of the slate that would hold the Google Assistant. And the other hardware, the Google Home smart speaker blended seamlessly into its environment, using a combination of plain white plastic and a fabric mesh - a material that would soon become a staple of much of Google’s smart home portfolio. The idea was for Google Assistant to always be around you, whether you were at home or on the go. And you wouldn’t even notice it was there.

Slowly, the Google Assistant expanded its reach. What was once an exclusive feature of Google’s own Pixel smartphone became a staple of Android, earning its own spot in Google’s operating system now running on billions of devices. It expanded to cars with Android Auto, and to headphones with partners like Bose and Sony. Over the years, Google itself has broadened its library of hardware, now offering things like thermostats, laptops, earbuds, routers and even TV’s with Google Assistant baked in. In the nearly five years since October 4, 2016, Google’s Assistant ecosystem has exploded, and it’s likely you can’t go a block in New York City without the “Hey Google” hot phrase triggering something in your immediate vicinity.

Of course, the move towards Assistant everywhere has been anything but universally popular. Many are disturbed at the idea of devices that are always listening, especially as they populate the world around us. In 2019, Google faced scrutiny after disclosing that a small set of Assistant queries were reviewed by real people, and that a language researcher had violated Dutch privacy laws by leaking some of this data. And just this July, the company faced another lawsuit alleging that it had violated California privacy laws by using data collected by accidental Assistant activations for advertising purposes.

But Google has persisted, repeating the claim that user privacy is a top priority for the company. In May 2019, it gave users the ability to auto-delete or refrain from ever storing various types of data, such as location, activity, and speech. And though lawsuits are filed nearly every month alleging that Google is infringing on one privacy law or another, it remains committed to getting the Google Assistant into as many devices as possible, in pursuit of its ambient computing vision. And fortunately for Google, it also recently discovered a compromise that is both a win for the company itself, but also for users’ privacy.

Just as important as the volume of devices the Assistant covers is the speed and accuracy of which it can perform a query. If you have 1,000 instances of the Assistant around you at all times but it either doesn’t understand them or takes too long to respond, you’re not going to use it. So in 2019 Google made a huge leap when it announced it had managed to condense the roughly 100GB model the Google Assistant required to just .5GB, allowing it to run on-device for dramatically improved speeds. Now, instead of relaying queries to one of Google’s data centers through an internet connection, computing it, and sending you back the result, the same could be done much faster, right on the device. This means less potential data leaked, and faster queries for end-users. It was a win-win.

But moving the Google Assistant on-device wasn’t enough for Google. In order to make a virtual assistant truly frictionless, you need to minimize the time it takes for the assistant to understand you, and then, for the computation to actually be executed. Moving the assistant onto the device was a huge jump in this regard, but the speed and accuracy of the Assistant was still constrained by the chipset actually performing the computation. Since queries were now executed on device, they no longer had the advantage of Google’s powerful data centers. Sure, you gained the benefit of the Google Assistant working for many things without a data connection. You could turn up the volume on your device or open an app, just with your voice. But artificial intelligence is much more complex than the type of computing we’re generally used to, and new types of computer cores have been built specifically to handle the type of tasks AI requires.

Then, on August 2, 2021, Google announced Tensor, a new System on a Chip that would power its upcoming Pixel 6 and Pixel 6 Pro smartphones. And while on the surface this may just seem like a way for Google to cut out the middle-man and take more ownership of its devices, it’s likely the move is more focused on helping the company achieve its ambient computing dreams.

The Tensor Transition

Google Tensor chip: Everything we know so far | ZDNet

Google hasn’t shared the full details of its Tensor chip design, but it has disclosed that the chip will be based around its custom-designed TPU, or Tensor Processing Unit, of which the chip itself derives its name. A TPU is a special cluster of cores on the SoC (System on a Chip) dedicated to machine learning and deep learning tasks. And though this custom TPU is a big deal for Google, it isn’t necessarily new. Qualcomm, the chip maker which powers most Android phones on the planet including Google’s current lineup of Pixel devices, utilizes an NPU, or Neural Processing Unit on its chips, which effectively handles the same tasks as Google’s Tensor cores. Apple’s A14 Bionic chip, which powers its iPhone 12 series of smartphones also features a cluster of machine learning-focused cores, which it calls its Neural Engine.

But unlike Qualcomm and Apple, which primarily design their chips around things like raw CPU and GPU performance, Google has designed the rest of the chip around the TPU itself.  While Qualcomm’s latest Snapdragon 888 SoC blasts past last year’s Snapdragon 865 chip in almost every department, it is not necessarily prioritizing AI performance - just improving everything across the board.

With Tensor, Google is signaling that AI and machine learning capabilities are now a top priority for the company. After all, its ultimate goal of Google Assistant everywhere, or ambient computing, relies on this very type of processing to function. While tasks like speech recognition, computational photography and image recognition can be done with traditional CPU and GPU cores, the Tensor cores Google has used in its server centers, and will now use in its smartphones, speed up these processes dramatically.

While Google is debuting Tensor in the Pixel 6 and Pixel 6 Pro, there’s nothing stopping the company from bringing the chip, or a chip based on the same architecture, to more of its devices. We’ve seen this exact move before, most notably from Apple. The Apple Watch Series 7 is powered by the S6 processor, which is based on the A13 Bionic chip it designed for the iPhone 11, and the Apple M1 processor which powers its new MacBooks and Mac Mini are scaled up versions of Apple’s A14 processor used in the iPhone 12 series.

Maintaining architecture parity has multiple advantages, from design to the capabilities that the device can offer. If Google has already gone through the trouble of designing Tensor to vertically integrate its smartphones, it only makes sense that it would do the same for its other devices. But a side effect of the architecture parity between its devices is the optimizations Google has already made to make Tensor an AI powerhouse.

Google’s AI capabilities are made or broken by the speed at which its devices running that AI can execute a task you’ve asked for. So if Google wants you to be completely tethered into its new ambient computing future, it only makes sense for the company to pass Tensor onto its laptops, smart displays, smart watches and headphones. The more devices it can vertically optimize for running the Google Assistant and performing AI-based tasks, the more seamless its ambient computing future becomes. And as it happens, Google’s Tensor processor is built specifically around these AI capabilities. If Google can transform every product it makes into a highly-optimized Google Assistant beacon, it comes one step closer to realizing the future of computing.

But Google’s TPU cores aren’t only useful for the Google Assistant. Google’s Tensor chip will help speed up things like computational photography and speech recognition for translation. Traditional CPU-based tasks have reached a point where we no longer really notice a difference year over year. But tasks that can be optimized by machine learning will see a huge improvement in performance.

And of course, there are a multitude of other reasons for Google to develop its own processor. Qualcomm will often only support devices up to four years, with the average Android device receiving only two to three years of software updates and a couple more security patches. By making its own processor, Google can support a device as long as it pleases, and rival Apple’s exceptional track record of pushing its latest version of iOS to devices over six years old. While Qualcomm is mostly concerned with selling more chips, Google is just trying to keep you on its devices. Because remember, Google is an ads business. The longer it can keep you on its devices and services, the more it makes from you in the long run.

So while it has been just five, not eight years since the Google Assistant quietly launched on Google’s software and hardware, I’m keen to admit that Lockheimer was right. We are (or at least I am) talking about October 4, 2016 - the day Google started its biggest transition yet, to change the way we interact with our technology.