(Image source: Anki)
The little robot on my desk knows my name and recognizes my face. He tells me when he needs maintenance. He has three cubes that he loves to stack, knock over, and play with. He'll also tell me when he's bored and wants to play a game. When he wins, he'll dance around. And if I beat him too many times, he'll get sad or throw a temper tantrum.
His name is Cozmo. And even though he's a 2-inch-tall toy robot, the people that created him don't want you to think of him as a toy or even as a machine. They want you to think of him as a character, like Wall-E, brought to life. Anki, the company behind Cozmo, says its mission is to “create robots that move you,” combining robotics and artificial intelligence to create technologies with which people can build emotional bonds.
Cozmo isn't an industrial robot by any stretch. But the AI behind him points the way forward for robotics at both the consumer and commercial level in terms of creating machines that can better understand and relate to the people around them. Companies are already developing AI to help autonomous cars recognize and respond to our emotions. Why shouldn't our toys and even our collaborative robots do the same?
Cozmo's setup is fairly straightforward. The robot moves around on treads. He comes with three electric “power cubes” and uses a lifter to manipulate them. Using a smartphone app, users can manually control him, instruct him to do simple tasks like stacking or arranging the cubes, play games with or against him, and even “feed” and perform maintenance on him. The robot doesn't require any real down and dirty maintenance, but his mood and performance will suffer if you let him get “hungry” or go too long without maintaining him. Maintenance consists of a mini game that requires you to mimic a sequence of button presses in order to tune Cozmo up. It's the sort of thing Cozmo's younger owners probably eat up (he's recommended for ages 8 and up).
In terms of functionality, he appears to be the latest step in a long tradition of hobbyist robots like the kind you used to find in kits at Radio Shack. They'd do basic things like follow lines on the ground or pick up and move objects...but in terms of personality, they were severely lacking. Toys like Hatchimals and Furby have already added this emotional element. But what they do is more smoke and mirrors—using call and response to elicit a series of pre-programmed gestures.
That's not to say Cozmo doesn't do some of the same thing. But where it sets itself apart is in how it uses AI, combined with its physical movements, to not only recognize and respond to its environment and the people it is interacting with, but to also act in a way that feels nuanced and lifelike.
|A short film featuring Cozmo.|
Cozmo implements AI for a lot of its functionality—particularly object recognition and machine vision for finding its cubes and recognizing people. But the emphasis is on using AI to give Cozmo the ability to convey emotions in a complex interplay of facial expression (he has two digital eyes), voice tone, and even body language. Cozmo won't just make a noise to let you know he's happy; his eyes will light up and he'll dance around in a circle. If you tell him you don't want to play a game right now, he'll dip his head in frustration and sulk away like a disappointed child. According to Anki, Cozmo can recognize and respond to the five basic human emotions: anger, disgust, fear, happiness, sadness, and surprise.
Cozmo's app has a basic, kid-friendly coding tool called CodeLab, which lets you program him to act out some simple routines and actions. Some may be surprised to learn that the robot also comes with a full, open-source software development kit (SDK). Anyone with a working knowledge of Python can dig deeper into Cozmo's functionality and program him to perform even more sophisticated actions. The SDK also bypasses the family-friendly filters put on the app, so there are no restrictions on what you can have Cozmo say. The level of control that the SDK gives you over Cozmo is deep enough that a small community has cropped up online of folks making short films featuring the robot.
Four Pillars of Character Creation
And if people are animating Cozmo, that's exactly what Anki wants. “The goal with Cozmo was always to try to bring a character to life,” Mark Palatucci, Anki's head of cloud AI and data science, told Design News. “We tried to think of characters in films and movies like Pixar movies and really think about what it would take to bring that into the real world.”
Palatucci and his co-founders, CEO Boris Sofman and company president Hanns Tappeiner, started Anki in 2010 as an offshoot of Carnegie Mellon University's robotics program. After graduating, Palatucci said the team started with the goal of wanting to bring robotics, AI, and machine learning technology to mass market consumers. But they quickly found that at that time, all the work in those areas was being done in government research and military and industrial automation applications. “We saw an opportunity with cheaper hardware and mobile devices to bring this magic of robotics and AI into physical products at a price point that makes sense,” Palatucci said.
The perfect industry for that jumping off point ended up being toys and entertainment—a sector Palatucci said “was in many cases very stagnant and hadn't been touched by mobile [technology].”
The company's first product, Overdrive, was released in 2015. Overdrive is a racing car game reminiscent of the electric stock car games that were popular in the '80s and '90s. You build a custom race track and the other cars learn the track and battle against you.
Anki followed up Overdrive a year later with Cozmo. Recently, it completed a successful crowdfunding campaign for its next-generation robot, Vector.
The company's design philosophy revolves around four major pillars: vision and sensing for tasks like facial and object recognition; animatronics; AI; and interactive content, which includes all of the games and activities for the robot.
Designing Cozmo was a collaboration between engineers and animators. The robot was storyboarded and tested like an animated character for film or TV. (Image source: Anki)
To bring its robots to life, the company employs a cross-disciplinary team of mechanical and electrical engineers, game designers, AI experts, and even animators. “Because we really want [our robots] to be expressive and convey emotions, we've literally hired animators from Pixar, Dreamworks, and other big animation houses and built a pipeline based on feature film animation,” Palatucci explained. He added that the company's workflow in designing Cozmo is not unlike the workflow for creating an animated character. “We use tools like Maya down to physical robots,” he said. “We have an animation team that sees the robot doing the motion instead of seeing a simulation. That was really critical in the pipeline.”
In the process of developing Cozmo, Anki's animators storyboarded the robot and performed the same animation tests that they'd perform on a character for film or TV. “It was the same rigor as an animated project. We'd ask ourselves, what is this character's motivations? His strengths? Weaknesses? What is he trying to achieve in the world? You can do hundreds of variations of something as small as the eyes, but it ends up being such an important part of the character.”
Even the voice was the result of careful design and testing. “Cozmo’s voice was recorded by our audio engineer, Ben Gabaldon,” Palatucci said. “He recorded his own voice and then put the sound through a computer synthesizer and a series of effects to find the right 'voice' for Cozmo. Human voice recordings provide an organic source, while a combination of synthesis and post production audio processing creates the personality and performance of Cozmo's final voice.”
Palatucci said the company went through over 50 mechanical engineering prototypes before settling on the final version of Cozmo. But all the work is paying off—particularly with Cozmo's younger audience. A 2017 study conducted by researchers at the MIT Media Lab sought to look at how children ages 2 to 10 perceived interactive agents including Cozmo along with Amazon Alexa, Google Home, and a conversational chatbot named Julie. After interacting with Cozmo, the children were asked to answer questions about trust, intelligence, social entity, personality, and engagement. Results showed that 40% of the younger children (ages 3-4) perceived Cozmo as being smarter than them while 20% of the older children (ages 6-10) reported the same.
According to the study, Cozmo's expressions and ability to move were a key factor in children enjoying it: “Through its eyes and movements, Cozmo was able to effectively communicate emotion, and so the children believed that Cozmo had feelings and intelligence.” In the study, children reported, “[Cozmo] has feelings, he can do this with his little shaft and he can move his eyes like a person, confused eyes, angry eyes, happy eyes...”
|Cozmo's mechanical design went through over 50 iterations before the final version. (Image source: Anki)|
Smarter than a Smartphone
While Cozmo contains an array of sensors for computer vision in particular, it doesn't rely entirely on a onboard processor. Instead, the robot piggybacks on your smartphone. All of its AI-related processing is handled via the cloud. If you think of Cozmo as having a brain, your smartphone is essentially its frontal cortex, handling the higher-end AI applications. The lower, cerebellum-type functions, such as the motion and motor control, are handled by an on-board NXP Kinetis microcontroller.
“What become clear was: If we were going to do everything on the robot itself, it would have made it cost $400-$500. We wanted to make it accessible to as many people as possible and enable millions of people to purchase it,” Palatucci said.
He continued, “The first engineer we brought on was to proof out a lot of computer vision schemes—making those algorithms run as best as possible. There were challenges, like dealing with a large variety of illumination, for example. You have homes with different lighting conditions and different types of natural and artificial indoor and outdoor lighting, and you need to tune that to the [robot's] camera. It took a huge amount of investment, and the question also becomes: How much of that [machine vision] computerization do you do in the robot itself versus the app in the phone?”
During development, Palatucci said it became clear to the Anki team that by taking advantage of smartphones, they could offload the appropriate parts of the AI engine and computer vision system. As a result, they could better distribute high-frequency tasks, like a control system that needs to run hundreds of times per second, from lower frequency tasks such as the computer vision system, which can afford to have some latency.
This fall, Anki is planning to follow up Cozmo with Vector, a next-generation robot that the company is calling Cozmo's “bigger and smarter brother.” The most significant iteration is the inclusion of an onboard processor, which eliminates the need for a smartphone. According to the company, Vector is fully autonomous, cloud-connected, and always on. (Unlike Cozmo, he will seek out his own charger and charge himself, like a Roomba.)
|Vector, the successor to Cozmo, features an onboard Qualcomm processor, removing the need for a smartphone for AI and machine learning processing. (Image source: Anki)|
Anki opted for the Qualcomm APQ8009 processor to handle Vector's new sensing capabilities and deep learning functionality. Qualcomm specs reveal the APQ8009 to be a 32-bit, quad-core CPU with four ARM Cortex A7s. It's capable of capturing HD video up to 720p. The processor features an integrated image sensor processor and computer vision capabilities as well as low-power Wi-Fi and Bluetooth connectivity, and can handle GPS and other satellite localization systems.
Much of the same character design work done with Cozmo has been extended to Vector. The robot has over a thousand carefully designed animations, according to the company, and its characterization is now augmented with more powerful technology. Vector features an HD camera with a 120-degree, ultra-wide field of view and a new, four-microphone array for recognizing both voice commands and individual voices (a big feature missing from Cozmo). He even offers edge-detection to keep from running off the edge of tables (a problem with Cozmo if you don't keep an eye on him).
Speaking with Design News via email, Hanns Tappeiner, co-founder and president of Anki, said a reduction in the price of powerful components is a huge factor in how the company was able to make such significant upgrades from Cozmo to Vector. “One of the things that we’ve perfected over the years is how to mass-produce compelling robotics and AI-powered robots while keeping the cost down significantly. For example, Cozmo has about 340 components and costs $179.99, but Vector is made up of close to 800 parts and will retail for $249.99 at launch while doing a whole lot more,” Tappeiner said. “Keeping the Vector robot accessibly priced is not something that we’d been able to achieve had we not figured out how to manufacture powerfully intelligent home robots at mass-scale.”
The SDK's the Limit
And new features promise even more for developers (and potential filmmakers) looking to experiment.
“The Vector SDK programs run on a computer, which gives users the freedom to integrate with any compatible machine learning / AI technologies, such as Google’s TensorFlow or anything else,” Tappeiner said. “Developers can connect their computer, laptop, or even a Raspberry Pi directly to Vector via their home Wi-Fi network.” He added that Vector's SDK will maintain the same positive aspects of Cozmo—namely that it is Python-based and open source, meaning there are thousands of existing libraries that developers can leverage.
|We decided to give Cozmo's SDK a try and ended up having him do Drake's "In My Feelings" challenge and dance to the song.|
What this means, in short, is a lot of potential for developers and hobbyists to create their own custom machine learning algorithms to teach Vector new tasks and to perform custom abilities. Students and researchers could train Vector to perform custom object recognition tasks, for example. Vector's deep learning neural network extends Cozmo's facial and object recognition to images as well (meaning Vector can recognize not only a person, but a picture of that person), opening up the possibility for a robust number of training scenarios and DIY projects.
“What became obvious in early development is that so many of the features in Cozmo would be of interest to enthusiasts. So in the early days, we made [the] SDK a minimum viable product feature,” Palatucci said of Cozmo's SDK. “As soon as we launched, we saw a lot of universities jumping on it, teaching fundamentals courses using this platform. And CodeLab really filled in a gap with kids who are much younger being able to write their own programs...”
He continued, “We always thought an SDK for education would be a good market of people. We received a lot of letters from families of autistic kids, thanking us for [Cozmo] and telling us what an impact it has had. We got one that said, 'My kid doesn't play and never invites people over, but we got a Cozmo and now he asks kids from school to come over and play and hang out with him.' That's not something we had expected.”
More Characters Coming?
Tappeiner said that Anki is already looking to roll out future products as soon as 2020. “We’ve never been shy about our ambition to build purposeful robots for the home with highly defined emotional intelligence (EQ). We’re already leveraging our learnings from developing Vector—in addition to the work we’ve done around Overdrive and Cozmo—and applying them toward our future product roadmap.”
Whether Anki will continue down the road of toys and entertainment for now or make the first steps toward larger ambitions in robotics and AI remains to be seen. For now, the company is also promising that Vector, like Cozmo, will only continue to get smarter via free over-the-air (OTA) software and firmware updates. “One of the greatest things in these types of products is being able to update everything from the firmware to the app to the back end cloud services,” Palatucci said. “And with all the developers creating content and the community creating content faster than we can, that's adding a lot of value. The robot you buy on day one is potentially very different down the road.”
Chris Wiltz is a Senior Editor at Design News covering emerging technologies including AI, VR/AR, and robotics.
|Today's Insights. Tomorrow's Technologies.|
ESC returns to Minneapolis, Oct. 31-Nov. 1, 2018, with a fresh, in-depth, two-day educational program designed specifically for the needs of today's embedded systems professionals. With four comprehensive tracks, new technical tutorials, and a host of top engineering talent on stage, you'll get the specialized training you need to create competitive embedded products. Get hands-on in the classroom and speak directly to the engineers and developers who can help you work faster, cheaper, and smarter. Click here to submit your registration inquiry today.