Hans Moravec has spent his entire career researching robotic automation. He served as a director of the Mobile Robot Lab at the Robotics Institute of Carnegie Mellon University–he’s still an adjunct faculty member there. Moravec also invented “evidence grid technology,” which uses information from cameras and stores it in a 3-D grid fashion, which can display objects in a room. He was also an early transhumanist, pre-Nick Bostrom. Oh, and he also taught Chris Urmson, the head of Google’s self-driving car efforts.
In 2003, Moravec founded Seegrid, a robotics company that uses “vision guided vehicles” (VGVs)–thus “see” and “grid”–to navigate warehouses. They unveiled their first VGV, a truck that could move through its environment autonomously, in 2008. TechRepublic caught up with Moravec and Jim Rock, Seegrid’s CEO, to learn how the technology works, and why the self-driving car industry should take notes.
Tesla and other self-driving car companies are using Mobileye technology. How is your technology different?
ROCK: At first glance, there isn’t that much difference. We’ve got stereo cameras. What they have that we don’t use much are 3D scanning lasers. We don’t use that because it’s expensive, and not practical in the long-run. It’s mechanical, and also it’s shooting out lasers. What we did instead was we said, “Well, we’re going to ride Moore’s law and wait until the computing power is adequate to do the stereo. After that point, we’re going to have the advantage because the computers will keep getting cheaper at a faster rate than the lasers will.” It’s completely practical as we do it now. Our camera system is the cheapest part of the robot.
What exactly are vision-guided vehicles?
MORAVEC: Since the ’70s, we’ve been using vision-guided vehicles that would involve a robot that would be used as the stereoscopic camera. That means two cameras looking at a scene, and you can determine the distance to something in the scene by how it shifts from one eye to the other. We do that in a grand scale, we look at the whole scene and pick out hundreds of thousands of points as distance, and then we distill that data into probability of what’s in space, let that set into the grid.
SEE: Photos: A list of the world’s self-driving cars racing toward 2020
Back in the early days, I had one camera that slid side-to-side to get the stereoscopic view, and managed to make it across the putter-filled room while mapping the room, but it took five hours, and it worked maybe half the time. Obviously, the technology wasn’t ready yet. The computers then could do better Marian calculations per second, and one of my burning interests was: how long will it take before they can do it well? The estimate for the computer speed was at least 100,000 times what I had back in the ’70s. The projection was that is really gonna become practical sometime in the 2000’s.
I kept working on the problem. Basically pacing the software with computers as they became available, and we made a number of discoveries during those decades. One of them was mapping things around the robot–which was less failure-prone than the early methods, which involved listing the points in the space through this grid method, which allowed you to accumulate statistical evidence for what’s there in every position and space around it.
First we had to do it in two-dimensions because computers were not powerful enough to do it in three, but by the mid 1990’s we were starting to be able to do it in three dimensions. By the early 2000’s it was going fast enough that I thought it may be time to take in the commercial realm.
How exactly does it work?
MORAVEC: There are two main proponents. One is this grid, which is a map of the surroundings of the robots, and it’s all probabilities. Basically it’s able to handle very uncertain information and accumulate it over time to extract certainty from it. The other part is the sensors that feed information into this grid. We used sonar-ring sensors in the ’80s and ’90s. In the ’70s, I had already used cameras and stereoscopic vision, but it was very slow and only worked some of the time, failing catastrophically other times.
In the ’90s, we figured out that it’s possible to train a recognition system. It’s not exactly neural. It’s sparser than that. It has parameters, with neural that’s 1,000 with parameters. Our system has dozens of parameters, and is designed so that we already built in what we already know about the problem, leaving parameters for the parts that we were uncertain about.
SEE: Toyota launches new AI lab in US, calls autonomous cars ‘robots on wheels’
The learning really makes things work well. It gives a lot of time to find the parameter settings that make it work, but when you do, it’s magic.
And then, we have stereoscopic code, which includes a learning system which chews up the entire thing, the mapping. Then there’s the one final key step. Once we built that, the robot needs to know where it is relative to those maps we’ve built, for localization. There’s quite a few parameters involved in localization as well, but everything’s optimized together.
What could manufacturers of autonomous cars learn from your approach to this?
MORAVEC: At the moment, it’s still a little bit slow for use in fast-moving traffic on the roads. We have a nice contained environment where our robots don’t move more than a couple meters per second.
They shouldn’t be moving any faster than that right now, but we’re sitting on the back of Moore’s Law, so every two years or so the speed doubles, and the speed at which we’ll be able to handle this data doubles. We’ve already done consolations that show that our system, our stereo approach will be able to produce the same results as the Velodyne lasers that are in a lot of the autonomous cars, including all of the Google Cars.
We can replace that with cameras, and sufficiently powerful computers, and do everything as accurately and as fast as the laser does it. It’s cheaper, and we’re not shooting out laser beams.
Our safety record is perfect. A thousand miles or more with no accidents. Google Cars don’t have that track record, but here, we do.
The 3 big takeaways for TechRepublic readers
1. Low cost: Using stereoscopic cameras is cheaper than lasers and poses fewer issues in a warehouse.
2. Advances in computing: Computing power is increasing so quickly that these stereo camera-guided vehicles will be able to start moving at a faster speed.
3. Safety: In factories–one of the most dangerous environments for machines and humans to work together–Seegrid has created vehicles with a perfect safety record.