02.06.13

Robot series 3: Stuck in flatland

Posted in Society at 14:56 by RjZ

Google has taken street view off-road. Teams of (lucky) engineers are walking around national parks and wild places like the Grand Canyon carrying a backpack full of cameras and GPS devices. They’re working towards Google’s mission of organizing the world’s information and making it universally accessible and useful. Amazing stuff; you can see a demo here.

What’s missing from all this work is the third dimension. Of course, street view info is mapped onto rough 3D contours, but generally, the information being collected is 2D. For most things, that’s just fine. Humans experience much of the world in only two dimensions because everything beyond a dozen meters or so is essentially two dimensional. Mapping the world with street view is well-served with 2D information, but it’s obviously not the whole story.

Try this. Imagine wearing a camera that records everything you see throughout the day and uploads this information to a cloud-based server. Sometime later, you realize you’ve misplaced your sunglasses. What if you could search through all that stored information, find a copy of your sunglasses from a time when you knew you had them and then have the server search for the last place they were seen? Repeat for keys, where you parked your car…even for people you’ve met…what was his name again?

Everything necessary to do this already exists today just waiting for someone to bring it all together (and, make a business out of it to pay for it….) Critical though, is that 2D data alone make this problem much more difficult than it should be. For example, a 2D system can’t tell the difference between your mom, and a picture of your mom. Storing 3D information can actually end up easing bandwidth problems, and certainly the search problems that need to be solved before this idea becomes a reality. A single 3D model of mom’s face helps the system to identify her, even from her profile and not just her portrait. The sunglasses can be spotted lying on the counter face up or face down.

In Japan, an aging populace has been wondering who will take care of them as they enter their later years. The Japanese government has been heavily promoting robot assistants as a potential solution. The recent film Robot and Frank took a charming look at what these future relationships might be like; but for Robot to be able walk around, do the dishes, and cook (not to mention, learn to pick locks…) classic 2D machine vision won’t be enough. Not convinced? Check out these convincing anamorphic illusions and you’ll be convinced of some of the limitations of 2D vision!

3D adds much more than just image acquisition. It allows security cameras to match a persons captured face with a mug shot even if the angle shot is completely different. 3D motion capture can enable computers read sign language or lips or be used as an interface that requires no buttons or touching whatsoever. Machine vision algorithms, amazing as they are, are pretty simple today. More information makes them more powerful and emerging, inexpensive, 3D capture technologies provide that valuable detail. From industrial bin and picking, 3D copiers (a natural extension to 3D printers like Makerbot that already exist today) to more personal applications, compact, inexpensive, and fast, 3D data capture is just one more piece of the modern robotics puzzle.

[Disclaimer: the company I work for, Chiaro Technologies is developing just this sort of inexpensive, fast, accurate, 3D capture technology.]

Leave a Comment