Of all the exciting things Google announced at the I/O developer conference this week, one of the sleeper hits was Drive for iOS. “Sleeper” might be too strong a word — attach “for iOS” to anything and it’s going to get attention — but it’s hard for any announcement to compete for headlines against Project Glass, the Nexus 7 with Jelly Bean, and skydivers. Part of what caused Drive to turn so many heads during the Day 2 keynote (aside from the iOS association) was its image recognition capabilities.
Drive uses optical character recognition (OCR) to perform certain tricks. This is nothing new for Google — back in 2009 Google Docs could take an image file or a PDF and convert it into a document. It could also change the orientation of one of those files using its ability to recognize characters on the page. These features were brought over to Drive, but so was basic image recognition. It might seem totally sci-fi, but if you upload a picture of a pyramid to Drive, you can search for it and the system will identify it based on what Google knows about pyramids. This feature might seem new, but Google has used image recognition in both Google Goggles and (in a tweaked form) Search by Image.
It’s not too shocking that Google would bring this feature over to the iOS and Android versions of Drive, but it is extra handy on a platform where you don’t have advanced search tools to sift through your 5GB of data. The bulk of the computing is taking place on the server side so it’s not like your iPad needs to be able to recognize that pyramid, you just need to have an internet connection so that Google can lend some of its search magic to the files you’ve placed on Drive.
I conducted an informal test of Drive’s image recognition capabilities using some resources from around the web, including a couples of documents and then some randomly chosen, generic images. I made sure they were free of any data that might help the search tools identify them (EXIF, file names, etc.) and then uploaded them to Drive. According to Google’s demo I should be able to identify each by a basic description of each of them. Here’s how it went…
- 2 cat images – Correctly identified 2 out of 2 with the term “cat”
- dog image – 1 out of 1 with the term “dog”
- 4 pyramid images – 4 out of 4 with the term “pyramid”; 1 out of 4 with “Giza”
- Samsung Galaxy Tab image – Did not identify as a “tablet”; did identify as “Samsung”
- Dell laptop image – Did not identify as “laptop”; did identify as “Dell”
- hard drive image – Did not identify as “hard drive”; did identify as “drive” and “disk”
- hamburger image – Identified as “hamburger” but not as more vague terms like “food” or “dinner”
- JPG of an invoice – Identified as an “invoice” and by the company names used on it
- 1099 tax form, PDF – Identified by “taxes”, “tax form”, “1099″ and “1099 misc”
- Failed to identify – Images I took of a refrigerator, graphics card, smartphone, and light bulb
With my limited data set the effectiveness of Google’s image recognition was mixed. It work very well with text in images, well enough with images that are in the Google Images database, and then it totally stumbled with my personal images. I gave the system sufficient time process the data (the better part of an hour) but it just couldn’t figure some out.
Maybe the image recognition isn’t particularly new for Google but the impressive bit here is that the company was able to cram it into a mobile app, apply it to my personal data set in the cloud, and then have the app communicate with the cloud in a quick manner. Sure, most of the work is done immediately after the file is uploaded, but the recognition abilities and cataloging is nothing short of what we’d expect from Google.
Aside from the OCR abilities and the convenience of using my Google account, Drive for iOS is basically an adequate Dropbox competitor. It’s a smooth experience and with snappy performance on my iPad 3, plus I have the ability to choose exactly which files I want to store offline and which I don’t. The app makes it easy to see who I’ve shared a file with, which is great for people who in groups. Advanced (and nearly hidden) tools include the abilities to share with people in your contacts list and to rename files.
In addition to being an impressive trick, the OCR gives Drive a notable advantage over Dropbox. That service is able to search by filename and it has some cool features, like the ability to email a link, but the search tool can’t even dive into the text of a PDF. And, as we know, Google is only getting smarter while other services struggle to compete against its user base, computing power, and massive Knowledge Graph.
Read on for more of ExtremeTech’s Google I/O 2012 coverage