The Internet went into a tizzy on Wednesday when someone discovered that Google image search results for the phrase Top 10 Criminals features pictures of Prime Minister Narendra Modi. Many took this as a sign that 'Google thinks' Narendra Modi is one of the top 10 criminals, belying a lack of understanding of how online search works. #top10criminals has been trending on Twitter since Wednesday with everyone seemingly having an opinion on the subject - what else is social media for, after all.
First of all, it should be obvious to anyone that Google search results are determined by algorithms and there's no individual or group sitting at Mountain View deciding what results to show for each query. The exact search results that are shown for a query - for example Top 10 Criminals - are determined by a variety of factors, some of which have been well-documented and others that Google prefers to keep secret so as to make it tougher for people to game the system.
These factors include what text people are using to link to a particular website, or in this case what information was used to describe an image. So, for example, if a lot of people link to NDTV Gadgets from their website with the text 'The Best Mobile Reviews,' when someone searches for The Best Mobile Reviews, Google is likely to show NDTV Gadgets as one of the results. Of course, this is an over-simplification of the entire process and there are a wide variety of inputs that Google takes into account.
One of these additional criteria is the trust or authority of a website that's making the association. You are more likely to trust the review of a high-profile critic rather than an unknown person, and Google search operates on a similar principle. Important websites - which, in turn, are determined by the number of websites linking out to them, amongst other factors - have more of a say in determining how search results for a particular query will turn out.
So, for example, if The New York Times links to a restaurant with the text 'Great Sushi in New York' and my mom links to another restaurant with the same text from her blog, guess which restaurant is more likely to show up Google's search results when you search for Sushi in New York? Something similar seems to have happened with Narendra Modi showing up in search results for the phrase Top 10 Criminals.
Back in July 2014, The Telegraph, one of UK's most reputed newspapers, published an article titled "Top Indian educationalist accused of racism over portrayal of criminal 'negroes'" with an image of Narendra Modi as the story image. The image included the following 'alt' text - "India's Prime Minister Narendra Modi : Top Indian educationalist accused of racism over portrayal of criminal 'negroes'"
Alt is an optional attribute associated with images on the Web that was used back in the days of limited bandwidth and text-based browsers to display 'alternate' text for an image to browsers unable or unwilling to display on-page images. While modern-day browsers continue to support the setting - some will also shown the alt text when you hover the mouse pointer over the image - most websites now use the alt text to provide a description of an image to search engines like Google.
In the case of The Telegraph article, the meta text of the story image appears to contain two parts, separated by the colon. The first describes the image: "India's Prime Minister Narendra Modi," while the second is the headline of the story, to provide context of where the image is being used. This is a standard practice at many publications, and little did The Telegraph know at the time that this seemingly innocuous juxtaposition of the words 'Top Indian' and 'Criminals' will stir up an Internet storm 11 months later. Other images of Narendra Modi - and indeed other people showing up in search results - are also from articles where these words are used.
Google search isn't perfect
Though we trust Google to return relevant results for all of our search queries - and more often than not it does exactly that - it's not perfect. Search is a hard problem to solve, and image search is even more difficult to crack, since you can't always rely on words to give you the right context. Google may have shown promising improvements in the area with the search features inside its new Photos service, examples like this show how limited information can lead to unexpected results.
"These results trouble us and are not reflective of the opinions of Google," a company spokesperson said on the subject of Wednesday's controversy via an emailed statement. "Sometimes, the way images are described on the Internet can yield surprising results to specific queries. We apologise for any confusion or misunderstanding this has caused. We're continually working to improve our algorithms to prevent unexpected results like this." In fact by Thursday morning Google, in an usual move, had even added a disclaimer to the controversial search results page (see image above), a reflection perhaps of how much heat the issue has generated.
While Google may not be happy with the quality of its results, it's not alone in having this 'problem.' Though Bing image search results did not include Narendra Modi in results for the phrase 'top 10 criminals', changing the search phrase to 'top 10 criminals in India' returned The Telegraph's image on both Bing image search and DuckDuckGo.
That should serve as a reminder that search engines only index what other websites are saying. If enough people make an association - or link to a website that does - search engines will pick it up, as US Presidential candidate Rick Santorum found out not too long ago. Of course in the case of Narendra Modi, a single image seems to have triggered something that looks damaging, though there might be other such references across the Web. What's worse is that with many websites now using 'top 10 criminals' in their headlines alongside mentions and images of Narendra Modi to report on this controversy, search algorithms may see this as a further signal to make this association stronger.
Is moderation the answer?
Some believe that since Google has acknowledged its mistake, the company should also modify search results for the phrase top 10 criminals to remove Narendra Modi's images. While Google should clearly be spending more of its engineering resources on improving image search to avoid such mistakes, adding a layer of human intervention to remove 'bad' results may not be the answer.
First of all, it's not possible for a company even the size of Google to manually examine search results for all queries in a bid to flag out inconsistent records. Second, search results for subjective queries like 'top 10 songs' or 'top 10 footballers' will always be contentious. While the Honourable Prime Minster of India clearly doesn't belong in search results for top 10 criminals, moderating one set of results will lead to requests for moderating another, leading one down a slippery slope.
Does a surfer belong in the list of greatest leaders of all times? Should Indian cricket fans complain to Google that Adam Gilchrist is featured above Sachin Tendulkar in results for top batsman of all times? The questions seem trivial when compared to the subject of criminals, but the point is if you are using image search results to get answers to questions it isn't designed to answer, you may find plenty of ammo - you'll just be firing blanks.