If you’re like me, you’re tired of seeing twice as many pages indexed by Google as you have actual pages and products on your ecommerce website. One way to combat this is to use the Robots.txt file to Disallow Google, Bing, and other web bots from crawling the content of your pages.
But that doesn’t actually keep the pages out of the index! It only keeps them from being crawled and having their content indexed and searchable. The page URLs themselves can (and most likely will) still appear in search results, with a message on Google saying that the page’s content is blocked by the site’s Robots.txt file.
So using the Disallow command in Robots.txt won’t reduce the number of pages indexed by the search engines. The only way to do that (besides tedious manual removal requests) is to include a NOINDEX meta tag on pages that you do not want indexed. We already took a look at Noindexing filtered navigation pages of product listings.
In this article, we’ll take a quick look at Noindexing the search results. Why is this important? Well, read it from the man himself, Matt Cutts, who wrote a “Search Results in Search Results” post on his blog all the way back in 2007. So yes, keeping Magento search results pages out of the Google index is important. Let’s look at how to do it!
Updating the local.xml Layout File
Navigate to the magentoroot/app/design/frontend/base/default/layout/local.xml file and open it. If it doesn’t exist, create it.
The code that we have to add to the file is this:
<?xml version="1.0" encoding="UTF-8"?> <!-- other code --> <layout> <!-- other layout code --> <catalogsearch_result_index translate="label"> <reference name="head"> <action method="setRobots"><value>NOINDEX,FOLLOW</value></action> </reference> </catalogsearch_result_index> </layout>
Wrapping Up Noindexing Internal Search Results
That’s it? We’re done? Well, yea, that’s basically all there is to it. In some cases, you may also want to Noindex the catalog_advanced_index and catalog_advanced_result but you can use the exact same code snippet as above, just replacing catalog_result_index with the other two variations.
This is a pretty simple fix, and can help Google keep a lot of your internal search pages out of their index, especially if you occasionally link to a search result as the most relevant and user-friendly method to find a product or list of related products.
If you followed along with the tutorial in the base default theme, you can do a search for anything, get the results, and check out the Page Source. You’ll see something like the following code, indicating that this simple fix has worked.
<title>Search results for: 'nokia'</title> <meta name="description" content="Default Description" /> <meta name="keywords" content="Magento, Varien, E-commerce" /> <meta name="robots" content="NOINDEX,FOLLOW" />