Sample SEO Magento robots.txt file
Since I get a ton of solicitations for a robots.txt record intended for Magento SEO here is a sample to kick you off.
To a great degree regular inquiry with regards to eCommerce – and so far as that is concerned Magento SEO – is the way a robots.txt document ought to look and what ought to be in it. With the end goal of this article, I chose to take the greater part of our insight and experience, some specimen robots.txt documents from our customers destinations and a few case from other industry driving Magento studios to attempt and make sense of an extreme Magento robots.txt record.
This Magento robots.txt makes the accompanying suspicions:
We don’t differentiate between search engines, hence User-agent: *
We allow assets to be crawled
i.e. images, CSS and JavaScript files
We only allow SEF URLs set in Magento
e.g. no direct access to the front controller index.php, view categories and products by ID, etc.
We don’t allow filter URLs
Please note: The list provided is not complete. In case you have custom extension that use filtering make sure to include these filter URLs and parameters in the filter URLs section.
We don’t allow session related URL segments
e.g. product comparison, customer, etc.
We don’t allow specific files to be crawled
e.g. READMEs, cron related files, etc.
Magento robots.txt
# Crawlers Setup User-agent: * # Directories Disallow: /app/ Disallow: /cgi-bin/ Disallow: /downloader/ Disallow: /includes/ Disallow: /lib/ Disallow: /pkginfo/ Disallow: /report/ Disallow: /shell/ Disallow: /var/ # Paths (clean URLs) Disallow: /index.php/ Disallow: /catalog/product_compare/ Disallow: /catalog/category/view/ Disallow: /catalog/product/view/ Disallow: /catalogsearch/ #Disallow: /checkout/ Disallow: /control/ Disallow: /contacts/ Disallow: /customer/ Disallow: /customize/ Disallow: /newsletter/ Disallow: /poll/ Disallow: /review/ Disallow: /sendfriend/ Disallow: /tag/ Disallow: /wishlist/ Disallow: /catalog/product/gallery/ # Misc. files you don’t want search engines to crawl Disallow: /cron.php Disallow: /cron.sh Disallow: /composer.json Disallow: /LICENSE.html Disallow: /LICENSE.txt Disallow: /LICENSE_AFL.txt Disallow: /STATUS.txt Disallow: /mage #Disallow: /modman #Disallow: /n98-magerun.phar Disallow: /scheduler_cron.sh Disallow: /*.php$ # Disallow filter urls Disallow: /*?min* Disallow: /*?max* Disallow: /*?q* Disallow: /*?cat* Disallow: /*?manufacturer_list* Disallow: /*?tx_indexedsearch