# Sample robots.txt file (make sure the filename is ALL LOWERCASE on Linux/Unix systems) # This file should go in your web site's ROOT directory # The root directory is where your site's main /index.html file would be found # It is usually found in /yourhomedir/public_html/ or /yourhomedir/httpdocs # Where "yourhomedir" is your user account's name # # We invite you to also check out our popular contribution: Simple Template System (STS) # It lets you layout or change your OSC look-and-feel by modifying a single HTML file # http://www.oscommerce.com/community/contributions,1524 or SimpleTemplateSystem.com # Enjoy! - Brian Gallagher @ DiamondSea.com # This says to apply these settings to ALL search engine spiders/crawlers User-agent: * # These settings will keep spiders from indexing your unwanted pages # This assumes that your OSC install is in your web site's ROOT directory # ie: http://www.yoursite.com/index.php <- Use if this brings up your OSC main page Disallow: /catalog/catalogues/ Disallow: /catalog/pub/ Disallow: /catalog/font/ Disallow: /catalog/download/ Disallow: /catalog/tmp/ Disallow: /catalog/export/ Disallow: /catalog/account Disallow: /catalog/advanced_search.php Disallow: /catalog/checkout Disallow: /catalog/create Disallow: /catalog/address Disallow: /catalog/login.php Disallow: /catalog/logoff.php Disallow: /catalog/password_forgotten.php Disallow: /catalog/popup_image.php Disallow: /catalog/popup Disallow: /catalog/shopping_cart.php Disallow: /catalog/download.php # IF YOU DO NOT WISH TO HAVE THE GOOGLE IMAGE BOT SCAN YOUR DOMAIN FOR IMAGES # THEN YOU CAN INCLUDE THE FOLLOWING IN YOUR ROBOTS FILE. # I FOUND THAT MY BANDWIDTH USAGE DROPPED BY A MASSIVE AMOUNT AFTER I GOT RID # OF THE GOOGLE IMAGE BOT. ALL I HAD WAS IMAGE HUNTERS STEALING PRODUCT SHOTS # AND NOT EVEN BROWSING THE SITE. User-agent: Googlebot-Image Disallow: /