# Please follow the rules in this robots.txt # All illegal activity is logged, these include all requests being made and the use of different useragents. # Failure in following these rules may result in a permanent ban. # # Request made with useragent: CCBot/1.0 (+http://www.commoncrawl.org/bot.html) # This useragent is currently not known as a crawler. # Known crawlers receive an optimized robots.txt that doesn't exclude the entire site. # # Crawlers receive different page's then normal visitors, *only* the following changes are made: # - meta-tags are only sent to crawlers # - urls are adjusted; some parameters are not sent to crawlers, these include: # + the session-id # + parameter for filtering records # + parameter for ordering records # + parameter for pagination (only when viewing details) # + all links are absolute, instead of relative. # This is done to prevent multiple url's ending up in search indexes containing the same content. # Crawlers that do request a uri containing such a parameter are redirected using a `301 Moved Permanently` to the same page, but without those parameters. # This effectivly sents the crawler to the page it is looking for. # - Google Adsense Ads may be removed from the page (if there are some on the page) # - uri's that are excluded in the robots.txt may result in a `403 Forbidden`, instead of the actual page. # These uri's shouldn't even be requested by the crawler in the first place. # - redirect-links that count the number of visitors going out to what url are changed to the actual destination. # # Questions about this robots.txt or about crawlers can be sent to: crawlers.cms1@doppy.nl # If any `black hat` technique is encountered, please notify us so we can take action to prevent that in the future. # User-agent: * Disallow: /