I have the following files on my server:
I want to exclude them all from being indexed. Is the following robots.txt enough?
Or even just:
I understand that Google's bots would accept
/file as disallowing them from all mentioned files (see https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt), but I don't want to address only Google but all well-behaved bots, so my question is about the original standard, not later extensions to it.
Best How To :
In short, yes.
If you have:
It will block anything that starts with /abc, including:
This is part of the original robots.txt standard, and it applies to all robots that obey robots.txt.
The thing to remember about robots.txt is that it's deliberately not very sophisticated. It was designed to be simple and easy for crawlers to implement. Unless you use an extension (like wildcards) it's a simple string comparison. The directive will match any URL that starts with the sequence of characters you give.