I am using the EasyDNN News module for the blog, news articles, etc. on our DNN website. The core DNN sitemap does not include the articles generated by this module, but the module creates its own sitemap.
For example: domain.com/blog/mid/1005/ctl/sitemap
When I try to submit this sitemap to Google, it says my Robots.txt file is blocking it.
Looking at the Robots.txt file that ships with DNN, I noticed the following lines under the Slurp and Googlebot user-agents:
Disallow: /*/ctl/ # Slurp permits *
Disallow: /*/ctl/ # Googlebot permits *
I'd like to submit the module's sitemap, but I'd like to know why the /ctl is disallowed for these user-agents, and what would the impact be if I just removed these lines from the file? Specifically, as it pertains to Google crawling the site.
As an added reference, I have read the article below about avoiding a duplicate content penalty by disallowing specific urls that contain /ctl such as login, register, terms, etc. I'm wondering if this is why DNN just disallowed any url with /ctl.
http://www.codeproject.com/Articles/18151/DotNetNuke-Search-Engine-Optimization-Part-Remov