Difference between revisions of "Denybot"
From Contao Community Documentation
(Added configuration) |
(Update to Version 1.0.0) |
||
Line 1: | Line 1: | ||
− | [[ | + | [[Category:Extensions]] |
[[de:denybot]] | [[de:denybot]] | ||
'''denybot''' - Hide page content for search engine bots. | '''denybot''' - Hide page content for search engine bots. | ||
− | |||
{{ExtInfo | {{ExtInfo | ||
| Dev=weke | | Dev=weke | ||
− | | ExtVersion= | + | | ExtVersion=1.0.0 stable |
| Version=2.9.5 - 2.9.5 | | Version=2.9.5 - 2.9.5 | ||
| ERLink=http://www.contao.org/erweiterungsliste/view/denybot.html | | ERLink=http://www.contao.org/erweiterungsliste/view/denybot.html | ||
| Depending=[[Bot_Detection|Bot Detection]] | | Depending=[[Bot_Detection|Bot Detection]] | ||
}} | }} | ||
− | |||
=Summary= | =Summary= | ||
− | This Extension deny the access to | + | This Extension deny the access to all or single web pages for detected [[w:en:Web_crawler|Web crawler]]. |
=Functionality= | =Functionality= | ||
Line 19: | Line 17: | ||
=Configuration= | =Configuration= | ||
− | + | In System-Settings you could find in sektion Global configuration the option to exclude searchengines global. | |
+ | |||
+ | If this option is set, every frontend access from a identified search enginge bot will return a 404-error page. | ||
− | + | If this option is not set, only pages with selected robot-tag "noindex,nofollow" or "noindex,follow" in their page configuration will generate the 404-error page for detected serarch engine bots. | |
− | + | ||
=Why should I use this extension= | =Why should I use this extension= |
Latest revision as of 14:03, 9 June 2011
denybot - Hide page content for search engine bots.
Extension-Overview | |
---|---|
Name of the developer | weke |
Version of the extension | 1.0.0 stable |
Compatibility with Contao Version | 2.9.5 - 2.9.5 |
Link to Extension Repository | http://www.contao.org/erweiterungsliste/view/denybot.html |
Depending of ff. Extension | Bot Detection |
Summary
This Extension deny the access to all or single web pages for detected Web crawler.
Functionality
With the usage of the extension Bot Detection the calling agent should be identified. If the calling agent is a search engine bot, the real content of the web page was hidden and a 404-Errorpage will return instead.
Configuration
In System-Settings you could find in sektion Global configuration the option to exclude searchengines global.
If this option is set, every frontend access from a identified search enginge bot will return a 404-error page.
If this option is not set, only pages with selected robot-tag "noindex,nofollow" or "noindex,follow" in their page configuration will generate the 404-error page for detected serarch engine bots.
Why should I use this extension
A normal web site is designed to be found in the internet.
This extension will serve the opposite function and try to hide the content. Search engines should only found not existing pages.
In that case this extension is designed especially for private usage. If you want to personally announce that you created a web page, e.g. to stay in contact with your family and friends across the word, and don't want to invite everyone in the internet, this extension could protect your pages against search engine bots. This extension could only be used as an additional features like using a login to protect your content. It could not replace any other type of data protection.
Restriction
The detection of a search engine bot depends on the quality of Bot Detection engine. The Author BugBuster explicitly pointed out: "There is no reliable detection guaranteed." The search engines are detected by their user agent string and their IP address. BugBusters still collect a huge list of different and popular search engines and he still extends the quality of detection.