Copyright (C) 2000 Bastian Kleineidam
You can choose between two licenses when using this package:
1) GNU GPLv2
2) PSF license for Python 2.2
The robots.txt Exclusion Protocol is implemented as specified in
||This class provides a set of methods to read, parse and answer|
questions about a single robots.txt file.
||Methods defined here:|
- __init__(self, url='')
- can_fetch(self, useragent, url)
- using the parsed robots.txt decide if useragent can fetch url
- Sets the time the robots.txt file was last fetched to the
- Returns the time the robots.txt file was last fetched.
This is useful for long-running web spiders that need to
check for new robots.txt files periodically.
- parse(self, lines)
- parse the input lines from a robots.txt file.
We allow that a user-agent: line is not preceded by
one or more blank lines.
- Reads the robots.txt URL and feeds it to the parser.
- set_url(self, url)
- Sets the URL referring to a robots.txt file.
||__all__ = ['RobotFileParser']|