LWP::RobotUA -
A class for Web Robots
require LWP::RobotUA;
$ua = new LWP::RobotUA 'my-robot/0.1', 'me@foo.com';
$ua->delay(10); # be very nice, go slowly
...
# just use it just like a normal LWP::UserAgent
$res = $ua->request($req);
This class implements a user agent that is suitable for robot applications.
Robots should be nice to the servers they visit. They should consult the robots.txt file to ensure that they are welcomed and they should not send too frequent
requests.
But, before you consider writing a robot take a look at
<URL:http://info.webcrawler.com/mak/projects/robots/robots.html>.
When you use a LWP::RobotUA as your user agent, then you do not really have to think about these things
yourself. Just send requests as you do when you are using a normal LWP::UserAgent and this special agent will make sure you are nice.
The LWP::RobotUA is a sub-class of LWP::UserAgent and implements the same
methods. The use_alarm()
method also desides whether we will
wait if a request is tried too early (if true), or will return an error
response (if false).
In addition these methods are provided:
Your robot's name and the mail address of the human responsible for the
robot (i.e. you) is required by the constructor.
Optionally it allows you to specify the WWW::RobotRules object to use.
Set the minimum delay between requests to the same server. The default is 1
minute.
Set/get which WWW::RobotRules object to use.
Returns the number of documents fetched from this server host.
Returns the number of seconds you must wait before you can make a new
request to this host.
Returns a text that describe the state of the
UA. Mainly useful for debugging.
UserAgent, RobotRules
Gisle Aas <aas@sn.no>