The default operation of the parser is to parse the description file
and any links in it, but this can be changed using the -M
or --maxdepth option. To make sure that we
don't bite off more than we can chew, we can add the --stayonhost
option to avoid following any external links,
% Spider.py -H http://plucker.gnu-designs.com/ -M3 --stayonhost -f PluckerDB
This would download the index page for Plucker's web site, any links
within this page and also any links within those pages, but it would
disregard any links outside the gnu-designs.com host.
These options are useful when you specify a specific host using -H. If you use a description file it is better to use the MAXDEPTH/STAYONHOST flags for each link instead.