The home page is the first page you see when you start Plucker on your
PDA, and is also the page you see when you tap on the home icon
(the little house) in the viewer on the Palm OS hand-held device. The
home page is by default created using the description file at
$HOME/.plucker/home.html.
When you installed Plucker, a default description file was put in
$HOME/.plucker. You can change this file in any text editor
and it doesn't require deep knowlege of HTML to create it. We will
explain how to do this step by step in a minute.
Except for the MAXDEPTH and similar
attributes the description file is like any other HTML file.
This also means that you can view home.html in your normal
web browser (e.g. Netscape). In fact you can even use a normal
web page at some web server as your home document. See chapter
4 for details on how to do this.
Prior to performing a HotSync, you have to tell Plucker about where to
grab the pages that you want to view. Plucker starts by scanning the
description file (also refered to as home document) that
you have defined. As stated above if you did not define otherwise this
will be $HOME/.plucker/home.html.
The parser finds any links in that file, and follows them. Each link
(e.g. something like <A HREF="...">
) is read from the Internet,
stored and parsed on your hard disk and included in the database that
you will later sync to your Palm. Let us explore the description
file in more detail.
A simple, typical home document will look like this (without the
linenumbers, they are only added for easier reference later on):
[01] <H1>Plucker Home</H1> [02] [03] <H2>Plucker Information</H2> [04] <P><A HREF="http://plucker.gnu-designs.com"> Plucker home page</A><P> [05] [06] <H2>Linux links</H2> [07] <A HREF="http://slashdot.org/index.pl?light=1&noboxes=1" NOIMAGES MAXDEPTH=1>Slashdot.org</A><P> [08] <H2>News</H2> [09] <A HREF="http://channel.nytimes.com/partners/palm-pilot/summ.html" MAXDEPTH=2>New York Times</A><P> [10] <A HREF="http://www.news.com/Newsfeed/Avantgo/index.html" MAXDEPTH=2>C-Net NEWS.COM</A><P>
Here you see several typical examples. First of all you may note that
this document follows the general outlines of normal HTML as stated
above. If you do not know HTML already, do not worry. What you need
here is really very easy. Let us look it trough line by line. (If you
already know HTML this will be somewhat boring to you.)
First you note that commands in HTML are enclosed in angle brackets.
Each command has a begin and an end always using the same tag, the
end marked by an additional slash in front of the tag's name. The first
and 3rd row e.g. create Headlines. The numbers simply specify different
fontsizes.
More interesting is the 4th line. First it starts a new paragraph
(<P>) and then you see the most important tag in HTML: a link. Links
have the following form:
<A HREF="http://plucker.gnu-designs.com">Plucker home page</A>
Enclosed in quotes you find the page they refer to (the URL) then
after an closing angle bracket you see the title of the link that
should be displayed to the user. You will see exactly this text within
Plucker on your home document.
Now you have the basic procedure how to tell Plucker to get a specific
web page for you. What does Plucker do if it finds a tag like this in
home.html? It will simply follow the URL you specified and get
this page. Note that as this page is plain HTML you can also view it
within your normal web browser and use the links the same way as
Plucker does it. That way you can easily check if all links are
working as you expected without the need to run the parser each time
and sync.
Well, it is nice to grab a web page, but what to do with a newsticker
that lists only the headlines? You want to retrieve the articles as
well. Let's have a look at line 9. You will note that this
time the link is enhanced by an additional tag: MAXDPTH=2. This
is the way to instruct Plucker to retrieve deeper levels from a
web server. You can give MAXDEPTH any number as parameter.
MAXDEPTH=2 means to load the target (linked-to) page, and any
targets within that page. This will just do the job. It will first
grab the headlines and then follow all linked pages to get the articles
as well.
MAXDEPTH=3 will load the target page, its linked pages, and any
pages linked within there and so on. You really do not want to set
MAXDEPTH too high. That could be very bad. A MAXDEPTH
of well under 50 would probably load the entire Internet, so you may
run into some storage problems...
MAXDEPTH is one of the most important tags used to customize
the information to download. Another important tag is
NOIMAGES. A sample on how to use it can
be found in line 7. If you do not want to download any images you
should specify this tag. Simply add it after the URL to
pluck. As you see in line 7 you can combine the various tags. That is
line 7 instructs Plucker to download only the title pages without
images. You see you can even explicitly specify a value of 1 to the
MAXDEPTH argument. This does not make much sense at the first
glance as our first example showed that leaving out the
MAXDEPTH statement will give the same result. But stop. The
reason to give a explicit MAXDEPTH is that you can define which
depth is the default for Plucker within Pluckers configuration file we
will talk about later on. So if you set the default depth to 2 e.g.
but this page should definitly be plucked only to a depth of 1 you can
specify it explicitly here instead of defining a depth of 2 for all
other pages. (It is not a bad idea to define the depth explicitly for
all pages as one can see at first glance how deep Plucker will
work.)
Now in line 7 you see another possibility of Plucker. You might have
wondered about these funny chars that appear in the URL. Well with
that definition you instruct Plucker to request the result of a so
called CGI and to pass some paraeters as well. A CGI is basically a
script run by a web provider to gather explicit informations e.g. from
a database. You do not need to worry about the details, the easiest
way to get the correct URL is always to point your web browser to the
page where the information is located and copy the URL from its
URL-field into the home-document of Plucker. Here we just wanted to
show that Plucker can even handle such funny URLs.
Hint: If you specify a CGI for a newsticker and you always get the
same news it is most likely that the date of the issue is passed to
the script via the URL.
NOTE: MAXDEPTH will most likely be sufficient for web pages that
where written for PDAs. If you are plucking normal web pages be
careful with MAXDEPTH, as many pages contain a menu to
navigate the site and Plucker will follow these links as well. That is
Plucker does not distinguish between a menu or a normal link.
That way a MAXDEPTH=2 which is meant to download an overview page
and the articles (e.g. of your newspaper) could easily result in Plucker
trying to retrieve the news archive of your favourite newspaper as well
(since it is linked from within the menu). So it is wise to attend
the first runs of Plucker if you add new pages to your description file,
especially if you do not use PDA-optimized pages. There are very
effective ways to prevent Plucker from running into this kind of problems.
Besides the MAXDEPTH and NOIMAGES tags there are various
other tags that influlence the gathering process. E.g. you can have
Plucker to exclude specific links etc. See below in chapter
4 for details about. We will not go into to much details
here.
Hint: Web pages created for handhelds have some advantages over normal
web pages. Plucker can retrieve normal web pages, no matter, but specially
designed pages are normally smaller, contain less graphics etc.
Hint: Plucker can not handle frames. Usually, pages designed
for handhelds do not contain frames...
Hint: A good collection of links to handheld friendly sites are
included with Plucker.
Our German speaking users can find sites with mostly German content at http://www.palmtop-portal.de.