|
|
|
|
|
|
|
|
Archie (Archive Server Listing Service)
|
"If the hill will not come to Mahomet,
Mahomet will come to the hill."
-- Francis Bacon, Of Boldness
|
|
Wouldn't it be great if there was some sort of "search" program
that would look through hundreds of different anonymous ftp
sites and tell us where all of the files that we want are
located? Well, such a search program exists. It is called
Archie (Archive Server Listing Service).
Archie is actually a collection of servers. Each of these
servers is responsible for keeping track of file locations
in several different anonymous ftp sites. All of the Archie
servers talk to each other, and they pool their information
into a huge, global database.
Administrators all over the world register anonymous FTP
servers with the archie service; once a month the archie service
runs a program which scans the directories and filenames
contained in each of the registered FTP servers, and generates a
grand merged list of all the files and directories contained
in all the registered servers. More than 2300 anonymous FTP
sites are now represented in this list, which is referred
to as the archie database. The archie database currently contains
more than approximately 20,000,000 unique filenames
themselves representing 4 terabytes (that is, 4,000,000,000,000 bytes)
of information.
Files made available at anonymous FTP sites contain software packages
for various systems (Windows, DOS, Macintosh, Unix, etc.), utilities,
information or documentation, mailing lists or Usenet group discussion
archives. At most FTP sites, the resources are organized hierarchically
in directories and subdirectories. The archie database contains both the
directory path and the file names.
You can search this database for file locations simply by giving
an Archie client or server a keyword to search for. (The archie database
is available to all users of the Internet, and can
also be accessed via electronic mail)
A few minutes ago I did an Archie search using the keyword
"linear". Archie sent me back a whole bunch of information
in the following format:
Host triples.math.mcgill.ca (132.206.150.30)
Last updated 22:41 23 Dec 1998
Location: /pub/rags
DIRECTORY drwxr-xr-x 512 01:55 17 Jun 1997 linear
|
What does all of this tell me? Well, this tells me the address of
the anonymous ftp site is
-
triples.math.mcgill.ca (132.206.150.30)
the directory that the file is located in is
-
/pub/rags/linear
Archie doesn't retrieve the file for me, but it does tell
me exactly where the file that I am looking for is located.
Once I know the file's location (and its filename), retrieving
the file using ftp is easy.
|
|
There are three ways that you can access Archie:
- through an Archie client running on your local system,
- through a telnet connection directly to an Archie server, or
- by sending an e-mail letter directly to an Archie server.
The load on all of the Archie servers is incredible. If your site
has its own Archie client, you should use that client instead of
telnetting or e-mailing to a distant Archie server.
|
Accessing Archie using a Local Archie Client
|
|
To find out if your site is running its own Archie client, type
the word at your system prompt;
-
$ archie
and see what happens. If you don't get an error message, you can
safely assume that your site has its own Archie client :)
To actually conduct an Archie search using your site's Archie
client, type
-
$ archie <search term>
replacing <search term> with what you want to search
for. For example:
-
What you want Archie to search for |
You type |
Files and directories that have the word "post" in their titles |
$ archie post |
Files that have the extension .dll |
$ archie .dll |
|
For example, to retrieve a list of ftp servers
with file(s) or directories containing "archie" string:
-
$ archie -s archie
then archie will send you the following results (partial listing):
Host ftp.csua.berkeley.edu (128.32.43.51)
Last updated 02:14 1 Aug 1998
Location: /pub
DIRECTORY drwxr-xr-x 512 20:00 26 Oct 1996 archie
Host ftp.nau.edu (134.114.96.15)
Last updated 02:06 13 Jun 1998
Location: /gopher/general/departs/cts/support/netwrkng/tcp/netsoft
FILE -rwxrwxr-x 7532 05:00 19 Aug 1994 archie
Host rcs1.urz.tu-dresden.de (141.30.61.11)
Last updated 02:36 31 Mar 1998
Location: /pub/soft/unix/bsd/FreeBSD/FreeBSD-CVS/ports/net
DIRECTORY drwxr-xr-x 8192 08:36 22 May 1997 archie
Location: /pub/soft/unix/comp.sources.misc/volume22
DIRECTORY drwxr-xr-x 8192 06:00 24 Oct 1996 archie
Location: /pub/soft/unix/comp.sources.misc/volume33
DIRECTORY drwxr-xr-x 8192 06:00 24 Oct 1996 archie
Host ftp.iij.ad.jp (202.232.2.51)
Last updated 20:25 12 Jun 1998
Location: /network
DIRECTORY drwxr-xr-x 1024 05:00 25 Apr 1996 archie
Location: /NetNews/comp.sources.misc/volume22
DIRECTORY drwxr-xr-x 512 05:00 23 Apr 1996 archie
Location: /FreeBSD/ports-2.1.6/news
DIRECTORY drwxr-xr-x 512 00:25 12 Jun 1997 archie
............
|
There are lots of options available, read the manual
with the 'help' command (no quotes).
For additional archie options, RTFM or use "archie -h."
FYI, various public domain clients for Windows, MS-DOS, OS/2, VMS, Unix
(including Linux), Macintosh and X-Windows are available from most of
anonymous ftp sites, and are in the directories /pub/archie/clients or /archie/clients.
Some of you may be wondering, why does the Anonymous FTP Sitelist exist
if archie can find files? The answer is this: archie does not
work (yet) with non-Unix sites (the number of which will increase
substantially year after year) and another problem with archie
is that different servers can provide you with different answers depending
on the ftp sites they currently have in their memory.
Using a European server you might not be able to find a file in the US,
but if you try a US server it's possible that it does find the file(s) you
need and vice versa.
|
Accessing Archie by Telnet
|
|
The following are a few of the Archie servers that you can access using
telnet. At the login: prompt enter 'archie' (no quotes). The login procedure leaves the
user at the prompt archie> indicating that
the server is ready for user requests.
-
$ telnet <archie host address>
login: archie
archie>
There are several archie servers you can telnet (see below for the list).
I normally use the archie server at Rutgers University (
archie.rutgers.edu)
which seems always faster than the others. Anyhow, if possible, use the server
that is closest to you.
Once connected, to find a file or directory called 'filename' you would
type: 'prog filename' (no quotes) or 'find filename' (again, no quotes) - depending on
which archie server you're accessing - at the archie> at the prompt.
-
archie> prog linear
# Search type: exact.
working...
.
.
.
It's great to see the list of all the ftp sites that contain the file(s) you're
looking for. However, it scrolls over too quickly and there seems no way
to redirect the search result. Don't worry. After Archie has finished its
search and printed its results on your screen, you can have archie e-mail
the results to you by typing
-
archie> mail <your e-mail address>
Finally, to quit your telnet archie session, type
-
archie> quit
# Bye.
Connection closed by foreign host.
$
Some suggestions on using Telnet Archie servers;
- Avoid connecting during working hours; most of the archie servers
are not dedicated machines - they have local functions as well.
- Make your queries as specific as possible; the response will be
quicker and shorter.
- Archie client installed on your computer help to reduce the load
on the server sites, so please use the client instead of telnet..
- Use the archie server closest to you and, in particular, don't
overload the TransAtlantic lines.
To get an updated archie server list, type:
telnet archie.ans.net (or any other archie server)
and login as 'archie' (no quotes) and type
'servers' (again, no quotes).
Of course you can also try a server somewhat closer but this list
is from archie.ans.net.
|
Frequently Used Telnet Archie Commands
|
|
The following archie commands are available for
telnet archie search:
Telnet Archie command |
It does |
exit, quit, bye |
exits archie. |
help <command-name> |
invokes the on-line help. If a command-name is given, the
help request is restricted to that command. Pressing the RETURN key
exits from the on-line help. |
list
<pattern> |
provides a list of the FTP servers in the database and the time at
which they were last updated. The result is a list of site names, with
the site IP address and date of the last update in the database. The
optional parameter limits the list to sites matching
pattern:
the command list with no
pattern will list all
sites in the database (more than 1000 sites!). E.g.
list \.kr$
will list all Korean anonymous ftp sites
archie> list \.kr$
# Your queue position: 1
# Estimated time for completion: 13 seconds.
working... =
cbubbs.chungbuk.ac.kr 203.255.72.254 14:47 23 Feb 2001
ftp.kigam.re.kr 134.75.144.10 15:11 16 Jul 2001
ftp.kornet.nm.kr 168.126.63.7 02:27 4 Mar 2001
uniboy.dwt.co.kr 165.133.1.2 15:11 16 Jul 2001
...............
archie> | |
|
site(*) site-name |
lists the directories and subdirectories held in the database from
a particular site-name. The result may be very long. |
whatis string |
searches the database of software package descriptions for
string. The search is case-insensitive.
If you send the command whatis sparse
(=sparse matrix solver)
in a Telnet session, then you will get the following results:
archie> whatis sparse
harwell MA28 sparse linear system (argonne)
laso Scott's Lanczos program for eigenvalues of sparse
matrices (argonne)
sparse Kundert + Sangiovanni-Vincentelli, C sparse linear
algebra (argonne)
sparspak George + Liu, sparse linear algebra core (argonne)
y12m Sparse linear system (Aarhus) (argonne)
archie>
| |
|
prog string | pattern
find(+) string | pattern |
searches the database for string or pattern.
Searches may be performed in a number of different ways specified in the
variable search, which also determines whether the parameter
is treated as a string or as a pattern.
The search produces a list of
FTP site addresses which contain filenames matching the pattern or
containing the string, the size of the file, its last modification date
and its directory path. The number of matches is limited by the
maxhits variable.
The list can be sorted in different ways,
depending on the value of the sortby variable. By default,
the variables search, maxhits and sortby
are set to, respectively, exact match search on string, 1000
hits and unsorted resulting list.
A search can be aborted by typing the
keyboard interrupt character (Control-C); the list produced at that point will be
displayed. |
mail <email> <,email2...> |
places the result of the last command in a mail message and
dispatches specified e-mail address(es). If no mail address is specified
as a parameter, the result is sent to the address specified in the
variable mailto. |
show <variable> |
displays the value of the given variable. If issued with no argument,
it displays all variables. The archie variables are shown below with the
details of the set command. |
set variable value |
changes the value of the specified archie variable.
The variables specify how other archie commands should operate. |
|
Variables and
values of telnet archie command are:
Archie command Variable/Value |
It means |
compress(+) compress-method |
specifies the compression method (none or
compress) to be used
before mailing a result with the mail command. The default is none. |
encode(+) encode-method |
specifies the encoding method (none or
uuencode) to be used before mailing a result with the mail
command. This variable is ignored if compress is not set. The default is
none. |
mailto email <,email2 ...> |
specifies the e-mail address(es) to be used when the
mail
command is issued with no arguments. |
maxhits number |
specifies the maximum number of matches
prog will
generate (within the range 0 to 1000). The default value is 1000. |
search search-value |
determines the kind of search performed on the database by the
command: prog string | pattern. search-values are:
- sub
- a partial and case insensitive search is performed
with string on the database, e.g.:
"is" will match "islington" and "this" and "poison"
- subcase
- as above but the search is case sensitive, e.g.
"TeX" will match "LaTeX" but not "Latex"
- exact
- the parameter of
prog (string) must EXACTLY match
the string in the database (including case). The fastest search method
of all, and the default.
- regex
- pattern is used as a Unix regular expression to
match filenames during the database search.
- sortby sort-value
- describes how to sort the result of
prog.
sort-values are:
- hostname
- on the FTP site address in lexical order.
- time
- by the modification date, most recent first.
- size
- by the size of the files or directories in the list,
largest first.
- filename
- on file or directory name in lexical order.
- none
- unsorted (default) -- Reverse sorts can be carried out
by prepending r to the sortby value given (e.g.
rhostname instead of hostname).
- set term terminal-type <number-of-rows <number-of-columns>
- tells the archie server what type of terminal you are using,
and optionally its size in rows and columns, e.g.
set term xterm 24 100
|
|
|
Accessing Archie by e-mail
|
|
User's internet access capability limited to only e-mail
can still access the archie servers via archie e-mail.
To conduct an Archie search via e-mail, send an e-mail letter
to the Archie server closest to you.
(The domain addresses of the servers are listed below.)
Typical e-mail archie search looks similar to shown below.
-
find ****
set mailto your_e-mail_address
quit
replacing "****" with what you want the
server to search for. Search results will be automatically sent back to
you via e-mail.
The e-mail interface to an archie server recognizes a subset of
the commands described described below. An empty message, or
a message containing no valid requests, is treated as a
help request.
Archie commands are sent in the body part of the mail message, but the
Subject: line is also processed as if it were part of the main
body, so be careful! Command lines begin in the first column; all lines that do not
match a valid command are ignored.
|
Frequently Used e-mail Archie Commands
|
|
The following archie commands are available for e-mail archie search:
E-mail Archie command |
It does |
help |
sends you the help file. The help command is exclusive,
so other commands in the same message are ignored. |
path return-address
set mailto(+) return-address |
specifies a return e-mail address different from that which is
extracted from the message header. If you do not receive a
reply from the archie server within several hours, you might
need to add a path command to your message request. |
list pattern <pattern2 ...> |
requests a list of the sites in the database that match
pattern, with the time at which they were last updated. The
result is a list with site names, site IP addresses and date of each site's last
update in the database. |
site(*) site-name |
lists the directories and subdirectories of site-name in
the database. |
whatis string <string2 ...> |
searches the descriptions of software packages for each
string. The search is case insensitive. |
prog pattern <pattern2 ...>
find(+) pattern <pattern2> |
uses pattern as a Unix regular expression to be matched
when searching the database. If multiple pattern are placed
on one line, the results will be mailed back in one message. If several
lines are sent, each containing a prog command, then multiple messages
will be returned, one for each prog line. Results are sorted
by FTP site address in lexical order. If pattern contains
spaces, it must be quoted with single (') or double (") quotes. The
search is case insensitive. |
compress(*) |
causes the result of the current request to be compressed
and uuencoded. When you receive the reply, you should run it
through uudecode, to produce a .Z file. You can then run
uncompress on the .Z file and get the result of your request. |
set compress(+) compress-method |
specifies the compression method (none or
compress) to be used before mailing the result of the current
request. The default is none. |
set encode(+) encode-method |
specifies the encoding method (none or
uuencode) to be used before mailing the result of the current
request. This variable is ignored if compress is not set. The default is
none.
Note: set compress compress and set encode uuencode
would produce the same result as the former compress command.
|
quit |
nothing past this point is interpreted. Useful if a
signature is automatically appended to the end of your mail messages. |
|
|
Maximizing Archie search using 'Patterns'
|
|
A pattern is a specification of a character string, and may
include characters which take a special meaning. The special meaning will be
lost if "\" is put before the character, i.e., just use the character
as is. The special characters are:
Pattern Character |
It means |
. (Period) |
this is the wildcard character that replaces any
single character, e.g. "...." will match any 4-character string. |
^ (caret) |
if "^" appears at the beginning of the pattern, then
only strings which start with the substring following the "^" will
match the pattern. If the substring occurs anywhere else in the string
it does not match the pattern, e.g.
"^efghi" will match "efghi" or "efghijlk" but not "abcefghi"
|
$ (dollar) |
if "$" appears at the end of the pattern, then the
searched string must end with the substring preceding the "$". If the
substring occurs anywhere else in the searched string, it is not
considered to match, e.g.
"efghi$" will match "efghi" or "abcdefghi" but not "efghijkl"
|
|
|
|
|