]> code.delx.au - gnu-emacs/blob - doc/misc/url.texi
* doc/misc/cl.texi (Overview): Mention EIEIO here, as well as the appendix.
[gnu-emacs] / doc / misc / url.texi
1 \input texinfo
2 @setfilename ../../info/url
3 @settitle URL Programmer's Manual
4
5 @iftex
6 @c @finalout
7 @end iftex
8 @c @setchapternewpage odd
9 @c @smallbook
10
11 @tex
12 \overfullrule=0pt
13 %\global\baselineskip 30pt % for printing in double space
14 @end tex
15 @dircategory Emacs lisp libraries
16 @direntry
17 * URL: (url). URL loading package.
18 @end direntry
19
20 @copying
21 This file documents the Emacs Lisp URL loading package.
22
23 Copyright @copyright{} 1993-1999, 2002, 2004-2012 Free Software Foundation, Inc.
24
25 @quotation
26 Permission is granted to copy, distribute and/or modify this document
27 under the terms of the GNU Free Documentation License, Version 1.3 or
28 any later version published by the Free Software Foundation; with no
29 Invariant Sections, with the Front-Cover texts being ``A GNU Manual,''
30 and with the Back-Cover Texts as in (a) below. A copy of the license
31 is included in the section entitled ``GNU Free Documentation License''.
32
33 (a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
34 modify this GNU manual. Buying copies from the FSF supports it in
35 developing GNU and promoting software freedom.''
36 @end quotation
37 @end copying
38
39 @c
40 @titlepage
41 @title URL Programmer's Manual
42 @subtitle First Edition, URL Version 2.0
43 @author William M. Perry @email{wmperry@@gnu.org}
44 @author David Love @email{fx@@gnu.org}
45 @page
46 @vskip 0pt plus 1filll
47 @insertcopying
48 @end titlepage
49
50 @contents
51
52 @node Top
53 @top URL
54
55 @ifnottex
56 @insertcopying
57 @end ifnottex
58
59 @menu
60 * Getting Started:: Preparing your program to use URLs.
61 * Retrieving URLs:: How to use this package to retrieve a URL.
62 * Supported URL Types:: Descriptions of URL types currently supported.
63 * Defining New URLs:: How to define a URL loader for a new protocol.
64 * General Facilities:: URLs can be cached, accessed via a gateway
65 and tracked in a history list.
66 * Customization:: Variables you can alter.
67 * GNU Free Documentation License:: The license for this documentation.
68 * Function Index::
69 * Variable Index::
70 * Concept Index::
71 @end menu
72
73 @node Getting Started
74 @chapter Getting Started
75 @cindex URLs, definition
76 @cindex URIs
77
78 @dfn{Uniform Resource Locators} (URLs) are a specific form of
79 @dfn{Uniform Resource Identifiers} (URI) described in RFC 2396 which
80 updates RFC 1738 and RFC 1808. RFC 2016 defines uniform resource
81 agents.
82
83 URIs have the form @var{scheme}:@var{scheme-specific-part}, where the
84 @var{scheme}s supported by this library are described below.
85 @xref{Supported URL Types}.
86
87 FTP, NFS, HTTP, HTTPS, @code{rlogin}, @code{telnet}, tn3270,
88 IRC and gopher URLs all have the form
89
90 @example
91 @var{scheme}://@r{[}@var{userinfo}@@@r{]}@var{hostname}@r{[}:@var{port}@r{]}@r{[}/@var{path}@r{]}
92 @end example
93 @noindent
94 where @samp{@r{[}} and @samp{@r{]}} delimit optional parts.
95 @var{userinfo} sometimes takes the form @var{username}:@var{password}
96 but you should beware of the security risks of sending cleartext
97 passwords. @var{hostname} may be a domain name or a dotted decimal
98 address. If the @samp{:@var{port}} is omitted then the library will
99 use the ``well known'' port for that service when accessing URLs. With
100 the possible exception of @code{telnet}, it is rare for ports to be
101 specified, and it is possible using a non-standard port may have
102 undesired consequences if a different service is listening on that
103 port (e.g., an HTTP URL specifying the SMTP port can cause mail to be
104 sent). @c , but @xref{Other Variables, url-bad-port-list}.
105 The meaning of the @var{path} component depends on the service.
106
107 @menu
108 * Configuration::
109 * Parsed URLs:: URLs are parsed into vector structures.
110 @end menu
111
112 @node Configuration
113 @section Configuration
114
115 @defvar url-configuration-directory
116 @cindex @file{~/.url}
117 @cindex configuration files
118 The directory in which URL configuration files, the cache etc.,
119 reside. The old default was @file{~/.url}, and this directory
120 is still used if it exists. The new default is a @file{url/}
121 directory in @code{user-emacs-directory}, which is normally
122 @file{~/.emacs.d}.
123 @end defvar
124
125 @node Parsed URLs
126 @section Parsed URLs
127 @cindex parsed URLs
128 The library functions typically operate on @dfn{parsed} versions of
129 URLs. These are actually CL structures (vectors) of the form:
130
131 @example
132 [cl-struct-url @var{type} @var{user} @var{password} @var{host} @var{port} @var{filename} @var{target} @var{attributes} @var{fullness} @var{use-cookies}]
133 @end example
134
135 @noindent where
136 @table @var
137 @item type
138 is the type of the URL scheme, e.g., @code{http}
139 @item user
140 is the username associated with it, or @code{nil};
141 @item password
142 is the user password associated with it, or @code{nil};
143 @item host
144 is the host name associated with it, or @code{nil};
145 @item port
146 is the port number associated with it, or @code{nil};
147 @item filename
148 is the ``file'' part of it, or @code{nil}. This doesn't necessarily
149 actually refer to a file;
150 @item target
151 is the target part, or @code{nil};
152 @item attributes
153 is the attributes associated with it, or @code{nil};
154 @item fullness
155 is @code{t} for a fully-specified URL, with a host part indicated by
156 @samp{//} after the scheme part.
157 @item use-cookies
158 is @code{nil} to neither send or store cookies to the server, @code{t}
159 otherwise.
160 @end table
161
162 @findex url-type
163 @findex url-user
164 @findex url-password
165 @findex url-host
166 @findex url-port
167 @findex url-filename
168 @findex url-target
169 @findex url-attributes
170 @findex url-fullness
171 These attributes have accessors named @code{url-@var{part}}, where
172 @var{part} is the name of one of the elements above, e.g.,
173 @code{url-host}. These attributes can be set with the same accessors
174 using @code{setf}:
175
176 @example
177 (setf (url-port url) 80)
178 @end example
179
180 If @var{port} is @var{nil}, @code{url-port} returns the default port
181 of the protocol.
182
183 There are functions for parsing and unparsing between the string and
184 vector forms.
185
186 @defun url-generic-parse-url url
187 Return a parsed version of the string @var{url}.
188 @end defun
189
190 @defun url-recreate-url url
191 @cindex unparsing URLs
192 Recreates a URL string from the parsed @var{url}.
193 @end defun
194
195 @node Retrieving URLs
196 @chapter Retrieving URLs
197
198 @defun url-retrieve-synchronously url
199 Retrieve @var{url} synchronously and return a buffer containing the
200 data. @var{url} is either a string or a parsed URL structure. Return
201 @code{nil} if there are no data associated with it (the case for dired,
202 info, or mailto URLs that need no further processing).
203 @end defun
204
205 @defun url-retrieve url callback &optional cbargs silent no-cookies
206 Retrieve @var{url} asynchronously and call @var{callback} with args
207 @var{cbargs} when finished. The callback is called when the object
208 has been completely retrieved, with the current buffer containing the
209 object and any MIME headers associated with it. @var{url} is either a
210 string or a parsed URL structure. Returns the buffer @var{url} will
211 load into, or @code{nil} if the process has already completed.
212 If the optional argument @var{silent} is non-@code{nil}, suppress
213 progress messages. If the optional argument @var{no-cookies} is
214 non-@code{nil}, do not store or send cookies.
215 @end defun
216
217 @vindex url-queue-parallel-processes
218 @vindex url-queue-timeout
219 @defun url-queue-retrieve url callback &optional cbargs silent no-cookies
220 This acts like the @code{url-retrieve} function, but with limits on
221 the degree of parallelism. The option @code{url-queue-parallel-processes}
222 controls the number of concurrent processes, and the option
223 @code{url-queue-timeout} sets a timeout in seconds.
224 @end defun
225
226 @node Supported URL Types
227 @chapter Supported URL Types
228
229 @menu
230 * http/https:: Hypertext Transfer Protocol.
231 * file/ftp:: Local files and FTP archives.
232 * info:: Emacs "Info" pages.
233 * mailto:: Sending email.
234 * news/nntp/snews:: Usenet news.
235 * rlogin/telnet/tn3270:: Remote host connectivity.
236 * irc:: Internet Relay Chat.
237 * data:: Embedded data URLs.
238 * nfs:: Networked File System
239 @c * finger::
240 @c * gopher::
241 @c * netrek::
242 @c * prospero::
243 * cid:: Content-ID.
244 * about::
245 * ldap:: Lightweight Directory Access Protocol
246 * imap:: IMAP mailboxes.
247 * man:: Unix man pages.
248 @end menu
249
250 @node http/https
251 @section @code{http} and @code{https}
252
253 The scheme @code{http} is Hypertext Transfer Protocol. The library
254 supports version 1.1, specified in RFC 2616. (This supersedes 1.0,
255 defined in RFC 1945) HTTP URLs have the following form, where most of
256 the parts are optional:
257 @example
258 http://@var{user}:@var{password}@@@var{host}:@var{port}/@var{path}?@var{searchpart}#@var{fragment}
259 @end example
260 @c The @code{:@var{port}} part is optional, and @var{port} defaults to
261 @c 80. The @code{/@var{path}} part, if present, is a slash-separated
262 @c series elements. The @code{?@var{searchpart}}, if present, is the
263 @c query for a search or the content of a form submission. The
264 @c @code{#fragment} part, if present, is a location in the document.
265
266 The scheme @code{https} is a secure version of @code{http}, with
267 transmission via SSL. It is defined in RFC 2069. Its default port is
268 443. This scheme depends on SSL support in Emacs via the
269 @file{ssl.el} library and is actually implemented by forcing the
270 @code{ssl} gateway method to be used. @xref{Gateways in general}.
271
272 @defopt url-honor-refresh-requests
273 This controls honoring of HTTP @samp{Refresh} headers by which
274 servers can direct clients to reload documents from the same URL or a
275 or different one. @code{nil} means they will not be honored,
276 @code{t} (the default) means they will always be honored, and
277 otherwise the user will be asked on each request.
278 @end defopt
279
280
281 @menu
282 * Cookies::
283 * HTTP language/coding::
284 * HTTP URL Options::
285 * Dealing with HTTP documents::
286 @end menu
287
288 @node Cookies
289 @subsection Cookies
290
291 @defopt url-cookie-file
292 The file in which cookies are stored, defaulting to @file{cookies} in
293 the directory specified by @code{url-configuration-directory}.
294 @end defopt
295
296 @defopt url-cookie-confirmation
297 Specifies whether confirmation is require to accept cookies.
298 @end defopt
299
300 @defopt url-cookie-multiple-line
301 Specifies whether to put all cookies for the server on one line in the
302 HTTP request to satisfy broken servers like
303 @url{http://www.hotmail.com}.
304 @end defopt
305
306 @defopt url-cookie-trusted-urls
307 A list of regular expressions matching URLs from which to accept
308 cookies always.
309 @end defopt
310
311 @defopt url-cookie-untrusted-urls
312 A list of regular expressions matching URLs from which to reject
313 cookies always.
314 @end defopt
315
316 @defopt url-cookie-save-interval
317 The number of seconds between automatic saves of cookies to disk.
318 Default is one hour.
319 @end defopt
320
321
322 @node HTTP language/coding
323 @subsection Language and Encoding Preferences
324
325 HTTP allows clients to express preferences for the language and
326 encoding of documents which servers may honor. For each of these
327 variables, the value is a string; it can specify a single choice, or
328 it can be a comma-separated list.
329
330 Normally, this list is ordered by descending preference. However, each
331 element can be followed by @samp{;q=@var{priority}} to specify its
332 preference level, a decimal number from 0 to 1; e.g., for
333 @code{url-mime-language-string}, @w{@code{"de, en-gb;q=0.8,
334 en;q=0.7"}}. An element that has no @samp{;q} specification has
335 preference level 1.
336
337 @defopt url-mime-charset-string
338 @cindex character sets
339 @cindex coding systems
340 This variable specifies a preference for character sets when documents
341 can be served in more than one encoding.
342
343 HTTP allows specifying a series of MIME charsets which indicate your
344 preferred character set encodings, e.g., Latin-9 or Big5, and these
345 can be weighted. The default series is generated automatically from
346 the associated MIME types of all defined coding systems, sorted by the
347 coding system priority specified in Emacs. @xref{Recognize Coding, ,
348 Recognizing Coding Systems, emacs, The GNU Emacs Manual}.
349 @end defopt
350
351 @defopt url-mime-language-string
352 @cindex language preferences
353 A string specifying the preferred language when servers can serve
354 files in several languages. Use RFC 1766 abbreviations, e.g.,
355 @samp{en} for English, @samp{de} for German.
356
357 The string can be @code{"*"} to get the first available language (as
358 opposed to the default).
359 @end defopt
360
361 @node HTTP URL Options
362 @subsection HTTP URL Options
363
364 HTTP supports an @samp{OPTIONS} method describing things supported by
365 the URL@.
366
367 @defun url-http-options url
368 Returns a property list describing options available for URL. The
369 property list members are:
370
371 @table @code
372 @item methods
373 A list of symbols specifying what HTTP methods the resource
374 supports.
375
376 @item dav
377 @cindex DAV
378 A list of numbers specifying what DAV protocol/schema versions are
379 supported.
380
381 @item dasl
382 @cindex DASL
383 A list of supported DASL search types supported (string form).
384
385 @item ranges
386 A list of the units available for use in partial document fetches.
387
388 @item p3p
389 @cindex P3P
390 The @dfn{Platform For Privacy Protection} description for the resource.
391 Currently this is just the raw header contents.
392 @end table
393
394 @end defun
395
396 @node Dealing with HTTP documents
397 @subsection Dealing with HTTP documents
398
399 HTTP URLs are retrieved into a buffer containing the HTTP headers
400 followed by the body. Since the headers are quasi-MIME, they may be
401 processed using the MIME library. @xref{Top,, Emacs MIME,
402 emacs-mime, The Emacs MIME Manual}.
403
404 @node file/ftp
405 @section file and ftp
406 @cindex files
407 @cindex FTP
408 @cindex File Transfer Protocol
409 @cindex compressed files
410 @cindex dired
411
412 @example
413 ftp://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
414 file://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
415 @end example
416
417 These schemes are defined in RFC 1808.
418 @samp{ftp:} and @samp{file:} are synonymous in this library. They
419 allow reading arbitrary files from hosts. Either @samp{ange-ftp}
420 (Emacs) or @samp{efs} (XEmacs) is used to retrieve them from remote
421 hosts. Local files are accessed directly.
422
423 Compressed files are handled, but support is hard-coded so that
424 @code{jka-compr-compression-info-list} and so on have no affect.
425 Suffixes recognized are @samp{.z}, @samp{.gz}, @samp{.Z} and
426 @samp{.bz2}.
427
428 @defopt url-directory-index-file
429 The filename to look for when indexing a directory, default
430 @samp{"index.html"}. If this file exists, and is readable, then it
431 will be viewed instead of using @code{dired} to view the directory.
432 @end defopt
433
434 @node info
435 @section info
436 @cindex Info
437 @cindex Texinfo
438 @findex Info-goto-node
439
440 @example
441 info:@var{file}#@var{node}
442 @end example
443
444 Info URLs are not officially defined. They invoke
445 @code{Info-goto-node} with argument @samp{(@var{file})@var{node}}.
446 @samp{#@var{node}} is optional, defaulting to @samp{Top}.
447
448 @node mailto
449 @section mailto
450
451 @cindex mailto
452 @cindex email
453 A mailto URL will send an email message to the address in the
454 URL, for example @samp{mailto:foo@@bar.com} would compose a
455 message to @samp{foo@@bar.com}.
456
457 @defopt url-mail-command
458 @vindex mail-user-agent
459 The function called whenever url needs to send mail. This should
460 normally be left to default from @var{mail-user-agent}. @xref{Mail
461 Methods, , Mail-Composition Methods, emacs, The GNU Emacs Manual}.
462 @end defopt
463
464 An @samp{X-Url-From} header field containing the URL of the document
465 that contained the mailto URL is added if that URL is known.
466
467 RFC 2368 extends the definition of mailto URLs in RFC 1738.
468 The form of a mailto URL is
469 @example
470 @samp{mailto:@var{mailbox}[?@var{header}=@var{contents}[&@var{header}=@var{contents}]]}
471 @end example
472 @noindent where an arbitrary number of @var{header}s can be added. If the
473 @var{header} is @samp{body}, then @var{contents} is put in the body
474 otherwise a @var{header} header field is created with @var{contents}
475 as its contents. Note that the URL library does not consider any
476 headers ``dangerous'' so you should check them before sending the
477 message.
478
479 @c Fixme: update
480 Email messages are defined in @sc{rfc}822.
481
482 @node news/nntp/snews
483 @section @code{news}, @code{nntp} and @code{snews}
484 @cindex news
485 @cindex network news
486 @cindex usenet
487 @cindex NNTP
488 @cindex snews
489
490 @c draft-gilman-news-url-01
491 The network news URL scheme take the following forms following RFC
492 1738 except that for compatibility with other clients, host and port
493 fields may be included in news URLs though they are properly only
494 allowed for nntp an snews.
495
496 @table @samp
497 @item news:@var{newsgroup}
498 Retrieves a list of messages in @var{newsgroup};
499 @item news:@var{message-id}
500 Retrieves the message with the given @var{message-id};
501 @item news:*
502 Retrieves a list of all available newsgroups;
503 @item nntp://@var{host}:@var{port}/@var{newsgroup}
504 @itemx nntp://@var{host}:@var{port}/@var{message-id}
505 @itemx nntp://@var{host}:@var{port}/*
506 Similar to the @samp{news} versions.
507 @end table
508
509 @samp{:@var{port}} is optional and defaults to :119.
510
511 @samp{snews} is the same as @samp{nntp} except that the default port
512 is :563.
513 @cindex SSL
514 (It is tunneled through SSL.)
515
516 An @samp{nntp} URL is the same as a news URL, except that the URL may
517 specify an article by its number.
518
519 @defopt url-news-server
520 This variable can be used to override the default news server.
521 Usually this will be set by the Gnus package, which is used to fetch
522 news.
523 @cindex environment variable
524 @vindex NNTPSERVER
525 It may be set from the conventional environment variable
526 @code{NNTPSERVER}.
527 @end defopt
528
529 @node rlogin/telnet/tn3270
530 @section rlogin, telnet and tn3270
531 @cindex rlogin
532 @cindex telnet
533 @cindex tn3270
534 @cindex terminal emulation
535 @findex terminal-emulator
536
537 These URL schemes from RFC 1738 for logon via a terminal emulator have
538 the form
539 @example
540 telnet://@var{user}:@var{password}@@@var{host}:@var{port}
541 @end example
542 but the @code{:@var{password}} component is ignored.
543
544 To handle rlogin, telnet and tn3270 URLs, a @code{rlogin},
545 @code{telnet} or @code{tn3270} (the program names and arguments are
546 hardcoded) session is run in a @code{terminal-emulator} buffer.
547 Well-known ports are used if the URL does not specify a port.
548
549 @node irc
550 @section irc
551 @cindex IRC
552 @cindex Internet Relay Chat
553 @cindex ZEN IRC
554 @cindex ERC
555 @cindex rcirc
556 @c Fixme: reference (was http://www.w3.org/Addressing/draft-mirashi-url-irc-01.txt)
557 @dfn{Internet Relay Chat} (IRC) is handled by handing off the @sc{irc}
558 session to a function named in @code{url-irc-function}.
559
560 @defopt url-irc-function
561 A function to actually open an IRC connection.
562 This function
563 must take five arguments, @var{host}, @var{port}, @var{channel},
564 @var{user} and @var{password}. The @var{channel} argument specifies the
565 channel to join immediately, this can be @code{nil}. By default this is
566 @code{url-irc-rcirc}.
567 @end defopt
568 @defun url-irc-rcirc host port channel user password
569 Processes the arguments and lets @code{rcirc} handle the session.
570 @end defun
571 @defun url-irc-erc host port channel user password
572 Processes the arguments and lets @code{ERC} handle the session.
573 @end defun
574 @defun url-irc-zenirc host port channel user password
575 Processes the arguments and lets @code{zenirc} handle the session.
576 @end defun
577
578 @node data
579 @section data
580 @cindex data URLs
581
582 @example
583 data:@r{[}@var{media-type}@r{]}@r{[};@var{base64}@r{]},@var{data}
584 @end example
585
586 Data URLs contain MIME data in the URL itself. They are defined in
587 RFC 2397.
588
589 @var{media-type} is a MIME @samp{Content-Type} string, possibly
590 including parameters. It defaults to
591 @samp{text/plain;charset=US-ASCII}. The @samp{text/plain} can be
592 omitted but the charset parameter supplied. If @samp{;base64} is
593 present, the @var{data} are base64-encoded.
594
595 @node nfs
596 @section nfs
597 @cindex NFS
598 @cindex Network File System
599 @cindex automounter
600
601 @example
602 nfs://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
603 @end example
604
605 The @samp{nfs:} scheme is defined in RFC 2224. It is similar to
606 @samp{ftp:} except that it points to a file on a remote host that is
607 handled by the automounter on the local host.
608
609 @defvar url-nfs-automounter-directory-spec
610 @end defvar
611 A string saying how to invoke the NFS automounter. Certain @samp{%}
612 sequences are recognized:
613
614 @table @samp
615 @item %h
616 The hostname of the NFS server;
617 @item %n
618 The port number of the NFS server;
619 @item %u
620 The username to use to authenticate;
621 @item %p
622 The password to use to authenticate;
623 @item %f
624 The filename on the remote server;
625 @item %%
626 A literal @samp{%}.
627 @end table
628
629 Each can be used any number of times.
630
631 @node cid
632 @section cid
633 @cindex Content-ID
634
635 RFC 2111
636
637 @node about
638 @section about
639
640 @node ldap
641 @section ldap
642 @cindex LDAP
643 @cindex Lightweight Directory Access Protocol
644
645 The LDAP scheme is defined in RFC 2255.
646
647 @node imap
648 @section imap
649 @cindex IMAP
650
651 RFC 2192
652
653 @node man
654 @section man
655 @cindex @command{man}
656 @cindex Unix man pages
657 @findex man
658
659 @example
660 @samp{man:@var{page-spec}}
661 @end example
662
663 This is a non-standard scheme. @var{page-spec} is passed directly to
664 the Lisp @code{man} function.
665
666 @node Defining New URLs
667 @chapter Defining New URLs
668
669 @menu
670 * Naming conventions::
671 * Required functions::
672 * Optional functions::
673 * Asynchronous fetching::
674 * Supporting file-name-handlers::
675 @end menu
676
677 @node Naming conventions
678 @section Naming conventions
679
680 @node Required functions
681 @section Required functions
682
683 @node Optional functions
684 @section Optional functions
685
686 @node Asynchronous fetching
687 @section Asynchronous fetching
688
689 @node Supporting file-name-handlers
690 @section Supporting file-name-handlers
691
692 @node General Facilities
693 @chapter General Facilities
694
695 @menu
696 * Disk Caching::
697 * Proxies::
698 * Gateways in general::
699 * History::
700 @end menu
701
702 @node Disk Caching
703 @section Disk Caching
704 @cindex Caching
705 @cindex Persistent Cache
706 @cindex Disk Cache
707
708 The disk cache stores retrieved documents locally, whence they can be
709 retrieved more quickly. When requesting a URL that is in the cache,
710 the library checks to see if the page has changed since it was last
711 retrieved from the remote machine. If not, the local copy is used,
712 saving the transmission over the network.
713 @cindex Cleaning the cache
714 @cindex Clearing the cache
715 @cindex Cache cleaning
716 Currently the cache isn't cleared automatically.
717 @c Running the @code{clean-cache} shell script
718 @c fist is recommended, to allow for future cleaning of the cache. This
719 @c shell script will remove all files that have not been accessed since it
720 @c was last run. To keep the cache pared down, it is recommended that this
721 @c script be run from @i{at} or @i{cron} (see the manual pages for
722 @c crontab(5) or at(1) for more information)
723
724 @defopt url-automatic-caching
725 Setting this variable non-@code{nil} causes documents to be cached
726 automatically.
727 @end defopt
728
729 @defopt url-cache-directory
730 This variable specifies the
731 directory to store the cache files. It defaults to sub-directory
732 @file{cache} of @code{url-configuration-directory}.
733 @end defopt
734
735 @defopt url-cache-creation-function
736 The cache relies on a scheme for mapping URLs to files in the cache.
737 This variable names a function which sets the type of cache to use.
738 It takes a URL as argument and returns the absolute file name of the
739 corresponding cache file. The two supplied possibilities are
740 @code{url-cache-create-filename-using-md5} and
741 @code{url-cache-create-filename-human-readable}.
742 @end defopt
743
744 @defun url-cache-create-filename-using-md5 url
745 Creates a cache file name from @var{url} using MD5 hashing.
746 This is creates entries with very few cache collisions and is fast.
747 @cindex MD5
748 @smallexample
749 (url-cache-create-filename-using-md5 "http://www.example.com/foo/bar")
750 @result{} "/home/fx/.url/cache/fx/http/com/example/www/b8a35774ad20db71c7c3409a5410e74f"
751 @end smallexample
752 @end defun
753
754 @defun url-cache-create-filename-human-readable url
755 Creates a cache file name from @var{url} more obviously connected to
756 @var{url} than for @code{url-cache-create-filename-using-md5}, but
757 more likely to conflict with other files.
758 @smallexample
759 (url-cache-create-filename-human-readable "http://www.example.com/foo/bar")
760 @result{} "/home/fx/.url/cache/fx/http/com/example/www/foo/bar"
761 @end smallexample
762 @end defun
763
764 @defun url-cache-expired
765 This function returns non-nil if a cache entry has expired (or is absent).
766 The arguments are a URL and optional expiration delay in seconds
767 (default @var{url-cache-expire-time}).
768 @end defun
769
770 @defopt url-cache-expire-time
771 This variable is the default number of seconds to use for the
772 expire-time argument of the function @code{url-cache-expired}.
773 @end defopt
774
775 @defun url-fetch-from-cache
776 This function takes a URL as its argument and returns a buffer
777 containing the data cached for that URL.
778 @end defun
779
780 @c Fixme: never actually used currently?
781 @c @defopt url-standalone-mode
782 @c @cindex Relying on cache
783 @c @cindex Cache only mode
784 @c @cindex Standalone mode
785 @c If this variable is non-@code{nil}, the library relies solely on the
786 @c cache for fetching documents and avoids checking if they have changed
787 @c on remote servers.
788 @c @end defopt
789
790 @c With a large cache of documents on the local disk, it can be very handy
791 @c when traveling, or any other time the network connection is not active
792 @c (a laptop with a dial-on-demand PPP connection, etc). Emacs/W3 can rely
793 @c solely on its cache, and avoid checking to see if the page has changed
794 @c on the remote server. In the case of a dial-on-demand PPP connection,
795 @c this will keep the phone line free as long as possible, only bringing up
796 @c the PPP connection when asking for a page that is not located in the
797 @c cache. This is very useful for demonstrations as well.
798
799 @node Proxies
800 @section Proxies and Gatewaying
801
802 @c fixme: check/document url-ns stuff
803 @cindex proxy servers
804 @cindex proxies
805 @cindex environment variables
806 @vindex HTTP_PROXY
807 Proxy servers are commonly used to provide gateways through firewalls
808 or as caches serving some more-or-less local network. Each protocol
809 (HTTP, FTP, etc.)@: can have a different gateway server. Proxying is
810 conventionally configured commonly amongst different programs through
811 environment variables of the form @code{@var{protocol}_proxy}, where
812 @var{protocol} is one of the supported network protocols (@code{http},
813 @code{ftp} etc.). The library recognizes such variables in either
814 upper or lower case. Their values are of one of the forms:
815 @itemize @bullet
816 @item @code{@var{host}:@var{port}}
817 @item A full URL;
818 @item Simply a host name.
819 @end itemize
820
821 @vindex NO_PROXY
822 The @code{NO_PROXY} environment variable specifies URLs that should be
823 excluded from proxying (on servers that should be contacted directly).
824 This should be a comma-separated list of hostnames, domain names, or a
825 mixture of both. Asterisks can be used as wildcards, but other
826 clients may not support that. Domain names may be indicated by a
827 leading dot. For example:
828 @example
829 NO_PROXY="*.aventail.com,home.com,.seanet.com"
830 @end example
831 @noindent says to contact all machines in the @samp{aventail.com} and
832 @samp{seanet.com} domains directly, as well as the machine named
833 @samp{home.com}. If @code{NO_PROXY} isn't defined, @code{no_PROXY}
834 and @code{no_proxy} are also tried, in that order.
835
836 Proxies may also be specified directly in Lisp.
837
838 @defopt url-proxy-services
839 This variable is an alist of URL schemes and proxy servers that
840 gateway them. The items are of the form @w{@code{(@var{scheme}
841 . @var{host}:@var{portnumber})}}, says that the URL @var{scheme} is
842 gatewayed through @var{portnumber} on the specified @var{host}. An
843 exception is the pseudo scheme @code{"no_proxy"}, which is paired with
844 a regexp matching host names not to be proxied. This variable is
845 initialized from the environment as above.
846
847 @example
848 (setq url-proxy-services
849 '(("http" . "proxy.aventail.com:80")
850 ("no_proxy" . "^.*\\(aventail\\|seanet\\)\\.com")))
851 @end example
852 @end defopt
853
854 @node Gateways in general
855 @section Gateways in General
856 @cindex gateways
857 @cindex firewalls
858
859 The library provides a general gateway layer through which all
860 networking passes. It can both control access to the network and
861 provide access through gateways in firewalls. This may make direct
862 connections in some cases and pass through some sort of gateway in
863 others.@footnote{Proxies (which only operate over HTTP) are
864 implemented using this.} The library's basic function responsible for
865 making connections is @code{url-open-stream}.
866
867 @defun url-open-stream name buffer host service
868 @cindex opening a stream
869 @cindex stream, opening
870 Open a stream to @var{host}, possibly via a gateway. The other
871 arguments are as for @code{open-network-stream}. This will not make a
872 connection if @code{url-gateway-unplugged} is non-@code{nil}.
873 @end defun
874
875 @defvar url-gateway-local-host-regexp
876 This is a regular expression that matches local hosts that do not
877 require the use of a gateway. If @code{nil}, all connections are made
878 through the gateway.
879 @end defvar
880
881 @defvar url-gateway-method
882 This variable controls which gateway method is used. It may be useful
883 to bind it temporarily in some applications. It has values taken from
884 a list of symbols. Possible values are:
885
886 @table @code
887 @item telnet
888 @cindex @command{telnet}
889 Use this method if you must first telnet and log into a gateway host,
890 and then run telnet from that host to connect to outside machines.
891
892 @item rlogin
893 @cindex @command{rlogin}
894 This method is identical to @code{telnet}, but uses @command{rlogin}
895 to log into the remote machine without having to send the username and
896 password over the wire every time.
897
898 @item socks
899 @cindex @sc{socks}
900 Use if the firewall has a @sc{socks} gateway running on it. The
901 @sc{socks} v5 protocol is defined in RFC 1928.
902
903 @c @item ssl
904 @c This probably shouldn't be documented
905 @c Fixme: why not? -- fx
906
907 @item native
908 This method uses Emacs's builtin networking directly. This is the
909 default. It can be used only if there is no firewall blocking access.
910 @end table
911 @end defvar
912
913 The following variables control the gateway methods.
914
915 @defopt url-gateway-telnet-host
916 The gateway host to telnet to. Once logged in there, you then telnet
917 out to the hosts you want to connect to.
918 @end defopt
919 @defopt url-gateway-telnet-parameters
920 This should be a list of parameters to pass to the @command{telnet} program.
921 @end defopt
922 @defopt url-gateway-telnet-password-prompt
923 This is a regular expression that matches the password prompt when
924 logging in.
925 @end defopt
926 @defopt url-gateway-telnet-login-prompt
927 This is a regular expression that matches the username prompt when
928 logging in.
929 @end defopt
930 @defopt url-gateway-telnet-user-name
931 The username to log in with.
932 @end defopt
933 @defopt url-gateway-telnet-password
934 The password to send when logging in.
935 @end defopt
936 @defopt url-gateway-prompt-pattern
937 This is a regular expression that matches the shell prompt.
938 @end defopt
939
940 @defopt url-gateway-rlogin-host
941 Host to @samp{rlogin} to before telnetting out.
942 @end defopt
943 @defopt url-gateway-rlogin-parameters
944 Parameters to pass to @samp{rsh}.
945 @end defopt
946 @defopt url-gateway-rlogin-user-name
947 User name to use when logging in to the gateway.
948 @end defopt
949 @defopt url-gateway-prompt-pattern
950 This is a regular expression that matches the shell prompt.
951 @end defopt
952
953 @defopt socks-server
954 This specifies the default server, it takes the form
955 @w{@code{("Default server" @var{server} @var{port} @var{version})}}
956 where @var{version} can be either 4 or 5.
957 @end defopt
958 @defvar socks-password
959 If this is @code{nil} then you will be asked for the password,
960 otherwise it will be used as the password for authenticating you to
961 the @sc{socks} server.
962 @end defvar
963 @defvar socks-username
964 This is the username to use when authenticating yourself to the
965 @sc{socks} server. By default this is your login name.
966 @end defvar
967 @defvar socks-timeout
968 This controls how long, in seconds, to wait for responses from the
969 @sc{socks} server; it is 5 by default.
970 @end defvar
971 @c fixme: these have been effectively commented-out in the code
972 @c @defopt socks-server-aliases
973 @c This a list of server aliases. It is a list of aliases of the form
974 @c @var{(alias hostname port version)}.
975 @c @end defopt
976 @c @defopt socks-network-aliases
977 @c This a list of network aliases. Each entry in the list takes the form
978 @c @var{(alias (network))} where @var{alias} is a string that names the
979 @c @var{network}. The networks can contain a pair (not a dotted pair) of
980 @c @sc{ip} addresses which specify a range of @sc{ip} addresses, an @sc{ip}
981 @c address and a netmask, a domain name or a unique hostname or @sc{ip}
982 @c address.
983 @c @end defopt
984 @c @defopt socks-redirection-rules
985 @c This a list of redirection rules. Each rule take the form
986 @c @var{(Destination network Connection type)} where @var{Destination
987 @c network} is a network alias from @code{socks-network-aliases} and
988 @c @var{Connection type} can be @code{nil} in which case a direct
989 @c connection is used, or it can be an alias from
990 @c @code{socks-server-aliases} in which case that server is used as a
991 @c proxy.
992 @c @end defopt
993 @defopt socks-nslookup-program
994 @cindex @command{nslookup}
995 This the @samp{nslookup} program. It is @code{"nslookup"} by default.
996 @end defopt
997
998 @menu
999 * Suppressing network connections::
1000 @end menu
1001 @c * Broken hostname resolution::
1002
1003 @node Suppressing network connections
1004 @subsection Suppressing Network Connections
1005
1006 @cindex network connections, suppressing
1007 @cindex suppressing network connections
1008 @cindex bugs, HTML
1009 @cindex HTML `bugs'
1010 In some circumstances it is desirable to suppress making network
1011 connections. A typical case is when rendering HTML in a mail user
1012 agent, when external URLs should not be activated, particularly to
1013 avoid ``bugs'' which ``call home'' by fetch single-pixel images and the
1014 like. To arrange this, bind the following variable for the duration
1015 of such processing.
1016
1017 @defvar url-gateway-unplugged
1018 If this variable is non-@code{nil} new network connections are never
1019 opened by the URL library.
1020 @end defvar
1021
1022 @c @node Broken hostname resolution
1023 @c @subsection Broken Hostname Resolution
1024
1025 @c @cindex hostname resolver
1026 @c @cindex resolver, hostname
1027 @c Some C libraries do not include the hostname resolver routines in
1028 @c their static libraries. If Emacs was linked statically, and was not
1029 @c linked with the resolver libraries, it will not be able to get to any
1030 @c machines off the local network. This is characterized by being able
1031 @c to reach someplace with a raw ip number, but not its hostname
1032 @c (@url{http://129.79.254.191/} works, but
1033 @c @url{http://www.cs.indiana.edu/} doesn't). This used to happen on
1034 @c SunOS4 and Ultrix, but is now probably now rare. If Emacs can't be
1035 @c rebuilt linked against the resolver library, it can use the external
1036 @c @command{nslookup} program instead.
1037
1038 @c @defopt url-gateway-broken-resolution
1039 @c @cindex @code{nslookup} program
1040 @c @cindex program, @code{nslookup}
1041 @c If non-@code{nil}, this variable says to use the program specified by
1042 @c @code{url-gateway-nslookup-program} program to do hostname resolution.
1043 @c @end defopt
1044
1045 @c @defopt url-gateway-nslookup-program
1046 @c The name of the program to do hostname lookup if Emacs can't do it
1047 @c directly. This program should expect a single argument on the command
1048 @c line---the hostname to resolve---and should produce output similar to
1049 @c the standard Unix @command{nslookup} program:
1050 @c @example
1051 @c Name: www.cs.indiana.edu
1052 @c Address: 129.79.254.191
1053 @c @end example
1054 @c @end defopt
1055
1056 @node History
1057 @section History
1058
1059 @findex url-do-setup
1060 The library can maintain a global history list tracking URLs accessed.
1061 URL completion can be done from it. The history mechanism is set up
1062 automatically via @code{url-do-setup} when it is configured to be on.
1063 Note that the size of the history list is currently not limited.
1064
1065 @vindex url-history-hash-table
1066 The history ``list'' is actually a hash table,
1067 @code{url-history-hash-table}. It contains access times keyed by URL
1068 strings. The times are in the format returned by @code{current-time}.
1069
1070 @defun url-history-update-url url time
1071 This function updates the history table with an entry for @var{url}
1072 accessed at the given @var{time}.
1073 @end defun
1074
1075 @defopt url-history-track
1076 If non-@code{nil}, the library will keep track of all the URLs
1077 accessed. If it is @code{t}, the list is saved to disk at the end of
1078 each Emacs session. The default is @code{nil}.
1079 @end defopt
1080
1081 @defopt url-history-file
1082 The file storing the history list between sessions. It defaults to
1083 @file{history} in @code{url-configuration-directory}.
1084 @end defopt
1085
1086 @defopt url-history-save-interval
1087 @findex url-history-setup-save-timer
1088 The number of seconds between automatic saves of the history list.
1089 Default is one hour. Note that if you change this variable directly,
1090 rather than using Custom, after @code{url-do-setup} has been run, you
1091 need to run the function @code{url-history-setup-save-timer}.
1092 @end defopt
1093
1094 @defun url-history-parse-history &optional fname
1095 Parses the history file @var{fname} (default @code{url-history-file})
1096 and sets up the history list.
1097 @end defun
1098
1099 @defun url-history-save-history &optional fname
1100 Saves the current history to file @var{fname} (default
1101 @code{url-history-file}).
1102 @end defun
1103
1104 @defun url-completion-function string predicate function
1105 You can use this function to do completion of URLs from the history.
1106 @end defun
1107
1108 @node Customization
1109 @chapter Customization
1110
1111 @section Environment Variables
1112
1113 @cindex environment variables
1114 The following environment variables affect the library's operation at
1115 startup.
1116
1117 @table @code
1118 @item TMPDIR
1119 @vindex TMPDIR
1120 @vindex url-temporary-directory
1121 If this is defined, @var{url-temporary-directory} is initialized from
1122 it.
1123 @end table
1124
1125 @section General User Options
1126
1127 The following user options, settable with Customize, affect the
1128 general operation of the package.
1129
1130 @defopt url-debug
1131 @cindex debugging
1132 Specifies the types of debug messages which are logged to
1133 the @code{*URL-DEBUG*} buffer.
1134 @code{t} means log all messages.
1135 A number means log all messages and show them with @code{message}.
1136 It may also be a list of the types of messages to be logged.
1137 @end defopt
1138 @defopt url-personal-mail-address
1139 @end defopt
1140 @defopt url-privacy-level
1141 @end defopt
1142 @defopt url-uncompressor-alist
1143 @end defopt
1144 @defopt url-passwd-entry-func
1145 @end defopt
1146 @defopt url-standalone-mode
1147 @end defopt
1148 @defopt url-bad-port-list
1149 @end defopt
1150 @defopt url-max-password-attempts
1151 @end defopt
1152 @defopt url-temporary-directory
1153 @end defopt
1154 @defopt url-show-status
1155 @end defopt
1156 @defopt url-confirmation-func
1157 The function to use for asking yes or no functions. This is normally
1158 either @code{y-or-n-p} or @code{yes-or-no-p}, but could be another
1159 function taking a single argument (the prompt) and returning @code{t}
1160 only if an affirmative answer is given.
1161 @end defopt
1162 @defopt url-gateway-method
1163 @c fixme: describe gatewaying
1164 A symbol specifying the type of gateway support to use for connections
1165 from the local machine. The supported methods are:
1166
1167 @table @code
1168 @item telnet
1169 Run telnet in a subprocess to connect;
1170 @item rlogin
1171 Rlogin to another machine to connect;
1172 @item socks
1173 Connect through a socks server;
1174 @item ssl
1175 Connect with SSL;
1176 @item native
1177 Connect directly.
1178 @end table
1179 @end defopt
1180
1181 @node GNU Free Documentation License
1182 @appendix GNU Free Documentation License
1183 @include doclicense.texi
1184
1185 @node Function Index
1186 @unnumbered Command and Function Index
1187 @printindex fn
1188
1189 @node Variable Index
1190 @unnumbered Variable Index
1191 @printindex vr
1192
1193 @node Concept Index
1194 @unnumbered Concept Index
1195 @printindex cp
1196
1197 @bye