]> code.delx.au - gnu-emacs/blob - doc/misc/url.texi
Add 2012 to FSF copyright years for Emacs files
[gnu-emacs] / doc / misc / url.texi
1 \input texinfo
2 @setfilename ../../info/url
3 @settitle URL Programmer's Manual
4
5 @iftex
6 @c @finalout
7 @end iftex
8 @c @setchapternewpage odd
9 @c @smallbook
10
11 @tex
12 \overfullrule=0pt
13 %\global\baselineskip 30pt % for printing in double space
14 @end tex
15 @dircategory Emacs lisp libraries
16 @direntry
17 * URL: (url). URL loading package.
18 @end direntry
19
20 @copying
21 This file documents the Emacs Lisp URL loading package.
22
23 Copyright @copyright{} 1993-1999, 2002, 2004-2012 Free Software Foundation, Inc.
24
25 @quotation
26 Permission is granted to copy, distribute and/or modify this document
27 under the terms of the GNU Free Documentation License, Version 1.3 or
28 any later version published by the Free Software Foundation; with no
29 Invariant Sections, with the Front-Cover texts being ``A GNU Manual,''
30 and with the Back-Cover Texts as in (a) below. A copy of the license
31 is included in the section entitled ``GNU Free Documentation License''.
32
33 (a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
34 modify this GNU manual. Buying copies from the FSF supports it in
35 developing GNU and promoting software freedom.''
36 @end quotation
37 @end copying
38
39 @c
40 @titlepage
41 @title URL Programmer's Manual
42 @subtitle First Edition, URL Version 2.0
43 @author William M. Perry @email{wmperry@@gnu.org}
44 @author David Love @email{fx@@gnu.org}
45 @page
46 @vskip 0pt plus 1filll
47 @insertcopying
48 @end titlepage
49
50 @contents
51
52 @node Top
53 @top URL
54
55 @ifnottex
56 @insertcopying
57 @end ifnottex
58
59 @menu
60 * Getting Started:: Preparing your program to use URLs.
61 * Retrieving URLs:: How to use this package to retrieve a URL.
62 * Supported URL Types:: Descriptions of URL types currently supported.
63 * Defining New URLs:: How to define a URL loader for a new protocol.
64 * General Facilities:: URLs can be cached, accessed via a gateway
65 and tracked in a history list.
66 * Customization:: Variables you can alter.
67 * GNU Free Documentation License:: The license for this documentation.
68 * Function Index::
69 * Variable Index::
70 * Concept Index::
71 @end menu
72
73 @node Getting Started
74 @chapter Getting Started
75 @cindex URLs, definition
76 @cindex URIs
77
78 @dfn{Uniform Resource Locators} (URLs) are a specific form of
79 @dfn{Uniform Resource Identifiers} (URI) described in RFC 2396 which
80 updates RFC 1738 and RFC 1808. RFC 2016 defines uniform resource
81 agents.
82
83 URIs have the form @var{scheme}:@var{scheme-specific-part}, where the
84 @var{scheme}s supported by this library are described below.
85 @xref{Supported URL Types}.
86
87 FTP, NFS, HTTP, HTTPS, @code{rlogin}, @code{telnet}, tn3270,
88 IRC and gopher URLs all have the form
89
90 @example
91 @var{scheme}://@r{[}@var{userinfo}@@@r{]}@var{hostname}@r{[}:@var{port}@r{]}@r{[}/@var{path}@r{]}
92 @end example
93 @noindent
94 where @samp{@r{[}} and @samp{@r{]}} delimit optional parts.
95 @var{userinfo} sometimes takes the form @var{username}:@var{password}
96 but you should beware of the security risks of sending cleartext
97 passwords. @var{hostname} may be a domain name or a dotted decimal
98 address. If the @samp{:@var{port}} is omitted then the library will
99 use the `well known' port for that service when accessing URLs. With
100 the possible exception of @code{telnet}, it is rare for ports to be
101 specified, and it is possible using a non-standard port may have
102 undesired consequences if a different service is listening on that
103 port (e.g., an HTTP URL specifying the SMTP port can cause mail to be
104 sent). @c , but @xref{Other Variables, url-bad-port-list}.
105 The meaning of the @var{path} component depends on the service.
106
107 @menu
108 * Configuration::
109 * Parsed URLs:: URLs are parsed into vector structures.
110 @end menu
111
112 @node Configuration
113 @section Configuration
114
115 @defvar url-configuration-directory
116 @cindex @file{~/.url}
117 @cindex configuration files
118 The directory in which URL configuration files, the cache etc.,
119 reside. The old default was @file{~/.url}, and this directory
120 is still used if it exists. The new default is a @file{url/}
121 directory in @code{user-emacs-directory}, which is normally
122 @file{~/.emacs.d}.
123 @end defvar
124
125 @node Parsed URLs
126 @section Parsed URLs
127 @cindex parsed URLs
128 The library functions typically operate on @dfn{parsed} versions of
129 URLs. These are actually vectors of the form:
130
131 @example
132 [@var{type} @var{user} @var{password} @var{host} @var{port} @var{file} @var{target} @var{attributes} @var{full}]
133 @end example
134
135 @noindent where
136 @table @var
137 @item type
138 is the type of the URL scheme, e.g., @code{http}
139 @item user
140 is the username associated with it, or @code{nil};
141 @item password
142 is the user password associated with it, or @code{nil};
143 @item host
144 is the host name associated with it, or @code{nil};
145 @item port
146 is the port number associated with it, or @code{nil};
147 @item file
148 is the `file' part of it, or @code{nil}. This doesn't necessarily
149 actually refer to a file;
150 @item target
151 is the target part, or @code{nil};
152 @item attributes
153 is the attributes associated with it, or @code{nil};
154 @item full
155 is @code{t} for a fully-specified URL, with a host part indicated by
156 @samp{//} after the scheme part.
157 @end table
158
159 @findex url-type
160 @findex url-user
161 @findex url-password
162 @findex url-host
163 @findex url-port
164 @findex url-file
165 @findex url-target
166 @findex url-attributes
167 @findex url-full
168 @findex url-set-type
169 @findex url-set-user
170 @findex url-set-password
171 @findex url-set-host
172 @findex url-set-port
173 @findex url-set-file
174 @findex url-set-target
175 @findex url-set-attributes
176 @findex url-set-full
177 These attributes have accessors named @code{url-@var{part}}, where
178 @var{part} is the name of one of the elements above, e.g.,
179 @code{url-host}. Similarly, there are setters of the form
180 @code{url-set-@var{part}}.
181
182 There are functions for parsing and unparsing between the string and
183 vector forms.
184
185 @defun url-generic-parse-url url
186 Return a parsed version of the string @var{url}.
187 @end defun
188
189 @defun url-recreate-url url
190 @cindex unparsing URLs
191 Recreates a URL string from the parsed @var{url}.
192 @end defun
193
194 @node Retrieving URLs
195 @chapter Retrieving URLs
196
197 @defun url-retrieve-synchronously url
198 Retrieve @var{url} synchronously and return a buffer containing the
199 data. @var{url} is either a string or a parsed URL structure. Return
200 @code{nil} if there are no data associated with it (the case for dired,
201 info, or mailto URLs that need no further processing).
202 @end defun
203
204 @defun url-retrieve url callback &optional cbargs
205 Retrieve @var{url} asynchronously and call @var{callback} with args
206 @var{cbargs} when finished. The callback is called when the object
207 has been completely retrieved, with the current buffer containing the
208 object and any MIME headers associated with it. @var{url} is either a
209 string or a parsed URL structure. Returns the buffer @var{url} will
210 load into, or @code{nil} if the process has already completed.
211 @end defun
212
213 @node Supported URL Types
214 @chapter Supported URL Types
215
216 @menu
217 * http/https:: Hypertext Transfer Protocol.
218 * file/ftp:: Local files and FTP archives.
219 * info:: Emacs `Info' pages.
220 * mailto:: Sending email.
221 * news/nntp/snews:: Usenet news.
222 * rlogin/telnet/tn3270:: Remote host connectivity.
223 * irc:: Internet Relay Chat.
224 * data:: Embedded data URLs.
225 * nfs:: Networked File System
226 @c * finger::
227 @c * gopher::
228 @c * netrek::
229 @c * prospero::
230 * cid:: Content-ID.
231 * about::
232 * ldap:: Lightweight Directory Access Protocol
233 * imap:: IMAP mailboxes.
234 * man:: Unix man pages.
235 @end menu
236
237 @node http/https
238 @section @code{http} and @code{https}
239
240 The scheme @code{http} is Hypertext Transfer Protocol. The library
241 supports version 1.1, specified in RFC 2616. (This supersedes 1.0,
242 defined in RFC 1945) HTTP URLs have the following form, where most of
243 the parts are optional:
244 @example
245 http://@var{user}:@var{password}@@@var{host}:@var{port}/@var{path}?@var{searchpart}#@var{fragment}
246 @end example
247 @c The @code{:@var{port}} part is optional, and @var{port} defaults to
248 @c 80. The @code{/@var{path}} part, if present, is a slash-separated
249 @c series elements. The @code{?@var{searchpart}}, if present, is the
250 @c query for a search or the content of a form submission. The
251 @c @code{#fragment} part, if present, is a location in the document.
252
253 The scheme @code{https} is a secure version of @code{http}, with
254 transmission via SSL. It is defined in RFC 2069. Its default port is
255 443. This scheme depends on SSL support in Emacs via the
256 @file{ssl.el} library and is actually implemented by forcing the
257 @code{ssl} gateway method to be used. @xref{Gateways in general}.
258
259 @defopt url-honor-refresh-requests
260 This controls honoring of HTTP @samp{Refresh} headers by which
261 servers can direct clients to reload documents from the same URL or a
262 or different one. @code{nil} means they will not be honored,
263 @code{t} (the default) means they will always be honored, and
264 otherwise the user will be asked on each request.
265 @end defopt
266
267
268 @menu
269 * Cookies::
270 * HTTP language/coding::
271 * HTTP URL Options::
272 * Dealing with HTTP documents::
273 @end menu
274
275 @node Cookies
276 @subsection Cookies
277
278 @defopt url-cookie-file
279 The file in which cookies are stored, defaulting to @file{cookies} in
280 the directory specified by @code{url-configuration-directory}.
281 @end defopt
282
283 @defopt url-cookie-confirmation
284 Specifies whether confirmation is require to accept cookies.
285 @end defopt
286
287 @defopt url-cookie-multiple-line
288 Specifies whether to put all cookies for the server on one line in the
289 HTTP request to satisfy broken servers like
290 @url{http://www.hotmail.com}.
291 @end defopt
292
293 @defopt url-cookie-trusted-urls
294 A list of regular expressions matching URLs from which to accept
295 cookies always.
296 @end defopt
297
298 @defopt url-cookie-untrusted-urls
299 A list of regular expressions matching URLs from which to reject
300 cookies always.
301 @end defopt
302
303 @defopt url-cookie-save-interval
304 The number of seconds between automatic saves of cookies to disk.
305 Default is one hour.
306 @end defopt
307
308
309 @node HTTP language/coding
310 @subsection Language and Encoding Preferences
311
312 HTTP allows clients to express preferences for the language and
313 encoding of documents which servers may honor. For each of these
314 variables, the value is a string; it can specify a single choice, or
315 it can be a comma-separated list.
316
317 Normally, this list is ordered by descending preference. However, each
318 element can be followed by @samp{;q=@var{priority}} to specify its
319 preference level, a decimal number from 0 to 1; e.g., for
320 @code{url-mime-language-string}, @w{@code{"de, en-gb;q=0.8,
321 en;q=0.7"}}. An element that has no @samp{;q} specification has
322 preference level 1.
323
324 @defopt url-mime-charset-string
325 @cindex character sets
326 @cindex coding systems
327 This variable specifies a preference for character sets when documents
328 can be served in more than one encoding.
329
330 HTTP allows specifying a series of MIME charsets which indicate your
331 preferred character set encodings, e.g., Latin-9 or Big5, and these
332 can be weighted. The default series is generated automatically from
333 the associated MIME types of all defined coding systems, sorted by the
334 coding system priority specified in Emacs. @xref{Recognize Coding, ,
335 Recognizing Coding Systems, emacs, The GNU Emacs Manual}.
336 @end defopt
337
338 @defopt url-mime-language-string
339 @cindex language preferences
340 A string specifying the preferred language when servers can serve
341 files in several languages. Use RFC 1766 abbreviations, e.g.,
342 @samp{en} for English, @samp{de} for German.
343
344 The string can be @code{"*"} to get the first available language (as
345 opposed to the default).
346 @end defopt
347
348 @node HTTP URL Options
349 @subsection HTTP URL Options
350
351 HTTP supports an @samp{OPTIONS} method describing things supported by
352 the URL@.
353
354 @defun url-http-options url
355 Returns a property list describing options available for URL. The
356 property list members are:
357
358 @table @code
359 @item methods
360 A list of symbols specifying what HTTP methods the resource
361 supports.
362
363 @item dav
364 @cindex DAV
365 A list of numbers specifying what DAV protocol/schema versions are
366 supported.
367
368 @item dasl
369 @cindex DASL
370 A list of supported DASL search types supported (string form).
371
372 @item ranges
373 A list of the units available for use in partial document fetches.
374
375 @item p3p
376 @cindex P3P
377 The @dfn{Platform For Privacy Protection} description for the resource.
378 Currently this is just the raw header contents.
379 @end table
380
381 @end defun
382
383 @node Dealing with HTTP documents
384 @subsection Dealing with HTTP documents
385
386 HTTP URLs are retrieved into a buffer containing the HTTP headers
387 followed by the body. Since the headers are quasi-MIME, they may be
388 processed using the MIME library. @xref{Top,, Emacs MIME,
389 emacs-mime, The Emacs MIME Manual}.
390
391 @node file/ftp
392 @section file and ftp
393 @cindex files
394 @cindex FTP
395 @cindex File Transfer Protocol
396 @cindex compressed files
397 @cindex dired
398
399 @example
400 ftp://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
401 file://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
402 @end example
403
404 These schemes are defined in RFC 1808.
405 @samp{ftp:} and @samp{file:} are synonymous in this library. They
406 allow reading arbitrary files from hosts. Either @samp{ange-ftp}
407 (Emacs) or @samp{efs} (XEmacs) is used to retrieve them from remote
408 hosts. Local files are accessed directly.
409
410 Compressed files are handled, but support is hard-coded so that
411 @code{jka-compr-compression-info-list} and so on have no affect.
412 Suffixes recognized are @samp{.z}, @samp{.gz}, @samp{.Z} and
413 @samp{.bz2}.
414
415 @defopt url-directory-index-file
416 The filename to look for when indexing a directory, default
417 @samp{"index.html"}. If this file exists, and is readable, then it
418 will be viewed instead of using @code{dired} to view the directory.
419 @end defopt
420
421 @node info
422 @section info
423 @cindex Info
424 @cindex Texinfo
425 @findex Info-goto-node
426
427 @example
428 info:@var{file}#@var{node}
429 @end example
430
431 Info URLs are not officially defined. They invoke
432 @code{Info-goto-node} with argument @samp{(@var{file})@var{node}}.
433 @samp{#@var{node}} is optional, defaulting to @samp{Top}.
434
435 @node mailto
436 @section mailto
437
438 @cindex mailto
439 @cindex email
440 A mailto URL will send an email message to the address in the
441 URL, for example @samp{mailto:foo@@bar.com} would compose a
442 message to @samp{foo@@bar.com}.
443
444 @defopt url-mail-command
445 @vindex mail-user-agent
446 The function called whenever url needs to send mail. This should
447 normally be left to default from @var{mail-user-agent}. @xref{Mail
448 Methods, , Mail-Composition Methods, emacs, The GNU Emacs Manual}.
449 @end defopt
450
451 An @samp{X-Url-From} header field containing the URL of the document
452 that contained the mailto URL is added if that URL is known.
453
454 RFC 2368 extends the definition of mailto URLs in RFC 1738.
455 The form of a mailto URL is
456 @example
457 @samp{mailto:@var{mailbox}[?@var{header}=@var{contents}[&@var{header}=@var{contents}]]}
458 @end example
459 @noindent where an arbitrary number of @var{header}s can be added. If the
460 @var{header} is @samp{body}, then @var{contents} is put in the body
461 otherwise a @var{header} header field is created with @var{contents}
462 as its contents. Note that the URL library does not consider any
463 headers `dangerous' so you should check them before sending the
464 message.
465
466 @c Fixme: update
467 Email messages are defined in @sc{rfc}822.
468
469 @node news/nntp/snews
470 @section @code{news}, @code{nntp} and @code{snews}
471 @cindex news
472 @cindex network news
473 @cindex usenet
474 @cindex NNTP
475 @cindex snews
476
477 @c draft-gilman-news-url-01
478 The network news URL scheme take the following forms following RFC
479 1738 except that for compatibility with other clients, host and port
480 fields may be included in news URLs though they are properly only
481 allowed for nntp an snews.
482
483 @table @samp
484 @item news:@var{newsgroup}
485 Retrieves a list of messages in @var{newsgroup};
486 @item news:@var{message-id}
487 Retrieves the message with the given @var{message-id};
488 @item news:*
489 Retrieves a list of all available newsgroups;
490 @item nntp://@var{host}:@var{port}/@var{newsgroup}
491 @itemx nntp://@var{host}:@var{port}/@var{message-id}
492 @itemx nntp://@var{host}:@var{port}/*
493 Similar to the @samp{news} versions.
494 @end table
495
496 @samp{:@var{port}} is optional and defaults to :119.
497
498 @samp{snews} is the same as @samp{nntp} except that the default port
499 is :563.
500 @cindex SSL
501 (It is tunneled through SSL.)
502
503 An @samp{nntp} URL is the same as a news URL, except that the URL may
504 specify an article by its number.
505
506 @defopt url-news-server
507 This variable can be used to override the default news server.
508 Usually this will be set by the Gnus package, which is used to fetch
509 news.
510 @cindex environment variable
511 @vindex NNTPSERVER
512 It may be set from the conventional environment variable
513 @code{NNTPSERVER}.
514 @end defopt
515
516 @node rlogin/telnet/tn3270
517 @section rlogin, telnet and tn3270
518 @cindex rlogin
519 @cindex telnet
520 @cindex tn3270
521 @cindex terminal emulation
522 @findex terminal-emulator
523
524 These URL schemes from RFC 1738 for logon via a terminal emulator have
525 the form
526 @example
527 telnet://@var{user}:@var{password}@@@var{host}:@var{port}
528 @end example
529 but the @code{:@var{password}} component is ignored.
530
531 To handle rlogin, telnet and tn3270 URLs, a @code{rlogin},
532 @code{telnet} or @code{tn3270} (the program names and arguments are
533 hardcoded) session is run in a @code{terminal-emulator} buffer.
534 Well-known ports are used if the URL does not specify a port.
535
536 @node irc
537 @section irc
538 @cindex IRC
539 @cindex Internet Relay Chat
540 @cindex ZEN IRC
541 @cindex ERC
542 @cindex rcirc
543 @c Fixme: reference (was http://www.w3.org/Addressing/draft-mirashi-url-irc-01.txt)
544 @dfn{Internet Relay Chat} (IRC) is handled by handing off the @sc{irc}
545 session to a function named in @code{url-irc-function}.
546
547 @defopt url-irc-function
548 A function to actually open an IRC connection.
549 This function
550 must take five arguments, @var{host}, @var{port}, @var{channel},
551 @var{user} and @var{password}. The @var{channel} argument specifies the
552 channel to join immediately, this can be @code{nil}. By default this is
553 @code{url-irc-rcirc}.
554 @end defopt
555 @defun url-irc-rcirc host port channel user password
556 Processes the arguments and lets @code{rcirc} handle the session.
557 @end defun
558 @defun url-irc-erc host port channel user password
559 Processes the arguments and lets @code{ERC} handle the session.
560 @end defun
561 @defun url-irc-zenirc host port channel user password
562 Processes the arguments and lets @code{zenirc} handle the session.
563 @end defun
564
565 @node data
566 @section data
567 @cindex data URLs
568
569 @example
570 data:@r{[}@var{media-type}@r{]}@r{[};@var{base64}@r{]},@var{data}
571 @end example
572
573 Data URLs contain MIME data in the URL itself. They are defined in
574 RFC 2397.
575
576 @var{media-type} is a MIME @samp{Content-Type} string, possibly
577 including parameters. It defaults to
578 @samp{text/plain;charset=US-ASCII}. The @samp{text/plain} can be
579 omitted but the charset parameter supplied. If @samp{;base64} is
580 present, the @var{data} are base64-encoded.
581
582 @node nfs
583 @section nfs
584 @cindex NFS
585 @cindex Network File System
586 @cindex automounter
587
588 @example
589 nfs://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
590 @end example
591
592 The @samp{nfs:} scheme is defined in RFC 2224. It is similar to
593 @samp{ftp:} except that it points to a file on a remote host that is
594 handled by the automounter on the local host.
595
596 @defvar url-nfs-automounter-directory-spec
597 @end defvar
598 A string saying how to invoke the NFS automounter. Certain @samp{%}
599 sequences are recognized:
600
601 @table @samp
602 @item %h
603 The hostname of the NFS server;
604 @item %n
605 The port number of the NFS server;
606 @item %u
607 The username to use to authenticate;
608 @item %p
609 The password to use to authenticate;
610 @item %f
611 The filename on the remote server;
612 @item %%
613 A literal @samp{%}.
614 @end table
615
616 Each can be used any number of times.
617
618 @node cid
619 @section cid
620 @cindex Content-ID
621
622 RFC 2111
623
624 @node about
625 @section about
626
627 @node ldap
628 @section ldap
629 @cindex LDAP
630 @cindex Lightweight Directory Access Protocol
631
632 The LDAP scheme is defined in RFC 2255.
633
634 @node imap
635 @section imap
636 @cindex IMAP
637
638 RFC 2192
639
640 @node man
641 @section man
642 @cindex @command{man}
643 @cindex Unix man pages
644 @findex man
645
646 @example
647 @samp{man:@var{page-spec}}
648 @end example
649
650 This is a non-standard scheme. @var{page-spec} is passed directly to
651 the Lisp @code{man} function.
652
653 @node Defining New URLs
654 @chapter Defining New URLs
655
656 @menu
657 * Naming conventions::
658 * Required functions::
659 * Optional functions::
660 * Asynchronous fetching::
661 * Supporting file-name-handlers::
662 @end menu
663
664 @node Naming conventions
665 @section Naming conventions
666
667 @node Required functions
668 @section Required functions
669
670 @node Optional functions
671 @section Optional functions
672
673 @node Asynchronous fetching
674 @section Asynchronous fetching
675
676 @node Supporting file-name-handlers
677 @section Supporting file-name-handlers
678
679 @node General Facilities
680 @chapter General Facilities
681
682 @menu
683 * Disk Caching::
684 * Proxies::
685 * Gateways in general::
686 * History::
687 @end menu
688
689 @node Disk Caching
690 @section Disk Caching
691 @cindex Caching
692 @cindex Persistent Cache
693 @cindex Disk Cache
694
695 The disk cache stores retrieved documents locally, whence they can be
696 retrieved more quickly. When requesting a URL that is in the cache,
697 the library checks to see if the page has changed since it was last
698 retrieved from the remote machine. If not, the local copy is used,
699 saving the transmission over the network.
700 @cindex Cleaning the cache
701 @cindex Clearing the cache
702 @cindex Cache cleaning
703 Currently the cache isn't cleared automatically.
704 @c Running the @code{clean-cache} shell script
705 @c fist is recommended, to allow for future cleaning of the cache. This
706 @c shell script will remove all files that have not been accessed since it
707 @c was last run. To keep the cache pared down, it is recommended that this
708 @c script be run from @i{at} or @i{cron} (see the manual pages for
709 @c crontab(5) or at(1) for more information)
710
711 @defopt url-automatic-caching
712 Setting this variable non-@code{nil} causes documents to be cached
713 automatically.
714 @end defopt
715
716 @defopt url-cache-directory
717 This variable specifies the
718 directory to store the cache files. It defaults to sub-directory
719 @file{cache} of @code{url-configuration-directory}.
720 @end defopt
721
722 @defopt url-cache-creation-function
723 The cache relies on a scheme for mapping URLs to files in the cache.
724 This variable names a function which sets the type of cache to use.
725 It takes a URL as argument and returns the absolute file name of the
726 corresponding cache file. The two supplied possibilities are
727 @code{url-cache-create-filename-using-md5} and
728 @code{url-cache-create-filename-human-readable}.
729 @end defopt
730
731 @defun url-cache-create-filename-using-md5 url
732 Creates a cache file name from @var{url} using MD5 hashing.
733 This is creates entries with very few cache collisions and is fast.
734 @cindex MD5
735 @smallexample
736 (url-cache-create-filename-using-md5 "http://www.example.com/foo/bar")
737 @result{} "/home/fx/.url/cache/fx/http/com/example/www/b8a35774ad20db71c7c3409a5410e74f"
738 @end smallexample
739 @end defun
740
741 @defun url-cache-create-filename-human-readable url
742 Creates a cache file name from @var{url} more obviously connected to
743 @var{url} than for @code{url-cache-create-filename-using-md5}, but
744 more likely to conflict with other files.
745 @smallexample
746 (url-cache-create-filename-human-readable "http://www.example.com/foo/bar")
747 @result{} "/home/fx/.url/cache/fx/http/com/example/www/foo/bar"
748 @end smallexample
749 @end defun
750
751 @defun url-cache-expired
752 This function returns non-nil if a cache entry has expired (or is absent).
753 The arguments are a URL and optional expiration delay in seconds
754 (default @var{url-cache-expire-time}).
755 @end defun
756
757 @defopt url-cache-expire-time
758 This variable is the default number of seconds to use for the
759 expire-time argument of the function @code{url-cache-expired}.
760 @end defopt
761
762 @defun url-fetch-from-cache
763 This function takes a URL as its argument and returns a buffer
764 containing the data cached for that URL.
765 @end defun
766
767 @c Fixme: never actually used currently?
768 @c @defopt url-standalone-mode
769 @c @cindex Relying on cache
770 @c @cindex Cache only mode
771 @c @cindex Standalone mode
772 @c If this variable is non-@code{nil}, the library relies solely on the
773 @c cache for fetching documents and avoids checking if they have changed
774 @c on remote servers.
775 @c @end defopt
776
777 @c With a large cache of documents on the local disk, it can be very handy
778 @c when traveling, or any other time the network connection is not active
779 @c (a laptop with a dial-on-demand PPP connection, etc). Emacs/W3 can rely
780 @c solely on its cache, and avoid checking to see if the page has changed
781 @c on the remote server. In the case of a dial-on-demand PPP connection,
782 @c this will keep the phone line free as long as possible, only bringing up
783 @c the PPP connection when asking for a page that is not located in the
784 @c cache. This is very useful for demonstrations as well.
785
786 @node Proxies
787 @section Proxies and Gatewaying
788
789 @c fixme: check/document url-ns stuff
790 @cindex proxy servers
791 @cindex proxies
792 @cindex environment variables
793 @vindex HTTP_PROXY
794 Proxy servers are commonly used to provide gateways through firewalls
795 or as caches serving some more-or-less local network. Each protocol
796 (HTTP, FTP, etc.)@: can have a different gateway server. Proxying is
797 conventionally configured commonly amongst different programs through
798 environment variables of the form @code{@var{protocol}_proxy}, where
799 @var{protocol} is one of the supported network protocols (@code{http},
800 @code{ftp} etc.). The library recognizes such variables in either
801 upper or lower case. Their values are of one of the forms:
802 @itemize @bullet
803 @item @code{@var{host}:@var{port}}
804 @item A full URL;
805 @item Simply a host name.
806 @end itemize
807
808 @vindex NO_PROXY
809 The @code{NO_PROXY} environment variable specifies URLs that should be
810 excluded from proxying (on servers that should be contacted directly).
811 This should be a comma-separated list of hostnames, domain names, or a
812 mixture of both. Asterisks can be used as wildcards, but other
813 clients may not support that. Domain names may be indicated by a
814 leading dot. For example:
815 @example
816 NO_PROXY="*.aventail.com,home.com,.seanet.com"
817 @end example
818 @noindent says to contact all machines in the @samp{aventail.com} and
819 @samp{seanet.com} domains directly, as well as the machine named
820 @samp{home.com}. If @code{NO_PROXY} isn't defined, @code{no_PROXY}
821 and @code{no_proxy} are also tried, in that order.
822
823 Proxies may also be specified directly in Lisp.
824
825 @defopt url-proxy-services
826 This variable is an alist of URL schemes and proxy servers that
827 gateway them. The items are of the form @w{@code{(@var{scheme}
828 . @var{host}:@var{portnumber})}}, says that the URL @var{scheme} is
829 gatewayed through @var{portnumber} on the specified @var{host}. An
830 exception is the pseudo scheme @code{"no_proxy"}, which is paired with
831 a regexp matching host names not to be proxied. This variable is
832 initialized from the environment as above.
833
834 @example
835 (setq url-proxy-services
836 '(("http" . "proxy.aventail.com:80")
837 ("no_proxy" . "^.*\\(aventail\\|seanet\\)\\.com")))
838 @end example
839 @end defopt
840
841 @node Gateways in general
842 @section Gateways in General
843 @cindex gateways
844 @cindex firewalls
845
846 The library provides a general gateway layer through which all
847 networking passes. It can both control access to the network and
848 provide access through gateways in firewalls. This may make direct
849 connections in some cases and pass through some sort of gateway in
850 others.@footnote{Proxies (which only operate over HTTP) are
851 implemented using this.} The library's basic function responsible for
852 making connections is @code{url-open-stream}.
853
854 @defun url-open-stream name buffer host service
855 @cindex opening a stream
856 @cindex stream, opening
857 Open a stream to @var{host}, possibly via a gateway. The other
858 arguments are as for @code{open-network-stream}. This will not make a
859 connection if @code{url-gateway-unplugged} is non-@code{nil}.
860 @end defun
861
862 @defvar url-gateway-local-host-regexp
863 This is a regular expression that matches local hosts that do not
864 require the use of a gateway. If @code{nil}, all connections are made
865 through the gateway.
866 @end defvar
867
868 @defvar url-gateway-method
869 This variable controls which gateway method is used. It may be useful
870 to bind it temporarily in some applications. It has values taken from
871 a list of symbols. Possible values are:
872
873 @table @code
874 @item telnet
875 @cindex @command{telnet}
876 Use this method if you must first telnet and log into a gateway host,
877 and then run telnet from that host to connect to outside machines.
878
879 @item rlogin
880 @cindex @command{rlogin}
881 This method is identical to @code{telnet}, but uses @command{rlogin}
882 to log into the remote machine without having to send the username and
883 password over the wire every time.
884
885 @item socks
886 @cindex @sc{socks}
887 Use if the firewall has a @sc{socks} gateway running on it. The
888 @sc{socks} v5 protocol is defined in RFC 1928.
889
890 @c @item ssl
891 @c This probably shouldn't be documented
892 @c Fixme: why not? -- fx
893
894 @item native
895 This method uses Emacs's builtin networking directly. This is the
896 default. It can be used only if there is no firewall blocking access.
897 @end table
898 @end defvar
899
900 The following variables control the gateway methods.
901
902 @defopt url-gateway-telnet-host
903 The gateway host to telnet to. Once logged in there, you then telnet
904 out to the hosts you want to connect to.
905 @end defopt
906 @defopt url-gateway-telnet-parameters
907 This should be a list of parameters to pass to the @command{telnet} program.
908 @end defopt
909 @defopt url-gateway-telnet-password-prompt
910 This is a regular expression that matches the password prompt when
911 logging in.
912 @end defopt
913 @defopt url-gateway-telnet-login-prompt
914 This is a regular expression that matches the username prompt when
915 logging in.
916 @end defopt
917 @defopt url-gateway-telnet-user-name
918 The username to log in with.
919 @end defopt
920 @defopt url-gateway-telnet-password
921 The password to send when logging in.
922 @end defopt
923 @defopt url-gateway-prompt-pattern
924 This is a regular expression that matches the shell prompt.
925 @end defopt
926
927 @defopt url-gateway-rlogin-host
928 Host to @samp{rlogin} to before telnetting out.
929 @end defopt
930 @defopt url-gateway-rlogin-parameters
931 Parameters to pass to @samp{rsh}.
932 @end defopt
933 @defopt url-gateway-rlogin-user-name
934 User name to use when logging in to the gateway.
935 @end defopt
936 @defopt url-gateway-prompt-pattern
937 This is a regular expression that matches the shell prompt.
938 @end defopt
939
940 @defopt socks-server
941 This specifies the default server, it takes the form
942 @w{@code{("Default server" @var{server} @var{port} @var{version})}}
943 where @var{version} can be either 4 or 5.
944 @end defopt
945 @defvar socks-password
946 If this is @code{nil} then you will be asked for the password,
947 otherwise it will be used as the password for authenticating you to
948 the @sc{socks} server.
949 @end defvar
950 @defvar socks-username
951 This is the username to use when authenticating yourself to the
952 @sc{socks} server. By default this is your login name.
953 @end defvar
954 @defvar socks-timeout
955 This controls how long, in seconds, to wait for responses from the
956 @sc{socks} server; it is 5 by default.
957 @end defvar
958 @c fixme: these have been effectively commented-out in the code
959 @c @defopt socks-server-aliases
960 @c This a list of server aliases. It is a list of aliases of the form
961 @c @var{(alias hostname port version)}.
962 @c @end defopt
963 @c @defopt socks-network-aliases
964 @c This a list of network aliases. Each entry in the list takes the form
965 @c @var{(alias (network))} where @var{alias} is a string that names the
966 @c @var{network}. The networks can contain a pair (not a dotted pair) of
967 @c @sc{ip} addresses which specify a range of @sc{ip} addresses, an @sc{ip}
968 @c address and a netmask, a domain name or a unique hostname or @sc{ip}
969 @c address.
970 @c @end defopt
971 @c @defopt socks-redirection-rules
972 @c This a list of redirection rules. Each rule take the form
973 @c @var{(Destination network Connection type)} where @var{Destination
974 @c network} is a network alias from @code{socks-network-aliases} and
975 @c @var{Connection type} can be @code{nil} in which case a direct
976 @c connection is used, or it can be an alias from
977 @c @code{socks-server-aliases} in which case that server is used as a
978 @c proxy.
979 @c @end defopt
980 @defopt socks-nslookup-program
981 @cindex @command{nslookup}
982 This the @samp{nslookup} program. It is @code{"nslookup"} by default.
983 @end defopt
984
985 @menu
986 * Suppressing network connections::
987 @end menu
988 @c * Broken hostname resolution::
989
990 @node Suppressing network connections
991 @subsection Suppressing Network Connections
992
993 @cindex network connections, suppressing
994 @cindex suppressing network connections
995 @cindex bugs, HTML
996 @cindex HTML `bugs'
997 In some circumstances it is desirable to suppress making network
998 connections. A typical case is when rendering HTML in a mail user
999 agent, when external URLs should not be activated, particularly to
1000 avoid `bugs' which `call home' by fetch single-pixel images and the
1001 like. To arrange this, bind the following variable for the duration
1002 of such processing.
1003
1004 @defvar url-gateway-unplugged
1005 If this variable is non-@code{nil} new network connections are never
1006 opened by the URL library.
1007 @end defvar
1008
1009 @c @node Broken hostname resolution
1010 @c @subsection Broken Hostname Resolution
1011
1012 @c @cindex hostname resolver
1013 @c @cindex resolver, hostname
1014 @c Some C libraries do not include the hostname resolver routines in
1015 @c their static libraries. If Emacs was linked statically, and was not
1016 @c linked with the resolver libraries, it will not be able to get to any
1017 @c machines off the local network. This is characterized by being able
1018 @c to reach someplace with a raw ip number, but not its hostname
1019 @c (@url{http://129.79.254.191/} works, but
1020 @c @url{http://www.cs.indiana.edu/} doesn't). This used to happen on
1021 @c SunOS4 and Ultrix, but is now probably now rare. If Emacs can't be
1022 @c rebuilt linked against the resolver library, it can use the external
1023 @c @command{nslookup} program instead.
1024
1025 @c @defopt url-gateway-broken-resolution
1026 @c @cindex @code{nslookup} program
1027 @c @cindex program, @code{nslookup}
1028 @c If non-@code{nil}, this variable says to use the program specified by
1029 @c @code{url-gateway-nslookup-program} program to do hostname resolution.
1030 @c @end defopt
1031
1032 @c @defopt url-gateway-nslookup-program
1033 @c The name of the program to do hostname lookup if Emacs can't do it
1034 @c directly. This program should expect a single argument on the command
1035 @c line---the hostname to resolve---and should produce output similar to
1036 @c the standard Unix @command{nslookup} program:
1037 @c @example
1038 @c Name: www.cs.indiana.edu
1039 @c Address: 129.79.254.191
1040 @c @end example
1041 @c @end defopt
1042
1043 @node History
1044 @section History
1045
1046 @findex url-do-setup
1047 The library can maintain a global history list tracking URLs accessed.
1048 URL completion can be done from it. The history mechanism is set up
1049 automatically via @code{url-do-setup} when it is configured to be on.
1050 Note that the size of the history list is currently not limited.
1051
1052 @vindex url-history-hash-table
1053 The history `list' is actually a hash table,
1054 @code{url-history-hash-table}. It contains access times keyed by URL
1055 strings. The times are in the format returned by @code{current-time}.
1056
1057 @defun url-history-update-url url time
1058 This function updates the history table with an entry for @var{url}
1059 accessed at the given @var{time}.
1060 @end defun
1061
1062 @defopt url-history-track
1063 If non-@code{nil}, the library will keep track of all the URLs
1064 accessed. If it is @code{t}, the list is saved to disk at the end of
1065 each Emacs session. The default is @code{nil}.
1066 @end defopt
1067
1068 @defopt url-history-file
1069 The file storing the history list between sessions. It defaults to
1070 @file{history} in @code{url-configuration-directory}.
1071 @end defopt
1072
1073 @defopt url-history-save-interval
1074 @findex url-history-setup-save-timer
1075 The number of seconds between automatic saves of the history list.
1076 Default is one hour. Note that if you change this variable directly,
1077 rather than using Custom, after @code{url-do-setup} has been run, you
1078 need to run the function @code{url-history-setup-save-timer}.
1079 @end defopt
1080
1081 @defun url-history-parse-history &optional fname
1082 Parses the history file @var{fname} (default @code{url-history-file})
1083 and sets up the history list.
1084 @end defun
1085
1086 @defun url-history-save-history &optional fname
1087 Saves the current history to file @var{fname} (default
1088 @code{url-history-file}).
1089 @end defun
1090
1091 @defun url-completion-function string predicate function
1092 You can use this function to do completion of URLs from the history.
1093 @end defun
1094
1095 @node Customization
1096 @chapter Customization
1097
1098 @section Environment Variables
1099
1100 @cindex environment variables
1101 The following environment variables affect the library's operation at
1102 startup.
1103
1104 @table @code
1105 @item TMPDIR
1106 @vindex TMPDIR
1107 @vindex url-temporary-directory
1108 If this is defined, @var{url-temporary-directory} is initialized from
1109 it.
1110 @end table
1111
1112 @section General User Options
1113
1114 The following user options, settable with Customize, affect the
1115 general operation of the package.
1116
1117 @defopt url-debug
1118 @cindex debugging
1119 Specifies the types of debug messages which are logged to
1120 the @code{*URL-DEBUG*} buffer.
1121 @code{t} means log all messages.
1122 A number means log all messages and show them with @code{message}.
1123 It may also be a list of the types of messages to be logged.
1124 @end defopt
1125 @defopt url-personal-mail-address
1126 @end defopt
1127 @defopt url-privacy-level
1128 @end defopt
1129 @defopt url-uncompressor-alist
1130 @end defopt
1131 @defopt url-passwd-entry-func
1132 @end defopt
1133 @defopt url-standalone-mode
1134 @end defopt
1135 @defopt url-bad-port-list
1136 @end defopt
1137 @defopt url-max-password-attempts
1138 @end defopt
1139 @defopt url-temporary-directory
1140 @end defopt
1141 @defopt url-show-status
1142 @end defopt
1143 @defopt url-confirmation-func
1144 The function to use for asking yes or no functions. This is normally
1145 either @code{y-or-n-p} or @code{yes-or-no-p}, but could be another
1146 function taking a single argument (the prompt) and returning @code{t}
1147 only if an affirmative answer is given.
1148 @end defopt
1149 @defopt url-gateway-method
1150 @c fixme: describe gatewaying
1151 A symbol specifying the type of gateway support to use for connections
1152 from the local machine. The supported methods are:
1153
1154 @table @code
1155 @item telnet
1156 Run telnet in a subprocess to connect;
1157 @item rlogin
1158 Rlogin to another machine to connect;
1159 @item socks
1160 Connect through a socks server;
1161 @item ssl
1162 Connect with SSL;
1163 @item native
1164 Connect directly.
1165 @end table
1166 @end defopt
1167
1168 @node GNU Free Documentation License
1169 @appendix GNU Free Documentation License
1170 @include doclicense.texi
1171
1172 @node Function Index
1173 @unnumbered Command and Function Index
1174 @printindex fn
1175
1176 @node Variable Index
1177 @unnumbered Variable Index
1178 @printindex vr
1179
1180 @node Concept Index
1181 @unnumbered Concept Index
1182 @printindex cp
1183
1184 @bye