@c -*-texinfo-*-
@c This is part of the GNU Emacs Lisp Reference Manual.
-@c Copyright (C) 1990-1995, 1998-2014 Free Software Foundation, Inc.
+@c Copyright (C) 1990-1995, 1998-2015 Free Software Foundation, Inc.
@c See the file elisp.texi for copying conditions.
@node Text
@chapter Text
* Registers:: How registers are implemented. Accessing the text or
position stored in a register.
* Transposition:: Swapping two portions of a buffer.
+* Decompression:: Dealing with compressed data.
* Base 64:: Conversion to or from base 64 encoding.
* Checksum/Hash:: Computing cryptographic hashes.
* Parsing HTML/XML:: Parsing HTML and XML.
@node Buffer Contents
@section Examining Buffer Contents
+@cindex buffer portion as string
This section describes functions that allow a Lisp program to
convert any portion of the text in the buffer into a string.
the current buffer, as a string.
@end defun
+ If you need to make sure the resulting string, when copied to a
+different location, will not change its visual appearance due to
+reordering of bidirectional text, use the
+@code{buffer-substring-with-bidi-context} function
+(@pxref{Bidirectional Display, buffer-substring-with-bidi-context}).
+
@defun filter-buffer-substring start end &optional delete
-This function passes the buffer text between @var{start} and @var{end}
-through the filter functions specified by the wrapper hook
-@code{filter-buffer-substring-functions}, and returns the result. The
-obsolete variable @code{buffer-substring-filters} is also consulted.
-If both of these variables are @code{nil}, the value is the unaltered
-text from the buffer, i.e., what @code{buffer-substring} would
-return.
-
-If @var{delete} is non-@code{nil}, this function deletes the text
+This function filters the buffer text between @var{start} and @var{end}
+using a function specified by the variable
+@code{filter-buffer-substring-function}, and returns the result.
+
+The default filter function consults the obsolete wrapper hook
+@code{filter-buffer-substring-functions}, and the obsolete variable
+@code{buffer-substring-filters}. If both of these are @code{nil}, it
+returns the unaltered text from the buffer, i.e., what
+@code{buffer-substring} would return.
+
+If @var{delete} is non-@code{nil}, the function deletes the text
between @var{start} and @var{end} after copying it, like
@code{delete-and-extract-region}.
@code{buffer-substring-no-properties},
or @code{delete-and-extract-region} when copying into user-accessible
data structures such as the kill-ring, X clipboard, and registers.
-Major and minor modes can add functions to
-@code{filter-buffer-substring-functions} to alter such text as it is
-copied out of the buffer.
+Major and minor modes can modify @code{filter-buffer-substring-function}
+to alter such text as it is copied out of the buffer.
@end defun
-@c FIXME: `filter-buffer-substring-function' should be documented.
+@defvar filter-buffer-substring-function
+The value of this variable is a function that @code{filter-buffer-substring}
+will call to do the actual work. The function receives three
+arguments, the same as those of @code{filter-buffer-substring},
+which it should treat as per the documentation of that function. It
+should return the filtered text (and optionally delete the source text).
+@end defvar
+
+@noindent The following two variables are obsoleted by
+@code{filter-buffer-substring-function}, but are still supported for
+backward compatibility.
+
@defvar filter-buffer-substring-functions
-This variable is a wrapper hook (@pxref{Running Hooks}), whose members
-should be functions that accept four arguments: @var{fun},
-@var{start}, @var{end}, and @var{delete}. @var{fun} is a function
-that takes three arguments (@var{start}, @var{end}, and @var{delete}),
-and returns a string. In both cases, the @var{start}, @var{end}, and
-@var{delete} arguments are the same as those of
-@code{filter-buffer-substring}.
+This obsolete variable is a wrapper hook, whose members should be functions
+that accept four arguments: @var{fun}, @var{start}, @var{end}, and
+@var{delete}. @var{fun} is a function that takes three arguments
+(@var{start}, @var{end}, and @var{delete}), and returns a string. In
+both cases, the @var{start}, @var{end}, and @var{delete} arguments are
+the same as those of @code{filter-buffer-substring}.
The first hook function is passed a @var{fun} that is equivalent to
the default operation of @code{filter-buffer-substring}, i.e., it
@end defvar
@defvar buffer-substring-filters
-This variable is obsoleted by
-@code{filter-buffer-substring-functions}, but is still supported for
-backward compatibility. Its value should should be a list of
-functions which accept a single string argument and return another
-string. @code{filter-buffer-substring} passes the buffer substring to
-the first function in this list, and the return value of each function
-is passed to the next function. The return value of the last function
-is passed to @code{filter-buffer-substring-functions}.
+The value of this obsolete variable should be a list of functions
+that accept a single string argument and return another string.
+The default @code{filter-buffer-substring} function passes the buffer
+substring to the first function in this list, and the return value of
+each function is passed to the next function. The return value of the
+last function is passed to @code{filter-buffer-substring-functions}.
@end defvar
@defun current-word &optional strict really-word
if @code{case-fold-search} is non-@code{nil}. It always ignores
text properties.
-Suppose the current buffer contains the text @samp{foobarbar
-haha!rara!}; then in this example the two substrings are @samp{rbar }
-and @samp{rara!}. The value is 2 because the first substring is greater
-at the second character.
+Suppose you have the text @w{@samp{foobarbar haha!rara!}} in the
+current buffer; then in this example the two substrings are @samp{rbar
+} and @samp{rara!}. The value is 2 because the first substring is
+greater at the second character.
@example
(compare-buffer-substrings nil 6 11 nil 16 21)
If this command acts on the entire buffer (i.e. if called
interactively with the mark inactive, or called from Lisp with
-@var{end} nil), it also deletes all trailing lines at the end of the
+@var{end} @code{nil}), it also deletes all trailing lines at the end of the
buffer if the variable @code{delete-trailing-lines} is non-@code{nil}.
@end deffn
The deleted text itself is the string @var{text}. The place to
reinsert it is @code{(abs @var{position})}. If @var{position} is
positive, point was at the beginning of the deleted text, otherwise it
-was at the end.
+was at the end. Zero or more (@var{marker} . @var{adjustment})
+elements follow immediately after this element.
@item (t . @var{time-flag})
This kind of element indicates that an unmodified buffer became
@item (@var{marker} . @var{adjustment})
This kind of element records the fact that the marker @var{marker} was
relocated due to deletion of surrounding text, and that it moved
-@var{adjustment} character positions. Undoing this element moves
-@var{marker} @minus{} @var{adjustment} characters.
+@var{adjustment} character positions. If the marker's location is
+consistent with the (@var{text} . @var{position}) element preceding it
+in the undo list, then undoing this element moves @var{marker}
+@minus{} @var{adjustment} characters.
@item (apply @var{funname} . @var{args})
This is an extensible undo item, which is undone by calling
This section describes the primitive functions used to count and
insert indentation. The functions in the following sections use these
-primitives. @xref{Width}, for related functions.
+primitives. @xref{Size of Displayed Text}, for related functions.
@defun current-indentation
@comment !!Type Primitive Function
@node Examining Properties
@subsection Examining Text Properties
+@cindex examining text properties
+@cindex text properties, examining
The simplest way to examine text properties is to ask for the value of
a particular property of a particular character. For that, use
@node Changing Properties
@subsection Changing Text Properties
+@cindex changing text properties
+@cindex text properties, changing
The primitives for changing properties apply to a specified range of
text in a buffer or string. The function @code{set-text-properties}
(@pxref{Special Properties}), such as a face name or an anonymous face
(@pxref{Faces}).
-If any text in the region already has a non-nil @code{face} property,
+If any text in the region already has a non-@code{nil} @code{face} property,
those face(s) are retained. This function sets the @code{face}
property to a list of faces, with @var{face} as the first element (by
default) and the pre-existing faces as the remaining elements. If the
@node Property Search
@subsection Text Property Search Functions
+@cindex searching text properties
+@cindex text properties, searching
In typical use of text properties, most of the time several or many
consecutive characters have the same value for a property. Rather than
special trick: bind @code{inhibit-read-only} to a non-@code{nil} value
and then remove the property. @xref{Read Only Buffers}.
+@item inhibit-read-only
+@kindex inhibit-read-only @r{(text property)}
+If a character has the property @code{inhibit-read-only}, and the
+buffer is read-only, editing the character in question is allowed.
+
@item invisible
@kindex invisible @r{(text property)}
A non-@code{nil} @code{invisible} property can make a character invisible
position. You can place the cursor on any desired character of these
strings by giving that character a non-@code{nil} @code{cursor} text
property. In addition, if the value of the @code{cursor} property is
-an integer number, it specifies the number of buffer's character
+an integer, it specifies the number of buffer's character
positions, starting with the position where the overlay or the
@code{display} property begins, for which the cursor should be
displayed on that character. Specifically, if the value of the
In other words, the string character with the @code{cursor} property
of any non-@code{nil} value is the character where to display the
cursor. The value of the property says for which buffer positions to
-display the cursor there. If the value is an integer number @var{n},
+display the cursor there. If the value is an integer @var{n},
the cursor is displayed there when point is anywhere between the
beginning of the overlay or @code{display} property and @var{n}
positions after that. If the value is anything else and
intervals with the same properties, and we kill the text of one interval
and yank it back. The same interval-coalescence feature that rescues
the other case causes trouble in this one: after yanking, we have just
-one interval. One again, editing does not preserve the distinction
+one interval. Once again, editing does not preserve the distinction
between one interval and two.
Insertion of text at the border between intervals also raises
@node Substitution
@section Substituting for a Character Code
+@cindex replace characters in region
+@cindex substitute characters
The following functions replace characters within a specified region
based on their character codes.
Normally, this command puts point before the inserted text, and the
mark after it. However, if the optional second argument @var{beforep}
is non-@code{nil}, it puts the mark before and point after.
-You can pass a non-@code{nil} second argument @var{beforep} to this
-function interactively by supplying any prefix argument.
+
+When called interactively, the command defaults to putting point after
+text, and a prefix argument inverts this behavior.
If the register contains a rectangle, then the rectangle is inserted
with its upper left corner at point. This means that text is inserted
changed in the future.
@end deffn
+@defun register-read-with-preview prompt
+@cindex register preview
+This function reads and returns a register name, prompting with
+@var{prompt} and possibly showing a preview of the existing registers
+and their contents. The preview is shown in a temporary window, after
+the delay specified by the user option @code{register-preview-delay},
+if its value and @code{register-alist} are both non-@code{nil}. The
+preview is also shown if the user requests help (e.g., by typing the
+help character). We recommend that all interactive commands which
+read register names use this function.
+@end defun
+
@node Transposition
@section Transposition of Text
all markers unrelocated.
@end defun
+@node Decompression
+@section Dealing With Compressed Data
+
+When @code{auto-compression-mode} is enabled, Emacs automatically
+uncompresses compressed files when you visit them, and automatically
+recompresses them if you alter and save them. @xref{Compressed
+Files,,, emacs, The GNU Emacs Manual}.
+
+The above feature works by calling an external executable (e.g.,
+@command{gzip}). Emacs can also be compiled with support for built-in
+decompression using the zlib library, which is faster than calling an
+external program.
+
+@defun zlib-available-p
+This function returns non-@code{nil} if built-in zlib decompression is
+available.
+@end defun
+
+@defun zlib-decompress-region start end
+This function decompresses the region between @var{start} and
+@var{end}, using built-in zlib decompression. The region should
+contain data that were compressed with gzip or zlib. On success, the
+function replaces the contents of the region with the decompressed
+data. On failure, the function leaves the region unchanged and
+returns @code{nil}. This function can be called only in unibyte
+buffers.
+@end defun
+
+
@node Base 64
@section Base 64 Encoding
@cindex base 64 encoding
When Emacs is compiled with libxml2 support, the following functions
are available to parse HTML or XML text into Lisp object trees.
-@defun libxml-parse-html-region start end &optional base-url
+@defun libxml-parse-html-region start end &optional base-url discard-comments
This function parses the text between @var{start} and @var{end} as
HTML, and returns a list representing the HTML @dfn{parse tree}. It
attempts to handle ``real world'' HTML by robustly coping with syntax
The optional argument @var{base-url}, if non-@code{nil}, should be a
string specifying the base URL for relative URLs occurring in links.
+If the optional argument @var{discard-comments} is non-@code{nil},
+then the parse tree is created without any comments.
+
In the parse tree, each HTML node is represented by a list in which
the first element is a symbol representing the node name, the second
element is an alist of node attributes, and the remaining elements are
@end example
@noindent
-A call to @code{libxml-parse-html-region} returns this:
+A call to @code{libxml-parse-html-region} returns this @acronym{DOM}
+(document object model):
@example
-(html ()
- (head ())
- (body ((width . "101"))
- (div ((class . "thing"))
- "Foo"
- (div ()
- "Yes"))))
+(html nil
+ (head nil)
+ (body ((width . "101"))
+ (div ((class . "thing"))
+ "Foo"
+ (div nil
+ "Yes"))))
@end example
@end defun
@end defun
@cindex parsing xml
-@defun libxml-parse-xml-region start end &optional base-url
+@defun libxml-parse-xml-region start end &optional base-url discard-comments
This function is the same as @code{libxml-parse-html-region}, except
that it parses the text as XML rather than HTML (so it is stricter
about syntax).
@end defun
+@menu
+* Document Object Model:: Access, manipulate and search the @acronym{DOM}.
+@end menu
+
+@node Document Object Model
+@subsection Document Object Model
+@cindex HTML DOM
+@cindex XML DOM
+@cindex DOM
+@cindex Document Object Model
+
+The @acronym{DOM} returned by @code{libxml-parse-html-region} (and the
+other @acronym{XML} parsing functions) is a tree structure where each
+node has a node name (called a @dfn{tag}), and optional key/value
+@dfn{attribute} list, and then a list of @dfn{child nodes}. The child
+nodes are either strings or @acronym{DOM} objects.
+
+@example
+(body ((width . "101"))
+ (div ((class . "thing"))
+ "Foo"
+ (div nil
+ "Yes")))
+@end example
+
+@defun dom-node tag &optional attributes &rest children
+This function creates a @acronym{DOM} node of type @var{tag}. If
+given, @var{attributes} should be a key/value pair list.
+If given, @var{children} should be @acronym{DOM} nodes.
+@end defun
+
+The following functions can be used to work with this structure. Each
+function takes a @acronym{DOM} node, or a list of nodes. In the
+latter case, only the first node in the list is used.
+
+Simple accessors:
+
+@table @code
+@item dom-tag @var{node}
+Return the @dfn{tag} (also called ``node name'') of the node.
+
+@item dom-attr @var{node} @var{attribute}
+Return the value of @var{attribute} in the node. A common usage
+would be:
+
+@lisp
+(dom-attr img 'href)
+=> "http://fsf.org/logo.png"
+@end lisp
+
+@item dom-children @var{node}
+Return all the children of the node.
+
+@item dom-non-text-children @var{node}
+Return all the non-string children of the node.
+
+@item dom-attributes @var{node}
+Return the key/value pair list of attributes of the node.
+
+@item dom-text @var{node}
+Return all the textual elements of the node as a concatenated string.
+
+@item dom-texts @var{node}
+Return all the textual elements of the node, as well as the textual
+elements of all the children of the node, recursively, as a
+concatenated string. This function also takes an optional separator
+to be inserted between the textual elements.
+
+@item dom-parent @var{dom} @var{node}
+Return the parent of @var{node} in @var{dom}.
+@end table
+
+The following are functions for altering the @acronym{DOM}.
+
+@table @code
+@item dom-set-attribute @var{node} @var{attribute} @var{value}
+Set the @var{attribute} of the node to @var{value}.
+
+@item dom-append-child @var{node} @var{child}
+Append @var{child} as the last child of @var{node}.
+
+@item dom-add-child-before @var{node} @var{child} @var{before}
+Add @var{child} to @var{node}'s child list before the @var{before}
+node. If @var{before} is @code{nil}, make @var{child} the first child.
+
+@item dom-set-attributes @var{node} @var{attributes}
+Replace all the attributes of the node with a new key/value list.
+@end table
+
+The following are functions for searching for elements in the
+@acronym{DOM}. They all return lists of matching nodes.
+
+@table @code
+@item dom-by-tag @var{dom} @var{tag}
+Return all nodes in @var{dom} that are of type @var{tag}. A typical
+use would be:
+
+@lisp
+(dom-by-tag dom 'td)
+=> '((td ...) (td ...) (td ...))
+@end lisp
+
+@item dom-by-class @var{dom} @var{match}
+Return all nodes in @var{dom} that have class names that match
+@var{match}, which is a regular expression.
+
+@item dom-by-style @var{dom} @var{style}
+Return all nodes in @var{dom} that have styles that match @var{match},
+which is a regular expression.
+
+@item dom-by-id @var{dom} @var{style}
+Return all nodes in @var{dom} that have IDs that match @var{match},
+which is a regular expression.
+
+@end table
+
+Utility functions:
+
+@table @code
+@item dom-pp @var{dom} &optional @var{remove-empty}
+Pretty-print @var{dom} at point. If @var{remove-empty}, don't print
+textual nodes that just contain white-space.
+@end table
+
+
@node Atomic Changes
@section Atomic Change Groups
@cindex atomic changes