@c -*-texinfo-*-
@c This is part of the GNU Emacs Lisp Reference Manual.
-@c Copyright (C) 1990, 1991, 1992, 1993, 1994, 1995, 1998, 1999, 2001,
-@c 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010
+@c Copyright (C) 1990-1995, 1998-1999, 2001-2011
@c Free Software Foundation, Inc.
@c See the file elisp.texi for copying conditions.
@setfilename ../../info/searching
To include @samp{^} in a character alternative, put it anywhere but at
the beginning.
-The beginning and end of a range of multibyte characters must be in
-the same character set (@pxref{Character Sets}). Thus,
-@code{"[\x8e0-\x97c]"} is invalid because character 0x8e0 (@samp{a}
-with grave accent) is in the Emacs character set for Latin-1 but the
-character 0x97c (@samp{u} with diaeresis) is in the Emacs character
-set for Latin-2. (We use Lisp string syntax to write that example,
-and a few others in the next few paragraphs, in order to include hex
-escape sequences in them.)
-
If a range starts with a unibyte character @var{c} and ends with a
multibyte character @var{c2}, the range is divided into two parts: one
is @samp{@var{c}..?\377}, the other is @samp{@var{c1}..@var{c2}}, where
@cindex @samp{\S} in regexp
matches any character whose syntax is not @var{code}.
+@cindex category, regexp search for
@item \c@var{c}
matches any character whose category is @var{c}. Here @var{c} is a
character that represents a category: thus, @samp{c} for Chinese
characters or @samp{g} for Greek characters in the standard category
-table.
+table. You can see the list of all the currently defined categories
+with @kbd{M-x describe-categories @key{RET}}. You can also define
+your own categories in addition to the standard ones using the
+@code{define-category} function (@pxref{Categories}).
@item \C@var{c}
matches any character whose category is not @var{c}.
If the optional argument @var{paren} is non-@code{nil}, then the
returned regular expression is always enclosed by at least one
parentheses-grouping construct. If @var{paren} is @code{words}, then
-that construct is additionally surrounded by @samp{\<} and @samp{\>}.
+that construct is additionally surrounded by @samp{\<} and @samp{\>};
+alternatively, if @var{paren} is @code{symbols}, then that construct
+is additionally surrounded by @samp{\_<} and @samp{\_>}
+(@code{symbols} is often appropriate when matching
+programming-language keywords and the like).
This simplified definition of @code{regexp-opt} produces a
regular expression which is equivalent to the actual value
can't avoid another intervening search, you must save and restore the
match data around it, to prevent it from being overwritten.
+ Notice that all functions are allowed to overwrite the match data
+unless they're explicitly documented not to do so. A consequence is
+that functions that are run implictly in the background
+(@pxref{Timers}, and @ref{Idle Timers}) should likely save and restore
+the match data explicitly.
+
@menu
* Replacing Match:: Replacing a substring that was matched.
* Simple Match Data:: Accessing single items of match data,
@code{sentence-end-without-period} and
@code{sentence-end-without-space}.
@end defun
-
-@ignore
- arch-tag: c2573ca2-18aa-4839-93b8-924043ef831f
-@end ignore