* String Search:: Search for an exact match.
* Regular Expressions:: Describing classes of strings.
* Regexp Search:: Searching for a match for a regexp.
+* POSIX Regexps:: Searching POSIX-style for the longest match.
* Search and Replace:: Internals of @code{query-replace}.
* Match Data:: Finding out which part of the text matched
various parts of a regexp, after regexp search.
to return the new position of point in that case, but some programs
may depend on a value of @code{nil}.)
- If @var{repeat} is non-@code{nil}, then the search is repeated that
-many times. Point is positioned at the end of the last match.
+If @var{repeat} is supplied (it must be a positive number), then the
+search is repeated that many times (each time starting at the end of the
+previous time's match). If these successive searches succeed, the
+function succeeds, moving point and returning its new value. Otherwise
+the search fails.
@end deffn
@deffn Command search-backward string &optional limit noerror repeat
@node Syntax of Regexps
@subsection Syntax of Regular Expressions
- Regular expressions have a syntax in which a few characters are special
-constructs and the rest are @dfn{ordinary}. An ordinary character is a
-simple regular expression which matches that character and nothing else.
-The special characters are @samp{$}, @samp{^}, @samp{.}, @samp{*},
-@samp{+}, @samp{?}, @samp{[}, @samp{]} and @samp{\}; no new special
-characters will be defined in the future. Any other character appearing
-in a regular expression is ordinary, unless a @samp{\} precedes it.
+ Regular expressions have a syntax in which a few characters are
+special constructs and the rest are @dfn{ordinary}. An ordinary
+character is a simple regular expression that matches that character and
+nothing else. The special characters are @samp{.}, @samp{*}, @samp{+},
+@samp{?}, @samp{[}, @samp{]}, @samp{^}, @samp{$}, and @samp{\}; no new
+special characters will be defined in the future. Any other character
+appearing in a regular expression is ordinary, unless a @samp{\}
+precedes it.
For example, @samp{f} is not a special character, so it is ordinary, and
therefore @samp{f} is a regular expression that matches the string
only @samp{o}.@refill
Any two regular expressions @var{a} and @var{b} can be concatenated. The
-result is a regular expression which matches a string if @var{a} matches
+result is a regular expression that matches a string if @var{a} matches
some amount of the beginning of that string and @var{b} matches the rest of
the string.@refill
@item *
@cindex @samp{*} in regexp
-is not a construct by itself; it is a suffix operator that means to
-repeat the preceding regular expression as many times as possible. In
-@samp{fo*}, the @samp{*} applies to the @samp{o}, so @samp{fo*} matches
-one @samp{f} followed by any number of @samp{o}s. The case of zero
-@samp{o}s is allowed: @samp{fo*} does match @samp{f}.@refill
+is not a construct by itself; it is a postfix operator that means to
+match the preceding regular expression repetitively as many times as
+possible. Thus, @samp{o*} matches any number of @samp{o}s (including no
+@samp{o}s).
@samp{*} always applies to the @emph{smallest} possible preceding
-expression. Thus, @samp{fo*} has a repeating @samp{o}, not a
-repeating @samp{fo}.@refill
+expression. Thus, @samp{fo*} has a repeating @samp{o}, not a repeating
+@samp{fo}. It matches @samp{f}, @samp{fo}, @samp{foo}, and so on.
The matcher processes a @samp{*} construct by matching, immediately,
as many repetitions as can be found. Then it continues with the rest
The next alternative is for @samp{a*} to match only two @samp{a}s.
With this choice, the rest of the regexp matches successfully.@refill
+Nested repetition operators can be extremely slow if they specify
+backtracking loops. For example, it could take hours for the regular
+expression @samp{\(x+y*\)*a} to match the sequence
+@samp{xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxz}. The slowness is because
+Emacs must try each imaginable way of grouping the 35 @samp{x}'s before
+concluding that none of them can work. To make sure your regular
+expressions run fast, check nested repetitions carefully.
+
@item +
@cindex @samp{+} in regexp
-is a suffix operator similar to @samp{*} except that the preceding
-expression must match at least once. So, for example, @samp{ca+r}
+is a postfix operator, similar to @samp{*} except that it must match
+the preceding expression at least once. So, for example, @samp{ca+r}
matches the strings @samp{car} and @samp{caaaar} but not the string
@samp{cr}, whereas @samp{ca*r} matches all three strings.
@item ?
@cindex @samp{?} in regexp
-is a suffix operator similar to @samp{*} except that the preceding
-expression can match either once or not at all. For example,
-@samp{ca?r} matches @samp{car} or @samp{cr}, but does not match anyhing
-else.
+is a postfix operator, similar to @samp{*} except that it can match the
+preceding expression either once or not at all. For example,
+@samp{ca?r} matches @samp{car} or @samp{cr}; nothing else.
@item [ @dots{} ]
@cindex character set (in regexp)
@cindex @samp{[} in regexp
@cindex @samp{]} in regexp
-@samp{[} begins a @dfn{character set}, which is terminated by a
-@samp{]}. In the simplest case, the characters between the two brackets
-form the set. Thus, @samp{[ad]} matches either one @samp{a} or one
-@samp{d}, and @samp{[ad]*} matches any string composed of just @samp{a}s
-and @samp{d}s (including the empty string), from which it follows that
-@samp{c[ad]*r} matches @samp{cr}, @samp{car}, @samp{cdr},
-@samp{caddaar}, etc.@refill
-
-The usual regular expression special characters are not special inside a
+is a @dfn{character set}, which begins with @samp{[} and is terminated
+by @samp{]}. In the simplest case, the characters between the two
+brackets are what this set can match.
+
+Thus, @samp{[ad]} matches either one @samp{a} or one @samp{d}, and
+@samp{[ad]*} matches any string composed of just @samp{a}s and @samp{d}s
+(including the empty string), from which it follows that @samp{c[ad]*r}
+matches @samp{cr}, @samp{car}, @samp{cdr}, @samp{caddaar}, etc.
+
+You can also include character ranges in a character set, by writing the
+starting and ending characters with a @samp{-} between them. Thus,
+@samp{[a-z]} matches any lower-case ASCII letter. Ranges may be
+intermixed freely with individual characters, as in @samp{[a-z$%.]},
+which matches any lower case ASCII letter or @samp{$}, @samp{%} or
+period.
+
+Note that the usual regexp special characters are not special inside a
character set. A completely different set of special characters exists
-inside character sets: @samp{]}, @samp{-} and @samp{^}.@refill
-
-@samp{-} is used for ranges of characters. To write a range, write two
-characters with a @samp{-} between them. Thus, @samp{[a-z]} matches any
-lower case letter. Ranges may be intermixed freely with individual
-characters, as in @samp{[a-z$%.]}, which matches any lower case letter
-or @samp{$}, @samp{%} or a period.@refill
-
-To include a @samp{]} in a character set, make it the first character.
-For example, @samp{[]a]} matches @samp{]} or @samp{a}. To include a
-@samp{-}, write @samp{-} as the first character in the set, or put
-immediately after a range. (You can replace one individual character
-@var{c} with the range @samp{@var{c}-@var{c}} to make a place to put the
-@samp{-}). There is no way to write a set containing just @samp{-} and
-@samp{]}.
+inside character sets: @samp{]}, @samp{-} and @samp{^}.
+
+To include a @samp{]} in a character set, you must make it the first
+character. For example, @samp{[]a]} matches @samp{]} or @samp{a}. To
+include a @samp{-}, write @samp{-} as the first or last character of the
+set, or put it after a range. Thus, @samp{[]-]} matches both @samp{]}
+and @samp{-}.
To include @samp{^} in a set, put it anywhere but at the beginning of
the set.
@item [^ @dots{} ]
@cindex @samp{^} in regexp
-@samp{[^} begins a @dfn{complement character set}, which matches any
-character except the ones specified. Thus, @samp{[^a-z0-9A-Z]}
-matches all characters @emph{except} letters and digits.@refill
+@samp{[^} begins a @dfn{complemented character set}, which matches any
+character except the ones specified. Thus, @samp{[^a-z0-9A-Z]} matches
+all characters @emph{except} letters and digits.
@samp{^} is not special in a character set unless it is the first
character. The character following the @samp{^} is treated as if it
-were first (thus, @samp{-} and @samp{]} are not special there).
+were first (in other words, @samp{-} and @samp{]} are not special there).
-Note that a complement character set can match a newline, unless
-newline is mentioned as one of the characters not to match.
+A complemented character set can match a newline, unless newline is
+mentioned as one of the characters not to match. This is in contrast to
+the handling of regexps in programs such as @code{grep}.
@item ^
@cindex @samp{^} in regexp
@cindex beginning of line in regexp
-is a special character that matches the empty string, but only at
-the beginning of a line in the text being matched. Otherwise it fails
-to match anything. Thus, @samp{^foo} matches a @samp{foo} which occurs
-at the beginning of a line.
+is a special character that matches the empty string, but only at the
+beginning of a line in the text being matched. Otherwise it fails to
+match anything. Thus, @samp{^foo} matches a @samp{foo} that occurs at
+the beginning of a line.
-When matching a string, @samp{^} matches at the beginning of the string
-or after a newline character @samp{\n}.
+When matching a string instead of a buffer, @samp{^} matches at the
+beginning of the string or after a newline character @samp{\n}.
@item $
@cindex @samp{$} in regexp
is similar to @samp{^} but matches only at the end of a line. Thus,
@samp{x+$} matches a string of one @samp{x} or more at the end of a line.
-When matching a string, @samp{$} matches at the end of the string
-or before a newline character @samp{\n}.
+When matching a string instead of a buffer, @samp{$} matches at the end
+of the string or before a newline character @samp{\n}.
@item \
@cindex @samp{\} in regexp
@samp{\}), and it introduces additional special constructs.
Because @samp{\} quotes special characters, @samp{\$} is a regular
-expression which matches only @samp{$}, and @samp{\[} is a regular
-expression which matches only @samp{[}, and so on.
+expression that matches only @samp{$}, and @samp{\[} is a regular
+expression that matches only @samp{[}, and so on.
Note that @samp{\} also has special meaning in the read syntax of Lisp
strings (@pxref{String Type}), and must be quoted with @samp{\}. For
@samp{\} is @code{"\\\\"}.@refill
@end table
-@strong{Please note:} for historical compatibility, special characters
+@strong{Please note:} For historical compatibility, special characters
are treated as ordinary ones if they are in contexts where their special
meanings make no sense. For example, @samp{*foo} treats @samp{*} as
ordinary since there is no preceding expression on which the @samp{*}
-can act. It is poor practice to depend on this behavior; better to
-quote the special character anyway, regardless of where it
-appears.@refill
+can act. It is poor practice to depend on this behavior; quote the
+special character anyway, regardless of where it appears.@refill
For the most part, @samp{\} followed by any character matches only
-that character. However, there are several exceptions: characters
-which, when preceded by @samp{\}, are special constructs. Such
-characters are always ordinary when encountered on their own. Here
-is a table of @samp{\} constructs:
+that character. However, there are several exceptions: two-character
+sequences starting with @samp{\} which have special meanings. The
+second character in the sequence is always an ordinary character on
+their own. Here is a table of @samp{\} constructs.
@table @kbd
@item \|
@enumerate
@item
-To enclose a set of @samp{\|} alternatives for other operations.
-Thus, @samp{\(foo\|bar\)x} matches either @samp{foox} or @samp{barx}.
+To enclose a set of @samp{\|} alternatives for other operations. Thus,
+the regular expression @samp{\(foo\|bar\)x} matches either @samp{foox}
+or @samp{barx}.
@item
-To enclose an expression for a suffix operator such as @samp{*} to act
-on. Thus, @samp{ba\(na\)*} matches @samp{bananana}, etc., with any
-(zero or more) number of @samp{na} strings.@refill
+To enclose a complicated expression for the postfix operators @samp{*},
+@samp{+} and @samp{?} to operate on. Thus, @samp{ba\(na\)*} matches
+@samp{bananana}, etc., with any (zero or more) number of @samp{na}
+strings.@refill
@item
To record a matched substring for future reference.
@end enumerate
This last application is not a consequence of the idea of a
-parenthetical grouping; it is a separate feature which happens to be
+parenthetical grouping; it is a separate feature that happens to be
assigned as a second meaning to the same @samp{\( @dots{} \)} construct
because there is no conflict in practice between the two meanings.
Here is an explanation of this feature:
@item \@var{digit}
-matches the same text which matched the @var{digit}th occurrence of a
+matches the same text that matched the @var{digit}th occurrence of a
@samp{\( @dots{} \)} construct.
-In other words, after the end of a @samp{\( @dots{} \)} construct. the
+In other words, after the end of a @samp{\( @dots{} \)} construct, the
matcher remembers the beginning and end of the text matched by that
construct. Then, later on in the regular expression, you can use
@samp{\} followed by @var{digit} to match that same text, whatever it
@item \W
@cindex @samp{\W} in regexp
-matches any character that is not a word-constituent.
+matches any character that is not a word constituent.
@item \s@var{code}
@cindex @samp{\s} in regexp
matches any character whose syntax is @var{code}. Here @var{code} is a
-character which represents a syntax code: thus, @samp{w} for word
+character that represents a syntax code: thus, @samp{w} for word
constituent, @samp{-} for whitespace, @samp{(} for open parenthesis,
-etc. @xref{Syntax Tables}, for a list of syntax codes and the
-characters that stand for them.
+etc. Represent a character of whitespace (which can be a newline) by
+either @samp{-} or a space character. @xref{Syntax Tables}, for a list
+of syntax codes and the characters that stand for them.
@item \S@var{code}
@cindex @samp{\S} in regexp
matches any character whose syntax is not @var{code}.
@end table
- These regular expression constructs match the empty string---that is,
+ The following regular expression constructs match the empty string---that is,
they don't use up any characters---but whether they match depends on the
context.
@samp{foo} as a separate word. @samp{\bballs?\b} matches
@samp{ball} or @samp{balls} as a separate word.@refill
+@samp{\b} matches at the beginning or end of the buffer
+regardless of what text appears next to it.
+
@item \B
@cindex @samp{\B} in regexp
matches the empty string, but @emph{not} at the beginning or
@item \<
@cindex @samp{\<} in regexp
matches the empty string, but only at the beginning of a word.
+@samp{\<} matches at the beginning of the buffer only if a
+word-constituent character follows.
@item \>
@cindex @samp{\>} in regexp
-matches the empty string, but only at the end of a word.
+matches the empty string, but only at the end of a word. @samp{\>}
+matches at the end of the buffer only if the contents end with a
+word-constituent character.
@end table
@kindex invalid-regexp
Not every string is a valid regular expression. For example, a string
with unbalanced square brackets is invalid (with a few exceptions, such
-as @samp{[]]}, and so is a string that ends with a single @samp{\}. If
+as @samp{[]]}), and so is a string that ends with a single @samp{\}. If
an invalid regular expression is passed to any of the search functions,
an @code{invalid-regexp} error is signaled.
One use of @code{regexp-quote} is to combine an exact string match with
context described as a regular expression. For example, this searches
-for the string which is the value of @code{string}, surrounded by
+for the string that is the value of @code{string}, surrounded by
whitespace:
@example
@group
(re-search-forward
- (concat "\\s " (regexp-quote string) "\\s "))
+ (concat "\\s-" (regexp-quote string) "\\s-"))
@end group
@end example
@end defun
@table @code
@item [.?!]
-The first part of the pattern consists of three characters, a period, a
-question mark and an exclamation mark, within square brackets. The
+The first part of the pattern is a character set that matches any one of
+three characters: period, question mark, and exclamation mark. The
match must begin with one of these three characters.
@item []\"')@}]*
preceding regular expression (a character set, in this case) may be
repeated zero or more times.
-@item \\($\\|@ \\|\t\\|@ @ \\)
+@item \\($\\|@ $\\|\t\\|@ @ \\)
The third part of the pattern matches the whitespace that follows the
end of a sentence: the end of a line, or a tab, or two spaces. The
double backslashes mark the parentheses and vertical bars as regular
-expression syntax; the parentheses mark the group and the vertical bars
+expression syntax; the parentheses delimit a group and the vertical bars
separate alternatives. The dollar sign is used to match the end of a
line.
text that is matched by the regular expression @var{regexp}, leaving
point at the beginning of the first text found.
-This function is analogous to @code{re-search-forward}, but they are
-not simple mirror images. @code{re-search-forward} finds the match
-whose beginning is as close as possible. If @code{re-search-backward}
-were a perfect mirror image, it would find the match whose end is as
-close as possible. However, in fact it finds the match whose beginning
-is as close as possible. The reason is that matching a regular
-expression at a given spot always works from beginning to end, and is
-done at a specified beginning position.
+This function is analogous to @code{re-search-forward}, but they are not
+simple mirror images. @code{re-search-forward} finds the match whose
+beginning is as close as possible to the starting point. If
+@code{re-search-backward} were a perfect mirror image, it would find the
+match whose end is as close as possible. However, in fact it finds the
+match whose beginning is as close as possible. The reason is that
+matching a regular expression at a given spot always works from
+beginning to end, and starts at a specified beginning position.
A true mirror-image of @code{re-search-forward} would require a special
feature for matching regexps from end to beginning. It's not worth the
@end example
@end defun
+@node POSIX Regexps
+@section POSIX Regular Expression Searching
+
+ The usual regular expression functions do backtracking when necessary
+to handle the @samp{\|} and repetition constructs, but they continue
+this only until they find @emph{some} match. Then they succeed and
+report the first match found.
+
+ This section describes alternative search functions which perform the
+full backtracking specified by the POSIX standard for regular expression
+matching. They continue backtracking until they have tried all
+possibilities and found all matches, so they can report the longest
+match, as required by POSIX. This is much slower, so use these
+functions only when you really need the longest match.
+
+ In Emacs versions prior to 19.29, these functions did not exist, and
+the functions described above implemented full POSIX backtracking.
+
+@defun posix-search-forward regexp &optional limit noerror repeat
+This is like @code{re-search-forward} except that it performs the full
+backtracking specified by the POSIX standard for regular expression
+matching.
+@end defun
+
+@defun posix-search-backward regexp &optional limit noerror repeat
+This is like @code{re-search-backward} except that it performs the full
+backtracking specified by the POSIX standard for regular expression
+matching.
+@end defun
+
+@defun posix-looking-at regexp
+This is like @code{looking-at} except that it performs the full
+backtracking specified by the POSIX standard for regular expression
+matching.
+@end defun
+
+@defun posix-string-match regexp string &optional start
+This is like @code{string-match} except that it performs the full
+backtracking specified by the POSIX standard for regular expression
+matching.
+@end defun
+
@ignore
@deffn Command delete-matching-lines regexp
This function is identical to @code{delete-non-matching-lines}, save
with. If it is a string, that string is used. It can also be a list of
strings, to be used in cyclic order.
-If @var{repeat-count} is non-@code{nil}, it should be an integer, the
-number of occurrences to consider. In this case, @code{perform-replace}
-returns after considering that many occurrences.
+If @var{repeat-count} is non-@code{nil}, it should be an integer. Then
+it specifies how many times to use each of the strings in the
+@var{replacements} list before advancing cyclicly to the next one.
Normally, the keymap @code{query-replace-map} defines the possible user
-responses. The argument @var{map}, if non-@code{nil}, is a keymap to
-use instead of @code{query-replace-map}.
+responses for queries. The argument @var{map}, if non-@code{nil}, is a
+keymap to use instead of @code{query-replace-map}.
@end defun
@defvar query-replace-map
Do not take action for this question---in other words, ``no.''
@item exit
-Answer this question ``no,'' and don't ask any more.
+Answer this question ``no,'' and give up on the entire series of
+questions, assuming that the answers will be ``no.''
@item act-and-exit
-Answer this question ``yes,'' and don't ask any more.
+Answer this question ``yes,'' and give up on the entire series of
+questions, assuming that subsequent answers will be ``no.''
@item act-and-show
Answer this question ``yes,'' but show the results---don't advance yet
@node Simple Match Data
@subsection Simple Match Data Access
- This section explains how to use the match data to find the starting
-point or ending point of the text that was matched by a particular
-search, or by a particular parenthetical subexpression of a regular
-expression.
+ This section explains how to use the match data to find out what was
+matched by the last search or match operation.
+
+ You can ask about the entire matching text, or about a particular
+parenthetical subexpression of a regular expression. The @var{count}
+argument in the functions below specifies which. If @var{count} is
+zero, you are asking about the entire match. If @var{count} is
+positive, it specifies which subexpression you want.
+
+ Recall that the subexpressions of a regular expression are those
+expressions grouped with escaped parentheses, @samp{\(@dots{}\)}. The
+@var{count}th subexpression is found by counting occurrences of
+@samp{\(} from the beginning of the whole regular expression. The first
+subexpression is numbered 1, the second 2, and so on. Only regular
+expressions can have subexpressions---after a simple string search, the
+only information available is about the entire match.
+
+@defun match-string count &optional in-string
+This function returns, as a string, the text matched in the last search
+or match operation. It returns the entire text if @var{count} is zero,
+or just the portion corresponding to the @var{count}th parenthetical
+subexpression, if @var{count} is positive. If @var{count} is out of
+range, or if that subexpression didn't match anything, the value is
+@code{nil}.
+
+If the last such operation was done against a string with
+@code{string-match}, then you should pass the same string as the
+argument @var{in-string}. Otherwise, after a buffer search or match,
+you should omit @var{in-string} or pass @code{nil} for it; but you
+should make sure that the current buffer when you call
+@code{match-string} is the one in which you did the searching or
+matching.
+@end defun
@defun match-beginning count
This function returns the position of the start of text matched by the
last regular expression searched for, or a subexpression of it.
-The argument @var{count}, a number, specifies a subexpression whose
-start position is the value. If @var{count} is zero, then the value is
-the position of the text matched by the whole regexp. If @var{count} is
-greater than zero, then the value is the position of the beginning of
-the text matched by the @var{count}th subexpression.
+If @var{count} is zero, then the value is the position of the start of
+the entire match. Otherwise, @var{count} specifies a subexpression in
+the regular expresion, and the value of the function is the starting
+position of the match for that subexpression.
-Subexpressions of a regular expression are those expressions grouped
-inside of parentheses, @samp{\(@dots{}\)}. The @var{count}th
-subexpression is found by counting occurrences of @samp{\(} from the
-beginning of the whole regular expression. The first subexpression is
-numbered 1, the second 2, and so on.
-
-The value is @code{nil} for a parenthetical grouping inside of a
-@samp{\|} alternative that wasn't used in the match.
+The value is @code{nil} for a subexpression inside a @samp{\|}
+alternative that wasn't used in the match.
@end defun
@defun match-end count
-This function returns the position of the end of the text that matched
-the last regular expression searched for, or a subexpression of it.
-This function is otherwise similar to @code{match-beginning}.
+This function is like @code{match-beginning} except that it returns the
+position of the end of the match, rather than the position of the
+beginning.
@end defun
Here is an example of using the match data, with a comment showing the
@result{} 4
@end group
+@group
+(match-string 0 "The quick fox jumped quickly.")
+ @result{} "quick"
+(match-string 1 "The quick fox jumped quickly.")
+ @result{} "qu"
+(match-string 2 "The quick fox jumped quickly.")
+ @result{} "ick"
+@end group
+
@group
(match-beginning 1) ; @r{The beginning of the match}
@result{} 4 ; @r{with @samp{qu} is at index 4.}
(re-search-forward "The \\(cat \\)")
(match-beginning 0)
(match-beginning 1))
- @result{} (t 9 13)
+ @result{} (9 9 13)
@end group
@group
@var{replacement}.
@cindex case in replacements
-@defun replace-match replacement &optional fixedcase literal
-This function replaces the buffer text matched by the last search, with
-@var{replacement}. It applies only to buffers; you can't use
-@code{replace-match} to replace a substring found with
-@code{string-match}.
+@defun replace-match replacement &optional fixedcase literal string subexp
+This function replaces the text in the buffer (or in @var{string}) that
+was matched by the last search. It replaces that text with
+@var{replacement}.
+
+If you did the last search in a buffer, you should specify @code{nil}
+for @var{string}. Then @code{replace-match} does the replacement by
+editing the buffer; it leaves point at the end of the replacement text,
+and returns @code{t}.
+
+If you did the search in a string, pass the same string as @var{string}.
+Then @code{replace-match} does the replacement by constructing and
+returning a new string.
If @var{fixedcase} is non-@code{nil}, then the case of the replacement
text is not changed; otherwise, the replacement text is converted to a
letter, @code{replace-match} considers this a capitalized first word
rather than all upper case.
+If @code{case-replace} is @code{nil}, then case conversion is not done,
+regardless of the value of @var{fixed-case}. @xref{Searching and Case}.
+
If @var{literal} is non-@code{nil}, then @var{replacement} is inserted
exactly as it is, the only alterations being case changes as needed.
If it is @code{nil} (the default), then the character @samp{\} is treated
@item @samp{\@var{n}}
@cindex @samp{\@var{n}} in replacement
-@samp{\@var{n}} stands for the text that matched the @var{n}th
-subexpression in the original regexp. Subexpressions are those
-expressions grouped inside of @samp{\(@dots{}\)}. @var{n} is a digit.
+@samp{\@var{n}}, where @var{n} is a digit, stands for the text that
+matched the @var{n}th subexpression in the original regexp.
+Subexpressions are those expressions grouped inside @samp{\(@dots{}\)}.
@item @samp{\\}
@cindex @samp{\} in replacement
@samp{\\} stands for a single @samp{\} in the replacement text.
@end table
-@code{replace-match} leaves point at the end of the replacement text,
-and returns @code{t}.
+If @var{subexp} is non-@code{nil}, that says to replace just
+subexpression number @var{subexp} of the regexp that was matched, not
+the entire match. For example, after matching @samp{foo \(ba*r\)},
+calling @code{replace-match} with 1 as @var{subexp} means to replace
+just the text that matched @samp{\(ba*r\)}.
@end defun
@node Entire Match Data
@node Saving Match Data
@subsection Saving and Restoring the Match Data
- All asynchronous process functions (filters and sentinels) and
-functions that use @code{recursive-edit} should save and restore the
-match data if they do a search or if they let the user type arbitrary
-commands. Saving the match data is useful in other cases as
-well---whenever you want to access the match data resulting from an
-earlier search, notwithstanding another intervening search.
-
- This example shows the problem that can arise if you fail to
-attend to this requirement:
+ When you call a function that may do a search, you may need to save
+and restore the match data around that call, if you want to preserve the
+match data from an earlier search for later use. Here is an example
+that shows the problem that arises if you fail to save the match data:
@example
@group
@end group
@end example
- In Emacs versions 19 and later, you can save and restore the match
-data with @code{save-match-data}:
+ You can save and restore the match data with @code{save-match-data}:
-@defspec save-match-data body@dots{}
+@defmac save-match-data body@dots{}
This special form executes @var{body}, saving and restoring the match
-data around it. This is useful if you wish to do a search without
-altering the match data that resulted from an earlier search.
-@end defspec
+data around it.
+@end defmac
You can use @code{set-match-data} together with @code{match-data} to
imitate the effect of the special form @code{save-match-data}. This is
@end group
@end example
+ Emacs automatically saves and restores the match data when it runs
+process filter functions (@pxref{Filter Functions}) and process
+sentinels (@pxref{Sentinels}).
+
@ignore
Here is a function which restores the match data provided the buffer
associated with it still exists.
If you do not want this feature, set the variable
@code{case-fold-search} to @code{nil}. Then all letters must match
-exactly, including case. This is a per-buffer-local variable; altering
-the variable affects only the current buffer. (@xref{Intro to
+exactly, including case. This is a buffer-local variable; altering the
+variable affects only the current buffer. (@xref{Intro to
Buffer-Local}.) Alternatively, you may change the value of
@code{default-case-fold-search}, which is the default value of
@code{case-fold-search} for buffers that do not override it.
searching functions Lisp functions use.
@defopt case-replace
-This variable determines whether @code{query-replace} should preserve
-case in replacements. If the variable is @code{nil}, then
-@code{replace-match} should not try to convert case.
+This variable determines whether the replacement functions should
+preserve case. If the variable is @code{nil}, that means to use the
+replacement text verbatim. A non-@code{nil} value means to convert the
+case of the replacement text according to the text being replaced.
+
+The function @code{replace-match} is where this variable actually has
+its effect. @xref{Replacing Match}.
@end defopt
@defopt case-fold-search
@defvar page-delimiter
This is the regexp describing line-beginnings that separate pages. The
-default value is @code{"^\014"} (i.e., @code{"^^L"} or @code{"^\C-l"}).
+default value is @code{"^\014"} (i.e., @code{"^^L"} or @code{"^\C-l"});
+this matches a line that starts with a formfeed character.
@end defvar
+ The following two regular expressions should @emph{not} assume the
+match always starts at the beginning of a line; they should not use
+@samp{^} to anchor the match. Most often, the paragraph commands do
+check for a match only at the beginning of a line, which means that
+@samp{^} would be superfluous. When there is a nonzero left margin,
+they accept matches that start after the left margin. In that case, a
+@samp{^} would be incorrect. However, a @samp{^} is harmless in modes
+where a left margin is never used.
+
@defvar paragraph-separate
This is the regular expression for recognizing the beginning of a line
that separates paragraphs. (If you change this, you may have to
-change @code{paragraph-start} also.) The default value is @code{"^[
-\t\f]*$"}, which is a line that consists entirely of spaces, tabs, and
-form feeds.
+change @code{paragraph-start} also.) The default value is
+@w{@code{"[@ \t\f]*$"}}, which matches a line that consists entirely of
+spaces, tabs, and form feeds (after its left margin).
@end defvar
@defvar paragraph-start
This is the regular expression for recognizing the beginning of a line
that starts @emph{or} separates paragraphs. The default value is
-@code{"^[ \t\n\f]"}, which matches a line starting with a space, tab,
-newline, or form feed.
+@w{@code{"[@ \t\n\f]"}}, which matches a line starting with a space, tab,
+newline, or form feed (after its left margin).
@end defvar
@defvar sentence-end
is:
@example
-"[.?!][]\"')@}]*\\($\\|\t\\| \\)[ \t\n]*"
+"[.?!][]\"')@}]*\\($\\| $\\|\t\\| \\)[ \t\n]*"
@end example
-This means a period, question mark or exclamation mark, followed by a
-closing brace, followed by tabs, spaces or new lines.
+This means a period, question mark or exclamation mark, followed
+optionally by a closing parenthetical character, followed by tabs,
+spaces or new lines.
For a detailed explanation of this regular expression, see @ref{Regexp
Example}.