1 @c This is part of the Emacs manual.
2 @c Copyright (C) 1985, 86, 87, 93, 94, 95, 97, 2000, 2001
3 @c Free Software Foundation, Inc.
4 @c See file emacs.texi for copying conditions.
5 @node Search, Fixit, Display, Top
6 @chapter Searching and Replacement
8 @cindex finding strings within text
10 Like other editors, Emacs has commands for searching for occurrences of
11 a string. The principal search command is unusual in that it is
12 @dfn{incremental}; it begins to search before you have finished typing the
13 search string. There are also nonincremental search commands more like
14 those of other editors.
16 Besides the usual @code{replace-string} command that finds all
17 occurrences of one string and replaces them with another, Emacs has a fancy
18 replacement command called @code{query-replace} which asks interactively
19 which occurrences to replace.
22 * Incremental Search:: Search happens as you type the string.
23 * Nonincremental Search:: Specify entire string and then search.
24 * Word Search:: Search for sequence of words.
25 * Regexp Search:: Search for match for a regexp.
26 * Regexps:: Syntax of regular expressions.
27 * Search Case:: To ignore case while searching, or not.
28 * Replace:: Search, and replace some or all matches.
29 * Other Repeating Search:: Operating on all matches for some regexp.
32 @node Incremental Search, Nonincremental Search, Search, Search
33 @section Incremental Search
35 @cindex incremental search
36 An incremental search begins searching as soon as you type the first
37 character of the search string. As you type in the search string, Emacs
38 shows you where the string (as you have typed it so far) would be
39 found. When you have typed enough characters to identify the place you
40 want, you can stop. Depending on what you plan to do next, you may or
41 may not need to terminate the search explicitly with @key{RET}.
46 Incremental search forward (@code{isearch-forward}).
48 Incremental search backward (@code{isearch-backward}).
52 @findex isearch-forward
53 @kbd{C-s} starts an incremental search. @kbd{C-s} reads characters from
54 the keyboard and positions the cursor at the first occurrence of the
55 characters that you have typed. If you type @kbd{C-s} and then @kbd{F},
56 the cursor moves right after the first @samp{F}. Type an @kbd{O}, and see
57 the cursor move to after the first @samp{FO}. After another @kbd{O}, the
58 cursor is after the first @samp{FOO} after the place where you started the
59 search. At each step, the buffer text that matches the search string is
60 highlighted, if the terminal can do that; at each step, the current search
61 string is updated in the echo area.
63 If you make a mistake in typing the search string, you can cancel
64 characters with @key{DEL}. Each @key{DEL} cancels the last character of
65 search string. This does not happen until Emacs is ready to read another
66 input character; first it must either find, or fail to find, the character
67 you want to erase. If you do not want to wait for this to happen, use
68 @kbd{C-g} as described below.
70 When you are satisfied with the place you have reached, you can type
71 @key{RET}, which stops searching, leaving the cursor where the search
72 brought it. Also, any command not specially meaningful in searches
73 stops the searching and is then executed. Thus, typing @kbd{C-a}
74 would exit the search and then move to the beginning of the line.
75 @key{RET} is necessary only if the next command you want to type is a
76 printing character, @key{DEL}, @key{RET}, or another character that is
77 special within searches (@kbd{C-q}, @kbd{C-w}, @kbd{C-r}, @kbd{C-s},
78 @kbd{C-y}, @kbd{M-y}, @kbd{M-r}, @kbd{M-s}, and some other
81 Sometimes you search for @samp{FOO} and find it, but not the one you
82 expected to find. There was a second @samp{FOO} that you forgot
83 about, before the one you were aiming for. In this event, type
84 another @kbd{C-s} to move to the next occurrence of the search string.
85 You can repeat this any number of times. If you overshoot, you can
86 cancel some @kbd{C-s} characters with @key{DEL}.
88 After you exit a search, you can search for the same string again by
89 typing just @kbd{C-s C-s}: the first @kbd{C-s} is the key that invokes
90 incremental search, and the second @kbd{C-s} means ``search again.''
92 To reuse earlier search strings, use the @dfn{search ring}. The
93 commands @kbd{M-p} and @kbd{M-n} move through the ring to pick a search
94 string to reuse. These commands leave the selected search ring element
95 in the minibuffer, where you can edit it. Type @kbd{C-s} or @kbd{C-r}
96 to terminate editing the string and search for it.
98 If your string is not found at all, the echo area says @samp{Failing
99 I-Search}. The cursor is after the place where Emacs found as much of your
100 string as it could. Thus, if you search for @samp{FOOT}, and there is no
101 @samp{FOOT}, you might see the cursor after the @samp{FOO} in @samp{FOOL}.
102 At this point there are several things you can do. If your string was
103 mistyped, you can rub some of it out and correct it. If you like the place
104 you have found, you can type @key{RET} or some other Emacs command to
105 ``accept what the search offered.'' Or you can type @kbd{C-g}, which
106 removes from the search string the characters that could not be found (the
107 @samp{T} in @samp{FOOT}), leaving those that were found (the @samp{FOO} in
108 @samp{FOOT}). A second @kbd{C-g} at that point cancels the search
109 entirely, returning point to where it was when the search started.
111 An upper-case letter in the search string makes the search
112 case-sensitive. If you delete the upper-case character from the search
113 string, it ceases to have this effect. @xref{Search Case}.
115 To search for a newline, type @kbd{C-j}. To search for another
116 control character, such as control-S or carriage return, you must quote
117 it by typing @kbd{C-q} first. This function of @kbd{C-q} is analogous
118 to its use for insertion (@pxref{Inserting Text}): it causes the
119 following character to be treated the way any ``ordinary'' character is
120 treated in the same context. You can also specify a character by its
121 octal code: enter @kbd{C-q} followed by a sequence of octal digits.
123 @cindex searching for non-ASCII characters
124 @cindex input method, during incremental search
125 To search for non-ASCII characters, you must use an input method
126 (@pxref{Input Methods}). If an input method is turned on in the
127 current buffer when you start the search, you can use it while you
128 type the search string also. Emacs indicates that by including the
129 input method mnemonic in its prompt, like this:
136 @findex isearch-toggle-input-method
137 @findex isearch-toggle-specified-input-method
138 where @var{im} is the mnemonic of the active input method. You can
139 toggle (enable or disable) the input method while you type the search
140 string with @kbd{C-\} (@code{isearch-toggle-input-method}). You can
141 turn on a certain (non-default) input method with @kbd{C-^}
142 (@code{isearch-toggle-specified-input-method}), which prompts for the
143 name of the input method. Note that the input method you turn on
144 during incremental search is turned on in the current buffer as well.
146 If a search is failing and you ask to repeat it by typing another
147 @kbd{C-s}, it starts again from the beginning of the buffer.
148 Repeating a failing reverse search with @kbd{C-r} starts again from
149 the end. This is called @dfn{wrapping around}, and @samp{Wrapped}
150 appears in the search prompt once this has happened. If you keep on
151 going past the original starting point of the search, it changes to
152 @samp{Overwrapped}, which means that you are revisiting matches that
153 you have already seen.
155 @cindex quitting (in search)
156 The @kbd{C-g} ``quit'' character does special things during searches;
157 just what it does depends on the status of the search. If the search has
158 found what you specified and is waiting for input, @kbd{C-g} cancels the
159 entire search. The cursor moves back to where you started the search. If
160 @kbd{C-g} is typed when there are characters in the search string that have
161 not been found---because Emacs is still searching for them, or because it
162 has failed to find them---then the search string characters which have not
163 been found are discarded from the search string. With them gone, the
164 search is now successful and waiting for more input, so a second @kbd{C-g}
165 will cancel the entire search.
167 You can change to searching backwards with @kbd{C-r}. If a search fails
168 because the place you started was too late in the file, you should do this.
169 Repeated @kbd{C-r} keeps looking for more occurrences backwards. A
170 @kbd{C-s} starts going forwards again. @kbd{C-r} in a search can be canceled
174 @findex isearch-backward
175 If you know initially that you want to search backwards, you can use
176 @kbd{C-r} instead of @kbd{C-s} to start the search, because @kbd{C-r} as
177 a key runs a command (@code{isearch-backward}) to search backward. A
178 backward search finds matches that are entirely before the starting
179 point, just as a forward search finds matches that begin after it.
181 The characters @kbd{C-y} and @kbd{C-w} can be used in incremental
182 search to grab text from the buffer into the search string. This makes
183 it convenient to search for another occurrence of text at point.
184 @kbd{C-w} copies the word after point as part of the search string,
185 advancing point over that word. Another @kbd{C-s} to repeat the search
186 will then search for a string including that word. @kbd{C-y} is similar
187 to @kbd{C-w} but copies all the rest of the current line into the search
188 string. Both @kbd{C-y} and @kbd{C-w} convert the text they copy to
189 lower case if the search is currently not case-sensitive; this is so the
190 search remains case-insensitive.
192 The character @kbd{M-y} copies text from the kill ring into the search
193 string. It uses the same text that @kbd{C-y} as a command would yank.
194 @kbd{Mouse-2} in the echo area does the same.
197 When you exit the incremental search, it sets the mark to where point
198 @emph{was}, before the search. That is convenient for moving back
199 there. In Transient Mark mode, incremental search sets the mark without
200 activating it, and does so only if the mark is not already active.
202 @cindex lazy search highlighting
203 @vindex isearch-lazy-highlight
204 When you pause for a little while during incremental search, it
205 highlights all other possible matches for the search string. This
206 makes it easier to anticipate where you can get to by typing @kbd{C-s}
207 or @kbd{C-r} to repeat the search. The short delay before highlighting
208 other matches helps indicate which match is the current one.
209 If you don't like this feature, you can turn it off by setting
210 @code{isearch-lazy-highlight} to @code{nil}.
212 @vindex isearch-lazy-highlight-face
213 @cindex faces for highlighting search matches
214 You can control how this highlighting looks by customizing the faces
215 @code{isearch} (used for the current match) and
216 @code{isearch-lazy-highlight-face} (for all the other matches).
218 @vindex isearch-mode-map
219 To customize the special characters that incremental search understands,
220 alter their bindings in the keymap @code{isearch-mode-map}. For a list
221 of bindings, look at the documentation of @code{isearch-mode} with
222 @kbd{C-h f isearch-mode @key{RET}}.
224 @subsection Slow Terminal Incremental Search
226 Incremental search on a slow terminal uses a modified style of display
227 that is designed to take less time. Instead of redisplaying the buffer at
228 each place the search gets to, it creates a new single-line window and uses
229 that to display the line that the search has found. The single-line window
230 comes into play as soon as point moves outside of the text that is already
233 When you terminate the search, the single-line window is removed.
234 Emacs then redisplays the window in which the search was done, to show
235 its new position of point.
237 @vindex search-slow-speed
238 The slow terminal style of display is used when the terminal baud rate is
239 less than or equal to the value of the variable @code{search-slow-speed},
240 initially 1200. See @code{baud-rate} in @ref{Display Custom}.
242 @vindex search-slow-window-lines
243 The number of lines to use in slow terminal search display is controlled
244 by the variable @code{search-slow-window-lines}. Its normal value is 1.
246 @node Nonincremental Search, Word Search, Incremental Search, Search
247 @section Nonincremental Search
248 @cindex nonincremental search
250 Emacs also has conventional nonincremental search commands, which require
251 you to type the entire search string before searching begins.
254 @item C-s @key{RET} @var{string} @key{RET}
255 Search for @var{string}.
256 @item C-r @key{RET} @var{string} @key{RET}
257 Search backward for @var{string}.
260 To do a nonincremental search, first type @kbd{C-s @key{RET}}. This
261 enters the minibuffer to read the search string; terminate the string
262 with @key{RET}, and then the search takes place. If the string is not
263 found, the search command gets an error.
265 The way @kbd{C-s @key{RET}} works is that the @kbd{C-s} invokes
266 incremental search, which is specially programmed to invoke nonincremental
267 search if the argument you give it is empty. (Such an empty argument would
268 otherwise be useless.) @kbd{C-r @key{RET}} also works this way.
270 However, nonincremental searches performed using @kbd{C-s @key{RET}} do
271 not call @code{search-forward} right away. The first thing done is to see
272 if the next character is @kbd{C-w}, which requests a word search.
277 @findex search-forward
278 @findex search-backward
279 Forward and backward nonincremental searches are implemented by the
280 commands @code{search-forward} and @code{search-backward}. These
281 commands may be bound to keys in the usual manner. The feature that you
282 can get to them via the incremental search commands exists for
283 historical reasons, and to avoid the need to find suitable key sequences
286 @node Word Search, Regexp Search, Nonincremental Search, Search
290 Word search searches for a sequence of words without regard to how the
291 words are separated. More precisely, you type a string of many words,
292 using single spaces to separate them, and the string can be found even
293 if there are multiple spaces, newlines, or other punctuation characters
296 Word search is useful for editing a printed document made with a text
297 formatter. If you edit while looking at the printed, formatted version,
298 you can't tell where the line breaks are in the source file. With word
299 search, you can search without having to know them.
302 @item C-s @key{RET} C-w @var{words} @key{RET}
303 Search for @var{words}, ignoring details of punctuation.
304 @item C-r @key{RET} C-w @var{words} @key{RET}
305 Search backward for @var{words}, ignoring details of punctuation.
308 Word search is a special case of nonincremental search and is invoked
309 with @kbd{C-s @key{RET} C-w}. This is followed by the search string,
310 which must always be terminated with @key{RET}. Being nonincremental,
311 this search does not start until the argument is terminated. It works
312 by constructing a regular expression and searching for that; see
315 Use @kbd{C-r @key{RET} C-w} to do backward word search.
317 @findex word-search-forward
318 @findex word-search-backward
319 Forward and backward word searches are implemented by the commands
320 @code{word-search-forward} and @code{word-search-backward}. These
321 commands may be bound to keys in the usual manner. The feature that you
322 can get to them via the incremental search commands exists for historical
323 reasons, and to avoid the need to find suitable key sequences for them.
325 @node Regexp Search, Regexps, Word Search, Search
326 @section Regular Expression Search
327 @cindex regular expression
330 A @dfn{regular expression} (@dfn{regexp}, for short) is a pattern that
331 denotes a class of alternative strings to match, possibly infinitely
332 many. In GNU Emacs, you can search for the next match for a regexp
333 either incrementally or not.
336 @findex isearch-forward-regexp
338 @findex isearch-backward-regexp
339 Incremental search for a regexp is done by typing @kbd{C-M-s}
340 (@code{isearch-forward-regexp}). This command reads a search string
341 incrementally just like @kbd{C-s}, but it treats the search string as a
342 regexp rather than looking for an exact match against the text in the
343 buffer. Each time you add text to the search string, you make the
344 regexp longer, and the new regexp is searched for. Invoking @kbd{C-s}
345 with a prefix argument (its value does not matter) is another way to do
346 a forward incremental regexp search. To search backward for a regexp,
347 use @kbd{C-M-r} (@code{isearch-backward-regexp}), or @kbd{C-r} with a
350 All of the control characters that do special things within an
351 ordinary incremental search have the same function in incremental regexp
352 search. Typing @kbd{C-s} or @kbd{C-r} immediately after starting the
353 search retrieves the last incremental search regexp used; that is to
354 say, incremental regexp and non-regexp searches have independent
355 defaults. They also have separate search rings that you can access with
356 @kbd{M-p} and @kbd{M-n}.
358 If you type @key{SPC} in incremental regexp search, it matches any
359 sequence of whitespace characters, including newlines. If you want
360 to match just a space, type @kbd{C-q @key{SPC}}.
362 Note that adding characters to the regexp in an incremental regexp
363 search can make the cursor move back and start again. For example, if
364 you have searched for @samp{foo} and you add @samp{\|bar}, the cursor
365 backs up in case the first @samp{bar} precedes the first @samp{foo}.
367 @findex re-search-forward
368 @findex re-search-backward
369 Nonincremental search for a regexp is done by the functions
370 @code{re-search-forward} and @code{re-search-backward}. You can invoke
371 these with @kbd{M-x}, or bind them to keys, or invoke them by way of
372 incremental regexp search with @kbd{C-M-s @key{RET}} and @kbd{C-M-r
375 If you use the incremental regexp search commands with a prefix
376 argument, they perform ordinary string search, like
377 @code{isearch-forward} and @code{isearch-backward}. @xref{Incremental
380 @node Regexps, Search Case, Regexp Search, Search
381 @section Syntax of Regular Expressions
382 @cindex syntax of regexps
384 Regular expressions have a syntax in which a few characters are
385 special constructs and the rest are @dfn{ordinary}. An ordinary
386 character is a simple regular expression which matches that same
387 character and nothing else. The special characters are @samp{$},
388 @samp{^}, @samp{.}, @samp{*}, @samp{+}, @samp{?}, @samp{[}, @samp{]} and
389 @samp{\}. Any other character appearing in a regular expression is
390 ordinary, unless a @samp{\} precedes it. (When you use regular
391 expressions in a Lisp program, each @samp{\} must be doubled, see the
392 example near the end of this section.)
394 For example, @samp{f} is not a special character, so it is ordinary, and
395 therefore @samp{f} is a regular expression that matches the string
396 @samp{f} and no other string. (It does @emph{not} match the string
397 @samp{ff}.) Likewise, @samp{o} is a regular expression that matches
398 only @samp{o}. (When case distinctions are being ignored, these regexps
399 also match @samp{F} and @samp{O}, but we consider this a generalization
400 of ``the same string,'' rather than an exception.)
402 Any two regular expressions @var{a} and @var{b} can be concatenated. The
403 result is a regular expression which matches a string if @var{a} matches
404 some amount of the beginning of that string and @var{b} matches the rest of
407 As a simple example, we can concatenate the regular expressions @samp{f}
408 and @samp{o} to get the regular expression @samp{fo}, which matches only
409 the string @samp{fo}. Still trivial. To do something nontrivial, you
410 need to use one of the special characters. Here is a list of them.
413 @item .@: @r{(Period)}
414 is a special character that matches any single character except a newline.
415 Using concatenation, we can make regular expressions like @samp{a.b}, which
416 matches any three-character string that begins with @samp{a} and ends with
420 is not a construct by itself; it is a postfix operator that means to
421 match the preceding regular expression repetitively as many times as
422 possible. Thus, @samp{o*} matches any number of @samp{o}s (including no
425 @samp{*} always applies to the @emph{smallest} possible preceding
426 expression. Thus, @samp{fo*} has a repeating @samp{o}, not a repeating
427 @samp{fo}. It matches @samp{f}, @samp{fo}, @samp{foo}, and so on.
429 The matcher processes a @samp{*} construct by matching, immediately,
430 as many repetitions as can be found. Then it continues with the rest
431 of the pattern. If that fails, backtracking occurs, discarding some
432 of the matches of the @samp{*}-modified construct in case that makes
433 it possible to match the rest of the pattern. For example, in matching
434 @samp{ca*ar} against the string @samp{caaar}, the @samp{a*} first
435 tries to match all three @samp{a}s; but the rest of the pattern is
436 @samp{ar} and there is only @samp{r} left to match, so this try fails.
437 The next alternative is for @samp{a*} to match only two @samp{a}s.
438 With this choice, the rest of the regexp matches successfully.@refill
441 is a postfix operator, similar to @samp{*} except that it must match
442 the preceding expression at least once. So, for example, @samp{ca+r}
443 matches the strings @samp{car} and @samp{caaaar} but not the string
444 @samp{cr}, whereas @samp{ca*r} matches all three strings.
447 is a postfix operator, similar to @samp{*} except that it can match the
448 preceding expression either once or not at all. For example,
449 @samp{ca?r} matches @samp{car} or @samp{cr}; nothing else.
452 @cindex non-greedy regexp matching
453 are non-greedy variants of the operators above. The normal operators
454 @samp{*}, @samp{+}, @samp{?} are @dfn{greedy} in that they match as
455 much as they can, as long as the overall regexp can still match. With
456 a following @samp{?}, they are non-greedy: they will match as little
459 Thus, both @samp{ab*} and @samp{ab*?} can match the string @samp{a}
460 and the string @samp{abbbb}; but if you try to match them both against
461 the text @samp{abbb}, @samp{ab*} will match it all (the longest valid
462 match), while @samp{ab*?} will match just @samp{a} (the shortest
466 is a postfix operator that specifies repetition @var{n} times---that
467 is, the preceding regular expression must match exactly @var{n} times
468 in a row. For example, @samp{x\@{4\@}} matches the string @samp{xxxx}
471 @item \@{@var{n},@var{m}\@}
472 is a postfix operator that specifies repetition between @var{n} and
473 @var{m} times---that is, the preceding regular expression must match
474 at least @var{n} times, but no more than @var{m} times. If @var{m} is
475 omitted, then there is no upper limit, but the preceding regular
476 expression must match at least @var{n} times.@* @samp{\@{0,1\@}} is
477 equivalent to @samp{?}. @* @samp{\@{0,\@}} is equivalent to
478 @samp{*}. @* @samp{\@{1,\@}} is equivalent to @samp{+}.
481 is a @dfn{character set}, which begins with @samp{[} and is terminated
482 by @samp{]}. In the simplest case, the characters between the two
483 brackets are what this set can match.
485 Thus, @samp{[ad]} matches either one @samp{a} or one @samp{d}, and
486 @samp{[ad]*} matches any string composed of just @samp{a}s and @samp{d}s
487 (including the empty string), from which it follows that @samp{c[ad]*r}
488 matches @samp{cr}, @samp{car}, @samp{cdr}, @samp{caddaar}, etc.
490 You can also include character ranges in a character set, by writing the
491 starting and ending characters with a @samp{-} between them. Thus,
492 @samp{[a-z]} matches any lower-case ASCII letter. Ranges may be
493 intermixed freely with individual characters, as in @samp{[a-z$%.]},
494 which matches any lower-case ASCII letter or @samp{$}, @samp{%} or
497 Note that the usual regexp special characters are not special inside a
498 character set. A completely different set of special characters exists
499 inside character sets: @samp{]}, @samp{-} and @samp{^}.
501 To include a @samp{]} in a character set, you must make it the first
502 character. For example, @samp{[]a]} matches @samp{]} or @samp{a}. To
503 include a @samp{-}, write @samp{-} as the first or last character of the
504 set, or put it after a range. Thus, @samp{[]-]} matches both @samp{]}
507 To include @samp{^} in a set, put it anywhere but at the beginning of
508 the set. (At the beginning, it complements the set---see below.)
510 When you use a range in case-insensitive search, you should write both
511 ends of the range in upper case, or both in lower case, or both should
512 be non-letters. The behavior of a mixed-case range such as @samp{A-z}
513 is somewhat ill-defined, and it may change in future Emacs versions.
516 @samp{[^} begins a @dfn{complemented character set}, which matches any
517 character except the ones specified. Thus, @samp{[^a-z0-9A-Z]} matches
518 all characters @emph{except} ASCII letters and digits.
520 @samp{^} is not special in a character set unless it is the first
521 character. The character following the @samp{^} is treated as if it
522 were first (in other words, @samp{-} and @samp{]} are not special there).
524 A complemented character set can match a newline, unless newline is
525 mentioned as one of the characters not to match. This is in contrast to
526 the handling of regexps in programs such as @code{grep}.
529 is a special character that matches the empty string, but only at the
530 beginning of a line in the text being matched. Otherwise it fails to
531 match anything. Thus, @samp{^foo} matches a @samp{foo} that occurs at
532 the beginning of a line.
535 is similar to @samp{^} but matches only at the end of a line. Thus,
536 @samp{x+$} matches a string of one @samp{x} or more at the end of a line.
539 has two functions: it quotes the special characters (including
540 @samp{\}), and it introduces additional special constructs.
542 Because @samp{\} quotes special characters, @samp{\$} is a regular
543 expression that matches only @samp{$}, and @samp{\[} is a regular
544 expression that matches only @samp{[}, and so on.
547 Note: for historical compatibility, special characters are treated as
548 ordinary ones if they are in contexts where their special meanings make no
549 sense. For example, @samp{*foo} treats @samp{*} as ordinary since there is
550 no preceding expression on which the @samp{*} can act. It is poor practice
551 to depend on this behavior; it is better to quote the special character anyway,
552 regardless of where it appears.@refill
554 For the most part, @samp{\} followed by any character matches only that
555 character. However, there are several exceptions: two-character
556 sequences starting with @samp{\} that have special meanings. The second
557 character in the sequence is always an ordinary character when used on
558 its own. Here is a table of @samp{\} constructs.
562 specifies an alternative. Two regular expressions @var{a} and @var{b}
563 with @samp{\|} in between form an expression that matches some text if
564 either @var{a} matches it or @var{b} matches it. It works by trying to
565 match @var{a}, and if that fails, by trying to match @var{b}.
567 Thus, @samp{foo\|bar} matches either @samp{foo} or @samp{bar}
568 but no other string.@refill
570 @samp{\|} applies to the largest possible surrounding expressions. Only a
571 surrounding @samp{\( @dots{} \)} grouping can limit the grouping power of
574 Full backtracking capability exists to handle multiple uses of @samp{\|}.
577 is a grouping construct that serves three purposes:
581 To enclose a set of @samp{\|} alternatives for other operations.
582 Thus, @samp{\(foo\|bar\)x} matches either @samp{foox} or @samp{barx}.
585 To enclose a complicated expression for the postfix operators @samp{*},
586 @samp{+} and @samp{?} to operate on. Thus, @samp{ba\(na\)*} matches
587 @samp{bananana}, etc., with any (zero or more) number of @samp{na}
591 To record a matched substring for future reference.
594 This last application is not a consequence of the idea of a
595 parenthetical grouping; it is a separate feature that is assigned as a
596 second meaning to the same @samp{\( @dots{} \)} construct. In practice
597 there is usually no conflict between the two meanings; when there is
598 a conflict, you can use a ``shy'' group.
600 @item \(?: @dots{} \)
601 @cindex shy group, in regexp
602 specifies a ``shy'' group that does not record the matched substring;
603 you can't refer back to it with @samp{\@var{d}}. This is useful
604 in mechanically combining regular expressions, so that you
605 can add groups for syntactic purposes without interfering with
606 the numbering of the groups that were written by the user.
609 matches the same text that matched the @var{d}th occurrence of a
610 @samp{\( @dots{} \)} construct.
612 After the end of a @samp{\( @dots{} \)} construct, the matcher remembers
613 the beginning and end of the text matched by that construct. Then,
614 later on in the regular expression, you can use @samp{\} followed by the
615 digit @var{d} to mean ``match the same text matched the @var{d}th time
616 by the @samp{\( @dots{} \)} construct.''
618 The strings matching the first nine @samp{\( @dots{} \)} constructs
619 appearing in a regular expression are assigned numbers 1 through 9 in
620 the order that the open-parentheses appear in the regular expression.
621 So you can use @samp{\1} through @samp{\9} to refer to the text matched
622 by the corresponding @samp{\( @dots{} \)} constructs.
624 For example, @samp{\(.*\)\1} matches any newline-free string that is
625 composed of two identical halves. The @samp{\(.*\)} matches the first
626 half, which may be anything, but the @samp{\1} that follows must match
629 If a particular @samp{\( @dots{} \)} construct matches more than once
630 (which can easily happen if it is followed by @samp{*}), only the last
634 matches the empty string, but only at the beginning
635 of the buffer or string being matched against.
638 matches the empty string, but only at the end of
639 the buffer or string being matched against.
642 matches the empty string, but only at point.
645 matches the empty string, but only at the beginning or
646 end of a word. Thus, @samp{\bfoo\b} matches any occurrence of
647 @samp{foo} as a separate word. @samp{\bballs?\b} matches
648 @samp{ball} or @samp{balls} as a separate word.@refill
650 @samp{\b} matches at the beginning or end of the buffer
651 regardless of what text appears next to it.
654 matches the empty string, but @emph{not} at the beginning or
658 matches the empty string, but only at the beginning of a word.
659 @samp{\<} matches at the beginning of the buffer only if a
660 word-constituent character follows.
663 matches the empty string, but only at the end of a word. @samp{\>}
664 matches at the end of the buffer only if the contents end with a
665 word-constituent character.
668 matches any word-constituent character. The syntax table
669 determines which characters these are. @xref{Syntax}.
672 matches any character that is not a word-constituent.
675 matches any character whose syntax is @var{c}. Here @var{c} is a
676 character that designates a particular syntax class: thus, @samp{w}
677 for word constituent, @samp{-} or @samp{ } for whitespace, @samp{.}
678 for ordinary punctuation, etc. @xref{Syntax}.
681 matches any character whose syntax is not @var{c}.
683 @cindex categories of characters
684 @cindex characters which belong to a specific language
685 @findex describe-categories
687 matches any character that belongs to the category @var{c}. For
688 example, @samp{\cc} matches Chinese characters, @samp{\cg} matches
689 Greek characters, etc. For the description of the known categories,
690 type @kbd{M-x describe-categories @key{RET}}.
693 matches any character that does @emph{not} belong to category
697 The constructs that pertain to words and syntax are controlled by the
698 setting of the syntax table (@pxref{Syntax}).
700 Here is a complicated regexp, stored in @code{sentence-end} and used
701 by Emacs to recognize the end of a sentence together with any
702 whitespace that follows. We show its Lisp syntax to distinguish the
703 spaces from the tab characters. In Lisp syntax, the string constant
704 begins and ends with a double-quote. @samp{\"} stands for a
705 double-quote as part of the regexp, @samp{\\} for a backslash as part
706 of the regexp, @samp{\t} for a tab, and @samp{\n} for a newline.
709 "[.?!][]\"')]*\\($\\| $\\|\t\\| \\)[ \t\n]*"
713 This contains four parts in succession: a character set matching
714 period, @samp{?}, or @samp{!}; a character set matching
715 close-brackets, quotes, or parentheses, repeated zero or more times; a
716 set of alternatives within backslash-parentheses that matches either
717 end-of-line, a space at the end of a line, a tab, or two spaces; and a
718 character set matching whitespace characters, repeated any number of
721 To enter the same regexp interactively, you would type @key{TAB} to
722 enter a tab, and @kbd{C-j} to enter a newline. You would also type
723 single backslashes as themselves, instead of doubling them for Lisp syntax.
726 @c I commented this out because it is missing vital information
727 @c and therefore useless. For instance, what do you do to *use* the
728 @c regular expression when it is finished? What jobs is this good for?
732 @cindex authoring regular expressions
733 For convenient interactive development of regular expressions, you
734 can use the @kbd{M-x re-builder} command. It provides a convenient
735 interface for creating regular expressions, by giving immediate visual
736 feedback. The buffer from which @code{re-builder} was invoked becomes
737 the target for the regexp editor, which pops in a separate window. At
738 all times, all the matches in the target buffer for the current
739 regular expression are highlighted. Each parenthesized sub-expression
740 of the regexp is shown in a distinct face, which makes it easier to
741 verify even very complex regexps. (On displays that don't support
742 colors, Emacs blinks the cursor around the matched text, as it does
743 for matching parens.)
746 @node Search Case, Replace, Regexps, Search
747 @section Searching and Case
749 Incremental searches in Emacs normally ignore the case of the text
750 they are searching through, if you specify the text in lower case.
751 Thus, if you specify searching for @samp{foo}, then @samp{Foo} and
752 @samp{foo} are also considered a match. Regexps, and in particular
753 character sets, are included: @samp{[ab]} would match @samp{a} or
754 @samp{A} or @samp{b} or @samp{B}.@refill
756 An upper-case letter anywhere in the incremental search string makes
757 the search case-sensitive. Thus, searching for @samp{Foo} does not find
758 @samp{foo} or @samp{FOO}. This applies to regular expression search as
759 well as to string search. The effect ceases if you delete the
760 upper-case letter from the search string.
762 Typing @kbd{M-c} within an incremental search toggles the case
763 sensitivity of that search. The effect does not extend beyond the
764 current incremental search to the next one, but it does override the
765 effect of including an upper-case letter in the current search.
767 @vindex case-fold-search
768 If you set the variable @code{case-fold-search} to @code{nil}, then
769 all letters must match exactly, including case. This is a per-buffer
770 variable; altering the variable affects only the current buffer, but
771 there is a default value which you can change as well. @xref{Locals}.
772 This variable applies to nonincremental searches also, including those
773 performed by the replace commands (@pxref{Replace}) and the minibuffer
774 history matching commands (@pxref{Minibuffer History}).
776 @node Replace, Other Repeating Search, Search Case, Search
777 @section Replacement Commands
779 @cindex search-and-replace commands
780 @cindex string substitution
781 @cindex global substitution
783 Global search-and-replace operations are not needed as often in Emacs
784 as they are in other editors@footnote{In some editors,
785 search-and-replace operations are the only convenient way to make a
786 single change in the text.}, but they are available. In addition to the
787 simple @kbd{M-x replace-string} command which is like that found in most
788 editors, there is a @kbd{M-x query-replace} command which asks you, for
789 each occurrence of the pattern, whether to replace it.
791 The replace commands normally operate on the text from point to the
792 end of the buffer; however, in Transient Mark mode, when the mark is
793 active, they operate on the region. The replace commands all replace
794 one string (or regexp) with one replacement string. It is possible to
795 perform several replacements in parallel using the command
796 @code{expand-region-abbrevs} (@pxref{Expanding Abbrevs}).
799 * Unconditional Replace:: Replacing all matches for a string.
800 * Regexp Replace:: Replacing all matches for a regexp.
801 * Replacement and Case:: How replacements preserve case of letters.
802 * Query Replace:: How to use querying.
805 @node Unconditional Replace, Regexp Replace, Replace, Replace
806 @subsection Unconditional Replacement
807 @findex replace-string
808 @findex replace-regexp
811 @item M-x replace-string @key{RET} @var{string} @key{RET} @var{newstring} @key{RET}
812 Replace every occurrence of @var{string} with @var{newstring}.
813 @item M-x replace-regexp @key{RET} @var{regexp} @key{RET} @var{newstring} @key{RET}
814 Replace every match for @var{regexp} with @var{newstring}.
817 To replace every instance of @samp{foo} after point with @samp{bar},
818 use the command @kbd{M-x replace-string} with the two arguments
819 @samp{foo} and @samp{bar}. Replacement happens only in the text after
820 point, so if you want to cover the whole buffer you must go to the
821 beginning first. All occurrences up to the end of the buffer are
822 replaced; to limit replacement to part of the buffer, narrow to that
823 part of the buffer before doing the replacement (@pxref{Narrowing}).
824 In Transient Mark mode, when the region is active, replacement is
825 limited to the region (@pxref{Transient Mark}).
827 When @code{replace-string} exits, it leaves point at the last
828 occurrence replaced. It sets the mark to the prior position of point
829 (where the @code{replace-string} command was issued); use @kbd{C-u
830 C-@key{SPC}} to move back there.
832 A numeric argument restricts replacement to matches that are surrounded
833 by word boundaries. The argument's value doesn't matter.
835 @node Regexp Replace, Replacement and Case, Unconditional Replace, Replace
836 @subsection Regexp Replacement
838 The @kbd{M-x replace-string} command replaces exact matches for a
839 single string. The similar command @kbd{M-x replace-regexp} replaces
840 any match for a specified pattern.
842 In @code{replace-regexp}, the @var{newstring} need not be constant: it
843 can refer to all or part of what is matched by the @var{regexp}.
844 @samp{\&} in @var{newstring} stands for the entire match being replaced.
845 @samp{\@var{d}} in @var{newstring}, where @var{d} is a digit, stands for
846 whatever matched the @var{d}th parenthesized grouping in @var{regexp}.
847 To include a @samp{\} in the text to replace with, you must enter
848 @samp{\\}. For example,
851 M-x replace-regexp @key{RET} c[ad]+r @key{RET} \&-safe @key{RET}
855 replaces (for example) @samp{cadr} with @samp{cadr-safe} and @samp{cddr}
856 with @samp{cddr-safe}.
859 M-x replace-regexp @key{RET} \(c[ad]+r\)-safe @key{RET} \1 @key{RET}
863 performs the inverse transformation.
865 @node Replacement and Case, Query Replace, Regexp Replace, Replace
866 @subsection Replace Commands and Case
868 If the first argument of a replace command is all lower case, the
869 command ignores case while searching for occurrences to
870 replace---provided @code{case-fold-search} is non-@code{nil}. If
871 @code{case-fold-search} is set to @code{nil}, case is always significant
875 In addition, when the @var{newstring} argument is all or partly lower
876 case, replacement commands try to preserve the case pattern of each
877 occurrence. Thus, the command
880 M-x replace-string @key{RET} foo @key{RET} bar @key{RET}
884 replaces a lower case @samp{foo} with a lower case @samp{bar}, an
885 all-caps @samp{FOO} with @samp{BAR}, and a capitalized @samp{Foo} with
886 @samp{Bar}. (These three alternatives---lower case, all caps, and
887 capitalized, are the only ones that @code{replace-string} can
890 If upper-case letters are used in the replacement string, they remain
891 upper case every time that text is inserted. If upper-case letters are
892 used in the first argument, the second argument is always substituted
893 exactly as given, with no case conversion. Likewise, if either
894 @code{case-replace} or @code{case-fold-search} is set to @code{nil},
895 replacement is done without case conversion.
897 @node Query Replace,, Replacement and Case, Replace
898 @subsection Query Replace
899 @cindex query replace
902 @item M-% @var{string} @key{RET} @var{newstring} @key{RET}
903 @itemx M-x query-replace @key{RET} @var{string} @key{RET} @var{newstring} @key{RET}
904 Replace some occurrences of @var{string} with @var{newstring}.
905 @item C-M-% @var{regexp} @key{RET} @var{newstring} @key{RET}
906 @itemx M-x query-replace-regexp @key{RET} @var{regexp} @key{RET} @var{newstring} @key{RET}
907 Replace some matches for @var{regexp} with @var{newstring}.
911 @findex query-replace
912 If you want to change only some of the occurrences of @samp{foo} to
913 @samp{bar}, not all of them, then you cannot use an ordinary
914 @code{replace-string}. Instead, use @kbd{M-%} (@code{query-replace}).
915 This command finds occurrences of @samp{foo} one by one, displays each
916 occurrence and asks you whether to replace it. Aside from querying,
917 @code{query-replace} works just like @code{replace-string}. It
918 preserves case, like @code{replace-string}, provided
919 @code{case-replace} is non-@code{nil}, as it normally is. A numeric
920 argument means consider only occurrences that are bounded by
921 word-delimiter characters.
924 @findex query-replace-regexp
925 @kbd{C-M-%} performs regexp search and replace (@code{query-replace-regexp}).
927 The characters you can type when you are shown a match for the string
930 @ignore @c Not worth it.
931 @kindex SPC @r{(query-replace)}
932 @kindex DEL @r{(query-replace)}
933 @kindex , @r{(query-replace)}
934 @kindex RET @r{(query-replace)}
935 @kindex . @r{(query-replace)}
936 @kindex ! @r{(query-replace)}
937 @kindex ^ @r{(query-replace)}
938 @kindex C-r @r{(query-replace)}
939 @kindex C-w @r{(query-replace)}
940 @kindex C-l @r{(query-replace)}
946 to replace the occurrence with @var{newstring}.
949 to skip to the next occurrence without replacing this one.
952 to replace this occurrence and display the result. You are then asked
953 for another input character to say what to do next. Since the
954 replacement has already been made, @key{DEL} and @key{SPC} are
955 equivalent in this situation; both move to the next occurrence.
957 You can type @kbd{C-r} at this point (see below) to alter the replaced
958 text. You can also type @kbd{C-x u} to undo the replacement; this exits
959 the @code{query-replace}, so if you want to do further replacement you
960 must use @kbd{C-x @key{ESC} @key{ESC} @key{RET}} to restart
961 (@pxref{Repetition}).
964 to exit without doing any more replacements.
966 @item .@: @r{(Period)}
967 to replace this occurrence and then exit without searching for more
971 to replace all remaining occurrences without asking again.
974 to go back to the position of the previous occurrence (or what used to
975 be an occurrence), in case you changed it by mistake. This works by
976 popping the mark ring. Only one @kbd{^} in a row is meaningful, because
977 only one previous replacement position is kept during @code{query-replace}.
980 to enter a recursive editing level, in case the occurrence needs to be
981 edited rather than just replaced with @var{newstring}. When you are
982 done, exit the recursive editing level with @kbd{C-M-c} to proceed to
983 the next occurrence. @xref{Recursive Edit}.
986 to delete the occurrence, and then enter a recursive editing level as in
987 @kbd{C-r}. Use the recursive edit to insert text to replace the deleted
988 occurrence of @var{string}. When done, exit the recursive editing level
989 with @kbd{C-M-c} to proceed to the next occurrence.
992 to edit the replacement string in the minibuffer. When you exit the
993 minibuffer by typing @key{RET}, the minibuffer contents replace the
994 current occurrence of the pattern. They also become the new
995 replacement string for any further occurrences.
998 to redisplay the screen. Then you must type another character to
999 specify what to do with this occurrence.
1002 to display a message summarizing these options. Then you must type
1003 another character to specify what to do with this occurrence.
1006 Some other characters are aliases for the ones listed above: @kbd{y},
1007 @kbd{n} and @kbd{q} are equivalent to @key{SPC}, @key{DEL} and
1010 Aside from this, any other character exits the @code{query-replace},
1011 and is then reread as part of a key sequence. Thus, if you type
1012 @kbd{C-k}, it exits the @code{query-replace} and then kills to end of
1015 To restart a @code{query-replace} once it is exited, use @kbd{C-x
1016 @key{ESC} @key{ESC}}, which repeats the @code{query-replace} because it
1017 used the minibuffer to read its arguments. @xref{Repetition, C-x ESC
1020 See also @ref{Transforming File Names}, for Dired commands to rename,
1021 copy, or link files by replacing regexp matches in file names.
1023 @node Other Repeating Search,, Replace, Search
1024 @section Other Search-and-Loop Commands
1026 Here are some other commands that find matches for a regular
1027 expression. They all ignore case in matching, if the pattern contains
1028 no upper-case letters and @code{case-fold-search} is non-@code{nil}.
1029 Aside from @code{occur}, all operate on the text from point to the end
1030 of the buffer, or on the active region in Transient Mark mode.
1032 @findex list-matching-lines
1035 @findex delete-non-matching-lines
1036 @findex delete-matching-lines
1041 @item M-x occur @key{RET} @var{regexp} @key{RET}
1042 Display a list showing each line in the buffer that contains a match
1043 for @var{regexp}. To limit the search to part of the buffer, narrow
1044 to that part (@pxref{Narrowing}). A numeric argument @var{n}
1045 specifies that @var{n} lines of context are to be displayed before and
1046 after each matching line.
1048 @kindex RET @r{(Occur mode)}
1049 The buffer @samp{*Occur*} containing the output serves as a menu for
1050 finding the occurrences in their original context. Click @kbd{Mouse-2}
1051 on an occurrence listed in @samp{*Occur*}, or position point there and
1052 type @key{RET}; this switches to the buffer that was searched and
1053 moves point to the original of the chosen occurrence.
1055 @item M-x list-matching-lines
1056 Synonym for @kbd{M-x occur}.
1058 @item M-x how-many @key{RET} @var{regexp} @key{RET}
1059 Print the number of matches for @var{regexp} that exist in the buffer
1060 after point. In Transient Mark mode, if the region is active, the
1061 command operates on the region instead.
1063 @item M-x flush-lines @key{RET} @var{regexp} @key{RET}
1064 Delete each line that contains a match for @var{regexp}, operating on
1065 the text after point. In Transient Mark mode, if the region is
1066 active, the command operates on the region instead.
1068 @item M-x keep-lines @key{RET} @var{regexp} @key{RET}
1069 Delete each line that @emph{does not} contain a match for
1070 @var{regexp}, operating on the text after point. In Transient Mark
1071 mode, if the region is active, the command operates on the region
1075 You can also search multiple files under control of a tags table
1076 (@pxref{Tags Search}) or through Dired @kbd{A} command
1077 (@pxref{Operating on Files}), or ask the @code{grep} program to do it
1078 (@pxref{Grep Searching}).