X-Git-Url: https://code.delx.au/gnu-emacs/blobdiff_plain/1911e6e52c846c4a5bf744d850ec7061ff90c412..d0997f103e59252edd7d245a90ec8b622c154a92:/lispref/syntax.texi diff --git a/lispref/syntax.texi b/lispref/syntax.texi index 35cde861d1..bc3ac9c36b 100644 --- a/lispref/syntax.texi +++ b/lispref/syntax.texi @@ -1,6 +1,7 @@ @c -*-texinfo-*- @c This is part of the GNU Emacs Lisp Reference Manual. -@c Copyright (C) 1990, 1991, 1992, 1993, 1994, 1995, 1998 Free Software Foundation, Inc. +@c Copyright (C) 1990, 1991, 1992, 1993, 1994, 1995, 1998, 1999 +@c Free Software Foundation, Inc. @c See the file elisp.texi for copying conditions. @setfilename ../info/syntax @node Syntax Tables, Abbrevs, Searching and Matching, Top @@ -33,7 +34,7 @@ functions in this chapter. @node Syntax Basics @section Syntax Table Concepts -@ifinfo +@ifnottex A @dfn{syntax table} provides Emacs with the information that determines the syntactic use of each character in a buffer. This information is used by the parsing commands, the complex movement @@ -42,7 +43,7 @@ syntactic constructs begin and end. The current syntax table controls the meaning of the word motion functions (@pxref{Word Motion}) and the list motion functions (@pxref{List Motion}) as well as the functions in this chapter. -@end ifinfo +@end ifnottex A syntax table is a char-table (@pxref{Char-Tables}). The element at index @var{c} describes the character with code @var{c}. The element's @@ -71,7 +72,7 @@ A syntax table can inherit the data for some characters from the standard syntax table, while specifying other characters itself. The ``inherit'' syntax class means ``inherit this character's syntax from the standard syntax table.'' Just changing the standard syntax for a -characters affects all syntax tables which inherit from it. +character affects all syntax tables that inherit from it. @defun syntax-table-p object This function returns @code{t} if @var{object} is a syntax table. @@ -92,9 +93,11 @@ syntax table and its class in any other table. Each class is designated by a mnemonic character, which serves as the name of the class when you need to specify a class. Usually the -designator character is one that is frequently in that class; however, +designator character is one that is often assigned that class; however, its meaning as a designator is unvarying and independent of what syntax -that character currently has. +that character currently has. Thus, @samp{\} as a designator character +always gives ``escape character'' syntax, regardless of what syntax +@samp{\} currently has. @cindex syntax descriptor A syntax descriptor is a Lisp string that specifies a syntax class, a @@ -106,7 +109,7 @@ character or flags are needed, one character is sufficient. For example, the syntax descriptor for the character @samp{*} in C mode is @samp{@w{. 23}} (i.e., punctuation, matching character slot -unused, second character of a comment-starter, first character of an +unused, second character of a comment-starter, first character of a comment-ender), and the entry for @samp{/} is @samp{@w{. 14}} (i.e., punctuation, matching character slot unused, first character of a comment-starter, second character of a comment-ender). @@ -256,11 +259,11 @@ designator for this syntax code is @samp{@@}. @end deffn @deffn {Syntax class} @w{generic comment delimiter} -A @dfn{generic comment delimiter} character starts or ends a special -kind of comment. @emph{Any} generic comment delimiter matches -@emph{any} generic comment delimiter, but they cannot match a comment -starter or comment ender; generic comment delimiters can only match each -other. +A @dfn{generic comment delimiter} (designated by @samp{!}) starts +or ends a special kind of comment. @emph{Any} generic comment delimiter +matches @emph{any} generic comment delimiter, but they cannot match +a comment starter or comment ender; generic comment delimiters can only +match each other. This syntax class is primarily meant for use with the @code{syntax-table} text property (@pxref{Syntax Properties}). You can @@ -270,10 +273,10 @@ identifying them as generic comment delimiters. @end deffn @deffn {Syntax class} @w{generic string delimiter} -A @dfn{generic string delimiter} character starts or ends a string. -This class differs from the string quote class in that @emph{any} -generic string delimiter can match any other generic string delimiter; -but they do not match ordinary string quote characters. +A @dfn{generic string delimiter} (designated by @samp{|}) starts or ends +a string. This class differs from the string quote class in that @emph{any} +generic string delimiter can match any other generic string delimiter; but +they do not match ordinary string quote characters. This syntax class is primarily meant for use with the @code{syntax-table} text property (@pxref{Syntax Properties}). You can @@ -287,18 +290,19 @@ identifying them as generic string delimiters. @cindex syntax flags In addition to the classes, entries for characters in a syntax table -can specify flags. There are six possible flags, represented by the -characters @samp{1}, @samp{2}, @samp{3}, @samp{4}, @samp{b} and -@samp{p}. - - All the flags except @samp{p} are used to describe multi-character -comment delimiters. The digit flags indicate that a character can -@emph{also} be part of a comment sequence, in addition to the syntactic -properties associated with its character class. The flags are -independent of the class and each other for the sake of characters such -as @samp{*} in C mode, which is a punctuation character, @emph{and} the -second character of a start-of-comment sequence (@samp{/*}), @emph{and} -the first character of an end-of-comment sequence (@samp{*/}). +can specify flags. There are seven possible flags, represented by the +characters @samp{1}, @samp{2}, @samp{3}, @samp{4}, @samp{b}, @samp{n}, +and @samp{p}. + + All the flags except @samp{n} and @samp{p} are used to describe +multi-character comment delimiters. The digit flags indicate that a +character can @emph{also} be part of a comment sequence, in addition to +the syntactic properties associated with its character class. The flags +are independent of the class and each other for the sake of characters +such as @samp{*} in C mode, which is a punctuation character, @emph{and} +the second character of a start-of-comment sequence (@samp{/*}), +@emph{and} the first character of an end-of-comment sequence +(@samp{*/}). Here is a table of the possible flags for a character @var{c}, and what they mean: @@ -369,6 +373,12 @@ This is a comment-end sequence for ``b'' style, because the newline character has the @samp{b} flag. @end table +@item +@samp{n} on a comment delimiter character specifies +that this kind of comment can be nested. For a two-character +comment delimiter, @samp{n} on either character makes it +nestable. + @item @c Emacs 19 feature @samp{p} identifies an additional ``prefix character'' for Lisp syntax. @@ -498,6 +508,18 @@ This function returns the current syntax table, which is the table for the current buffer. @end defun +@defmac with-syntax-table @var{table} @var{body}... +@tindex with-syntax-table +This macro executes @var{body} using @var{table} as the current syntax +table. It returns the value of the last form in @var{body}, after +restoring the old current syntax table. + +Since each buffer has its own current syntax table, we should make that +more precise: @code{with-syntax-table} temporarily alters the current +syntax table of whichever buffer is current at the time the macro +execution starts. Other buffers are not affected. +@end defmac + @node Syntax Properties @section Syntax Properties @kindex syntax-table @r{(text property)} @@ -517,7 +539,7 @@ occurrence of the character. @item @code{(@var{syntax-code} . @var{matching-char})} A cons cell of this format specifies the syntax for this -occurrence of the character. +occurrence of the character. (@pxref{Syntax Table Internals}) @item @code{nil} If the property is @code{nil}, the character's syntax is determined from @@ -525,7 +547,6 @@ the current syntax table in the usual way. @end table @defvar parse-sexp-lookup-properties -@tindex parse-sexp-lookup-properties If this is non-@code{nil}, the syntax scanning functions pay attention to syntax text properties. Otherwise they use only the current syntax table. @@ -542,6 +563,10 @@ This function moves point forward across characters having syntax classes mentioned in @var{syntaxes}. It stops when it encounters the end of the buffer, or position @var{limit} (if specified), or a character it is not supposed to skip. + +If @var{syntaxes} starts with @samp{^}, then the function skips +characters whose syntax is @emph{not} in @var{syntaxes}. + The return value is the distance traveled, which is a nonnegative integer. @end defun @@ -549,8 +574,11 @@ integer. @defun skip-syntax-backward syntaxes &optional limit This function moves point backward across characters whose syntax classes are mentioned in @var{syntaxes}. It stops when it encounters -the beginning of the buffer, or position @var{limit} (if specified), or a -character it is not supposed to skip. +the beginning of the buffer, or position @var{limit} (if specified), or +a character it is not supposed to skip. + +If @var{syntaxes} starts with @samp{^}, then the function skips +characters whose syntax is @emph{not} in @var{syntaxes}. The return value indicates the distance traveled. It is an integer that is zero or less. @@ -572,6 +600,30 @@ these functions can be used for Lisp expressions when in Lisp mode and for C expressions when in C mode. @xref{List Motion}, for convenient higher-level functions for moving over balanced expressions. +A syntax table only describes how each character changes the state of +the parser, rather than describing the state itself. For example, a string +delimiter character toggles the parser state between ``in-string'' and +``in-code'' but the characters inside the string do not have any particular +syntax to identify them as such. + +For example (note: 15 is the syntax-code of generic string delimiters): + +@example +(put-text-property 1 9 'syntax-table '(15 . nil)) +@end example + +does not tell Emacs that the first eight chars of the current buffer +are a string, but rather that they are all string delimiters and thus +Emacs should treat them as four adjacent empty strings. + +The state of the parser is transient (i.e. not stored in the buffer for +example). Instead, every time the parser is used, it is given not just +a starting position but a starting state. If the starting state is not +specified explicitly, Emacs assumes we are at the top level of parenthesis +structure, such as the beginning of a function definition (this is the case +for @code{forward-sexp} which blindly assumes that the starting point is in +such a state.) + @defun parse-partial-sexp start limit &optional target-depth stop-before state stop-comment This function parses a sexp in the current buffer starting at @var{start}, not scanning past @var{limit}. It stops at position @@ -629,7 +681,9 @@ string delimiter character should terminate it. @item @cindex inside comment -@code{t} if inside a comment (of either style). +@code{t} if inside a comment (of either style), +or the comment nesting level if inside a kind of comment +that can be nested. @item @cindex quote character @@ -691,6 +745,14 @@ signaled. If it reaches the beginning or end between groupings but before count is used up, @code{nil} is returned. @end defun +@defvar multibyte-syntax-as-symbol +@tindex multibyte-syntax-as-symbol +If this variable is non-@code{nil}, @code{scan-sexps} treats all +non-@sc{ascii} characters as symbol constituents regardless +of what the syntax table says about them. (However, text properties +can still override the syntax.) +@end defvar + @defvar parse-sexp-ignore-comments @cindex skipping comments If the value is non-@code{nil}, then comments are treated as @@ -707,10 +769,19 @@ You can use @code{forward-comment} to move forward or backward over one comment or several comments. @defun forward-comment count -This function moves point forward across @var{count} comments (backward, -if @var{count} is negative). If it finds anything other than a comment -or whitespace, it stops, leaving point at the place where it stopped. -It also stops after satisfying @var{count}. +This function moves point forward across @var{count} complete comments +(that is, including the starting delimiter and the terminating +delimiter if any), plus any whitespace encountered on the way. It +moves backward if @var{count} is negative. If it encounters anything +other than a comment or whitespace, it stops, leaving point at the +place where it stopped. This includes (for instance) finding the end +of a comment when moving forward and expecting the beginning of one. +The function also stops immediately after moving over the specified +number of complete comments. + +This function cannot tell whether the ``comments'' it traverses are +embedded within a string. If they look like comments, it treats them +as comments. @end defun To move forward over all comments and whitespace following point, use @@ -750,7 +821,8 @@ function.) Lisp programs don't usually work with the elements directly; the Lisp-level syntax table functions usually work with syntax descriptors (@pxref{Syntax Descriptors}). Nonetheless, here we document the -internal format. +internal format. This format is used mostly when manipulating +syntax properties. Each element of a syntax table is a cons cell of the form @code{(@var{syntax-code} . @var{matching-char})}. The @sc{car}, @@ -803,10 +875,10 @@ to each syntactic type. @tab 9 @ @ escape @tab -14 @ @ comment-fence +14 @ @ generic comment @item @tab -15 @ string-fence +15 @ generic string @end multitable For example, the usual syntax value for @samp{(} is @code{(4 . 41)}. @@ -828,18 +900,26 @@ corresponds to each syntax flag. @tab @samp{1} @ @ @code{(lsh 1 16)} @tab -@samp{3} @ @ @code{(lsh 1 18)} +@samp{4} @ @ @code{(lsh 1 19)} @tab -@samp{p} @ @ @code{(lsh 1 20)} +@samp{b} @ @ @code{(lsh 1 21)} @item @tab @samp{2} @ @ @code{(lsh 1 17)} @tab -@samp{4} @ @ @code{(lsh 1 19)} +@samp{p} @ @ @code{(lsh 1 20)} @tab -@samp{b} @ @ @code{(lsh 1 21)} +@samp{n} @ @ @code{(lsh 1 22)} +@item +@tab +@samp{3} @ @ @code{(lsh 1 18)} @end multitable +@defun string-to-syntax @var{desc} +This function returns the internal form @code{(@var{syntax-code} . +@var{matching-char})} corresponding to the syntax descriptor @var{desc}. +@end defun + @node Categories @section Categories @cindex categories of characters @@ -856,7 +936,7 @@ category table defines its own categories, but normally these are initialized by copying from the standard categories table, so that the standard categories are available in all modes. - Each category has a name, which is an @sc{ASCII} printing character in + Each category has a name, which is an @sc{ascii} printing character in the range @w{@samp{ }} to @samp{~}. You specify the name of a category when you define it with @code{define-category}. @@ -918,6 +998,13 @@ This function makes @var{table} the category table for the current buffer. It returns @var{table}. @end defun +@defun make-category-table +@tindex make-category-table +This creates and returns an empty category table. In an empty category +table, no categories have been allocated, and no characters belong to +any categories. +@end defun + @defun make-category-set categories This function returns a new category set---a bool-vector---whose initial contents are the categories listed in the string @var{categories}. The @@ -946,7 +1033,8 @@ the category table. @defun category-set-mnemonics category-set This function converts the category set @var{category-set} into a string -containing the names of all the categories that are members of the set. +containing the characters that designate the categories that are members +of the set. @example (category-set-mnemonics (char-category-set ?a)) @@ -963,3 +1051,9 @@ Normally, it modifies the category set by adding @var{category} to it. But if @var{reset} is non-@code{nil}, then it deletes @var{category} instead. @end defun + +@deffn Command describe-categories +This function describes the category specifications in the current +category table. The descriptions are inserted in a buffer, which is +then displayed. +@end deffn