X-Git-Url: https://code.delx.au/gnu-emacs/blobdiff_plain/6cbdd38befb162339ca946e07b2484f6433af3d3..03da5d089a8ed035cec443a27259e7d21487a22e:/lispref/syntax.texi diff --git a/lispref/syntax.texi b/lispref/syntax.texi index 0d7c1cd036..5cde2badab 100644 --- a/lispref/syntax.texi +++ b/lispref/syntax.texi @@ -1,7 +1,7 @@ @c -*-texinfo-*- @c This is part of the GNU Emacs Lisp Reference Manual. -@c Copyright (C) 1990, 1991, 1992, 1993, 1994, 1995, 1998, 1999 -@c Free Software Foundation, Inc. +@c Copyright (C) 1990, 1991, 1992, 1993, 1994, 1995, 1998, 1999, 2002, 2003, +@c 2004, 2005 Free Software Foundation, Inc. @c See the file elisp.texi for copying conditions. @setfilename ../info/syntax @node Syntax Tables, Abbrevs, Searching and Matching, Top @@ -135,10 +135,10 @@ modes. @end deffn @deffn {Syntax class} @w{word constituent} -@dfn{Word constituents} (designated by @samp{w}) are parts of normal -English words and are typically used in variable and command names in -programs. All upper- and lower-case letters, and the digits, are typically -word constituents. +@dfn{Word constituents} (designated by @samp{w}) are parts of words in +human languages, and are typically used in variable and command names +in programs. All upper- and lower-case letters, and the digits, are +typically word constituents. @end deffn @deffn {Syntax class} @w{symbol constituent} @@ -155,9 +155,10 @@ character that is valid in symbols is underscore (@samp{_}). @dfn{Punctuation characters} (designated by @samp{.}) are those characters that are used as punctuation in English, or are used in some way in a programming language to separate symbols from one another. -Most programming language modes, including Emacs Lisp mode, have no +Some programming language modes, such as Emacs Lisp mode, have no characters in this class since the few characters that are not symbol or -word constituents all have other uses. +word constituents all have other uses. Other programming language modes, +such as C mode, use punctuation syntax for operators. @end deffn @deffn {Syntax class} @w{open parenthesis character} @@ -255,7 +256,7 @@ English text has no comment characters. In Lisp, the semicolon @deffn {Syntax class} @w{inherit} This syntax class does not specify a particular syntax. It says to look in the standard syntax table to find the syntax of this character. The -designator for this syntax code is @samp{@@}. +designator for this syntax class is @samp{@@}. @end deffn @deffn {Syntax class} @w{generic comment delimiter} @@ -384,7 +385,7 @@ nestable. @samp{p} identifies an additional ``prefix character'' for Lisp syntax. These characters are treated as whitespace when they appear between expressions. When they appear within an expression, they are handled -according to their usual syntax codes. +according to their usual syntax classes. The function @code{backward-prefix-chars} moves back over these characters, as well as over characters whose primary syntax class is @@ -397,10 +398,13 @@ prefix (@samp{'}). @xref{Motion and Syntax}. In this section we describe functions for creating, accessing and altering syntax tables. -@defun make-syntax-table -This function creates a new syntax table. It inherits the syntax for -letters and control characters from the standard syntax table. For -other characters, the syntax is copied from the standard syntax table. +@defun make-syntax-table &optional table +This function creates a new syntax table, with all values initialized +to @code{nil}. If @var{table} is non-@code{nil}, it becomes the +parent of the new syntax table, otherwise the standard syntax table is +the parent. Like all char-tables, a syntax table inherits from its +parent. Thus the original syntax of all characters in the returned +syntax table is determined by the parent. @xref{Char-Tables}. Most major mode syntax tables are created in this way. @end defun @@ -408,7 +412,7 @@ Most major mode syntax tables are created in this way. @defun copy-syntax-table &optional table This function constructs a copy of @var{table} and returns it. If @var{table} is not supplied (or is @code{nil}), it returns a copy of the -current syntax table. Otherwise, an error is signaled if @var{table} is +standard syntax table. Otherwise, an error is signaled if @var{table} is not a syntax table. @end defun @@ -425,7 +429,7 @@ This function always returns @code{nil}. The old syntax information in the table for this character is discarded. An error is signaled if the first character of the syntax descriptor is not -one of the twelve syntax class designator characters. An error is also +one of the seventeen syntax class designator characters. An error is also signaled if @var{char} is not a character. @example @@ -433,7 +437,7 @@ signaled if @var{char} is not a character. @exdent @r{Examples:} ;; @r{Put the space character in class whitespace.} -(modify-syntax-entry ?\ " ") +(modify-syntax-entry ?\s " ") @result{} nil @end group @@ -479,7 +483,7 @@ character, @samp{)}. @example @group -(string (char-syntax ?\ )) +(string (char-syntax ?\s)) @result{} " " @end group @@ -524,10 +528,12 @@ execution starts. Other buffers are not affected. @section Syntax Properties @kindex syntax-table @r{(text property)} -When the syntax table is not flexible enough to specify the syntax of a -language, you can use @code{syntax-table} text properties to override -the syntax table for specific character occurrences in the buffer. -@xref{Text Properties}. +When the syntax table is not flexible enough to specify the syntax of +a language, you can use @code{syntax-table} text properties to +override the syntax table for specific character occurrences in the +buffer. @xref{Text Properties}. You can use Font Lock mode to set +@code{syntax-table} text properties. @xref{Setting Syntax +Properties}. The valid values of @code{syntax-table} text property are: @@ -559,10 +565,11 @@ table. have certain syntax classes. @defun skip-syntax-forward syntaxes &optional limit -This function moves point forward across characters having syntax classes -mentioned in @var{syntaxes}. It stops when it encounters the end of -the buffer, or position @var{limit} (if specified), or a character it is -not supposed to skip. +This function moves point forward across characters having syntax +classes mentioned in @var{syntaxes} (a string of syntax class +characters). It stops when it encounters the end of the buffer, or +position @var{limit} (if specified), or a character it is not supposed +to skip. If @var{syntaxes} starts with @samp{^}, then the function skips characters whose syntax is @emph{not} in @var{syntaxes}. @@ -654,67 +661,106 @@ start of a comment. If @var{stop-comment} is the symbol string, or the end of a comment or a string, whichever comes first. @cindex parse state -The fifth argument @var{state} is a nine-element list of the same form +The fifth argument @var{state} is a ten-element list of the same form as the value of this function, described below. (It is OK to omit the -last element of the nine.) The return value of one call may be used to -initialize the state of the parse on another call to +last two elements of this list.) The return value of one call may be +used to initialize the state of the parse on another call to @code{parse-partial-sexp}. -The result is a list of nine elements describing the final state of +The result is a list of ten elements describing the final state of the parse: @enumerate 0 -@item +@item The depth in parentheses, counting from 0. -@item +@item @cindex innermost containing parentheses The character position of the start of the innermost parenthetical grouping containing the stopping point; @code{nil} if none. -@item +@item @cindex previous complete subexpression The character position of the start of the last complete subexpression terminated; @code{nil} if none. -@item +@item @cindex inside string Non-@code{nil} if inside a string. More precisely, this is the character that will terminate the string, or @code{t} if a generic string delimiter character should terminate it. -@item +@item @cindex inside comment @code{t} if inside a comment (of either style), or the comment nesting level if inside a kind of comment that can be nested. -@item +@item @cindex quote character @code{t} if point is just after a quote character. -@item +@item The minimum parenthesis depth encountered during this scan. @item -What kind of comment is active: @code{nil} for a comment of style ``a'', -@code{t} for a comment of style ``b'', and @code{syntax-table} for -a comment that should be ended by a generic comment delimiter character. +What kind of comment is active: @code{nil} for a comment of style +``a'' or when not inside a comment, @code{t} for a comment of style +``b'', and @code{syntax-table} for a comment that should be ended by a +generic comment delimiter character. @item The string or comment start position. While inside a comment, this is the position where the comment began; while inside a string, this is the position where the string began. When outside of strings and comments, this element is @code{nil}. + +@item +Internal data for continuing the parsing. The meaning of this +data is subject to change; it is used if you pass this list +as the @var{state} argument to another call. + @end enumerate -Elements 0, 3, 4, 5 and 7 are significant in the argument @var{state}. +Elements 0, 3, 4, 5, 7 and 9 are significant in the argument +@var{state}. @cindex indenting with parentheses This function is most often used to compute indentation for languages that have nested parentheses. @end defun +@defun syntax-ppss &optional pos +This function returns the state that the parser would have at position +@var{pos}, if it were started with a default start state at the +beginning of the buffer. Thus, it is equivalent to +@code{(parse-partial-sexp (point-min) @var{pos})}, except that +@code{syntax-ppss} uses a cache to speed up the computation. Also, +the 2nd value (previous complete subexpression) and 6th value (minimum +parenthesis depth) of the returned state are not meaningful. +@end defun + +@defun syntax-ppss-flush-cache beg +This function flushes the cache used by @code{syntax-ppss}, starting at +position @var{beg}. + +When @code{syntax-ppss} is called, it automatically hooks itself +to @code{before-change-functions} to keep its cache consistent. +But this can fail if @code{syntax-ppss} is called while +@code{before-change-functions} is temporarily let-bound, or if the +buffer is modified without obeying the hook, such as when using +@code{inhibit-modification-hooks}. For this reason, it is sometimes +necessary to flush the cache manually. +@end defun + +@defvar syntax-begin-function +If this is non-@code{nil}, it should be a function that moves to an +earlier buffer position where the parser state is equivalent to +@code{nil}---in other words, a position outside of any comment, +string, or parenthesis. @code{syntax-ppss} uses it to supplement its +cache. +@end defvar + @defun scan-lists from count depth This function scans forward @var{count} balanced parenthetical groupings from position @var{from}. It returns the position where the scan stops. @@ -752,22 +798,20 @@ before count is used up, @code{nil} is returned. @defvar multibyte-syntax-as-symbol @tindex multibyte-syntax-as-symbol If this variable is non-@code{nil}, @code{scan-sexps} treats all -non-@sc{ascii} characters as symbol constituents regardless +non-@acronym{ASCII} characters as symbol constituents regardless of what the syntax table says about them. (However, text properties can still override the syntax.) @end defvar -@defvar parse-sexp-ignore-comments +@defopt parse-sexp-ignore-comments @cindex skipping comments If the value is non-@code{nil}, then comments are treated as whitespace by the functions in this section and by @code{forward-sexp}. +@end defopt -In older Emacs versions, this feature worked only when the comment -terminator is something like @samp{*/}, and appears only to end a -comment. In languages where newlines terminate comments, it was -necessary make this variable @code{nil}, since not every newline is the -end of a comment. This limitation no longer exists. -@end defvar +@vindex parse-sexp-lookup-properties +The behavior of @code{parse-partial-sexp} is also affected by +@code{parse-sexp-lookup-properties} (@pxref{Syntax Properties}). You can use @code{forward-comment} to move forward or backward over one comment or several comments. @@ -781,7 +825,9 @@ other than a comment or whitespace, it stops, leaving point at the place where it stopped. This includes (for instance) finding the end of a comment when moving forward and expecting the beginning of one. The function also stops immediately after moving over the specified -number of complete comments. +number of complete comments. If @var{count} comments are found as +expected, with nothing except whitespace between them, it returns +@code{t}; otherwise it returns @code{nil}. This function cannot tell whether the ``comments'' it traverses are embedded within a string. If they look like comments, it treats them @@ -837,7 +883,7 @@ a character to match was specified. This table gives the value of @var{syntax-code} which corresponds to each syntactic type. -@multitable @columnfractions .05 .3 .3 .3 +@multitable @columnfractions .05 .3 .3 .31 @item @tab @i{Integer} @i{Class} @@ -924,6 +970,30 @@ This function returns the internal form @code{(@var{syntax-code} . @var{matching-char})} corresponding to the syntax descriptor @var{desc}. @end defun +@defun syntax-after pos +This function returns the syntax code of the character in the buffer +after position @var{pos}, taking account of syntax properties as well +as the syntax table. If @var{pos} is outside the buffer's accessible +portion (@pxref{Narrowing, accessible portion}), this function returns +@code{nil}. +@end defun + +@defun syntax-class syntax +This function returns the syntax class of the syntax code +@var{syntax}. (It masks off the high 16 bits that hold the flags +encoded in the syntax descriptor.) If @var{syntax} is @code{nil}, it +returns @code{nil}; this is so evaluating the expression + +@example +(syntax-class (syntax-after pos)) +@end example + +@noindent +where @code{pos} is outside the buffer's accessible portion, will +yield @code{nil} without throwing errors or producing wrong syntax +class codes. +@end defun + @node Categories @section Categories @cindex categories of characters @@ -940,7 +1010,7 @@ category table defines its own categories, but normally these are initialized by copying from the standard categories table, so that the standard categories are available in all modes. - Each category has a name, which is an @sc{ascii} printing character in + Each category has a name, which is an @acronym{ASCII} printing character in the range @w{@samp{ }} to @samp{~}. You specify the name of a category when you define it with @code{define-category}. @@ -951,12 +1021,12 @@ belongs to. In this category set, if the element at index @var{cat} is @code{t}, that means category @var{cat} is a member of the set, and that character @var{c} belongs to category @var{cat}. +For the next three functions, the optional argument @var{table} +defaults to the current buffer's category table. + @defun define-category char docstring &optional table This function defines a new category, with name @var{char} and -documentation @var{docstring}. - -The new category is defined for category table @var{table}, which -defaults to the current buffer's category table. +documentation @var{docstring}, for the category table @var{table}. @end defun @defun category-docstring category &optional table @@ -971,7 +1041,7 @@ in category table @var{table}. @end example @end defun -@defun get-unused-category table +@defun get-unused-category &optional table This function returns a category name (a character) which is not currently defined in @var{table}. If all possible categories are in use in @var{table}, it returns @code{nil}. @@ -993,7 +1063,7 @@ This function returns the standard category table. @defun copy-category-table &optional table This function constructs a copy of @var{table} and returns it. If @var{table} is not supplied (or is @code{nil}), it returns a copy of the -current category table. Otherwise, an error is signaled if @var{table} +standard category table. Otherwise, an error is signaled if @var{table} is not a category table. @end defun @@ -1023,11 +1093,11 @@ other categories. @end defun @defun char-category-set char -This function returns the category set for character @var{char}. This -is the bool-vector which records which categories the character -@var{char} belongs to. The function @code{char-category-set} does not -allocate storage, because it returns the same bool-vector that exists in -the category table. +This function returns the category set for character @var{char} in the +current buffer's category table. This is the bool-vector which +records which categories the character @var{char} belongs to. The +function @code{char-category-set} does not allocate storage, because +it returns the same bool-vector that exists in the category table. @example (char-category-set ?a) @@ -1056,8 +1126,13 @@ But if @var{reset} is non-@code{nil}, then it deletes @var{category} instead. @end defun -@deffn Command describe-categories +@deffn Command describe-categories &optional buffer-or-name This function describes the category specifications in the current -category table. The descriptions are inserted in a buffer, which is -then displayed. +category table. It inserts the descriptions in a buffer, and then +displays that buffer. If @var{buffer-or-name} is non-@code{nil}, it +describes the category table of that buffer instead. @end deffn + +@ignore + arch-tag: 4d914e96-0283-445c-9233-75d33662908c +@end ignore