@c -*-texinfo-*-
@c This is part of the GNU Emacs Lisp Reference Manual.
-@c Copyright (C) 1990, 1991, 1992, 1993, 1994, 1995, 1998 Free Software Foundation, Inc.
+@c Copyright (C) 1990, 1991, 1992, 1993, 1994, 1995, 1998, 1999
+@c Free Software Foundation, Inc.
@c See the file elisp.texi for copying conditions.
@setfilename ../info/syntax
@node Syntax Tables, Abbrevs, Searching and Matching, Top
@node Syntax Basics
@section Syntax Table Concepts
-@ifinfo
+@ifnottex
A @dfn{syntax table} provides Emacs with the information that
determines the syntactic use of each character in a buffer. This
information is used by the parsing commands, the complex movement
the meaning of the word motion functions (@pxref{Word Motion}) and the
list motion functions (@pxref{List Motion}) as well as the functions in
this chapter.
-@end ifinfo
+@end ifnottex
A syntax table is a char-table (@pxref{Char-Tables}). The element at
index @var{c} describes the character with code @var{c}. The element's
@end deffn
@deffn {Syntax class} @w{generic comment delimiter}
-A @dfn{generic comment delimiter} character starts or ends a special
-kind of comment. @emph{Any} generic comment delimiter matches
-@emph{any} generic comment delimiter, but they cannot match a comment
-starter or comment ender; generic comment delimiters can only match each
-other.
+A @dfn{generic comment delimiter} (designated by @samp{!}) starts
+or ends a special kind of comment. @emph{Any} generic comment delimiter
+matches @emph{any} generic comment delimiter, but they cannot match
+a comment starter or comment ender; generic comment delimiters can only
+match each other.
This syntax class is primarily meant for use with the
@code{syntax-table} text property (@pxref{Syntax Properties}). You can
@end deffn
@deffn {Syntax class} @w{generic string delimiter}
-A @dfn{generic string delimiter} character starts or ends a string.
-This class differs from the string quote class in that @emph{any}
-generic string delimiter can match any other generic string delimiter;
-but they do not match ordinary string quote characters.
+A @dfn{generic string delimiter} (designated by @samp{|}) starts or ends
+a string. This class differs from the string quote class in that @emph{any}
+generic string delimiter can match any other generic string delimiter; but
+they do not match ordinary string quote characters.
This syntax class is primarily meant for use with the
@code{syntax-table} text property (@pxref{Syntax Properties}). You can
@cindex syntax flags
In addition to the classes, entries for characters in a syntax table
-can specify flags. There are six possible flags, represented by the
-characters @samp{1}, @samp{2}, @samp{3}, @samp{4}, @samp{b} and
-@samp{p}.
-
- All the flags except @samp{p} are used to describe multi-character
-comment delimiters. The digit flags indicate that a character can
-@emph{also} be part of a comment sequence, in addition to the syntactic
-properties associated with its character class. The flags are
-independent of the class and each other for the sake of characters such
-as @samp{*} in C mode, which is a punctuation character, @emph{and} the
-second character of a start-of-comment sequence (@samp{/*}), @emph{and}
-the first character of an end-of-comment sequence (@samp{*/}).
+can specify flags. There are seven possible flags, represented by the
+characters @samp{1}, @samp{2}, @samp{3}, @samp{4}, @samp{b}, @samp{n},
+and @samp{p}.
+
+ All the flags except @samp{n} and @samp{p} are used to describe
+multi-character comment delimiters. The digit flags indicate that a
+character can @emph{also} be part of a comment sequence, in addition to
+the syntactic properties associated with its character class. The flags
+are independent of the class and each other for the sake of characters
+such as @samp{*} in C mode, which is a punctuation character, @emph{and}
+the second character of a start-of-comment sequence (@samp{/*}),
+@emph{and} the first character of an end-of-comment sequence
+(@samp{*/}).
Here is a table of the possible flags for a character @var{c},
and what they mean:
character has the @samp{b} flag.
@end table
+@item
+@samp{n} on a comment delimiter character specifies
+that this kind of comment can be nested. For a two-character
+comment delimiter, @samp{n} on either character makes it
+nestable.
+
@item
@c Emacs 19 feature
@samp{p} identifies an additional ``prefix character'' for Lisp syntax.
the current buffer.
@end defun
+@defmac with-syntax-table @var{table} @var{body}...
+@tindex with-syntax-table
+This macro executes @var{body} using @var{table} as the current syntax
+table. It returns the value of the last form in @var{body}, after
+restoring the old current syntax table.
+
+Since each buffer has its own current syntax table, we should make that
+more precise: @code{with-syntax-table} temporarily alters the current
+syntax table of whichever buffer is current at the time the macro
+execution starts. Other buffers are not affected.
+@end defmac
+
@node Syntax Properties
@section Syntax Properties
@kindex syntax-table @r{(text property)}
@item @code{(@var{syntax-code} . @var{matching-char})}
A cons cell of this format specifies the syntax for this
-occurrence of the character.
+occurrence of the character. (@pxref{Syntax Table Internals})
@item @code{nil}
If the property is @code{nil}, the character's syntax is determined from
@end table
@defvar parse-sexp-lookup-properties
-@tindex parse-sexp-lookup-properties
If this is non-@code{nil}, the syntax scanning functions pay attention
to syntax text properties. Otherwise they use only the current syntax
table.
for C expressions when in C mode. @xref{List Motion}, for convenient
higher-level functions for moving over balanced expressions.
+A syntax table only describes how each character changes the state of
+the parser, rather than describing the state itself. For example, a string
+delimiter character toggles the parser state between ``in-string'' and
+``in-code'' but the characters inside the string do not have any particular
+syntax to identify them as such.
+
+For example (note: 15 is the syntax-code of generic string delimiters):
+
+@example
+(put-text-property 1 9 'syntax-table '(15 . nil))
+@end example
+
+does not tell Emacs that the first eight chars of the current buffer
+are a string, but rather that they are all string delimiters and thus
+Emacs should treat them as four adjacent empty strings.
+
+The state of the parser is transient (i.e. not stored in the buffer for
+example). Instead, every time the parser is used, it is given not just
+a starting position but a starting state. If the starting state is not
+specified explicitly, Emacs assumes we are at the top level of parenthesis
+structure, such as the beginning of a function definition (this is the case
+for @code{forward-sexp} which blindly assumes that the starting point is in
+such a state.)
+
@defun parse-partial-sexp start limit &optional target-depth stop-before state stop-comment
This function parses a sexp in the current buffer starting at
@var{start}, not scanning past @var{limit}. It stops at position
@item
@cindex inside comment
-@code{t} if inside a comment (of either style).
+@code{t} if inside a comment (of either style),
+or the comment nesting level if inside a kind of comment
+that can be nested.
@item
@cindex quote character
before count is used up, @code{nil} is returned.
@end defun
+@defvar multibyte-syntax-as-symbol
+@tindex multibyte-syntax-as-symbol
+If this variable is non-@code{nil}, @code{scan-sexps} treats all
+non-@sc{ascii} characters as symbol constituents regardless
+of what the syntax table says about them. (However, text properties
+can still override the syntax.)
+@end defvar
+
@defvar parse-sexp-ignore-comments
@cindex skipping comments
If the value is non-@code{nil}, then comments are treated as
one comment or several comments.
@defun forward-comment count
-This function moves point forward across @var{count} comments (backward,
-if @var{count} is negative). If it finds anything other than a comment
-or whitespace, it stops, leaving point at the place where it stopped.
-It also stops after satisfying @var{count}.
+This function moves point forward across @var{count} complete comments
+(that is, including the starting delimiter and the terminating
+delimiter if any), plus any whitespace encountered on the way. It
+moves backward if @var{count} is negative. If it encounters anything
+other than a comment or whitespace, it stops, leaving point at the
+place where it stopped. This includes (for instance) finding the end
+of a comment when moving forward and expecting the beginning of one.
+The function also stops immediately after moving over the specified
+number of complete comments.
+
+This function cannot tell whether the ``comments'' it traverses are
+embedded within a string. If they look like comments, it treats them
+as comments.
@end defun
To move forward over all comments and whitespace following point, use
Lisp programs don't usually work with the elements directly; the
Lisp-level syntax table functions usually work with syntax descriptors
(@pxref{Syntax Descriptors}). Nonetheless, here we document the
-internal format.
+internal format. This format is used mostly when manipulating
+syntax properties.
Each element of a syntax table is a cons cell of the form
@code{(@var{syntax-code} . @var{matching-char})}. The @sc{car},
@tab
9 @ @ escape
@tab
-14 @ @ comment-fence
+14 @ @ generic comment
@item
@tab
-15 @ string-fence
+15 @ generic string
@end multitable
For example, the usual syntax value for @samp{(} is @code{(4 . 41)}.
@tab
@samp{1} @ @ @code{(lsh 1 16)}
@tab
-@samp{3} @ @ @code{(lsh 1 18)}
+@samp{4} @ @ @code{(lsh 1 19)}
@tab
-@samp{p} @ @ @code{(lsh 1 20)}
+@samp{b} @ @ @code{(lsh 1 21)}
@item
@tab
@samp{2} @ @ @code{(lsh 1 17)}
@tab
-@samp{4} @ @ @code{(lsh 1 19)}
+@samp{p} @ @ @code{(lsh 1 20)}
@tab
-@samp{b} @ @ @code{(lsh 1 21)}
+@samp{n} @ @ @code{(lsh 1 22)}
+@item
+@tab
+@samp{3} @ @ @code{(lsh 1 18)}
@end multitable
+@defun string-to-syntax @var{desc}
+This function returns the internal form @code{(@var{syntax-code} .
+@var{matching-char})} corresponding to the syntax descriptor @var{desc}.
+@end defun
+
@node Categories
@section Categories
@cindex categories of characters
buffer. It returns @var{table}.
@end defun
+@defun make-category-table
+@tindex make-category-table
+This creates and returns an empty category table. In an empty category
+table, no categories have been allocated, and no characters belong to
+any categories.
+@end defun
+
@defun make-category-set categories
This function returns a new category set---a bool-vector---whose initial
contents are the categories listed in the string @var{categories}. The
@defun category-set-mnemonics category-set
This function converts the category set @var{category-set} into a string
-containing the names of all the categories that are members of the set.
+containing the characters that designate the categories that are members
+of the set.
@example
(category-set-mnemonics (char-category-set ?a))
But if @var{reset} is non-@code{nil}, then it deletes @var{category}
instead.
@end defun
+
+@deffn Command describe-categories
+This function describes the category specifications in the current
+category table. The descriptions are inserted in a buffer, which is
+then displayed.
+@end deffn