@c -*-texinfo-*-
@c This is part of the GNU Emacs Lisp Reference Manual.
@c Copyright (C) 1990, 1991, 1992, 1993, 1994, 1995, 1998, 1999, 2002, 2003,
-@c 2004, 2005 Free Software Foundation, Inc.
+@c 2004, 2005, 2006 Free Software Foundation, Inc.
@c See the file elisp.texi for copying conditions.
@setfilename ../info/objects
@node Lisp Data Types, Numbers, Introduction, Top
In most cases, an object's printed representation is also a read
syntax for the object. However, some types have no read syntax, since
it does not make sense to enter objects of these types as constants in
-a Lisp program. These objects are printed in @dfn{hash notation}: the
-characters @samp{#<} followed by a descriptive string (typically the
-type name followed by the name of the object), and closed with a
-matching @samp{>}. For example:
+a Lisp program. These objects are printed in @dfn{hash notation},
+which consists of the characters @samp{#<}, a descriptive string
+(typically the type name followed by the name of the object), and a
+closing @samp{>}. For example:
@example
(current-buffer)
vertical tab, formfeed, space, return, del, and escape as @samp{?\a},
@samp{?\b}, @samp{?\t}, @samp{?\n}, @samp{?\v}, @samp{?\f},
@samp{?\s}, @samp{?\r}, @samp{?\d}, and @samp{?\e}, respectively.
-Thus,
+(@samp{?\s} followed by a dash has a different meaning---it applies
+the ``super'' modifier to the following character.) Thus,
@example
?\a @result{} 7 ; @r{control-g, @kbd{C-g}}
These sequences which start with backslash are also known as
@dfn{escape sequences}, because backslash plays the role of an
``escape character''; this terminology has nothing to do with the
-character @key{ESC}. @samp{\s} is meant for use only in character
+character @key{ESC}. @samp{\s} is meant for use in character
constants; in string constants, just write the space.
@cindex control characters
bit values are 2**22 for alt, 2**23 for super and 2**24 for hyper.
@end ifnottex
+@cindex unicode character escape
+ Emacs provides a syntax for specifying characters by their Unicode
+code points. @code{?\u@var{nnnn}} represents a character that maps to
+the Unicode code point @samp{U+@var{nnnn}}. There is a slightly
+different syntax for specifying characters with code points above
+@code{#xFFFF}; @code{\U00@var{nnnnnn}} represents the character whose
+Unicode code point is @samp{U+@var{nnnnnn}}, if such a character
+is supported by Emacs.
+
+ This peculiar and inconvenient syntax was adopted for compatibility
+with other programming languages. Unlike some other languages, Emacs
+Lisp supports this syntax in only character literals and strings.
+
@cindex @samp{\} in character constant
@cindex backslash in character constant
@cindex octal character code
@dfn{atoms}.
@cindex parenthesis
+@cindex @samp{(@dots{})} in lists
The read syntax and printed representation for lists are identical, and
consist of a left parenthesis, an arbitrary number of elements, and a
right parenthesis. Here are examples of lists:
@end group
@end smallexample
-@cindex @samp{(@dots{})} in lists
@cindex @code{nil} in lists
@cindex empty list
A list with no elements in it is the @dfn{empty list}; it is identical
@end group
@end example
- The same list represented in the first box notation looks like this:
+ The same list represented in the second box notation looks like this:
@example
@group
@dfn{Dotted pair notation} is a general syntax for cons cells that
represents the @sc{car} and @sc{cdr} explicitly. In this syntax,
@code{(@var{a} .@: @var{b})} stands for a cons cell whose @sc{car} is
-the object @var{a}, and whose @sc{cdr} is the object @var{b}. Dotted
+the object @var{a} and whose @sc{cdr} is the object @var{b}. Dotted
pair notation is more general than list syntax because the @sc{cdr}
does not have to be a list. However, it is more cumbersome in cases
where list syntax would work. In dotted pair notation, the list
type of array has its own read syntax; see the following sections for
details.
- The array type is contained in the sequence type and
-contains the string type, the vector type, the bool-vector type, and the
-char-table type.
+ The array type is a subset of the sequence type, and contains the
+string type, the vector type, the bool-vector type, and the char-table
+type.
@node String Type
@subsection String Type
string (even for an @acronym{ASCII} character) forces the string to be
multibyte.
+ You can also specify characters in a string by their numeric values
+in Unicode, using @samp{\u} and @samp{\U} (@pxref{Character Type}).
+
@xref{Text Representations}, for more information about the two
text representations.
A hash table is a very fast kind of lookup table, somewhat like an
alist in that it maps keys to corresponding values, but much faster.
-Hash tables have no read syntax, and
-print using hash notation. @xref{Hash Tables}.
+Hash tables have no read syntax, and print using hash notation.
+@xref{Hash Tables}, for functions that operate on hash tables.
@example
(make-hash-table)
@node Type Predicates
@section Type Predicates
-@cindex predicates
@cindex type checking
@kindex wrong-type-argument
@item windowp
@xref{Basic Windows, windowp}.
+
+@item booleanp
+@xref{nil and t, booleanp}.
+
+@item string-or-null-p
+@xref{Predicates for Strings, string-or-null-p}.
@end table
The most general way to check the type of an object is to call the
@end group
@end example
+@cindex equality of strings
Comparison of strings is case-sensitive, but does not take account of
text properties---it compares only the characters in the strings. For
technical reasons, a unibyte string and a multibyte string are