* basic.texi (Inserting Text): Document ucs-insert.

author Chong Yidong <cyd@stupidchicken.com>

Wed, 6 May 2009 03:55:12 +0000 (03:55 +0000)

committer Chong Yidong <cyd@stupidchicken.com>

Wed, 6 May 2009 03:55:12 +0000 (03:55 +0000)
author Chong Yidong <cyd@stupidchicken.com>
Wed, 6 May 2009 03:55:12 +0000 (03:55 +0000)
committer Chong Yidong <cyd@stupidchicken.com>
Wed, 6 May 2009 03:55:12 +0000 (03:55 +0000)
diff --git a/doc/emacs/ChangeLog b/doc/emacs/ChangeLog

index fc2c277972c40aacd582f4e6ba387aeaa59d5b53..29be8c714d3c121a5a6ca683c945153a65524be4 100644 (file)
--- a/doc/emacs/ChangeLog
+++ b/doc/emacs/ChangeLog
@@ -1,3 +1,19 @@
+2009-05-06  Chong Yidong  <cyd@stupidchicken.com>
+
+       * basic.texi (Inserting Text): Document ucs-insert.
+
+       * mule.texi (International Chars): Define "multibyte".  Note that
+       internal representation is unicode-based.  Simplify definition of raw
+       bytes.  Mention ucs-insert.
+       (Enabling Multibyte): Remove obsolete discussion.  Copyedits.
+       (Language Environments): Add language environments new to Emacs 23.
+       (Multibyte Conversion): Node deleted.
+       (Coding Systems): Remove obsolete unify-8859-on-decoding-mode.  Don't
+       mention obsolete emacs-mule coding system.
+       (Output Coding): Copyedits.
+
+       * emacs.texi (Top): Update node listing.
+
  2009-05-05  Per Starbäck  <per@starback.se>  (tiny change)
  
         * trouble.texi (Lossage): Use new binding of view-emacs-problems.
diff --git a/doc/emacs/basic.texi b/doc/emacs/basic.texi

index 710a093f495f137fe3b2d4e1fa28a055d34630d9..72ab17c33ac008dd79d7aae173849770df4df07f 100644 (file)
--- a/doc/emacs/basic.texi
+++ b/doc/emacs/basic.texi
@@ -64,9 +64,11 @@ key; other keys act as editing commands and do not insert themselves.
  For instance, @kbd{DEL} runs the command @code{delete-backward-char}
  by default (some modes bind it to a different command); it does not
  insert a literal @samp{DEL} character (@acronym{ASCII} character code
-127).  To insert a non-graphic character, first @dfn{quote} it by
-typing @kbd{C-q} (@code{quoted-insert}).  There are two ways to use
-@kbd{C-q}:
+127).
+
+  To insert a non-graphic character, or a character that your keyboard
+does not support, first @dfn{quote} it by typing @kbd{C-q}
+(@code{quoted-insert}).  There are two ways to use @kbd{C-q}:
  
  @itemize @bullet
  @item
@@ -87,32 +89,24 @@ Overwrite mode, to give you a convenient way to insert a digit instead
  of overwriting with it.
  @end itemize
  
-@cindex 8-bit character codes
-@noindent
-If you specify a code in the octal range 0200 through 0377, @kbd{C-q}
-assumes that you intend to use some ISO 8859-@var{n} character set,
-and converts the specified code to the corresponding Emacs character
-code.  Your choice of language environment determines which of the ISO
-8859 character sets to use (@pxref{Language Environments}).  This
-feature is disabled if multibyte characters are disabled
-(@pxref{Enabling Multibyte}).
-
  @vindex read-quoted-char-radix
+@noindent
  To use decimal or hexadecimal instead of octal, set the variable
-@code{read-quoted-char-radix} to 10 or 16.  If the radix is greater than
-10, some letters starting with @kbd{a} serve as part of a character
-code, just like digits.
+@code{read-quoted-char-radix} to 10 or 16.  If the radix is greater
+than 10, some letters starting with @kbd{a} serve as part of a
+character code, just like digits.
  
-A numeric argument tells @kbd{C-q} how many copies of the quoted
+  A numeric argument tells @kbd{C-q} how many copies of the quoted
  character to insert (@pxref{Arguments}).
  
-@findex newline
-@findex self-insert
-  Customization information: @key{DEL} in most modes runs the command
-@code{delete-backward-char}; @key{RET} runs the command
-@code{newline}, and self-inserting printing characters run the command
-@code{self-insert}, which inserts whatever character you typed.  Some
-major modes rebind @key{DEL} to other commands.
+@findex ucs-insert
+@cindex Unicode
+  Instead of @kbd{C-q}, you can use @kbd{C-x 8 @key{RET}}
+(@code{ucs-insert}) to insert a character based on its Unicode name or
+code-point.  This commands prompts for a character to insert, using
+the minibuffer; you can specify the character using either (i) the
+character's name in the Unicode standard, or (ii) the character's
+code-point in the Unicode standard.
  
  @node Moving Point
  @section Changing the Location of Point
diff --git a/doc/emacs/emacs.texi b/doc/emacs/emacs.texi

index 4fb083ad22b8a13e594bc87de0e583b8f3ffcea9..717e2b78c3efca5f9788bac251fd6b1e29a04fbf 100644 (file)
--- a/doc/emacs/emacs.texi
+++ b/doc/emacs/emacs.texi
@@ -507,7 +507,6 @@ International Character Set Support
  * Language Environments::   Setting things up for the language you use.
  * Input Methods::           Entering text characters not on your keyboard.
  * Select Input Method::     Specifying your choice of input methods.
-* Multibyte Conversion::    How single-byte characters convert to multibyte.
  * Coding Systems::          Character set conversion when you read and
                                write files, and so on.
  * Recognize Coding::        How Emacs figures out which conversion to use.
diff --git a/doc/emacs/mule.texi b/doc/emacs/mule.texi

index a622722f1c6360d09c6751f43908f6413fe534b6..aa25ed371dee157906a1b24889c4399fde6ba305 100644 (file)
--- a/doc/emacs/mule.texi
+++ b/doc/emacs/mule.texi
@@ -89,7 +89,6 @@ to make sure Emacs interprets keyboard input correctly; see
  * Language Environments::   Setting things up for the language you use.
  * Input Methods::           Entering text characters not on your keyboard.
  * Select Input Method::     Specifying your choice of input methods.
-* Multibyte Conversion::    How single-byte characters convert to multibyte.
  * Coding Systems::          Character set conversion when you read and
                                write files, and so on.
  * Recognize Coding::        How Emacs figures out which conversion to use.
@@ -115,14 +114,17 @@ to make sure Emacs interprets keyboard input correctly; see
  
    The users of international character sets and scripts have
  established many more-or-less standard coding systems for storing
-files.  Emacs internally uses a single multibyte character encoding,
-so that it can intermix characters from all these scripts in a single
-buffer or string.  This encoding represents each non-@acronym{ASCII}
-character as a sequence of bytes in the range 0200 through 0377.
-Emacs translates between the multibyte character encoding and various
-other coding systems when reading and writing files, when exchanging
-data with subprocesses, and (in some cases) in the @kbd{C-q} command
-(@pxref{Multibyte Conversion}).
+files.  These coding systems are typically @dfn{multibyte}, meaning
+that sequences of two or more bytes are used to represent individual
+non-@acronym{ASCII} characters.
+
+@cindex Unicode
+  Internally, Emacs uses its own multibyte character encoding, which
+is a superset of the @dfn{Unicode} standard.  This internal encoding
+allows characters from almost every known script to be intermixed in a
+single buffer or string.  Emacs translates between the multibyte
+character encoding and various other coding systems when reading and
+writing files, and when exchanging data with subprocesses.
  
  @kindex C-h h
  @findex view-hello-file
@@ -134,10 +136,14 @@ This illustrates various scripts.  If some characters can't be
  displayed on your terminal, they appear as @samp{?} or as hollow boxes
  (@pxref{Undisplayable Characters}).
  
-  Keyboards, even in the countries where these character sets are used,
-generally don't have keys for all the characters in them.  So Emacs
-supports various @dfn{input methods}, typically one for each script or
-language, to make it convenient to type them.
+  Keyboards, even in the countries where these character sets are
+used, generally don't have keys for all the characters in them.  You
+can insert characters that your keyboard does not support, using
+@kbd{C-q} (@code{quoted-insert}) or @kbd{C-x 8 @key{RET}}
+(@code{ucs-insert}).  @xref{Inserting Text}.  Emacs also supports
+various @dfn{input methods}, typically one for each script or
+language, which make it easier to type characters in the script.
+@xref{Input Methods}.
  
  @kindex C-x RET
    The prefix key @kbd{C-x @key{RET}} is used for commands that pertain
@@ -165,12 +171,12 @@ system encodes the character safely and with a single byte
  (@pxref{Coding Systems}).  If the character's encoding is longer than
  one byte, Emacs shows @samp{file ...}.
  
-  However, if the character displayed is in the range 0200 through
-0377 octal, it may actually stand for an invalid UTF-8 byte read from
-a file.  In Emacs, that byte is represented as a sequence of 8-bit
-characters, but all of them together display as the original invalid
-byte, in octal code.  In this case, @kbd{C-x =} shows @samp{part of
-display ...} instead of @samp{file}.
+  As a special case, if the character lies in the range 128 (0200
+octal) through 159 (0237 octal), it stands for a ``raw'' byte that
+does not correspond to any specific displayable character.  Such a
+``character'' lies within the @code{eight-bit-control} character set,
+and is displayed as an escaped octal character code.  In this case,
+@kbd{C-x =} shows @samp{part of display ...} instead of @samp{file}.
  
  @cindex character set of character at point
  @cindex font of character at point
@@ -235,74 +241,62 @@ There are text properties here:
  @node Enabling Multibyte
  @section Enabling Multibyte Characters
  
-  By default, Emacs starts in multibyte mode, because that allows you to
-use all the supported languages and scripts without limitations.
+  By default, Emacs starts in multibyte mode: it stores the contents
+of buffers and strings using an internal encoding that represents
+non-@acronym{ASCII} characters using multi-byte sequences.  Multibyte
+mode allows you to use all the supported languages and scripts without
+limitations.
  
  @cindex turn multibyte support on or off
-  You can enable or disable multibyte character support, either for
-Emacs as a whole, or for a single buffer.  When multibyte characters
-are disabled in a buffer, we call that @dfn{unibyte mode}.  Then each
-byte in that buffer represents a character, even codes 0200 through
-0377.
-
-  The old features for supporting the European character sets, ISO
-Latin-1 and ISO Latin-2, work in unibyte mode as they did in Emacs 19
-and also work for the other ISO 8859 character sets.  However, there
-is no need to turn off multibyte character support to use ISO Latin;
-the Emacs multibyte character set includes all the characters in these
-character sets, and Emacs can translate automatically to and from the
-ISO codes.
+  Under very special circumstances, you may want to disable multibyte
+character support, either for Emacs as a whole, or for a single
+buffer.  When multibyte characters are disabled in a buffer, we call
+that @dfn{unibyte mode}.  In unibyte mode, each character in the
+buffer has a character code ranging from 0 through 255 (0377 octal); 0
+through 127 (0177 octal) represent @acronym{ASCII} characters, and 128
+(0200 octal) through 255 (0377 octal) represent non-@acronym{ASCII}
+characters.
  
    To edit a particular file in unibyte representation, visit it using
-@code{find-file-literally}.  @xref{Visiting}.  To convert a buffer in
-multibyte representation into a single-byte representation of the same
-characters, the easiest way is to save the contents in a file, kill the
-buffer, and find the file again with @code{find-file-literally}.  You
-can also use @kbd{C-x @key{RET} c}
-(@code{universal-coding-system-argument}) and specify @samp{raw-text} as
-the coding system with which to find or save a file.  @xref{Text
-Coding}.  Finding a file as @samp{raw-text} doesn't disable format
-conversion, uncompression and auto mode selection as
-@code{find-file-literally} does.
+@code{find-file-literally}.  @xref{Visiting}.  You can convert a
+multibyte buffer to unibyte by saving it to a file, killing the
+buffer, and visiting the file again with @code{find-file-literally}.
+Alternatively, you can use @kbd{C-x @key{RET} c}
+(@code{universal-coding-system-argument}) and specify @samp{raw-text}
+as the coding system with which to visit or save a file.  @xref{Text
+Coding}.  Unlike @code{find-file-literally}, finding a file as
+@samp{raw-text} doesn't disable format conversion, uncompression, or
+auto mode selection.
  
  @vindex enable-multibyte-characters
  @vindex default-enable-multibyte-characters
+@cindex environment variables, and non-@acronym{ASCII} characters
    To turn off multibyte character support by default, start Emacs with
  the @samp{--unibyte} option (@pxref{Initial Options}), or set the
  environment variable @env{EMACS_UNIBYTE}.  You can also customize
  @code{enable-multibyte-characters} or, equivalently, directly set the
  variable @code{default-enable-multibyte-characters} to @code{nil} in
  your init file to have basically the same effect as @samp{--unibyte}.
-
-@findex toggle-enable-multibyte-characters
-  To convert a unibyte session to a multibyte session, set
-@code{default-enable-multibyte-characters} to @code{t}.  Buffers which
-were created in the unibyte session before you turn on multibyte support
-will stay unibyte.  You can turn on multibyte support in a specific
-buffer by invoking the command @code{toggle-enable-multibyte-characters}
-in that buffer.
+With @samp{--unibyte}, multibyte strings are not created during
+initialization from the values of environment variables,
+@file{/etc/passwd} entries etc., even if those contain
+non-@acronym{ASCII} characters.
  
  @cindex Lisp files, and multibyte operation
  @cindex multibyte operation, and Lisp files
  @cindex unibyte operation, and Lisp files
  @cindex init file, and non-@acronym{ASCII} characters
-@cindex environment variables, and non-@acronym{ASCII} characters
-  With @samp{--unibyte}, multibyte strings are not created during
-initialization from the values of environment variables,
-@file{/etc/passwd} entries etc.@: that contain non-@acronym{ASCII} 8-bit
-characters.
-
    Emacs normally loads Lisp files as multibyte, regardless of whether
-you used @samp{--unibyte}.  This includes the Emacs initialization file,
-@file{.emacs}, and the initialization files of Emacs packages such as
-Gnus.  However, you can specify unibyte loading for a particular Lisp
-file, by putting @w{@samp{-*-unibyte: t;-*-}} in a comment on the first
-line (@pxref{File Variables}).  Then that file is always loaded as
-unibyte text, even if you did not start Emacs with @samp{--unibyte}.
-The motivation for these conventions is that it is more reliable to
-always load any particular Lisp file in the same way.  However, you can
-load a Lisp file as unibyte, on any one occasion, by typing @kbd{C-x
-@key{RET} c raw-text @key{RET}} immediately before loading it.
+you used @samp{--unibyte}.  This includes the Emacs initialization
+file, @file{.emacs}, and the initialization files of Emacs packages
+such as Gnus.  However, you can specify unibyte loading for a
+particular Lisp file, by putting @w{@samp{-*-unibyte: t;-*-}} in a
+comment on the first line (@pxref{File Variables}).  Then that file is
+always loaded as unibyte text.  The motivation for these conventions
+is that it is more reliable to always load any particular Lisp file in
+the same way.  However, you can load a Lisp file as unibyte, on any
+one occasion, by typing @kbd{C-x @key{RET} c raw-text @key{RET}}
+immediately before loading it.
  
    The mode line indicates whether multibyte character support is
  enabled in the current buffer.  If it is, there are two or more
@@ -312,6 +306,14 @@ convention (colon, backslash, etc.).  When multibyte characters
  are not enabled, nothing precedes the colon except a single dash.
  @xref{Mode Line}, for more details about this.
  
+@findex toggle-enable-multibyte-characters
+  To convert a unibyte session to a multibyte session, set
+@code{default-enable-multibyte-characters} to @code{t}.  Buffers which
+were created in the unibyte session before you turn on multibyte
+support will stay unibyte.  You can turn on multibyte support in a
+specific buffer by invoking the command
+@code{toggle-enable-multibyte-characters} in that buffer.
+
  @node Language Environments
  @section Language Environments
  @cindex language environments
@@ -319,43 +321,41 @@ are not enabled, nothing precedes the colon except a single dash.
    All supported character sets are supported in Emacs buffers whenever
  multibyte characters are enabled; there is no need to select a
  particular language in order to display its characters in an Emacs
-buffer.  However, it is important to select a @dfn{language environment}
-in order to set various defaults.  The language environment really
-represents a choice of preferred script (more or less) rather than a
-choice of language.
+buffer.  However, it is important to select a @dfn{language
+environment} in order to set various defaults.  Roughly speaking, the
+language environment represents a choice of preferred script rather
+than a choice of language.
  
    The language environment controls which coding systems to recognize
  when reading text (@pxref{Recognize Coding}).  This applies to files,
-incoming mail, netnews, and any other text you read into Emacs.  It may
-also specify the default coding system to use when you create a file.
-Each language environment also specifies a default input method.
+incoming mail, and any other text you read into Emacs.  It may also
+specify the default coding system to use when you create a file.  Each
+language environment also specifies a default input method.
  
  @findex set-language-environment
  @vindex current-language-environment
-  To select a language environment, you can customize the variable
+  To select a language environment, customize the variable
  @code{current-language-environment} or use the command @kbd{M-x
  set-language-environment}.  It makes no difference which buffer is
-current when you use this command, because the effects apply globally to
-the Emacs session.  The supported language environments include:
+current when you use this command, because the effects apply globally
+to the Emacs session.  The supported language environments include:
  
  @cindex Euro sign
  @cindex UTF-8
  @quotation
-ASCII, Belarusian, Brazilian Portuguese, Bulgarian, Chinese-BIG5,
-Chinese-CNS, Chinese-EUC-TW, Chinese-GB, Croatian, Cyrillic-ALT,
-Cyrillic-ISO, Cyrillic-KOI8, Czech, Devanagari, Dutch, English,
-Esperanto, Ethiopic, French, Georgian, German, Greek, Hebrew, IPA,
-Italian, Japanese, Kannada, Korean, Lao, Latin-1, Latin-2, Latin-3,
-Latin-4, Latin-5, Latin-6, Latin-7, Latin-8 (Celtic), Latin-9 (updated
-Latin-1 with the Euro sign), Latvian, Lithuanian, Malayalam, Polish,
-Romanian, Russian, Slovak, Slovenian, Spanish, Swedish, Tajik, Tamil,
-Thai, Tibetan, Turkish, UTF-8 (for a setup which prefers Unicode
-characters and files encoded in UTF-8), Ukrainian, Vietnamese, Welsh,
-and Windows-1255 (for a setup which prefers Cyrillic characters and
-files encoded in Windows-1255).
-@tex
-\hbadness=10000\par  % just avoid underfull hbox warning
-@end tex
+ASCII, Belarusian, Bengali, Brazilian Portuguese, Bulgarian,
+Chinese-BIG5, Chinese-CNS, Chinese-EUC-TW, Chinese-GB, Chinese-GBK,
+Chinese-GB18030, Croatian, Cyrillic-ALT, Cyrillic-ISO, Cyrillic-KOI8,
+Czech, Devanagari, Dutch, English, Esperanto, Ethiopic, French,
+Georgian, German, Greek, Gujarati, Hebrew, IPA, Italian, Japanese,
+Kannada, Khmer, Korean, Lao, Latin-1, Latin-2, Latin-3, Latin-4,
+Latin-5, Latin-6, Latin-7, Latin-8 (Celtic), Latin-9 (updated Latin-1
+with the Euro sign), Latvian, Lithuanian, Malayalam, Oriya, Polish,
+Punjabi, Romanian, Russian, Sinhala, Slovak, Slovenian, Spanish,
+Swedish, TaiViet, Tajik, Tamil, Telugu, Thai, Tibetan, Turkish, UTF-8
+(for a setup which prefers Unicode characters and files encoded in
+UTF-8), Ukrainian, Vietnamese, Welsh, and Windows-1255 (for a setup
+which prefers Cyrillic characters and files encoded in Windows-1255).
  @end quotation
  
  @cindex fonts for various scripts
@@ -657,34 +657,6 @@ character.
  list-input-methods}.  The list gives information about each input
  method, including the string that stands for it in the mode line.
  
-@node Multibyte Conversion
-@section Unibyte and Multibyte Non-@acronym{ASCII} characters
-
-  When multibyte characters are enabled, character codes 0240 (octal)
-through 0377 (octal) are not really legitimate in the buffer.  The valid
-non-@acronym{ASCII} printing characters have codes that start from 0400.
-
-  If you type a self-inserting character in the range 0240 through
-0377, or if you use @kbd{C-q} to insert one, Emacs assumes you
-intended to use one of the ISO Latin-@var{n} character sets, and
-converts it to the Emacs code representing that Latin-@var{n}
-character.  You select @emph{which} ISO Latin character set to use
-through your choice of language environment
-@iftex
-(see above).
-@end iftex
-@ifnottex
-(@pxref{Language Environments}).
-@end ifnottex
-If you do not specify a choice, the default is Latin-1.
-
-  If you insert a character in the range 0200 through 0237, which
-forms the @code{eight-bit-control} character set, it is inserted
-literally.  You should normally avoid doing this since buffers
-containing such characters have to be written out in either the
-@code{emacs-mule} or @code{raw-text} coding system, which is usually
-not what you want.
-
  @node Coding Systems
  @section Coding Systems
  @cindex coding systems
@@ -698,11 +670,11 @@ possible in reading or writing files, in sending or receiving from the
  terminal, and in exchanging data with subprocesses.
  
    Emacs assigns a name to each coding system.  Most coding systems are
-used for one language, and the name of the coding system starts with the
-language name.  Some coding systems are used for several languages;
-their names usually start with @samp{iso}.  There are also special
-coding systems @code{no-conversion}, @code{raw-text} and
-@code{emacs-mule} which do not convert printing characters at all.
+used for one language, and the name of the coding system starts with
+the language name.  Some coding systems are used for several
+languages; their names usually start with @samp{iso}.  There are also
+special coding systems, such as @code{no-conversion}, @code{raw-text},
+and @code{emacs-internal}.
  
  @cindex international files from DOS/Windows systems
    A special class of coding systems, collectively known as
@@ -814,37 +786,21 @@ the @kbd{M-x find-file-literally} command.  This uses
  @code{no-conversion}, and also suppresses other Emacs features that
  might convert the file contents before you see them.  @xref{Visiting}.
  
-  The coding system @code{emacs-mule} means that the file contains
-non-@acronym{ASCII} characters stored with the internal Emacs encoding.  It
-handles end-of-line conversion based on the data encountered, and has
-the usual three variants to specify the kind of end-of-line conversion.
-
-@findex unify-8859-on-decoding-mode
-@anchor{Character Translation} 
-  The @dfn{character translation} feature can modify the effect of
-various coding systems, by changing the internal Emacs codes that
-decoding produces.  For instance, the command
-@code{unify-8859-on-decoding-mode} enables a mode that ``unifies'' the
-Latin alphabets when decoding text.  This works by converting all
-non-@acronym{ASCII} Latin-@var{n} characters to either Latin-1 or
-Unicode characters.  This way it is easier to use various
-Latin-@var{n} alphabets together.  (In a future Emacs version we hope
-to move towards full Unicode support and complete unification of
-character sets.)
-
-@vindex enable-character-translation
-  If you set the variable @code{enable-character-translation} to
-@code{nil}, that disables all character translation (including
-@code{unify-8859-on-decoding-mode}).
+  The coding system @code{emacs-internal} (or @code{utf-8-emacs},
+which is equivalent) means that the file contains non-@acronym{ASCII}
+characters stored with the internal Emacs encoding.  This coding
+system handles end-of-line conversion based on the data encountered,
+and has the usual three variants to specify the kind of end-of-line
+conversion.
  
  @node Recognize Coding
  @section Recognizing Coding Systems
  
-  Emacs tries to recognize which coding system to use for a given text
-as an integral part of reading that text.  (This applies to files
-being read, output from subprocesses, text from X selections, etc.)
-Emacs can select the right coding system automatically most of the
-time---once you have specified your preferences.
+  Whenever Emacs reads a given piece of text, it tries to recognize
+which coding system to use.  This applies to files being read, output
+from subprocesses, text from X selections, etc.  Emacs can select the
+right coding system automatically most of the time---once you have
+specified your preferences.
  
    Some coding systems can be recognized or distinguished by which byte
  sequences appear in the data.  However, there are coding systems that
@@ -948,19 +904,17 @@ pattern, are decoded correctly.  One of the builtin
  @code{auto-coding-functions} detects the encoding for XML files.
  
  @vindex rmail-decode-mime-charset
+@vindex rmail-file-coding-system
    When you get new mail in Rmail, each message is translated
  automatically from the coding system it is written in, as if it were a
  separate file.  This uses the priority list of coding systems that you
  have specified.  If a MIME message specifies a character set, Rmail
  obeys that specification, unless @code{rmail-decode-mime-charset} is
-@code{nil}.
-
-@vindex rmail-file-coding-system
-  For reading and saving Rmail files themselves, Emacs uses the coding
-system specified by the variable @code{rmail-file-coding-system}.  The
-default value is @code{nil}, which means that Rmail files are not
-translated (they are read and written in the Emacs internal character
-code).
+@code{nil}.  For reading and saving Rmail files themselves, Emacs uses
+the coding system specified by the variable
+@code{rmail-file-coding-system}.  The default value is @code{nil},
+which means that Rmail files are not translated (they are read and
+written in the Emacs internal character code).
  
  @node Specify Coding
  @section Specifying a File's Coding System
@@ -984,13 +938,6 @@ use of the Latin-1 coding system, as well as C mode.  When you specify
  the coding explicitly in the file, that overrides
  @code{file-coding-system-alist}.
  
-  If you add the character @samp{!} at the end of the coding system
-name in @code{coding}, it disables any character translation
-(@pxref{Character Translation}) while decoding the file.  This is
-useful when you need to make sure that the character codes in the
-Emacs buffer will not vary due to changes in user settings; for
-instance, for the sake of strings in Emacs Lisp source files.
-
  @node Output Coding
  @section Choosing Coding Systems for Output
  
@@ -1004,22 +951,21 @@ different coding system for further file output from the buffer using
  
    You can insert any character Emacs supports into any Emacs buffer,
  but most coding systems can only handle a subset of these characters.
-Therefore, you can insert characters that cannot be encoded with the
-coding system that will be used to save the buffer.  For example, you
-could start with an @acronym{ASCII} file and insert a few Latin-1
-characters into it, or you could edit a text file in Polish encoded in
-@code{iso-8859-2} and add some Russian words to it.  When you save
+Therefore, it's possible that the characters you insert cannot be
+encoded with the coding system that will be used to save the buffer.
+For example, you could visit a text file in Polish, encoded in
+@code{iso-8859-2}, and add some Russian words to it.  When you save
  that buffer, Emacs cannot use the current value of
  @code{buffer-file-coding-system}, because the characters you added
  cannot be encoded by that coding system.
  
    When that happens, Emacs tries the most-preferred coding system (set
  by @kbd{M-x prefer-coding-system} or @kbd{M-x
-set-language-environment}), and if that coding system can safely
-encode all of the characters in the buffer, Emacs uses it, and stores
-its value in @code{buffer-file-coding-system}.  Otherwise, Emacs
-displays a list of coding systems suitable for encoding the buffer's
-contents, and asks you to choose one of those coding systems.
+set-language-environment}).  If that coding system can safely encode
+all of the characters in the buffer, Emacs uses it, and stores its
+value in @code{buffer-file-coding-system}.  Otherwise, Emacs displays
+a list of coding systems suitable for encoding the buffer's contents,
+and asks you to choose one of those coding systems.
  
    If you insert the unsuitable characters in a mail message, Emacs
  behaves a bit differently.  It additionally checks whether the
@@ -1248,9 +1194,9 @@ interactively.
  
    If @code{file-name-coding-system} is @code{nil}, Emacs uses a
  default coding system determined by the selected language environment.
-In the default language environment, any non-@acronym{ASCII}
-characters in file names are not encoded specially; they appear in the
-file system using the internal Emacs representation.
+In the default language environment, non-@acronym{ASCII} characters in
+file names are not encoded specially; they appear in the file system
+using the internal Emacs representation.
  
    @strong{Warning:} if you change @code{file-name-coding-system} (or the
  language environment) in the middle of an Emacs session, problems can
@@ -1317,7 +1263,7 @@ You can do this by putting
  @end lisp
  
  @noindent
-in your @file{~/.emacs} file.
+in your init file.
  
    There is a similarity between using a coding system translation for
  keyboard input, and using an input method: both define sequences of
author	Chong Yidong <cyd@stupidchicken.com>
	Wed, 6 May 2009 03:55:12 +0000 (03:55 +0000)
committer	Chong Yidong <cyd@stupidchicken.com>
	Wed, 6 May 2009 03:55:12 +0000 (03:55 +0000)
doc/emacs/ChangeLog		patch \| blob \| history
doc/emacs/basic.texi		patch \| blob \| history
doc/emacs/emacs.texi		patch \| blob \| history
doc/emacs/mule.texi		patch \| blob \| history