X-Git-Url: https://code.delx.au/gnu-emacs/blobdiff_plain/e8757f091a502b858912a4c267210e009227d6e6..a7fecaa0c5f8247c3b3747506201ec2a2ecbe292:/doc/emacs/mule.texi

diff --git a/doc/emacs/mule.texi b/doc/emacs/mule.texi
index 1dfae79c78..1600f19499 100644
--- a/doc/emacs/mule.texi
+++ b/doc/emacs/mule.texi
@@ -1,11 +1,10 @@
 @c This is part of the Emacs manual.
-@c Copyright (C) 1997, 1999-2012 Free Software Foundation, Inc.
+@c Copyright (C) 1997, 1999-2014 Free Software Foundation, Inc.
 @c See file emacs.texi for copying conditions.
 @node International
 @chapter International Character Set Support
 @c This node is referenced in the tutorial.  When renaming or deleting
 @c it, the tutorial needs to be adjusted.  (TUTORIAL.de)
-@cindex MULE
 @cindex international scripts
 @cindex multibyte characters
 @cindex encoding of characters
@@ -90,7 +89,6 @@ value to make sure Emacs interprets keyboard input correctly; see
 
 @menu
 * International Chars::     Basic concepts of multibyte characters.
-* Disabling Multibyte::     Controlling whether to use multibyte characters.
 * Language Environments::   Setting things up for the language you use.
 * Input Methods::           Entering text characters not on your keyboard.
 * Select Input Method::     Specifying your choice of input methods.
@@ -244,79 +242,6 @@ Character code properties: customize what to show
   decomposition: (65 768) ('A' '`')
 @end smallexample
 
-@c FIXME?  Does this section even belong in the user manual?
-@c Seems more appropriate to the lispref?
-@node Disabling Multibyte
-@section Disabling Multibyte Characters
-
-  By default, Emacs starts in multibyte mode: it stores the contents
-of buffers and strings using an internal encoding that represents
-non-@acronym{ASCII} characters using multi-byte sequences.  Multibyte
-mode allows you to use all the supported languages and scripts without
-limitations.
-
-@cindex turn multibyte support on or off
-  Under very special circumstances, you may want to disable multibyte
-character support, for a specific buffer.
-When multibyte characters are disabled in a buffer, we call
-that @dfn{unibyte mode}.  In unibyte mode, each character in the
-buffer has a character code ranging from 0 through 255 (0377 octal); 0
-through 127 (0177 octal) represent @acronym{ASCII} characters, and 128
-(0200 octal) through 255 (0377 octal) represent non-@acronym{ASCII}
-characters.
-
-  To edit a particular file in unibyte representation, visit it using
-@code{find-file-literally}.  @xref{Visiting}.  You can convert a
-multibyte buffer to unibyte by saving it to a file, killing the
-buffer, and visiting the file again with @code{find-file-literally}.
-Alternatively, you can use @kbd{C-x @key{RET} c}
-(@code{universal-coding-system-argument}) and specify @samp{raw-text}
-as the coding system with which to visit or save a file.  @xref{Text
-Coding}.  Unlike @code{find-file-literally}, finding a file as
-@samp{raw-text} doesn't disable format conversion, uncompression, or
-auto mode selection.
-
-@c Not a single file in Emacs uses this feature.  Is it really worth
-@c mentioning in the _user_ manual?  Also, this duplicates somewhat
-@c "Loading Non-ASCII" from the lispref.
-@cindex Lisp files, and multibyte operation
-@cindex multibyte operation, and Lisp files
-@cindex unibyte operation, and Lisp files
-@cindex init file, and non-@acronym{ASCII} characters
-  Emacs normally loads Lisp files as multibyte.
-This includes the Emacs initialization
-file, @file{.emacs}, and the initialization files of packages
-such as Gnus.  However, you can specify unibyte loading for a
-particular Lisp file, by adding an entry @samp{coding: raw-text} in a file
-local variables section.  @xref{Specify Coding}.
-Then that file is always loaded as unibyte text.
-@ignore
-@c I don't see the point of this statement:
-The motivation for these conventions is that it is more reliable to
-always load any particular Lisp file in the same way.
-@end ignore
-You can also load a Lisp file as unibyte, on any one occasion, by
-typing @kbd{C-x @key{RET} c raw-text @key{RET}} immediately before
-loading it.
-
-@c See http://debbugs.gnu.org/11226 for lack of unibyte tooltip.
-@vindex enable-multibyte-characters
-The buffer-local variable @code{enable-multibyte-characters} is
-non-@code{nil} in multibyte buffers, and @code{nil} in unibyte ones.
-The mode line also indicates whether a buffer is multibyte or not.
-@xref{Mode Line}.  With a graphical display, in a multibyte buffer,
-the portion of the mode line that indicates the character set has a
-tooltip that (amongst other things) says that the buffer is multibyte.
-In a unibyte buffer, the character set indicator is absent.  Thus, in
-a unibyte buffer (when using a graphical display) there is normally
-nothing before the indication of the visited file's end-of-line
-convention (colon, backslash, etc.), unless you are using an input
-method.
-
-@findex toggle-enable-multibyte-characters
-You can turn off multibyte support in a specific buffer by invoking the
-command @code{toggle-enable-multibyte-characters} in that buffer.
-
 @node Language Environments
 @section Language Environments
 @cindex language environments
@@ -919,19 +844,6 @@ pattern, are decoded correctly.
 Unlike the previous two, this variable does not override any
 @samp{-*-coding:-*-} tag.
 
-@c FIXME?  This seems somewhat out of place.  Move to the Rmail section?
-@vindex rmail-decode-mime-charset
-@vindex rmail-file-coding-system
-  When you get new mail in Rmail, each message is translated
-automatically from the coding system it is written in, as if it were a
-separate file.  This uses the priority list of coding systems that you
-have specified.  If a MIME message specifies a character set, Rmail
-obeys that specification.  For reading and saving Rmail files
-themselves, Emacs uses the coding system specified by the variable
-@code{rmail-file-coding-system}.  The default value is @code{nil},
-which means that Rmail files are not translated (they are read and
-written in the Emacs internal character code).
-
 @node Specify Coding
 @section Specifying a File's Coding System
 
@@ -995,7 +907,7 @@ decoding.  (You can still use an unsuitable coding system if you enter
 its name at the prompt.)
 
 @c It seems that select-message-coding-system does this.
-@c Both sendmail.el and smptmail.el call it; i.e. smtpmail.el still
+@c Both sendmail.el and smptmail.el call it; i.e., smtpmail.el still
 @c obeys sendmail-coding-system.
 @vindex sendmail-coding-system
   When you send a mail message (@pxref{Sending Mail}),
@@ -1040,12 +952,16 @@ decoding it using coding system @var{right} instead.
 @findex set-buffer-file-coding-system
   The command @kbd{C-x @key{RET} f}
 (@code{set-buffer-file-coding-system}) sets the file coding system for
-the current buffer---in other words, it says which coding system to
-use when saving or reverting the visited file.  You specify which
-coding system using the minibuffer.  If you specify a coding system
-that cannot handle all of the characters in the buffer, Emacs warns
-you about the troublesome characters when you actually save the
-buffer.
+the current buffer (i.e., the coding system to use when saving or
+reverting the file).  You specify which coding system using the
+minibuffer.  You can also invoke this command by clicking with
+@kbd{Mouse-3} on the coding system indicator in the mode line
+(@pxref{Mode Line}).
+
+  If you specify a coding system that cannot handle all the characters
+in the buffer, Emacs will warn you about the troublesome characters,
+and ask you to choose another coding system, when you try to save the
+buffer (@pxref{Output Coding}).
 
 @cindex specify end-of-line conversion
   You can also use this command to specify the end-of-line conversion
@@ -1213,6 +1129,21 @@ In the default language environment, non-@acronym{ASCII} characters in
 file names are not encoded specially; they appear in the file system
 using the internal Emacs representation.
 
+@cindex file-name encoding, MS-Windows
+@vindex w32-unicode-filenames
+  When Emacs runs on MS-Windows versions that are descendants of the
+NT family (Windows 2000, XP, Vista, Windows 7, and Windows 8), the
+value of @code{file-name-coding-system} is largely ignored, as Emacs
+by default uses APIs that allow to pass Unicode file names directly.
+By contrast, on Windows 9X, file names are encoded using
+@code{file-name-coding-system}, which should be set to the codepage
+(@pxref{Coding Systems, codepage}) pertinent for the current system
+locale.  The value of the variable @code{w32-unicode-filenames}
+controls whether Emacs uses the Unicode APIs when it calls OS
+functions that accept file names.  This variable is set by the startup
+code to @code{nil} on Windows 9X, and to @code{t} on newer versions of
+MS-Windows.
+
   @strong{Warning:} if you change @code{file-name-coding-system} (or the
 language environment) in the middle of an Emacs session, problems can
 result if you have already visited files whose names were encoded using
@@ -1320,7 +1251,7 @@ scripts.@footnote{If you run Emacs on X, you may need to inform the X
 server about the location of the newly installed fonts with commands
 such as:
 @c FIXME?  I feel like this may be out of date.
-@c Eg the intlfonts tarfile is ~ 10 years old.
+@c E.g., the intlfonts tarfile is ~ 10 years old.
 
 @example
  xset fp+ /usr/local/share/emacs/fonts
@@ -1566,7 +1497,7 @@ no font appear as a hollow box.
 
   If you use Latin-1 characters but your terminal can't display
 Latin-1, you can arrange to display mnemonic @acronym{ASCII} sequences
-instead, e.g.@: @samp{"o} for o-umlaut.  Load the library
+instead, e.g., @samp{"o} for o-umlaut.  Load the library
 @file{iso-ascii} to do this.
 
 @vindex latin1-display
@@ -1588,15 +1519,13 @@ the range 0240 to 0377 octal (160 to 255 decimal) to handle the
 accented letters and punctuation needed by various European languages
 (and some non-European ones).  Note that Emacs considers bytes with
 codes in this range as raw bytes, not as characters, even in a unibyte
-buffer, i.e.@: if you disable multibyte characters.  However, Emacs
-can still handle these character codes as if they belonged to
-@emph{one} of the single-byte character sets at a time.  To specify
-@emph{which} of these codes to use, invoke @kbd{M-x
-set-language-environment} and specify a suitable language environment
-such as @samp{Latin-@var{n}}.
-
-  For more information about unibyte operation, see
-@ref{Disabling Multibyte}.
+buffer, i.e., if you disable multibyte characters.  However, Emacs can
+still handle these character codes as if they belonged to @emph{one}
+of the single-byte character sets at a time.  To specify @emph{which}
+of these codes to use, invoke @kbd{M-x set-language-environment} and
+specify a suitable language environment such as @samp{Latin-@var{n}}.
+@xref{Disabling Multibyte, , Disabling Multibyte Characters, elisp,
+GNU Emacs Lisp Reference Manual}.
 
 @vindex unibyte-display-via-language-environment
   Emacs can also display bytes in the range 160 to 255 as readable
@@ -1764,7 +1693,7 @@ directionality when they are displayed.  The default value is
   Each paragraph of bidirectional text can have its own @dfn{base
 direction}, either right-to-left or left-to-right.  (Paragraph
 @c paragraph-separate etc have no influence on this?
-boundaries are empty lines, i.e.@: lines consisting entirely of
+boundaries are empty lines, i.e., lines consisting entirely of
 whitespace characters.)  Text in left-to-right paragraphs begins on
 the screen at the left margin of the window and is truncated or
 continued when it reaches the right margin.  By contrast, text in
@@ -1801,4 +1730,6 @@ jump when point traverses reordered bidirectional text.  Similarly, a
 highlighted region covering a contiguous range of character positions
 may look discontinuous if the region spans reordered text.  This is
 normal and similar to the behavior of other programs that support
-bidirectional text.
+bidirectional text.  If you set @code{visual-order-cursor-movement} to
+a non-@code{nil} value, cursor motion by the arrow keys follows the
+visual order on screen (@pxref{Moving Point, visual-order movement}).