(msdos-color-values): Add defvar.

[gnu-emacs] / man / mule.texi
diff --git a/man/mule.texi b/man/mule.texi

index 54952fa08fa8c0aaabb0e1a6ba13ae0d15a498b1..fa5e1246f251555749e4a07576da7b0f68c8e250 100644 (file)
--- a/man/mule.texi
+++ b/man/mule.texi
@@ -1,5 +1,6 @@
  @c This is part of the Emacs manual.
-@c Copyright (C) 1997, 1999, 2000, 2001 Free Software Foundation, Inc.
+@c Copyright (C) 1997, 1999, 2000, 2001, 2002, 2003, 2004,
+@c   2005 Free Software Foundation, Inc.
  @c See file emacs.texi for copying conditions.
  @node International, Major Modes, Frames, Top
  @chapter International Character Set Support
@@ -35,11 +36,12 @@
  @cindex Dutch
  @cindex Spanish
    Emacs supports a wide variety of international character sets,
-including European variants of the Latin alphabet, as well as Chinese,
-Cyrillic, Devanagari (Hindi and Marathi), Ethiopic, Greek, Hebrew, IPA,
-Japanese, Korean, Lao, Thai, Tibetan, and Vietnamese scripts.  These features
-have been merged from the modified version of Emacs known as MULE (for
-``MULti-lingual Enhancement to GNU Emacs'')
+including European and Vietnamese variants of the Latin alphabet, as
+well as Cyrillic, Devanagari (for Hindi and Marathi), Ethiopic, Greek,
+Han (for Chinese and Japanese), Hangul (for Korean), Hebrew, IPA,
+Kannada, Lao, Malayalam, Tamil, Thai, Tibetan, and Vietnamese scripts.
+These features have been merged from the modified version of Emacs
+known as MULE (for ``MULti-lingual Enhancement to GNU Emacs'')
  
    Emacs also supports various encodings of these characters used by
  other internationalized software, such as word processors and mailers.
@@ -69,8 +71,7 @@ describes possible problems and explains how to solve them.
  You can insert non-@acronym{ASCII} characters or search for them.  To do that,
  you can specify an input method (@pxref{Select Input Method}) suitable
  for your language, or use the default input method set up when you set
-your language environment.  (Emacs input methods are part of the Leim
-package, which must be installed for you to be able to use them.)  If
+your language environment.  If
  your keyboard can produce non-@acronym{ASCII} characters, you can select an
  appropriate keyboard coding system (@pxref{Specify Coding}), and Emacs
  will accept those characters.  Latin-1 characters can also be input by
@@ -97,9 +98,8 @@ correctly; see @ref{Language Environments, locales}.
                                that cover the whole spectrum of characters.
  * Defining Fontsets::       Defining a new fontset.
  * Undisplayable Characters:: When characters don't display.
-* Single-Byte Character Support::
-                            You can pick one European character set
-                            to use without multibyte characters.
+* Single-Byte Character Support:: You can pick one European character set
+                              to use without multibyte characters.
  * Charsets::                How Emacs groups its internal character codes.
  @end menu
  
@@ -241,13 +241,19 @@ the Emacs session.  The supported language environments include:
  @cindex Euro sign
  @cindex UTF-8
  @quotation
-Chinese-BIG5, Chinese-CNS, Chinese-GB, Cyrillic-ALT, Cyrillic-ISO,
-Cyrillic-KOI8, Czech, Devanagari, Dutch, English, Ethiopic, German,
-Greek, Hebrew, IPA, Japanese, Korean, Lao, Latin-1, Latin-2, Latin-3,
-Latin-4, Latin-5, Latin-8 (Celtic), Latin-9 (updated Latin-1, with the
-Euro sign), Polish, Romanian, Slovak, Slovenian, Spanish, Thai, Tibetan,
-Turkish, UTF-8 (for a setup which prefers Unicode characters and files
-encoded in UTF-8), and Vietnamese.
+Belarusian, Brazilian Portuguese, Bulgarian, Chinese-BIG5,
+Chinese-CNS, Chinese-EUC-TW, Chinese-GB, Croatian, Cyrillic-ALT,
+Cyrillic-ISO, Cyrillic-KOI8, Czech, Devanagari, Dutch, English,
+Ethiopic, French, Georgian, German, Greek, Hebrew, IPA, Italian,
+Japanese, Kannada, Korean, Lao, Latin-1, Latin-2, Latin-3,
+Latin-4, Latin-5, Latin-6, Latin-7, Latin-8 (Celtic),
+Latin-9 (updated Latin-1 with the Euro sign), Latvian,
+Lithuanian, Malayalam, Polish, Romanian, Russian, Slovak,
+Slovenian, Spanish, Swedish, Tajik, Tamil, Thai, Tibetan,
+Turkish, UTF-8 (for a setup which prefers Unicode characters and
+files encoded in UTF-8), Ukrainian, Vietnamese, Welsh, and
+Windows-1255 (for a setup which prefers Cyrillic characters and
+files encoded in Windows-1255).
  @end quotation
  
  @cindex fonts for various scripts
@@ -255,7 +261,7 @@ encoded in UTF-8), and Vietnamese.
    To display the script(s) used by your language environment on a
  graphical display, you need to have a suitable font.  If some of the
  characters appear as empty boxes, you should install the GNU Intlfonts
-package, which includes fonts for all supported scripts.@footnote{If
+package, which includes fonts for most supported scripts.@footnote{If
  you run Emacs on X, you need to inform the X server about the location
  of the newly installed fonts with the following commands:
  
@@ -428,6 +434,9 @@ is the command @kbd{C-\} (@code{toggle-input-method}) used twice.
  because it stops waiting for more characters to combine, and starts
  searching for what you have already entered.
  
+  To find out how to input the character after point using the current
+input method, type @kbd{C-u C-x =}.  @xref{Position Info}.
+
  @vindex input-method-verbose-flag
  @vindex input-method-highlight-flag
    The variables @code{input-method-highlight-flag} and
@@ -523,9 +532,11 @@ actual keyboard layout.  To specify which layout your keyboard has, use
  the command @kbd{M-x quail-set-keyboard-layout}.
  
  @findex quail-show-key
-  You can use the command @kbd{M-x quail-show-key} to show what key
-(or key sequence) to type in order to input the character following
-point, using the selected keyboard layout.
+  You can use the command @kbd{M-x quail-show-key} to show what key (or
+key sequence) to type in order to input the character following point,
+using the selected keyboard layout.  The command @kbd{C-u C-x =} also
+shows that information in addition to the other information about the
+character.
  
  @findex list-input-methods
    To display a list of all the supported input methods, type @kbd{M-x
@@ -582,12 +593,15 @@ coding systems @code{no-conversion}, @code{raw-text} and
  @cindex international files from DOS/Windows systems
    A special class of coding systems, collectively known as
  @dfn{codepages}, is designed to support text encoded by MS-Windows and
-MS-DOS software.  To use any of these systems, you need to create it
-with @kbd{M-x codepage-setup}.  @xref{MS-DOS and MULE}.  After
-creating the coding system for the codepage, you can use it as any
-other coding system.  For example, to visit a file encoded in codepage
-850, type @kbd{C-x @key{RET} c cp850 @key{RET} C-x C-f @var{filename}
-@key{RET}}.
+MS-DOS software.  The names of these coding systems are
+@code{cp@var{nnnn}}, where @var{nnnn} is a 3- or 4-digit number of the
+codepage.  You can use these encodings just like any other coding
+system; for example, to visit a file encoded in codepage 850, type
+@kbd{C-x @key{RET} c cp850 @key{RET} C-x C-f @var{filename}
+@key{RET}}@footnote{
+In the MS-DOS port of Emacs, you need to create a @code{cp@var{nnn}}
+coding system with @kbd{M-x codepage-setup}, before you can use it.
+@xref{MS-DOS and MULE}.}.
  
    In addition to converting various representations of non-@acronym{ASCII}
  characters, a coding system can perform end-of-line conversion.  Emacs
@@ -734,7 +748,7 @@ example, to read and write all @samp{.txt} files using the coding system
  @code{china-iso-8bit}, you can execute this Lisp expression:
  
  @smallexample
-(modify-coding-system-alist 'file "\\.txt\\'" 'china-iso-8bit)
+(modify-coding-system-alist 'file "\\.txt\\'" 'chinese-iso-8bit)
  @end smallexample
  
  @noindent
@@ -808,7 +822,7 @@ pattern, are decoded correctly.  One of the builtin
  
    If Emacs recognizes the encoding of a file incorrectly, you can
  reread the file using the correct coding system by typing @kbd{C-x
-@key{RET} c @var{coding-system} @key{RET} M-x revert-buffer
+@key{RET} r @var{coding-system}
  @key{RET}}.  To see what coding system Emacs actually used to decode
  the file, look at the coding system mnemonic letter near the left edge
  of the mode line (@pxref{Mode Line}), or type @kbd{C-h C @key{RET}}.
@@ -928,6 +942,9 @@ files.
  @item C-x @key{RET} X @var{coding} @key{RET}
  Use coding system @var{coding} for transferring @emph{one}
  selection---the next one---to or from the window system.
+
+@item M-x recode-region
+Convert the region from a previous coding system to a new one.
  @end table
  
  @kindex C-x RET f
@@ -1056,6 +1073,12 @@ corresponding buffer.
    The default for translation of process input and output depends on the
  current language environment.
  
+@findex recode-region
+  If a piece of text has already been inserted into a buffer using the
+wrong coding system, you can decode it again using @kbd{M-x
+recode-region}.  This prompts you for the old coding system and the
+desired coding system, and acts on the text in the region.
+
  @vindex file-name-coding-system
  @cindex file names with non-@acronym{ASCII} characters
  @findex set-file-name-coding-system
@@ -1084,6 +1107,12 @@ these buffers under the visited file name, saving may use the wrong file
  name, or it may get an error.  If such a problem happens, use @kbd{C-x
  C-w} to specify a new file name for that buffer.
  
+@findex recode-file-name
+  If a mistake occurs when encoding a file name, use the command
+@kbd{M-x recode-file-name} to change the file name's coding
+system.  This prompts for an existing file name, its old coding
+system, and the coding system to which you wish to convert.
+
  @vindex locale-coding-system
  @cindex decoding non-@acronym{ASCII} keyboard input on X
    The variable @code{locale-coding-system} specifies a coding system
@@ -1358,6 +1387,27 @@ however, on a console terminal or in @code{xterm}, you can arrange for
  Meta to be converted to @kbd{ESC} and still be able type 8-bit
  characters present directly on the keyboard or using @kbd{Compose} or
  @kbd{AltGr} keys.  @xref{User Input}.
+
+@kindex C-x 8
+@cindex @code{iso-transl} library
+@cindex compose character
+@cindex dead character
+@item
+For Latin-1 only, you can use the key @kbd{C-x 8} as a ``compose
+character'' prefix for entry of non-@acronym{ASCII} Latin-1 printing
+characters.  @kbd{C-x 8} is good for insertion (in the minibuffer as
+well as other buffers), for searching, and in any other context where
+a key sequence is allowed.
+
+@kbd{C-x 8} works by loading the @code{iso-transl} library.  Once that
+library is loaded, the @key{ALT} modifier key, if the keyboard has
+one, serves the same purpose as @kbd{C-x 8}: use @key{ALT} together
+with an accent character to modify the following letter.  In addition,
+if the keyboard has keys for the Latin-1 ``dead accent characters,''
+they too are defined to compose with the following character, once
+@code{iso-transl} is loaded.
+
+Use @kbd{C-x 8 C-h} to list all the available @kbd{C-x 8} translations.
  @end itemize
  
  @node Charsets