X-Git-Url: https://code.delx.au/gnu-emacs/blobdiff_plain/7ad8fe5e2876518a8f33b80050f98dab4ff78398..0bafabe7b28b6ee05cf052579e398102fd73e0eb:/admin/notes/unicode diff --git a/admin/notes/unicode b/admin/notes/unicode index 4f0becc459..0654036d36 100644 --- a/admin/notes/unicode +++ b/admin/notes/unicode @@ -1,7 +1,6 @@ - -*-mode: text; coding: latin-1;-*- + -*-mode: text; coding: utf-8;-*- -Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 - Free Software Foundation, Inc. +Copyright (C) 2002-2013 Free Software Foundation, Inc. See the end of the file for license conditions. Problems, fixmes and other unicode-related issues @@ -13,9 +12,9 @@ regard to completeness. * SINGLE_BYTE_CHAR_P returns true for Latin-1 characters, which has undesirable effects. E.g.: - (multibyte-string-p (let ((s "x")) (aset s 0 ?£) s)) => nil - (multibyte-string-p (concat [?£])) => nil - (text-char-description ?£) => "M-#" + (multibyte-string-p (let ((s "x")) (aset s 0 ?£) s)) => nil + (multibyte-string-p (concat [?£])) => nil + (text-char-description ?£) => "M-#" These examples are all fixed by the change of 2002-10-14, but there still exist questionable SINGLE_BYTE_CHAR_P in the @@ -63,14 +62,6 @@ regard to completeness. dumped emacs. But, those maps (char tables) generated while temacs is running can't be removed from the dumped emacs. - * Translation tables for {en,de}code currently aren't supported. - - This should be fixed by the changes of 2002-10-14. - - * Defining CCL coding systems currently doesn't work. - - This should be fixed by the changes of 2003-01-30. - * iso-2022 charsets get unified on i/o. With the change on 2003-01-06, decoding routines put `charset' @@ -86,11 +77,9 @@ regard to completeness. spelling and calendar, but that's not a Unicode issue.) * Handle Unicode combining characters usefully, e.g. diacritics, and - handle more scripts specifically (à la Devanagari). There are + handle more scripts specifically (à la Devanagari). There are issues with canonicalization. - * Bidi is a separate issue with no support currently. - * We need tabular input methods, e.g. for maths symbols. (Not specific to Unicode.) @@ -103,29 +92,67 @@ regard to completeness. worry about what happens when double-width charsets covering non-CJK characters are unified. - * Emacs 20/21 .elc files are currently not loadable. It may or may - not be possible to do this properly. + * There are type errors lurking, e.g. in + Fcheck_coding_systems_region. Define ENABLE_CHECKING to find them. - With the change on 2002-07-24, elc files generated by Emacs - 20.3 and later are correctly loaded (including those - containing multibyte characters and compressed). But, elc - files generated by 20.2 and the primer are still not loadable. - Is it really worth working on it? + * Old auto-save files, and similar files, such as Gnus drafts, + containing non-ASCII characters probably won't be re-read correctly. - * Rmail won't work with non-ASCII text. Encoding issues for Babyl - files need sorting out, but rms says Babyl will go before this is - released. - * Gnus still needs some attention, and we need to get changes - accepted by Gnus maintainers... +Source file encoding +-------------------- - * There are type errors lurking, e.g. in - Fcheck_coding_systems_region. Define ENABLE_CHECKING to find them. +Most Emacs source files are encoded in UTF-8 (or in ASCII, which is a +subset), but there are a few exceptions, listed below. Perhaps +someday these files will be converted to UTF-8, for convenience when +using tools like 'grep -r', but this might need nontrivial changes to +the build process. - * You can grep the code for lots of fixmes. + * chinese-big5 - * Old auto-save files, and similar files, such as Gnus drafts, - containing non-ASCII characters probably won't be re-read correctly. + leim/CXTERM-DIC/4Corner.tit + leim/CXTERM-DIC/ARRAY30.tit + leim/CXTERM-DIC/ECDICT.tit + leim/CXTERM-DIC/ETZY.tit + leim/CXTERM-DIC/PY-b5.tit + leim/CXTERM-DIC/Punct-b5.tit + leim/CXTERM-DIC/QJ-b5.tit + leim/CXTERM-DIC/ZOZY.tit + leim/MISC-DIC/CTLau-b5.html + leim/MISC-DIC/cangjie-table.b5 + + * chinese-iso-8bit + + leim/CXTERM-DIC/CCDOSPY.tit + leim/CXTERM-DIC/Punct.tit + leim/CXTERM-DIC/QJ.tit + leim/CXTERM-DIC/SW.tit + leim/CXTERM-DIC/TONEPY.tit + leim/MISC-DIC/pinyin.map + leim/MISC-DIC/CTLau.html + leim/MISC-DIC/ziranma.cin + + * iso-latin-2 + + etc/refcards/cs-refcard.tex + etc/refcards/sk-survival.tex + etc/refcards/cs-survival.tex + etc/refcards/cs-dired-ref.tex + etc/refcards/sk-dired-ref.tex + etc/refcards/sk-refcard.tex + + * japanese-iso-8bit + + leim/SKK-DIC/SKK-JISYO.L + leim/ja-dic/ja-dic.el + + * japanese-shift-jis + + admin/charsets/mapfiles/cns2ucsdkw.txt + + * no-conversion + + lib-src/testfile This file is part of GNU Emacs.