X-Git-Url: https://code.delx.au/gnu-emacs/blobdiff_plain/1dd4f26ab6c1f14628d9fcf03b0cca7e54d52302..eed3b46ca184b5bca1dc341e3204f1539b831104:/admin/notes/unicode diff --git a/admin/notes/unicode b/admin/notes/unicode index 51314b199f..65df2166f2 100644 --- a/admin/notes/unicode +++ b/admin/notes/unicode @@ -10,8 +10,11 @@ Emacs uses the following files from the Unicode Character Database (a.k.a. "UCD): . UnicodeData.txt + . Blocks.txt . BidiMirroring.txt + . BidiBrackets.txt . IVD_Sequences.txt + . NormalizationTest.txt First, these files need to be copied into admin/unidata/, and then Emacs should be rebuilt for them to take effect. Rebuilding Emacs @@ -34,8 +37,25 @@ updates in charscript.el, but it is a good idea to look at the results and see if any changes in admin/unidata/blocks.awk are required. Any new scripts added by UnicodeData.txt will also need updates to -script-representative-chars defined in fontset.el. Other databases in -fontset.el might also need to be updated as needed. +script-representative-chars defined in fontset.el, and also the list +of OTF script tags in otf-script-alist, whose source is on this page: + + https://www.microsoft.com/typography/otspec/scripttags.htm + +Other databases in fontset.el might also need to be updated as needed. + +The function 'ucs-names', defined in lisp/international/mule-cmds.el, +might need to be updated because it knows about used and unused ranges +of Unicode codepoints, which a new release of the Unicode Standard +could change. + +Finally, test normalization functions against NormalizationTests.txt, +in the test/ directory run: + + make lisp/international/ucs-normalize-tests + +See commentary in test/lisp/international/ucs-normalize-tests.el +regarding failing lines. Problems, fixmes and other unicode-related issues ------------------------------------------------------------- @@ -120,8 +140,6 @@ regard to completeness. * Need multibyte text in menus, e.g. for the above. (Not specific to Unicode -- see Emacs etc/TODO, but now mostly works with gtk.) - * There's currently no support for Unicode normalization. - * Populate char-width-table correctly for Unicode characters and worry about what happens when double-width charsets covering non-CJK characters are unified.