X-Git-Url: https://code.delx.au/gnu-emacs/blobdiff_plain/1dd4f26ab6c1f14628d9fcf03b0cca7e54d52302..eed3b46ca184b5bca1dc341e3204f1539b831104:/admin/notes/unicode

diff --git a/admin/notes/unicode b/admin/notes/unicode
index 51314b199f..65df2166f2 100644
--- a/admin/notes/unicode
+++ b/admin/notes/unicode
@@ -10,8 +10,11 @@ Emacs uses the following files from the Unicode Character Database
 (a.k.a. "UCD):
 
   . UnicodeData.txt
+  . Blocks.txt
   . BidiMirroring.txt
+  . BidiBrackets.txt
   . IVD_Sequences.txt
+  . NormalizationTest.txt
 
 First, these files need to be copied into admin/unidata/, and then
 Emacs should be rebuilt for them to take effect.  Rebuilding Emacs
@@ -34,8 +37,25 @@ updates in charscript.el, but it is a good idea to look at the results
 and see if any changes in admin/unidata/blocks.awk are required.
 
 Any new scripts added by UnicodeData.txt will also need updates to
-script-representative-chars defined in fontset.el.  Other databases in
-fontset.el might also need to be updated as needed.
+script-representative-chars defined in fontset.el, and also the list
+of OTF script tags in otf-script-alist, whose source is on this page:
+
+  https://www.microsoft.com/typography/otspec/scripttags.htm
+
+Other databases in fontset.el might also need to be updated as needed.
+
+The function 'ucs-names', defined in lisp/international/mule-cmds.el,
+might need to be updated because it knows about used and unused ranges
+of Unicode codepoints, which a new release of the Unicode Standard
+could change.
+
+Finally, test normalization functions against NormalizationTests.txt,
+in the test/ directory run:
+
+  make lisp/international/ucs-normalize-tests
+
+See commentary in test/lisp/international/ucs-normalize-tests.el
+regarding failing lines.
 
 Problems, fixmes and other unicode-related issues
 -------------------------------------------------------------
@@ -120,8 +140,6 @@ regard to completeness.
  * Need multibyte text in menus, e.g. for the above.  (Not specific to
    Unicode -- see Emacs etc/TODO, but now mostly works with gtk.)
 
- * There's currently no support for Unicode normalization.
-
  * Populate char-width-table correctly for Unicode characters and
    worry about what happens when double-width charsets covering
    non-CJK characters are unified.