Add tests for ucs-normalize.el

[gnu-emacs] / admin / notes / unicode
diff --git a/admin/notes/unicode b/admin/notes/unicode

index 0b2ce90067aff6f2f7cc222aa2bd968d4b8254b3..65df2166f28b36fa5e6e1ded9fcb1751a6e20388 100644 (file)
--- a/admin/notes/unicode
+++ b/admin/notes/unicode
@@ -14,6 +14,7 @@ Emacs uses the following files from the Unicode Character Database
    . BidiMirroring.txt
    . BidiBrackets.txt
    . IVD_Sequences.txt
+  . NormalizationTest.txt
  
  First, these files need to be copied into admin/unidata/, and then
  Emacs should be rebuilt for them to take effect.  Rebuilding Emacs
@@ -36,14 +37,26 @@ updates in charscript.el, but it is a good idea to look at the results
  and see if any changes in admin/unidata/blocks.awk are required.
  
  Any new scripts added by UnicodeData.txt will also need updates to
-script-representative-chars defined in fontset.el.  Other databases in
-fontset.el might also need to be updated as needed.
+script-representative-chars defined in fontset.el, and also the list
+of OTF script tags in otf-script-alist, whose source is on this page:
+
+  https://www.microsoft.com/typography/otspec/scripttags.htm
+
+Other databases in fontset.el might also need to be updated as needed.
  
  The function 'ucs-names', defined in lisp/international/mule-cmds.el,
  might need to be updated because it knows about used and unused ranges
  of Unicode codepoints, which a new release of the Unicode Standard
  could change.
  
+Finally, test normalization functions against NormalizationTests.txt,
+in the test/ directory run:
+
+  make lisp/international/ucs-normalize-tests
+
+See commentary in test/lisp/international/ucs-normalize-tests.el
+regarding failing lines.
+
  Problems, fixmes and other unicode-related issues
  -------------------------------------------------------------
  
@@ -127,8 +140,6 @@ regard to completeness.
   * Need multibyte text in menus, e.g. for the above.  (Not specific to
     Unicode -- see Emacs etc/TODO, but now mostly works with gtk.)
  
- * There's currently no support for Unicode normalization.
-
   * Populate char-width-table correctly for Unicode characters and
     worry about what happens when double-width charsets covering
     non-CJK characters are unified.