code.delx.au - gnu-emacs/blob - doc/misc/nxml-mode.texi

   1 \input texinfo @c -*- texinfo -*-
   2 @c %**start of header
   3 @setfilename ../../info/nxml-mode
   4 @settitle nXML Mode
   5 @c %**end of header
   6
   7 @copying
   8 This manual documents nXML mode, an Emacs major mode for editing
   9 XML with RELAX NG support.
  10
  11 Copyright @copyright{} 2007--2014 Free Software Foundation, Inc.
  12
  13 @quotation
  14 Permission is granted to copy, distribute and/or modify this document
  15 under the terms of the GNU Free Documentation License, Version 1.3 or
  16 any later version published by the Free Software Foundation; with no
  17 Invariant Sections, with the Front-Cover texts being ``A GNU Manual,''
  18 and with the Back-Cover Texts as in (a) below.  A copy of the license
  19 is included in the section entitled ``GNU Free Documentation License''.
  20
  21 (a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
  22 modify this GNU manual.''
  23 @end quotation
  24 @end copying
  25
  26 @dircategory Emacs editing modes
  27 @direntry
  28 * nXML Mode: (nxml-mode).       XML editing mode with RELAX NG support.
  29 @end direntry
  30
  31
  32 @titlepage
  33 @title nXML mode
  34 @page
  35 @vskip 0pt plus 1filll
  36 @insertcopying
  37 @end titlepage
  38
  39 @contents
  40
  41
  42 @node Top
  43 @top nXML Mode
  44
  45 @insertcopying
  46
  47 This manual is not yet complete.
  48
  49 @menu
  50 * Introduction::
  51 * Completion::
  52 * Inserting end-tags::
  53 * Paragraphs::
  54 * Outlining::
  55 * Locating a schema::
  56 * DTDs::
  57 * Limitations::
  58 * GNU Free Documentation License::  The license for this documentation.
  59 @end menu
  60
  61 @node Introduction
  62 @chapter Introduction
  63
  64 nXML mode is an Emacs major-mode for editing XML documents.  It supports
  65 editing well-formed XML documents, and provides schema-sensitive editing
  66 using RELAX NG Compact Syntax.  To get started, visit a file containing an
  67 XML document, and, if necessary, use @kbd{M-x nxml-mode} to switch to nXML
  68 mode.  By default, @code{auto-mode-alist} and @code{magic-fallback-alist}
  69 put buffers in nXML mode if they have recognizable XML content or file
  70 extensions.  You may wish to customize the settings, for example to
  71 recognize different file extensions.
  72
  73 Once in nXML mode, you can type @kbd{C-h m} for basic information on the
  74 mode.
  75
  76 The @file{etc/nxml} directory in the Emacs distribution contains some data
  77 files used by nXML mode, and includes two files (@file{test-valid.xml} and
  78 @file{test-invalid.xml}) that provide examples of valid and invalid XML
  79 documents.
  80
  81 To get validation and schema-sensitive editing, you need a RELAX NG Compact
  82 Syntax (RNC) schema for your document (@pxref{Locating a schema}).  The
  83 @file{etc/schema} directory includes some schemas for popular document
  84 types.  See @url{http://relaxng.org/} for more information on RELAX NG@.
  85 You can use the @samp{Trang} program from
  86 @url{http://www.thaiopensource.com/relaxng/trang.html} to
  87 automatically create RNC schemas.  This program can:
  88
  89 @itemize @bullet
  90 @item
  91 infer an RNC schema from an instance document;
  92 @item
  93 convert a DTD to an RNC schema;
  94 @item
  95 convert a RELAX NG XML syntax schema to an RNC schema.
  96 @end itemize
  97
  98 @noindent To convert a RELAX NG XML syntax (@samp{.rng}) schema to a RNC
  99 one, you can also use the XSLT stylesheet from
 100 @url{https://github.com/oleg-pavliv/emacs/tree/master/xsl}.
 101 @ignore
 102 @c Original location, now defunct.
 103 @url{http://www.pantor.com/download.html}.
 104 @end ignore
 105
 106 To convert a W3C XML Schema to an RNC schema, you need first to convert it
 107 to RELAX NG XML syntax using the RELAX NG converter tool @code{rngconv}
 108 (built on top of MSV).  See @url{https://github.com/kohsuke/msv}
 109 and @url{https://msv.dev.java.net/}.
 110
 111 For historical discussions only, see the mailing list archives at
 112 @url{http://groups.yahoo.com/group/emacs-nxml-mode/}.  Please make all new
 113 discussions on the @samp{help-gnu-emacs} and @samp{emacs-devel} mailing
 114 lists.  Report any bugs with @kbd{M-x report-emacs-bug}.
 115
 116
 117 @node Completion
 118 @chapter Completion
 119
 120 Apart from real-time validation, the most important feature that nXML
 121 mode provides for assisting in document creation is "completion".
 122 Completion assists the user in inserting characters at point, based on
 123 knowledge of the schema and on the contents of the buffer before
 124 point.
 125
 126 nXML mode adapts the standard GNU Emacs command for completion in a
 127 buffer: @code{completion-at-point}, which is bound to @kbd{C-M-i} and
 128 @kbd{M-@key{TAB}}.  Note that many window systems and window managers
 129 use @kbd{M-@key{TAB}} themselves (typically for switching between
 130 windows) and do not pass it to applications.  In that case, you should
 131 type @kbd{C-M-i} or @kbd{@key{ESC} @key{TAB}} for completion, or bind
 132 @code{completion-at-point} to a key that is convenient for you.  In
 133 the following, I will assume that you type @kbd{C-M-i}.
 134
 135 nXML mode completion works by examining the symbol preceding point.
 136 This is the symbol to be completed. The symbol to be completed may be
 137 the empty. Completion considers what symbols starting with the symbol
 138 to be completed would be valid replacements for the symbol to be
 139 completed, given the schema and the contents of the buffer before
 140 point.  These symbols are the possible completions.  An example may
 141 make this clearer.  Suppose the buffer looks like this (where @point{}
 142 indicates point):
 143
 144 @example
 145 <html xmlns="http://www.w3.org/1999/xhtml">
 146 <h@point{}
 147 @end example
 148
 149 @noindent
 150 and the schema is XHTML@.  In this context, the symbol to be completed
 151 is @samp{h}.  The possible completions consist of just
 152 @samp{head}.  Another example, is
 153
 154 @example
 155 <html xmlns="http://www.w3.org/1999/xhtml">
 156 <head>
 157 <@point{}
 158 @end example
 159
 160 @noindent
 161 In this case, the symbol to be completed is empty, and the possible
 162 completions are @samp{base}, @samp{isindex},
 163 @samp{link}, @samp{meta}, @samp{script},
 164 @samp{style}, @samp{title}.  Another example is:
 165
 166 @example
 167 <html xmlns="@point{}
 168 @end example
 169
 170 @noindent
 171 In this case, the symbol to be completed is empty, and the possible
 172 completions are just @samp{http://www.w3.org/1999/xhtml}.
 173
 174 When you type @kbd{C-M-i}, what happens depends
 175 on what the set of possible completions are.
 176
 177 @itemize @bullet
 178 @item
 179 If the set of completions is empty, nothing
 180 happens.
 181 @item
 182 If there is one possible completion, then that completion is
 183 inserted, together with any following characters that are
 184 required. For example, in this case:
 185
 186 @example
 187 <html xmlns="http://www.w3.org/1999/xhtml">
 188 <@point{}
 189 @end example
 190
 191 @noindent
 192 @kbd{C-M-i} will yield
 193
 194 @example
 195 <html xmlns="http://www.w3.org/1999/xhtml">
 196 <head@point{}
 197 @end example
 198 @item
 199 If there is more than one possible completion, but all
 200 possible completions share a common non-empty prefix, then that prefix
 201 is inserted. For example, suppose the buffer is:
 202
 203 @example
 204 <html x@point{}
 205 @end example
 206
 207 @noindent
 208 The symbol to be completed is @samp{x}. The possible completions are
 209 @samp{xmlns} and @samp{xml:lang}.  These share a common prefix of
 210 @samp{xml}.  Thus, @kbd{C-M-i} will yield:
 211
 212 @example
 213 <html xml@point{}
 214 @end example
 215
 216 @noindent
 217 Typically, you would do @kbd{C-M-i} again, which would have the result
 218 described in the next item.
 219 @item
 220 If there is more than one possible completion, but the
 221 possible completions do not share a non-empty prefix, then Emacs will
 222 prompt you to input the symbol in the minibuffer, initializing the
 223 minibuffer with the symbol to be completed, and popping up a buffer
 224 showing the possible completions.  You can now input the symbol to be
 225 inserted.  The symbol you input will be inserted in the buffer instead
 226 of the symbol to be completed.  Emacs will then insert any required
 227 characters after the symbol.  For example, if it contains:
 228
 229 @example
 230 <html xml@point{}
 231 @end example
 232
 233 @noindent
 234 Emacs will prompt you in the minibuffer with
 235
 236 @example
 237 Attribute: xml@point{}
 238 @end example
 239
 240 @noindent
 241 and the buffer showing possible completions will contain
 242
 243 @example
 244 Possible completions are:
 245 xml:lang                           xmlns
 246 @end example
 247
 248 @noindent
 249 If you input @kbd{xmlns}, the result will be:
 250
 251 @example
 252 <html xmlns="@point{}
 253 @end example
 254
 255 @noindent
 256 (If you do @kbd{C-M-i} again, the namespace URI will be
 257 inserted. Should that happen automatically?)
 258 @end itemize
 259
 260 @node Inserting end-tags
 261 @chapter Inserting end-tags
 262
 263 The main redundancy in XML syntax is end-tags.  nXML mode provides
 264 several ways to make it easier to enter end-tags.  You can use all of
 265 these without a schema.
 266
 267 You can use @kbd{C-M-i} after @samp{</} to complete the rest of the
 268 end-tag.
 269
 270 @kbd{C-c C-f} inserts an end-tag for the element containing
 271 point. This command is useful when you want to input the start-tag,
 272 then input the content and finally input the end-tag. The @samp{f}
 273 is mnemonic for finish.
 274
 275 If you want to keep tags balanced and input the end-tag at the
 276 same time as the start-tag, before inputting the content, then you can
 277 use @kbd{C-c C-i}. This inserts a @samp{>}, then inserts
 278 the end-tag and leaves point before the end-tag.  @kbd{C-c C-b}
 279 is similar but more convenient for block-level elements: it puts the
 280 start-tag, point and the end-tag on successive lines, appropriately
 281 indented. The @samp{i} is mnemonic for inline and the
 282 @samp{b} is mnemonic for block.
 283
 284 Finally, you can customize nXML mode so that @kbd{/} automatically
 285 inserts the rest of the end-tag when it occurs after @samp{<}, by
 286 doing
 287
 288 @display
 289 @kbd{M-x customize-variable @key{RET} nxml-slash-auto-complete-flag @key{RET}}
 290 @end display
 291
 292 @noindent
 293 and then following the instructions in the displayed buffer.
 294
 295 @node Paragraphs
 296 @chapter Paragraphs
 297
 298 Emacs has several commands that operate on paragraphs, most
 299 notably @kbd{M-q}. nXML mode redefines these to work in a way
 300 that is useful for XML@.  The exact rules that are used to find the
 301 beginning and end of a paragraph are complicated; they are designed
 302 mainly to ensure that @kbd{M-q} does the right thing.
 303
 304 A paragraph consists of one or more complete, consecutive lines.
 305 A group of lines is not considered a paragraph unless it contains some
 306 non-whitespace characters between tags or inside comments.  A blank
 307 line separates paragraphs.  A single tag on a line by itself also
 308 separates paragraphs.  More precisely, if one tag together with any
 309 leading and trailing whitespace completely occupy one or more lines,
 310 then those lines will not be included in any paragraph.
 311
 312 A start-tag at the beginning of the line (possibly indented) may
 313 be treated as starting a paragraph.  Similarly, an end-tag at the end
 314 of the line may be treated as ending a paragraph. The following rules
 315 are used to determine whether such a tag is in fact treated as a
 316 paragraph boundary:
 317
 318 @itemize @bullet
 319 @item
 320 If the schema does not allow text at that point, then it
 321 is a paragraph boundary.
 322 @item
 323 If the end-tag corresponding to the start-tag is not at
 324 the end of its line, or the start-tag corresponding to the end-tag is
 325 not at the beginning of its line, then it is not a paragraph
 326 boundary. For example, in
 327
 328 @example
 329 <p>This is a paragraph with an
 330 <emph>emphasized</emph> phrase.
 331 @end example
 332
 333 @noindent
 334 the @samp{<emph>} start-tag would not be considered as
 335 starting a paragraph, because its corresponding end-tag is not at the
 336 end of the line.
 337 @item
 338 If there is text that is a sibling in element tree, then
 339 it is not a paragraph boundary.  For example, in
 340
 341 @example
 342 <p>This is a paragraph with an
 343 <emph>emphasized phrase that takes one source line</emph>
 344 @end example
 345
 346 @noindent
 347 the @samp{<emph>} start-tag would not be considered as
 348 starting a paragraph, even though its end-tag is at the end of its
 349 line, because there the text @samp{This is a paragraph with an}
 350 is a sibling of the @samp{emph} element.
 351 @item
 352 Otherwise, it is a paragraph boundary.
 353 @end itemize
 354
 355 @node Outlining
 356 @chapter Outlining
 357
 358 nXML mode allows you to display all or part of a buffer as an
 359 outline, in a similar way to Emacs's outline mode.  An outline in nXML
 360 mode is based on recognizing two kinds of element: sections and
 361 headings.  There is one heading for every section and one section for
 362 every heading.  A section contains its heading as or within its first
 363 child element.  A section also contains its subordinate sections (its
 364 subsections).  The text content of a section consists of anything in a
 365 section that is neither a subsection nor a heading.
 366
 367 Note that this is a different model from that used by XHTML@.
 368 nXML mode's outline support will not be useful for XHTML unless you
 369 adopt a convention of adding a @code{div} to enclose each
 370 section, rather than having sections implicitly delimited by different
 371 @code{h@var{n}} elements.  This limitation may be removed
 372 in a future version.
 373
 374 The variable @code{nxml-section-element-name-regexp} gives
 375 a regexp for the local names (i.e., the part of the name following any
 376 prefix) of section elements. The variable
 377 @code{nxml-heading-element-name-regexp} gives a regexp for the
 378 local names of heading elements. For an element to be recognized
 379 as a section
 380
 381 @itemize @bullet
 382 @item
 383 its start-tag must occur at the beginning of a line
 384 (possibly indented);
 385 @item
 386 its local name must match
 387 @code{nxml-section-element-name-regexp};
 388 @item
 389 either its first child element or a descendant of that
 390 first child element must have a local name that matches
 391 @code{nxml-heading-element-name-regexp}; the first such element
 392 is treated as the section's heading.
 393 @end itemize
 394
 395 @noindent
 396 You can customize these variables using @kbd{M-x
 397 customize-variable}.
 398
 399 There are three possible outline states for a section:
 400
 401 @itemize @bullet
 402 @item
 403 normal, showing everything, including its heading, text
 404 content and subsections; each subsection is displayed according to the
 405 state of that subsection;
 406 @item
 407 showing just its heading, with both its text content and
 408 its subsections hidden; all subsections are hidden regardless of their
 409 state;
 410 @item
 411 showing its heading and its subsections, with its text
 412 content hidden; each subsection is displayed according to the state of
 413 that subsection.
 414 @end itemize
 415
 416 In the last two states, where the text content is hidden, the
 417 heading is displayed specially, in an abbreviated form. An element
 418 like this:
 419
 420 @example
 421 <section>
 422 <title>Food</title>
 423 <para>There are many kinds of food.</para>
 424 </section>
 425 @end example
 426
 427 @noindent
 428 would be displayed on a single line like this:
 429
 430 @example
 431 <-section>Food...</>
 432 @end example
 433
 434 @noindent
 435 If there are hidden subsections, then a @code{+} will be used
 436 instead of a @code{-} like this:
 437
 438 @example
 439 <+section>Food...</>
 440 @end example
 441
 442 @noindent
 443 If there are non-hidden subsections, then the section will instead be
 444 displayed like this:
 445
 446 @example
 447 <-section>Food...
 448   <-section>Delicious Food...</>
 449   <-section>Distasteful Food...</>
 450 </-section>
 451 @end example
 452
 453 @noindent
 454 The heading is always displayed with an indent that corresponds to its
 455 depth in the outline, even it is not actually indented in the buffer.
 456 The variable @code{nxml-outline-child-indent} controls how much
 457 a subheading is indented with respect to its parent heading when the
 458 heading is being displayed specially.
 459
 460 Commands to change the outline state of sections are bound to
 461 key sequences that start with @kbd{C-c C-o} (@kbd{o} is
 462 mnemonic for outline).  The third and final key has been chosen to be
 463 consistent with outline mode.  In the following descriptions
 464 current section means the section containing point, or, more precisely,
 465 the innermost section containing the character immediately following
 466 point.
 467
 468 @itemize @bullet
 469 @item
 470 @kbd{C-c C-o C-a} shows all sections in the buffer
 471 normally.
 472 @item
 473 @kbd{C-c C-o C-t} hides the text content
 474 of all sections in the buffer.
 475 @item
 476 @kbd{C-c C-o C-c} hides the text content
 477 of the current section.
 478 @item
 479 @kbd{C-c C-o C-e} shows the text content
 480 of the current section.
 481 @item
 482 @kbd{C-c C-o C-d} hides the text content
 483 and subsections of the current section.
 484 @item
 485 @kbd{C-c C-o C-s} shows the current section
 486 and all its direct and indirect subsections normally.
 487 @item
 488 @kbd{C-c C-o C-k} shows the headings of the
 489 direct and indirect subsections of the current section.
 490 @item
 491 @kbd{C-c C-o C-l} hides the text content of the
 492 current section and of its direct and indirect
 493 subsections.
 494 @item
 495 @kbd{C-c C-o C-i} shows the headings of the
 496 direct subsections of the current section.
 497 @item
 498 @kbd{C-c C-o C-o} hides as much as possible without
 499 hiding the current section's text content; the headings of ancestor
 500 sections of the current section and their child section sections will
 501 not be hidden.
 502 @end itemize
 503
 504 When a heading is displayed specially, you can use
 505 @key{RET} in that heading to show the text content of the section
 506 in the same way as @kbd{C-c C-o C-e}.
 507
 508 You can also use the mouse to change the outline state:
 509 @kbd{S-mouse-2} hides the text content of a section in the same
 510 way as@kbd{C-c C-o C-c}; @kbd{mouse-2} on a specially
 511 displayed heading shows the text content of the section in the same
 512 way as @kbd{C-c C-o C-e}; @kbd{mouse-1} on a specially
 513 displayed start-tag toggles the display of subheadings on and
 514 off.
 515
 516 The outline state for each section is stored with the first
 517 character of the section (as a text property). Every command that
 518 changes the outline state of any section updates the display of the
 519 buffer so that each section is displayed correctly according to its
 520 outline state.  If the section structure is subsequently changed, then
 521 it is possible for the display to no longer correctly reflect the
 522 stored outline state. @kbd{C-c C-o C-r} can be used to refresh
 523 the display so it is correct again.
 524
 525 @node Locating a schema
 526 @chapter Locating a schema
 527
 528 nXML mode has a configurable set of rules to locate a schema for
 529 the file being edited.  The rules are contained in one or more schema
 530 locating files, which are XML documents.
 531
 532 The variable @samp{rng-schema-locating-files} specifies
 533 the list of the file-names of schema locating files that nXML mode
 534 should use.  The order of the list is significant: when file
 535 @var{x} occurs in the list before file @var{y} then rules
 536 from file @var{x} have precedence over rules from file
 537 @var{y}.  A filename specified in
 538 @samp{rng-schema-locating-files} may be relative. If so, it will
 539 be resolved relative to the document for which a schema is being
 540 located. It is not an error if relative file-names in
 541 @samp{rng-schema-locating-files} do not exist. You can use
 542 @kbd{M-x customize-variable @key{RET} rng-schema-locating-files
 543 @key{RET}} to customize the list of schema locating
 544 files.
 545
 546 By default, @samp{rng-schema-locating-files} list has two
 547 members: @samp{schemas.xml}, and
 548 @samp{@var{dist-dir}/schema/schemas.xml} where
 549 @samp{@var{dist-dir}} is the directory containing the nXML
 550 distribution. The first member will cause nXML mode to use a file
 551 @samp{schemas.xml} in the same directory as the document being
 552 edited if such a file exist.  The second member contains rules for the
 553 schemas that are included with the nXML distribution.
 554
 555 @menu
 556 * Commands for locating a schema::
 557 * Schema locating files::
 558 @end menu
 559
 560 @node Commands for locating a schema
 561 @section Commands for locating a schema
 562
 563 The command @kbd{C-c C-s C-w} will tell you what schema
 564 is currently being used.
 565
 566 The rules for locating a schema are applied automatically when
 567 you visit a file in nXML mode. However, if you have just created a new
 568 file and the schema cannot be inferred from the file-name, then this
 569 will not locate the right schema.  In this case, you should insert the
 570 start-tag of the root element and then use the command @kbd{C-c C-s
 571 C-a}, which reapplies the rules based on the current content of
 572 the document.  It is usually not necessary to insert the complete
 573 start-tag; often just @samp{<@var{name}} is
 574 enough.
 575
 576 If you want to use a schema that has not yet been added to the
 577 schema locating files, you can use the command @kbd{C-c C-s C-f}
 578 to manually select the file containing the schema for the document in
 579 current buffer.  Emacs will read the file-name of the schema from the
 580 minibuffer. After reading the file-name, Emacs will ask whether you
 581 wish to add a rule to a schema locating file that persistently
 582 associates the document with the selected schema.  The rule will be
 583 added to the first file in the list specified
 584 @samp{rng-schema-locating-files}; it will create the file if
 585 necessary, but will not create a directory. If the variable
 586 @samp{rng-schema-locating-files} has not been customized, this
 587 means that the rule will be added to the file @samp{schemas.xml}
 588 in the same directory as the document being edited.
 589
 590 The command @kbd{C-c C-s C-t} allows you to select a schema by
 591 specifying an identifier for the type of the document.  The schema
 592 locating files determine the available type identifiers and what
 593 schema is used for each type identifier. This is useful when it is
 594 impossible to infer the right schema from either the file-name or the
 595 content of the document, even though the schema is already in the
 596 schema locating file.  A situation in which this can occur is when
 597 there are multiple variants of a schema where all valid documents have
 598 the same document element.  For example, XHTML has Strict and
 599 Transitional variants.  In a situation like this, a schema locating file
 600 can define a type identifier for each variant. As with @kbd{C-c
 601 C-s C-f}, Emacs will ask whether you wish to add a rule to a schema
 602 locating file that persistently associates the document with the
 603 specified type identifier.
 604
 605 The command @kbd{C-c C-s C-l} adds a rule to a schema
 606 locating file that persistently associates the document with
 607 the schema that is currently being used.
 608
 609 @node Schema locating files
 610 @section Schema locating files
 611
 612 Each schema locating file specifies a list of rules.  The rules
 613 from each file are appended in order. To locate a schema each rule is
 614 applied in turn until a rule matches.  The first matching rule is then
 615 used to determine the schema.
 616
 617 Schema locating files are designed to be useful for other
 618 applications that need to locate a schema for a document. In fact,
 619 there is nothing specific to locating schemas in the design; it could
 620 equally well be used for locating a stylesheet.
 621
 622 @menu
 623 * Schema locating file syntax basics::
 624 * Using the document's URI to locate a schema::
 625 * Using the document element to locate a schema::
 626 * Using type identifiers in schema locating files::
 627 * Using multiple schema locating files::
 628 @end menu
 629
 630 @node Schema locating file syntax basics
 631 @subsection Schema locating file syntax basics
 632
 633 There is a schema for schema locating files in the file
 634 @samp{locate.rnc} in the schema directory.  Schema locating
 635 files must be valid with respect to this schema.
 636
 637 The document element of a schema locating file must be
 638 @samp{locatingRules} and the namespace URI must be
 639 @samp{http://thaiopensource.com/ns/locating-rules/1.0}.  The
 640 children of the document element specify rules. The order of the
 641 children is the same as the order of the rules.  Here's a complete
 642 example of a schema locating file:
 643
 644 @example
 645 <?xml version="1.0"?>
 646 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
 647   <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
 648   <documentElement localName="book" uri="docbook.rnc"/>
 649 </locatingRules>
 650 @end example
 651
 652 @noindent
 653 This says to use the schema @samp{xhtml.rnc} for a document with
 654 namespace @samp{http://www.w3.org/1999/xhtml}, and to use the
 655 schema @samp{docbook.rnc} for a document whose local name is
 656 @samp{book}.  If the document element had both a namespace URI
 657 of @samp{http://www.w3.org/1999/xhtml} and a local name of
 658 @samp{book}, then the matching rule that comes first will be
 659 used and so the schema @samp{xhtml.rnc} would be used.  There is
 660 no precedence between different types of rule; the first matching rule
 661 of any type is used.
 662
 663 As usual with XML-related technologies, resources are identified
 664 by URIs.  The @samp{uri} attribute identifies the schema by
 665 specifying the URI@.  The URI may be relative.  If so, it is resolved
 666 relative to the URI of the schema locating file that contains
 667 attribute. This means that if the value of @samp{uri} attribute
 668 does not contain a @samp{/}, then it will refer to a filename in
 669 the same directory as the schema locating file.
 670
 671 @node Using the document's URI to locate a schema
 672 @subsection Using the document's URI to locate a schema
 673
 674 A @samp{uri} rule locates a schema based on the URI of the
 675 document.  The @samp{uri} attribute specifies the URI of the
 676 schema.  The @samp{resource} attribute can be used to specify
 677 the schema for a particular document.  For example,
 678
 679 @example
 680 <uri resource="spec.xml" uri="docbook.rnc"/>
 681 @end example
 682
 683 @noindent
 684 specifies that the schema for @samp{spec.xml} is
 685 @samp{docbook.rnc}.
 686
 687 The @samp{pattern} attribute can be used instead of the
 688 @samp{resource} attribute to specify the schema for any document
 689 whose URI matches a pattern.  The pattern has the same syntax as an
 690 absolute or relative URI except that the path component of the URI can
 691 use a @samp{*} character to stand for zero or more characters
 692 within a path segment (i.e., any character other @samp{/}).
 693 Typically, the URI pattern looks like a relative URI, but, whereas a
 694 relative URI in the @samp{resource} attribute is resolved into a
 695 particular absolute URI using the base URI of the schema locating
 696 file, a relative URI pattern matches if it matches some number of
 697 complete path segments of the document's URI ending with the last path
 698 segment of the document's URI@. For example,
 699
 700 @example
 701 <uri pattern="*.xsl" uri="xslt.rnc"/>
 702 @end example
 703
 704 @noindent
 705 specifies that the schema for documents with a URI whose path ends
 706 with @samp{.xsl} is @samp{xslt.rnc}.
 707
 708 A @samp{transformURI} rule locates a schema by
 709 transforming the URI of the document. The @samp{fromPattern}
 710 attribute specifies a URI pattern with the same meaning as the
 711 @samp{pattern} attribute of the @samp{uri} element.  The
 712 @samp{toPattern} attribute is a URI pattern that is used to
 713 generate the URI of the schema.  Each @samp{*} in the
 714 @samp{toPattern} is replaced by the string that matched the
 715 corresponding @samp{*} in the @samp{fromPattern}.  The
 716 resulting string is appended to the initial part of the document's URI
 717 that was not explicitly matched by the @samp{fromPattern}.  The
 718 rule matches only if the transformed URI identifies an existing
 719 resource.  For example, the rule
 720
 721 @example
 722 <transformURI fromPattern="*.xml" toPattern="*.rnc"/>
 723 @end example
 724
 725 @noindent
 726 would transform the URI @samp{file:///home/jjc/docs/spec.xml}
 727 into the URI @samp{file:///home/jjc/docs/spec.rnc}.  Thus, this
 728 rule specifies that to locate a schema for a document
 729 @samp{@var{foo}.xml}, Emacs should test whether a file
 730 @samp{@var{foo}.rnc} exists in the same directory as
 731 @samp{@var{foo}.xml}, and, if so, should use it as the
 732 schema.
 733
 734 @node Using the document element to locate a schema
 735 @subsection Using the document element to locate a schema
 736
 737 A @samp{documentElement} rule locates a schema based on
 738 the local name and prefix of the document element. For example, a rule
 739
 740 @example
 741 <documentElement prefix="xsl" localName="stylesheet" uri="xslt.rnc"/>
 742 @end example
 743
 744 @noindent
 745 specifies that when the name of the document element is
 746 @samp{xsl:stylesheet}, then @samp{xslt.rnc} should be used
 747 as the schema. Either the @samp{prefix} or
 748 @samp{localName} attribute may be omitted to allow any prefix or
 749 local name.
 750
 751 A @samp{namespace} rule locates a schema based on the
 752 namespace URI of the document element. For example, a rule
 753
 754 @example
 755 <namespace ns="http://www.w3.org/1999/XSL/Transform" uri="xslt.rnc"/>
 756 @end example
 757
 758 @noindent
 759 specifies that when the namespace URI of the document is
 760 @samp{http://www.w3.org/1999/XSL/Transform}, then
 761 @samp{xslt.rnc} should be used as the schema.
 762
 763 @node Using type identifiers in schema locating files
 764 @subsection Using type identifiers in schema locating files
 765
 766 Type identifiers allow a level of indirection in locating the
 767 schema for a document.  Instead of associating the document directly
 768 with a schema URI, the document is associated with a type identifier,
 769 which is in turn associated with a schema URI@. nXML mode does not
 770 constrain the format of type identifiers.  They can be simply strings
 771 without any formal structure or they can be public identifiers or
 772 URIs.  Note that these type identifiers have nothing to do with the
 773 DOCTYPE declaration.  When comparing type identifiers, whitespace is
 774 normalized in the same way as with the @samp{xsd:token}
 775 datatype: leading and trailing whitespace is stripped; other sequences
 776 of whitespace are normalized to a single space character.
 777
 778 Each of the rules described in previous sections that uses a
 779 @samp{uri} attribute to specify a schema, can instead use a
 780 @samp{typeId} attribute to specify a type identifier.  The type
 781 identifier can be associated with a URI using a @samp{typeId}
 782 element. For example,
 783
 784 @example
 785 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
 786   <namespace ns="http://www.w3.org/1999/xhtml" typeId="XHTML"/>
 787   <typeId id="XHTML" typeId="XHTML Strict"/>
 788   <typeId id="XHTML Strict" uri="xhtml-strict.rnc"/>
 789   <typeId id="XHTML Transitional" uri="xhtml-transitional.rnc"/>
 790 </locatingRules>
 791 @end example
 792
 793 @noindent
 794 declares three type identifiers @samp{XHTML} (representing the
 795 default variant of XHTML to be used), @samp{XHTML Strict} and
 796 @samp{XHTML Transitional}.  Such a schema locating file would
 797 use @samp{xhtml-strict.rnc} for a document whose namespace is
 798 @samp{http://www.w3.org/1999/xhtml}.  But it is considerably
 799 more flexible than a schema locating file that simply specified
 800
 801 @example
 802 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml-strict.rnc"/>
 803 @end example
 804
 805 @noindent
 806 A user can easily use @kbd{C-c C-s C-t} to select between XHTML
 807 Strict and XHTML Transitional. Also, a user can easily add a catalog
 808
 809 @example
 810 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
 811   <typeId id="XHTML" typeId="XHTML Transitional"/>
 812 </locatingRules>
 813 @end example
 814
 815 @noindent
 816 that makes the default variant of XHTML be XHTML Transitional.
 817
 818 @node Using multiple schema locating files
 819 @subsection Using multiple schema locating files
 820
 821 The @samp{include} element includes rules from another
 822 schema locating file.  The behavior is exactly as if the rules from
 823 that file were included in place of the @samp{include} element.
 824 Relative URIs are resolved into absolute URIs before the inclusion is
 825 performed. For example,
 826
 827 @example
 828 <include rules="../rules.xml"/>
 829 @end example
 830
 831 @noindent
 832 includes the rules from @samp{rules.xml}.
 833
 834 The process of locating a schema takes as input a list of schema
 835 locating files.  The rules in all these files and in the files they
 836 include are resolved into a single list of rules, which are applied
 837 strictly in order.  Sometimes this order is not what is needed.
 838 For example, suppose you have two schema locating files, a private
 839 file
 840
 841 @example
 842 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
 843   <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
 844 </locatingRules>
 845 @end example
 846
 847 @noindent
 848 followed by a public file
 849
 850 @example
 851 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
 852   <transformURI pathSuffix=".xml" replacePathSuffix=".rnc"/>
 853   <namespace ns="http://www.w3.org/1999/XSL/Transform" typeId="XSLT"/>
 854 </locatingRules>
 855 @end example
 856
 857 @noindent
 858 The effect of these two files is that the XHTML @samp{namespace}
 859 rule takes precedence over the @samp{transformURI} rule, which
 860 is almost certainly not what is needed.  This can be solved by adding
 861 an @samp{applyFollowingRules} to the private file.
 862
 863 @example
 864 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
 865   <applyFollowingRules ruleType="transformURI"/>
 866   <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
 867 </locatingRules>
 868 @end example
 869
 870 @node DTDs
 871 @chapter DTDs
 872
 873 nXML mode is designed to support the creation of standalone XML
 874 documents that do not depend on a DTD@.  Although it is common practice
 875 to insert a DOCTYPE declaration referencing an external DTD, this has
 876 undesirable side-effects.  It means that the document is no longer
 877 self-contained. It also means that different XML parsers may interpret
 878 the document in different ways, since the XML Recommendation does not
 879 require XML parsers to read the DTD@.  With DTDs, it was impractical to
 880 get validation without using an external DTD or reference to an
 881 parameter entity.  With RELAX NG and other schema languages, you can
 882 simultaneously get the benefits of validation and standalone XML
 883 documents.  Therefore, I recommend that you do not reference an
 884 external DOCTYPE in your XML documents.
 885
 886 One problem is entities for characters. Typically, as well as
 887 providing validation, DTDs also provide a set of character entities
 888 for documents to use. Schemas cannot provide this functionality,
 889 because schema validation happens after XML parsing.  The recommended
 890 solution is to either use the Unicode characters directly, or, if this
 891 is impractical, use character references.  nXML mode supports this by
 892 providing commands for entering characters and character references
 893 using the Unicode names, and can display the glyph corresponding to a
 894 character reference.
 895
 896 @node Limitations
 897 @chapter Limitations
 898
 899 nXML mode has some limitations:
 900
 901 @itemize @bullet
 902 @item
 903 DTD support is limited.  Internal parsed general entities declared
 904 in the internal subset are supported provided they do not contain
 905 elements. Other usage of DTDs is ignored.
 906 @item
 907 The restrictions on RELAX NG schemas in section 7 of the RELAX NG
 908 specification are not enforced.
 909 @end itemize
 910
 911 @node GNU Free Documentation License
 912 @appendix GNU Free Documentation License
 913 @include doclicense.texi
 914
 915 @bye