]> code.delx.au - gnu-emacs-elpa/blob - PGN.txt
reward passed pawns, and make the code a bit faster
[gnu-emacs-elpa] / PGN.txt
1 ======================================================================
2 TABLE OF CONTENTS
3 ======================================================================
4 ======================================================================
5 0: Preface
6 1: Introduction
7 2: Chess data representation
8 2.1: Data interchange incompatibility
9 2.2: Specification goals
10 2.3: A sample PGN game
11 3: Formats: import and export
12 3.1: Import format allows for manually prepared data
13 3.2: Export format used for program generated output
14 3.2.1: Byte equivalence
15 3.2.2: Archival storage and the newline character
16 3.2.3: Speed of processing
17 3.2.4: Reduced export format
18 4: Lexicographical issues
19 4.1: Character codes
20 4.2: Tab characters
21 4.3: Line lengths
22 5: Commentary
23 6: Escape mechanism
24 7: Tokens
25 8: Parsing games
26 8.1: Tag pair section
27 8.1.1: Seven Tag Roster
28 8.1.1.1: The Event tag
29 8.1.1.2: The Site tag
30 8.1.1.3: The Date tag
31 8.1.1.4: The Round tag
32 8.1.1.5: The White tag
33 8.1.1.6: The Black tag
34 8.1.1.7: The Result tag
35 8.2: Movetext section
36 8.2.1: Movetext line justification
37 8.2.2: Movetext move number indications
38 8.2.2.1: Import format move number indications
39 8.2.2.2: Export format move number indications
40 8.2.3: Movetext SAN (Standard Algebraic Notation)
41 8.2.3.1: Square identification
42 8.2.3.2: Piece identification
43 8.2.3.3: Basic SAN move construction
44 8.2.3.4: Disambiguation
45 8.2.3.5: Check and checkmate indication characters
46 8.2.3.6: SAN move length
47 8.2.3.7: Import and export SAN
48 8.2.3.8: SAN move suffix annotations
49 8.2.4: Movetext NAG (Numeric Annotation Glyph)
50 8.2.5: Movetext RAV (Recursive Annotation Variation)
51 8.2.6: Game Termination Markers
52 9: Supplemental tag names
53 9.1: Player related information
54 9.1.1: Tags: WhiteTitle, BlackTitle
55 9.1.2: Tags: WhiteElo, BlackElo
56 9.1.3: Tags: WhiteUSCF, BlackUSCF
57 9.1.4: Tags: WhiteNA, BlackNA
58 9.1.5: Tags: WhiteType, BlackType
59 9.2: Event related information
60 9.2.1: Tag: EventDate
61 9.2.2: Tag: EventSponsor
62 9.2.3: Tag: Section
63 9.2.4: Tag: Stage
64 9.2.5: Tag: Board
65 9.3: Opening information (locale specific)
66 9.3.1: Tag: Opening
67 9.3.2: Tag: Variation
68 9.3.3: Tag: SubVariation
69 9.4: Opening information (third party vendors)
70 9.4.1: Tag: ECO
71 9.4.2: Tag: NIC
72 9.5: Time and date related information
73 9.5.1: Tag: Time
74 9.5.2: Tag: UTCTime
75 9.5.3: Tag: UTCDate
76 9.6: Time control
77 9.6.1: Tag: TimeControl
78 9.7: Alternative starting positions
79 9.7.1: Tag: SetUp
80 9.7.2: Tag: FEN
81 9.8: Game conclusion
82 9.8.1: Tag: Termination
83 9.9: Miscellaneous
84 9.9.1: Tag: Annotator
85 9.9.2: Tag: Mode
86 9.9.3: Tag: PlyCount
87 10: Numeric Annotation Glyphs
88 11: File names and directories
89 11.1: File name suffix for PGN data
90 11.2: File name formation for PGN data for a specific player
91 11.3: File name formation for PGN data for a specific event
92 11.4: File name formation for PGN data for chronologically ordered games
93 11.5: Suggested directory tree organization
94 12: PGN collating sequence
95 13: PGN software
96 13.1: The SAN Kit
97 13.2: pgnRead
98 13.3: mail2pgn/GIICS
99 13.4: XBoard
100 13.5: cupgn
101 13.6: Zarkov
102 13.7: Chess Assistant
103 13.8: BOOKUP
104 13.9: HIARCS
105 13.10: Deja Vu
106 13.11: MV2PGN
107 13.12: The Hansen utilities (cb2pgn, nic2pgn, pgn2cb, pgn2nic)
108 13.13: Slappy the Database
109 13.14: CBASCII
110 13.15: ZZZZZZ
111 13.16: icsconv
112 13.17: CHESSOP (CHESSOPN/CHESSOPG)
113 13.18: CAT2PGN
114 13.19: pgn2opg
115 14: PGN data archives
116 15: International Olympic Committee country codes
117 16: Additional chess data standards
118 16.1: FEN
119 16.1.1: History
120 16.1.2: Uses for a position notation
121 16.1.3: Data fields
122 16.1.3.1: Piece placement data
123 16.1.3.2: Active color
124 16.1.3.3: Castling availability
125 16.1.3.4: En passant target square
126 16.1.3.5: Halfmove clock
127 16.1.3.6: Fullmove number
128 16.1.4: Examples
129 16.2: EPD
130 16.2.1: History
131 16.2.2: Uses for an extended position notation
132 16.2.3: Data fields
133 16.2.3.1: Piece placement data
134 16.2.3.2: Active color
135 16.2.3.3: Castling availability
136 16.2.3.4: En passant target square
137 16.2.4: Operations
138 16.2.4.1: General format
139 16.2.4.2: Opcode mnemonics
140 16.2.5: Opcode list
141 16.2.5.1: Opcode "acn": analysis count: nodes
142 16.2.5.2: Opcode "acs": analysis count: seconds
143 16.2.5.3: Opcode "am": avoid move(s)
144 16.2.5.4: Opcode "bm": best move(s)
145 16.2.5.5: Opcode "c0": comment (primary, also "c1" though "c9")
146 16.2.5.6: Opcode "ce": centipawn evaluation
147 16.2.5.7: Opcode "dm": direct mate fullmove count
148 16.2.5.8: Opcode "draw_accept": accept a draw offer
149 16.2.5.9: Opcode "draw_claim": claim a draw
150 16.2.5.10: Opcode "draw_offer": offer a draw
151 16.2.5.11: Opcode "draw_reject": reject a draw offer
152 16.2.5.12: Opcode "eco": _Encyclopedia of Chess Openings_ opening code
153 16.2.5.13: Opcode "fmvn": fullmove number
154 16.2.5.14: Opcode "hmvc": halfmove clock
155 16.2.5.15: Opcode "id": position identification
156 16.2.5.16: Opcode "nic": _New In Chess_ opening code
157 16.2.5.17: Opcode "noop": no operation
158 16.2.5.18: Opcode "pm": predicted move
159 16.2.5.19: Opcode "pv": predicted variation
160 16.2.5.20: Opcode "rc": repetition count
161 16.2.5.21: Opcode "resign": game resignation
162 16.2.5.22: Opcode "sm": supplied move
163 16.2.5.23: Opcode "tcgs": telecommunication: game selector
164 16.2.5.24: Opcode "tcri": telecommunication: receiver identification
165 16.2.5.25: Opcode "tcsi": telecommunication: sender identification
166 16.2.5.26: Opcode "v0": variation name (primary, also "v1" though "v9")
167 17: Alternative chesspiece identifier letters
168 18: Formal syntax
169 19: Canonical chess position hash coding
170 20: Binary representation (PGC)
171 20.1: Bytes, words, and doublewords
172 20.2: Move ordinals
173 20.3: String data
174 20.4: Marker codes
175 20.4.1: Marker 0x01: reduced export format single game
176 20.4.2: Marker 0x02: tag pair
177 20.4.3: Marker 0x03: short move sequence
178 20.4.4: Marker 0x04: long move sequence
179 20.4.5: Marker 0x05: general game data begin
180 20.4.6: Marker 0x06: general game data end
181 20.4.7: Marker 0x07: simple-nag
182 20.4.8: Marker 0x08: rav-begin
183 20.4.9: Marker 0x09: rav-end
184 20.4.10: Marker 0x0a: escape-string
185 21: E-mail correspondence usage
186
187 ======================================================================
188 Standard: Portable Game Notation Specification and Implementation Guide
189
190 Revised: 1994.03.12
191
192 Authors: Interested readers of the Internet newsgroup rec.games.chess
193
194 Coordinator: Steven J. Edwards (send comments to sje@world.std.com)
195
196 0: Preface
197
198 >From the Tower of Babel story:
199
200 "If now, while they are one people, all speaking the same language, they have
201 started to do this, nothing will later stop them from doing whatever they
202 propose to do."
203
204 Genesis XI, v.6, _New American Bible_
205
206 1: Introduction
207
208 PGN is "Portable Game Notation", a standard designed for the representation of
209 chess game data using ASCII text files. PGN is structured for easy reading and
210 writing by human users and for easy parsing and generation by computer
211 programs. The intent of the definition and propagation of PGN is to facilitate
212 the sharing of public domain chess game data among chessplayers (both organic
213 and otherwise), publishers, and computer chess researchers throughout the
214 world.
215
216 PGN is not intended to be a general purpose standard that is suitable for every
217 possible use; no such standard could fill all conceivable requirements.
218 Instead, PGN is proposed as a universal portable representation for data
219 interchange. The idea is to allow the construction of a family of chess
220 applications that can quickly and easily process chess game data using PGN for
221 import and export among themselves.
222
223 2: Chess data representation
224
225 Computer usage among chessplayers has become quite common in recent years and a
226 variety of different programs, both commercial and public domain, are used to
227 generate, access, and propagate chess game data. Some of these programs are
228 rather impressive; most are now well behaved in that they correctly follow the
229 Laws of Chess and handle users' data with reasonable care. Unfortunately, many
230 programs have had serious problems with several aspects of the external
231 representation of chess game data. Sometimes these problems become more
232 visible when a user attempts to move significant quantities of data from one
233 program to another; if there has been no real effort to ensure portability of
234 data, then the chances for a successful transfer are small at best.
235
236 2.1: Data interchange incompatibility
237
238 The reasons for format incompatibility are easy to understand. In fact, most
239 of them are correlated with the same problems that have already been seen with
240 commercial software offerings for other domains such as word processing,
241 spreadsheets, fonts, and graphics. Sometimes a manufacturer deliberately
242 designs a data format using encryption or some other secret, proprietary
243 technique to "lock in" a customer. Sometimes a designer may produce a format
244 that can be deciphered without too much difficulty, but at the same time
245 publicly discourage third party software by claiming trade secret protection.
246 Another software producer may develop a non-proprietary system, but it may work
247 well only within the scope of a single program or application because it is not
248 easily expandable. Finally, some other software may work very well for many
249 purposes, but it uses symbols and language not easily understood by people or
250 computers available to those outside the country of its development.
251
252 2.2: Specification goals
253
254 A specification for a portable game notation must observe the lessons of
255 history and be able to handle probable needs of the future. The design
256 criteria for PGN were selected to meet these needs. These criteria include:
257
258 1) The details of the system must be publicly available and free of unnecessary
259 complexity. Ideally, if the documentation is not available for some reason,
260 typical chess software developers and users should be able to understand most
261 of the data without the need for third party assistance.
262
263 2) The details of the system must be non-proprietary so that users and software
264 developers are unrestricted by concerns about infringing on intellectual
265 property rights. The idea is to let chess programmers compete in a free market
266 where customers may choose software based on their real needs and not based on
267 artificial requirements created by a secret data format.
268
269 3) The system must work for a variety of programs. The format should be such
270 that it can be used by chess database programs, chess publishing programs,
271 chess server programs, and chessplaying programs without being unnecessarily
272 specific to any particular application class.
273
274 4) The system must be easily expandable and scalable. The expansion ability
275 must include handling data items that may not exist currently but could be
276 expected to emerge in the future. (Examples: new opening classifications and
277 new country names.) The system should be scalable in that it must not have any
278 arbitrary restrictions concerning the quantity of stored data. Also, planned
279 modes of expansion should either preserve earlier databases or at least allow
280 for their automatic conversion.
281
282 5) The system must be international. Chess software users are found in many
283 countries and the system should be free of difficulties caused by conventions
284 local to a given region.
285
286 6) Finally, the system should handle the same kinds and amounts of data that
287 are already handled by existing chess software and by print media.
288
289 2.3: A sample PGN game
290
291 Although its description may seem rather lengthy, PGN is actually fairly
292 simple. A sample PGN game follows; it has most of the important features
293 described in later sections of this document.
294
295 [Event "F/S Return Match"]
296 [Site "Belgrade, Serbia JUG"]
297 [Date "1992.11.04"]
298 [Round "29"]
299 [White "Fischer, Robert J."]
300 [Black "Spassky, Boris V."]
301 [Result "1/2-1/2"]
302
303 1. e4 e5 2. Nf3 Nc6 3. Bb5 a6 4. Ba4 Nf6 5. O-O Be7 6. Re1 b5 7. Bb3 d6 8. c3
304 O-O 9. h3 Nb8 10. d4 Nbd7 11. c4 c6 12. cxb5 axb5 13. Nc3 Bb7 14. Bg5 b4 15.
305 Nb1 h6 16. Bh4 c5 17. dxe5 Nxe4 18. Bxe7 Qxe7 19. exd6 Qf6 20. Nbd2 Nxd6 21.
306 Nc4 Nxc4 22. Bxc4 Nb6 23. Ne5 Rae8 24. Bxf7+ Rxf7 25. Nxf7 Rxe1+ 26. Qxe1 Kxf7
307 27. Qe3 Qg5 28. Qxg5 hxg5 29. b3 Ke6 30. a3 Kd6 31. axb4 cxb4 32. Ra5 Nd5 33.
308 f3 Bc8 34. Kf2 Bf5 35. Ra7 g6 36. Ra6+ Kc5 37. Ke1 Nf4 38. g3 Nxh3 39. Kd2 Kb5
309 40. Rd6 Kc5 41. Ra6 Nf2 42. g4 Bd3 43. Re6 1/2-1/2
310
311 3: Formats: import and export
312
313 There are two formats in the PGN specification. These are the "import" format
314 and the "export" format. These are the two different ways of formatting the
315 same PGN data according to its source. The details of the two formats are
316 described throughout the following sections of this document.
317
318 Other than formats, there is the additional topic of PGN presentation. While
319 both PGN import and export formats are designed to be readable by humans, there
320 is no recommendation that either of these be an ultimate mode of chess data
321 presentation. Rather, software developers are urged to consider all of the
322 various techniques at their disposal to enhance the display of chess data at
323 the presentation level (i.e., highest level) of their programs. This means
324 that the use of different fonts, character sizes, color, and other tools of
325 computer aided interaction and publishing should be explored to provide a high
326 quality presentation appropriate to the function of the particular program.
327
328 3.1: Import format allows for manually prepared data
329
330 The import format is rather flexible and is used to describe data that may have
331 been prepared by hand, much like a source file for a high level programming
332 language. A program that can read PGN data should be able to handle the
333 somewhat lax import format.
334
335 3.2: Export format used for program generated output
336
337 The export format is rather strict and is used to describe data that is usually
338 prepared under program control, something like a pretty printed source program
339 reformatted by a compiler.
340
341 3.2.1: Byte equivalence
342
343 For a given PGN data file, export format representations generated by different
344 PGN programs on the same computing system should be exactly equivalent, byte
345 for byte.
346
347 3.2.2: Archival storage and the newline character
348
349 Export format should also be used for archival storage. Here, "archival"
350 storage is defined as storage that may be accessed by a variety of computing
351 systems. The only extra requirement for archival storage is that the newline
352 character have a specific representation that is independent of its value for a
353 particular computing system's text file usage. The archival representation of
354 a newline is the ASCII control character LF (line feed, decimal value 10,
355 hexadecimal value 0x0a).
356
357 Sadly, there are some accidents of history that survive to this day that have
358 baroque representations for a newline: multicharacter sequences, end-of-line
359 record markers, start-of-line byte counts, fixed length records, and so forth.
360 It is well beyond the scope of the PGN project to reconcile all of these to the
361 unified world of ANSI C and the those enjoying the bliss of a single '\n'
362 convention. Some systems may just not be able to handle an archival PGN text
363 file with native text editors. In these cases, an indulgence of sorts is
364 granted to use the local newline convention in non-archival PGN files for those
365 text editors.
366
367 3.2.3: Speed of processing
368
369 Several parts of the export format deal with exact descriptions of line and
370 field justification that are absent from the import format details. The main
371 reason for these restrictions on the export format are to allow the
372 construction of simple data translation programs that can easily scan PGN data
373 without having to have a full chess engine or other complex parsing routines.
374 The idea is to encourage chess software authors to always allow for at least a
375 limited PGN reading capability. Even when a full chess engine parsing
376 capability is available, it is likely to be at least two orders of magnitude
377 slower than a simple text scanner.
378
379 3.2.4: Reduced export format
380
381 A PGN game represented using export format is said to be in "reduced export
382 format" if all of the following hold: 1) it has no commentary, 2) it has only
383 the standard seven tag roster identification information ("STR", see below), 3)
384 it has no recursive annotation variations ("RAV", see below), and 4) it has no
385 numeric annotation glyphs ("NAG", see below). Reduced export format is used
386 for bulk storage of unannotated games. It represents a minimum level of
387 standard conformance for a PGN exporting application.
388
389 4: Lexicographical issues
390
391 PGN data is composed of characters; non-overlapping contiguous sequences of
392 characters form lexical tokens.
393
394 4.1: Character codes
395
396 PGN data is represented using a subset of the eight bit ISO 8859/1 (Latin 1)
397 character set. ("ISO" is an acronym for the International Standards
398 Organization.) This set is also known as ECMA-94 and is similar to other ISO
399 Latin character sets. ISO 8859/1 includes the standard seven bit ASCII
400 character set for the 32 control character code values from zero to 31. The 95
401 printing character code values from 32 to 126 are also equivalent to seven bit
402 ASCII usage. (Code value 127, the ASCII DEL control character, is a graphic
403 character in ISO 8859/1; it is not used for PGN data representation.)
404
405 The 32 ISO 8859/1 code values from 128 to 159 are non-printing control
406 characters. They are not used for PGN data representation. The 32 code values
407 from 160 to 191 are mostly non-alphabetic printing characters and their use for
408 PGN data is discouraged as their graphic representation varies considerably
409 among other ISO Latin sets. Finally, the 64 code values from 192 to 255 are
410 mostly alphabetic printing characters with various diacritical marks; their use
411 is encouraged for those languages that require such characters. The graphic
412 representations of this last set of 64 characters is fairly constant for the
413 ISO Latin family.
414
415 Printing character codes outside of the seven bit ASCII range may only appear
416 in string data and in commentary. They are not permitted for use in symbol
417 construction.
418
419 Because some PGN users' environments may not support presentation of non-ASCII
420 characters, PGN game authors should refrain from using such characters in
421 critical commentary or string values in game data that may be referenced in
422 such environments. PGN software authors should have their programs handle such
423 environments by displaying a question mark ("?") for non-ASCII character codes.
424 This is an important point because there are many computing systems that can
425 display eight bit character data, but the display graphics may differ among
426 machines and operating systems from different manufacturers.
427
428 Only four of the ASCII control characters are permitted in PGN import format;
429 these are the horizontal and vertical tabs along with the linefeed and carriage
430 return codes.
431
432 The external representation of the newline character may differ among
433 platforms; this is an acceptable variation as long as the details of the
434 implementation are hidden from software implementors and users. When a choice
435 is practical, the Unix "newline is linefeed" convention is preferred.
436
437 4.2: Tab characters
438
439 Tab characters, both horizontal and vertical, are not permitted in the export
440 format. This is because the treatment of tab characters is highly dependent
441 upon the particular software in use on the host computing system. Also, tab
442 characters may not appear inside of string data.
443
444 4.3: Line lengths
445
446 PGN data are organized as simple text lines without any special bytes or
447 markers for secondary record structure imposed by specific operating systems.
448 Import format PGN text lines are limited to having a maximum of 255 characters
449 per line including the newline character. Lines with 80 or more printing
450 characters are strongly discouraged because of the difficulties experienced by
451 common text editors with long lines.
452
453 In some cases, very long tag values will require 80 or more columns, but these
454 are relatively rare. An example of this is the "FEN" tag pair; it may have a
455 long tag value, but this particular tag pair is only used to represent a game
456 that doesn't start from the usual initial position.
457
458 5: Commentary
459
460 Comment text may appear in PGN data. There are two kinds of comments. The
461 first kind is the "rest of line" comment; this comment type starts with a
462 semicolon character and continues to the end of the line. The second kind
463 starts with a left brace character and continues to the next right brace
464 character. Comments cannot appear inside any token.
465
466 Brace comments do not nest; a left brace character appearing in a brace comment
467 loses its special meaning and is ignored. A semicolon appearing inside of a
468 brace comment loses its special meaning and is ignored. Braces appearing
469 inside of a semicolon comments lose their special meaning and are ignored.
470
471 *** Export format representation of comments needs definition work.
472
473 6: Escape mechanism
474
475 There is a special escape mechanism for PGN data. This mechanism is triggered
476 by a percent sign character ("%") appearing in the first column of a line; the
477 data on the rest of the line is ignored by publicly available PGN scanning
478 software. This escape convention is intended for the private use of software
479 developers and researchers to embed non-PGN commands and data in PGN streams.
480
481 A percent sign appearing in any other place other than the first position in a
482 line does not trigger the escape mechanism.
483
484 7: Tokens
485
486 PGN character data is organized as tokens. A token is a contiguous sequence of
487 characters that represents a basic semantic unit. Tokens may be separated from
488 adjacent tokens by white space characters. (White space characters include
489 space, newline, and tab characters.) Some tokens are self delimiting and do
490 not require white space characters.
491
492 A string token is a sequence of zero or more printing characters delimited by a
493 pair of quote characters (ASCII decimal value 34, hexadecimal value 0x22). An
494 empty string is represented by two adjacent quotes. (Note: an apostrophe is
495 not a quote.) A quote inside a string is represented by the backslash
496 immediately followed by a quote. A backslash inside a string is represented by
497 two adjacent backslashes. Strings are commonly used as tag pair values (see
498 below). Non-printing characters like newline and tab are not permitted inside
499 of strings. A string token is terminated by its closing quote. Currently, a
500 string is limited to a maximum of 255 characters of data.
501
502 An integer token is a sequence of one or more decimal digit characters. It is
503 a special case of the more general "symbol" token class described below.
504 Integer tokens are used to help represent move number indications (see below).
505 An integer token is terminated just prior to the first non-symbol character
506 following the integer digit sequence.
507
508 A period character (".") is a token by itself. It is used for move number
509 indications (see below). It is self terminating.
510
511 An asterisk character ("*") is a token by itself. It is used as one of the
512 possible game termination markers (see below); it indicates an incomplete game
513 or a game with an unknown or otherwise unavailable result. It is self
514 terminating.
515
516 The left and right bracket characters ("[" and "]") are tokens. They are used
517 to delimit tag pairs (see below). Both are self terminating.
518
519 The left and right parenthesis characters ("(" and ")") are tokens. They are
520 used to delimit Recursive Annotation Variations (see below). Both are self
521 terminating.
522
523 The left and right angle bracket characters ("<" and ">") are tokens. They are
524 reserved for future expansion. Both are self terminating.
525
526 A Numeric Annotation Glyph ("NAG", see below) is a token; it is composed of a
527 dollar sign character ("$") immediately followed by one or more digit
528 characters. It is terminated just prior to the first non-digit character
529 following the digit sequence.
530
531 A symbol token starts with a letter or digit character and is immediately
532 followed by a sequence of zero or more symbol continuation characters. These
533 continuation characters are letter characters ("A-Za-z"), digit characters
534 ("0-9"), the underscore ("_"), the plus sign ("+"), the octothorpe sign ("#"),
535 the equal sign ("="), the colon (":"), and the hyphen ("-"). Symbols are used
536 for a variety of purposes. All characters in a symbol are significant. A
537 symbol token is terminated just prior to the first non-symbol character
538 following the symbol character sequence. Currently, a symbol is limited to a
539 maximum of 255 characters in length.
540
541 8: Parsing games
542
543 A PGN database file is a sequential collection of zero or more PGN games. An
544 empty file is a valid, although somewhat uninformative, PGN database.
545
546 A PGN game is composed of two sections. The first is the tag pair section and
547 the second is the movetext section. The tag pair section provides information
548 that identifies the game by defining the values associated with a set of
549 standard parameters. The movetext section gives the usually enumerated and
550 possibly annotated moves of the game along with the concluding game termination
551 marker. The chess moves themselves are represented using SAN (Standard
552 Algebraic Notation), also described later in this document.
553
554 8.1: Tag pair section
555
556 The tag pair section is composed of a series of zero or more tag pairs.
557
558 A tag pair is composed of four consecutive tokens: a left bracket token, a
559 symbol token, a string token, and a right bracket token. The symbol token is
560 the tag name and the string token is the tag value associated with the tag
561 name. (There is a standard set of tag names and semantics described below.)
562 The same tag name should not appear more than once in a tag pair section.
563
564 A further restriction on tag names is that they are composed exclusively of
565 letters, digits, and the underscore character. This is done to facilitate
566 mapping of tag names into key and attribute names for use with general purpose
567 database programs.
568
569 For PGN import format, there may be zero or more white space characters between
570 any adjacent pair of tokens in a tag pair.
571
572 For PGN export format, there are no white space characters between the left
573 bracket and the tag name, there are no white space characters between the tag
574 value and the right bracket, and there is a single space character between the
575 tag name and the tag value.
576
577 Tag names, like all symbols, are case sensitive. All tag names used for
578 archival storage begin with an upper case letter.
579
580 PGN import format may have multiple tag pairs on the same line and may even
581 have a tag pair spanning more than a single line. Export format requires each
582 tag pair to appear left justified on a line by itself; a single empty line
583 follows the last tag pair.
584
585 Some tag values may be composed of a sequence of items. For example, a
586 consultation game may have more than one player for a given side. When this
587 occurs, the single character ":" (colon) appears between adjacent items.
588 Because of this use as an internal separator in strings, the colon should not
589 otherwise appear in a string.
590
591 The tag pair format is designed for expansion; initially only strings are
592 allowed as tag pair values. Tag value formats associated with the STR (Seven
593 Tag Roster, see below) will not change; they will always be string values.
594 However, there are long term plans to allow general list structures as tag
595 values for non-STR tag pairs. Use of these expanded tag values will likely be
596 restricted to special research programs. In all events, the top level
597 structure of a tag pair remains the same: left bracket, tag name, tag value,
598 and right bracket.
599
600 8.1.1: Seven Tag Roster
601
602 There is a set of tags defined for mandatory use for archival storage of PGN
603 data. This is the STR (Seven Tag Roster). The interpretation of these tags is
604 fixed as is the order in which they appear. Although the definition and use of
605 additional tag names and semantics is permitted and encouraged when needed, the
606 STR is the common ground that all programs should follow for public data
607 interchange.
608
609 For import format, the order of tag pairs is not important. For export format,
610 the STR tag pairs appear before any other tag pairs. (The STR tag pairs must
611 also appear in order; this order is described below). Also for export format,
612 any additional tag pairs appear in ASCII order by tag name.
613
614 The seven tag names of the STR are (in order):
615
616 1) Event (the name of the tournament or match event)
617
618 2) Site (the location of the event)
619
620 3) Date (the starting date of the game)
621
622 4) Round (the playing round ordinal of the game)
623
624 5) White (the player of the white pieces)
625
626 6) Black (the player of the black pieces)
627
628 7) Result (the result of the game)
629
630 A set of supplemental tag names is given later in this document.
631
632 For PGN export format, a single blank line appears after the last of the tag
633 pairs to conclude the tag pair section. This helps simple scanning programs to
634 quickly determine the end of the tag pair section and the beginning of the
635 movetext section.
636
637 8.1.1.1: The Event tag
638
639 The Event tag value should be reasonably descriptive. Abbreviations are to be
640 avoided unless absolutely necessary. A consistent event naming should be used
641 to help facilitate database scanning. If the name of the event is unknown, a
642 single question mark should appear as the tag value.
643
644 Examples:
645
646 [Event "FIDE World Championship"]
647
648 [Event "Moscow City Championship"]
649
650 [Event "ACM North American Computer Championship"]
651
652 [Event "Casual Game"]
653
654 8.1.1.2: The Site tag
655
656 The Site tag value should include city and region names along with a standard
657 name for the country. The use of the IOC (International Olympic Committee)
658 three letter names is suggested for those countries where such codes are
659 available. If the site of the event is unknown, a single question mark should
660 appear as the tag value. A comma may be used to separate a city from a region.
661 No comma is needed to separate a city or region from the IOC country code. A
662 later section of this document gives a list of three letter nation codes along
663 with a few additions for "locations" not covered by the IOC.
664
665 Examples:
666
667 [Site "New York City, NY USA"]
668
669 [Site "St. Petersburg RUS"]
670
671 [Site "Riga LAT"]
672
673 8.1.1.3: The Date tag
674
675 The Date tag value gives the starting date for the game. (Note: this is not
676 necessarily the same as the starting date for the event.) The date is given
677 with respect to the local time of the site given in the Event tag. The Date
678 tag value field always uses a standard ten character format: "YYYY.MM.DD". The
679 first four characters are digits that give the year, the next character is a
680 period, the next two characters are digits that give the month, the next
681 character is a period, and the final two characters are digits that give the
682 day of the month. If the any of the digit fields are not known, then question
683 marks are used in place of the digits.
684
685 Examples:
686
687 [Date "1992.08.31"]
688
689 [Date "1993.??.??"]
690
691 [Date "2001.01.01"]
692
693 8.1.1.4: The Round tag
694
695 The Round tag value gives the playing round for the game. In a match
696 competition, this value is the number of the game played. If the use of a
697 round number is inappropriate, then the field should be a single hyphen
698 character. If the round is unknown, a single question mark should appear as
699 the tag value.
700
701 Some organizers employ unusual round designations and have multipart playing
702 rounds and sometimes even have conditional rounds. In these cases, a multipart
703 round identifier can be made from a sequence of integer round numbers separated
704 by periods. The leftmost integer represents the most significant round and
705 succeeding integers represent round numbers in descending hierarchical order.
706
707 Examples:
708
709 [Round "1"]
710
711 [Round "3.1"]
712
713 [Round "4.1.2"]
714
715 8.1.1.5: The White tag
716
717 The White tag value is the name of the player or players of the white pieces.
718 The names are given as they would appear in a telephone directory. The family
719 or last name appears first. If a first name or first initial is available, it
720 is separated from the family name by a comma and a space. Finally, one or more
721 middle initials may appear. (Wherever a comma appears, the very next character
722 should be a space. Wherever an initial appears, the very next character should
723 be a period.) If the name is unknown, a single question mark should appear as
724 the tag value.
725
726 The intent is to allow meaningful ASCII sorting of the tag value that is
727 independent of regional name formation customs. If more than one person is
728 playing the white pieces, the names are listed in alphabetical order and are
729 separated by the colon character between adjacent entries. A player who is
730 also a computer program should have appropriate version information listed
731 after the name of the program.
732
733 The format used in the FIDE Rating Lists is appropriate for use for player name
734 tags.
735
736 Examples:
737
738 [White "Tal, Mikhail N."]
739
740 [White "van der Wiel, Johan"]
741
742 [White "Acme Pawngrabber v.3.2"]
743
744 [White "Fine, R."]
745
746 8.1.1.6: The Black tag
747
748 The Black tag value is the name of the player or players of the black pieces.
749 The names are given here as they are for the White tag value.
750
751 Examples:
752
753 [Black "Lasker, Emmanuel"]
754
755 [Black "Smyslov, Vasily V."]
756
757 [Black "Smith, John Q.:Woodpusher 2000"]
758
759 [Black "Morphy"]
760
761 8.1.1.7: The Result tag
762
763 The Result field value is the result of the game. It is always exactly the
764 same as the game termination marker that concludes the associated movetext. It
765 is always one of four possible values: "1-0" (White wins), "0-1" (Black wins),
766 "1/2-1/2" (drawn game), and "*" (game still in progress, game abandoned, or
767 result otherwise unknown). Note that the digit zero is used in both of the
768 first two cases; not the letter "O".
769
770 All possible examples:
771
772 [Result "0-1"]
773
774 [Result "1-0"]
775
776 [Result "1/2-1/2"]
777
778 [Result "*"]
779
780 8.2: Movetext section
781
782 The movetext section is composed of chess moves, move number indications,
783 optional annotations, and a single concluding game termination marker.
784
785 Because illegal moves are not real chess moves, they are not permitted in PGN
786 movetext. They may appear in commentary, however. One would hope that illegal
787 moves are relatively rare in games worthy of recording.
788
789 8.2.1: Movetext line justification
790
791 In PGN import format, tokens in the movetext do not require any specific line
792 justification.
793
794 In PGN export format, tokens in the movetext are placed left justified on
795 successive text lines each of which has less than 80 printing characters. As
796 many tokens as possible are placed on a line with the remainder appearing on
797 successive lines. A single space character appears between any two adjacent
798 symbol tokens on the same line in the movetext. As with the tag pair section,
799 a single empty line follows the last line of data to conclude the movetext
800 section.
801
802 Neither the first or the last character on an export format PGN line is a
803 space. (This may change in the case of commentary; this area is currently
804 under development.)
805
806 8.2.2: Movetext move number indications
807
808 A move number indication is composed of one or more adjacent digits (an integer
809 token) followed by zero or more periods. The integer portion of the indication
810 gives the move number of the immediately following white move (if present) and
811 also the immediately following black move (if present).
812
813 8.2.2.1: Import format move number indications
814
815 PGN import format does not require move number indications. It does not
816 prohibit superfluous move number indications anywhere in the movetext as long
817 as the move numbers are correct.
818
819 PGN import format move number indications may have zero or more period
820 characters following the digit sequence that gives the move number; one or more
821 white space characters may appear between the digit sequence and the period(s).
822
823 8.2.2.2: Export format move number indications
824
825 There are two export format move number indication formats, one for use
826 appearing immediately before a white move element and one for use appearing
827 immediately before a black move element. A white move number indication is
828 formed from the integer giving the fullmove number with a single period
829 character appended. A black move number indication is formed from the integer
830 giving the fullmove number with three period characters appended.
831
832 All white move elements have a preceding move number indication. A black move
833 element has a preceding move number indication only in two cases: first, if
834 there is intervening annotation or commentary between the black move and the
835 previous white move; and second, if there is no previous white move in the
836 special case where a game starts from a position where Black is the active
837 player.
838
839 There are no other cases where move number indications appear in PGN export
840 format.
841
842 8.2.3: Movetext SAN (Standard Algebraic Notation)
843
844 SAN (Standard Algebraic Notation) is a representation standard for chess moves
845 using the ASCII Latin alphabet.
846
847 Examples of SAN recorded games are found throughout most modern chess
848 publications. SAN as presented in this document uses English language single
849 character abbreviations for chess pieces, although this is easily changed in
850 the source. English is chosen over other languages because it appears to be
851 the most widely recognized.
852
853 An alternative to SAN is FAN (Figurine Algebraic Notation). FAN uses miniature
854 piece icons instead of single letter piece abbreviations. The two notations
855 are otherwise identical.
856
857 8.2.3.1: Square identification
858
859 SAN identifies each of the sixty four squares on the chessboard with a unique
860 two character name. The first character of a square identifier is the file of
861 the square; a file is a column of eight squares designated by a single lower
862 case letter from "a" (leftmost or queenside) up to and including "h" (rightmost
863 or kingside). The second character of a square identifier is the rank of the
864 square; a rank is a row of eight squares designated by a single digit from "1"
865 (bottom side [White's first rank]) up to and including "8" (top side [Black's
866 first rank]). The initial squares of some pieces are: white queen rook at a1,
867 white king at e1, black queen knight pawn at b7, and black king rook at h8.
868
869 8.2.3.2: Piece identification
870
871 SAN identifies each piece by a single upper case letter. The standard English
872 values: pawn = "P", knight = "N", bishop = "B", rook = "R", queen = "Q", and
873 king = "K".
874
875 The letter code for a pawn is not used for SAN moves in PGN export format
876 movetext. However, some PGN import software disambiguation code may allow for
877 the appearance of pawn letter codes. Also, pawn and other piece letter codes
878 are needed for use in some tag pair and annotation constructs.
879
880 It is admittedly a bit chauvinistic to select English piece letters over those
881 from other languages. There is a slight justification in that English is a de
882 facto universal second language among most chessplayers and program users. It
883 is probably the best that can be done for now. A later section of this
884 document gives alternative piece letters, but these should be used only for
885 local presentation software and not for archival storage or for dynamic
886 interchange among programs.
887
888 8.2.3.3: Basic SAN move construction
889
890 A basic SAN move is given by listing the moving piece letter (omitted for
891 pawns) followed by the destination square. Capture moves are denoted by the
892 lower case letter "x" immediately prior to the destination square; pawn
893 captures include the file letter of the originating square of the capturing
894 pawn immediately prior to the "x" character.
895
896 SAN kingside castling is indicated by the sequence "O-O"; queenside castling is
897 indicated by the sequence "O-O-O". Note that the upper case letter "O" is
898 used, not the digit zero. The use of a zero character is not only incompatible
899 with traditional text practices, but it can also confuse parsing algorithms
900 which also have to understand about move numbers and game termination markers.
901 Also note that the use of the letter "O" is consistent with the practice of
902 having all chess move symbols start with a letter; also, it follows the
903 convention that all non-pwn move symbols start with an upper case letter.
904
905 En passant captures do not have any special notation; they are formed as if the
906 captured pawn were on the capturing pawn's destination square. Pawn promotions
907 are denoted by the equal sign "=" immediately following the destination square
908 with a promoted piece letter (indicating one of knight, bishop, rook, or queen)
909 immediately following the equal sign. As above, the piece letter is in upper
910 case.
911
912 8.2.3.4: Disambiguation
913
914 In the case of ambiguities (multiple pieces of the same type moving to the same
915 square), the first appropriate disambiguating step of the three following steps
916 is taken:
917
918 First, if the moving pieces can be distinguished by their originating files,
919 the originating file letter of the moving piece is inserted immediately after
920 the moving piece letter.
921
922 Second (when the first step fails), if the moving pieces can be distinguished
923 by their originating ranks, the originating rank digit of the moving piece is
924 inserted immediately after the moving piece letter.
925
926 Third (when both the first and the second steps fail), the two character square
927 coordinate of the originating square of the moving piece is inserted
928 immediately after the moving piece letter.
929
930 Note that the above disambiguation is needed only to distinguish among moves of
931 the same piece type to the same square; it is not used to distinguish among
932 attacks of the same piece type to the same square. An example of this would be
933 a position with two white knights, one on square c3 and one on square g1 and a
934 vacant square e2 with White to move. Both knights attack square e2, and if
935 both could legally move there, then a file disambiguation is needed; the
936 (nonchecking) knight moves would be "Nce2" and "Nge2". However, if the white
937 king were at square e1 and a black bishop were at square b4 with a vacant
938 square d2 (thus an absolute pin of the white knight at square c3), then only
939 one white knight (the one at square g1) could move to square e2: "Ne2".
940
941 8.2.3.5: Check and checkmate indication characters
942
943 If the move is a checking move, the plus sign "+" is appended as a suffix to
944 the basic SAN move notation; if the move is a checkmating move, the octothorpe
945 sign "#" is appended instead.
946
947 Neither the appearance nor the absence of either a check or checkmating
948 indicator is used for disambiguation purposes. This means that if two (or
949 more) pieces of the same type can move to the same square the differences in
950 checking status of the moves does not allieviate the need for the standard rank
951 and file disabiguation described above. (Note that a difference in checking
952 status for the above may occur only in the case of a discovered check.)
953
954 Neither the checking or checkmating indicators are considered annotation as
955 they do not communicate subjective information. Therefore, they are
956 qualitatively different from move suffix annotations like "!" and "?".
957 Subjective move annotations are handled using Numeric Annotation Glyphs as
958 described in a later section of this document.
959
960 There are no special markings used for double checks or discovered checks.
961
962 There are no special markings used for drawing moves.
963
964 8.2.3.6: SAN move length
965
966 SAN moves can be as short as two characters (e.g., "d4"), or as long as seven
967 characters (e.g., "Qa6xb7#", "fxg1=Q+"). The average SAN move length seen in
968 realistic games is probably just fractionally longer than three characters. If
969 the SAN rules seem complicated, be assured that the earlier notation systems of
970 LEN (Long English Notation) and EDN (English Descriptive Notation) are much
971 more complex, and that LAN (Long Algebraic Notation, the predecessor of SAN) is
972 unnecessarily bulky.
973
974 8.2.3.7: Import and export SAN
975
976 PGN export format always uses the above canonical SAN to represent moves in the
977 movetext section of a PGN game. Import format is somewhat more relaxed and it
978 makes allowances for moves that do not conform exactly to the canonical format.
979 However, these allowances may differ among different PGN reader programs. Only
980 data appearing in export format is in all cases guaranteed to be importable
981 into all PGN readers.
982
983 There are a number of suggested guidelines for use with implementing PGN reader
984 software for permitting non-canonical SAN move representation. The idea is to
985 have a PGN reader apply various transformations to attempt to discover the move
986 that is represented by non-canonical input. Some suggested transformations
987 include: letter case remapping, capture indicator insertion, check indicator
988 insertion, and checkmate indicator insertion.
989
990 8.2.3.8: SAN move suffix annotations
991
992 Import format PGN allows for the use of traditional suffix annotations for
993 moves. There are exactly six such annotations available: "!", "?", "!!", "!?",
994 "?!", and "??". At most one such suffix annotation may appear per move, and if
995 present, it is always the last part of the move symbol.
996
997 When exported, a move suffix annotation is translated into the corresponding
998 Numeric Annotation Glyph as described in a later section of this document. For
999 example, if the single move symbol "Qxa8?" appears in an import format PGN
1000 movetext, it would be replaced with the two adjacent symbols "Qxa8 $2".
1001
1002 8.2.4: Movetext NAG (Numeric Annotation Glyph)
1003
1004 An NAG (Numeric Annotation Glyph) is a movetext element that is used to
1005 indicate a simple annotation in a language independent manner. An NAG is
1006 formed from a dollar sign ("$") with a non-negative decimal integer suffix.
1007 The non-negative integer must be from zero to 255 in value.
1008
1009 8.2.5: Movetext RAV (Recursive Annotation Variation)
1010
1011 An RAV (Recursive Annotation Variation) is a sequence of movetext containing
1012 one or more moves enclosed in parentheses. An RAV is used to represent an
1013 alternative variation. The alternate move sequence given by an RAV is one that
1014 may be legally played by first unplaying the move that appears immediately
1015 prior to the RAV. Because the RAV is a recursive construct, it may be nested.
1016
1017 *** The specification for import/export representation of RAV elements needs
1018 further development.
1019
1020 8.2.6: Game Termination Markers
1021
1022 Each movetext section has exactly one game termination marker; the marker
1023 always occurs as the last element in the movetext. The game termination marker
1024 is a symbol that is one of the following four values: "1-0" (White wins), "0-1"
1025 (Black wins), "1/2-1/2" (drawn game), and "*" (game in progress, result
1026 unknown, or game abandoned). Note that the digit zero is used in the above;
1027 not the upper case letter "O". The game termination marker appearing in the
1028 movetext of a game must match the value of the game's Result tag pair. (While
1029 the marker appears as a string in the Result tag, it appears as a symbol
1030 without quotes in the movetext.)
1031
1032 9: Supplemental tag names
1033
1034 The following tag names and their associated semantics are recommended for use
1035 for information not contained in the Seven Tag Roster.
1036
1037 9.1: Player related information
1038
1039 Note that if there is more than one player field in an instance of a player
1040 (White or Black) tag, then there will be corresponding multiple fields in any
1041 of the following tags. For example, if the White tag has the three field value
1042 "Jones:Smith:Zacharias" (a consultation game), then the WhiteTitle tag could
1043 have a value of "IM:-:GM" if Jones was an International Master, Smith was
1044 untitled, and Zacharias was a Grandmaster.
1045
1046 9.1.1: Tags: WhiteTitle, BlackTitle
1047
1048 These use string values such as "FM", "IM", and "GM"; these tags are used only
1049 for the standard abbreviations for FIDE titles. A value of "-" is used for an
1050 untitled player.
1051
1052 9.1.2: Tags: WhiteElo, BlackElo
1053
1054 These tags use integer values; these are used for FIDE Elo ratings. A value of
1055 "-" is used for an unrated player.
1056
1057 9.1.3: Tags: WhiteUSCF, BlackUSCF
1058
1059 These tags use integer values; these are used for USCF (United States Chess
1060 Federation) ratings. Similar tag names can be constructed for other rating
1061 agencies.
1062
1063 9.1.4: Tags: WhiteNA, BlackNA
1064
1065 These tags use string values; these are the e-mail or network addresses of the
1066 players. A value of "-" is used for a player without an electronic address.
1067
1068 9.1.5: Tags: WhiteType, BlackType
1069
1070 These tags use string values; these describe the player types. The value
1071 "human" should be used for a person while the value "program" should be used
1072 for algorithmic (computer) players.
1073
1074 9.2: Event related information
1075
1076 The following tags are used for providing additional information about the
1077 event.
1078
1079 9.2.1: Tag: EventDate
1080
1081 This uses a date value, similar to the Date tag field, that gives the starting
1082 date of the Event.
1083
1084 9.2.2: Tag: EventSponsor
1085
1086 This uses a string value giving the name of the sponsor of the event.
1087
1088 9.2.3: Tag: Section
1089
1090 This uses a string; this is used for the playing section of a tournament (e.g.,
1091 "Open" or "Reserve").
1092
1093 9.2.4: Tag: Stage
1094
1095 This uses a string; this is used for the stage of a multistage event (e.g.,
1096 "Preliminary" or "Semifinal").
1097
1098 9.2.5: Tag: Board
1099
1100 This uses an integer; this identifies the board number in a team event and also
1101 in a simultaneous exhibition.
1102
1103 9.3: Opening information (locale specific)
1104
1105 The following tag pairs are used for traditional opening names. The associated
1106 tag values will vary according to the local language in use.
1107
1108 9.3.1: Tag: Opening
1109
1110 This uses a string; this is used for the traditional opening name. This will
1111 vary by locale. This tag pair is associated with the use of the EPD opcode
1112 "v0" described in a later section of this document.
1113
1114 9.3.2: Tag: Variation
1115
1116 This uses a string; this is used to further refine the Opening tag. This will
1117 vary by locale. This tag pair is associated with the use of the EPD opcode
1118 "v1" described in a later section of this document.
1119
1120 9.3.3: Tag: SubVariation
1121
1122 This uses a string; this is used to further refine the Variation tag. This
1123 will vary by locale. This tag pair is associated with the use of the EPD
1124 opcode "v2" described in a later section of this document.
1125
1126 9.4: Opening information (third party vendors)
1127
1128 The following tag pairs are used for representing opening identification
1129 according to various third party vendors and organizations. References to
1130 these organizations does not imply any endorsement of them or any endorsement
1131 by them.
1132
1133 9.4.1: Tag: ECO
1134
1135 This uses a string of either the form "XDD" or the form "XDD/DD" where the "X"
1136 is a letter from "A" to "E" and the "D" positions are digits; this is used for
1137 an opening designation from the five volume _Encyclopedia of Chess Openings_.
1138 This tag pair is associated with the use of the EPD opcode "eco" described in a
1139 later section of this document.
1140
1141 9.4.2: Tag: NIC
1142
1143 This uses a string; this is used for an opening designation from the _New in
1144 Chess_ database. This tag pair is associated with the use of the EPD opcode
1145 "nic" described in a later section of this document.
1146
1147 9.5: Time and date related information
1148
1149 The following tags assist with further refinement of the time and data
1150 information associated with a game.
1151
1152 9.5.1: Tag: Time
1153
1154 This uses a time-of-day value in the form "HH:MM:SS"; similar to the Date tag
1155 except that it denotes the local clock time (hours, minutes, and seconds) of
1156 the start of the game. Note that colons, not periods, are used for field
1157 separators for the Time tag value. The value is taken from the local time
1158 corresponding to the location given in the Site tag pair.
1159
1160 9.5.2: Tag: UTCTime
1161
1162 This tag is similar to the Time tag except that the time is given according to
1163 the Universal Coordinated Time standard.
1164
1165 9.5.3: Tag:; UTCDate
1166
1167 This tag is similar to the Date tag except that the date is given according to
1168 the Universal Coordinated Time standard.
1169
1170 9.6: Time control
1171
1172 The follwing tag is used to help describe the time control used with the game.
1173
1174 9.6.1: Tag: TimeControl
1175
1176 This uses a list of one or more time control fields. Each field contains a
1177 descriptor for each time control period; if more than one descriptor is present
1178 then they are separated by the colon character (":"). The descriptors appear
1179 in the order in which they are used in the game. The last field appearing is
1180 considered to be implicitly repeated for further control periods as needed.
1181
1182 There are six kinds of TimeControl fields.
1183
1184 The first kind is a single question mark ("?") which means that the time
1185 control mode is unknown. When used, it is usually the only descriptor present.
1186
1187 The second kind is a single hyphen ("-") which means that there was no time
1188 control mode in use. When used, it is usually the only descriptor present.
1189
1190 The third Time control field kind is formed as two positive integers separated
1191 by a solidus ("/") character. The first integer is the number of moves in the
1192 period and the second is the number of seconds in the period. Thus, a time
1193 control period of 40 moves in 2 1/2 hours would be represented as "40/9000".
1194
1195 The fourth TimeControl field kind is used for a "sudden death" control period.
1196 It should only be used for the last descriptor in a TimeControl tag value. It
1197 is sometimes the only descriptor present. The format consists of a single
1198 integer that gives the number of seconds in the period. Thus, a blitz game
1199 would be represented with a TimeControl tag value of "300".
1200
1201 The fifth TimeControl field kind is used for an "incremental" control period.
1202 It should only be used for the last descriptor in a TimeControl tag value and
1203 is usually the only descriptor in the value. The format consists of two
1204 positive integers separated by a plus sign ("+") character. The first integer
1205 gives the minimum number of seconds allocated for the period and the second
1206 integer gives the number of extra seconds added after each move is made. So,
1207 an incremental time control of 90 minutes plus one extra minute per move would
1208 be given by "4500+60" in the TimeControl tag value.
1209
1210 The sixth TimeControl field kind is used for a "sandclock" or "hourglass"
1211 control period. It should only be used for the last descriptor in a
1212 TimeControl tag value and is usually the only descriptor in the value. The
1213 format consists of an asterisk ("*") immediately followed by a positive
1214 integer. The integer gives the total number of seconds in the sandclock
1215 period. The time control is implemented as if a sandclock were set at the
1216 start of the period with an equal amount of sand in each of the two chambers
1217 and the players invert the sandclock after each move with a time forfeit
1218 indicated by an empty upper chamber. Electronic implementation of a physical
1219 sandclock may be used. An example sandclock specification for a common three
1220 minute egg timer sandclock would have a tag value of "*180".
1221
1222 Additional TimeControl field kinds will be defined as necessary.
1223
1224 9.7: Alternative starting positions
1225
1226 There are two tags defined for assistance with describing games that did not
1227 start from the usual initial array.
1228
1229 9.7.1: Tag: SetUp
1230
1231 This tag takes an integer that denotes the "set-up" status of the game. A
1232 value of "0" indicates that the game has started from the usual initial array.
1233 A value of "1" indicates that the game started from a set-up position; this
1234 position is given in the "FEN" tag pair. This tag must appear for a game
1235 starting with a set-up position. If it appears with a tag value of "1", a FEN
1236 tag pair must also appear.
1237
1238 9.7.2: Tag: FEN
1239
1240 This tag uses a string that gives the Forsyth-Edwards Notation for the starting
1241 position used in the game. FEN is described in a later section of this
1242 document. If a SetUp tag appears with a tag value of "1", the FEN tag pair is
1243 also required.
1244
1245 9.8: Game conclusion
1246
1247 There is a single tag that discusses the conclusion of the game.
1248
1249 9.8.1: Tag: Termination
1250
1251 This takes a string that describes the reason for the conclusion of the game.
1252 While the Result tag gives the result of the game, it does not provide any
1253 extra information and so the Termination tag is defined for this purpose.
1254
1255 Strings that may appear as Termination tag values:
1256
1257 * "abandoned": abandoned game.
1258
1259 * "adjudication": result due to third party adjudication process.
1260
1261 * "death": losing player called to greater things, one hopes.
1262
1263 * "emergency": game concluded due to unforeseen circumstances.
1264
1265 * "normal": game terminated in a normal fashion.
1266
1267 * "rules infraction": administrative forfeit due to losing player's failure to
1268 observe either the Laws of Chess or the event regulations.
1269
1270 * "time forfeit": loss due to losing player's failure to meet time control
1271 requirements.
1272
1273 * "unterminated": game not terminated.
1274
1275 9.9: Miscellaneous
1276
1277 These are tags that can be briefly described and that doon't fit well inother
1278 sections.
1279
1280 9.9.1: Tag: Annotator
1281
1282 This tag uses a name or names in the format of the player name tags; this
1283 identifies the annotator or annotators of the game.
1284
1285 9.9.2: Tag: Mode
1286
1287 This uses a string that gives the playing mode of the game. Examples: "OTB"
1288 (over the board), "PM" (paper mail), "EM" (electronic mail), "ICS" (Internet
1289 Chess Server), and "TC" (general telecommunication).
1290
1291 9.9.3: Tag: PlyCount
1292
1293 This tag takes a single integer that gives the number of ply (moves) in the
1294 game.
1295
1296 10: Numeric Annotation Glyphs
1297
1298 NAG zero is used for a null annotation; it is provided for the convenience of
1299 software designers as a placeholder value and should probably not be used in
1300 external PGN data.
1301
1302 NAGs with values from 1 to 9 annotate the move just played.
1303
1304 NAGs with values from 10 to 135 modify the current position.
1305
1306 NAGs with values from 136 to 139 describe time pressure.
1307
1308 Other NAG values are reserved for future definition.
1309
1310 Note: the number assignments listed below should be considered preliminary in
1311 nature; they are likely to be changed as a result of reviewer feedback.
1312
1313 NAG Interpretation
1314 --- --------------
1315 0 null annotation
1316 1 good move (traditional "!")
1317 2 poor move (traditional "?")
1318 3 very good move (traditional "!!")
1319 4 very poor move (traditional "??")
1320 5 speculative move (traditional "!?")
1321 6 questionable move (traditional "?!")
1322 7 forced move (all others lose quickly)
1323 8 singular move (no reasonable alternatives)
1324 9 worst move
1325 10 drawish position
1326 11 equal chances, quiet position
1327 12 equal chances, active position
1328 13 unclear position
1329 14 White has a slight advantage
1330 15 Black has a slight advantage
1331 16 White has a moderate advantage
1332 17 Black has a moderate advantage
1333 18 White has a decisive advantage
1334 19 Black has a decisive advantage
1335 20 White has a crushing advantage (Black should resign)
1336 21 Black has a crushing advantage (White should resign)
1337 22 White is in zugzwang
1338 23 Black is in zugzwang
1339 24 White has a slight space advantage
1340 25 Black has a slight space advantage
1341 26 White has a moderate space advantage
1342 27 Black has a moderate space advantage
1343 28 White has a decisive space advantage
1344 29 Black has a decisive space advantage
1345 30 White has a slight time (development) advantage
1346 31 Black has a slight time (development) advantage
1347 32 White has a moderate time (development) advantage
1348 33 Black has a moderate time (development) advantage
1349 34 White has a decisive time (development) advantage
1350 35 Black has a decisive time (development) advantage
1351 36 White has the initiative
1352 37 Black has the initiative
1353 38 White has a lasting initiative
1354 39 Black has a lasting initiative
1355 40 White has the attack
1356 41 Black has the attack
1357 42 White has insufficient compensation for material deficit
1358 43 Black has insufficient compensation for material deficit
1359 44 White has sufficient compensation for material deficit
1360 45 Black has sufficient compensation for material deficit
1361 46 White has more than adequate compensation for material deficit
1362 47 Black has more than adequate compensation for material deficit
1363 48 White has a slight center control advantage
1364 49 Black has a slight center control advantage
1365 50 White has a moderate center control advantage
1366 51 Black has a moderate center control advantage
1367 52 White has a decisive center control advantage
1368 53 Black has a decisive center control advantage
1369 54 White has a slight kingside control advantage
1370 55 Black has a slight kingside control advantage
1371 56 White has a moderate kingside control advantage
1372 57 Black has a moderate kingside control advantage
1373 58 White has a decisive kingside control advantage
1374 59 Black has a decisive kingside control advantage
1375 60 White has a slight queenside control advantage
1376 61 Black has a slight queenside control advantage
1377 62 White has a moderate queenside control advantage
1378 63 Black has a moderate queenside control advantage
1379 64 White has a decisive queenside control advantage
1380 65 Black has a decisive queenside control advantage
1381 66 White has a vulnerable first rank
1382 67 Black has a vulnerable first rank
1383 68 White has a well protected first rank
1384 69 Black has a well protected first rank
1385 70 White has a poorly protected king
1386 71 Black has a poorly protected king
1387 72 White has a well protected king
1388 73 Black has a well protected king
1389 74 White has a poorly placed king
1390 75 Black has a poorly placed king
1391 76 White has a well placed king
1392 77 Black has a well placed king
1393 78 White has a very weak pawn structure
1394 79 Black has a very weak pawn structure
1395 80 White has a moderately weak pawn structure
1396 81 Black has a moderately weak pawn structure
1397 82 White has a moderately strong pawn structure
1398 83 Black has a moderately strong pawn structure
1399 84 White has a very strong pawn structure
1400 85 Black has a very strong pawn structure
1401 86 White has poor knight placement
1402 87 Black has poor knight placement
1403 88 White has good knight placement
1404 89 Black has good knight placement
1405 90 White has poor bishop placement
1406 91 Black has poor bishop placement
1407 92 White has good bishop placement
1408 93 Black has good bishop placement
1409 84 White has poor rook placement
1410 85 Black has poor rook placement
1411 86 White has good rook placement
1412 87 Black has good rook placement
1413 98 White has poor queen placement
1414 99 Black has poor queen placement
1415 100 White has good queen placement
1416 101 Black has good queen placement
1417 102 White has poor piece coordination
1418 103 Black has poor piece coordination
1419 104 White has good piece coordination
1420 105 Black has good piece coordination
1421 106 White has played the opening very poorly
1422 107 Black has played the opening very poorly
1423 108 White has played the opening poorly
1424 109 Black has played the opening poorly
1425 110 White has played the opening well
1426 111 Black has played the opening well
1427 112 White has played the opening very well
1428 113 Black has played the opening very well
1429 114 White has played the middlegame very poorly
1430 115 Black has played the middlegame very poorly
1431 116 White has played the middlegame poorly
1432 117 Black has played the middlegame poorly
1433 118 White has played the middlegame well
1434 119 Black has played the middlegame well
1435 120 White has played the middlegame very well
1436 121 Black has played the middlegame very well
1437 122 White has played the ending very poorly
1438 123 Black has played the ending very poorly
1439 124 White has played the ending poorly
1440 125 Black has played the ending poorly
1441 126 White has played the ending well
1442 127 Black has played the ending well
1443 128 White has played the ending very well
1444 129 Black has played the ending very well
1445 130 White has slight counterplay
1446 131 Black has slight counterplay
1447 132 White has moderate counterplay
1448 133 Black has moderate counterplay
1449 134 White has decisive counterplay
1450 135 Black has decisive counterplay
1451 136 White has moderate time control pressure
1452 137 Black has moderate time control pressure
1453 138 White has severe time control pressure
1454 139 Black has severe time control pressure
1455
1456 11: File names and directories
1457
1458 File names chosen for PGN data should be both informative and portable. The
1459 directory names and arrangements should also be chosen for the same reasons and
1460 also for ease of navigation.
1461
1462 Some of suggested file and directory names may be difficult or impossible to
1463 represent on certain computing systems. Use of appropriate conversion customs
1464 is encouraged.
1465
1466 11.1: File name suffix for PGN data
1467
1468 The use of the file suffix ".pgn" is encouraged for ASCII text files containing
1469 PGN data.
1470
1471 11.2: File name formation for PGN data for a specific player
1472
1473 PGN games for a specific player should have a file name consisting of the
1474 player's last name followed by the ".pgn" suffix.
1475
1476 11.3: File name formation for PGN data for a specific event
1477
1478 PGN games for a specific event should have a file name consisting of the
1479 event's name followed by the ".pgn" suffix.
1480
1481 11.4: File name formation for PGN data for chronologically ordered games
1482
1483 PGN data files used for chronologically ordered (oldest first) archives use
1484 date information as file name root strings. A file containing all the PGN
1485 games for a given year would have an eight character name in the format
1486 "YYYY.pgn". A file containing PGN data for a given month would have a ten
1487 character name in the format "YYYYMM.pgn". Finally, a file for PGN games for a
1488 single day would have a twelve character name in the format "YYYYMMDD.pgn".
1489 Large files are split into smaller files as needed.
1490
1491 As game files are commonly arranged by chronological order, games with missing
1492 or incomplete Date tag pair data are to be avoided. Any question mark
1493 characters in a Date tag value will be treated as zero digits for collation
1494 within a file and also for file naming.
1495
1496 Large quantities of PGN data arranged by chronological order should be
1497 organized into hierarchical directories. A directory containing all PGN data
1498 for a given year would have a four character name in the format "YYYY";
1499 directories containing PGN files for a given month would have a six character
1500 name in the format "YYYYMM".
1501
1502 11.5: Suggested directory tree organization
1503
1504 A suggested directory arrangement for ftp sites and CD-ROM distributions:
1505
1506 * PGN: master directory of the PGN subtree (pub/chess/Game-Databases/PGN)
1507
1508 * PGN/Events: directory of PGN files, each for a specific event
1509
1510 * PGN/Events/News: news and status of the event collection
1511
1512 * PGN/Events/ReadMe: brief description of the local directory contents
1513
1514 * PGN/MGR: directory of the Master Games Repository subtree
1515
1516 * PGN/MGR/News: news and status of the entire PGN/MGR subtree
1517
1518 * PGN/MGR/ReadMe: brief description of the local directory contents
1519
1520 * PGN/MGR/YYYY: directory of games or subtrees for the year YYYY
1521
1522 * PGN/MGR/YYYY/ReadMe: description of local directory for year YYYY
1523
1524 * PGN/MGR/YYYY/News: news and status for year YYYY data
1525
1526 * PGN/News: news and status of the entire PGN subtree
1527
1528 * PGN/Players: directory of PGN files, each for a specific player
1529
1530 * PGN/Players/News: news and status of the player collection
1531
1532 * PGN/Players/ReadMe: brief description of the local directory contents
1533
1534 * PGN/ReadMe: brief description of the local directory contents
1535
1536 * PGN/Standard: the PGN standard (this document)
1537
1538 * PGN/Tools: software utilities that access PGN data
1539
1540 12: PGN collating sequence
1541
1542 There is a standard sorting order for PGN games within a file. This collation
1543 is based on eight keys; these are the seven tag values of the STR and also the
1544 movetext itself.
1545
1546 The first (most important, primary key) is the Date tag. Earlier dated games
1547 appear prior to games played at a later date. This field is sorted by
1548 ascending numeric value first with the year, then the month, and finally the
1549 day of the month. Query characters used for unknown date digit values will be
1550 treated as zero digit characters for ordering comparison.
1551
1552 The second key is the Event tag. This is sorted in ascending ASCII order.
1553
1554 The third key is the Site tag. This is sorted in ascending ASCII order.
1555
1556 The fourth key is the Round tag. This is sorted in ascending numeric order
1557 based on the value of the integer used to denote the playing round. A query or
1558 hyphen used for the round is ordered before any integer value. A query
1559 character is ordered before a hyphen character.
1560
1561 The fifth key is the White tag. This is sorted in ascending ASCII order.
1562
1563 The sixth key is the Black tag. This is sorted in ascending ASCII order.
1564
1565 The seventh key is the Result tag. This is sorted in ascending ASCII order.
1566
1567 The eighth key is the movetext itself. This is sorted in ascending ASCII order
1568 with the entire text including spaces and newline characters.
1569
1570 13: PGN software
1571
1572 This section describes some PGN software that is either currently available or
1573 expected to be available in the near future. The entries are presented in
1574 rough chronological order of their being made known to the PGN standard
1575 coordinator. Authors of PGN capable software are encouraged to contact the
1576 coordinator (e-mail address listed near the start of this document) so that the
1577 information may be included here in this section.
1578
1579 In addition to the PGN standard, there are two more chess standards of interest
1580 to the chess software community. These are the FEN standard (Forsyth-Edwards
1581 Notation) for position notation and the EPD standard (Extended Position
1582 Description) for comprehensive position description for automated interprogram
1583 processing. These are described in a later section of this document.
1584
1585 Some PGN software is freeware and can be gotten from ftp sites and other
1586 sources. Other PGN software is payware and appears as part of commercial
1587 chessplaying programs and chess database managers. Those who are interested in
1588 the propagation of the PGN standard are encouraged to support manufacturers of
1589 chess software that use the standard. If a particular vendor does not offer
1590 PGN compatibility, it is likely that a few letters to them along with a copy of
1591 this specification may help them decide to include PGN support in their next
1592 release.
1593
1594 The staff at the University of Oklahoma at Norman (USA) have graciously
1595 provided an ftp site (chess.uoknor.edu) for the storage of chess related data
1596 and programs. Because file names change over time, those accessing the site
1597 are encouraged to first retrieve the file "pub/chess/ls-lR.gz" for a current
1598 listing. A scan of this listing will also help locate versions of PGN programs
1599 for machine types and operating systems other than those listed below. Further
1600 information about this archive can be gotten from its administrator, Chris
1601 Petroff (chris@uoknor.edu).
1602
1603 For European users, the kind staff at the University of Hamburg (Germany) have
1604 provided the ftp site ftp.math.uni-hamburg.de; this carries a daily mirror of
1605 the pub/chess directory at the chess.uoknor.edu site.
1606
1607 13.1: The SAN Kit
1608
1609 The "SAN Kit" is an ANSI C source chess programming toolkit available for free
1610 from the ftp site chess.uoknor.edu in the directory pub/chess/Unix as the file
1611 "SAN.tar.gz" (a gzip tar archive). This kit contains code for PGN import and
1612 export and can be used to "regularize" PGN data into reduced export format by
1613 use of its "tfgg" command. The SAN Kit also supports FEN I/O. Code from this
1614 kit is freely redistributable for anyone as long as future distribution is
1615 unhindered for everyone. The SAN Kit is undergoing continuous development,
1616 although dates of future deliveries are quite difficult to predict and releases
1617 sometimes appear months apart. Suggestions and comments should be directed to
1618 its author, Steven J. Edwards (sje@world.std.com).
1619
1620 13.2: pgnRead
1621
1622 The program "pgnRead" runs under MS Windows 3.1 and provides an interactive
1623 graphical user interface for scanning PGN data files. This program includes a
1624 colorful figurine chessboard display and scrolling controls for game and game
1625 text selection. It is available from the chess.uoknor.edu ftp site in the
1626 pub/chess/DOS directory; several versions are available with names of the form
1627 "pgnrd**.exe"; the latest at this writing is "PGNRD130.EXE". Suggestions and
1628 comments should be directed to its author, Keith Fuller (keithfx@aol.com).
1629
1630 13.3: mail2pgn/GIICS
1631
1632 The program "mail2pgn" produces a PGN version of chess game data generated by
1633 the ICS (Internet Chess Server). It can be found at the chess.uoknor.edu ftp
1634 site in the pub/chess/DOS directory as the file "mail2pgn.zip" A C language
1635 version is in the directory pub/chess/Unix as the file "mail2pgn.c".
1636 Suggestions and comments should be directed to its author, John Aronson
1637 (aronson@helios.ece.arizona.edu). This code has been reportedly incorporated
1638 into the GIICS (Graphical Interface for the ICS); suggestions and comments
1639 should be directed to its author, Tony Acero (ace3@midway.uchicago.edu).
1640
1641 There is a report that mail2pgn has been superseded by the newer program
1642 "MV2PGN" described below.
1643
1644 13.4: XBoard
1645
1646 "XBoard" is a comprehensive chess utility running under the X Window System
1647 that provides a graphical user interface in a portable manner. A new version
1648 now handles PGN data. It is available from the chess.uoknor.edu ftp site in
1649 the pub/chess/X directory as the file "xboard-3.0.pl9.tar.gz". Suggestions and
1650 comments should be directed to its author, Tim Mann (mann@src.dec.com).
1651
1652 13.5: cupgn
1653
1654 The program "cupgn" converts game data stored in the ChessBase format into PGN.
1655 It is available from the chess.uoknor.edu ftp site in the
1656 pub/chess/Game-Databases/CBUFF directory as the file "cupgn.tar.gz". Another
1657 version is in the directory pub/chess/DOS as the file "cupgn120.exe".
1658 Suggestions and comments should be directed to its author, Anjo Anjewierden
1659 (anjo@swi.psy.uva.nl).
1660
1661 13.6: Zarkov
1662
1663 The current version (3.0) of the commercial chessplaying program "Zarkov" can
1664 read and write games using PGN. This program can also use the EPD standard for
1665 communication with other EPD capable programs. Historically, Zarkov is the
1666 very first program to use EPD. Suggestions and comments should be directed to
1667 its author, John Stanback (jhs@icbdfcs1.fc.hp.com).
1668
1669 A vendor for North America is:
1670
1671 International Chess Enterprises
1672 P.O. Box 19457
1673 Seattle, WA 98109
1674 USA
1675 (800) 262-4277
1676
1677 A vendor for Europe is:
1678
1679 Gambit-Soft
1680 Feckenhauser Strasse 27
1681 D-78628 Rottweil
1682 GERMANY
1683 49-741-21573
1684
1685 13.7: Chess Assistant
1686
1687 The upcoming version of the multifunction commercial database program "Chess
1688 Assistant" will be able to use the PGN standard as an import and export option.
1689 There is a report of a freeware program, "PGN2CA", that will convert PGN
1690 databases into Chess Assistant format. For more information, the contact is
1691 Victor Zakharov, one of the members of the Chess Assistant development team
1692 (VICTOR@ldis.cs.msu.su).
1693
1694 A vendor for North America is:
1695
1696 International Chess Enterprises
1697 P.O. Box 19457
1698 Seattle, WA 98109
1699 USA
1700 (800) 262-4277
1701
1702 13.8: BOOKUP
1703
1704 The MS-DOS edition of the multifunction commercial program BOOKUP, version 8.1,
1705 is able to use the EPD standard for communication with other EPD capable
1706 programs. It may also be PGN capable as well.
1707
1708 The BOOKUP 8.1.1 Addenda notes dated 1993.12.17 provide comprehensive
1709 information on how to use EPD in conjunction with "analyst" programs such as
1710 Zarkov and HIARCS. Specifically, the search and evaluation abilities of an
1711 analyst program are combined with the information organization abilities of the
1712 BOOKUP database program to provide position scoring. This is done by first
1713 having BOOKUP export a database in EPD format, then having an analyst program
1714 annotate each EPD record with a numeric score, and then having BOOKUP import
1715 the changed EPD file. BOOKUP can then apply minimaxing to the imported
1716 database; this results in scores from terminal positions being propagated back
1717 to earlier positions and even back to moves from the starting array.
1718
1719 For some reason, BOOKUP calls this process "backsolving", but it's really just
1720 standard minimaxing. In any case, it's a good example of how different
1721 programs from different authors performing different types of tasks can be
1722 integrated by use of a common, non-proprietary standard. This allows for a new
1723 set of powerful features that are beyond the capabilities of any one of the
1724 individual component programs.
1725
1726 BOOKUP allows for some customizing of EPD actions. One such customization is
1727 to require the positional evaluations to follow the EPD standard; this means
1728 that the score is always given from the viewpoint of the active player. This
1729 is explained more fully in the section on the "ce" (centipawn evaluation)
1730 opcode in the EPD description in a later section of this document. To ensure
1731 that BOOKUP handles the centipawn evaluations in the "right" way, the EPD
1732 setting "Positive for White" must be set to "N". This makes BOOKUP work
1733 correctly with Zarkov and with all other programs that use the "right"
1734 centipawn evaluation convention. There is an apparent problem with HIARCS that
1735 requires this option to be set to "Y"; but this really means that, if true,
1736 HIARCS needs to be adjusted to use the "right" centipawn evaluation convention.
1737
1738 A vendor in North America is:
1739
1740 BOOKUP
1741 2763 Kensington Place West
1742 Columbus, OH 43202
1743 USA
1744 (800) 949-5445
1745 (614) 263-7219
1746
1747 13.9: HIARCS
1748
1749 The current version (2.1) of the commercial chessplaying program "HIARCS" is
1750 able to use the EPD standard for communication with other EPD capable programs.
1751 It may also be PGN capable as well. More details will appear here as they
1752 become available.
1753
1754 A vendor in North America is:
1755
1756 HIARCS
1757 c/o BOOKUP
1758 2763 Kensington Place West
1759 Columbus, OH 43202
1760 USA
1761 (800) 949-5445
1762 (614) 263-7219
1763
1764 13.10: Deja Vu
1765
1766 The chess database "Deja Vu" from ChessWorks is a PGN compatible collection of
1767 over 300,000 games. It is available only on CD-ROM and is scheduled for
1768 release in 1994.05 with periodic revisions thereafter. The introductory price
1769 is US$329. For further information, the authors are John Crayton and Eric
1770 Schiller and they can be contacted via e-mail (chesswks@netcom.com).
1771
1772 13.11: MV2PGN
1773
1774 The program "MV2PGN" can be used to convert game data generated by both current
1775 and older versions of the GIICS (Graphical Interface - Internet Chess Server).
1776 The program is included in the self extracting archive available from
1777 chess.uoknor.edu in the directory pub/chess/DOS as the file "ics2pgn.exe".
1778 Source code is also included. This program is reported to supersede the older
1779 "mail2pgn" and was needed due to a change in ICS recording format in late 1993.
1780 For further information about MV2PGN, the contact person is Gary Bastin
1781 (gbastin@x102a.ess.harris.com).
1782
1783 13.12: The Hansen utilities (cb2pgn, nic2pgn, pgn2cb, pgn2nic)
1784
1785 The Hansen utilities are used to convert among various chess data
1786 representation formats. The PGN related programs include: "cb2pgn.exe"
1787 (convert ChessBase to PGN), "nic2pgn.exe" (convert NIC to PGN), "pgn2cb.exe"
1788 (convert PGN to ChessBase), and "pgn2nic.exe" (convert PGN to NIC).
1789
1790 The ChessBase related utilities (cb2pgn/pgn2cb) are found at chess.uoknor.edu
1791 in the pub/chess/Game-Databases/ChessBase directory.
1792
1793 The NIC related utilities (nic2pgn/pgn2nic) are found at chess.uoknor.edu in
1794 the pub/chess/Game-Databases/NIC directory.
1795
1796 For further information about the Hansen utilities, the contact person is the
1797 author, Carsten Hansen (ch0506@hdc.hha.dk).
1798
1799 13.13: Slappy the Database
1800
1801 "Slappy the Database" is a commercial chess database and translation program
1802 scheduled for release no sooner than late 1994. It is a low cost utility with
1803 a simple character interface intended for those who want a supported product
1804 but who do not need (or cannot afford) a comprehensive, feature-laden program
1805 with a graphical user interface. Slappy's two most important features are its
1806 batch processing ability and its full implementation of each and every standard
1807 described in this document. Versions of Slappy the Database will be provided
1808 for various platforms including: Intel 386/486 Unix, Apple Macintosh, and
1809 MS-DOS.
1810
1811 Slappy may also be useful to those who have a full feature program who also
1812 need to run time consuming chess database tasks on a spare computer.
1813
1814 Suggestions and comments should be directed to its author, Steven J. Edwards
1815 (sje@world.std.com). More details will appear here as they become available.
1816
1817 13.14: CBASCII
1818
1819 "CBASCII" is a general utility for converting chess data between ChessBase
1820 format and ASCII representations. It has PGN capability, and it is available
1821 from the chess.uoknor.edu ftp site in the pub/chess/DOS directory as the file
1822 "cba1_2.zip". The contact person is the program's author, Andy Duplain
1823 (duplain@btcs.bt.co.uk).
1824
1825 13.15: ZZZZZZ
1826
1827 "ZZZZZZ" is a chessplaying program, complete with source, that also includes
1828 some database functions. A recent version is reported to have both PGN and EPD
1829 capabilities. It is available from the chess.uoknor.edu ftp site in the
1830 pub/chess/Unix directory as the file "zzzzzz-3.2b1.tar.gz". The contact person
1831 is its author, Gijsbert Wiesenecker (wiesenecker@sara.nl).
1832
1833 13.16: icsconv
1834
1835 The program "icsconv" can be used to convert Internet Chess Server games, both
1836 old and new format, to PGN. It is available from the chess.uoknor.edu site in
1837 the pub/chess/Game-Databases/PGN/Tools directory as the file "icsconv.exe".
1838 The contact person is the author, Kevin Nomura (chow@netcom.com).
1839
1840 13.17: CHESSOP (CHESSOPN/CHESSOPG)
1841
1842 CHESSOP is an openings database and viewing tool with support for reading PGN
1843 games. It runs under MS-DOS and displays positions rather than games. For
1844 each position, both good and bad moves are listed with appropriate annotation.
1845 Transpositions are handled as well. The distributed database contains over
1846 100,000 positions covering all the common openings. Users can feed in their
1847 own PGN data as well. CHESSOP takes 3 Mbyte of hard disk, costs US$39 and can
1848 be obtained from:
1849
1850 CHESSX Software
1851 12 Bluebell Close
1852 Glenmore Park
1853 AUSTRALIA 2745.
1854
1855 The ideas behind CHESSOP can be seen in CHESSOPN (alias CHESSOPG), a free
1856 version on the ICS server which has a reduced openings database (25,000
1857 positions) and no PGN or transposition support but is otherwise the same as
1858 CHESSOP. (These are the files "chessopg.zip" in the directory pub/chess/DOS at
1859 the chess.uoknor.edu ftp site.)
1860
1861 13.18: CAT2PGN
1862
1863 The program "CAT2PGN" is a utility that translates data from the format used by
1864 Chess Assistant into PGN. It is available from the chess.uoknor.edu ftp site.
1865 The contact person for CAT2PGN is its author, David Myers
1866 (myers@frodo.biochem.duke.edu).
1867
1868 13.19: pgn2opg
1869
1870 The utility "pgn2opg" can be used to convert PGN files into a text format used
1871 by the "CHESSOPG" program mentioned above. Although it does not perform any
1872 semantic analysis on PGN input, it has been demonstrated to handle known
1873 correct PGN input properly. The file can be found in the pub/chess/PGN/Tools
1874 directory at the chess.uoknor.edu ftp site. For more information, the author
1875 is David Barnes (djb@ukc.ac.uk).
1876
1877 14: PGN data archives
1878
1879 The primary PGN data archive repository is located at the ftp site
1880 chess.uoknor.edu as the directory "pub/chess/Game-Databases/PGN". It is
1881 organized according to the description given in section C.5 of this document.
1882 The European site ftp.math.uni-hamburg.de is also reported to carry a regularly
1883 updated copy of the repository.
1884
1885 15: International Olympic Committee country codes
1886
1887 International Olympic Committee country codes are employed for Site nation
1888 information because of their traditional use with the reporting of
1889 international sporting events. Due to changes in geography and linguistic
1890 custom, some of the following may be incorrect or outdated. Corrections and
1891 extensions should be sent via e-mail to the PGN coordinator whose address
1892 listed near the start of this document.
1893
1894 AFG: Afghanistan
1895 AIR: Aboard aircraft
1896 ALB: Albania
1897 ALG: Algeria
1898 AND: Andorra
1899 ANG: Angola
1900 ANT: Antigua
1901 ARG: Argentina
1902 ARM: Armenia
1903 ATA: Antarctica
1904 AUS: Australia
1905 AZB: Azerbaijan
1906 BAN: Bangladesh
1907 BAR: Bahrain
1908 BHM: Bahamas
1909 BEL: Belgium
1910 BER: Bermuda
1911 BIH: Bosnia and Herzegovina
1912 BLA: Belarus
1913 BLG: Bulgaria
1914 BLZ: Belize
1915 BOL: Bolivia
1916 BRB: Barbados
1917 BRS: Brazil
1918 BRU: Brunei
1919 BSW: Botswana
1920 CAN: Canada
1921 CHI: Chile
1922 COL: Columbia
1923 CRA: Costa Rica
1924 CRO: Croatia
1925 CSR: Czechoslovakia
1926 CUB: Cuba
1927 CYP: Cyprus
1928 DEN: Denmark
1929 DOM: Dominican Republic
1930 ECU: Ecuador
1931 EGY: Egypt
1932 ENG: England
1933 ESP: Spain
1934 EST: Estonia
1935 FAI: Faroe Islands
1936 FIJ: Fiji
1937 FIN: Finland
1938 FRA: France
1939 GAM: Gambia
1940 GCI: Guernsey-Jersey
1941 GEO: Georgia
1942 GER: Germany
1943 GHA: Ghana
1944 GRC: Greece
1945 GUA: Guatemala
1946 GUY: Guyana
1947 HAI: Haiti
1948 HKG: Hong Kong
1949 HON: Honduras
1950 HUN: Hungary
1951 IND: India
1952 IRL: Ireland
1953 IRN: Iran
1954 IRQ: Iraq
1955 ISD: Iceland
1956 ISR: Israel
1957 ITA: Italy
1958 IVO: Ivory Coast
1959 JAM: Jamaica
1960 JAP: Japan
1961 JRD: Jordan
1962 JUG: Yugoslavia
1963 KAZ: Kazakhstan
1964 KEN: Kenya
1965 KIR: Kyrgyzstan
1966 KUW: Kuwait
1967 LAT: Latvia
1968 LEB: Lebanon
1969 LIB: Libya
1970 LIC: Liechtenstein
1971 LTU: Lithuania
1972 LUX: Luxembourg
1973 MAL: Malaysia
1974 MAU: Mauritania
1975 MEX: Mexico
1976 MLI: Mali
1977 MLT: Malta
1978 MNC: Monaco
1979 MOL: Moldova
1980 MON: Mongolia
1981 MOZ: Mozambique
1982 MRC: Morocco
1983 MRT: Mauritius
1984 MYN: Myanmar
1985 NCG: Nicaragua
1986 NET: The Internet
1987 NIG: Nigeria
1988 NLA: Netherlands Antilles
1989 NLD: Netherlands
1990 NOR: Norway
1991 NZD: New Zealand
1992 OST: Austria
1993 PAK: Pakistan
1994 PAL: Palestine
1995 PAN: Panama
1996 PAR: Paraguay
1997 PER: Peru
1998 PHI: Philippines
1999 PNG: Papua New Guinea
2000 POL: Poland
2001 POR: Portugal
2002 PRC: People's Republic of China
2003 PRO: Puerto Rico
2004 QTR: Qatar
2005 RIN: Indonesia
2006 ROM: Romania
2007 RUS: Russia
2008 SAF: South Africa
2009 SAL: El Salvador
2010 SCO: Scotland
2011 SEA: At Sea
2012 SEN: Senegal
2013 SEY: Seychelles
2014 SIP: Singapore
2015 SLV: Slovenia
2016 SMA: San Marino
2017 SPC: Aboard spacecraft
2018 SRI: Sri Lanka
2019 SUD: Sudan
2020 SUR: Surinam
2021 SVE: Sweden
2022 SWZ: Switzerland
2023 SYR: Syria
2024 TAI: Thailand
2025 TMT: Turkmenistan
2026 TRK: Turkey
2027 TTO: Trinidad and Tobago
2028 TUN: Tunisia
2029 UAE: United Arab Emirates
2030 UGA: Uganda
2031 UKR: Ukraine
2032 UNK: Unknown
2033 URU: Uruguay
2034 USA: United States of America
2035 UZB: Uzbekistan
2036 VEN: Venezuela
2037 VGB: British Virgin Islands
2038 VIE: Vietnam
2039 VUS: U.S. Virgin Islands
2040 WLS: Wales
2041 YEM: Yemen
2042 YUG: Yugoslavia
2043 ZAM: Zambia
2044 ZIM: Zimbabwe
2045 ZRE: Zaire
2046
2047 16: Additional chess data standards
2048
2049 While PGN is used for game storage, there are other data representation
2050 standards for other chess related purposes. Two important standards are FEN
2051 and EPD, both described in this section.
2052
2053 16.1: FEN
2054
2055 FEN is "Forsyth-Edwards Notation"; it is a standard for describing chess
2056 positions using the ASCII character set.
2057
2058 A single FEN record uses one text line of variable length composed of six data
2059 fields. The first four fields of the FEN specification are the same as the
2060 first four fields of the EPD specification.
2061
2062 A text file composed exclusively of FEN data records should have a file name
2063 with the suffix ".fen".
2064
2065 16.1.1: History
2066
2067 FEN is based on a 19th century standard for position recording designed by the
2068 Scotsman David Forsyth, a newspaper journalist. The original Forsyth standard
2069 has been slightly extended for use with chess software by Steven Edwards with
2070 assistance from commentators on the Internet. This new standard, FEN, was
2071 first implemented in Edwards' SAN Kit.
2072
2073 16.1.2: Uses for a position notation
2074
2075 Having a standard position notation is particularly important for chess
2076 programmers as it allows them to share position databases. For example, there
2077 exist standard position notation databases with many of the classical benchmark
2078 tests for chessplaying programs, and by using a common position notation format
2079 many hours of tedious data entry can be saved. Additionally, a position
2080 notation can be useful for page layout programs and for confirming position
2081 status for e-mail competition.
2082
2083 Many interesting chess problem sets represented using FEN can be found at the
2084 chess.uoknor.edu ftp site in the directory pub/chess/SAN_testsuites.
2085
2086 16.1.3: Data fields
2087
2088 FEN specifies the piece placement, the active color, the castling availability,
2089 the en passant target square, the halfmove clock, and the fullmove number.
2090 These can all fit on a single text line in an easily read format. The length
2091 of a FEN position description varies somewhat according to the position. In
2092 some cases, the description could be eighty or more characters in length and so
2093 may not fit conveniently on some displays. However, these positions aren't too
2094 common.
2095
2096 A FEN description has six fields. Each field is composed only of non-blank
2097 printing ASCII characters. Adjacent fields are separated by a single ASCII
2098 space character.
2099
2100 16.1.3.1: Piece placement data
2101
2102 The first field represents the placement of the pieces on the board. The board
2103 contents are specified starting with the eighth rank and ending with the first
2104 rank. For each rank, the squares are specified from file a to file h. White
2105 pieces are identified by uppercase SAN piece letters ("PNBRQK") and black
2106 pieces are identified by lowercase SAN piece letters ("pnbrqk"). Empty squares
2107 are represented by the digits one through eight; the digit used represents the
2108 count of contiguous empty squares along a rank. A solidus character "/" is
2109 used to separate data of adjacent ranks.
2110
2111 16.1.3.2: Active color
2112
2113 The second field represents the active color. A lower case "w" is used if
2114 White is to move; a lower case "b" is used if Black is the active player.
2115
2116 16.1.3.3: Castling availability
2117
2118 The third field represents castling availability. This indicates potential
2119 future castling that may of may not be possible at the moment due to blocking
2120 pieces or enemy attacks. If there is no castling availability for either side,
2121 the single character symbol "-" is used. Otherwise, a combination of from one
2122 to four characters are present. If White has kingside castling availability,
2123 the uppercase letter "K" appears. If White has queenside castling
2124 availability, the uppercase letter "Q" appears. If Black has kingside castling
2125 availability, the lowercase letter "k" appears. If Black has queenside
2126 castling availability, then the lowercase letter "q" appears. Those letters
2127 which appear will be ordered first uppercase before lowercase and second
2128 kingside before queenside. There is no white space between the letters.
2129
2130 16.1.3.4: En passant target square
2131
2132 The fourth field is the en passant target square. If there is no en passant
2133 target square then the single character symbol "-" appears. If there is an en
2134 passant target square then is represented by a lowercase file character
2135 immediately followed by a rank digit. Obviously, the rank digit will be "3"
2136 following a white pawn double advance (Black is the active color) or else be
2137 the digit "6" after a black pawn double advance (White being the active color).
2138
2139 An en passant target square is given if and only if the last move was a pawn
2140 advance of two squares. Therefore, an en passant target square field may have
2141 a square name even if there is no pawn of the opposing side that may
2142 immediately execute the en passant capture.
2143
2144 16.1.3.5: Halfmove clock
2145
2146 The fifth field is a nonnegative integer representing the halfmove clock. This
2147 number is the count of halfmoves (or ply) since the last pawn advance or
2148 capturing move. This value is used for the fifty move draw rule.
2149
2150 16.1.3.6: Fullmove number
2151
2152 The sixth and last field is a positive integer that gives the fullmove number.
2153 This will have the value "1" for the first move of a game for both White and
2154 Black. It is incremented by one immediately after each move by Black.
2155
2156 16.1.4: Examples
2157
2158 Here's the FEN for the starting position:
2159
2160 rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1
2161
2162 And after the move 1. e4:
2163
2164 rnbqkbnr/pppppppp/8/8/4P3/8/PPPP1PPP/RNBQKBNR b KQkq e3 0 1
2165
2166 And then after 1. ... c5:
2167
2168 rnbqkbnr/pp1ppppp/8/2p5/4P3/8/PPPP1PPP/RNBQKBNR w KQkq c6 0 2
2169
2170 And then after 2. Nf3:
2171
2172 rnbqkbnr/pp1ppppp/8/2p5/4P3/5N2/PPPP1PPP/RNBQKB1R b KQkq - 1 2
2173
2174 For two kings on their home squares and a white pawn on e2 (White to move) with
2175 thirty eight full moves played with five halfmoves since the last pawn move or
2176 capture:
2177
2178 4k3/8/8/8/8/8/4P3/4K3 w - - 5 39
2179
2180 16.2: EPD
2181
2182 EPD is "Extended Position Description"; it is a standard for describing chess
2183 positions along with an extended set of structured attribute values using the
2184 ASCII character set. It is intended for data and command interchange among
2185 chessplaying programs. It is also intended for the representation of portable
2186 opening library repositories.
2187
2188 A single EPD uses one text line of variable length composed of four data field
2189 followed by zero or more operations. The four fields of the EPD specification
2190 are the same as the first four fields of the FEN specification.
2191
2192 A text file composed exclusively of EPD data records should have a file name
2193 with the suffix ".epd".
2194
2195 16.2.1: History
2196
2197 EPD is based in part on the earlier FEN standard; it has added extensions for
2198 use with opening library preparation and also for general data and command
2199 interchange among advanced chess programs. EPD was developed by John Stanback
2200 and Steven Edwards; its first implementation is in Stanback's master strength
2201 chessplaying program Zarkov.
2202
2203 16.2.2: Uses for an extended position notation
2204
2205 Like FEN, EPD can also be used for general position description. However,
2206 unlike FEN, EPD is designed to be expandable by the addition of new operations
2207 that provide new functionality as needs arise.
2208
2209 Many interesting chess problem sets represented using EPD can be found at the
2210 chess.uoknor.edu ftp site in the directory pub/chess/SAN_testsuites.
2211
2212 16.2.3: Data fields
2213
2214 EPD specifies the piece placement, the active color, the castling availability,
2215 and the en passant target square of a position. These can all fit on a single
2216 text line in an easily read format. The length of an EPD position description
2217 varies somewhat according to the position and any associated operations. In
2218 some cases, the description could be eighty or more characters in length and so
2219 may not fit conveniently on some displays. However, most EPD descriptions pass
2220 among programs only and these are not usually seen by program users.
2221
2222 (Note: due to the likelihood of future expansion of EPD, implementors are
2223 encouraged to have their programs handle EPD text lines of up to 1024
2224 characters long.)
2225
2226 Each EPD data field is composed only of non-blank printing ASCII characters.
2227 Adjacent data fields are separated by a single ASCII space character.
2228
2229 16.2.3.1: Piece placement data
2230
2231 The first field represents the placement of the pieces on the board. The board
2232 contents are specified starting with the eighth rank and ending with the first
2233 rank. For each rank, the squares are specified from file a to file h. White
2234 pieces are identified by uppercase SAN piece letters ("PNBRQK") and black
2235 pieces are identified by lowercase SAN piece letters ("pnbrqk"). Empty squares
2236 are represented by the digits one through eight; the digit used represents the
2237 count of contiguous empty squares along a rank. A solidus character "/" is
2238 used to separate data of adjacent ranks.
2239
2240 16.2.3.2: Active color
2241
2242 The second field represents the active color. A lower case "w" is used if
2243 White is to move; a lower case "b" is used if Black is the active player.
2244
2245 16.2.3.3: Castling availability
2246
2247 The third field represents castling availability. This indicates potential
2248 future castling that may or may not be possible at the moment due to blocking
2249 pieces or enemy attacks. If there is no castling availability for either side,
2250 the single character symbol "-" is used. Otherwise, a combination of from one
2251 to four characters are present. If White has kingside castling availability,
2252 the uppercase letter "K" appears. If White has queenside castling
2253 availability, the uppercase letter "Q" appears. If Black has kingside castling
2254 availability, the lowercase letter "k" appears. If Black has queenside
2255 castling availability, then the lowercase letter "q" appears. Those letters
2256 which appear will be ordered first uppercase before lowercase and second
2257 kingside before queenside. There is no white space between the letters.
2258
2259 16.2.3.4: En passant target square
2260
2261 The fourth field is the en passant target square. If there is no en passant
2262 target square then the single character symbol "-" appears. If there is an en
2263 passant target square then is represented by a lowercase file character
2264 immediately followed by a rank digit. Obviously, the rank digit will be "3"
2265 following a white pawn double advance (Black is the active color) or else be
2266 the digit "6" after a black pawn double advance (White being the active color).
2267
2268 An en passant target square is given if and only if the last move was a pawn
2269 advance of two squares. Therefore, an en passant target square field may have
2270 a square name even if there is no pawn of the opposing side that may
2271 immediately execute the en passant capture.
2272
2273 16.2.4: Operations
2274
2275 An EPD operation is composed of an opcode followed by zero or more operands and
2276 is concluded by a semicolon.
2277
2278 Multiple operations are separated by a single space character. If there is at
2279 least one operation present in an EPD line, it is separated from the last
2280 (fourth) data field by a single space character.
2281
2282 16.2.4.1: General format
2283
2284 An opcode is an identifier that starts with a letter character and may be
2285 followed by up to fourteen more characters. Each additional character may be a
2286 letter or a digit or the underscore character.
2287
2288 An operand is either a set of contiguous non-white space printing characters or
2289 a string. A string is a set of contiguous printing characters delimited by a
2290 quote character at each end. A string value must have less than 256 bytes of
2291 data.
2292
2293 If at least one operand is present in an operation, there is a single space
2294 between the opcode and the first operand. If more than one operand is present
2295 in an operation, there is a single blank character between every two adjacent
2296 operands. If there are no operands, a semicolon character is appended to the
2297 opcode to mark the end of the operation. If any operands appear, the last
2298 operand has an appended semicolon that marks the end of the operation.
2299
2300 Any given opcode appears at most once per EPD record. Multiple operations in a
2301 single EPD record should appear in ASCII order of their opcode names
2302 (mnemonics). However, a program reading EPD records may allow for operations
2303 not in ASCII order by opcode mnemonics; the semantics are the same in either
2304 case.
2305
2306 Some opcodes that allow for more than one operand may have special ordering
2307 requirements for the operands. For example, the "pv" (predicted variation)
2308 opcode requires its operands (moves) to appear in the order in which they would
2309 be played. All other opcodes that allow for more than one operand should have
2310 operands appearing in ASCII order. An example of the latter set is the "bm"
2311 (best move[s]) opcode; its operands are moves that are all immediately playable
2312 from the current position.
2313
2314 Some opcodes require one or more operands that are chess moves. These moves
2315 should be represented using SAN. If a different representation is used, there
2316 is no guarantee that the EPD will be read correctly during subsequent
2317 processing.
2318
2319 Some opcodes require one or more operands that are integers. Some opcodes may
2320 require that an integer operand must be within a given range; the details are
2321 described in the opcode list given below. A negative integer is formed with a
2322 hyphen (minus sign) preceding the integer digit sequence. An optional plus
2323 sign may be used for indicating a non-negative value, but such use is not
2324 required and is indeed discouraged.
2325
2326 Some opcodes require one or more operands that are floating point numbers.
2327 Some opcodes may require that a floating point operand must be within a given
2328 range; the details are described in the opcode list given below. A floating
2329 point operand is constructed from an optional sign character ("+" or "-"), a
2330 digit sequence (with at least one digit), a radix point (always "."), and a
2331 final digit sequence (with at least one digit).
2332
2333 16.2.4.2: Opcode mnemonics
2334
2335 An opcode mnemonic used for archival storage and for interprogram communication
2336 starts with a lower case letter and is composed of only lower case letters,
2337 digits, and the underscore character (i.e., no upper case letters). These
2338 mnemonics will also all be at least two characters in length.
2339
2340 Opcode mnemonics used only by a single program or an experimental suite of
2341 programs should start with an upper case letter. This is so they may be easily
2342 distinguished should they be inadvertently be encountered by other programs.
2343 When a such a "private" opcode be demonstrated to be widely useful, it should
2344 be brought into the official list (appearing below) in a lower case form.
2345
2346 If a given program does not recognize a particular opcode, that operation is
2347 simply ignored; it is not signaled as an error.
2348
2349 16.2.5: Opcode list
2350
2351 The opcodes are listed here in ASCII order of their mnemonics. Suggestions for
2352 new opcodes should be sent to the PGN standard coordinator listed near the
2353 start of this document.
2354
2355 16.2.5.1: Opcode "acn": analysis count: nodes
2356
2357 The opcode "acn" takes a single non-negative integer operand. It is used to
2358 represent the number of nodes examined in an analysis. Note that the value may
2359 be quite large for some extended searches and so use of (at least) a long (four
2360 byte) representation is suggested.
2361
2362 16.2.5.2: Opcode "acs": analysis count: seconds
2363
2364 The opcode "acs" takes a single non-negative integer operand. It is used to
2365 represent the number of seconds used for an analysis. Note that the value may
2366 be quite large for some extended searches and so use of (at least) a long (four
2367 byte) representation is suggested.
2368
2369 16.2.5.3: Opcode "am": avoid move(s)
2370
2371 The opcode "am" indicates a set of zero or more moves, all immediately playable
2372 from the current position, that are to be avoided in the opinion of the EPD
2373 writer. Each operand is a SAN move; they appear in ASCII order.
2374
2375 16.2.5.4: Opcode "bm": best move(s)
2376
2377 The opcode "bm" indicates a set of zero or more moves, all immediately playable
2378 from the current position, that are judged to the best available by the EPD
2379 writer. Each operand is a SAN move; they appear in ASCII order.
2380
2381 16.2.5.5: Opcode "c0": comment (primary, also "c1" though "c9")
2382
2383 The opcode "c0" (lower case letter "c", digit character zero) indicates a top
2384 level comment that applies to the given position. It is the first of ten
2385 ranked comments, each of which has a mnemonic formed from the lower case letter
2386 "c" followed by a single decimal digit. Each of these opcodes takes either a
2387 single string operand or no operand at all.
2388
2389 This ten member comment family of opcodes is intended for use as descriptive
2390 commentary for a complete game or game fragment. The usual processing of these
2391 opcodes are as follows:
2392
2393 1) At the beginning of a game (or game fragment), a move sequence scanning
2394 program initializes each element of its set of ten comment string registers to
2395 be null.
2396
2397 2) As the EPD record for each position in the game is processed, the comment
2398 operations are interpreted from left to right. (Actually, all operations in n
2399 EPD record are interpreted from left to right.) Because operations appear in
2400 ASCII order according to their opcode mnemonics, opcode "c0" (if present) will
2401 be handled prior to all other opcodes, then opcode "c1" (if present), and so
2402 forth until opcode "c9" (if present).
2403
2404 3) The processing of opcode "cN" (0 <= N <= 9) involves two steps. First, all
2405 comment string registers with an index equal to or greater than N are set to
2406 null. (This is the set "cN" though "c9".) Second, and only if a string
2407 operand is present, the value of the corresponding comment string register is
2408 set equal to the string operand.
2409
2410 16.2.5.6: Opcode "ce": centipawn evaluation
2411
2412 The opcode "ce" indicates the evaluation of the indicated position in centipawn
2413 units. It takes a single operand, an optionally signed integer that gives an
2414 evaluation of the position from the viewpoint of the active player; i.e., the
2415 player with the move. Positive values indicate a position favorable to the
2416 moving player while negative values indicate a position favorable to the
2417 passive player; i.e., the player without the move. A centipawn evaluation
2418 value close to zero indicates a neutral positional evaluation.
2419
2420 Values are restricted to integers that are equal to or greater than -32767 and
2421 are less than or equal to 32766.
2422
2423 A value greater than 32000 indicates the availability of a forced mate to the
2424 active player. The number of plies until mate is given by subtracting the
2425 evaluation from the value 32767. Thus, a winning mate in N fullmoves is a mate
2426 in ((2 * N) - 1) halfmoves (or ply) and has a corresponding centipawn
2427 evaluation of (32767 - ((2 * N) - 1)). For example, a mate on the move (mate
2428 in one) has a centipawn evaluation of 32766 while a mate in five has a
2429 centipawn evaluation of 32758.
2430
2431 A value less than -32000 indicates the availability of a forced mate to the
2432 passive player. The number of plies until mate is given by subtracting the
2433 evaluation from the value -32767 and then negating the result. Thus, a losing
2434 mate in N fullmoves is a mate in (2 * N) halfmoves (or ply) and has a
2435 corresponding centipawn evaluation of (-32767 + (2 * N)). For example, a mate
2436 after the move (losing mate in one) has a centipawn evaluation of -32765 while
2437 a losing mate in five has a centipawn evaluation of -32757.
2438
2439 A value of -32767 indicates an illegal position. A stalemate position has a
2440 centipawn evaluation of zero as does a position drawn due to insufficient
2441 mating material. Any other position known to be a certain forced draw also has
2442 a centipawn evaluation of zero.
2443
2444 16.2.5.7: Opcode "dm": direct mate fullmove count
2445
2446 The "dm" opcode is used to indicate the number of fullmoves until checkmate is
2447 to be delivered by the active color for the indicated position. It always
2448 takes a single operand which is a positive integer giving the fullmove count.
2449 For example, a position known to be a "mate in three" would have an operation
2450 of "dm 3;" to indicate this.
2451
2452 This opcode is intended for use with problem sets composed of positions
2453 requiring direct mate answers as solutions.
2454
2455 16.2.5.8: Opcode "draw_accept": accept a draw offer
2456
2457 The opcode "draw_accept" is used to indicate that a draw offer made after the
2458 move that lead to the indicated position is accepted by the active player.
2459 This opcode takes no operands.
2460
2461 16.2.5.9: Opcode "draw_claim": claim a draw
2462
2463 The opcode "draw_claim" is used to indicate claim by the active player that a
2464 draw exists. The draw is claimed because of a third time repetition or because
2465 of the fifty move rule or because of insufficient mating material. A supplied
2466 move (see the opcode "sm") is also required to appear as part of the same EPD
2467 record. The draw_claim opcode takes no operands.
2468
2469 16.2.5.10: Opcode "draw_offer": offer a draw
2470
2471 The opcode "draw_offer" is used to indicate that a draw is offered by the
2472 active player. A supplied move (see the opcode "sm") is also required to
2473 appear as part of the same EPD record; this move is considered played from the
2474 indicated position. The draw_offer opcode takes no operands.
2475
2476 16.2.5.11: Opcode "draw_reject": reject a draw offer
2477
2478 The opcode "draw_reject" is used to indicate that a draw offer made after the
2479 move that lead to the indicated position is rejected by the active player.
2480 This opcode takes no operands.
2481
2482 16.2.5.12: Opcode "eco": _Encyclopedia of Chess Openings_ opening code
2483
2484 The opcode "eco" is used to associate an opening designation from the
2485 _Encyclopedia of Chess Openings_ taxonomy with the indicated position. The
2486 opcode takes either a single string operand (the ECO opening name) or no
2487 operand at all. If an operand is present, its value is associated with an
2488 "ECO" string register of the scanning program. If there is no operand, the ECO
2489 string register of the scanning program is set to null.
2490
2491 The usage is similar to that of the "ECO" tag pair of the PGN standard.
2492
2493 16.2.5.13: Opcode "fmvn": fullmove number
2494
2495 The opcode "fmvn" represents the fullmove n umber associated with the position.
2496 It always takes a single operand that is the positive integer value of the move
2497 number.
2498
2499 This opcode is used to explicitly represent the fullmove number in EPD that is
2500 present by default in FEN as the sixth field. Fullmove number information is
2501 usually omitted from EPD because it does not affect move generation (commonly
2502 needed for EPD-using tasks) but it does affect game notation (commonly needed
2503 for FEN-using tasks). Because of the desire for space optimization for large
2504 EPD files, fullmove numbers were dropped from EPD's parent FEN. The halfmove
2505 clock information was similarly dropped.
2506
2507 16.2.5.14: Opcode "hmvc": halfmove clock
2508
2509 The opcode "hmvc" represents the halfmove clock associated with the position.
2510 The halfmove clock of a position is equal to the number of plies since the last
2511 pawn move or capture. This information is used to implement the fifty move
2512 draw rule. It always takes a single operand that is the non-negative integer
2513 value of the halfmove clock.
2514
2515 This opcode is used to explicitly represent the halfmove clock in EPD that is
2516 present by default in FEN as the fifth field. Halfmove clock information is
2517 usually omitted from EPD because it does not affect move generation (commonly
2518 needed for EPD-using tasks) but it does affect game termination issues
2519 (commonly needed for FEN-using tasks). Because of the desire for space
2520 optimization for large EPD files, halfmove clock values were dropped from EPD's
2521 parent FEN. The fullmove number information was similarly dropped.
2522
2523 16.2.5.15: Opcode "id": position identification
2524
2525 The opcode "id" is used to provide a simple identifying label for the indicated
2526 position. It takes a single string operand.
2527
2528 This opcode is intended for use with test suites used for measuring
2529 chessplaying program strength. An example "id" operand for the seven hundred
2530 fifty seventh position of the one thousand one problems in Reinfeld's _1001
2531 Winning Chess Sacrifices and Combinations_ would be "WCSAC.0757" while the
2532 fifteenth position in the twenty four problem Bratko-Kopec test suite would
2533 have an "id" operand of "BK.15".
2534
2535 16.2.5.16: Opcode "nic": _New In Chess_ opening code
2536
2537 The opcode "nic" is used to associate an opening designation from the _New In
2538 Chess_ taxonomy with the indicated position. The opcode takes either a single
2539 string operand (the NIC opening name) or no operand at all. If an operand is
2540 present, its value is associated with an "NIC" string register of the scanning
2541 program. If there is no operand, the NIC string register of the scanning
2542 program is set to null.
2543
2544 The usage is similar to that of the "NIC" tag pair of the PGN standard.
2545
2546 16.2.5.17: Opcode "noop": no operation
2547
2548 The "noop" opcode is used to indicate no operation. It takes zero or more
2549 operands, each of which may be of any type. The operation involves no
2550 processing. It is intended for use by developers for program testing purposes.
2551
2552 16.2.5.18: Opcode "pm": predicted move
2553
2554 The "pm" opcode is used to provide a single predicted move for the indicated
2555 position. It has exactly one operand, a move playable from the position. This
2556 move is judged by the EPD writer to represent the best move available to the
2557 active player.
2558
2559 If a non-empty "pv" (predicted variation) line of play is also present in the
2560 same EPD record, the first move of the predicted variation is the same as the
2561 predicted move.
2562
2563 The "pm" opcode is intended for use as a general "display hint" mechanism.
2564
2565 16.2.5.19: Opcode "pv": predicted variation
2566
2567 The "pv" opcode is used to provide a predicted variation for the indicated
2568 position. It has zero or more operands which represent a sequence of moves
2569 playable from the position. This sequence is judged by the EPD writer to
2570 represent the best play available.
2571
2572 If a "pm" (predicted move) operation is also present in the same EPD record,
2573 the predicted move is the same as the first move of the predicted variation.
2574
2575 16.2.5.20: Opcode "rc": repetition count
2576
2577 The "rc" opcode is used to indicate the number of occurrences of the indicated
2578 position. It takes a single, positive integer operand. Any position,
2579 including the initial starting position, is considered to have an "rc" value of
2580 at least one. A value of three indicates a candidate for a draw claim by the
2581 position repetition rule.
2582
2583 16.2.5.21: Opcode "resign": game resignation
2584
2585 The opcode "resign" is used to indicate that the active player has resigned the
2586 game. This opcode takes no operands.
2587
2588 16.2.5.22: Opcode "sm": supplied move
2589
2590 The "sm" opcode is used to provide a single supplied move for the indicated
2591 position. It has exactly one operand, a move playable from the position. This
2592 move is the move to be played from the position.
2593
2594 The "sm" opcode is intended for use to communicate the most recent played move
2595 in an active game. It is used to communicate moves between programs in
2596 automatic play via a network. This includes correspondence play using e-mail
2597 and also programs acting as network front ends to human players.
2598
2599 16.2.5.23: Opcode "tcgs": telecommunication: game selector
2600
2601 The "tcgs" opcode is one of the telecommunication family of opcodes used for
2602 games conducted via e-mail and similar means. This opcode takes a single
2603 operand that is a positive integer. It is used to select among various games
2604 in progress between the same sender and receiver.
2605
2606 16.2.5.24: Opcode "tcri": telecommunication: receiver identification
2607
2608 The "tcri" opcode is one of the telecommunication family of opcodes used for
2609 games conducted via e-mail and similar means. This opcode takes two order
2610 dependent string operands. The first operand is the e-mail address of the
2611 receiver of the EPD record. The second operand is the name of the player
2612 (program or human) at the address who is the actual receiver of the EPD record.
2613
2614 16.2.5.25: Opcode "tcsi": telecommunication: sender identification
2615
2616 The "tcsi" opcode is one of the telecommunication family of opcodes used for
2617 games conducted via e-mail and similar means. This opcode takes two order
2618 dependent string operands. The first operand is the e-mail address of the
2619 sender of the EPD record. The second operand is the name of the player
2620 (program or human) at the address who is the actual sender of the EPD record.
2621
2622 16.2.5.26: Opcode "v0": variation name (primary, also "v1" though "v9")
2623
2624 The opcode "v0" (lower case letter "v", digit character zero) indicates a top
2625 level variation name that applies to the given position. It is the first of
2626 ten ranked variation names, each of which has a mnemonic formed from the lower
2627 case letter "v" followed by a single decimal digit. Each of these opcodes
2628 takes either a single string operand or no operand at all.
2629
2630 This ten member variation name family of opcodes is intended for use as
2631 traditional variation names for a complete game or game fragment. The usual
2632 processing of these opcodes are as follows:
2633
2634 1) At the beginning of a game (or game fragment), a move sequence scanning
2635 program initializes each element of its set of ten variation name string
2636 registers to be null.
2637
2638 2) As the EPD record for each position in the game is processed, the variation
2639 name operations are interpreted from left to right. (Actually, all operations
2640 in n EPD record are interpreted from left to right.) Because operations appear
2641 in ASCII order according to their opcode mnemonics, opcode "v0" (if present)
2642 will be handled prior to all other opcodes, then opcode "v1" (if present), and
2643 so forth until opcode "v9" (if present).
2644
2645 3) The processing of opcode "vN" (0 <= N <= 9) involves two steps. First, all
2646 variation name string registers with an index equal to or greater than N are
2647 set to null. (This is the set "vN" though "v9".) Second, and only if a string
2648 operand is present, the value of the corresponding variation name string
2649 register is set equal to the string operand.
2650
2651 17: Alternative chesspiece identifier letters
2652
2653 English language piece names are used to define the letter set for identifying
2654 chesspieces in PGN movetext. However, authors of programs which are used only
2655 for local presentation or scanning of chess move data may find it convenient to
2656 use piece letter codes common in their locales. This is not a problem as long
2657 as PGN data that resides in archival storage or that is exchanged among
2658 programs still uses the SAN (English) piece letter codes: "PNBRQK".
2659
2660 For the above authors only, a list of alternative piece letter codes are
2661 provided:
2662
2663 Language Piece letters (pawn knight bishop rook queen king)
2664 ---------- --------------------------------------------------
2665 Czech P J S V D K
2666 Danish B S L T D K
2667 Dutch O P L T D K
2668 English P N B R Q K
2669 Estonian P R O V L K
2670 Finnish P R L T D K
2671 French P C F T D R
2672 German B S L T D K
2673 Hungarian G H F B V K
2674 Icelandic P R B H D K
2675 Italian P C A T D R
2676 Norwegian B S L T D K
2677 Polish P S G W H K
2678 Portuguese P C B T D R
2679 Romanian P C N T D R
2680 Spanish P C A T D R
2681 Swedish B S L T D K
2682
2683 18: Formal syntax
2684
2685 <PGN-database> ::= <PGN-game> <PGN-database>
2686 <empty>
2687
2688 <PGN-game> ::= <tag-section> <movetext-section>
2689
2690 <tag-section> ::= <tag-pair> <tag-section>
2691 <empty>
2692
2693 <tag-pair> ::= [ <tag-name> <tag-value> ]
2694
2695 <tag-name> ::= <identifier>
2696
2697 <tag-value> ::= <string>
2698
2699 <movetext-section> ::= <element-sequence> <game-termination>
2700
2701 <element-sequence> ::= <element> <element-sequence>
2702 <recursive-variation> <element-sequence>
2703 <empty>
2704
2705 <element> ::= <move-number-indication>
2706 <SAN-move>
2707 <numeric-annotation-glyph>
2708
2709 <recursive-variation> ::= ( <element-sequence> )
2710
2711 <game-termination> ::= 1-0
2712 0-1
2713 1/2-1/2
2714 *
2715 <empty> ::=
2716
2717 19: Canonical chess position hash coding
2718
2719 *** This section is under development.
2720
2721 20: Binary representation (PGC)
2722
2723 *** This section is under development.
2724
2725 The binary coded version of PGN is PGC (PGN Game Coding). PGC is a binary
2726 representation standard of PGN data designed for the dual goals of storage
2727 efficiency and program I/O. A file containing PGC data should have a name with
2728 a suffix of ".pgc".
2729
2730 Unlike PGN text files that may have locale dependent representations for
2731 newlines, PGC files have data that does not vary due to local processing
2732 environment. This means that PGC files may be transferred among systems using
2733 general binary file methods.
2734
2735 PGC files should be used only when the use of PGN is impractical due to time
2736 and space resource constraints. As the general level of processing
2737 capabilities increases, the need for PGC over PGN will decrease. Therefore,
2738 implementors are encouraged not to use PGC as the default representation
2739 because it is much more difficult (than PGN) to understand without proper
2740 software.
2741
2742 PGC data is composed of a sequence of PGC records. Each record is composed of
2743 a sequence of one or more bytes. The first byte is the PGN record marker and
2744 it specifies the interpretation of the remaining portion of the record. This
2745 remaining portion is composed of zero or more PGN record items. Item types
2746 include move sequences, move sets, and character strings.
2747
2748 20.1: Bytes, words, and doublewords
2749
2750 At the lowest level, PGC binary data is organized as bytes, words (two
2751 contiguous bytes), and doublewords (four contiguous bytes). All eight bits of
2752 a byte are used. Longwords (eight contiguous bytes) are not used. Integer
2753 values are stored using two's complement representation. Integers may be
2754 signed or unsigned depending on context. Multibyte integers are stored in
2755 low-endian format with the least significant byte appearing first.
2756
2757 A one byte integer item is called "int-1". A two byte integer item is called
2758 "int-2". A four byte integer item is called "int-4".
2759
2760 Characters are stored as bytes using the ISO 8859/1 Latin-1 (ECMA-94) code set.
2761 There is no provision for other characters sets or representations.
2762
2763 20.2: Move ordinals
2764
2765 A chess move is represented using a move ordinal. This is a single unsigned
2766 byte quantity with values from zero to 255. A move ordinal is interpreted as
2767 an index into the list of legal moves from the current position. This list is
2768 constructed by generating the legal moves from the current position, assigning
2769 SAN ASCII strings to each move, and then sorting these strings in ascending
2770 order. Note that a seven bit ordinal, as used by some inferior representation
2771 systems, is insufficient as there are some positions that have more than 128
2772 moves available.
2773
2774 Examples: From the initial position, there are twenty moves. Move ordinal 0
2775 corresponds to the SAN move string "Na3"; move ordinal 1 corresponds to "Nc3",
2776 move ordinal 4 corresponds to "a3", and move ordinal 19 corresponds to "h4".
2777
2778 Moves can be organized into sequences and sets. A move sequence is an ordered
2779 list of moves that are played, one after another from first to last. A move
2780 set is a list of moves that are all playable from the current position.
2781
2782 Move sequence data is represented using a length header followed by move
2783 ordinal data. The length header is an unsigned integer that may be a byte or a
2784 word. The integer gives the number, possibly zero, of following move ordinal
2785 bytes. Most move sequences can be represented using just a byte header; these
2786 are called "mvseq-1" items. Move sequence data using a word header are called
2787 "mvseq-2" items.
2788
2789 Move set data is represented using a length header followed by move ordinal
2790 data. The length header is an unsigned integer that is a byte. The integer
2791 gives the number, possibly zero, of following move ordinal bytes. All move
2792 sets are be represented using just a byte header; these are called "mvset-1"
2793 items. (Note the implied restriction that a move set can only have a maximum
2794 of 255 of the possible 256 ordinals present at one time.)
2795
2796 20.3: String data
2797
2798 PGC string data is represented using a length header followed by bytes of
2799 character data. The length header is an unsigned integer that may be a byte, a
2800 word, or a doubleword. The integer gives the number, possibly zero, of
2801 following character bytes. Most strings can be represented using just a byte
2802 header; these are called "string-1" items. String data using a word header are
2803 called "string-2" items and string data using a doubleword header are called
2804 "string-4" items. No special ASCII NUL termination byte is required for PGC
2805 storage of a string as the length is explicitly given in the item header.
2806
2807 20.4: Marker codes
2808
2809 PGC marker codes are given in hexadecimal format. PGC marker code zero (marker
2810 0x00) is the "noop" marker and carries no meaning. Each additional marker code
2811 defined appears in its own subsection below.
2812
2813 20.4.1: Marker 0x01: reduced export format single game
2814
2815 Marker 0x01 is used to indicate a single complete game in reduced export
2816 format. This refers to a game that has only the Seven Tag Roster data, played
2817 moves, and no annotations or comments. This record type is used as an
2818 alternative to the general game data begin/end record pairs described below.
2819 The general marker pair (0x05/0x06) is used to help represent game data that
2820 can't be adequately represented in reduced export format. There are eight
2821 items that follow marker 0x01 to form the "reduced export format single game"
2822 record. In order, these are:
2823
2824 1) string-1 (Event tag value)
2825
2826 2) string-1 (Site tag value)
2827
2828 3) string-1 (Date tag value)
2829
2830 4) string-1 (Round tag value)
2831
2832 5) string-1 (White tag value)
2833
2834 6) string-1 (Black tag value)
2835
2836 7) string-1 (Result tag value)
2837
2838 8) mvseq-2 (played moves)
2839
2840 20.4.2: Marker 0x02: tag pair
2841
2842 Marker 0x02 is used to indicate a single tag pair. There are two items that
2843 follow marker 0x02 to form the "tag pair" record; in order these are:
2844
2845 1) string-1 (tag pair name)
2846
2847 2) string-1 (tag pair value)
2848
2849 20.4.3: Marker 0x03: short move sequence
2850
2851 Marker 0x03 is used to indicate a short move sequence. There is one item that
2852 follows marker 0x03 to form the "short move sequence" record; this is:
2853
2854 1) mvseq-1 (played moves)
2855
2856 20.4.4: Marker 0x04: long move sequence
2857
2858 Marker 0x04 is used to indicate a long move sequence. There is one item that
2859 follows marker 0x04 to form the "long move sequence" record; this is:
2860
2861 1) mvseq-2 (played moves)
2862
2863 20.4.5: Marker 0x05: general game data begin
2864
2865 Marker 0x05 is used to indicate the beginning of data for a game. It has no
2866 associated items; it is a complete record by itself. Instead, it marks the
2867 beginning of PGC records used to describe a game. All records up to the
2868 corresponding "general game data end" record are considered to be part of the
2869 same game. (PGC record type 0x01, "reduced export format single game", is not
2870 permitted to appear within a general game begin/end record pair. The general
2871 game construct is to be used as an alternative to record type 0x01 in those
2872 cases where the latter is too restrictive to contain the data for a game.)
2873
2874 20.4.6: Marker 0x06: general game data end
2875
2876 Marker 0x06 is used to indicate the end of data for a game. It has no
2877 associated items; it is a complete record by itself. Instead, it marks the end
2878 of PGC records used to describe a game. All records after the corresponding
2879 (and earlier appearing) "general game data begin" record are considered to be
2880 part of the same game.
2881
2882 20.4.7: Marker 0x07: simple-nag
2883
2884 Marker 0x07 is used to indicate the presence of a simple NAG (Numeric
2885 Annotation Glyph). This is an annotation marker that has only a short type
2886 identification and no operands. There is one item that follows marker 0x07 to
2887 form the "simple-nag" record; this is:
2888
2889 1) int-1 (unsigned NAG value, from 0 to 255)
2890
2891 20.4.8: Marker 0x08: rav-begin
2892
2893 Marker 0x08 is used to indicate the beginning of an RAV (Recursive Annotation
2894 Variation). It has no associated items; it is a complete record by itself.
2895 Instead, it marks the beginning of PGC records used to describe a recursive
2896 annotation. It is considered an opening bracket for a later rav-end record;
2897 the recursive annotation is completely described between the bracket pair. The
2898 rav-begin/data/rav-end structures can be nested.
2899
2900 20.4.9: Marker 0x09: rav-end
2901
2902 Marker 0x09 is used to indicate the end of an RAV (Recursive Annotation
2903 Variation). It has no associated items; it is a complete record by itself.
2904 Instead, it marks the end of PGC records used to describe a recursive
2905 annotation. It is considered a closing bracket for an earlier rav-begin
2906 record; the recursive annotation is completely described between the bracket
2907 pair. The rav-begin/data/rav-end structures can be nested.
2908
2909 20.4.10: Marker 0x0a: escape-string
2910
2911 Marker 0x0a is used to indicate the presence of an escape string. This is a
2912 string represented by the use of the percent sign ("%") escape mechanism in
2913 PGN. The data that is escaped is the sequence of characters immediately
2914 follwoing the percent sign up to but not including the terminating newline. As
2915 is the case with the PGN percent sign escape, the use of a PGC escape-string
2916 record is limited to use for non-archival data. There is one item that follows
2917 marker 0x0a to form the "escape-string" record; this is the string data being
2918 escaped:
2919
2920 1) string-2 (escaped string data)
2921
2922 21: E-mail correspondence usage
2923
2924 *** This section is under development.
2925
2926 Standard: EOF