Revision: emacs@sv.gnu.org/emacs--devo--0--patch-189

[gnu-emacs] / etc / DEBUG
diff --git a/etc/DEBUG b/etc/DEBUG

index 342699a62ff51b4712ed2fcbd22b7a2c1ee63731..b8edb12e4742967089f489d8906ca5bde3e522d0 100644 (file)
--- a/etc/DEBUG
+++ b/etc/DEBUG
@@ -1,5 +1,6 @@
  Debugging GNU Emacs
-Copyright (c) 1985, 2000, 2001 Free Software Foundation, Inc.
+Copyright (C) 1985, 2000, 2001, 2002, 2003, 2004,
+   2005, 2006 Free Software Foundation, Inc.
  
     Permission is granted to anyone to make or distribute verbatim copies
     of this document as received, in any medium, provided that the
@@ -16,13 +17,24 @@ Copyright (c) 1985, 2000, 2001 Free Software Foundation, Inc.
  should read the Windows-specific section near the end of this
  document.]
  
-It is a good idea to run Emacs under GDB (or some other suitable
+** When you debug Emacs with GDB, you should start it in the directory
+where the executable was made.  That directory has a .gdbinit file
+that defines various "user-defined" commands for debugging Emacs.
+
+** When you are trying to analyze failed assertions, it will be
+essential to compile Emacs either completely without optimizations or
+at least (when using GCC) with the -fno-crossjumping option.  Failure
+to do so may make the compiler recycle the same abort call for all
+assertions in a given function, rendering the stack backtrace useless
+for identifying the specific failed assertion.
+
+** It is a good idea to run Emacs under GDB (or some other suitable
  debugger) *all the time*.  Then, when Emacs crashes, you will be able
  to debug the live process, not just a core dump.  (This is especially
  important on systems which don't support core files, and instead print
  just the registers and some stack addresses.)
  
-If Emacs hangs, or seems to be stuck in some infinite loop, typing
+** If Emacs hangs, or seems to be stuck in some infinite loop, typing
  "kill -TSTP PID", where PID is the Emacs process ID, will cause GDB to
  kick in, provided that you run under GDB.
  
@@ -32,7 +44,7 @@ kick in, provided that you run under GDB.
  All Lisp errors go through there.
  
  It is useful, when debugging, to have a guaranteed way to return to
-the debugger at any time.  When using X, this is easy: type C-c at the
+the debugger at any time.  When using X, this is easy: type C-z at the
  window where Emacs is running under GDB, and it will stop Emacs just
  as it would stop any ordinary program.  When Emacs is running in a
  terminal, things are not so easy.
@@ -41,7 +53,7 @@ The src/.gdbinit file in the Emacs distribution arranges for SIGINT
  (C-g in Emacs) to be passed to Emacs and not give control back to GDB.
  On modern POSIX systems, you can override that with this command:
  
-   handle int stop nopass
+   handle SIGINT stop nopass
  
  After this `handle' command, SIGINT will return control to GDB.  If
  you want the C-g to cause a QUIT within Emacs as well, omit the
@@ -58,6 +70,11 @@ use the set command until the inferior process has been started.
  Put a breakpoint early in `main', or suspend the Emacs,
  to get an opportunity to do the set command.
  
+When Emacs is running in a terminal, it is useful to use a separate terminal
+for the debug session.  This can be done by starting Emacs as usual, then
+attaching to it from gdb with the `attach' command which is explained in the
+node "Attach" of the GDB manual.
+
  ** Examining Lisp object values.
  
  When you have a live process to debug, and it has not encountered a
@@ -65,9 +82,11 @@ fatal error, you can use the GDB command `pr'.  First print the value
  in the ordinary way, with the `p' command.  Then type `pr' with no
  arguments.  This calls a subroutine which uses the Lisp printer.
  
-Note: It is not a good idea to try `pr' if you know that Emacs is in
-deep trouble: its stack smashed (e.g., if it encountered SIGSEGV due
-to stack overflow), or crucial data structures, such as `obarray',
+You can also use `pp value' to print the emacs value directly.
+
+Note: It is not a good idea to try `pr' or `pp' if you know that Emacs
+is in deep trouble: its stack smashed (e.g., if it encountered SIGSEGV
+due to stack overflow), or crucial data structures, such as `obarray',
  corrupted, etc.  In such cases, the Emacs subroutine called by `pr'
  might make more damage, like overwrite some data that is important for
  debugging the original problem.
@@ -101,36 +120,32 @@ objects which you can examine in turn with the x... commands.
  Even with a live process, these x...  commands are useful for
  examining the fields in a buffer, window, process, frame or marker.
  Here's an example using concepts explained in the node "Value History"
-of the GDB manual to print the variable frame from this line in
-xmenu.c:
-
-                 buf.frame_or_window = frame;
-
-First, use these commands:
+of the GDB manual to print values associated with the variable
+called frame.  First, use these commands:
  
      cd src
      gdb emacs
-    b xmenu.c:1296
-    r -q 
+    b set_frame_buffer_list
+    r -q
  
-Then type C-x 5 2 to create a new frame, and it hits the breakpoint:
+Then Emacs hits the breakpoint:
  
      (gdb) p frame
-    $1 = 1077872640
+    $1 = 139854428
      (gdb) xtype
      Lisp_Vectorlike
      PVEC_FRAME
      (gdb) xframe
-    $2 = (struct frame *) 0x3f0800
+    $2 = (struct frame *) 0x8560258
      (gdb) p *$
      $3 = {
-      size = 536871989, 
-      next = 0x366240, 
-      name = 809661752, 
+      size = 1073742931,
+      next = 0x85dfe58,
+      name = 140615219,
        [...]
      }
      (gdb) p $3->name
-    $4 = 809661752
+    $4 = 140615219
  
  Now we can use `pr' to print the name of the frame:
  
@@ -143,11 +158,13 @@ we want the address of the l-value expression near the bottom of
  
    XVECTOR (this_command_keys)->contents[this_command_key_count++] = key;
  
-XVECTOR is a macro, and therefore GDB does not know about it.
-GDB cannot evaluate "p XVECTOR (this_command_keys)".
+XVECTOR is a macro, so GDB only knows about it if Emacs has been compiled with
+preprocessor macro information.  GCC provides this if you specify the options
+`-gdwarf-2' and `-g3'.  In this case, GDB can evaluate expressions like
+"p XVECTOR (this_command_keys)".
  
-However, you can use the xvector command in GDB to get the same
-result.  Here is how:
+When this information isn't available, you can use the xvector command in GDB
+to get the same result.  Here is how:
  
      (gdb) p this_command_keys
      $1 = 1078005760
@@ -173,7 +190,7 @@ this vector.  `recent_keys' is updated in keyboard.c by the command
    XVECTOR (recent_keys)->contents[recent_keys_index] = c;
  
  So we define a GDB command `xvector-elts', so the last 10 keystrokes
-are printed by 
+are printed by
  
      xvector-elts recent_keys recent_keys_index 10
  
@@ -185,7 +202,7 @@ where you can define xvector-elts as follows:
      xvector
      set $foo = $
      while $i < $arg2
-    p $foo->contents[$arg1-($i++)] 
+    p $foo->contents[$arg1-($i++)]
      pr
      end
      document xvector-elts
@@ -213,7 +230,7 @@ of function calling.
  
  By printing the remaining elements of args, you can see the argument
  values.  Here's how to print the first argument:
-  
+
     p args[1]
     pr
  
@@ -226,7 +243,26 @@ conveniently.  For example:
  
  and, assuming that "xtype" says that args[0] is a symbol:
  
-   xsymbol 
+   xsymbol
+
+** Using GDB in Emacs
+
+Debugging with GDB in Emacs offers some advantages over the command line (See
+the GDB Graphical Interface node of the Emacs manual).  There are also some
+features available just for debugging Emacs:
+
+1) The command gud-pp isavailable on the tool bar (the `pp' icon) and allows
+   the user to print the s-expression of the variable at point, in the GUD
+   buffer.
+
+2) Pressing `p' on a component of a watch expression that is a lisp object
+   in the speedbar prints its s-expression in the GUD buffer.
+
+3) The STOP button on the tool bar is adjusted so that it sends SIGTSTP
+   instead of the usual SIGINT.
+
+4) The command gud-pv has the global binding 'C-x C-a C-v' and prints the
+   value of the lisp variable at point.
  
  ** Debugging what happens while preloading and dumping Emacs
  
@@ -258,6 +294,62 @@ Setting a breakpoint in the function `x_error_quitter' and looking at
  the backtrace when Emacs stops inside that function will show what
  code causes the X protocol errors.
  
+Some bugs related to the X protocol disappear when Emacs runs in a
+synchronous mode.  To track down those bugs, we suggest the following
+procedure:
+
+  - Run Emacs under a debugger and put a breakpoint inside the
+    primitive function which, when called from Lisp, triggers the X
+    protocol errors.  For example, if the errors happen when you
+    delete a frame, put a breakpoint inside `Fdelete_frame'.
+
+  - When the breakpoint breaks, step through the code, looking for
+    calls to X functions (the ones whose names begin with "X" or
+    "Xt" or "Xm").
+
+  - Insert calls to `XSync' before and after each call to the X
+    functions, like this:
+
+       XSync (f->output_data.x->display_info->display, 0);
+
+    where `f' is the pointer to the `struct frame' of the selected
+    frame, normally available via XFRAME (selected_frame).  (Most
+    functions which call X already have some variable that holds the
+    pointer to the frame, perhaps called `f' or `sf', so you shouldn't
+    need to compute it.)
+
+    If your debugger can call functions in the program being debugged,
+    you should be able to issue the calls to `XSync' without recompiling
+    Emacs.  For example, with GDB, just type:
+
+       call XSync (f->output_data.x->display_info->display, 0)
+
+    before and immediately after the suspect X calls.  If your
+    debugger does not support this, you will need to add these pairs
+    of calls in the source and rebuild Emacs.
+
+    Either way, systematically step through the code and issue these
+    calls until you find the first X function called by Emacs after
+    which a call to `XSync' winds up in the function
+    `x_error_quitter'.  The first X function call for which this
+    happens is the one that generated the X protocol error.
+
+  - You should now look around this offending X call and try to figure
+    out what is wrong with it.
+
+** If Emacs causes errors or memory leaks in your X server
+
+You can trace the traffic between Emacs and your X server with a tool
+like xmon, available at ftp://ftp.x.org/contrib/devel_tools/.
+
+Xmon can be used to see exactly what Emacs sends when X protocol errors
+happen.  If Emacs causes the X server memory usage to increase you can
+use xmon to see what items Emacs creates in the server (windows,
+graphical contexts, pixmaps) and what items Emacs delete.  If there
+are consistently more creations than deletions, the type of item
+and the activity you do when the items get created can give a hint where
+to start debugging.
+
  ** If the symptom of the bug is that Emacs fails to respond
  
  Don't assume Emacs is `hung'--it may instead be in an infinite loop.
@@ -382,6 +474,9 @@ Several more functions for debugging display code are available in
  Emacs compiled with GLYPH_DEBUG defined; type "C-h f dump- TAB" and
  "C-h f trace- TAB" to see the full list.
  
+When you debug display problems running emacs under X, you can use
+the `ff' command to flush all pending display updates to the screen.
+
  
  ** Debugging LessTif
  
@@ -390,7 +485,7 @@ and keyboard events, or LessTif menus behave weirdly, it might be
  helpful to set the `DEBUGSOURCES' and `DEBUG_FILE' environment
  variables, so that one can see what LessTif was doing at this point.
  For instance
-  
+
    export DEBUGSOURCES="RowColumn.c:MenuShell.c:MenuUtil.c"
    export DEBUG_FILE=/usr/tmp/LESSTIF_TRACE
    emacs &
@@ -411,22 +506,44 @@ the machine where you started GDB and use the debugger from there.
  The array `last_marked' (defined on alloc.c) can be used to display up
  to 500 last objects marked by the garbage collection process.
  Whenever the garbage collector marks a Lisp object, it records the
-pointer to that object in the `last_marked' array.  The variable
-`last_marked_index' holds the index into the `last_marked' array one
-place beyond where the pointer to the very last marked object is
-stored.
+pointer to that object in the `last_marked' array, which is maintained
+as a circular buffer.  The variable `last_marked_index' holds the
+index into the `last_marked' array one place beyond where the pointer
+to the very last marked object is stored.
  
  The single most important goal in debugging GC problems is to find the
  Lisp data structure that got corrupted.  This is not easy since GC
  changes the tag bits and relocates strings which make it hard to look
  at Lisp objects with commands such as `pr'.  It is sometimes necessary
  to convert Lisp_Object variables into pointers to C struct's manually.
-Use the `last_marked' array and the source to reconstruct the sequence
-that objects were marked.
  
-Once you discover the corrupted Lisp object or data structure, it is
-useful to look at it in a fresh Emacs session and compare its contents
-with a session that you are debugging.
+Use the `last_marked' array and the source to reconstruct the sequence
+that objects were marked.  In general, you need to correlate the
+values recorded in the `last_marked' array with the corresponding
+stack frames in the backtrace, beginning with the innermost frame.
+Some subroutines of `mark_object' are invoked recursively, others loop
+over portions of the data structure and mark them as they go.  By
+looking at the code of those routines and comparing the frames in the
+backtrace with the values in `last_marked', you will be able to find
+connections between the values in `last_marked'.  E.g., when GC finds
+a cons cell, it recursively marks its car and its cdr.  Similar things
+happen with properties of symbols, elements of vectors, etc.  Use
+these connections to reconstruct the data structure that was being
+marked, paying special attention to the strings and names of symbols
+that you encounter: these strings and symbol names can be used to grep
+the sources to find out what high-level symbols and global variables
+are involved in the crash.
+
+Once you discover the corrupted Lisp object or data structure, grep
+the sources for its uses and try to figure out what could cause the
+corruption.  If looking at the sources doesn;t help, you could try
+setting a watchpoint on the corrupted data, and see what code modifies
+it in some invalid way.  (Obviously, this technique is only useful for
+data that is modified only very rarely.)
+
+It is also useful to look at the corrupted object or data structure in
+a fresh Emacs session and compare its contents with a session that you
+are debugging.
  
  ** Debugging problems with non-ASCII characters
  
@@ -437,6 +554,79 @@ some extra checks, such as look for broken relations between byte and
  character positions in buffers and strings; the resulting diagnostics
  might pinpoint the cause of the problem.
  
+** Debugging the TTY (non-windowed) version
+
+The most convenient method of debugging the character-terminal display
+is to do that on a window system such as X.  Begin by starting an
+xterm window, then type these commands inside that window:
+
+  $ tty
+  $ echo $TERM
+
+Let's say these commands print "/dev/ttyp4" and "xterm", respectively.
+
+Now start Emacs (the normal, windowed-display session, i.e. without
+the `-nw' option), and invoke "M-x gdb RET emacs RET" from there.  Now
+type these commands at GDB's prompt:
+
+  (gdb) set args -nw -t /dev/ttyp4
+  (gdb) set environment TERM xterm
+  (gdb) run
+
+The debugged Emacs should now start in no-window mode with its display
+directed to the xterm window you opened above.
+
+Similar arrangement is possible on a character terminal by using the
+`screen' package.
+
+** Running Emacs built with malloc debugging packages
+
+If Emacs exhibits bugs that seem to be related to use of memory
+allocated off the heap, it might be useful to link Emacs with a
+special debugging library, such as Electric Fence (a.k.a. efence) or
+GNU Checker, which helps find such problems.
+
+Emacs compiled with such packages might not run without some hacking,
+because Emacs replaces the system's memory allocation functions with
+its own versions, and because the dumping process might be
+incompatible with the way these packages use to track allocated
+memory.  Here are some of the changes you might find necessary
+(SYSTEM-NAME and MACHINE-NAME are the names of your OS- and
+CPU-specific headers in the subdirectories of `src'):
+
+  - In src/s/SYSTEM-NAME.h add "#define SYSTEM_MALLOC".
+
+  - In src/m/MACHINE-NAME.h add "#define CANNOT_DUMP" and
+    "#define CANNOT_UNEXEC".
+
+  - Configure with a different --prefix= option.  If you use GCC,
+    version 2.7.2 is preferred, as some malloc debugging packages
+    work a lot better with it than with 2.95 or later versions.
+
+  - Type "make" then "make -k install".
+
+  - If required, invoke the package-specific command to prepare
+    src/temacs for execution.
+
+  - cd ..; src/temacs
+
+(Note that this runs `temacs' instead of the usual `emacs' executable.
+This avoids problems with dumping Emacs mentioned above.)
+
+Some malloc debugging libraries might print lots of false alarms for
+bitfields used by Emacs in some data structures.  If you want to get
+rid of the false alarms, you will have to hack the definitions of
+these data structures on the respective headers to remove the `:N'
+bitfield definitions (which will cause each such field to use a full
+int).
+
+** How to recover buffer contents from an Emacs core dump file
+
+The file etc/emacs-buffer.gdb defines a set of GDB commands for
+recovering the contents of Emacs buffers from a core dump file.  You
+might also find those commands useful for displaying the list of
+buffers in human-readable format from within the debugger.
+
  ** Some suggestions for debugging on MS Windows:
  
     (written by Marc Fleischeuers, Geoff Voelker and Andrew Innes)
@@ -521,3 +711,5 @@ temporarily, you will see an old value for it.  Again, you need to
  look at the disassembly to determine which registers are being used,
  and look at those registers directly, to see the actual current values
  of these variables.
+
+;;; arch-tag: fbf32980-e35d-481f-8e4c-a2eca2586e6b