cgdbfedoracoredump

why does gdb complain that my core files are too small and then fail to produce a meaningful stack trace?


I have a core file generated from a segfault. When I try to load it into gdb, it doesn't appear to matter how I load it or if I use the correct executable or not - I always get this warning from gdb about the core file being truncated:

$ gdb -q /u1/dbg/bin/exdoc_usermaint_pdf_compact /tmp/barry/core.exdoc_usermaint.11
Reading symbols from /u1/dbg/bin/exdoc_usermaint_pdf_compact...done.
BFD: Warning: /tmp/barry/core.exdoc_usermaint.11 is truncated: expected core file size >= 43548672, found: 31399936.

warning: core file may not match specified executable file.
Cannot access memory at address 0x7f0ebc833668
(gdb) q

I am concerned with this error: "BFD: Warning: /tmp/barry/core.exdoc_usermaint.11 is truncated: expected core file size >= 43548672, found: 31399936."

Why does gdb think the core file is truncated? Is gdb right? Where does gdb obtain an expected size for the core file, and can I double-check it?

Background:

I am attempting to improve our diagnosis of segfaults on our production systems. My plan is to take core files from stripped executables in production and use them with debug versions of the executables on our development system, to quickly diagnose segfault bugs. In an earlier version of this question I gave many details related to the similar-but-different systems, but I have since been granted an account on our production system and determined that most of the details were unimportant to the problem.

gdb version:

$ gdb
GNU gdb (GDB) Fedora (7.0.1-50.fc12)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.

Linux version:

$ uname -a
Linux somehost 2.6.32.23-170.fc12.x86_64 #1 SMP Mon Sep 27 17:23:59 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux

I read this question (and many others) before posting. The asker has somewhat similar goals to myself, but is not getting any error from gdb about a truncated core file. Therefore the information related to that question does not help me with my problem.


Solution

  • The Core Dump File Format

    On a modern Linux system, core dump files are formatted using the ELF object file format, with a specific configuration. ELF is a structured binary file format, with file offsets used as references between data chunks in the file.

    For core dump files, the e_type field in the ELF file header will have the value ET_CORE.

    Unlike most ELF files, core dump files make all their data available via program headers, and no section headers are present. You may therefore choose to ignore section headers in calculating the size of the file, if you only need to deal with core files.

    Calculating Core Dump File Size

    To calculate the ELF file size:

    1. Consider all the chunks in the file:
      • chunk description (offset + size)
      • the ELF file header (0 + e_ehsize) (52 for ELF32, 64 for ELF64)
      • program header table (e_phoff + e_phentsize * e_phnum)
      • program data chunks (aka "segments") (p_offset + p_filesz)
      • the section header table (e_shoff + e_shentsize * e_shnum) - not required for core files
      • the section data chunks - (sh_offset + sh_size) - not required for core files
    2. Eliminate any section headers with a sh_type of SHT_NOBITS, as these are merely present to record the position of data that has been stripped and is no longer present in the file (not required for core files).
    3. Eliminate any chunks of size 0, as they contain no addressable bytes and therefore their file offset is irrelevant.
    4. The end of the file will be the end of the last chunk, which is the maximum of the offset + size for all remaining chunks listed above.

    If you find the offsets to the program header or section header tables are past the end of the file, then you will not be able to calculate an expected file size, but you will know the file has been truncated.

    Although an ELF file could potentially contain unaddressed regions and be longer than the calculated size, in my limited experience the files have been exactly the size calculated by the above method.

    Truncated Core Files

    gdb likely performs a calculation similar to the above to calculate the expected core file size.

    In short, if gdb says your core file is truncated, it is very likely truncated.

    One of the most likely causes for truncated core dump files is the system ulimit. This can be set on a system-wide basis in /etc/security/limits.conf, or on a per-user basis using the ulimit shell command [footnote: I don't know anything about systems other than my own].

    Try the command "ulimit -c" to check your effective core file size limit:

    $ ulimit -c
    unlimited
    

    Also, it's worth noting that gdb doesn't actually refuse to operate because of the truncated core file. gdb still attempts to produce a stack backtrace and in your case only fails when it tries to access data on the stack and finds that the specific memory locations addressed are off the end of the truncated core file.