Issue 130313.3: Range table base offset (Split DWARF, part 3/5)

Author: Cary Coutant
Champion: Cary Coutant
Date submitted: 2013-03-13
Date revised:
Date closed:
Type: Enhancement
Status: Accepted
DWARF Version: 5
Section , pg 
Introduction
------------

With the current DWARF format, the debug information is designed
with the expectation that it will be processed by the linker to
produce an output binary with complete debug information, and
with fully-resolved references to locations within the
application. For very large applications, however, this approach
can result in excessively large link times and excessively large
output files. Several vendors have independently developed
proprietary approaches that allow the debug information to remain
in the relocatable object files, so that the linker does not have
to process the debug information or copy it to the output file.
These approaches have all required that additional information be
made available to the debug information consumer, and that the
consumer perform some minimal amount of relocation in order to
interpret the debug info correctly. The additional information
required, in the form of load maps or symbol tables, and the
details of the relocation are not covered by the DWARF
specification, and vary with each vendor's implementation.

For more background information, see the GCC wiki page at:

    http://gcc.gnu.org/wiki/DebugFission

This is the third in a series of proposals to extend the DWARF
specification to support the use of unlinked and unrelocated
debug information.

  * The first proposal introduces a mechanism for referring to
    strings indirectly, collecting the string offsets into a new
    section, and introducing a new form code for this purpose.

  * The second proposal introduces a similar indirect mechanism
    for referring to relocatable addresses in the loadable
    segments of the program, collecting all such addresses in a
    single section.

  * This third proposal introduces a mechanism for referring to
    range table entries in the .debug_ranges section with a
    single relocation per compilation unit rather than one for
    each reference.

  * The fourth proposal introduces a mechanism for splitting the
    DWARF debugging information into two sets of sections. One
    set remains in the relocatable object (.o) files, and is
    linked into the final executable; the other set is written to
    a non-relocatable DWARF object (.dwo) file, and remains in
    the build tree as an independent file where the debugger can
    reference it as needed.

  * The fifth proposal introduces a package file format for
    collecting the DWARF object (.dwo) files so that the
    debugging information can be easily saved and transported for
    debugging applications outside of the build tree.


Overview
--------

In order to reduce the quantity of relocations that reference
range lists in the .debug_ranges section, we propose a new
DW_AT_ranges_base attribute that provides the base offset for the
range list entries for the compilation unit. When this attribute
is provided in a DW_TAG_compile_unit DIE, all references to a
range list are taken as offsets relative to the given base
offset. This allows the compiler to emit a single relocation per
compilation unit for the .debug_ranges section, and it can use
non-relocated offsets for every DW_AT_ranges attribute (while
still using DW_FORM_sec_offset).


Changes to the DWARF Specification
----------------------------------

Section 2.2 ("Attribute Types")

In Figure 2, add the following rows:

    Attribute name            Identifies or Specifies
    --------------            -----------------------
    DW_AT_ranges_base         Base offset for range lists

Section 2.17.3 ("Non-Contiguous Address Ranges")

The second paragraph currently reads as follows:

    Range lists are contained in a separate object file section
    called .debug_ranges. A range list is indicated by a
    DW_AT_ranges attribute whose value is represented as an
    offset from the beginning of the .debug_ranges section to the
    beginning of the range list.

At the end of this paragraph, add the following sentences:

    If the current compilation unit contains a DW_AT_ranges_base
    attribute, the value of that attribute establishes a base
    offset within the .debug_ranges section for the compilation
    unit. The offset given by the DW_AT_ranges attribute is
    relative to that base.

Section 3.1.1 ("Normal and Partial Compilation Unit Entries")

In the numbered list introduced by "Compilation unit entries may
have the following attributes:", add the following item:

    14. A DW_AT_ranges_base attribute, whose value is a reference.
        This attribute points to the beginning of the compilation
        unit's contribution to the .debug_ranges section.
        References to range lists (using DW_FORM_sec_offset)
        within the compilation unit must be interpreted as
        offsets relative to this base.

Section 7.5.4 ("Attribute Encodings")

In Figure 20, add the following row to the table:

    Attribute name            Value    Classes
    --------------            -----    -------
    DW_AT_ranges_base         [TBA]    rangelistptr

Appendix A ("Attributes by Tag Value")

In Figure 42, in the table entries for DW_TAG_compile_unit and
DW_TAG_partial_unit, add DW_AT_ranges_base.


In Chapter 7, add a new section after Section 7.24:

7.xx Range List Table

    Each set of entries in the address table contained in
    the .debug_ranges section begins with a header containing:

        1.  unit_length (initial length)
            A 4-byte or 12-byte length containing the length of
            the set of entries for this compilation unit, not
            including the length field itself. In the 32-bit
            DWARF format, this is a 4-byte unsigned integer
            (which must be less than 0xfffffff0); in the 64-bit
            DWARF format, this consists of the 4-byte value
            0xffffffff followed by an 8-byte unsigned integer
            that gives the actual length (see Section 7.4).

        2.  version (uhalf)
            A 2-byte version identifier containing the value 1
            (see Appendix F).

        3.  address_size (ubyte)
            A 1-byte unsigned integer containing the size in
            bytes of an address (or the offset portion of an
            address for segmented addressing) on the target
            system.

        4.  segment_size (ubyte)
            A 1-byte unsigned integer containing the size in
            bytes of a segment selector on the target system.

    This header is followed by a series of range list entries as
    described in Section 2.17.3. The segment size is given by the
    segment_size field of the header, and the address size is
    given by the address_size field of the header. If the
    segment_size field in the header is zero, the segment
    selector is omitted from the range list entries.

    The DW_AT_ranges_base attribute must point to the first entry
    following the header. The entries are referenced by a byte
    offset relative to this base address.

Appendix A ("Attributes by Tag Value")

In Figure 42, in the table entries for DW_TAG_compile_unit and
DW_TAG_partial_unit, add DW_AT_ranges_base.

Appendix F ("DWARF Section Version Numbers")

In Figure 97, replace the row for .debug_ranges with the
following:

    Section Name      V2    V3    V4    V5
    ------------      --    --    --    --
    .debug_ranges      x     x     x     5


---
Accepted 3/19/2013
Revised 5/26/2013:  Added section header.