[SOLVED] How to read llvm-cov json format?

How to read llvm-cov json format?

I'm able to export code coverage data by llvm-cov in json format, but the content seems mysterious to me. What does each number in segments section mean?

{
   "filename":"file.m",
   "segments":[
      [
         11,
         22,
         23,
         1,
         1
      ],
      [
         12,
         11,
         23,
         1,
         1
      ],
      ...
   ],
   "expansions":[

   ],
   "summary":{
      ...
   }
}

Solution

Going by https://clang.llvm.org/docs/SourceBasedCodeCoverage.html, the JSON format is explained in the source code, which I found at https://github.com/llvm/llvm-project/tree/main/llvm/tools/llvm-cov.

The source code contains the following description:

The json code coverage export follows the following format
Root: dict => Root Element containing metadata
-- Data: array => Homogeneous array of one or more export objects
  -- Export: dict => Json representation of one CoverageMapping
    -- Files: array => List of objects describing coverage for files
      -- File: dict => Coverage for a single file
        -- Branches: array => List of Branches in the file
          -- Branch: dict => Describes a branch of the file with counters
        -- Segments: array => List of Segments contained in the file
          -- Segment: dict => Describes a segment of the file with a counter
        -- Expansions: array => List of expansion records
          -- Expansion: dict => Object that descibes a single expansion
            -- CountedRegion: dict => The region to be expanded
            -- TargetRegions: array => List of Regions in the expansion
              -- CountedRegion: dict => Single Region in the expansion
            -- Branches: array => List of Branches in the expansion
              -- Branch: dict => Describes a branch in expansion and counters
        -- Summary: dict => Object summarizing the coverage for this file
          -- LineCoverage: dict => Object summarizing line coverage
          -- FunctionCoverage: dict => Object summarizing function coverage
          -- RegionCoverage: dict => Object summarizing region coverage
          -- BranchCoverage: dict => Object summarizing branch coverage
    -- Functions: array => List of objects describing coverage for functions
      -- Function: dict => Coverage info for a single function
        -- Filenames: array => List of filenames that the function relates to
  -- Summary: dict => Object summarizing the coverage for the entire binary
    -- LineCoverage: dict => Object summarizing line coverage
    -- FunctionCoverage: dict => Object summarizing function coverage
    -- InstantiationCoverage: dict => Object summarizing inst. coverage
    -- RegionCoverage: dict => Object summarizing region coverage
    -- BranchCoverage: dict => Object summarizing branch coverage

Sadly this is still not very explanatory regarding what a segment is, or how it's structured.

Looking a bit more in detail at the code we find the following two snippets:

json::Array renderSegment(const coverage::CoverageSegment &Segment) {
  return json::Array({Segment.Line, Segment.Col, int64_t(Segment.Count),
                      Segment.HasCount, Segment.IsRegionEntry});
}

json::Array renderRegion(const coverage::CountedRegion &Region) {
  return json::Array({Region.LineStart, Region.ColumnStart, Region.LineEnd,
                      Region.ColumnEnd, int64_t(Region.ExecutionCount),
                      Region.FileID, Region.ExpandedFileID,
                      int64_t(Region.Kind)});
}

Which should give you a better idea of what the entries mean.

The file ID seems to index into the filenames given in the expansions.