I'm having a difficult time with what, in my mind, should be a fairly simple task: Using the python bindings to libclang
I want to get the dimensions for a multi-dimensional array field of a POD C++ structure. When I traverse the AST, I can drill down the cursor containing the array field declaration, and I can even see that it has two child nodes containing, what I assume, are the sizes of each of its dimensions... but I've had no luck accessing the size as a value. Below is a minimal example of my attempt at accessing the size:
#pragma once
struct my_struct {
int mdarr[10][20];
};
import clang.cindex as cl
def process(c):
if c.kind in [cl.CursorKind.STRUCT_DECL, cl.CursorKind.CLASS_DECL]:
print("Found struct: ", c.spelling)
for field in c.type.get_fields():
print("Found field: ", field.spelling)
# Returns size of first dimension but not the second dimension
print("Array size: ", field.type.get_array_size())
for child in field.get_children():
# Prints an empty string
print("Found child: ", child.spelling)
# How do I extract the value from the `INTEGER_LITERAL`?
print("Child cursor kind: ", child.kind)
return
for child in c.get_children():
process(child)
idx = cl.Index.create()
tu = idx.parse("test.hpp")
process(tu.cursor)
Found struct: my_struct
Found field: mdarr
Array size: 10
Found child:
Child cursor kind: CursorKind.INTEGER_LITERAL
Found child:
Child cursor kind: CursorKind.INTEGER_LITERAL
Is there an easy way to extract each dimension's size using libclang
?
C and C++ do not have multidimensional arrays per se. Instead, an array type can have another array type as its element type. In the declaration:
int mdarr[10][20];
this is parsed "inside out" (like all C/C++ declarators) as:
mdarr
^^^^^ mdarr is ...
mdarr[10]
^^^^ ... an array of 10 elements, each element being ...
mdarr[10][20]
^^^^ ... an array of 20 elements, each element being ...
int mdarr[10][20]
^^^ ... an integer.
Clang's representation follows this structure, representing the type as
an array(10) of an array(20) of int
.
The type of a Python libclang
Cursor
is obtained with
Cursor.type
.
From a
Type
one may invoke:
Type.kind
to determine what kind of type (primitive, array, etc.) this is,
Type.get_array_element_type
to get the element type for an array, and
Type.get_array_size
to get the array size, among other methods.
The following is a modified (added between BEGIN ADDED
and END ADDED
) version of the code in the question that prints all of the array
dimensions:
import clang.cindex as cl
def process(c):
if c.kind in [cl.CursorKind.STRUCT_DECL, cl.CursorKind.CLASS_DECL]:
print("Found struct: ", c.spelling)
for field in c.type.get_fields():
print("Found field: ", field.spelling)
# Returns size of first dimension but not the second dimension
print("Array size: ", field.type.get_array_size())
# BEGIN ADDED
t = field.type
while t.kind == cl.TypeKind.CONSTANTARRAY:
print(" ADDED: Array size:", t.get_array_size())
t = t.get_array_element_type()
print(" ADDED: Element type:", t.spelling)
# END ADDED
for child in field.get_children():
# Prints an empty string
print("Found child: ", child.spelling)
# How do I extract the value from the `INTEGER_LITERAL`?
print("Child cursor kind: ", child.kind)
return
for child in c.get_children():
process(child)
idx = cl.Index.create()
tu = idx.parse("test.hpp")
process(tu.cursor)
On the original input, this script prints:
Found struct: my_struct
Found field: mdarr
Array size: 10
ADDED: Array size: 10
ADDED: Element type: int[20]
ADDED: Array size: 20
ADDED: Element type: int
Found child:
Child cursor kind: CursorKind.INTEGER_LITERAL
Found child:
Child cursor kind: CursorKind.INTEGER_LITERAL
INTEGER_LITERAL
Relatedly, you ask (in a comment in the question code) how to get the
value of an INTEGER_LITERAL
node. To do that, retrieve the tokens
that make up the literal, get the first one, and get its "spelling".
See the Q+A How to retrieve function call argument values using libclang for more details about that.