What kind of data structures are usually utilized for compiled XPath queries? Is it just AST or something else? If so, how is the search in DOM usually implemented? Is it just depth-first search traversing in DOM using parts of the query AST?
It's going to vary considerably from one processor to another. Some may be pure interpreters, some compile down to Java bytecode or similar. There are plenty of open source processors, I suggest you study them.
A processor will typically start by building a syntax tree, then do various semantic checks e.g. that variables are declared, then do some kind of decorating of the tree (for example, adding information derived from static type inferencing), then do various tree-rewriting optimisations such as constant folding (pre-evaluation of constant subexpressions) or loop-lifting, then use an interpreter pattern to evaluate the result.
The faster processors don't search DOM directly, they build an indexed representation of the tree that is more efficient for searching and navigation.