Contributor Documentation
The syntax_tree type is located in the scalpel::cpp namespace. It is a typedef of scalpel::cpp::syntax_nodes::translation_unit, the root node of every syntax tree. All the syntax nodes lie in the scalpel::cpp::syntax_nodes namespace.
There is about tow hundred types of syntax nodes. They approximately correspond to the grammar rules defined in the C++ standard document (Annex A: Grammar summary).
All these types can be categorized into six concepts: SequenceNode, ListNode, AlternativeNode, OptionalNode, PredefinedTextNode and LeafNode.
![]() | C++ Concepts |
|---|---|
Unfortunately, concepts are not part of the C++ language. These six concepts are so not formally expressed in the source code. However, they do exist in their way, through SFINAE and type traits. |
A sequence node is a finite series of N nodes of different types (a kind of tuple). For example, a namespace_definition is a sequence of five nodes: a terminal node whose text in 'namespace', an optional identifier node and a declaration_seq optional node placed between two other terminal nodes. The namespace_definition node type can be expressed with the following grammar rule (in EBNF format):
namespace_definition = "namespace", [identifier], "{", [declaration_seq], "}";
A list node is an infinite series of nodes of the same type, separated by a white space or a comma. By convention, node types whose name ends with _list are comma-separated lists, while those whose name ends with _seq are white-space-separated lists. For example, declaration_seq is a list of white-space-separated declaration nodes.
An alternative node contains a single node. Each alternative node type defines a list of node types. The child node must be one of these types. The declaration node type is an example of alternative node type:
declaration = block_declaration | function_definition | template_declaration | explicit_instantiation | explicit_specialization | linkage_specification | namespace_definition ;
An optional node contains zero or one node of a given type.
A predefined text node is a terminal node whose value is known at compilation time. All the keywords, operators and other symbols have their corresponding predefined text node.
A leaf node is a terminal node. Variable names, class names, namespace names, labels… are stored in such nodes. The identifier node type we saw in the namespace_definition grammar is a leaf node type.
Every node concepts have their corresponding class template (except LeafNode, which is defined by a class): sequence_node, list_node, alternative_node, optional_node and predefined_node. They are located in scalpel::cpp::syntax_nodes. Here are their respective declarations:
template<class... NodesT> class sequence_node; template<class T, const leaf_node& SeparatorNode> class list_node; template<class... NodesT> class alternative_node; template<class T> class optional_node; template<const std::string& Text> class predefined_text_node;
![]() | 'class...'? |
|---|---|
Yes, |
Most of the two hundred node types are just typedefs of instances of the above class templates.
Defining a node type from its corresponding grammar is a piece of cake:
//access_specifier = "public" | "protected" | "private"; typedef alternative_node < predefined_text_node<str::public_>, predefined_text_node<str::protected_>, predefined_text_node<str::private_> > access_specifier ;
Like I said, most of the node types are typedefs. What about the other types? Well, if you ever draw the dependency diagram of the syntax node types (good cheer), you'll notice it's a circular dependency nightmare. Since there's no such thing as forward typedefs in C++, the only way to resolve these circular dependencies is to use pimpl idiom.
Fortunately, preprocessor macros have been written to keep the code short. One macro generates the pimpl declaration (in the header file) and second one generates its definition (in the implementation file). There is a pair of macros for sequence_node and another one for alternative_node:
#define SCALPEL_ALTERNATIVE_NODE_PIMPL_DECLARATION(alternative_node_type, type_seq) /*...*/ #define SCALPEL_ALTERNATIVE_NODE_PIMPL_DEFINITION(alternative_node_type, type_seq) /*...*/ #define SCALPEL_SEQUENCE_NODE_PIMPL_DECLARATION(sequence_node_type, type_seq) /*...*/ #define SCALPEL_SEQUENCE_NODE_PIMPL_DEFINITION(sequence_node_type, type_seq) /*...*/
For example, remember the namespace_definition rule:
namespace_definition = "namespace", [identifier], "{", [declaration_seq], "}";Here is its declaration (in a header file):
SCALPEL_SEQUENCE_NODE_PIMPL_DECLARATION ( namespace_definition, (predefined_text_node<str::namespace_>) (optional_node<identifier>) (predefined_text_node<str::opening_brace>) (optional_node<declaration_seq>) (predefined_text_node<str::closing_brace>) )
And this is its definition (implementation file):
SCALPEL_SEQUENCE_NODE_PIMPL_DEFINITION ( namespace_definition, (predefined_text_node<str::namespace_>) (optional_node<identifier>) (predefined_text_node<str::opening_brace>) (optional_node<declaration_seq>) (predefined_text_node<str::closing_brace>) )
![]() | Boost.Preprocessor |
|---|---|
This syntax may look weird to you. If so, you should take a look at the Boost.Preprocessor library documentation. |
The generated code is exactly equivalent to a typedef, except we can make use of forward declaration to resolve circular dependencies.
Actually, you shouldn't really care about this implementation detail. Just remember that if you see such a macro function call, you can consider it as a typedef.
The semantic graph is an interface-based data structure. Unlike in the syntax tree structure, there are no concepts here.
In C++, a type is either a built-in type (int, bool, double, etc.) or a custom type (structures/classes, unions, etc.). These IS-A relationships are naturally translated into inheritance:

It is also possible to create composite types, using pointers, references, arrays and qualifiers such as const and volatile. The corresponding model uses the well-known decorator design pattern (the common implementation of all these decorators are factorized in the type_decorator_impl class):


![[Note]](../static/contributor-documentation/images/note.png)