Definition: Abstract Syntax Tree

Let $G=(V,T,R,S)$ be a grammar, let $L(G)$ be a language generated from the grammar, and let $\omega\in L$ be a string of the language. An abstract syntax tree of the string $\omega$ with respect to the grammar $G$ (denoted by $\operatorname {AST}_G(\omega)$) is a tree with * the starting symbol $S$ as the tree root, * all non-leave vertices consisting of non-terminal symbols (elements of $V$ in $G$), and * all leave vertices consisting of the characters of the string $\omega.$

Construction of $\operatorname {AST}_G(\omega)$

  1. Start with the starting symbol $S$ as a tree root.
  2. Apply a rule $r\in R$ and expand the tree root1.
  3. In case of the rule $S\to\epsilon$ ($\epsilon$ being the empty string), remove the expanded leaf.
  4. Check if all leaves of $\operatorname {AST}_G(\omega)$ are already non-terminal symbols.
  5. If yes, each leaf corresponds to the terminal symbols (characters of the string $\omega$).
  6. If not, repeat the procedure from the step $2$ for each non-terminal symbol as a new root of another subtree of $\operatorname {AST}_G(\omega).$

Definitions: 1
Examples: 2


Thank you to the contributors under CC BY-SA 4.0!

Github:
bookofproofs


References

Bibliography

  1. Erk, Katrin; Priese, Lutz: "Theoretische Informatik", Springer Verlag, 2000, 2nd Edition
  2. Hoffmann, Dirk: "Theoretische Informatik, 3. Auflage", Hanser, 2015

Footnotes


  1. Note that we still do not know, which rule has to be applied - this will be clarified later when we will be talking about parsers