| % Parse Tree Analysis |
| |
| \textit{NOTE. This section assumes that you are familiar with the Visitor |
| pattern\footnote{See Gamma, Helm, Johnson, and Vlissides, \textit{Design |
| Patterns}} and are familiar with symbol tables\footnote{See Aho, Sethi, and |
| Ullman, \textit{Compilers: Principles, Techniques, and Tools}}.} |
| |
| \section{Visitors} |
| |
| When you want to retrieve structural information about a program, typically |
| it will be done using a Visitor on the parse tree. |
| |
| There are two types of Visitors available in Photran. |
| \begin{itemize} |
| |
| \item A \texttt{ParseTreeVisitor} has a ``callback'' method for each nonterminal |
| in the grammar. The tree is traversed in preorder, and as each node is visited, |
| a call is made to the Visitor method corresponding to the nonterminal for that node. |
| This is usually the preferred Visitor to use. |
| |
| \item A \texttt{GenericParseTreeVisitor} only distinguishes between |
| \texttt{ParseTreeNode}s and \texttt{Token}s. This Visitor should only be used |
| when all internal nodes are treated similarly. If you need to distinguish between |
| the types of internal nodes, use a \texttt{ParseTreeVisitor} instead. |
| |
| \end{itemize} |
| |
| (\texttt{ParseTreeVisitor} is generated by the parser generator and should not |
| be modified.) |
| |
| In addition to the visit methods, both types of Visitors have a |
| \texttt{preparingToVisitChildrenOf} method, which is called after a node has |
| been visited but before its children have, and a \texttt{doneVisitingChildrenOf} |
| method, which is called immediately after its children have been visited but |
| before any sibling nodes have. |
| |
| \subsection{Creating and Using a Visitor} |
| |
| Create a visitor by subclassing \texttt{ParseTreeVisitor} or |
| \texttt{GenericParseTreeVisitor}. By default, all of the visit methods do |
| nothing, so you only have to override the methods that are useful to you. |
| |
| Given the root of the parse tree (i.e., the \texttt{ParseTreeNode} returned |
| by \texttt{FortranProcessor\#parse}), use the \texttt{visitUsing} method |
| to traverse the parse tree; pass your visitor as the only argument. |
| |
| \subsection{Sample Traversals} |
| |
| As an example, consider again the ``do nothing'' program from Figure \ref{parsetree}. |
| |
| When it is visited using a \texttt{GenericParseTreeVisitor}, the Visitor |
| methods are called in this order (methods called between \texttt{preparingToVisitChildrenOf} |
| and \texttt{doneVisitingChildrenOf} are indented to illustrate that child nodes |
| are being visited): |
| |
| \begin{verbatim} |
| visitParseTreeNode(<xExecutableProgram>) |
| preparingToVisitChildrenOf(<xExecutableProgram>) |
| visitParseTreeNode(<xProgramUnit>) |
| preparingToVisitChildrenOf(<xProgramUnit>) |
| visitParseTreeNode(<xMainProgram>) |
| preparingToVisitChildrenOf(<xMainProgram>) |
| visitParseTreeNode(<xProgramStmt>) |
| preparingToVisitChildrenOf(<xProgramStmt>) |
| visitToken(T_PROGRAM: "program") |
| visitParseTreeNode(<xProgramName>) |
| preparingToVisitChildrenOf(<xProgramName>) |
| visitToken(T_IDENT: "Hello") |
| doneVisitingChildrenOf(<xProgramName>) |
| visitToken(T_EOS: (end of line)) |
| doneVisitingChildrenOf(<xProgramStmt>) |
| visitParseTreeNode(<xMainRange>) |
| preparingToVisitChildrenOf(<xMainRange>) |
| visitParseTreeNode(<xEndProgramStmt>) |
| preparingToVisitChildrenOf(<xEndProgramStmt>) |
| visitToken(T_END: "end") |
| visitToken(T_PROGRAM: "program") |
| visitToken(T_EOS: (end of line)) |
| doneVisitingChildrenOf(<xEndProgramStmt>) |
| doneVisitingChildrenOf(<xMainRange>) |
| doneVisitingChildrenOf(<xMainProgram>) |
| doneVisitingChildrenOf(<xProgramUnit>) |
| doneVisitingChildrenOf(<xExecutableProgram>) |
| \end{verbatim} |
| |
| When it is visited using a \texttt{ParseTreeVisitor}, the Visitor |
| methods are called in this order: |
| |
| \begin{verbatim} |
| visitXexecutableprogram(<xExecutableProgram>) |
| preparingToVisitChildrenOf(<xExecutableProgram>) |
| visitXprogramunit(<xProgramUnit>) |
| preparingToVisitChildrenOf(<xProgramUnit>) |
| visitXmainprogram(<xMainProgram>) |
| preparingToVisitChildrenOf(<xMainProgram>) |
| visitXprogramstmt(<xProgramStmt>) |
| preparingToVisitChildrenOf(<xProgramStmt>) |
| visitXprogramname(<xProgramName>) |
| preparingToVisitChildrenOf(<xProgramName>) |
| doneVisitingChildrenOf(<xProgramName>) |
| doneVisitingChildrenOf(<xProgramStmt>) |
| visitXmainrange(<xMainRange>) |
| preparingToVisitChildrenOf(<xMainRange>) |
| visitXendprogramstmt(<xEndProgramStmt>) |
| preparingToVisitChildrenOf(<xEndProgramStmt>) |
| doneVisitingChildrenOf(<xEndProgramStmt>) |
| doneVisitingChildrenOf(<xMainRange>) |
| doneVisitingChildrenOf(<xMainProgram>) |
| doneVisitingChildrenOf(<xProgramUnit>) |
| doneVisitingChildrenOf(<xExecutableProgram>) |
| \end{verbatim} |
| |
| (The Visitors used to generate the output above are stored in |
| a JUnit test class called \texttt{ExampleVisitor}.) |
| |
| \section{The \texttt{ParseTreeSearcher}} |
| |
| Inside Visitor methods, typically you will want to look for particular tokens |
| or syntactic structures. |
| |
| \subsection{An Example} |
| |
| Consider the following (randomly selected) method from \texttt{DeclarationCollector}, |
| one of the classes used for building symbol tables. |
| This class scans a program's parse tree, looking for declarations of programs, functions, |
| block data, namelists, etc., and inserts declarations for them into a symbol table. |
| |
| \begin{verbatim} |
| public void visitXmainprogram(ParseTreeNode node) |
| { |
| // # R1101 desn"t ensure ordering as the standard requires; |
| // <xMainProgram> ::= |
| // <xMainRange> | |
| // <xProgramStmt> <xMainRange> ; |
| // # R1102 |
| // <xProgramStmt> ::= |
| // <xLblDef> T_PROGRAM <xProgramName> T_EOS ; |
| ParseTreeNode programStmt = ParseTreeSearcher.findFirstNodeIn(node, Nonterminal.XPROGRAMSTMT); |
| Token name; |
| if (programStmt != null) |
| name = ParseTreeSearcher.findFirstIdentifierIn(node); |
| else |
| { |
| name = new Token(); |
| name.setText("(Anonymous Main Program)"); |
| } |
| ... |
| \end{verbatim} |
| |
| (The comment lines are copied from the parser generation grammar, \texttt{Fortran95.bnf}.) |
| |
| In Fortran, a main program can either start with a program statement, or it |
| can not. So both of these are valid programs: |
| \begin{verbatim} |
| program SayHi |
| print *, 'Hello!' print *, 'Hello!' |
| end program SayHi end program SayHi |
| \end{verbatim} |
| |
| In this Visitor method, \texttt{node} is guaranteed to be a $<$xMainProgram$>$, |
| due to the name of the method. As shown, by looking at the grammar, we can |
| determine that it will either have one child (a \texttt{ParseTreeNode} corresponding |
| to an $<$xMainRange$>$) or two children (both \texttt{ParseTreeNode}s, the |
| first corresponding to an $<$xProgramStmt$>$ and the second corresponding to |
| an $<$xMainRange$>$. |
| |
| If it \textit{does} have an $<$xProgramStmt$>$, then, a quick look at the rules for |
| $<$xLblDef$>$ (an optional integer label that can appear at the beginning of any |
| Fortran statement) and $<$xProgramName$>$ will make it evident that the first |
| identifier token (T\_IDENT) under the $<$xProgramStmt$>$ is the name of the program. |
| |
| So now we can understand what this method does. |
| \begin{itemize} |
| |
| \item It checks to see if the main program has a $<$xProgramStmt$>$. This is done |
| by calling \texttt{ParseTreeSearcher\#findFirstNodeIn}, and telling it we want to |
| find the first $<$xProgramStmt$>$ that is a child of \texttt{node}. It will either |
| return a \texttt{ParseTreeNode} corresponding to an $<$xProgramStmt$>$, or if it |
| can't find one, it will return \texttt{null}. |
| |
| \item If there is an $<$xProgramStmt$>$, it grabs the first T\_IDENT token, which |
| tells the name the user gave to the program. |
| |
| \item If there was no $<$xProgramStmt$>$, the program does not have a name, so we |
| ``fake'' a token and name the program ``(Anonymous Main Program).'' |
| \end{itemize} |
| |
| \subsection{For More Information} |
| |
| So, essentially, anytime you have a parse tree (or a subtree of the parse tree) |
| and you want to find a particular node or token, use one of the methods in |
| \texttt{ParseTreeSearcher}. If there isn't one, you may have to write one yourself. |
| Be sure to look at the JavaDoc for the methods in that class; unless you are |
| doing something bizarre, one of the existing methods should work. |