org.eclipse.photran-dev-docs/misc-notes/old-dev-guide-chapters/inc-07-analysis.tex - gerrit/ptp/org.eclipse.photran - Git at Google

 % Parse Tree Analysis

 \textit{NOTE.  This section assumes that you are familiar with the Visitor
 pattern\footnote{See Gamma, Helm, Johnson, and Vlissides, \textit{Design
 Patterns}} and are familiar with symbol tables\footnote{See Aho, Sethi, and
 Ullman, \textit{Compilers: Principles, Techniques, and Tools}}.}

 \section{Visitors}

 When you want to retrieve structural information about a program, typically
 it will be done using a Visitor on the parse tree.

 There are two types of Visitors available in Photran.
 \begin{itemize}

 \item A \texttt{ParseTreeVisitor} has a ``callback'' method for each nonterminal
 in the grammar.  The tree is traversed in preorder, and as each node is visited,
 a call is made to the Visitor method corresponding to the nonterminal for that node.
 This is usually the preferred Visitor to use.

 \item A \texttt{GenericParseTreeVisitor} only distinguishes between
 \texttt{ParseTreeNode}s and \texttt{Token}s.  This Visitor should only be used
 when all internal nodes are treated similarly.  If you need to distinguish between
 the types of internal nodes, use a \texttt{ParseTreeVisitor} instead.

 \end{itemize}

 (\texttt{ParseTreeVisitor} is generated by the parser generator and should not
 be modified.)

 In addition to the visit methods, both types of Visitors have a
 \texttt{preparingToVisitChildrenOf} method, which is called after a node has
 been visited but before its children have, and a \texttt{doneVisitingChildrenOf}
 method, which is called immediately after its children have been visited but
 before any sibling nodes have.

 \subsection{Creating and Using a Visitor}

 Create a visitor by subclassing \texttt{ParseTreeVisitor} or
 \texttt{GenericParseTreeVisitor}.  By default, all of the visit methods do
 nothing, so you only have to override the methods that are useful to you.

 Given the root of the parse tree (i.e., the \texttt{ParseTreeNode} returned
 by \texttt{FortranProcessor\#parse}), use the \texttt{visitUsing} method
 to traverse the parse tree; pass your visitor as the only argument.

 \subsection{Sample Traversals}

 As an example, consider again the ``do nothing'' program from Figure \ref{parsetree}.

 When it is visited using a \texttt{GenericParseTreeVisitor}, the Visitor
 methods are called in this order (methods called between \texttt{preparingToVisitChildrenOf}
 and \texttt{doneVisitingChildrenOf} are indented to illustrate that child nodes
 are being visited):

 \begin{verbatim}
 visitParseTreeNode(<xExecutableProgram>)
 preparingToVisitChildrenOf(<xExecutableProgram>)
   visitParseTreeNode(<xProgramUnit>)
   preparingToVisitChildrenOf(<xProgramUnit>)
     visitParseTreeNode(<xMainProgram>)
     preparingToVisitChildrenOf(<xMainProgram>)
       visitParseTreeNode(<xProgramStmt>)
       preparingToVisitChildrenOf(<xProgramStmt>)
         visitToken(T_PROGRAM: "program")
         visitParseTreeNode(<xProgramName>)
         preparingToVisitChildrenOf(<xProgramName>)
           visitToken(T_IDENT: "Hello")
         doneVisitingChildrenOf(<xProgramName>)
         visitToken(T_EOS: (end of line))
       doneVisitingChildrenOf(<xProgramStmt>)
       visitParseTreeNode(<xMainRange>)
       preparingToVisitChildrenOf(<xMainRange>)
         visitParseTreeNode(<xEndProgramStmt>)
         preparingToVisitChildrenOf(<xEndProgramStmt>)
           visitToken(T_END: "end")
           visitToken(T_PROGRAM: "program")
           visitToken(T_EOS: (end of line))
         doneVisitingChildrenOf(<xEndProgramStmt>)
       doneVisitingChildrenOf(<xMainRange>)
     doneVisitingChildrenOf(<xMainProgram>)
   doneVisitingChildrenOf(<xProgramUnit>)
 doneVisitingChildrenOf(<xExecutableProgram>)
 \end{verbatim}

 When it is visited using a \texttt{ParseTreeVisitor}, the Visitor
 methods are called in this order:

 \begin{verbatim}
 visitXexecutableprogram(<xExecutableProgram>)
 preparingToVisitChildrenOf(<xExecutableProgram>)
   visitXprogramunit(<xProgramUnit>)
   preparingToVisitChildrenOf(<xProgramUnit>)
     visitXmainprogram(<xMainProgram>)
     preparingToVisitChildrenOf(<xMainProgram>)
       visitXprogramstmt(<xProgramStmt>)
       preparingToVisitChildrenOf(<xProgramStmt>)
         visitXprogramname(<xProgramName>)
         preparingToVisitChildrenOf(<xProgramName>)
         doneVisitingChildrenOf(<xProgramName>)
       doneVisitingChildrenOf(<xProgramStmt>)
       visitXmainrange(<xMainRange>)
       preparingToVisitChildrenOf(<xMainRange>)
         visitXendprogramstmt(<xEndProgramStmt>)
         preparingToVisitChildrenOf(<xEndProgramStmt>)
         doneVisitingChildrenOf(<xEndProgramStmt>)
       doneVisitingChildrenOf(<xMainRange>)
     doneVisitingChildrenOf(<xMainProgram>)
   doneVisitingChildrenOf(<xProgramUnit>)
 doneVisitingChildrenOf(<xExecutableProgram>)
 \end{verbatim}

 (The Visitors used to generate the output above are stored in
 a JUnit test class called \texttt{ExampleVisitor}.)

 \section{The \texttt{ParseTreeSearcher}}

 Inside Visitor methods, typically you will want to look for particular tokens
 or syntactic structures.

 \subsection{An Example}

 Consider the following (randomly selected) method from \texttt{DeclarationCollector},
 one of the classes used for building symbol tables.
 This class scans a program's parse tree, looking for declarations of programs, functions,
 block data, namelists, etc., and inserts declarations for them into a symbol table.

 \begin{verbatim}
     public void visitXmainprogram(ParseTreeNode node)
     {
         // # R1101 desn"t ensure ordering as the standard requires;
         // <xMainProgram> ::=
         //   <xMainRange> |
         //   <xProgramStmt> <xMainRange> ;
         // # R1102
         // <xProgramStmt> ::=
         //   <xLblDef> T_PROGRAM <xProgramName> T_EOS ;
         ParseTreeNode programStmt = ParseTreeSearcher.findFirstNodeIn(node, Nonterminal.XPROGRAMSTMT);
         Token name;
         if (programStmt != null)
 	        name = ParseTreeSearcher.findFirstIdentifierIn(node);
         else
         {
 	        name = new Token();
 	        name.setText("(Anonymous Main Program)");
         }
         ...
 \end{verbatim}

 (The comment lines are copied from the parser generation grammar, \texttt{Fortran95.bnf}.)

 In Fortran, a main program can either start with a program statement, or it
 can not.  So both of these are valid programs:
 \begin{verbatim}
 program SayHi
   print *, 'Hello!'                             print *, 'Hello!'
 end program SayHi                             end program SayHi
 \end{verbatim}

 In this Visitor method, \texttt{node} is guaranteed to be a $<$xMainProgram$>$,
 due to the name of the method.  As shown, by looking at the grammar, we can
 determine that it will either have one child (a \texttt{ParseTreeNode} corresponding
 to an $<$xMainRange$>$) or two children (both \texttt{ParseTreeNode}s, the
 first corresponding to an $<$xProgramStmt$>$ and the second corresponding to
 an $<$xMainRange$>$.

 If it \textit{does} have an $<$xProgramStmt$>$, then, a quick look at the rules for
 $<$xLblDef$>$ (an optional integer label that can appear at the beginning of any
 Fortran statement) and $<$xProgramName$>$ will make it evident that the first
 identifier token (T\_IDENT) under the $<$xProgramStmt$>$ is the name of the program.

 So now we can understand what this method does.
 \begin{itemize}

 \item It checks to see if the main program has a $<$xProgramStmt$>$.  This is done
 by calling \texttt{ParseTreeSearcher\#findFirstNodeIn}, and telling it we want to
 find the first $<$xProgramStmt$>$ that is a child of \texttt{node}.  It will either
 return a \texttt{ParseTreeNode} corresponding to an $<$xProgramStmt$>$, or if it
 can't find one, it will return \texttt{null}.

 \item If there is an $<$xProgramStmt$>$, it grabs the first T\_IDENT token, which
 tells the name the user gave to the program.

 \item If there was no $<$xProgramStmt$>$, the program does not have a name, so we
 ``fake'' a token and name the program ``(Anonymous Main Program).''
 \end{itemize}

 \subsection{For More Information}

 So, essentially, anytime you have a parse tree (or a subtree of the parse tree)
 and you want to find a particular node or token, use one of the methods in
 \texttt{ParseTreeSearcher}.  If there isn't one, you may have to write one yourself.
 Be sure to look at the JavaDoc for the methods in that class; unless you are
 doing something bizarre, one of the existing methods should work.
	% Parse Tree Analysis

	\textit{NOTE. This section assumes that you are familiar with the Visitor
	pattern\footnote{See Gamma, Helm, Johnson, and Vlissides, \textit{Design
	Patterns}} and are familiar with symbol tables\footnote{See Aho, Sethi, and
	Ullman, \textit{Compilers: Principles, Techniques, and Tools}}.}

	\section{Visitors}

	When you want to retrieve structural information about a program, typically
	it will be done using a Visitor on the parse tree.

	There are two types of Visitors available in Photran.
	\begin{itemize}

	\item A \texttt{ParseTreeVisitor} has a ``callback'' method for each nonterminal
	in the grammar. The tree is traversed in preorder, and as each node is visited,
	a call is made to the Visitor method corresponding to the nonterminal for that node.
	This is usually the preferred Visitor to use.

	\item A \texttt{GenericParseTreeVisitor} only distinguishes between
	\texttt{ParseTreeNode}s and \texttt{Token}s. This Visitor should only be used
	when all internal nodes are treated similarly. If you need to distinguish between
	the types of internal nodes, use a \texttt{ParseTreeVisitor} instead.

	\end{itemize}

	(\texttt{ParseTreeVisitor} is generated by the parser generator and should not
	be modified.)

	In addition to the visit methods, both types of Visitors have a
	\texttt{preparingToVisitChildrenOf} method, which is called after a node has
	been visited but before its children have, and a \texttt{doneVisitingChildrenOf}
	method, which is called immediately after its children have been visited but
	before any sibling nodes have.

	\subsection{Creating and Using a Visitor}

	Create a visitor by subclassing \texttt{ParseTreeVisitor} or
	\texttt{GenericParseTreeVisitor}. By default, all of the visit methods do
	nothing, so you only have to override the methods that are useful to you.

	Given the root of the parse tree (i.e., the \texttt{ParseTreeNode} returned
	by \texttt{FortranProcessor\#parse}), use the \texttt{visitUsing} method
	to traverse the parse tree; pass your visitor as the only argument.

	\subsection{Sample Traversals}

	As an example, consider again the ``do nothing'' program from Figure \ref{parsetree}.

	When it is visited using a \texttt{GenericParseTreeVisitor}, the Visitor
	methods are called in this order (methods called between \texttt{preparingToVisitChildrenOf}
	and \texttt{doneVisitingChildrenOf} are indented to illustrate that child nodes
	are being visited):

	\begin{verbatim}
	visitParseTreeNode(<xExecutableProgram>)
	preparingToVisitChildrenOf(<xExecutableProgram>)
	visitParseTreeNode(<xProgramUnit>)
	preparingToVisitChildrenOf(<xProgramUnit>)
	visitParseTreeNode(<xMainProgram>)
	preparingToVisitChildrenOf(<xMainProgram>)
	visitParseTreeNode(<xProgramStmt>)
	preparingToVisitChildrenOf(<xProgramStmt>)
	visitToken(T_PROGRAM: "program")
	visitParseTreeNode(<xProgramName>)
	preparingToVisitChildrenOf(<xProgramName>)
	visitToken(T_IDENT: "Hello")
	doneVisitingChildrenOf(<xProgramName>)
	visitToken(T_EOS: (end of line))
	doneVisitingChildrenOf(<xProgramStmt>)
	visitParseTreeNode(<xMainRange>)
	preparingToVisitChildrenOf(<xMainRange>)
	visitParseTreeNode(<xEndProgramStmt>)
	preparingToVisitChildrenOf(<xEndProgramStmt>)
	visitToken(T_END: "end")
	visitToken(T_PROGRAM: "program")
	visitToken(T_EOS: (end of line))
	doneVisitingChildrenOf(<xEndProgramStmt>)
	doneVisitingChildrenOf(<xMainRange>)
	doneVisitingChildrenOf(<xMainProgram>)
	doneVisitingChildrenOf(<xProgramUnit>)
	doneVisitingChildrenOf(<xExecutableProgram>)
	\end{verbatim}

	When it is visited using a \texttt{ParseTreeVisitor}, the Visitor
	methods are called in this order:

	\begin{verbatim}
	visitXexecutableprogram(<xExecutableProgram>)
	preparingToVisitChildrenOf(<xExecutableProgram>)
	visitXprogramunit(<xProgramUnit>)
	preparingToVisitChildrenOf(<xProgramUnit>)
	visitXmainprogram(<xMainProgram>)
	preparingToVisitChildrenOf(<xMainProgram>)
	visitXprogramstmt(<xProgramStmt>)
	preparingToVisitChildrenOf(<xProgramStmt>)
	visitXprogramname(<xProgramName>)
	preparingToVisitChildrenOf(<xProgramName>)
	doneVisitingChildrenOf(<xProgramName>)
	doneVisitingChildrenOf(<xProgramStmt>)
	visitXmainrange(<xMainRange>)
	preparingToVisitChildrenOf(<xMainRange>)
	visitXendprogramstmt(<xEndProgramStmt>)
	preparingToVisitChildrenOf(<xEndProgramStmt>)
	doneVisitingChildrenOf(<xEndProgramStmt>)
	doneVisitingChildrenOf(<xMainRange>)
	doneVisitingChildrenOf(<xMainProgram>)
	doneVisitingChildrenOf(<xProgramUnit>)
	doneVisitingChildrenOf(<xExecutableProgram>)
	\end{verbatim}

	(The Visitors used to generate the output above are stored in
	a JUnit test class called \texttt{ExampleVisitor}.)

	\section{The \texttt{ParseTreeSearcher}}

	Inside Visitor methods, typically you will want to look for particular tokens
	or syntactic structures.

	\subsection{An Example}

	Consider the following (randomly selected) method from \texttt{DeclarationCollector},
	one of the classes used for building symbol tables.
	This class scans a program's parse tree, looking for declarations of programs, functions,
	block data, namelists, etc., and inserts declarations for them into a symbol table.

	\begin{verbatim}
	public void visitXmainprogram(ParseTreeNode node)
	{
	// # R1101 desn"t ensure ordering as the standard requires;
	// <xMainProgram> ::=
	// <xMainRange> \|
	// <xProgramStmt> <xMainRange> ;
	// # R1102
	// <xProgramStmt> ::=
	// <xLblDef> T_PROGRAM <xProgramName> T_EOS ;
	ParseTreeNode programStmt = ParseTreeSearcher.findFirstNodeIn(node, Nonterminal.XPROGRAMSTMT);
	Token name;
	if (programStmt != null)
	name = ParseTreeSearcher.findFirstIdentifierIn(node);
	else
	{
	name = new Token();
	name.setText("(Anonymous Main Program)");
	}
	...
	\end{verbatim}

	(The comment lines are copied from the parser generation grammar, \texttt{Fortran95.bnf}.)

	In Fortran, a main program can either start with a program statement, or it
	can not. So both of these are valid programs:
	\begin{verbatim}
	program SayHi
	print , 'Hello!' print , 'Hello!'
	end program SayHi end program SayHi
	\end{verbatim}

	In this Visitor method, \texttt{node} is guaranteed to be a $<$xMainProgram$>$,
	due to the name of the method. As shown, by looking at the grammar, we can
	determine that it will either have one child (a \texttt{ParseTreeNode} corresponding
	to an $<$xMainRange$>$) or two children (both \texttt{ParseTreeNode}s, the
	first corresponding to an $<$xProgramStmt$>$ and the second corresponding to
	an $<$xMainRange$>$.

	If it \textit{does} have an $<$xProgramStmt$>$, then, a quick look at the rules for
	$<$xLblDef$>$ (an optional integer label that can appear at the beginning of any
	Fortran statement) and $<$xProgramName$>$ will make it evident that the first
	identifier token (T\_IDENT) under the $<$xProgramStmt$>$ is the name of the program.

	So now we can understand what this method does.
	\begin{itemize}

	\item It checks to see if the main program has a $<$xProgramStmt$>$. This is done
	by calling \texttt{ParseTreeSearcher\#findFirstNodeIn}, and telling it we want to
	find the first $<$xProgramStmt$>$ that is a child of \texttt{node}. It will either
	return a \texttt{ParseTreeNode} corresponding to an $<$xProgramStmt$>$, or if it
	can't find one, it will return \texttt{null}.

	\item If there is an $<$xProgramStmt$>$, it grabs the first T\_IDENT token, which
	tells the name the user gave to the program.

	\item If there was no $<$xProgramStmt$>$, the program does not have a name, so we
	``fake'' a token and name the program ``(Anonymous Main Program).''
	\end{itemize}

	\subsection{For More Information}

	So, essentially, anytime you have a parse tree (or a subtree of the parse tree)
	and you want to find a particular node or token, use one of the methods in
	\texttt{ParseTreeSearcher}. If there isn't one, you may have to write one yourself.
	Be sure to look at the JavaDoc for the methods in that class; unless you are
	doing something bizarre, one of the existing methods should work.