blob: 6d555e8ddd2cb422366a51194b0482456e3c8035 [file] [log] [blame]
% Refactoring
\vspace{-0.5in}
{\scriptsize Revision: \footnotesize \$Id: cha-refactoring.ltx-inc,v 1.3 2010/06/02 21:25:28 joverbey Exp $ $ - based on 2008/08/08 nchen}
\section{Introduction}
A refactoring is a program transformation to improve the quality of the source
code by making it easier to understand and modify. A refactoring is a special
kind of transformation because it preserves the \emph{observable behavior} of
your program -- it neither removes nor adds any functionality.\footnote{For more
information see
\textit{\href{http://www.amazon.com/Refactoring-Improving-Existing-Addison-Wesley-Technology/dp/0201485672}{Refactoring:
Improving the Design of Existing Code}}}.
The main purpose of writing
Photran was to create a refactoring tool for Fortran. The Eclipse Platform provides some
language-neutral components that allow Photran's refactorings to resemble the
refactorings in the Java Development Tools, at least in terms of their user interface.
However, the \textit{language infrastructure} for refactoring---the program representation
and the APIs for manipulating Fortran source code---are all unique to Photran.
Most of the code in a refactoring is dedicated to program analysis and source code
transformation; the APIs for doing this were described in the previous chapter on the
AST and VPG. In this chapter, we describe how automated refactorings are structured
and how they can be added to the Eclipse API.
\section{Overview}
{\footnotesize
\textit{This section is excerpted from
M.~M\'endez, J.~Overbey, A.~Garrido, F.~Tinetti, and R.~Johnson,
``A Catalog and Classification of Fortran Refactorings'' (2010).}
}
One of Photran's design objectives has been to make adding new refactorings as painless as possible. Often, it is possible to add a new refactoring by implementing a single Java class and adding one line of XML to a configuration file.
Photran divides refactorings into two categories. An \textbf{editor-based refactoring} requires the user to select part of a Fortran program (for example, he may select the header of a do loop) in a text editor in order to initiate the refactoring. A \textbf{resource refactoring} applies to entire files; the user can select several files at once, or even entire folders or projects, and the refactoring will be applied to all of the selected Fortran source files at once.
To create a new refactoring, the developer must decide whether it will be an editor-based refactoring or a resource refactoring. Photran provides different superclasses for each. The developer then creates a concrete subclass and adds a line of XML to a configuration file to make Photran aware of the new refactoring. The concrete subclass must define methods which:
\begin{itemize}
\item provide the name of the refactoring. This becomes its label in the Refactor menu, and it is also used to describe the refactoring in the Edit $>$ Undo menu item and in other user interface elements.
\item check initial preconditions. These are usually simple checks which verify that the user selected the correct construct in the editor, that the file is not read-only, etc.
\item acquire user input. For example, a refactoring to add a parameter to a subprogram must ask the user to supply the new parameter's name and type.
\item check final preconditions. These validate user input and perform any additional checks necessary to ensure that the transformation can be performed, the resulting code will compile, and it will retain the behavior of the original program.
\item perform the transformation. Once all preconditions have been checked, this method determines what files will be changed, and how.
\end{itemize}
Thanks to Java's reflective facilities, much of the user interface for a refactoring comes ``for free.'' Photran automatically adds the refactoring to the appropriate parts of the Eclipse user interface, and it provides a wizard-style dialog box which allows the user to interact with the refactoring. This dialog includes a \textit{diff}-like preview, which allows the user to see what changes the refactoring will make before committing it.
\section{Getting Started}
Perhaps the best way to create a new refactoring is to start with an existing refactoring and
modify it to do what you want. The following refactorings are very simple and should be used as
starting points:
\begin{itemize}
\item \textit{InterchangeLoopsRefactoring} -- An editor-based refactoring that swaps the headers of two perfectly-nested \texttt{DO}-loops
\item \textit{RepObsOpersRefactoring} -- A resource refactoring that replaces old-style operators like \texttt{.LT.} and \texttt{.EQ.} with their newer, symbolic counterparts (i.e., \texttt{$<$} and \texttt{==}).
\item \textit{KeywordCaseRefactoring} -- A resource refactoring that changes all keywords to either upper- or lower-case.
\end{itemize}
All of these refactorings make fairly simple replacements in an AST. A slightly less trivial
example is \textit{IntroImplicitNoneRefactoring}, which illustrates how to insert new code into
an AST.
\textit{\textbf{Hint:} To find any of these classes quickly, click on Navigate $>$ Open Type in the
Java perspective, and start typing the class name.}
\section{How to Create a Refactoring}
\begin{enumerate}
\item
Decide whether your refactoring will be an editor-based refactoring or a resource refactoring, and then
create a new subclass of either \textit{FortranEditorRefactoring} or
\textit{FortranResourceRefactoring}.\footnote{Both of these classes inherit from
of an Eclipse Platform class called \textit{Refactoring}.
The \textit{Refactoring} class is provided by a Platform
component called the Eclipse Language ToolKit, or LTK. The
LTK is a language-neutral API for supporting refactorings in the Eclipse
environment. It provides the wizard dialog used by most refactorings,
including the diff view shown in the preview page. This allows refactorings
for Java, C/C++, and Fortran to all have the same look and feel. See
``\href{http://www.eclipse.org/articles/Article-LTK/ltk.html}{The Language
Toolkit: An API for Automated Refactorings in Eclipse-based IDEs}'' for an
introduction to the LTK.}
Use one of the simpler refactorings listed above as a starting point, and look at the JavaDoc
for each method you must override.
\item
Add the refactoring to the \texttt{plugin.xml} file. Be sure that you give the correct
fully-qualified name for your refactoring class, and be sure you correctly identify it as
either a resource refactoring or an editor refactoring based on its superclass!
\item
When you run Photran, your refactoring should now be visible in the user interface.
\end{enumerate}
Refactoring classes inherit a large number of
\texttt{protected} utility methods common among refactorings, such as a method
to determine if a token is a uniquely-bound identifier, a method to parse
fragments of code that are not complete programs, and a \textit{fail} method
which is simply shorthand for throwing a \textit{PreconditionFailure}.
Before writing your own utility methods, use content assist to scan through the
list of methods available, or look at a refactoring with similar preconditions
to determine if the functionality you need already exists.
\section{Changing Source Code: AST Rewriting}
\label{sec:ast_rewriting}
Instead of manipulating the text in the source files directly, Photran's refactorings modify
source code by manipulating the program's abstract syntax tree (AST), and then
reproducing source code from the modified AST.
The previous chapter provided a general introduction to Photran's AST. (If you have not read it,
please do so before continuing.) The remainder of this section will discuss how the AST
is used in refactorings.
\subsection{Whitetext Affixes}
Photran's AST is \textit{concretized} as described in J.~Overbey and R.~Johnson, ``Generating
Rewritable Abstract Syntax Trees'' (SLE 2008). Its API appears to be that of an ordinary AST,
but in fact every token returned by the lexer is actually stored in the AST. Moreover, every
token has whitetext (comments, whitespace, line continuations, etc.) affixed to it.
In general, all of the comments, whitespace, etc.~are affixed to the \textit{following} token.
(This works well in Photran since it treats a newline as its own token.)
This whitetext can be retrieved by invoking \texttt{token.getWhiteBefore()}. However, if there
is whitetext at the \textit{end} of a file, it is affixed to the \textit{last} token in the file
and can be retrieved by invoking \texttt{token.getWhiteAfter()}.
\subsection{Mutable AST}
Photran's AST is \textit{mutable.} When you want to change the source code produced by the AST,
you modify the AST itself. That is, you add, remove, or replace parts of the AST as necessary to
effect the change you want. (Note that this is \textit{different} from the Eclipse JDT and CDT,
which use an \textit{ASTRewriter} to \textit{describe} source code changes in terms of AST nodes
but do not actually modify the AST.)
Note that the \textit{toString()} method can be invoked on any AST node to display its source code.
This means that, if you are stepping through a refactoring in the debugger, you can watch the
AST (or any subtree) to see the effects of each change that is made to it.
\subsection{Modifying, Moving, Removing, and Replacing AST Nodes}
In the org.eclipse.photran.internal.core.refactoring package, there is a class
called \textit{\_AST\_VPG\_HOWTO} which contains several small ``snippets''
illustrating how to perform common tasks with the AST/VPG. This includes how
to traverse the AST using a visitor, how to find nodes of a particular type,
how to find all of the references to a particular name, etc. It also includes
several examples of how to move, copy and delete nodes in an AST.
\subsection{Inserting new AST Nodes}
Some refactorings require inserting new AST nodes into the current program. For
instance, the ``Intro Implicit None Refactoring'' inserts new declaration
statements to make the type of each variable more explicit.
There are \emph{three} steps involved in inserting a new AST node:
\begin{enumerate}
\item Constructing the new AST node.
\item Inserting the new AST node into the correct place.
\item Re-indenting the new AST node to fit within the current file.
\end{enumerate}
\paragraph{Constructing the new AST node} A refactoring class inherits several
convenience methods for constructing new AST nodes. These are all named \textit{parseLiteral\dots}.
For instance,
the \textit{parseLiteralStatement} methods constructs a list of AST nodes for
use in the ``Intro Implicit None'' refactoring.
\paragraph{Inserting the new AST node} Examples for how to insert the new AST node are
included in the \textit{\_AST\_VPG\_HOWTO} class mentioned above.
\paragraph{Re-indenting the new AST node} It might be necessary to re-indent the
newly inserted AST node so that it conforms with the indentation at its
insertion point. The \textit{Reindenter} utility class provides the static
method \textit{reindent} to perform this task.
\subsection{Committing Changes}
After all of the changes have been made to a file's AST,
\textit{addChangeFromModifiedAST} has to be invoked to %actually
%commit the changes. This convenience function creates a new
%\textit{TextFileChange} for the \emph{entire} content of the file. The
%underlying Eclipse infrastructure performs a \textit{diff} internally to
%determine what parts have actually changed and present those changes to the user
%in the preview dialog.
inform the Eclipse refactoring infrastructure that the source code should
be replaced with the revised source code in the modified AST. \textit{If
you forget to invoke this method, your refactoring will appear to do
nothing!}
\section{Caveats}
\label{sec:refactoring_caveats}
\textbf{CAUTION:} Internally, the AST is changed only enough to reproduce
correct source code. After making changes to an AST, most of the accessor
methods on \textit{Token}s (\textit{getLine(), getOffset(),} etc.) will return
\textit{incorrect} or \emph{null} values.
Therefore, \textit{all program analysis should be done first}. Affected
\textit{Token}s should be recorded as \textit{TokenRef}s and affected
\textit{Definition}s should be stored
\textit{prior} to making any modifications to the AST. In general, ensure that
all analyses (and storing of important information from \textit{Token}s) is
done in the \textit{doCheckInitialConditions} and
\textit{doCheckFinalConditions} methods of your refactoring \textit{before} the
\textit{doCreateChange} method is invoked and before the AST is modified in any way.
\vspace{-0.2in}
\subsection{\textit{Token} or \textit{TokenRef}?}
\label{sub:token_or_tokenref}
\textit{Token}s form the leaves of the AST -- therefore they exist as part of
the Fortran AST. Essentially this means that holding on to a reference to a
\textit{Token} object requires the entire AST to be present in memory.
\textit{TokenRef}s are lightweight descriptions of tokens in an AST. They
contain only three fields: filename, offset and length. These three fields
uniquely identify a particular token in a file. Because they are not part of the
AST, storing a \textit{TokenRef} does not require the entire AST to be present
in memory.
If a refactoring only modifies one or two files,
using either \textit{Token}s or \textit{TokenRef}s does
not make much of a difference. However, in a refactoring like ``Rename''
that could potentially modify hundreds of files, it is impractical
to store all ASTs in memory at once. Because of the complexity of the Fortran
language itself, its ASTs can be rather large and complex. Therefore storing
references to \textit{TokenRef}s would minimize the number of ASTs that must be
in memory.
To retrieve an actual \textit{Token} from a \textit{TokenRef}, call the
\textit{findToken()} method in \textit{PhotranTokenRef}, a subclass
of \textit{TokenRef}.
To create a \textit{TokenRef} from an actual \textit{Token}, call the \textit{getTokenRef} method in \textit{Token}.
\newif\ifzzz
\ifzzz
\noindent \textbf{In an AST, how do I find an ancestor node that is of a
particular type?}
\\ Sometimes it might be necessary to traverse the AST \emph{upwards} to look
for an ancestor node of a particular type. Instead of traversing the AST
manually, you should call the \textit{findNearestAncestor(TargetASTNode.class)}
method on a \textit{Token} and pass it the \textbf{class} of the ASTNode that
you are looking for.
\noindent \textbf{How would I create a new AST node from a string?}
\\ Call the \textit{parseLiteralStatement(String string)} or
\textit{parseLiteralStatementSequence(String string)} method in
\textit{AbstractFortranRefactoring}. The former takes a \textit{String} that represents
a single statement while the latter takes a \textit{String} that represents a
sequence of statements.
\noindent \textbf{How do I print the text of an AST node and all its children
nodes?}
\\ Call the \textit{SourcePrinter.getSourceCodeFromASTNode(IASTNode node)}
method. This method returns a \textit{String} representing the source code of
its parameter; it includes the user's comments, capitalization and whitespace.
\fi
\section{Testing a Refactoring}
Writing JUnit tests for a refactoring is simple (starting with Photran 6 anyway).
Any of the tests suites in the org.eclipse.photran.internal.tests.refactoring
package can be used as examples.
Each refactoring has a corresponding test suite in that package. The test suite
is a class which inherits from \textit{PhotranRefactoringTestSuiteFromMarkers}.
The test suite is constructed by importing files from a directory in the
source tree, searching its \textit{markers} (comma-separated lines starting with
\texttt{!$<<<<<$}), and adding one test case to the suite for each marker.
Markers are expected to have one of the following forms:
\begin{enumerate}
\item \texttt{!$<<<<<$}) \textit{line, col, \dots}
\item \texttt{!$<<<<<$}) \textit{fromLine, fromCol, toLine, toCol, \dots}
\end{enumerate}
That is, the first two fields in each marker are expected to be a line and column number; the
text selection passed to the refactoring will be the offset of that line and column. The third
fourth fields may also be a line and column number; then, the selection passed to the refactoring
will extend from the first line/column to the second line/column.
The line and column numbers may be followed by an arbitrary number of fields that contain data
specific to the refactoring being invoked. Many refactorings don't require any additional data;
the Extract Local Variable test suite uses one field for the new variable declaration; the Add
ONLY to USE Statement test suite uses these fields to list the module entities to add; etc.
The final field must be either \texttt{pass}, \texttt{fail-initial}, or
\texttt{fail-final}, indicating whether the refactoring should succeed,
ail its initial precondition check, or fail its final precondition check.
If the refactoring is expected to succeed, the Fortran program will be compiled and run before
and after the refactoring in order to ensure that the refactoring actually preserved behavior.
To create a test suite for a new refactoring:
\begin{enumerate}
\item In the org.eclipse.photran.internal.tests.refactoring package,
copy an existing test suite class, like DataToParameterTestSuite.java.
\item Change the name of the Refactoring class in the list of type parameters and in the constructor.
\item If your markers will contain information other than fromLine, fromCol, toLine, toCol, and pass/fail, override the \textit{configureRefactoring} method.
See \textit{ExtractLocalVariableTestSuite} for an example.
\item Inside the \textit{refactoring-test-code} folder, create a new subfolder named after your
refactoring. We will call this the \textit{base folder.}
In the test suite class, change the string variable \texttt{DIR} to point to the
base folder.
\item Inside the base folder, create one or more subfolders. Each subfolder will
be imported into the runtime workspace as a single Eclipse project, so it may
contain only one Fortran file, or it may contain several.
The refactoring will be run according to the markers in its files.
Remember, markers have the basic form
\texttt{!$<<<<<$} \textit{startLine, startCol, endLine, endCol, result.}
\item If there is only one marker anywhere in the subfolder, and it indicates that
the refactoring should succeed, you may also create
a \textit{.result} file. For example, if the folder contains \textit{hello.f90},
it may also contain \textit{hello.f90.result}. This file should contain the
\textit{exact text} the file is expected to have after the refactoring has been
performed. The \textit{.result} file will be compared character-by-character with
the result of the refactoring, so spaces, tabs, and newlines \textit{do} matter;
the test will fail even if a single space is incorrect. (This is because
a production refactoring tool must produce correctly-indented code and handle
comments correctly, and this is an easy way to test for this.) Note that you
will probably need to include the marker in both the original file and the
result file, based on how comments are handled by the refactoring (since it is,
after all, simply another comment in the Fortran code).
\end{enumerate}