blob: 0cd4eba8d4ebd89c60ff1e545e790c3dcfcc4c4f [file] [log] [blame]
\clearpage
\chapter{Orchestration Workflow}
\label{chp:Workflow}
The previous chapters provided a detailed discussion on a number of task-specific languages, each one addressing an individual model management task. However, in practice, model management tasks are seldom carried out in isolation; instead, they are often combined together to form complex workflows. Therefore, of similar importance to the existence of individual task-specific management languages is the provision of a mechanism that enables developers to compose modular and reusable tasks into complex automated processes. In a broader context, to facilitate implementation of seamless workflows, an appropriate MDE workflow mechanism should also support mainstream development tasks such as file management, version control management, source code compilation and invocation of external programs and services.
\section{Motivation}
\label{sec:Workflow.Motivation}
As a motivating example, an exemplar workflow that consists of both MDD tasks (1-4, 6) and mainstream software development tasks (5, 7) is displayed below.
\begin{enumerate}
\item Load a UML model
\item Validate it
\item Transform it into a Database Schema model
\item Generate Java code from the UML model
\item Compile the Java code
\item Generate SQL code from the Database model
\item Deploy the SQL code in a Database Management System (DBMS)
\end{enumerate}
In the above workflow, if the validation step (2) fails, the entire process should be aborted and the identified errors should be reported to the user. This example demonstrates that to be of practical use, a task orchestration framework needs to be able to coordinate both model management and mainstream development tasks and provide mechanisms for establishing dependencies between different tasks.
This chapter presents such a framework for orchestrating modular model management tasks implemented using languages of the Epsilon platform. As the problem of task coordination is common in software development, many technical solutions have been already proposed and are widely used by software practitioners. In this context, designing a new general-purpose workflow management solution was deemed inappropriate. Therefore, the task orchestration solution discussed here has been designed as an extension to the robust and widely used ANT \cite{ANT} framework. A brief overview of ANT as well as a discussion on the choice to design the orchestration workflow of Epsilon atop it is provided below.
\section{The ANT Tool}
\label{sec:Workflow.ANT}
ANT, named so because it is \textit{a little thing that can be used to build big things} \cite{AntBook}, is a robust and widely-used framework for composing automated workflows from small reusable activities. The most important advantages of ANT, compared to traditional build tools such as \emph{gnumake} \cite{GnuMake}, is that it is platform independent and easily extensible. Platform independence is achieved by building atop Java, and extensibility is realized through a lightweight binding mechanism that enables developers to contribute custom tasks using well defined interfaces and extension points.
Although a number of tools with functionality similar to ANT exist in the Java community, only Maven \cite{Maven} is currently of comparable magnitude in terms of user-basis size and robustness. Outlining the discussion provided in \cite{AntVsMaven}, ANT is considered to be easier to learn and to enable low-level control, while Maven is considered to provide a more elaborate task organization scheme. Nevertheless, the two frameworks are significantly similar and the ANT technical solution discussed in this chapter can easily be ported to work with the latter.
This section provides a brief discussion of the structure and concrete syntax of ANT workflows, as well as the extensibility mechanisms that ANT provides to enable users contribute custom tasks.
\subsection{Structure}
In ANT, each workflow is captured as a \textit{project}. A simplified illustration of the structure of an ANT project is displayed in Figure \ref{fig:AntCore}. Each ANT project consists of a number of \textit{targets}. The one specified as the \emph{default} is executed automatically when the project is executed. Each \emph{target} contains a number of \emph{tasks} and \emph{depends} on other targets that must be executed before it. An ANT task is responsible for a distinct activity and can either succeed or fail. Exemplar activities implemented by ANT tasks include file system management, compiler invocation, version management and remote artefact deployment.
\begin{figure}
\centering
\includegraphics{images/AntCore.png}
\caption{Simplified ANT object model}
\label{fig:AntCore}
\end{figure}
\subsection{Concrete Syntax}
In terms of concrete syntax, ANT provides an XML-based syntax. In Listing \ref{lst:ANTExample}, an exemplar ANT project that compiles a set of Java files is illustrated. The project contains one target (\emph{main}) which is also set to be the \emph{default} target. The \emph{main} target contains one \emph{javac} task that specifies attributes such as \emph{srcdir}, \emph{destdir} and \emph{classpath}, which define that the Java compiler will compile a set of Java files contained into the \emph{src} directory into classes that should be placed in the \emph{build} directory using \emph{dependencies.jar} as an external library.
\begin{lstlisting}[caption=Compiling Java classes using the javac task, label=lst:ANTExample]
<project default="main">
<target name="main"/>
<javac srcdir="${src}"
destdir="${build}"
classpath="dependencies.jar"
debug="on"
source="1.4"/>
</target>
</project>
\end{lstlisting}
\subsection{Extending ANT}
Binding between the XML tags that describe the tasks and the actual implementations of the tasks is achieved through a light-weight mechanism at two levels. First, the tag (in the example of Listing \ref{lst:ANTExample}, \emph{javac}) is resolved to a Java class that extends the \emph{org.apache.ant.Task} abstract class (in the case of \emph{javac}, the class is \emph{org.apache.tools.ant.taskdefs.Javac}) via a configuration file. Then, the attributes of the tasks (e.g. \emph{srcdir}) are set using the reflective features that Java provides. Finally, the \emph{execute()} method of the task is invoked to perform the actual job.
This lightweight and straightforward way of defining tasks has rendered ANT particularly popular in the Java development community and currently there is a large number of tasks contributed by ANT users \cite{AntExternalTasks}, ranging from invoking tools such as code generators and XSLT processors, to emulating logical control flow structures such as \emph{if} conditions and \emph{while} loops. The AMMA platform \cite{AMMA} also provides integration of model driven engineering tools such as TCS \cite{TCS} and ATL \cite{ATL} with ANT.
ANT also supports more advanced features including nested XML elements and \emph{filesets}, however providing a complete discussion is beyond the scope of this chapter. For a definitive guide to ANT readers can refer to \cite{AntBook}.
\section{Integration Challenges}
\label{sec:Workflow.IntegrationChallenges}
A simple approach to extending ANT with support for model management tasks would be to implement one standalone task for each language in Epsilon. However, such an approach demonstrates a number of integration and performance shortcomings which are discussed below.
Since models are typically serialized in the file system, before a task is executed, the models it needs to access/modify must be parsed and loaded in memory. In the absence of a more elaborate framework, each model management task would have to take responsibility for loading and storing the models it operates on. Also, in most workflows, more than one task operates on the same models sequentially, and needlessly loading/storing the same models many times in the context of the same workflow is an expensive operation both time and memory-wise, particularly as the size of models increases.
Another weakness of this primitive approach is limited inter-task communication. In the absence of a communication framework that allows model management tasks to exchange information with each other, it is often the case that many tasks end up performing the same (potentially expensive) queries on models. By contrast, an inter-task communication framework would enable time and resource intensive calculations to be performed once and their results to be communicated to all interested subsequent tasks.
Having discussed ANT, Epsilon and the challenges their integration poses, the following sections presents the design of a solution that enables developers to invoke model management tasks in the context of ANT workflows. The solution consists of a core framework that addresses the challenges discussed in Section \ref{sec:Workflow.IntegrationChallenges}, a set of specific tasks, each of which implements a distinct model management activity, and a set of tasks that enable developers to initiate and manage transactions on models using the respective facilities provided by the model connectivity layer discussed in Section \ref{sec:EMC.ModelTransactionSupport}.
\section{Framework Design and Core Tasks}
\label{sec:Workflow.Framework}
The role of the core framework, illustrated in Figure \ref{fig:Core}, is to provide model loading and storing facilities as well as runtime communication facilities to the individual model management tasks that build atop it. This section provides a detailed discussion of the components it consists of.
%\begin{landscape}
\begin{sidewaysfigure}
\centering
\includegraphics{images/AntEpsilon.png}
\caption{Core Framework}
\label{fig:Core}
\end{sidewaysfigure}
%\end{landscape}
\begin{figure}
\centering
\includegraphics{images/AntEpsilonModels.png}
\caption{Core Models Framework}
\label{fig:CoreModels}
\end{figure}
\subsection{The EpsilonTask task}
An ANT task can access the project in which it is contained by invoking the \emph{Task.getProject()} method. To facilitate sharing of arbitrary information between tasks, ANT projects provide two convenience methods, namely \emph{addReference(String key, Object ref)} and \emph{getReference(String key) : Object}. The former is used to add key-value pairs, which are then accessible using the latter from other tasks of the project.
To avoid loading models multiple times and to enable on-the-fly management of models from different Epsilon modules without needing to store and re-load the models after each task, a reference to a project-wide model repository has been added to the current ANT project using the \emph{addReference} method discussed above. In this way, all the subclasses of the abstract \emph{EpsilonTask} can invoke the \emph{getProjectRepository()} method to access the project model repository.
Also, to support a variable sharing mechanism that enables inter-task communication, the same technique has been employed; a shared context, accessible by all Epsilon tasks via the \emph{getProjectContext()} method, has been added. Through this mechanism, model management tasks can export variables to the project context (e.g. traces or lists containing results of expensive queries) which other tasks can then reuse.
\emph{EpsilonTask} also specifies a \emph{profile} attribute that defines if the execution of the task must be profiled using the profiling features provided by Epsilon. Profiling is a particularly important aspect of workflow execution, especially where model management languages are involved. The main reason is that model management languages tend to provide convenient features which can however be computationally expensive (such as the \emph{allInstances()} EOL built-in feature that returns all the instances of a specific metaclass in the model) and when used more often than really needed, can significantly degrade the overall performance.
%\subsection{Tasks for Loading and Storing Models}
\input{tasks/LoadModel}
\input{tasks/StoreModel}
\input{tasks/DisposeModel}
The workflow leverages the model-transaction services provided by the model connectivity framework of Epsilon by providing three tasks for managing transactions in the context of workflows.
\subsection{The StartTransaction Task}
The \emph{epsilon.startTransaction} task defines a \emph{name} attribute that identifies the transaction. It also optionally defines a comma-separated list of model names (\emph{models}) that the transaction will manage. If the \emph{models} attribute is not specified, the transaction involves all the models contained in the common project model repository.
\subsection{The CommitTransaction and RollbackTransaction Tasks}
The \emph{epsilon.commitTransaction} and \emph{epsilon.rollbackTransaction} tasks define a \emph{name} attribute through which the transaction to be committed/rolled-back is located in the project's active transactions. If several active transactions with the same name exist the more recent one is selected.
The example of Listing \ref{lst:ANTTransactionsExample} demonstrates an exemplar usage of the \emph{epsilon.startTransaction} and \emph{epsilon.rollbackTransaction} tasks. In this example, two empty models Tree1 and Tree2 are loaded in lines 1,2. Then, the EOL task of line \ref{line:initialQuery} queries the models and prints the number of instances of the \emph{Tree} metaclass in each one of them (which is 0 for both). Then, in line \ref{line:transactionStart}, a transaction named T1 is started on model Tree1. The EOL task of line \ref{line:newInstances}, creates a new instance of Tree in both Tree1 and Tree2 and prints the number of instances of Tree in the two models (which is 1 for both models). Then, in line \ref{line:transactionRollback}, the T1 transaction is rolled-back and any changes done in its context to model Tree1 (but not Tree2) are undone. Therefore, the EOL task of line \ref{line:finalQuery}, which prints the number of instances of Tree in both models, prints 0 for Tree1 but 1 for Tree2.
\begin{lstlisting}[caption=Exemplar usage of the \emph{epsilon.startTransaction} and \emph{epsilon.rollbackTransaction} tasks, label=lst:ANTTransactionsExample, language=XML]
<epsilon.loadModel name="Tree1" type="EMF">...</epsilon.loadModel>
<epsilon.loadModel name="Tree2" type="EMF">...</epsilon.loadModel>
<epsilon.eol>/*@\label{line:initialQuery}@*/
<![CDATA[
Tree1!Tree.allInstances.size().println(); // prints 0
Tree2!Tree.allInstances.size().println(); // prints 0
]]>
<model ref="Tree1"/>
<model ref="Tree2"/>
</epsilon.eol>
<epsilon.startTransaction name="T1" models="Tree1"/> /*@\label{line:transactionStart}@*/
<epsilon.eol> /*@\label{line:newInstances}@*/
<![CDATA[
var t1 : new Tree1!Tree;
Tree1!Tree.allInstances.size().println(); // prints 1
var t2 : new Tree2!Tree;
Tree2!Tree.allInstances.size().println(); // prints 1
]]>
<model ref="Tree1"/>
<model ref="Tree2"/>
</epsilon.eol>
<epsilon.rollbackTransaction name="T1"/> /*@\label{line:transactionRollback}@*/
<epsilon.eol>/*@\label{line:finalQuery}@*/
<![CDATA[
Tree1!Tree.allInstances.size().println(); // prints 0
Tree2!Tree.allInstances.size().println(); // prints 1
]]>
<model ref="Tree1"/>
<model ref="Tree2"/>
</epsilon.eol>
\end{lstlisting}
%\subsection{Model Management Tasks}
\input{tasks/EpsilonExecutableModule}
\section{Model Management Tasks}
\label{sec:Workflow.ModelManagementTasks}
Having discussed the core framework, this section presents the model management tasks that have been implemented atop it, using languages of the Epsilon platform.
\begin{figure}[ht!]
\centering
\includegraphics{images/Tasks.png}
\caption{Model Management Tasks}
\label{fig:Tasks}
\end{figure}
\input{tasks/Eol}
\input{tasks/Evl}
\input{tasks/Etl}
\input{tasks/Ecl}
\input{tasks/Eml}
\input{tasks/Egl}
\input{tasks/Flock}
\input{tasks/Epl}
\section{Miscellaneous Tasks}
\subsection{Java Class Static Method Execution Task}
The \emph{epsilon.java.executeStaticMethod} task executes a parameter-less static method, defined using the \emph{method} attribute, of a Java class, defined using the \emph{javaClass} attribute. This task can be useful for setting up the infrastructure of Xtext-based languages.
\subsection{Adding a new Model Management Task}
\label{sec:Workflow.NewModelManagementTask}
As discussed in Section \ref{sec:Design.ImplementingANewLanguage}, additional task-specific languages are likely to be needed in the future for tasks that are not effectively supported by existing task-specific languages. In addition to designing and implementing the syntax and execution semantics of a new language, it is also important to provide integration with the workflow -- if the nature of the language permits execution within a workflow. As a counter-example, no workflow task has been provided for EWL since its execution semantics is predominately user-driven and as such, it makes little sense to execute EWL in the context of an automated workflow.
To implement support for a new task-specific language to the workflow, a new extension of the abstract \emph{ExecutableModuleTask} needs to be provided (similarly to what has been done for existing task-specific languages). By extending \emph{ExecutableModuleTask}, the task is automatically provided with access to the essential features of the workflow such as the shared model repository, and runtime context. Additional configuration options for the task need to specified as new ANT \emph{attributes} and/or \emph{nested elements}, similarly to what has been done for the tasks presented in Sections \ref{sec:EolTask}--\ref{sec:EglTask}.
\section{Chapter Summary}
This chapter has presented the detailed design of an ANT-based framework for integrating and orchestrating mainstream software development tasks with model management tasks implemented using model management languages in Epsilon. In Section \ref{sec:Workflow.Framework}, the core framework that provides features such as centralized model loading/storing facilities, a shared model repository and a mechanism through which individual tasks can communicate at runtime has been illustrated. Then, Section \ref{sec:Workflow.ModelManagementTasks} has provided a discussion on the integration of the task specific languages with the framework and also provided guidance for adding support for additional languages that are likely to be developed in the future atop Epsilon.