blob: 2c0e9588cb46d3a799c01cd8d8595fb0d813e97e [file] [log] [blame]
h2. Partitioning
The AMALTHEA partitioning approach supports various features, which can be combined and configured in order to exploit software parallelism in various ways. Besides mandatory model enhancements such as the label access analysis leading to a constraints model with @RunnableSequencingConstraints@ and the cycle elimination to provide directed acyclic graphs, features like activation grouping or the independent graph grouping can be optionally performed. Subsequently, one of two different partitioning approaches can be performed, that cut the created graphs into partitions (@ProcessPrototypes@) that can later be transformed to tasks. These tasks can be efficiently executed in parallel on a target platform, since ordering constraints are considered and task execution time and inter task communications are minimized. On the one hand, the __CPP__ approach performs graph theoretical mechanisms to initially reveal a critical path of @runnables@ (also denoted as nodes) and dependencies (also denoted as edges) from a source node to a sink node. That initial path is assigned to the first partition and branch nodes are identified and assigned to additional tasks with respect to order constraints, execution cycles and communication overheads. This partitioning approach provides an automatic @runnable@ distribution resulting in an efficient parallelism factor with respect to an unbounded number of tasks and causal runnable orders i.e. dependencies. On the other hand, the __ESSP__ partitioning approach performs a timing based partitioning with respect to a bounded number of partitions (configurable by the user), since this case is often mandatory in order to meet system requirements. Both approaches therefore read, adapt and write AMALTHEA specific constraints and software models and express the first approaches within the AMALTHEA platform to automatically exploit system parallelism and utilize parallel resources. The user benefits from not manually assigning runnables to tasks through a complex, error prone and time consuming process but triggering a configurable partitioning approach, that automatically performs that assignment using different algorithms for various optimizing criteria in order to distribute existing software effectively.
h3. Usage of Partitioning
The following configuration window shows the different features, which are available for the partitioning process.
!../pictures/partitioning-guide/Config.png!
Each process is described in the following sections.
In order to start the partitioning, the user should right click on an __.amxmi__ file and select the __Perform Partitioning__ command as shown in the following figure.
!../pictures/partitioning-guide/click_small.png!
Alternatively, a workflow can be written (e.g. using the __EASE__ script technology) to start the partitioning process and further define various workflows for different inputs and outputs (see example at this partitioning help's end).
With regard to the partitioning configuration, the partitioning process will perform various model adaptations as shown in the following figure and described in the next sections.
!(scale)../pictures/partitioning-guide/psteps_small.png!
h3. Pre-Partitioning
In order to perform the actual partitioning based on <i>DAWG</i>s (directed acyclic weighted graphs), various processing needs to be performed on the given input i.e. a set of runnables with label accesses, execution cycles and activations. Required processing are creating a constraints model with @RunnableSequencingConstraints@ and cycle elimination that creates @AccessPrecedences@ for each dependency that needs to be decomposed from a @RunnableSequencingConstraint@ to eliminate one or more cycles. Further adaptations can create @ProcessPrototypes@ in order to group runnables based on activation references or based on independent graphs.
h4. Label Access Analysis
The label access analysis comprises the comparison of __read__ and __write__ accesses of all runnables for identifying dependencies. For example, in case runnable __A__ writes a label and another runnable __B__ reads the same label, runnable __B__ depends on runnable __A__. This dependency is saved as a @RunnableSequencingConstraint@. Such @RunnableSequencingConstraint@ is the basis for performing a __DAG__ (directed acyclic graph) analysis and allows the determination of concurrency due to giving information about fork and join dependencies respectively @runnables@ that can be calculated in parallel. Furthermore, the label access analysis allows deriving memory structure via common data accesses. The @RunnableSequencingConstraints@ are stored within a new or an existing constraints model. An example is given in the following picture:
!../pictures/partitioning-guide/dependency_new.png!
h4. Cycle Elimination
The cycle elimination is a mandatory step for all subsequent methods and features. Topological and graph theoretical calculations require <i>DAWG</i>s, such that a cycle elimination has to be performed in advance. A cycle may occur in case two runnables share the same resource (label) in both directions, i.e. both runnables read and write the same label, or in case runnable __A__ reads label __BA__ and writes label __AB__ and runnable __B__ reads label __AB__ and writes label __BA__. Furthermore, a cycle may be formed across multiple runnables. For the purpose of finding such cycles, the external __JgraphT__ library has been added to the project, that supports finding all of these cycles. After all cycles have been identified, a specific mechanism (configurable through 'Minimal dependency decompositions for cycle elimination' in the preference page) detects edges (dependencies), which occur in multiple cycles. This mechanism iterates over edges within multiple cycles in descending order, i.e. it starts with an edge, that occurs in most cycles for ensuring minimal edge elimination. In order to retain a dependency that has been selected for elimination, such selected edges are transformed from a @RunnableSequencingConstraint@ to an @AccessPrecedence@. After edge sharing cycles have been decomposed, all cycles that do not share any edges have to be decomposed as well. For each of these cycles an edge can be identified that provides an optimal retaining graph (configurable through 'Increase parallelization potential for cycle elimination result' in the preference page). The following figure shows 7 cycle elimination possibilities with an example graph.
!(scale)../pictures/partitioning-guide/cycle_small.png!
Red transitions indicate edges, that are decomposed into @AccessPrecedences@ for the corresponding solution. Rretaining graphs are shown on the right side of each solution indicated by a dashed rectangle. For illustration purposes, we assume equal execution time for each runnable (letter). Solution 1 and 5 (green dashed rectangles) feature the minimal runtime for two tasks. This assessment is made with respect to topological graph structure respectively the __span__ of a graph (critical path) compared with its __parallelism__.
h4. Activation Partitioning
Typical for embedded software, code fragments need to be executed within different intervals i.e. timers, interrupts or events. Sensors or actuators for example must often be read within short intervals for precise and accurate executions. Contrarily, certain processing fragments independent from the system's stability can be executed at greater intervals due to having less impact on the system's performance. Such activations can either be referenced by tasks via the stimulation model or by runnables via activations in the software model. By assessing these model entities and references, temporal classification can be implied. The activation analysis feature creates @ProcessPrototypes@ for each activation and assigns referencing runnables correspondingly via creating @TaskRunnableCalls@ within the @ProcessPrototypes@.
h4. Independent Graph Partitioning
The independent graph identification process, or also denoted as global graph partitioning, can be executed after the cycle elimination in case the user selected the corresponding entry in the configurations ('Create independent graph partitions'). This methodology looks for graphs or single nodes, that do not share any label with other runnables or complete graphs that do not share labels with other graphs. Such methodology allows forming tasks, which can be totally distributed to either different cores or even to totally different systems as seen in the following figure:
!(scale)../pictures/partitioning-guide/GGP_small.png!
h4. Tag Partitioning
Tags can be used to group runnables similar to the __Activation Partitioning__ approach. A typical use case of tags are software components. This mechanism provides the user with any type of grouping in order to keep related runnables within the same partition respectively task.
h4. ASIL-level Partitioning
@ASIL-levels@ and @Tags@ (e.g., for software components) are equally considered such that separate groups are formed prior to the partitioning process, which always splits the most instructions consuming partition first. Consequently, generated partitions are aligned regarding their instruction sums as much as possible. In other words, all runnables referencing the same tag are grouped into a partition that may be split further if the sum of all runnables contained in this group is higher than other the instruction sums at other partitions. An important aspect of this mechanism is its influence of the overall software distribution. In order to keep the amount of generated @AccessPrecedencs@ (i.e., a dissolution of a direct cause and effect relation of two runnables) low, i.e., to keep the program's causality at a high level, existing groups can me merged with other groups or partitions.
h4. Consideration of Runnable Pairing Constraints
The folowing figure outlines the consideration of @RunnablePairingConstraints@ via runnable cumulation.
!(scale)../pictures/partitioning-guide/pairing_small.png!
Any @RunnablePairingConstraints@ merges the corresponding runnables for the subsequent graph algorithms (cumulation) and decomposes (reconstruction) to the original structure after partitions have been formed. Consequently, causality, i.e., the runnable pairing positions and sequences within the partitions are considered.
h3. Partitioning Algorithms
Each of the algorithms described in the following creates ProcessPrototypes i.e. task precursor.
h4. Critical Path Partitioning (CPP)
The __CPP__ approach considers node weights (i.e. arbitrary computation / execution cycles / instructions of @runnables@) and partitions <i>DAWG</i>s, whereas @runnables@ are equally distributed among an automatically determined number of partitions. The partitioning's objective is to reduce overall execution time and inter-task communication. The subsequent mapping methodology further considers resource constraints (availability of hardware and software resources namely program and data memory). The system's critical path is assigned to the first partition and branches of the graph are subsequently assigned to additional partitions. The approach has been chosen, because the critical path features mandatory sequential ordering, that cannot be computed in parallel. Thus, the weight of a critical path provides a lower bound on the total time to perform all the jobs. A __CPP__ example is shown in the following figure (left = input graph; right = partitioning result):
!(scale)../pictures/partitioning-guide/CPP_small.png!
h4. Earliest Start Schedule Partitioning (ESSP)
The __ESSP__ partitioning is developed for allowing the user to restrict the number of partitioning. This may be important and useful in very large systems in order to keep the task creation and task inter-communication costs low. The __ESSP__ partitioning balances runnables across partitions with respect to their causal orders. For this purpose, only runnable's @eit@ (earliest initial time) -values are calculated that define the sum of the longest preceding path's instructions. The methodology picks the partition with the lowest schedule length, calculates a set of assignable @runnables@, which predecessors are already scheduled, and assigns a runnable to the current partition that minimizes the schedule length.
h3. Further Features
Two further features provide graph visualizations and a different graph representation through @RunnableSequencingConstraints@.
h4. Applet Generation
The applet generation can be triggered with a different command in the context menu via right-clicking on an __.amxmi__ file. In case the file contains a software model with runnables and a constraints model with @RunnableSequencingConstraints@, a java file will be generated that can be executed as an applet with the help of the __JgraphT__ library. Such an applet visualizes a graph and can be adapted in an applet viewer. The following figure shows such an applet.
!(scale)../pictures/partitioning-guide/applet_small.png!
h4. Dependency Alternative
This file generation feature focuses on a constraints model approach that features a different dependency representation compared with the result from the label access analysis that always features two @RunnableGroups@ with each one @RunnableGroupEntry@ entity within a @RunnableSequencingConstraint@. The approach can be adapted to feature more @RunnableGroups@ and more @RunnableGroupEntries@ and a less amount of @RunnableSequencingConstraints@, derived from the same graph. An example is expressed in the following figure.
!../pictures/partitioning-guide/TA_small.png!
h3. Partitioning Examples
The following sections present some basic examples to illustrate the different __CPP__ and __ESSP__ results and further to provide an easy starting point for developers performing mutliple partitioning steps across mutlplie models via just a single start of an __EASE__ script.
h4. Algorithm Comparison
The following figure illustrates the resulting partitions of the previously described algorithms based on an example graph shown on the left. On the right, the partitioning results are shown for (top) - @CPP@, (middle) - @ESSP@ configured with 2 partitions and (bottom) - @ESSP@ for 3 partitions.
!(scale)../pictures/partitioning-guide/partexample.png!
h4. Workflow Example
The above described features can also be triggered via a workflow exemplarily shown in the following code (__EASE__ script):
bc..
Workflow {
//basic setup
loadModule('/System/Resources')
loadModule('/APP4MC/Workflow')
//Importing needed packages
importPackage(org.eclipse.app4mc.amalthea.workflow.component)
importPackage(org.eclipse.app4mc.amalthea.workflow.core)
importPackage(org.eclipse.app4mc.multicore.partitioning.workflow)
//Configure logging
addLoggerToConsole("org.eclipse.app4mc.multicore")
addLoggerToConsole("org.eclipse.app4mc.amalthea.workflow")
print("Starting Workflow ...")
//general setup
const BASE = "platform:/resource"
const PROJECT = BASE + "/org.eclipse.app4mc.amalthea.example.democar.mapping"
const MODEL_LOCATION1 = PROJECT + "/model/AMALTHEA_Democar_MappingExample.amxmi"
var ctx = new DefaultContext()
//Reader
var reader = new ModelReader()
reader.addFileName(MODEL_LOCATION1)
reader.run(ctx)
//prepartitioning
var prepart = new PrePartitioningWrkflw()
prepart.setAa(false)
prepart.setGgp(false)
prepart.setMinimEdge(false)
prepart.setEffEdge(false)
prepart.run(ctx)
//result is saved in modelslot prePartitioning
//Writer
var writer = new ModelWriter()
writer.setModelSlot("prePartitioning")
writer.setFileName("prePartitioning")
writer.setSingleFile(true)
writer.setOutputDir(PROJECT + "/workflow-output")
writer.run(ctx)
//partitioning
var part = new GeneratePartitioning()
part.setModelLoc(MODEL_LOCATION1);
part.setModelSlot("prePartitioning")
part.setPartitioningAlg("essp")
part.setNumberOfPartitions("4")
part.run(ctx)
//Writer
var writer = new ModelWriter()
writer.setModelSlot("partitioning")
writer.setFileName("partitioning")
writer.setSingleFile(true)
writer.setOutputDir(PROJECT + "/workflow-output")
writer.run(ctx)
print("Finished Workflow")
ctx.clear()
endWorkflow()
p.
h3. Plugin dependencies
bc..
org.eclipse.core.runtime,
org.eclipse.emf.ecore,
org.eclipse.emf.ecore.xmi,
org.eclipse.sphinx.emf,
org.eclipse.ui,
org.jgrapht,
org.eclipse.app4mc.amalthea.model,
org.eclipse.app4mc.multicore.openmapping
p.