tools/gem/org.eclipse.ptp.gem.help/html/overview.html - gerrit/ptp/org.eclipse.ptp - Git at Google

 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

 <html>
 <head>
     <link rel="stylesheet" type="text/css" href="help.css">
     <title>GEM - Overview and Features</title>
 </head>

 <body>
 <a name="top"></a>
 <table cellspacing="5">
     <tr>
         <td>
             <img src="images/trident_transparent.png">
         </td>
         <td>
             <h1>GEM - Overview and Features</h1>
         </td>
     </tr>
 </table>

 <hr>

 <p>
 Graphical Explorer of MPI Programs (GEM), is a graphical front end for
 In-situ Partial Order (ISP), a dynamic formal verification tool for MPI C/C++
 programs, developed at the University of Utah. GEM greatly enhances the usability
 of ISP by providing an intuitive graphical way to track down bugs in MPI C/C++
 programs, examine post-runtime results and to analyze MPI runtime behavior.
 </p>

 <p>
 GEM can be used by anyone who can write simple MPI C/C++ programs, and requires no
 special training. GEM allows you to formally verify your MPI C/C++ programs automatically
 without any extra efforts on your part. No code instrumentation is necessary. GEM
 will definitively flag the following errors and warnings:
 </p>

 <ul>
     <li>Deadlocks</li>
     <li>Functionally Irrelevant Barriers</li>
     <li>MPI Object Leaks</li>
     <li>Local Assertion Violations</li>
     <li> MPI Type Mismatches</li>
 </ul>

 <h3>Analyzer View</h3>
 <p>
 The <a href="analyzerView.html">Analyzer View</a> is a unique user interface that visually displays the output
 from the log file that was generated at runtime (by ISP) by highlighting lines representing match sets (p2p and collective send-recv matches) in the source file.
 It shows both the current MPI call, and any matching point-to-point or collective operation.
 The <a href="analyzerView.html">Analyzer View</a> also allows the user to examine
 MPI calls for a particular rank with an easy to use <b>Rank Locking</b> feature.
 </p>

 <p>
 The Analyzer View offers the user the ability to &#34;step&#34; through execution of the code in debug fashion
 examining the execution of the MPI calls in the program as they happened at runtime. The user can view this in
 two different orderings&#58;
 </p>
 <ul>
 	<li>
 		Internal Issue Order - The order in which the MPI calls were issued to the MPI runtime.
 	</li>
 	<li>
 		Program Order - The order in which the MPI calls appear in the source code.
 	</li>
 </ul>

 <p>
 The source code analyzer also offers a feature to browse calls by rank and by interleaving as well as the ability to
 launch the <a href="happensBeforeViewer.html">Happens Before Viewer</a>, another visualization tool provided by GEM.
 </p>

 <br>
 <img src="images/analyzer.png" width="858" height="581"/>
 <br>

 <p>
 In addition, it helps you understand as well as step through all relevant process interleavings
 (schedules). Notice our use of the word <b>relevant</b>: even a short MPI program may have too many
 (an &#34;exponential number&#34;) interleavings. For example, an MPI program with five processes, each
 containing five MPI calls, can have well in excess of 1000 interleavings. However, ISP generates
 a new interleaving only when it represents a truly new (as yet unexamined) behavior in your MPI program.
 </p>

 <h3>As Examples:</h3>
 <ul>
     <li>
         If an MPI program consisting of two processes issues point-to-point sends and non-wildcard
         receives to each other, then there is no observable difference (apart from performance or
         resource consumption) whether the sends are posted before each matching receive or vice
         versa; in this case, ISP will generate only one interleaving;
         </li>
     <li>
         If an MPI program consisting of three processes is such that the second process issues a
         sequence of wildcard receives, and the first and third process issue point-to-point sends
         to process two, then the exact non-deterministic choices of matches made at each step may
         affect the ensuing computation (including the conditional outcomes). In such programs, it
         is possible to force an exponential number of interleavings. In practice, here is how the
         results look (these examples come with the ISP distribution).
     </li>
     <li>
         For Parmetis [1], a 14,000+ line MPI program that makes 56,990 MPI calls for four ranks and
         171,956 calls for eight ranks, ISP needed to examine only one interleaving!
     </li>
     <li>
         For many tests in the Umpire test suite [2], conventional MPI testing tools missed deadlocks
         despite examining many schedules. ISP determined a minimal list of interleavings to examine,
         finding deadlocks whenever they existed (see our table of results at the URL given under [3]).
     </li>
     <li>
         For Madre [4], a naive algorithm can result in n! interleavings. An improved algorithm
         called POE B, present in the current release of ISP reduces this to one interleaving.
         Much like with existing partial order [5], ISP does not guarantee a
         minimal number of interleavings, although it comes pretty close to it.
     </li>
 </ul>

 <h3>Browser View</h3>

 <p>
 The <a href="browserView.html">Browser View</a> is a highly functional view that summarizes and categorically groups all
 errors and warnings found by GEM for a particular MPI verification run. The summary label (just above the tabs)
 gives the user a quick overview of problems found. Errors and warnings are mapped to the corresponding
 source code lines in the Eclipse editor. The Browser View&#39;s tabs help the user to quickly localize
 and fix MPI errors and warnings flagged by GEM within their source code.
 </p>

 <ul>
 	<li>
 		<b>Deadlocks:</b>&nbsp;&nbsp;
 		This tab allows the user to visit all calls that did not complete due
 		to a detected deadlock.
 	</li>
 	<li>
 		<b>Assertion Violations:</b>&nbsp;&nbsp;
 		This tab lists all local assertion violations found. Clicking any
 		one of these will take the user to the line of source code with the failed assertion.
 	</li>
 	<li>
 		<b>Leaks:</b>&nbsp;&nbsp;
 		Clicking any of these will take the user to the line of source code where the
 		corresponding un-freed MPI resource was allocated (these are usually MPI communicators and MPI DataTypes).
 	</li>
 	<li>
 		<b>Irrelevant Barriers:</b> &nbsp;&nbsp;
         This tab is reserved for Functionally Irrelevant Barriers. Clicking any these will take the user to
         the corresponding MPI_Barrier call itself, which can then be safely removed without changing the
         runtime behavior of the MPI application. This could potentially speed the application up with the
         absence of unnecessary synchronization.
 	</li>
 	<li>
 		<b>Type Mismatches:</b>&nbsp;&nbsp;
 		These are warnings about the send type not matching the receive type. In some cases this may be
 		intentional, but often times it is a programming mistake.
 	</li>
 </ul>

 <p>
 At the top left is a brief summary of the results indicating the number of errors and warnings found.
 In the top right are buttons provided with the ability to set the number of processes and re-run GEM
 on the current MPI project.
 </p>

 <br>
 <img src="images/browser.png" width="755" height="368"/>
 <br>

 <h3>How GEM Works:</h3>
 <p>
 GEM employs the use of In-situ Partial Order (ISP), the underlying dynamic formal verification tool, to verify
 MPI C/C++ programs. Here we describe ISP in detail to provide a better understanding of how it works.
 </p>
 <p>
 ISP works by intercepting the MPI calls made by the target program and making decisions on when
 to send these MPI calls to the MPI library. This is accomplished by the two main components of ISP:
 the Profiler and the Scheduler. An overview of ISP&#146;s components and their interaction with the
 program as well as the MPI library is provided in Figure 1 below.
 </p>

 <center>
 <img src="images/ISP-diagram.gif"><br>
 <b>Figure 1</b>
 </center>

 <h3>Profiler:</h3>
 <p>
 The interception of MPI calls is accomplished by compiling the ISP Profiler together with
 the target program&#146;s source code. The Profiler makes use of MPI&#146;s profiling mechanism
 (PMPI). Then the Profiler provides its own version of MPI <i>f</i> (replacing the original) for each corresponding MPI function <i>f</i>. Within
 each of these MPI <i>f</i> , the profiler communicates with the scheduler using TCP sockets to
 send information about the MPI call the process wants to make. It will then wait for the
 scheduler to make a decision on whether to send the MPI call to the MPI library or to
 postpone it until later. When the permission to fire <i>f</i> is given from the scheduler, the
 corresponding PMPI <i>f</i> will be issued to the MPI run-time. Since all MPI libraries come
 with functions such as PMPI <i>f</i> for every MPI function <i>f</i>, this approach provides a portable
 and light-weight instrumentation mechanism for MPI programs being verified [6].
 </p>

 <h3>Scheduler:</h3>
 <p>
 The ISP Scheduler helps carry out the POE algorithm. As it turns out, the POE algorithm is
 based on exploiting MPI&#146;s out-of-order completion semantics. In other words,
 </p>

 <ul>
     <li>
         (i) Not all MPI operations issued by a process complete in that order.
     </li>
     <li>
         (ii) A proper modeling of this out of-order completion semantics is essential in
         order to meet goals G1 and G2. For example, two MPI_Isend commands issued in
         succession by an MPI process P1 to the same target process (say P2) are forced to
         match in order, whereas if these MPI_Isends are targeted to two different MPI
         processes, then they may match contrary to the issue order. As another example,
         any operation following an MPI_Barrier must complete only after the barrier has
         completed, while an operation issued before the barrier may linger across the
         barrier, and actually complete after the barrier [6]!
     </li>
 </ul>

 <h3>Description of POE Algorithm:</h3>
 <p>POE stands for &#34;POR with out-of-order and elusive interleavings" </P>
 <P>
 MPI programs suffer from &#34;elusive&#34; interleavings in the presence of MPI
 wildcard Receive <I>R</I>(*) . Issuing the MPI sends in different orders does not
 help as the MPI runtime provides no guarantees based on when the Send is
 actually issued. POE employs dynamic source re-writing in place of wildcard
 so that every interleaving explored by POE algorithm is deterministic. Also,
 the presence of collectives such as <CODE>MPI\_Barrier</CODE> can cause traditional
 runtime verification techniques such as &#34;Dynamic Partial Order Reduction" to
 fail since it is not always possible to issue instructions in different orders
 in the presence of collectives. POE also addresses this issue by creating
 &#34;match sets".
 </P>

 <p>
 POE algorithm works by creating a graph structure with the MPI operations of
 processes as nodes and by adding <EM>Intra-Completes-Before</EM> (IntraCB) edges
 between these nodes. As the name suggests, IntraCB edges are only added between
 the MPI operations within the MPI processes. The IntraCB edges are added
 between the MPI operations based on the MPI semantics of the operations. MPI
 functions like <CODE>MPI_Barriers</CODE>, <CODE>MPI_Wait</CODE> etc have the semantics,
 that no later MPI operations can be issued until these MPI operations complete.
 We call such operations as <EM>fence operations</EM>. Note that instructions
 issued before the fence operations can linger even after the fence operation
 is complete. </P><P>The IntraCB edges are added based on the following rules:
 Let <I>i</I> and <I>j</I> be the MPI operations of a process <I>P</I> such that <I>i</I> &lt; <I>j</I> (i.e.,
 <I>i</I> comes before <I>j</I> in the program order. There is an IntraCB edge between <I>i</I>
 and <I>j</I> is one of the following is true:
 </P>

 <UL>
     <LI><I>i</I> is a <EM>fence</EM> operation.</LI>
     <LI><I>i</I> and <I>j</I> are send operations and the destination of <I>i</I> and <I>j</I> are the same</LI>
     <LI><I>i</I> and <I>j</I> are receive operations and the source of <I>i</I> and <I>j</I> are the same</LI>
     <LI><I>i</I> is a wildcard receive operation and <I>j</I> is a receive operation.</LI>
 </UL>

 <P>
 If there is a path from <I>i</I> to <I>j</I> then <I>i</I> is called the ancestor of <I>j</I>. POE
 algorithm uses these IntraCB edges to form <EM>match-sets</EM> to be issued into
 the MPI runtime. The following is the (informal) algorithm:</P><P>An MPI operation is &#34;matched&#34;
 if it has been issued to the MPI runtime by the POE algorithm.
 </P>

 <OL CLASS="enumerate" type=1>
     <LI>Run the MPI processes until they all reach fence operations.</LI>
     <LI>Let <I>e</I> denote the MPI operations of all processes such that for each <I>i</I> &#8712; <I>e</I>, all ancestors of <I>i</I> have been matched.</LI>
     <LI>Let <I>m</I> denote the match-sets formed out of <I>e</I> as follows:
     <UL>
         <LI>
             <B>SR-Match:</B> <I>i</I>, <I>j</I> &#8712; <I>e</I> such that <I>i</I> and <I>j</I> are send and receive
             operations where source (destination) of <I>i</I> is equal to the destination
             (source)j, then &#9001; <I>i</I>, <I>j</I>&#9002; form a match set.
         </LI>
         <LI>
             <B>SR*-Match</B> If <I>i</I> is a wildcard receive and <I>j</I><SUB>1</SUB>,<I>j</I><SUB>2</SUB>&#8230; <I>j</I><SUB><I>n</I></SUB> are the matching
             sends, then create the match-sets as {&#9001; <I>i</I>, <I>j</I><SUB>1</SUB>&#9002;, &#9001; <I>i</I>,
             <I>j</I><SUB>2</SUB>&#9002;, &#8230; &#9001; <I>i</I>, <I>j</I><SUB><I>n</I></SUB>&#9002;}. Each element of the set is issued as a different interleaving.
         </LI>
         <LI>
             <B>C-Match:</B> If <I>i</I><SUB>1</SUB>,<I>i</I><SUB>2</SUB>&#8230; <I>i</I><SUB><I>n</I></SUB> are the same collective operation of <I>n</I> processes
             then &#9001; <I>i</I><SUB>1</SUB>, <I>i</I><SUB>2</SUB>, &#8230;, <I>i</I><SUB><I>n</I></SUB>&#9002; form a match set.
         </LI>
         <LI>
             <B>W-Match</B>If <I>i</I> is an <CODE>MPI_Wait</CODE>, then &#9001; <I>i</I> &#9002; forms a match set.
         </LI>
     </UL>
     </LI>
     <LI>
         Issue the match sets to MPI runtime in the following priority order: <EM>C-Match &gt; W-Match &gt; SR-Match &gt; SR*-Match</EM>.
     </LI>
     <LI>
         If any of the processes issued a fence operation then goto step 1. Else, goto step 4.
     </LI>
     <LI>
         If there are no match sets and all processes have issued <CODE>MPI_Finalize</CODE>, check if any wildcard match sets are yet to be explored. If yes, then restart the processes and goto step 1, otherwise, exit.
     </LI>
     <LI>
         If there are no match sets and there is at least one process that has not issued <CODE>MPI_Finalize</CODE>, then there is a deadlock. Flag a deadlock and kill all the MPI processes.
     </LI>
 </OL>

 <h3>Obtaining ISP and Examples</h3>

 ISP can be downloaded from the Gauss Group&#146;s <a href="http://www.cs.utah.edu/formal_verification/ISP-release">ISP Release Page</a>.

 <p>
     What you will receive are the full sources as well as the following examples in the directory isp-tests:
 </p>

 <ul>
     <li>
         12 examples pertaining to the Game of Life (in directory life, courtesy of Bill Gropp of UIIC
         and Rusty Lusk of Argonne from their EuroPVM/MPI 2007 tutorials)
     </li>
     <li>
         74 examples of the Umpire test suite (in directory Umpire), courtesy of Bronis de Supinski from LLNL.
     </li>
     <li>
         12 examples under others developed by various authors.
     </li>
     <li>
         16 examples under madre-0.3 courtesy of Stephen F. Siegel (U. of Delaware) from his Madre release (see our PPoPP 2009 paper).
     </li>
     <li>
         The Parmetis 3.1 hypergraph partitioner (see our PPoPP 2009 paper).
     </li>
 </ul>

 <table cellspacing="5">
     <tr>
         <td>
             <img src="images/trident_transparent.png">
         </td>
         <td>
             <h3>Why the Trident?</h3>
             <p>
                 Because It is a universal symbol for &#34;slaying the evil bug&#34;!
             </p>
         </td>
     </tr>
 </table>

 <br>
 <hr>

 <h3>References</h3>
 <ul>
     <li>
         [1] ParMETIS - Parallel graph partitioning and fill-reducing matrix ordering.
     </li>
     <li>
         [2] Jeffrey S. Vetter and Bronis R. de Supinski. Dynamic software testing of MPI
         applications with Umpire. In Supercomputing &#146;00: Proceedings of the 2000 ACM/IEEE
         Conference on Supercomputing (CDROM). IEEE Computer Society, 2000. Article 51.
     </li>
     <li>
        [3] Sarvani Vakkalanka, Ganesh Gopalakrishnan, and Robert M. Kirby. Dynamic verification
        of MPI programs with reductions in presence of split operations and relaxed orderings.
        In Aarti Gupta and Sharad Malik, editors, CAV, volume 5123 of Lecture Notes in
        Computer Science. Springer, 2008.
     </li>
     <li>
         [4] Stephen F. Siegel. The MADRE web page.
     </li>
     <li>
         [5] Edmund M. Clarke, Orna Grumberg, and Doron A. Peled. Model Checking. MIT Press, 2000.
     </li>
     <li>
         [6] Anh Vo, Sarvani Vakkalanka, Rajeev Thakur, Michael DeLisi, Ganesh Gopalakrishnan and Robert M. Kirby.
         Formal Verification of Practical MPI Programs. PPoPP &#146;09, February 14&#150;18, 2009, Raleigh, North Carolina, USA.
     </li>
 </ul>

 <br/>
 <hr/>

 <br/><br/>
 <p><a href="#top">Back to Top</a> | <a href="toc.html">Back to Table of Contents</a></p>
 <br/><br/>

 <hr/>
 <center>
 <p>
 School of Computing * 50 S. Central Campus Dr. Rm. 3190 * Salt Lake City, UT
 84112 * <A href="mailto:isp-dev@cs.utah.edu">isp-dev@cs.utah.edu</a><br>
 <a href="http://www.eclipse.org/org/documents/epl-v10.php">License</a>
 </p>
 </center>

 </body>
 </html>
	<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

	<html>
	<head>
	<link rel="stylesheet" type="text/css" href="help.css">
	<title>GEM - Overview and Features</title>
	</head>

	<body>
	<a name="top"></a>
	<table cellspacing="5">
	<tr>
	<td>
	<img src="images/trident_transparent.png">
	</td>
	<td>
	<h1>GEM - Overview and Features</h1>
	</td>
	</tr>
	</table>

	<hr>

	<p>
	Graphical Explorer of MPI Programs (GEM), is a graphical front end for
	In-situ Partial Order (ISP), a dynamic formal verification tool for MPI C/C++
	programs, developed at the University of Utah. GEM greatly enhances the usability
	of ISP by providing an intuitive graphical way to track down bugs in MPI C/C++
	programs, examine post-runtime results and to analyze MPI runtime behavior.
	</p>

	<p>
	GEM can be used by anyone who can write simple MPI C/C++ programs, and requires no
	special training. GEM allows you to formally verify your MPI C/C++ programs automatically
	without any extra efforts on your part. No code instrumentation is necessary. GEM
	will definitively flag the following errors and warnings:
	</p>

	<ul>
	<li>Deadlocks</li>
	<li>Functionally Irrelevant Barriers</li>
	<li>MPI Object Leaks</li>
	<li>Local Assertion Violations</li>
	<li> MPI Type Mismatches</li>
	</ul>

	<h3>Analyzer View</h3>
	<p>
	The <a href="analyzerView.html">Analyzer View</a> is a unique user interface that visually displays the output
	from the log file that was generated at runtime (by ISP) by highlighting lines representing match sets (p2p and collective send-recv matches) in the source file.
	It shows both the current MPI call, and any matching point-to-point or collective operation.
	The <a href="analyzerView.html">Analyzer View</a> also allows the user to examine
	MPI calls for a particular rank with an easy to use <b>Rank Locking</b> feature.
	</p>

	<p>
	The Analyzer View offers the user the ability to "step" through execution of the code in debug fashion
	examining the execution of the MPI calls in the program as they happened at runtime. The user can view this in
	two different orderings:
	</p>
	<ul>
	<li>
	Internal Issue Order - The order in which the MPI calls were issued to the MPI runtime.
	</li>
	<li>
	Program Order - The order in which the MPI calls appear in the source code.
	</li>
	</ul>

	<p>
	The source code analyzer also offers a feature to browse calls by rank and by interleaving as well as the ability to
	launch the <a href="happensBeforeViewer.html">Happens Before Viewer</a>, another visualization tool provided by GEM.
	</p>

	<br>
	<img src="images/analyzer.png" width="858" height="581"/>
	<br>

	<p>
	In addition, it helps you understand as well as step through all relevant process interleavings
	(schedules). Notice our use of the word <b>relevant</b>: even a short MPI program may have too many
	(an "exponential number") interleavings. For example, an MPI program with five processes, each
	containing five MPI calls, can have well in excess of 1000 interleavings. However, ISP generates
	a new interleaving only when it represents a truly new (as yet unexamined) behavior in your MPI program.
	</p>

	<h3>As Examples:</h3>
	<ul>
	<li>
	If an MPI program consisting of two processes issues point-to-point sends and non-wildcard
	receives to each other, then there is no observable difference (apart from performance or
	resource consumption) whether the sends are posted before each matching receive or vice
	versa; in this case, ISP will generate only one interleaving;
	</li>
	<li>
	If an MPI program consisting of three processes is such that the second process issues a
	sequence of wildcard receives, and the first and third process issue point-to-point sends
	to process two, then the exact non-deterministic choices of matches made at each step may
	affect the ensuing computation (including the conditional outcomes). In such programs, it
	is possible to force an exponential number of interleavings. In practice, here is how the
	results look (these examples come with the ISP distribution).
	</li>
	<li>
	For Parmetis [1], a 14,000+ line MPI program that makes 56,990 MPI calls for four ranks and
	171,956 calls for eight ranks, ISP needed to examine only one interleaving!
	</li>
	<li>
	For many tests in the Umpire test suite [2], conventional MPI testing tools missed deadlocks
	despite examining many schedules. ISP determined a minimal list of interleavings to examine,
	finding deadlocks whenever they existed (see our table of results at the URL given under [3]).
	</li>
	<li>
	For Madre [4], a naive algorithm can result in n! interleavings. An improved algorithm
	called POE B, present in the current release of ISP reduces this to one interleaving.
	Much like with existing partial order [5], ISP does not guarantee a
	minimal number of interleavings, although it comes pretty close to it.
	</li>
	</ul>

	<h3>Browser View</h3>

	<p>
	The <a href="browserView.html">Browser View</a> is a highly functional view that summarizes and categorically groups all
	errors and warnings found by GEM for a particular MPI verification run. The summary label (just above the tabs)
	gives the user a quick overview of problems found. Errors and warnings are mapped to the corresponding
	source code lines in the Eclipse editor. The Browser View's tabs help the user to quickly localize
	and fix MPI errors and warnings flagged by GEM within their source code.
	</p>

	<ul>
	<li>
	<b>Deadlocks:</b>
	This tab allows the user to visit all calls that did not complete due
	to a detected deadlock.
	</li>
	<li>
	<b>Assertion Violations:</b>
	This tab lists all local assertion violations found. Clicking any
	one of these will take the user to the line of source code with the failed assertion.
	</li>
	<li>
	<b>Leaks:</b>
	Clicking any of these will take the user to the line of source code where the
	corresponding un-freed MPI resource was allocated (these are usually MPI communicators and MPI DataTypes).
	</li>
	<li>
	<b>Irrelevant Barriers:</b>
	This tab is reserved for Functionally Irrelevant Barriers. Clicking any these will take the user to
	the corresponding MPI_Barrier call itself, which can then be safely removed without changing the
	runtime behavior of the MPI application. This could potentially speed the application up with the
	absence of unnecessary synchronization.
	</li>
	<li>
	<b>Type Mismatches:</b>
	These are warnings about the send type not matching the receive type. In some cases this may be
	intentional, but often times it is a programming mistake.
	</li>
	</ul>

	<p>
	At the top left is a brief summary of the results indicating the number of errors and warnings found.
	In the top right are buttons provided with the ability to set the number of processes and re-run GEM
	on the current MPI project.
	</p>

	<br>
	<img src="images/browser.png" width="755" height="368"/>
	<br>

	<h3>How GEM Works:</h3>
	<p>
	GEM employs the use of In-situ Partial Order (ISP), the underlying dynamic formal verification tool, to verify
	MPI C/C++ programs. Here we describe ISP in detail to provide a better understanding of how it works.
	</p>
	<p>
	ISP works by intercepting the MPI calls made by the target program and making decisions on when
	to send these MPI calls to the MPI library. This is accomplished by the two main components of ISP:
	the Profiler and the Scheduler. An overview of ISPs components and their interaction with the
	program as well as the MPI library is provided in Figure 1 below.
	</p>

	<center>
	<img src="images/ISP-diagram.gif"><br>
	<b>Figure 1</b>
	</center>

	<h3>Profiler:</h3>
	<p>
	The interception of MPI calls is accomplished by compiling the ISP Profiler together with
	the target programs source code. The Profiler makes use of MPIs profiling mechanism
	(PMPI). Then the Profiler provides its own version of MPI <i>f</i> (replacing the original) for each corresponding MPI function <i>f</i>. Within
	each of these MPI <i>f</i> , the profiler communicates with the scheduler using TCP sockets to
	send information about the MPI call the process wants to make. It will then wait for the
	scheduler to make a decision on whether to send the MPI call to the MPI library or to
	postpone it until later. When the permission to fire <i>f</i> is given from the scheduler, the
	corresponding PMPI <i>f</i> will be issued to the MPI run-time. Since all MPI libraries come
	with functions such as PMPI <i>f</i> for every MPI function <i>f</i>, this approach provides a portable
	and light-weight instrumentation mechanism for MPI programs being verified [6].
	</p>

	<h3>Scheduler:</h3>
	<p>
	The ISP Scheduler helps carry out the POE algorithm. As it turns out, the POE algorithm is
	based on exploiting MPIs out-of-order completion semantics. In other words,
	</p>

	<ul>
	<li>
	(i) Not all MPI operations issued by a process complete in that order.
	</li>
	<li>
	(ii) A proper modeling of this out of-order completion semantics is essential in
	order to meet goals G1 and G2. For example, two MPI_Isend commands issued in
	succession by an MPI process P1 to the same target process (say P2) are forced to
	match in order, whereas if these MPI_Isends are targeted to two different MPI
	processes, then they may match contrary to the issue order. As another example,
	any operation following an MPI_Barrier must complete only after the barrier has
	completed, while an operation issued before the barrier may linger across the
	barrier, and actually complete after the barrier [6]!
	</li>
	</ul>

	<h3>Description of POE Algorithm:</h3>
	<p>POE stands for "POR with out-of-order and elusive interleavings" </P>
	<P>
	MPI programs suffer from "elusive" interleavings in the presence of MPI
	wildcard Receive <I>R</I>(*) . Issuing the MPI sends in different orders does not
	help as the MPI runtime provides no guarantees based on when the Send is
	actually issued. POE employs dynamic source re-writing in place of wildcard
	so that every interleaving explored by POE algorithm is deterministic. Also,
	the presence of collectives such as <CODE>MPI\_Barrier</CODE> can cause traditional
	runtime verification techniques such as "Dynamic Partial Order Reduction" to
	fail since it is not always possible to issue instructions in different orders
	in the presence of collectives. POE also addresses this issue by creating
	"match sets".
	</P>

	<p>
	POE algorithm works by creating a graph structure with the MPI operations of
	processes as nodes and by adding <EM>Intra-Completes-Before</EM> (IntraCB) edges
	between these nodes. As the name suggests, IntraCB edges are only added between
	the MPI operations within the MPI processes. The IntraCB edges are added
	between the MPI operations based on the MPI semantics of the operations. MPI
	functions like <CODE>MPI_Barriers</CODE>, <CODE>MPI_Wait</CODE> etc have the semantics,
	that no later MPI operations can be issued until these MPI operations complete.
	We call such operations as <EM>fence operations</EM>. Note that instructions
	issued before the fence operations can linger even after the fence operation
	is complete. </P><P>The IntraCB edges are added based on the following rules:
	Let <I>i</I> and <I>j</I> be the MPI operations of a process <I>P</I> such that <I>i</I> < <I>j</I> (i.e.,
	<I>i</I> comes before <I>j</I> in the program order. There is an IntraCB edge between <I>i</I>
	and <I>j</I> is one of the following is true:
	</P>

	<UL>
	<LI><I>i</I> is a <EM>fence</EM> operation.</LI>
	<LI><I>i</I> and <I>j</I> are send operations and the destination of <I>i</I> and <I>j</I> are the same</LI>
	<LI><I>i</I> and <I>j</I> are receive operations and the source of <I>i</I> and <I>j</I> are the same</LI>
	<LI><I>i</I> is a wildcard receive operation and <I>j</I> is a receive operation.</LI>
	</UL>

	<P>
	If there is a path from <I>i</I> to <I>j</I> then <I>i</I> is called the ancestor of <I>j</I>. POE
	algorithm uses these IntraCB edges to form <EM>match-sets</EM> to be issued into
	the MPI runtime. The following is the (informal) algorithm:</P><P>An MPI operation is "matched"
	if it has been issued to the MPI runtime by the POE algorithm.
	</P>

	<OL CLASS="enumerate" type=1>
	<LI>Run the MPI processes until they all reach fence operations.</LI>
	<LI>Let <I>e</I> denote the MPI operations of all processes such that for each <I>i</I> ∈ <I>e</I>, all ancestors of <I>i</I> have been matched.</LI>
	<LI>Let <I>m</I> denote the match-sets formed out of <I>e</I> as follows:
	<UL>
	<LI>
	<B>SR-Match:</B> <I>i</I>, <I>j</I> ∈ <I>e</I> such that <I>i</I> and <I>j</I> are send and receive
	operations where source (destination) of <I>i</I> is equal to the destination
	(source)j, then 〈 <I>i</I>, <I>j</I>〉 form a match set.
	</LI>
	<LI>
	<B>SR*-Match</B> If <I>i</I> is a wildcard receive and <I>j</I><SUB>1</SUB>,<I>j</I><SUB>2</SUB>… <I>j</I><SUB><I>n</I></SUB> are the matching
	sends, then create the match-sets as {〈 <I>i</I>, <I>j</I><SUB>1</SUB>〉, 〈 <I>i</I>,
	<I>j</I><SUB>2</SUB>〉, … 〈 <I>i</I>, <I>j</I><SUB><I>n</I></SUB>〉}. Each element of the set is issued as a different interleaving.
	</LI>
	<LI>
	<B>C-Match:</B> If <I>i</I><SUB>1</SUB>,<I>i</I><SUB>2</SUB>… <I>i</I><SUB><I>n</I></SUB> are the same collective operation of <I>n</I> processes
	then 〈 <I>i</I><SUB>1</SUB>, <I>i</I><SUB>2</SUB>, …, <I>i</I><SUB><I>n</I></SUB>〉 form a match set.
	</LI>
	<LI>
	<B>W-Match</B>If <I>i</I> is an <CODE>MPI_Wait</CODE>, then 〈 <I>i</I> 〉 forms a match set.
	</LI>
	</UL>
	</LI>
	<LI>
	Issue the match sets to MPI runtime in the following priority order: <EM>C-Match > W-Match > SR-Match > SR*-Match</EM>.
	</LI>
	<LI>
	If any of the processes issued a fence operation then goto step 1. Else, goto step 4.
	</LI>
	<LI>
	If there are no match sets and all processes have issued <CODE>MPI_Finalize</CODE>, check if any wildcard match sets are yet to be explored. If yes, then restart the processes and goto step 1, otherwise, exit.
	</LI>
	<LI>
	If there are no match sets and there is at least one process that has not issued <CODE>MPI_Finalize</CODE>, then there is a deadlock. Flag a deadlock and kill all the MPI processes.
	</LI>
	</OL>

	<h3>Obtaining ISP and Examples</h3>

	ISP can be downloaded from the Gauss Groups <a href="http://www.cs.utah.edu/formal_verification/ISP-release">ISP Release Page</a>.

	<p>
	What you will receive are the full sources as well as the following examples in the directory isp-tests:
	</p>

	<ul>
	<li>
	12 examples pertaining to the Game of Life (in directory life, courtesy of Bill Gropp of UIIC
	and Rusty Lusk of Argonne from their EuroPVM/MPI 2007 tutorials)
	</li>
	<li>
	74 examples of the Umpire test suite (in directory Umpire), courtesy of Bronis de Supinski from LLNL.
	</li>
	<li>
	12 examples under others developed by various authors.
	</li>
	<li>
	16 examples under madre-0.3 courtesy of Stephen F. Siegel (U. of Delaware) from his Madre release (see our PPoPP 2009 paper).
	</li>
	<li>
	The Parmetis 3.1 hypergraph partitioner (see our PPoPP 2009 paper).
	</li>
	</ul>

	<table cellspacing="5">
	<tr>
	<td>
	<img src="images/trident_transparent.png">
	</td>
	<td>
	<h3>Why the Trident?</h3>
	<p>
	Because It is a universal symbol for "slaying the evil bug"!
	</p>
	</td>
	</tr>
	</table>

	<br>
	<hr>

	<h3>References</h3>
	<ul>
	<li>
	[1] ParMETIS - Parallel graph partitioning and fill-reducing matrix ordering.
	</li>
	<li>
	[2] Jeffrey S. Vetter and Bronis R. de Supinski. Dynamic software testing of MPI
	applications with Umpire. In Supercomputing 00: Proceedings of the 2000 ACM/IEEE
	Conference on Supercomputing (CDROM). IEEE Computer Society, 2000. Article 51.
	</li>
	<li>
	[3] Sarvani Vakkalanka, Ganesh Gopalakrishnan, and Robert M. Kirby. Dynamic verification
	of MPI programs with reductions in presence of split operations and relaxed orderings.
	In Aarti Gupta and Sharad Malik, editors, CAV, volume 5123 of Lecture Notes in
	Computer Science. Springer, 2008.
	</li>
	<li>
	[4] Stephen F. Siegel. The MADRE web page.
	</li>
	<li>
	[5] Edmund M. Clarke, Orna Grumberg, and Doron A. Peled. Model Checking. MIT Press, 2000.
	</li>
	<li>
	[6] Anh Vo, Sarvani Vakkalanka, Rajeev Thakur, Michael DeLisi, Ganesh Gopalakrishnan and Robert M. Kirby.
	Formal Verification of Practical MPI Programs. PPoPP 09, February 1418, 2009, Raleigh, North Carolina, USA.
	</li>
	</ul>

	<br/>
	<hr/>

	<br/><br/>
	<p><a href="#top">Back to Top</a> \| <a href="toc.html">Back to Table of Contents</a></p>
	<br/><br/>

	<hr/>
	<center>
	<p>
	School of Computing * 50 S. Central Campus Dr. Rm. 3190 * Salt Lake City, UT
	84112 * <A href="mailto:isp-dev@cs.utah.edu">isp-dev@cs.utah.edu</a><br>
	<a href="http://www.eclipse.org/org/documents/epl-v10.php">License</a>
	</p>
	</center>

	</body>
	</html>