blob: f649fd114ae2895849d72169ddc30ebee0bb523e [file] [log] [blame]
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!-- Copyright (c) 2007 IBM Corporation. -->
<!-- All rights reserved. This program and the accompanying materials -->
<!-- are made available under the terms of the Eclipse Public License v1.0 -->
<!-- which accompanies this distribution, and is available at -->
<!-- http://www.eclipse.org/legal/epl-v10.html -->
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<title>Running IBM LoadLeveler Batch Jobs</title>
<link rel="stylesheet" type="text/css" href="help.css">
</head>
<body>
<h1>Running IBM LoadLeveler Batch Jobs</h1>
<h2><A name=introduction></A>Introduction</h2>
<p><p>The IBM LoadLeveler (LL) Eclipse plug-in, part of the
<a href="PLUGINS_ROOT/org.eclipse.ptp.help/html/toc.html">Parallel Tools Platform</a> (PTP),
allows you to run a parallel or serial application (job) using the Tivoli
Workload Scheduler LoadLeveler from within the PTP framework. In addition to running parallel
and serial jobs, you can monitor their status and control their execution from within PTP.</p>
<h2><A name=requirements></A>Requirements</h2>
<p>In addition to having Eclipse and PTP installed, you need to have the following software packages installed.</p>
<ul>
<li>IBM Tivoli Workload Scheduler LoadLeveler 3.4 or later</li>
</ul>
<p>See the following IBM Tivoli Workload Scheduler LoadLeveler V3.4 guides for specific LoadLeveler
installation and configuration requirements:</p>
<ul>
<li>
<a href="http://publib.boulder.ibm.com/epubs/pdf/am2in206.pdf">
Tivoli Workload Scheduler LoadLeveler V3.4 Installation Guide
</li>
<li>
<a href="http://publib.boulder.ibm.com/epubs/pdf/am2ug305.pdf">
Tivoli Workload Scheduler LoadLeveler V3.4 Using and Administering Guide
</li>
<li>
<a href="http://publib.boulder.ibm.com/epubs/pdf/am2mg105.pdf">
Tivoli Workload Scheduler LoadLeveler V3.4 Diagnosis and Messages Guide
</li>
</ul>
<h2><A name=starting></A>Getting Started</h2>
<p>Define an appropriate project, either a C/C++ project using CDT or a FORTRAN project using Photran. You can define
either a standard make project or a managed make project.</p>
<p>You should use the following compiler invocation commands
to invoke the compiler if you want to build a parallel program to run with the LoadLeveler proxy, either in your makefile, or when specifying the compiler and linker in a managed make project.</p>
<ul>
<li>AIX</li>
<ul>
<li>mpcc_r for C programs</li>
<li>mpCC_r for C++ programs</li>
<li>mpxlf_r for FORTRAN programs</li>
</ul>
<li>Linux</li>
<ul>
<li>mpcc for C programs</li>
<li>mpCC for C++ programs</li>
<li>mpfort for FORTRAN programs</li>
</ul>
</ul>
<p>The following image shows how the compiler is specified for a managed make project. The correct linker should also
be specified by clicking the settings for the linker.</p>
<img src="images/project_create.jpg">
<h2><A name=preferences></A>Setting Up Preferences</h2>
<p>Once you have created your project, you should switch to the PTP runtime perspective, and set up the default
preferences for LoadLeveler and specify the required resource manager. This setup needs to only be done once.</p>
<p>To setup the default preferences: Window->Preferences->PTP->Resource Managers->LoadLeveler. The IBMLL preferences
panel is displayed. Here, you specify the default path for a resource manager to be used for LoadLeveler.</p>
<img src="images/ibmll_preferences.jpg">
<P>In addition to the location of the proxy, you may specify parameters that control the execution of the
proxy and the front end gui.</p>
<ul>
<li><b>Library Proxy Provate Library Path</b> - If you have a non standard LoadLeveler installation then you can specify the
directory containing the LoadLeveler shared library libllapi.a (AIX) or libllapi.so (Linux).</li>
<li><b>LoadLeveler Proxy Message Options</b> - can be used to turn on or off different catagories of messages generated by
the LoadLeveler proxy.</li>
<li><b>LoadLeveler Proxy Debug Options</b> - If you are debugging the proxy, you can force the proxy into a spin loop
in the main entry point allowing a debugger to be attached. Once a debugger has been attached then breakpoints can
be assigned and the loop canceled by setting the "debug_loop" variable to 0 in the debugger and continuing execution.</li>
<li><b>LoadLeveler Gui Message Options</b> - can be used to turn on or off different catagories of messages generated by
the LoadLeveler Resource Manager Launch.</li>
<li><b>LoadLeveler Multicluster Options</b> - When running LoadLeveler version 3.4.1.2 or later then LoadLeveler will
inform the proxy of whether or not it is configured to run Multicluster. If running an earlier version of LoadLeveler
then you must tell the proxy by forcing the Local or Multicluster mode.</li>
<li><b>LoadLeveler Job Command File Template Override</b> - When submitting jobs using the basic mode, parameters are
substituted into a template file before the job is submitted to LoadLeveler. You can specify your own template file
containing a mix of the keywords to be substituted and specific job command file parameters of your choice.</li>
<li><b>LoadLeveler Basic Template Options</b> - You can instruct the proxy to write the template file it is using to /tmp each time
the proxy is started and the main entry point is entered, or never write the template file for those cases where you are
using only advanced mode or are specifying your own template file to be used.</li>
<li><b>LoadLeveler Proxy Polling Options</b> - In order to determine the nodes and jobs status the proxy must poll
LoadLeveler. The polling options specify the number of seconds between querys. The Node polling allows a minimum and maximum poll.
Each time the proxy polls LoadLeveler for node status and there have not been any changes since the last poll then the time
is incrementally increased until the maximum time is reached. This helps to minimize the impact on LoadLeveler of
continuously pooling for data.</li>
</ul>
<h2><A name=createmanager></A>Creating a Resource Manager</h2>
<p>Once you have set up preferences, you should create the resource manager for LoadLeveler: To do this, right-click in the
resource managers view and selecting <b>Add Resource Manager</b> from the menu. A resource manager wizard will be
displayed. Select <b>IBMLL</b> from the list of resource manager types and click next. The next panel displayed configures
the IBMLL resource manager. Select the Remote service provider and proxy server location from the dropdown lists. You can
also create a new proxy server location by clicking the <b>New</b> button in this pane. Specify the path to the proxy
server executable, or just accept the value filled in as default from the preferences panel. Uncheck the <b>Launch server
manually</b> and click next again. On the final page of the resource configuration manager wizard, either accept the
default resource manager name, or uncheck the checkbox and enter a name and description for the resource manager.
Then click finish to create the resource manager entry.</p>
<p>The image below shows the series of IBMLL proxy configuration panels of the resource manager wizard.</p>
<p>Configure the resource manager.</P>
<img src="images/ibmll_proxy_config_1.jpg">
<p>Override preferences setings specifically for this occurance of the resource manager.</p>
<img src="images/ibmll_proxy_config_2.jpg">
<p>Name this occurance of the resource manager.</p>
<img src="images/ibmll_proxy_config_3.jpg">
<p>You may have as many resource managers defined as you require for your system configuration.</p>
<h2><A name=startmanager></A>Starting a Resource Manager</h2>
<p>Once you have defined a resource manager, you must start it before it can run a parallel application using that
resource manager. To start a resource manager, right-click over the name of the resource manager in the resource manager
view then select <b>Start resource manager</b> from the popup menu. Once started, the icon for the selected resource
manager will turn green.</p>
<p>If necessary, you can stop a resource manager by right-clicking over its name in the resource manager and selecting
<b>Stop resource manager</b> from the popup menu. Normally, a resource manger will shutdown automatically when PTP is
shutdown, and if it was running when PTP was shutdown, then it will automatically start when PTP is started again.</p>
<img src="images/ibmll_start_resource_manager.jpg">
<h2><A name=running></A>Running a Parallel Application</h2>
<p>Once an application has been compiled, the first step in running the application is to be sure the resource manager for
LoadLeveler is running on a node in the LoadLeveler cluster where you intend to start the application. To do this, check the resource
manager's entry in the resource manager view. The icon for the resource manager should be green. If it is not green, then
right click over the resource manager's name and select <b>Start resource manager</b> from the popup menu to start the
resource manager.</p>
<p>Before running an application, it must have a launch configuration defined for it. A launch configuration contains
all of the settings required to run the application. The values entered in a launch configuration are saved so that the
next time the application is run using the same launch configuration, those values do not need to be entered again.</p>
<p>To create a launch configuration, click the <b>Run</b> menu in the Eclipse menubar and select <b>Open Run Dialog</b>.
Alternatively, click the dropdown next to the run icon in the Eclipse toolbar and select Open Run Dialog.
When the run dialog is displayed, right click over <b>Parallel Application</b> and click <b>New</b> from the menu.
This will create a new launch configuration. Set its name to the desired value. Select the <b>Main</b> tab and fill in
the name of the project and the application program.</p>
<img src="images/ibmll_run_application_1.jpg">
<p>Make sure that the debugger tab has an appropriate debugger selected.
Fill in the Arguments tab with any application program arguments. Fill in the Environment tab with any environment variable
settings required by the application.</p>
<p>The <b>Resources</b> tab is the tab where invocation options unique to LoadLeveler. When it is initially displayed,
it will appear with the <b>advanced</b> option selected allowing specification of an existing complete LoadLeveler job command
file (no substitutions permitted). This is intended for the experienced LoadLeveler user who has constructed his/her own
complex job command file to run parallel or serial jobs.</p>
<img src="images/ibmll_run_advanced_application_2.jpg">
<p>When <b>Run</b> is selected the job command file is passed to the LoadLeveler Resource manager proxy and is submitted
to LoadLeveler for execution. When the job starts it will appear in the PTP view.</p>
<img src="images/ibmll_run_advanced_application_3.jpg">
<p>If the <b>Basic</b> option is selected then you will be permitted to specify a template
file for parameter value substitution. Multiple tabbed panels will be activated to allow specification
of a subset of LoadLeveler Job Command file parameters. If you need to specify a parameter not supported by the tabbed panels
then you must override the template file being used and add the parameter to your own template.</p>
<img src="images/ibmll_run_basic_application_2.jpg">
<p>Example of the <b>Nodes/Network</b> Tab</p>
<img src="images/ibmll_run_basic_application_3.jpg">
<h2><A name=template></A>Template File For Basic Mode</h2>
<p>Currently, only a subset of LoadLeveler Job Command File parameters can be specified on the Tabbed Panels. The following
is an example of the template file used by the LoadLeveler Resource manager and all of the possible parameters. Additionally,
this template lists those parameters that are not supported by the Resource Manager. If
you want to use any parameters not supported by the tabbed panels then copy the template file to your own file
and customize your own template file.<br><br>
<small>
#!/bin/sh<br>
# @ account_no = &lt;&lt;&lt;LL_PTP_ACCOUNT_NO&gt;&gt;&gt;<br>
# @ arguments = &lt;&lt;&lt;progArgs&gt;&gt;&gt;<br>
#(NOT SUPPORTED)# @ bg_connection<br>
#(NOT SUPPORTED)# @ bg_partition<br>
#(NOT SUPPORTED)# @ bg_requirements<br>
#(NOT SUPPORTED)# @ bg_route<br>
#(NOT SUPPORTED)# @ bg_shape<br>
#(NOT SUPPORTED)# @ bg_size<br>
# @ blocking = &lt;&lt;&lt;LL_PTP_BLOCKING&gt;&gt;&gt;<br>
# @ bulkxfer = &lt;&lt;&lt;LL_PTP_BULKXFER&gt;&gt;&gt;<br>
# @ checkpoint = &lt;&lt;&lt;LL_PTP_CHECKPOINT&gt;&gt;&gt;<br>
# @ ckpt_dir = &lt;&lt;&lt;LL_PTP_CKPT_DIR&gt;&gt;&gt;<br>
# @ ckpt_execute_dir = &lt;&lt;&lt;LL_PTP_CKPT_EXECUTE_DIR&gt;&gt;&gt;<br>
# @ ckpt_file = &lt;&lt;&lt;LL_PTP_CKPT_FILE&gt;&gt;&gt;<br>
# @ ckpt_time_limit = &lt;&lt;&lt;LL_PTP_CKPT_TIME_LIMIT_HARD&gt;&gt;&gt;,&lt;&lt;&lt;LL_PTP_CKPT_TIME_LIMIT_SOFT&gt;&gt;&gt;<br>
# @ class = &lt;&lt;&lt;LL_PTP_CLASS&gt;&gt;&gt;<br>
# @ cluster_input_file = &lt;&lt;&lt;LL_PTP_CLUSTER_INPUT_FILE_1&gt;&gt;&gt;<br>
# @ cluster_input_file = &lt;&lt;&lt;LL_PTP_CLUSTER_INPUT_FILE_2&gt;&gt;&gt;<br>
# @ cluster_input_file = &lt;&lt;&lt;LL_PTP_CLUSTER_INPUT_FILE_3&gt;&gt;&gt;<br>
# @ cluster_list = &lt;&lt;&lt;LL_PTP_CLUSTER_LIST&gt;&gt;&gt;<br>
# @ cluster_output_file = &lt;&lt;&lt;LL_PTP_CLUSTER_OUTPUT_FILE_1&gt;&gt;&gt;<br>
# @ cluster_output_file = &lt;&lt;&lt;LL_PTP_CLUSTER_OUTPUT_FILE_2&gt;&gt;&gt;<br>
# @ cluster_output_file = &lt;&lt;&lt;LL_PTP_CLUSTER_OUTPUT_FILE_3&gt;&gt;&gt;<br>
# @ comment = &lt;&lt;&lt;LL_PTP_COMMENT&gt;&gt;&gt;<br>
# @ core_limit = &lt;&lt;&lt;LL_PTP_CORE_LIMIT_HARD&gt;&gt;&gt;,&lt;&lt;&lt;LL_PTP_CORE_LIMIT_SOFT&gt;&gt;&gt;<br>
#(NOT SUPPORTED)# @ coschedule<br>
# @ cpu_limit = &lt;&lt;&lt;LL_PTP_CPU_LIMIT_HARD&gt;&gt;&gt;,&lt;&lt;&lt;LL_PTP_CPU_LIMIT_SOFT&gt;&gt;&gt;<br>
# @ data_limit = &lt;&lt;&lt;LL_PTP_DATA_LIMIT_HARD&gt;&gt;&gt;,&lt;&lt;&lt;LL_PTP_DATA_LIMIT_SOFT&gt;&gt;&gt;<br>
#(NOT SUPPORTED)# @ dependency<br>
# @ env_copy = &lt;&lt;&lt;LL_PTP_ENV_COPY&gt;&gt;&gt;<br>
# @ environment = &lt;&lt;&lt;LL_PTP_ENVIRONMENT&gt;&gt;&gt;<br>
# @ error = &lt;&lt;&lt;LL_PTP_ERROR&gt;&gt;&gt;<br>
# @ executable = &lt;&lt;&lt;execPath&gt;&gt;&gt;/&lt;&lt;&lt;execName&gt;&gt;&gt;<br>
# @ executable = &lt;&lt;&lt;LL_PTP_EXECUTABLE&gt;&gt;&gt;<br>
# @ file_limit = &lt;&lt;&lt;LL_PTP_FILE_LIMIT_HARD&gt;&gt;&gt;,&lt;&lt;&lt;LL_PTP_FILE_LIMIT_SOFT&gt;&gt;&gt;<br>
# @ group = &lt;&lt;&lt;LL_PTP_GROUP&gt;&gt;&gt;<br>
#(NOT SUPPORTED)# @ hold<br>
# @ image_size = &lt;&lt;&lt;LL_PTP_IMAGE_SIZE&gt;&gt;&gt;<br>
# @ initialdir = &lt;&lt;&lt;workingDir&gt;&gt;&gt;<br>
# @ initialdir = &lt;&lt;&lt;LL_PTP_INITIALDIR&gt;&gt;&gt;<br>
# @ input = &lt;&lt;&lt;LL_PTP_INPUT&gt;&gt;&gt;<br>
# @ job_cpu_limit = &lt;&lt;&lt;LL_PTP_JOB_CPU_LIMIT_HARD&gt;&gt;&gt;, &lt;&lt;&lt;LL_PTP_JOB_CPU_LIMIT_SOFT&gt;&gt;&gt;<br>
# @ job_name = &lt;&lt;&lt;LL_PTP_JOB_NAME&gt;&gt;&gt;<br>
# @ job_type = &lt;&lt;&lt;LL_PTP_JOB_TYPE&gt;&gt;&gt;<br>
# @ large_page = &lt;&lt;&lt;LL_PTP_LARGE_PAGE&gt;&gt;&gt;<br>
#(NOT SUPPORTED)# @ max_processors<br>
# @ mcm_affinity_options = &lt;&lt;&lt;LL_PTP_MCM_AFFINITY_OPTIONS&gt;&gt;&gt;<br>
#(NOT SUPPORTED)# @ min_processors<br>
# @ network.MPI = &lt;&lt;&lt;LL_PTP_NETWORK_MPI&gt;&gt;&gt;<br>
# @ network.LAPI = &lt;&lt;&lt;LL_PTP_NETWORK_LAPI&gt;&gt;&gt;<br>
# @ network.MPI_LAPI = &lt;&lt;&lt;LL_PTP_NETWORK_MPI_LAPI&gt;&gt;&gt;<br>
# @ node = &lt;&lt;&lt;LL_PTP_NODE_MIN&gt;&gt;&gt;,&lt;&lt;&lt;LL_PTP_NODE_MAX&gt;&gt;&gt;<br>
# @ node_usage = &lt;&lt;&lt;LL_PTP_NODE_USAGE&gt;&gt;&gt;<br>
# @ notification = &lt;&lt;&lt;LL_PTP_NOTIFICATION&gt;&gt;&gt;<br>
# @ notify_user = &lt;&lt;&lt;LL_PTP_NOTIFY_USER&gt;&gt;&gt;<br>
# @ output = &lt;&lt;&lt;LL_PTP_OUTPUT&gt;&gt;&gt;<br>
# @ preferences = &lt;&lt;&lt;LL_PTP_PREFERENCES&gt;&gt;&gt;<br>
# @ requirements = &lt;&lt;&lt;LL_PTP_REQUIREMENTS&gt;&gt;&gt;<br>
# @ resources = &lt;&lt;&lt;LL_PTP_RESOURCES&gt;&gt;&gt;<br>
# @ restart = &lt;&lt;&lt;LL_PTP_RESTART&gt;&gt;&gt;<br>
# @ restart_from_ckpt = &lt;&lt;&lt;LL_PTP_RESTART_FROM_CKPT&gt;&gt;&gt;<br>
#(NOT SUPPORTED)# @ restart_on_same_nodes<br>
# @ rset = &lt;&lt;&lt;LL_PTP_RSET&gt;&gt;&gt;<br>
# @ shell = &lt;&lt;&lt;LL_PTP_SHELL&gt;&gt;&gt;<br>
# @ smt = &lt;&lt;&lt;LL_PTP_SMT&gt;&gt;&gt;<br>
# @ stack_limit = &lt;&lt;&lt;LL_PTP_STACK_LIMIT_HARD&gt;&gt;&gt;,&lt;&lt;&lt;LL_PTP_STACK_LIMIT_SOFT&gt;&gt;&gt;<br>
# @ start_date = &lt;&lt;&lt;LL_PTP_START_DATE&gt;&gt;&gt;<br>
#(NOT SUPPORTED)# @ step_name<br>
# @ task_geometry = &lt;&lt;&lt;LL_PTP_TASK_GEOMETRY&gt;&gt;&gt;<br>
# @ tasks_per_node = &lt;&lt;&lt;LL_PTP_TASKS_PER_NODE&gt;&gt;&gt;<br>
# @ total_tasks = &lt;&lt;&lt;LL_PTP_TOTAL_TASKS&gt;&gt;&gt;<br>
# @ user_priority = &lt;&lt;&lt;LL_PTP_USER_PRIORITY&gt;&gt;&gt;<br>
# @ wall_clock_limit = &lt;&lt;&lt;LL_PTP_WALLCLOCK_HARD&gt;&gt;&gt;,&lt;&lt;&lt;LL_PTP_WALLCLOCK_SOFT&gt;&gt;&gt;<br>
# @ queue<br>
</small>
</p>
<h2><A name=restrictions></A>Restrictions</h2>
<ul>
<li>All pathnames specified in the Resources tab of the launch configuration must be absolute pathnames</li>
</ul>
</body>
</html>