Configuration file keyword descriptions

Tivoli Workload Scheduler LoadLeveler V3.4.3 Using and Administering
SA22-7881-07

 

This topic provides an alphabetical list of the keywords you can use in a LoadLeveler configuration file. It also provides examples of statements that use these keywords.

ACCT

Turns the accounting function on or off.

Syntax:

ACCT = flag ...

The available flags are:

A_DETAIL

Enables extended accounting. Using this flag causes LoadLeveler to record detail resource consumption by machine and by events for each job step. This flag also enables the -x flag of the llq command, permitting users to view resource consumption for active jobs.

A_RES

Turns reservation data recording on.

A_OFF

Turns accounting data recording off.

A_ON

Turns accounting data recording on. If specified without the A_DETAIL flag, the following is recorded:

·        The total amount of CPU time consumed by the entire job

·        The maximum memory consumption of all tasks (or nodes).

A_VALIDATE

Turns account validation on.

Default value: A_OFF

Example: This example specifies that accounting should be turned on and that extended accounting data should be collected and that the -x flag of the llq command be enabled.

ACCT = A_ON A_DETAIL

ACCT_VALIDATION

Identifies the executable called to perform account validation.

Syntax:

ACCT_VALIDATION = program

Where program is a validation program.

Default value: $(BIN)/llacctval (the accounting validation program shipped with LoadLeveler.

ACTION_ON_MAX_REJECT

Specifies the state in which jobs are placed when their rejection count has reached the value of the MAX_JOB_REJECT keyword. HOLD specifies that jobs are placed in User Hold status; SYSHOLD specifies that jobs are placed in System Hold status; CANCEL specifies that jobs are canceled. When a job is rejected, LoadLeveler sends a mail message stating why the job was rejected.

Syntax:

ACTION_ON_MAX_REJECT = HOLD | SYSHOLD | CANCEL

Default value: HOLD

ACTION_ON_SWITCH_TABLE_ERROR

Points to an administrator supplied program that will be run when DRAIN_ON_SWITCH_TABLE_ERROR is set to true and a switch table unload error occurs.

Syntax:

ACTION_ON_SWITCH_TABLE_ERROR = program

Default value: The default is to not run a program.

ADMIN_FILE

Points to the administration file containing user, class, group, machine, and adapter stanzas.

Syntax:

ADMIN_FILE = directory

Default value: $(tilde)/admin_file

AFS_GETNEWTOKEN

Specifies a filter that, for example, can be used to refresh an AFS® token.

Syntax:

AFS_GETNEWTOKEN = full_path_to_executable

Where full_path_to_executable is an administrator-supplied program that receives the AFS authentication information on standard input and writes the new information to standard output. The filter is run when the job is scheduled to run and can be used to refresh a token which expired when the job was queued.

Default value: The default is to not run a program.

AGGREGATE_ADAPTERS

Allows an external scheduler to specify per-window adapter usages.

Syntax:

AGGREGATE_ADAPTERS = YES | NO

When this keyword is set to YES, the resources from multiple switch adapters on the same switch network are treated as one aggregate pool available to each job. When this keyword is set to NO, the switch adapters are treated individually and a job cannot use resources from multiple adapters on the same network.

Set this keyword to NO when you are using an external scheduler; otherwise, set to YES (or accept the default).

Default value: YES

ALLOC_EXCLUSIVE_CPU_PER_JOB

Specifies the way CPU affinity is enforced on Linux® platforms. When this keyword is not specified or when an unrecognized value is assigned to it, LoadLeveler will not attempt to set CPU affinity for any application processes spawned by it.

Note: This keyword is ignored by LoadLeveler on platforms on which the LoadLeveler for Linux CPU affinity feature is not available.

The ALLOC_EXCLUSIVE_CPU_PER_JOB keyword can be specified in the global or local configuration files. It can also be specified in both configuration files, in which case the setting in the local configuration file will override that of the global configuration file. The keyword cannot be turned off in a local configuration file if it has been set to any value in the global configuration file.

Changes to ALLOC_EXCLUSIVE_CPU_PER_JOB will not take effect at reconfiguration. The administrator must stop and restart or recycle LoadLeveler when changing ALLOC_EXCLUSIVE_CPU_PER_JOB.

Syntax:

ALLOC_EXCLUSIVE_CPU_PER_JOB = LOGICAL|PHYSICAL

Default value: By default, when this keyword is not specified, CPU affinity is not set.

Example: When the value of this keyword is set to LOGICAL, only one LoadLeveler job step will run on each of the processors available on the machine:

ALLOC_EXCLUSIVE_CPU_PER_JOB = LOGICAL

Example: When the value of this keyword is set to PHYSICAL, all logical processors (or physical cores) configured in one physical CPU package will be allocated to one and only one LoadLeveler job step.

ALLOC_EXCLUSIVE_CPU_PER_JOB = PHYSICAL

For more information related to this keyword, see Linux CPU affinity support.

ARCH

Indicates the standard architecture of the system. The architecture you specify here must be specified in the same format in the requirements and preferences statements in job command files. The administrator defines the character string for each architecture.

Syntax:

ARCH = string

Default value: Use the command llstatus -l to view the default.

Example: To define a machine as an RS/6000®, the keyword would look like:

  ARCH = R6000

BG_ALLOW_LL_JOBS_ONLY

Specifies if only jobs submitted through LoadLeveler will be accepted by the Blue Gene® job launcher program.

Syntax:

BG_ALLOW_LL_JOBS_ONLY = true | false

Default value: false

BG_CACHE_PARTITIONS

Specifies whether allocated partitions are to be reused for Blue Gene jobs whenever possible.

Syntax:

BG_CACHE_PARTITIONS = true | false

Default value: true

BG_ENABLED

Specifies whether Blue Gene support is enabled.

Syntax:

BG_ENABLED = true | false

If the value of this keyword is true, the Central Manager will load the Blue Gene control system libraries and query the state of the Blue Gene system so that jobs of type bluegene can be scheduled.

Default value: false

BG_MIN_PARTITION_SIZE

Specifies the smallest number of compute nodes in a partition.

Syntax:

BG_MIN_PARTITION_SIZE = 32 | 128 | 512 (for Blue Gene/L)
 
BG_MIN_PARTITION_SIZE = 16 | 32 | 64 | 128 | 256 | 512 (for Blue Gene/P)

The value for this keyword must not be smaller than the minimum partition size supported by the physical Blue Gene hardware. If the number of compute nodes requested in a job is less than the minimum partition size, LoadLeveler will increase the requested size to the minimum partition size.

If the max_psets_per_bp value is set in the DB_PROPERTY file, the value for the BG_MIN_PARTITION_SIZE must be set as described in Table 1:

Table 1. BG_MIN_PARTITION_SIZE values

max_psets_per_bp value in DB_PROPERTY file

BG_MIN_PARTITION_SIZE for Blue Gene/L™

BG_MIN_PARTITION_SIZE for Blue Gene/P™

4

>= 128

>= 128

8

>= 128

>= 64

16

>= 32

>= 32

32

>= 32

>= 16

Default value: 32

BIN

Defines the directory where LoadLeveler binaries are kept.

Syntax:

BIN = $(RELEASEDIR)/bin

Default value: $(tilde)/bin

CENTRAL_MANAGER_HEARTBEAT_INTERVAL

Specifies the amount of time, in seconds, that defines how frequently primary and alternate central manager communicate with each other.

Syntax:

CENTRAL_MANAGER_HEARTBEAT_INTERVAL = number

Default value: The default is 300 seconds or 5 minutes.

CENTRAL_MANAGER_TIMEOUT

Specifies the number of heartbeat intervals that an alternate central manager will wait before declaring that the primary central manager is not operating.

Syntax:

CENTRAL_MANAGER_TIMEOUT = number

Default value: The default is 6.

CKPT_CLEANUP_INTERVAL

Specifies the interval, in seconds, at which the Schedd daemon will run the program specified by the CKPT_CLEANUP_PROGRAM keyword.

Syntax:

CKPT_CLEANUP_INTERVAL = number

number must be a positive integer.

Default value: -1

CKPT_CLEANUP_PROGRAM

Identifies an administrator-provided program which is to be run at the interval specified by the ckpt_cleanup_interval keyword. The intent of this program is to delete old checkpoint files created by jobs running under LoadLeveler during the checkpoint process.

Syntax:

CKPT_CLEANUP_PROGRAM = program

Where program is the fully qualified name of the program to be run. The program must be accessible and executable by LoadLeveler.

A sample program to remove checkpoint files is provided in the /usr/lpp/LoadL/full/samples/llckpt/rmckptfiles.c file.

Default value: No default value is set.

CKPT_EXECUTE_DIR

Specifies the directory where the job step's executable will be saved for checkpointable jobs. You can specify this keyword in either the configuration file or the job command file; different file permissions are required depending on where this keyword is set. For additional information, see Planning considerations for checkpointing jobs.

Syntax:

CKPT_EXECUTE_DIR = directory

This directory cannot be the same as the current location of the executable file, or LoadLeveler will not stage the executable. In this case, the user must have execute permission for the current executable file.

Default value: By default, the executable of a checkpointable job step is not staged.

CLASS

Determines whether a machine will accept jobs of a certain job class. For parallel jobs, you must define a class instance for each task you want to run on a node using one of two formats:

·        The format, CLASS = class_name (count), defines the CLASS names using a statement that names the classes and sets the number of tasks for each class in parenthesis.

With this format, the following rules apply:

o       Each class can have only one entry

o       If a class has more than one entry or there is a syntax error, the entire CLASS statement will be ignored

o       If the CLASS statement has a blank value or is not specified, it will be defaulted to No_Class (1)

o       The number of instances for a class specified inside the parenthesis ( ) must be an unsigned integer. If the number specified is 0, it is correct syntactically, but the class will not be defined in LoadLeveler

o       If the number of instances for all classes in the CLASS statement are 0, the default No_Class(1) will be used

·        The format, CLASS = { "class1" "class2" "class2" "class2"}, defines the CLASS names using a statement that names each class and sets the number of tasks for each class based on the number of times that the class name is used inside the {} operands.

Note: With both formats, the class names list is blank delimited.

For a LoadLeveler job to run on a machine, the machine must have a vacancy for the class of that job. If the machine is configured for only one No_Class job and a LoadLeveler job is already running there, then no further LoadLeveler jobs are started on that machine until the current job completes.

You can have a maximum of 1024 characters in the class statement. You cannot use allclasses as a class name, since this is a reserved LoadLeveler keyword.

You can assign multiple classes to the same machine by specifying the classes in the LoadLeveler configuration file (called LoadL_config) or in the local configuration file (called LoadL_config.local). The classes, themselves, should be defined in the administration file. See Setting up a single machine to have multiple job classes and Defining classes for more information on classes.

Syntax:

CLASS = { "class_name" ... } | {"No_Class"} | class_name (count) ...

Default value: {"No_Class"}

CLIENT_TIMEOUT

Specifies the maximum time, in seconds, that a daemon waits for a response over TCP/IP from a process. If the waiting time exceeds the specified amount, the daemon tries again to communicate with the process. In general, you should use the default setting unless you are experiencing delays due to an excessively loaded network. If so, you should try increasing this value.

Syntax:

CLIENT_TIMEOUT = number

Default value: The default is 30 seconds.

CLUSTER_METRIC

Indicates the installation exit to be run by the Schedd to determine where a remote job is distributed. If a remote job is submitted with a list of clusters or the reserved word any and the installation exit is not specified, the remote job is not submitted.

Syntax:

CLUSTER_METRIC = full_pathname_to_executable

The installation exit is run with the following parameters passed as input. All parameters are character strings.

·        The job ID of the job to be distributed

·        The number of clusters in the list of clusters

·        A blank-delimited list of clusters to be considered

If the user specifies the reserved word any as the cluster_list during job submission, the job is sent to the first outbound Schedd defined for the first configured remote cluster. The CLUSTER_METRIC is executed on this machine to determine where the job will be distributed. If this machine is not the outbound_hosts Schedd for the assigned cluster, the job will be forwarded to the correct outbound_hosts Schedd. If the user specifies a list of clusters as the cluster_list during job submission, the job is sent to the first outbound Schedd defined for the first specified remote cluster. The CLUSTER_METRIC is executed on this machine to determine where the job will be distributed. If this machine is not the outbound_hosts Schedd for the assigned cluster, the job will be forwarded to the correct outbound_hosts Schedd.

Note: The list of clusters may contain a single entry of the reserved word any, which indicates that the CLUSTER_METRIC installation exit must determine its own list of clusters to select from. This can be all of the clusters available using the data access API or a predetermined list set by the administrator. If any is specified in place of a cluster list, the metric will receive a count of 1 followed by the keyword any.

The installation exit must write the remote cluster name to which the job is submitted as standard output and exit with a value of 0. An exit value of -1 indicates an error in determining the cluster for distribution and the job is not submitted. Returned cluster names that are not valid also cause the job to be not submitted. STDERR from the exit is written to the Schedd log.

LoadLeveler provides a set of sample exits for use in distributing jobs by the following metrics:

·        The number of jobs in the idle queue

·        The number of jobs in the specified class

·        The number of free nodes in the cluster

The installation exit samples are available in the ${RELEASEDIR}/samples/llcluster directory.

CLUSTER_REMOTE_JOB_FILTER

Indicates the installation exit to be run by the inbound Schedd for each remote job request to filter the user's job command file statements during submission or move job. If the keyword is not specified, no job filtering is done.

Syntax:

CLUSTER_REMOTE_JOB_FILTER = full_pathname_to_executable

The installation exit is run with the submitting user's ID. All parameters are character strings.

This installation exit is executed on the inbound_hosts of the local cluster when receiving a job submission or move job request.

The executable specified is called with the submitting user's unfiltered job command file statements as the standard input. The standard output is submitted to LoadLeveler. If the exit returns with a nonzero exit code, the remote job submission or job move will fail. A submit filter can only make changes to LoadLeveler job command file statements.

The data access API can be used by the remote job filter to query the Schedd for the job object received from the sending cluster.

If the local submission filter on the submitting cluster has added or deleted steps from the original user's job command file, the remote job filter must add or delete the same number of steps. The job command file statements returned by the remote job filter must contain the same number of steps as the job object received from the sending cluster.

Changes to the following job command file keyword statements are ignored:

·        executable

·        environment

·        image_size

·        cluster_input_file

·        cluster_output_file

·        cluster_list

The following job command file keyword will have different behavior:

·        initialdir – If not set by the remote job filter or the submitting user's unfiltered job command file, the default value will remain the current working directory at the time the job was submitted. Access to the initialdir will be verified on the cluster selected to run the job. If access to initialdir fails, the submission or move job will fail.

To maintain compatibility between the SUBMIT_FILTER and CLUSTER_REMOTE_JOB_FILTER programs, the following environment variables are set when either exit is invoked:

·        LOADL_ACTIVE – the LoadLeveler version.

·        LOADL_STEP_COMMAND – the location of the job command file passed as input to the program. This job command file only contains LoadLeveler keywords.

·        LOADL_STEP_ID – The job identifier, generated by the submitting LoadLeveler cluster.

Note: The environment variable name is LOADL_STEP_ID although the value it contains is a "job" identifier. This name is used to be compatible with the local job filter interface.

·        LOADL_STEP_OWNER – The owner (UNIX® user name) of the job.

CLUSTER_USER_MAPPER

Indicates the installation exit to be run by the inbound Schedd for each remote job request to determine the user mapping of the cluster. This keyword implies that user mapping is performed. If the keyword is not specified, no user mapping is done.

Syntax:

CLUSTER_USER_MAPPER = full_pathname_to_executable

The installation exit is run with the following parameters passed as input. All parameters are character strings.

·        The user name to be mapped

·        The cluster name where the user originated from

This installation exit is executed on the inbound_hosts of the local cluster when receiving a job submission, move job request or remote command.

The installation exit must write the new user name as standard output and exit with a value of 0. An exit value of -1 indicates an error and the job is not submitted. STDERR from the exit is written to the Schedd log. An exit value of 1 indicates that the user name returned for this job was not mapped.

CM_CHECK_USERID

Specifies whether the central manager will check the existence of user IDs that sent requests through a command or API on the central manager machine.

Syntax:

CM_CHECK_USERID = true | false

Default value: true

COLLECTOR_DGRAM_PORT

Specifies the port number used when connecting to a daemon.

Syntax:

CM_COLLECTOR_PORT = port number

Default value: The default is 9612.

COMM

Specifies a local directory where LoadLeveler keeps special files used for UNIX domain sockets for communicating among LoadLeveler daemons running on the same machine. This keyword allows the administrator to choose a different file system other than /tmp for these files. If you change the COMM option you must stop and then restart LoadLeveler using the llctl command.

Syntax:

COMM = local directory

Default value: The default location for the files is /tmp.

CONTINUE

Determines whether suspended jobs should continue execution.

Syntax:

CONTINUE: expression that evaluates to T or F (true or false)

When T, suspended LoadLeveler jobs resume execution on the machine.

Default value: No default value is set.

For information about time-related variables that you may use for this keyword, see Variables to use for setting times.

CUSTOM_METRIC

Specifies a machine's relative priority to run jobs.

Syntax:

CUSTOM_METRIC = number

This is an arbitrary number which you can use in the MACHPRIO expression. Negative values are not allowed.

Default value: If you specify neither CUSTOM_METRIC nor CUSTOM_METRIC_COMMAND, CUSTOM_METRIC = 1 is assumed. For more information, see Setting negotiator characteristics and policies.

For more information related to using this keyword, see Defining a LoadLeveler cluster.

CUSTOM_METRIC_COMMAND

Specifies an executable and any required arguments. The exit code of this command is assigned to CUSTOM_METRIC. If this command does not exit normally, CUSTOM_METRIC is assigned a value of 1. This command is forked every (POLLING_FREQUENCY * POLLS_PER_UPDATE) period.

Syntax:

CUSTOM_METRIC_COMMAND = command

Default value: No default is set; LoadLeveler does not run any command to determine CUSTOM_METRIC.

DCE_AUTHENTICATION_PAIR

Specifies a pair of installation supplied programs that are used to authenticate DCE security credentials.

Restriction: DCE security is not supported by LoadLeveler for Linux.

Syntax:

DCE_AUTHENTICATION_PAIR = program1, program2

Where program1 and program2 are LoadLeveler- or installation-supplied programs that are used to authenticate DCE security credentials. program1 obtains a handle (an opaque credentials object), at the time the job is submitted, which is used to authenticate to DCE. program2 uses the handle obtained by program1 to authenticate to DCE before starting the job on the executing machines.

Default value: See Handling DCE security credentials for information about defaults.

DEFAULT_PREEMPT_METHOD

Specifies the default preemption method for LoadLeveler to use when a preempt method is not specified in a PREEMPT_CLASS statement or in the llpreempt command. LoadLeveler also uses this default preemption method to preempt job steps that are running on reserved machines when a reservation period begins.

Restrictions:

·        This keyword is valid only for the BACKFILL scheduler.

·        The suspend method of preemption (the default) might not be supported on your level of Linux. If you want to preempt jobs that are running where process tracking is not supported, you must use this keyword to specify a method other than suspend.

Syntax:

DEFAULT_PREEMPT_METHOD =  rm | sh | su | vc | uh

Valid values are:

rm

LoadLeveler preempts the jobs and removes them from the job queue. To rerun the job, the user must resubmit the job to LoadLeveler.

sh

LoadLeveler ends the jobs and puts them into System Hold state. They remain in that state on the job queue until an administrator releases them. After being released, the jobs go into Idle state and will be rescheduled to run as soon as resources for the job are available.

su

LoadLeveler suspends the jobs and puts them in Preempted state. They remain in that state on the job queue until the preempting job has terminated, and resources are available to resume the preempted job on the same set of nodes. To use this value, process tracking must be enabled.

vc

LoadLeveler ends the jobs and puts them in Vacate state. They remain in that state on the job queue and will be rescheduled to run as soon as resources for the job are available.

uh

LoadLeveler ends the jobs and puts them into User Hold state. They remain in that state on the job queue until an administrator releases them. After being released, the jobs go into Idle state and will be rescheduled to run as soon as resources for the job are available.

Default value: su (suspend method)

For more information related to using this keyword, see Steps for configuring a scheduler to preempt jobs.

DRAIN_ON_SWITCH_TABLE_ERROR

Specifies whether the startd should be drained when the switch table fails to unload. This will flag the administrator that intervention may be required to unload the switch table. When DRAIN_ON_SWITCH_TABLE_ERROR is set to true, the startd will be drained when the switch table fails to unload.

Syntax:

DRAIN_ON_SWITCH_TABLE_ERROR = true | false

Default value: false

ENFORCE_RESOURCE_MEMORY

Specifies whether the AIX® Workload Manager is configured to limit, as precisely as possible, the real memory usage of a WLM class. For this keyword to be valid, ConsumableMemory must be set through the ENFORCE_RESOURCE_USAGE keyword.

Syntax:

ENFORCE_RESOURCE_MEMORY = true | false

Default value: false

ENFORCE_RESOURCE_POLICY

Specifies what type of resource entitlements will be assigned to the AIX Workload Manager classes. If the value specified is shares, it means a share value is assigned to the class based on the job step's requested resources (one unit of resource equals one share). This is the default policy. If the value specified is soft, it means a percentage value is assigned to the class based on the job step's requested resources and the total machine resources. This percentage can be exceeded if there is no contention for the resource. If the value specified is hard, it means a percentage value is assigned to the class based on the job step's requested resources and the total machine resources. This percentage cannot be exceeded regardless of the contention for the resource. If desired, this keyword can be used in the LoadL_config.local file to set up a different policy for each machine. The ENFORCE_RESOURCE_USAGE keyword must be set for this keyword to be valid.

Syntax:

ENFORCE_RESOURCE_POLICY = hard |soft | shares

Default value: shares

ENFORCE_RESOURCE_SUBMISSION

Indicates whether jobs submitted should be checked for the resources and node_resources keywords. If the value specified is true, LoadLeveler will check all jobs at submission time for the resources and node_resources keywords. The job command file resources and node_resources keywords combined need to have at least the resources specified in the ENFORCE_RESOURCE_USAGE keyword in order for the job to be submitted successfully. When RSET_MCM_AFFINITY is enabled, the task_affinity or parallel_threads keyword can be used instead of the resources and node_resources keywords when the resource being enforced is ConsumableCpus.

If the value specified is false, no checking will be done and jobs submitted without the resources or node_resources keywords will not have resources enforced. In this instance, those jobs might interfere with other jobs whose resources are enforced.

Syntax:

ENFORCE_RESOURCE_SUBMISSION = true | false

Default value: false

ENFORCE_RESOURCE_USAGE

Specifies that the AIX Workload Manager should be used to enforce CPU or real memory resources. This keyword accepts the predefined resources ConsumableCpus and ConsumableMemory. Either memory or CPUs or both can be enforced but the resources must also be specified on the SCHEDULE_BY_RESOURCES keyword. If deactivate is specified, LoadLeveler will deactivate AIX Workload Manager on all the nodes in the LoadLeveler cluster.

Restriction: WLM enforcement is ignored by LoadLeveler for Linux.

Syntax:

ENFORCE_RESOURCE_USAGE = ConsumableCpus ConsumableMemory | deactivate

EXECUTE

Specifies the local directory to store the executables of jobs submitted by other machines.

Syntax:

EXECUTE = local directory/execute

Default value: $(tilde)/execute

FAIR_SHARE_INTERVAL

Specifies, in units of hours, the time interval it takes for resource usage in fair share scheduling to decay to 5% of its initial value. Historic fair share data collected before the most recent time interval of this length will have little impact on fair share scheduling.

Syntax:

FAIR_SHARE_INTERVAL = hours

Default value: The default value is 168 hours (one week). If a negative value or 0 is specified, the default value is used.

FAIR_SHARE_TOTAL_SHARES

Specifies the total number of shares that the cluster CPU or Blue Gene resources are divided into. If this value is less than or equal to 0, fair share scheduling is turned off.

Syntax:

FAIR_SHARE_TOTAL_SHARES = shares

Default value: The default value is 0.

FEATURE

Specifies an optional characteristic to use to match jobs with machines. You can specify unique characteristics for any machine using this keyword. When evaluating job submissions, LoadLeveler compares any required features specified in the job command file to those specified using this keyword. You can have a maximum of 1024 characters in the feature statement.

Syntax:

Feature = {"string" ...}

Default value: No default value is set.

Example: If a machine has licenses for installed products ABC and XYZ in the local configuration file, you can enter the following:

Feature = {"abc" "xyz"}

When submitting a job that requires both of these products, you should enter the following in your job command file:

requirements = (Feature == "abc") && (Feature == "xyz")

Note: You must define a feature on all machines that will be able to run dynamic simultaneous multithreading (SMT). SMT is only supported on POWER6™ and POWER5™ processor-based systems.

Example: When submitting a job that requires the SMT function, first specify smt = yes in job command file (or select a class which had smt = yes defined). Next, specify node_usage = not_shared and last, enter the following in the job command file:

requirements = (Feature == "smt") 

FLOATING_RESOURCES

Specifies which consumable resources are available collectively on all of the machines in the LoadLeveler cluster. The count for each resource must be an integer greater than or equal to zero, and each resource can only be specified once in the list. Any resource specified for this keyword that is not already listed in the SCHEDULE_BY_RESOURCES keyword will not affect job scheduling. If any resource is specified incorrectly with the FLOATING_RESOURCES keyword, then all floating resources will be ignored. ConsumableCpus, ConsumableMemory, and ConsumableVirtualMemory may not be specified as floating resources.

Syntax:

FLOATING_RESOURCES = name(count) name(count) ... name(count)

Default value: No default value is set.

FS_INTERVAL

Defines the number of minutes used as the interval for checking free file system space or inodes. If your file system receives many log messages or copies large executables to the LoadLeveler spool, the file system will fill up quicker and you should perform file size checking more frequently by setting the interval to a smaller value. LoadLeveler will not check the file system if the value of FS_INTERVAL is:

·        Set to zero

·        Set to a negative integer

Syntax:

FS_INTERVAL = minutes

Default value: If FS_INTERVAL is not specified but any of the other file-system keywords (FS_NOTIFY, FS_SUSPEND, FS_TERMINATE, INODE_NOTIFY, INODE_SUSPEND, INODE_TERMINATE) are specified, the FS_INTERVAL value will default to 5 and the file system will be checked. If no file-system or inode keywords are set, LoadLeveler does not monitor file systems at all.

For more information related to using this keyword, see Setting up file system monitoring.

FS_NOTIFY

Defines the lower and upper amounts, in bytes, of free file-system space at which LoadLeveler is to notify the administrator:

·        If the amount of free space becomes less than the lower threshold value, LoadLeveler sends a mail message to the administrator indicating that logging problems may occur.

·        When the amount of free space becomes greater than the upper threshold value, LoadLeveler sends a mail message to the administrator indicating that problem has been resolved.

Syntax:

FS_NOTIFY = lower threshold, upper threshold

Specify space in bytes with the unit B. A metric prefix such as K, M or G may precede the B. The valid range for both the lower and upper thresholds are -1B and all positive integers. If the value is set to -1, the transition across the threshold is not checked.

Default value: In bytes: 1KB, -1B

For more information related to using this keyword, see Setting up file system monitoring.

FS_SUSPEND

Defines the lower and upper amounts, in bytes, of free file system space at which LoadLeveler drains and resumes the Schedd and startd daemons running on a node.

·        If the amount of free space becomes less than the lower threshold value, then LoadLeveler drains the Schedd and the startd daemons if they are running on a node. When this happens, logging is turned off and mail notification is sent to the administrator.

·        When the amount of free space becomes greater than the upper threshold value, LoadLeveler signals the Schedd and the startd daemons to resume. When this happens, logging is turned on and mail notification is sent to the administrator.

Syntax:

FS_SUSPEND = lower threshold, upper threshold

Specify space in bytes with the unit B. A metric prefix such as K, M or G may precede the B. The valid range for both the lower and upper thresholds are -1B and all positive integers. If the value is set to -1, the transition across the threshold is not checked.

Default value: In bytes: -1B, -1B

For more information related to using this keyword, see Setting up file system monitoring.

FS_TERMINATE

Defines the lower and upper amounts, in bytes, of free file system space at which LoadLeveler is terminated. This keyword sends the SIGTERM signal to the Master daemon which then terminates all LoadLeveler daemons running on the node.

·        If the amount of free space becomes less than the lower threshold value, all LoadLeveler daemons are terminated.

·        An upper threshold value is required for this keyword. However, since LoadLeveler has been terminated at the lower threshold, no action occurs.

Syntax:

FS_TERMINATE = lower threshold, upper threshold

Specify space in bytes with the unit B. A metric prefix such as K, M or G may precede the B. The valid range for the lower threshold is -1B and all positive integers. If the value is set to -1, the transition across the threshold is not checked.

Default value: In bytes: -1B, -1B

For more information related to using this keyword, see Setting up file system monitoring.

GLOBAL_HISTORY

Identifies the directory that will contain the global history files produced by llacctmrg command when no directory is specified as a command argument.

Syntax:

GLOBAL_HISTORY = directory

Default value: The default value is $(SPOOL) (the local spool directory).

For more information related to using this keyword, see Collecting the accounting information and storing it into files.

GSMONITOR

Location of the gsmonitor executable (LoadL_GSmonitor).

Restriction: This keyword is ignored by LoadLeveler for Linux.

Syntax:

GSMONITOR = directory

Default value: $(BIN)/LoadL_GSmonitor

GSMONITOR_COREDUMP_DIR

Local directory for storing LoadL_GSmonitor core dump files.

Restriction: This keyword is ignored by LoadLeveler for Linux.

Syntax:

GSMONITOR_COREDUMP_DIR = directory

Default value: The /tmp directory.

For more information related to using this keyword, see Specifying file and directory locations.

GSMONITOR_DOMAIN

Specifies the peer domain, on which the GSMONITOR daemon will execute.

Restriction: This keyword is ignored by LoadLeveler for Linux.

Syntax:

GSMONITOR_DOMAIN = PEER 

Default value: No default value is set.

For more information related to using this keyword, see The gsmonitor daemon.

GSMONITOR_RUNS_HERE

Specifies whether the gsmonitor daemon will run on the host.

Restriction: This keyword is ignored by LoadLeveler for Linux.

Syntax:

GSMONITOR_RUNS_HERE = TRUE | FALSE

Default value: FALSE

For more information related to using this keyword, see The gsmonitor daemon.

HISTORY

Defines the path name where a file containing the history of local LoadLeveler jobs is kept.

Syntax:

HISTORY = directory

Default value: $(SPOOL)/history

For more information related to using this keyword, see Collecting the accounting information and storing it into files.

HISTORY_PERMISSION

Specifies the owner, group, and world permissions of the history file associated with a LoadL_schedd daemon.

Syntax:

HISTORY_PERMISSION = permissions | rw-rw----

permissions must be a string with a length of nine characters and consisting of the characters, r, w, x, or -.

Default value: The default settings are 660 (rw-rw----). LoadL_schedd will use the default setting if the specified permission are less than rw-------.

Example: A specification such as HISTORY_PERMISSION = rw-rw-r-- will result in permission settings of 664.

INODE_NOTIFY

Defines the lower and upper amounts, in inodes, of free file-system inodes at which LoadLeveler is to notify the administrator:

·        If the number of free inodes becomes less than the lower threshold value, LoadLeveler sends a mail message to the administrator indicating that logging problems may occur.

·        When the number of free inodes becomes greater than the upper threshold value, LoadLeveler sends a mail message to the administrator indicating that problem has been resolved.

Syntax:

INODE_NOTIFY = lower threshold, upper threshold

The valid range for both the lower and upper thresholds are -1 and all positive integers. If the value is set to -1, the transition across the threshold is not checked.

Default value: In inodes: 1000, -1

For more information related to using this keyword, see Setting up file system monitoring.

INODE_SUSPEND

Defines the lower and upper amounts, in inodes, of free file system inodes at which LoadLeveler drains and resumes the Schedd and startd daemons running on a node.

·        If the number of free inodes becomes less than the lower threshold value, then LoadLeveler drains the Schedd and the startd daemons if they are running on a node. When this happens, logging is turned off and mail notification is sent to the administrator.

·        When the number of free inodes becomes greater than the upper threshold value, LoadLeveler signals the Schedd and the startd daemons to resume. When this happens, logging is turned on and mail notification is sent to the administrator.

Syntax:

INODE_SUSPEND = lower threshold, upper threshold

The valid range for both the lower and upper thresholds are -1 and all positive integers. If the value is set to -1, the transition across the threshold is not checked.

Default value: In inodes: -1, -1

For more information related to using this keyword, see Setting up file system monitoring.

INODE_TERMINATE

Defines the lower and upper amounts, in inodes, of free file system inodes at which LoadLeveler is terminated. This keyword sends the SIGTERM signal to the Master daemon which then terminates all LoadLeveler daemons running on the node.

·        If the number of free inodes becomes less than the lower threshold value, all LoadLeveler daemons are terminated.

·        An upper threshold value is required for this keyword. However, since LoadLeveler has been terminated at the lower threshold, no action occurs.

Syntax:

INODE_TERMINATE = lower threshold, upper threshold

The valid range for the lower threshold is -1 and all positive integers. If the value is set to -1, the transition across the threshold is not checked.

Default value: In inodes: -1, -1

For more information related to using this keyword, see Setting up file system monitoring.

JOB_ACCT_Q_POLICY

Specifies the amount of time, in seconds, that determines how often the startd daemon updates the Schedd daemon with accounting data of running jobs. This controls the accuracy of the llq -x command.

Syntax:

JOB_ACCT_Q_POLICY = number

Default value: 300 seconds

For more information related to using this keyword, see Gathering job accounting data.

JOB_EPILOG

Path name of the epilog program.

Syntax:

JOB_EPILOG = program name

Default value: No default value is set.

For more information related to using this keyword, see Writing prolog and epilog programs.

JOB_LIMIT_POLICY

Specifies the amount of time, in seconds, that LoadLeveler checks to see if job_cpu_limit has been exceeded. The smaller of JOB_LIMIT_POLICY and JOB_ACCT_Q_POLICY is used to control how often the startd daemon collects resource consumption data on running jobs, and how often the job_cpu_limit is checked.

Syntax:

JOB_LIMIT_POLICY = number

Default value: The default for JOB_LIMIT_POLICY is POLLING_FREQUENCY multiplied by POLLS_PER_UPDATE.

JOB_PROLOG

Path name of the prolog program.

Syntax:

JOB_PROLOG = program name

Default value: No default value is set.

For more information related to using this keyword, see Writing prolog and epilog programs.

JOB_USER_EPILOG

Path name of the user epilog program.

Syntax:

JOB_USER_EPILOG = program name

Default value: No default value is set.

For more information related to using this keyword, see Writing prolog and epilog programs.

JOB_USER_PROLOG

Path name of the user prolog program.

Syntax:

JOB_USER_PROLOG = program name

Default value: No default value is set.

For more information related to using this keyword, see Writing prolog and epilog programs.

KBDD

Location of kbdd executable (LoadL_kbdd).

Syntax:

KBDD = directory

Default value: $(BIN)/LoadL_kbdd

KBDD_COREDUMP_DIR

Local directory for storing LoadL_kbdd daemon core dump files.

Syntax:

KBDD_COREDUMP_DIR = directory

Default value: The /tmp directory.

For more information related to using this keyword, see Specifying file and directory locations.

KILL

Determines whether or not vacated jobs should be sent the SIGKILL signal and replaced in the queue. It is used to remove a job that is taking too long to vacate.

Syntax:

KILL: expression that evaluates to T or F (true or false)

When T, vacated LoadLeveler jobs are removed from the machine with no attempt to take checkpoints.

For information about time-related variables that you may use for this keyword, see Variables to use for setting times.

LIB

Defines the directory where LoadLeveler libraries are kept.

Syntax:

LIB = directory

Default value: $(RELEASEDIR)/lib

LL_RSH_COMMAND

Specifies an administrator provided executable to be used by llctl start when starting LoadLeveler on remote machines in the administration file. The LL_RSH_COMMAND keyword is any executable that can be used as a substitute for /usr/bin/rsh. The llctl start command passes arguments to the executable specified by LL_RSH_COMMAND in the following format:

LL_RSH_COMMAND hostname -n llctl start options

Syntax:

LL_RSH_COMMAND = full_path_to_executable

Default value: /usr/bin/rsh. This keyword must specify the full path name to the executable provided. If no value is specified, LoadLeveler will use /usr/bin/rsh as the default when issuing a start. If an error occurred while locating the executable specified, an error message will be displayed.

Example: This example shows that using the secure shell (/usr/bin/ssh) is the preferred method for the llctl start command to communicate with remote nodes. Specify the following in the configuration file:

LL_RSH_COMMAND=/usr/bin/ssh

LOADL_ADMIN

Specifies a list of LoadLeveler administrators.

Syntax:

LOADL_ADMIN = list of user names

Where list of user names is a blank-delimited list of those individuals who will have administrative authority. These users are able to invoke the administrator-only commands such as llctl, llfavorjob, and llfavoruser. These administrators can also invoke the administrator-only GUI functions. For more information, see Using LoadLeveler's GUI to perform administrator tasks.

Default value: No default value is set, which means no one has administrator authority until this keyword is defined with one or more user names.

Example: To grant administrative authority to users bob and mary, enter the following in the configuration file:

LOADL_ADMIN = bob mary

For more information related to using this keyword, see Defining LoadLeveler administrators.

LOCAL_CONFIG

Specifies the path name of the optional local configuration file containing information specific to a node in the LoadLeveler network.

Syntax:

LOCAL_CONFIG = directory

Default value: No default value is set.

Examples:

·        If you are using a distributed file system like NFS, some examples are:

·                      LOCAL_CONFIG = $(tilde)/$(host).LoadL_config.local
·                      LOCAL_CONFIG = $(tilde)/LoadL_config.$(host).$(domain)
LOCAL_CONFIG = $(tilde)/LoadL_config.local.$(hostname)

See LoadLeveler variables for information about the tilde, host, and domain variables.

·        If you are using a local file system, an example is:

LOCAL_CONFIG = /var/LoadL/LoadL_config.local

LOG

Defines the local directory to store log files. It is not necessary to keep all the log files created by the various LoadLeveler daemons and programs in one directory, but you will probably find it convenient to do so.

Syntax:

LOG = local directory/log

Default value: $(tilde)/log

LOG_MESSAGE_THRESHOLD

Specifies the maximum amount of memory, in bytes, for the message queue. Messages in the queue are waiting to be written to the log file. When the message logging thread cannot write messages to the log file as fast as they arrive, the memory consumed by the message queue can exceed the threshold. In this case, LoadLeveler will curtail logging by turning off all debug flags except D_ALWAYS, therefore, reducing the amount of logging that takes place. If the threshold is exceeded by the curtailed message queue, message logging is stopped. Special log messages are written to the log file, which indicate that some messages are missing. Mail is also sent to the administrator indicating that messages are missing. A value of -1 for this keyword will turn off the buffer threshold meaning that the threshold is unlimited.

Syntax:

LOG_MESSAGE_THRESHOLD = bytes

Default value: 20*1024*1024 (bytes)

MACHINE_AUTHENTICATE

Specifies whether machine validation is performed. When set to true, LoadLeveler only accepts connections from machines specified in the administration file. When set to false, LoadLeveler accepts connections from any machine.

When set to true, every communication between LoadLeveler processes will verify that the sending process is running on a machine which is identified via a machine stanza in the administration file. The validation is done by capturing the address of the sending machine when the accept function call is issued to accept a connection. The gethostbyaddr function is called to translate the address to a name, and the name is matched with the list derived from the administration file.

Syntax:

MACHINE_AUTHENTICATE = true | false 

Default value: false

For more information related to using this keyword, see Defining a LoadLeveler cluster.

MACHINE_UPDATE_INTERVAL

Specifies the time, in seconds, during which machines must report to the central manager.

Syntax:

MACHINE_UPDATE_INTERVAL = number

Where number specifies the time period, in seconds, during which machines must report to the central manager. Machines that do not report in this number of seconds are considered down. number must be a numerical value and cannot be an arithmetic expression.

Default value: The default is 300 seconds.

For more information related to using this keyword, see Setting negotiator characteristics and policies.

MACHPRIO

Machine priority expression.

Syntax:

MACHPRIO = expression

You can use the following LoadLeveler variables in the MACHPRIO expression:

·        LoadAvg

·        Connectivity

·        Cpus

·        Speed

·        Memory

·        VirtualMemory

·        Disk

·        CustomMetric

·        MasterMachPriority

·        ConsumableCpus

·        ConsumableMemory

·        ConsumableVirtualMemory

·        PagesFreed

·        PagesScanned

·        FreeRealMemory

For detailed descriptions of these variables, see LoadLeveler variables.

Default value: (0 - LoadAvg)

Examples:

·        Example 1

This example orders machines by the Berkeley one-minute load average.

MACHPRIO : 0 - (LoadAvg)

Therefore, if LoadAvg equals .7, this example would read:

MACHPRIO : 0 - (.7)

The MACHPRIO would evaluate to -.7.

·        Example 2

This example orders machines by the Berkeley one-minute load average normalized for machine speed:

MACHPRIO : 0 - (1000 * (LoadAvg / (Cpus * Speed)))

Therefore, if LoadAvg equals .7, Cpus equals 1, and Speed equals 2, this example would read:

MACHPRIO : 0 - (1000 * (.7 / (1 * 2)))

This example further evaluates to:

MACHPRIO : 0 - (350)

The MACHPRIO would evaluate to -350.

Notice that if the speed of the machine were increased to 3, the equation would read:

MACHPRIO : 0 - (1000 * (.7 / (1 * 3)))

The MACHPRIO would evaluate to approximately -233. Therefore, as the speed of the machine increases, the MACHPRIO also increases.

·        Example 3

This example orders machines accounting for real memory and available swap space (remembering that Memory is in Mbytes and VirtualMemory is in Kbytes):

MACHPRIO : 0 - (10000 * (LoadAvg / (Cpus * Speed))) +
(10 * Memory) + (VirtualMemory / 1000)

·        Example 4

This example sets a relative machine priority based on the value of the CUSTOM_METRIC keyword.

MACHPRIO : CustomMetric

To do this, you must specify a value for the CUSTOM_METRIC keyword or the CUSTOM_METRIC_COMMAND keyword in either the LoadL_config.local file of a machine or in the global LoadL_config file. To assign the same relative priority to all machines, specify the CUSTOM_METRIC keyword in the global configuration file. For example:

CUSTOM_METRIC = 5

You can override this value for an individual machine by specifying a different value in that machine's LoadL_config.local file.

·        Example 5

This example gives master nodes the highest priority:

MACHPRIO : (MasterMachPriority * 10000)

·        Example 6

This example gives nodes the with highest percentage of switch adapters with connectivity the highest priority:

MACHPRIO : Connectivity

For more information related to using this keyword, see Setting negotiator characteristics and policies.

MAIL

Name of a local mail program used to override default mail notification.

Syntax:

MAIL = program name

Default value: No default value is set.

For more information related to using this keyword, see Using your own mail program.

MASTER

Location of the master executable (LoadL_master).

Syntax:

MASTER = directory

Default value: $(BIN)/LoadL_master

For more information related to using this keyword, see How LoadLeveler daemons process jobs.

MASTER_COREDUMP_DIR

Local directory for storing LoadL_master core dump files.

Syntax:

MASTER_COREDUMP_DIR = directory

Default value: The /tmp directory.

For more information related to using this keyword, see Specifying file and directory locations.

MASTER_DGRAM_PORT

The port number used when connecting to the daemon.

Syntax:

MASTER_DGRAM_PORT = port number

Default value: The default is 9617.

For more information related to using this keyword, see Defining network characteristics.

MASTER_STREAM_PORT

Specifies the port number to be used when connecting to the daemon.

Syntax:

MASTER_STREAM_PORT = port number

Default value: The default is 9616.

For more information related to using this keyword, see Defining network characteristics.

MAX_CKPT_INTERVAL

The maximum number of seconds between checkpoints for running jobs.

Syntax:

MAX_CKPT_INTERVAL = number

Default value: 7200 (2 hours)

For more information related to using this keyword, see LoadLeveler support for checkpointing jobs.

MAX_JOB_REJECT

Determines the number of times a job is rejected before it is canceled or put in User Hold or System Hold status.

Syntax:

MAX_JOB_REJECT = number

number must be a numerical value and cannot be an arithmetic expression. MAX_JOB_REJECT may be set to unlimited rejects by specifying a value of –1.

Default value: The default value is 0, which indicates a rejected job will immediately be canceled or placed on hold.

For related information, see the NEGOTIATOR_REJECT_DEFER keyword.

MAX_RESERVATIONS

Specifies the maximum number of reservations that this LoadLeveler cluster can have. Only reservations in waiting and in use are counted toward this limit; LoadLeveler does not count reservations that have already ended or are in the process of being canceled.

Note: Having too many reservations in a LoadLeveler cluster can have performance impacts. Administrators should select a suitable value for this keyword.

Syntax:

MAX_RESERVATIONS = number

The value for this keyword can be 0 or a positive integer.

Default value: The default is 10.

MAX_STARTERS

Specifies the maximum number of tasks that can run simultaneously on a machine. In this case, a task can be a serial job step or a parallel task. MAX_STARTERS defines the number of initiators on the machine (the number of tasks that can be initiated from a startd).

Syntax:

MAX_STARTERS = number

Default value: If this keyword is not specified, the default is the number of elements in the Class statement.

For more information related to using this keyword, see Specifying how many jobs a machine can run.

MAX_TOP_DOGS

Specifies the maximum total number of top dogs that the central manager daemon will allocate. When scheduling jobs, after MAX_TOP_DOGS total top dogs have been allocated, no more will be considered.

Syntax:

MAX_TOP_DOGS = k | 1 

where: k is a non-negative integer specifying the global maximum top dogs limit.

Default value: The default value is 1.

For more information related to using this keyword, see Using the BACKFILL scheduler.

MIN_CKPT_INTERVAL

The minimum number of seconds between checkpoints for running jobs.

Syntax:

MIN_CKPT_INTERVAL = number

Default value: 900 (15 minutes)

For more information related to using this keyword, see LoadLeveler support for checkpointing jobs.

NEGOTIATOR

Location of the negotiator executable (LoadL_negotiator).

Syntax:

NEGOTIATOR = directory

Default value: $(BIN)/LoadL_negotiator

For more information related to using this keyword, see How LoadLeveler daemons process jobs.

NEGOTIATOR_COREDUMP_DIR

Local directory for storing LoadL_negotiator core dump files.

Syntax:

NEGOTIATOR_COREDUMP_DIR = directory

Default value: The /tmp directory.

For more information related to using this keyword, see Specifying file and directory locations.

NEGOTIATOR_CYCLE_DELAY

Specifies the minimum time, in seconds, the negotiator delays between periods when it attempts to schedule jobs. This time is used by the negotiator daemon to respond to queries, reorder job queues, collect information about changes in the states of jobs, and so on. Delaying the scheduling of jobs might improve the overall performance of the negotiator by preventing it from spending excessive time attempting to schedule jobs.

Syntax:

NEGOTIATOR_CYCLE_DELAY = number

number must be a numerical value and cannot be an arithmetic expression.

Default value: The default is 0 seconds

NEGOTIATOR_CYCLE_TIME_LIMIT

Specifies the maximum amount of time, in seconds, that LoadLeveler will allow the negotiator to spend in one cycle trying to schedule jobs. The negotiator cycle will end, after the specified number of seconds, even if there are additional jobs waiting for dispatch. Jobs waiting for dispatch will be considered at the next negotiator cycle. The NEGOTIATOR_CYCLE_TIME_LIMIT keyword applies only to the BACKFILL scheduler.

Syntax:

NEGOTIATOR_CYCLE_TIME_LIMIT = number

Where number must be a positive integer or zero and cannot be an arithmetic expression.

Default value: If the keyword value is not specified or a value of zero is used, the negotiator cycle will be unlimited.

NEGOTIATOR_INTERVAL

The time interval, in seconds, at which the negotiator daemon updates the status of jobs in the LoadLeveler cluster and negotiates with machines that are available to run jobs.

Syntax:

NEGOTIATOR_INTERVAL = number

Where number specifies the interval, in seconds, at which the negotiator daemon performs a “negotiation loop” during which it attempts to assign available machines to waiting jobs. A negotiation loop also occurs whenever job states or machine states change. number must be a numerical value and cannot be an arithmetic expression.

When this keyword is set to zero, the central manager's automatic scheduling activity is been disabled, and LoadLeveler will not attempt to schedule any jobs unless instructed to do so through the llrunscheduler command or ll_run_scheduler subroutine.

Default value: The default is 30 seconds.

For more information related to using this keyword, see Controlling the central manager scheduling cycle.

NEGOTIATOR_LOADAVG_INCREMENT

Specifies the value the negotiator adds to the startd machine's load average whenever a job in the Pending state is queued on that machine. This value is used to compensate for the increased load caused by starting another job.

Syntax:

NEGOTIATOR_LOADAVG_INCREMENT = number

number must be a numerical value and cannot be an arithmetic expression.

Default value: The default value is .5

NEGOTIATOR_PARALLEL_DEFER

Specifies the amount of time, in seconds, that defines how long a job stays out of the queue after it fails to get the correct number of processors. This keyword applies only to the default LoadLeveler scheduler. This keyword must be greater than the NEGOTIATOR_INTERVAL. value; if it is not, the default is used.

Syntax:

NEGOTIATOR_PARALLEL_DEFER = number

number must be a numerical value and cannot be an arithmetic expression.

Default value: The default is NEGOTIATOR_INTERVAL multiplied by 5.

NEGOTIATOR_PARALLEL_HOLD

Specifies the amount of time, in seconds, that defines how long a job is given to accumulate processors. This keyword applies only to the default LoadLeveler scheduler. This keyword must be greater than the NEGOTIATOR_INTERVAL value; if it is not, the default is used.

Syntax:

NEGOTIATOR_PARALLEL_HOLD = number

number must be a numerical value and cannot be an arithmetic expression.

Default value: The default is NEGOTIATOR_INTERVAL multiplied by 5.

NEGOTIATOR_RECALCULATE_SYSPRIO_INTERVAL

Specifies the amount of time, in seconds, between calculation of the SYSPRIO values for waiting jobs. Recalculating the priority can be CPU-intensive; specifying low values for the NEGOTIATOR_RECALCULATE_SYSPRIO_INTERVAL keyword may lead to a heavy CPU load on the negotiator if a large number of jobs are running or waiting for resources. A value of 0 means the SYSPRIO values are not recalculated.

You can use this keyword to base the order in which jobs are run on the current number of running, queued, or total jobs for a user or a group.

Syntax:

NEGOTIATOR_RECALCULATE_SYSPRIO_INTERVAL = number

number must be a numerical value and cannot be an arithmetic expression.

Default value: The default is 120 seconds.

NEGOTIATOR_REJECT_DEFER

Specifies the amount of time in seconds the negotiator waits before it considers scheduling a job to a machine that recently rejected the job.

Syntax:

NEGOTIATOR_REJECT_DEFER = number

number must be a numerical value and cannot be an arithmetic expression.

Default value: The default is 120 seconds.

For related information, see the MAX_JOB_REJECT keyword.

NEGOTIATOR_REMOVE_COMPLETED

Specifies the amount of time, in seconds, that you want the negotiator to keep information regarding completed and removed jobs so that you can query this information using the llq command.

Syntax:

NEGOTIATOR_REMOVE_COMPLETED = number

number must be a numerical value and cannot be an arithmetic expression.

Default value: The default is 0 seconds.

NEGOTIATOR_RESCAN_QUEUE

specifies the amount of time in seconds that defines how long the negotiator waits to rescan the job queue for machines which have bypassed jobs which could not run due to conditions which may change over time. This keyword must be greater than the NEGOTIATOR_INTERVAL value; if it is not, the default is used.

Syntax:

NEGOTIATOR_RESCAN_QUEUE = number

number must be a numerical value and cannot be an arithmetic expression.

Default value: The default is 900 seconds.

NEGOTIATOR_STREAM_PORT

Specifies the port number used when connecting to the daemon.

Syntax:

NEGOTIATOR_STREAM_PORT = port number

Default value: The default is 9614.

For more information related to using this keyword, see Defining network characteristics.

OBITUARY_LOG_LENGTH

Specifies the number of lines from the end of the file that are appended to the mail message. The master daemon mails this log to the LoadLeveler administrators when one of the daemons dies.

Syntax:

OBITUARY_LOG_LENGTH = number

number must be a numerical value and cannot be an arithmetic expression.

Default value: The default is 25.

POLLING_FREQUENCY

Specifies the interval, in seconds, with which the startd daemon evaluates the load on the local machine and decides whether to suspend, resume, or abort jobs. This time is also the minimum interval at which the kbdd daemon reports keyboard or mouse activity to the startd daemon.

Syntax:

POLLING_FREQUENCY = number

number must be a numerical value and cannot be an arithmetic expression.

Default value: The default is 5.

POLLS_PER_UPDATE

Specifies how often, in POLLING_FREQUENCY intervals, startd daemon updates the central manager. Due to the communication overhead, it is impractical to do this with the frequency defined by the POLLING_FREQUENCY keyword. Therefore, the startd daemon only updates the central manager every nth (where n is the number specified for POLLS_PER_UPDATE) local update. Change POLLS_PER_UPDATE when changing the POLLING_FREQUENCY.

Syntax:

POLLS_PER_UPDATE = number

number must be a numerical value and cannot be an arithmetic expression.

Default value: The default is 24.

PRESTARTED_STARTERS

Specifies how many prestarted starter processes LoadLeveler will maintain on an execution node to manage jobs when they arrive. The startd daemon starts the number of starter processes specified by this keyword. You may specify this keyword in either the global or local configuration file.

Syntax:

PRESTARTED_STARTERS = number

number must be less than or equal to the value specified through the MAX_STARTERS keyword. If the value of PRESTARTED_STARTERS specified is greater then MAX_STARTERS, LoadLeveler records a warning message in the startd log and assigns PRESTARTED_STARTERS the same value as MAX_STARTERS.

If the value PRESTARTED_STARTERS is zero, no starter processes will be started before jobs arrive on the execution node.

Default value: The default is 1.

PREEMPT_CLASS

Defines the preemption rule for a job class.

Syntax: The following forms illustrate correct syntax.

PREEMPT_CLASS[incoming_class] = ALL[:preempt_method] { outgoing_class1 [outgoing_class2 ...] }

Using this form, ALL indicates that job steps of incoming_class have priority and will not share nodes with job steps of outgoing_class1, outgoing_class2, or other outgoing classes. If a job step of the incoming_class is to be started on a set of nodes, all job steps of outgoing_class1, outgoing_class2, or other outgoing classes running on those nodes will be preempted.

Note: The ALL preemption rule does not apply to Blue Gene jobs.

PREEMPT_CLASS[incoming_class] = ENOUGH[:preempt_method] { outgoing_class1 [outgoing_class2 ...] }

Using this form, ENOUGH indicates that job steps of incoming_class will share nodes with job steps of outgoing_class1, outgoing_class2, or other outgoing classes if there are sufficient resources. If a job step of the incoming_class is to be started on a set of nodes, one or more job steps of outgoing_class1, outgoing_class2, or other outgoing classes running on those nodes may be preempted to get needed resources.

Combinations of these forms are also allowed.

Note:

1.     The optional specification preempt_method indicates which method LoadLeveler is to use to preempt the jobs; this specification is valid only for the BACKFILL scheduler. Valid values for this specification in keyword syntax are the highlighted abbreviations in parentheses:

o       Remove (rm)

o       System hold (sh)

o       Suspend (su)

o       Vacate (vc)

o       User hold (uh)

For more information about preemption methods, see Steps for configuring a scheduler to preempt jobs.

2.     Using the "ALL" value in the PREEMPT_CLASS keyword places implied restrictions on when a job can start. See Planning to preempt jobs for more information.

3.     The incoming class is designated inside [ ] brackets.

4.     Outgoing classes are designated inside { } curly braces.

5.     The job classes on the right hand (outgoing) side of the statement must be different from incoming class, or it may be allclasses. If the outgoing side is defined as allclasses then all job classes are preemptable with the exception of the incoming class specified within brackets.

6.     A class name or allclasses should not be in both the ALL list and the ENOUGH list. If you do so, the entire statement will be ignored. An example of this is:

PREEMPT_CLASS[Class_A]=ALL{allclasses} ENOUGH {allclasses}

7.     If you use allclasses as an outgoing (preemptable) class, then no other class names should be listed at the right hand side as the entire statement will be ignored. An example of this is:

PREEMPT_CLASS[Class_A]=ALL{Class_B} ENOUGH {allclasses}

8.     More than one ALL statement and more than one ENOUGH statement may appear at the right hand side. Multiple statements have a cumulative effect.

9.     Each ALL or ENOUGH statement can have multiple class names inside the curly braces. However, a blank space delimiter is required between each class name.

10. Both the ALL and ENOUGH statements can include an optional specification indicating the method LoadLeveler will use to preempt the jobs. Valid values for this specification are listed in the description of the DEFAULT_PREEMPT_METHOD keyword. If a value is specified on the PREEMPT_CLASS ALL or ENOUGH statement, that value overrides the value set on the DEFAULT_PREEMPT_METHOD keyword, if any.

11. ALL and ENOUGH may be in mixed cases.

12. Spaces are allowed around the brackets and curly braces.

13. PREEMPT_CLASS [allclasses] will be ignored.

Default value: No default value is set.

Examples:

PREEMPT_CLASS[Class_B]=ALL{Class_E Class_D} ENOUGH {Class_C}

This indicates that all Class_E jobs and all Class_D jobs and enough Class_C jobs will be preempted to enable an incoming Class_B job to run.

PREEMPT_CLASS[Class_D]=ENOUGH:VC {Class_E}

This indicates that zero, one, or more Class_E jobs will be preempted using the vacate method to enable an incoming Class_D job to run.

PREEMPTION_SUPPORT

For the BACKFILL or API schedulers only, specifies the level of preemption support for a cluster.

Syntax:

PREEMPTION_SUPPORT= full | no_adapter | none

·        When set to full, preemption is fully supported.

·        When set to no_adapter, preemption is supported but the adapter resources are not released by preemption.

·        When set to none, preemption is not supported, and preemption requests will be rejected.

Note:

1.     If the value of this keyword is set to any value other than none for the default scheduler, LoadLeveler will not start.

2.     For the BACKFILL or API scheduler, when this keyword is set to full or no_adapter and preemption by the suspend method is required, the configuration keyword PROCESS_TRACKING must be set to true.

Default value: The default value for all schedulers is none; if you want to enable preemption under these schedulers, you must set a value for this keyword.

PROCESS_TRACKING

Specifies whether or not LoadLeveler will cancel any processes (throughout the entire cluster), left behind when a job terminates.

Syntax:

PROCESS_TRACKING = TRUE | FALSE

When TRUE ensures that when a job is terminated, no processes created by the job will continue running.

Note: It is necessary to set this keyword to true to do preemption by the suspend method with the BACKFILL or API scheduler.

Default value: FALSE

PROCESS_TRACKING_EXTENSION

Specifies the directory containing the kernel module LoadL_pt_ke (AIX) or proctrk.ko (Linux).

Syntax:

PROCESS_TRACKING_EXTENSION = directory

Default value: The directory $HOME/bin

For more information related to using this keyword, see Tracking job processes.

PUBLISH_OBITUARIES

Specifies whether or not the master daemon sends mail to the administrator when any daemon it manages ends abnormally. When set to true, this keyword specifies that the master daemon sends mail to the administrators identified by LOADL_ADMIN keyword.

Syntax:

PUBLISH_OBITUARIES = true | false

Default value: true

REJECT_ON_RESTRICTED_LOGIN

Specifies whether the user's account status will be checked on every node where the job will be run by calling the AIX loginrestrictions function with the S_DIST_CLNT flag.

Restriction: Login restriction checking is ignored by LoadLeveler for Linux.

Login restriction checking includes:

·        Does the account still exist?

·        Is the account locked?

·        Has the account expired?

·        Do failed login attempts exceed the limit for this account?

·        Is login disabled via /etc/nologin?

If the AIX loginrestrictions function indicates a failure then the user's job will be rejected and will be processed according to the LoadLeveler configuration parameters MAX_JOB_REJECT and ACTION_ON_MAX_REJECT.

Syntax:

REJECT_ON_RESTRICTED_LOGIN = true | false

Default value: false

RELEASEDIR

Defines the directory where all the LoadLeveler software resides.

Syntax:

RELEASEDIR = release directory

Default value: $(RELEASEDIR)

RESERVATION_CAN_BE_EXCEEDED

Specifies whether LoadLeveler will schedule job steps that are bound to a reservation when their end times (based on hard wall-clock limits) exceed the reservation end time.

Syntax:

RESERVATION_CAN_BE_EXCEEDED = true | false

When this keyword is set to false, LoadLeveler schedules only those job steps that will complete before the reservation ends. When set to true, LoadLeveler schedules job steps to run under a reservation even if their end times are expected to exceed the reservation end time. When the reservation ends, however, the reserved nodes no longer belong to the reservation, and so these nodes might not be available for the jobs to continue running. In this case, LoadLeveler might preempt the running jobs.

Note that this keyword setting does not change the actual end time of the reservation. It only affects how LoadLeveler manages job steps whose end times exceed the end time of the reservation.

Default value: true

RESERVATION_HISTORY

Defines the name of a file that is to contain the local history of reservations.

Syntax:

RESERVATION_HISTORY = file name

LoadLeveler appends a single line to the reservation history file for each reservation. For an example, see Collecting accounting data for reservations.

Default value: $(SPOOL)/reservation_history

RESERVATION_MIN_ADVANCE_TIME

Specifies the minimum time, in minutes, between the time at which a reservation is created and the time at which the reservation is to start.

Syntax:

RESERVATION_MIN_ADVANCE_TIME = number of minutes

By default, the earliest time at which a reservation may start is the current time plus the value set for the RESERVATION_SETUP_TIME keyword.

Default value: 0 (zero)

RESERVATION_PRIORITY

Specifies whether LoadLeveler administrators may reserve nodes on which running jobs are expected to end after the reservation start time. This keyword value applies only for LoadLeveler administrators; other reservation owners do not have this capability.

Syntax:

RESERVATION_PRIORITY = NONE | HIGH

When you set this keyword to HIGH, before activating the reservation, LoadLeveler preempts the job steps running on the reserved nodes (Blue Gene job steps are handled the same way). The only exceptions are non-preemptable jobs; LoadLeveler will not preempt those jobs because of any reservations.

Default value: NONE

RESERVATION_SETUP_TIME

Specifies how much time, in seconds, that LoadLeveler may use to prepare for a reservation before it is to start. The tasks that LoadLeveler performs during this time include checking and reporting node conditions, and preempting job steps still running on the reserved nodes.

For a given reservation, LoadLeveler uses the RESERVATION_SETUP_TIME keyword value that is set at the time that the reservation is created, not whatever value might be set when the reservation starts. If the start time of the reservation is modified, however, LoadLeveler uses the RESERVATION_SETUP_TIME keyword value that is set at the time of the modification.

Syntax:

RESERVATION_SETUP_TIME = number of seconds

Default value: 60

RESTARTS_PER_HOUR

Specifies how many times the master daemon attempts to restart a daemon that dies abnormally. Because one or more of the daemons may be unable to run due to a permanent error, the master only attempts $(RESTARTS_PER_HOUR) restarts within a 60 minute period. Failing that, it sends mail to the administrators identified by the LOADL_ADMIN keyword and exits.

Syntax:

RESTARTS_PER_HOUR = number

number must be a numerical value and cannot be an arithmetic expression.

Default value: The default is 12.

RESUME_ON_SWITCH_TABLE_ERROR_CLEAR

Specifies whether or not the startd that was drained when the switch table failed to unload will automatically resume once the unload errors are cleared. The unload error is considered cleared after LoadLeveler can successfully unload the switch table. For this keyword to work, the DRAIN_ON_SWITCH_TABLE_ERROR option in the configuration file must be turned on and not disabled. Flushing, suspending, or draining of a startd manually or automatically will disable this option until the startd is manually resumed.

Syntax:

RESUME_ON_SWITCH_TABLE_ERROR_CLEAR = true | false

Default value: false

RSET_SUPPORT

Indicates the level of RSet support present on a machine.

Restriction: RSet support is not available on Linux platforms.

Syntax:

RSET_SUPPORT = option

The available options are:

RSET_CONSUMABLE_CPUS

Indicates that the jobs scheduled to the machine will be attached to RSets with the number of CPUs specified by the consumableCPUs variable.

RSET_MCM_AFFINITY

Indicates that the machine can run jobs requesting MCM (memory or adapter) and processor (cache or SMT) affinity.

RSET_NONE

Indicates that LoadLeveler RSet support is not available on the machine.

RSET_USER_DEFINED

Indicates that the machine can be used for jobs with a user-created RSet in their job command file.

Default value: RSET_NONE

SAVELOGS

Specifies the directory in which log files are archived.

Syntax:

SAVELOGS = directory

Where directory is the directory in which log files will be archived.

Default value: No default value is set.

For more information related to using this keyword, see Configuring recording activity and log files.

SAVELOGS_COMPRESS_PROGRAM

Compresses logs after they are copied to the SAVELOGS directory. If not specified, SAVELOGS are copied, but are not compressed.

Syntax:

SAVELOGS_COMPRESS_PROGRAM = program

Where program is a specific executable program. It can be a system-provided facility (such as, /bin/gzip) or an administrator-provided executable program. The value must be a full path name and can contain command-line arguments. LoadLeveler will call the program as: program filename.

Default value: If blank, the logs are not compressed.

Example: In this example, LoadLeveler will run the gzip -f command. The log file in SAVELOGS will be compressed after it is copied to SAVELOGS. If the program cannot be found or is not executable, LoadLeveler will log the error and SAVELOGS will remain uncompressed.

SAVELOGS_COMPRESS_PROGRAM = /bin/gzip -f

SCHEDD

Location of the Schedd executable (LoadL_schedd).

Syntax:

SCHEDD = directory

Default value: $(BIN)/LoadL_schedd

For more information related to using this keyword, see How LoadLeveler daemons process jobs.

SCHEDD_COREDUMP_DIR

Specifies the local directory for storing LoadL_schedd core dump files.

Syntax:

SCHEDD_COREDUMP_DIR = directory

Default value: The /tmp directory.

For more information related to using this keyword, see Specifying file and directory locations.

SCHEDD_INTERVAL

Specifies the interval, in seconds, at which the Schedd daemon checks the local job queue and updates the negotiator daemon.

Syntax:

SCHEDD_INTERVAL = number

number must be a numerical value and cannot be an arithmetic expression.

Default value: The default is 60 seconds.

SCHEDD_RUNS_HERE

Specifies whether the Schedd daemon runs on the host. If you do not want to run the Schedd daemon, specify false.

This keyword does not designate a machine as a public scheduling machine. Unless configured as a public scheduling machine, a machine configured to run the Schedd daemon will only accept job submissions from the same machine running the Schedd daemon. A public scheduling machine accepts job submissions from other machines in the LoadLeveler cluster. To configure a machine as a public scheduling machine, see the schedd_host keyword description in Administration file keyword descriptions.

Syntax:

SCHEDD_RUNS_HERE = true | false

Default value: true

SCHEDD_SUBMIT_AFFINITY

Specifies whether job submissions are directed to a locally running Schedd daemon. When the keyword is set to true, job submissions are directed to a Schedd daemon running on the same machine where the submission takes place, provided there is a Schedd daemon running on that machine. In this case the submission is said to have "affinity" for the local Schedd daemon. If there is no Schedd daemon running on the machine where the submission takes place, or if this keyword is set to false, the job submission will only be directed to a Schedd daemon serving as a public scheduling machine. In this case, if there are no public scheduling machines configured the job cannot be submitted. A public scheduling machine accepts job submissions from other machines in the LoadLeveler cluster. To configure a machine as a public scheduling machine, see the schedd_host keyword description in Administration file keyword descriptions.

Installations with a large number of nodes should consider setting this keyword to false to more evenly distribute dispatching of jobs among the Schedd daemons. For more information, see Scaling considerations.

Syntax:

SCHEDD_SUBMIT_AFFINITY = true | false 

Default value: true

SCHEDD_STATUS_PORT

Specifies the port number used when connecting to the daemon.

Syntax:

SCHEDD_STATUS_PORT = port number

Default value: The default is 9606.

For more information related to using this keyword, see Defining network characteristics.

SCHEDD_STREAM_PORT

Specifies the port number used when connecting to the daemon.

Syntax:

SCHEDD_STREAM_PORT = port number

Default value: The default is 9605.

For more information related to using this keyword, see Defining network characteristics.

SCHEDULE_BY_RESOURCES

Specifies which consumable resources are considered by the LoadLeveler schedulers. Each consumable resource name may be an administrator-defined alphanumeric string, or may be one of the following predefined resources:

·        ConsumableCpus

·        ConsumableMemory

·        ConsumableVirtualMemory

·        RDMA

Each string may only appear in the list once. These resources are either floating resources, or machine resources. If any resource is specified incorrectly with the SCHEDULE_BY_RESOURCES keyword, then all scheduling resources will be ignored.

Syntax:

SCHEDULE_BY_RESOURCES = name name ... name

Default value: No default value is set.

SCHEDULER_TYPE

Specifies the LoadLeveler scheduling algorithm:

LL_DEFAULT

Specifies the default LoadLeveler scheduling algorithm. If SCHEDULER_TYPE has not been defined, LoadLeveler will use the default scheduler (LL_DEFAULT).

BACKFILL

Specifies the LoadLeveler BACKFILL scheduler. When you specify this keyword, you should use only the default settings for the START expression and the other job control expressions described in Managing job status through control expressions.

API

Specifies that you will use an external scheduler. External schedulers communicate to LoadLeveler through the job control API. For more information on setting an external scheduler, see Using an external scheduler.

Syntax:

SCHEDULER_TYPE = LL_DEFAULT | BACKFILL | API 

Default value: LL_DEFAULT

Note:

1.     If a scheduler type is not set, LoadLeveler will start, but it will use the default scheduler.

2.     If you have set SCHEDULER_TYPE with an option that is not valid, LoadLeveler will not start.

3.     If you change the scheduler option specified by SCHEDULER_TYPE, you must stop and restart LoadLeveler using llctl or recycle using llctl.

For more information related to using this keyword, see Defining a LoadLeveler cluster.

SEC_ADMIN_GROUP

When security services are enabled, this keyword points to the name of the UNIX group that contains the local identities of the LoadLeveler administrators.

Restriction: CtSec security is not supported on LoadLeveler for Linux.

Syntax:

SEC_ADMIN_GROUP = name of lladmin group

Default value: No default value is set.

For more information related to using this keyword, see Configuring LoadLeveler to use cluster security services.

SEC_ENABLEMENT

Specifies the security mechanism to be used.

Restriction: Do not set this keyword to CtSec in the configuration file for a Linux machine. CtSec security is not supported on LoadLeveler for Linux.

Syntax:

SEC_ENABLEMENT = COMPAT | CTSEC

Default value: No default value is set.

SEC_SERVICES_GROUP

When security services are enabled, this keyword specifies the name of the LoadLeveler services group.

Restriction: CtSec security is not supported on LoadLeveler for Linux.

Syntax:

SEC_SERVICES_GROUP=group name

Where group name defines the identities of the LoadLeveler daemons.

Default value: No default value is set.

SEC_IMPOSED_MECHS

Specifies a blank-delimited list of LoadLeveler's permitted security mechanisms when Cluster Security (CtSec) services are enabled.

Restriction: CtSec security is not supported on LoadLeveler for Linux.

Syntax: Specify a blank delimited list containing combinations of the following values:

none

If this is the only value specified, then users will run unauthenticated and, if authorization is necessary, the job will fail. If this is not the only value specified, then users may run unauthenticated and, if authorization is necessary, the job will fail.

unix

If this is the only value specified, then UNIX host-based authentication will be used; otherwise, other mechanisms may be used.

Default value: No default value is set.

Example:

SEC_IMPOSED_MECHS = none unix

SPOOL

Defines the local directory where LoadLeveler keeps the local job queue and checkpoint files

Syntax:

SPOOL = local directory/spool

Default value: $(tilde)/spool

START

Determines whether a machine can run a LoadLeveler job.

Syntax:

START: expression that evaluates to T or F (true or false)

When the expression evaluates to T, LoadLeveler considers dispatching a job to the machine. When you use a START expression that is based on the CPU load average, the negotiator may evaluate the expression as F even though the load average indicates the machine is Idle. This is because the negotiator adds a compensating factor to the startd machine's load average every time the negotiator assigns a job. For more information, see the NEGOTIATOR_INTERVAL keyword.

Default value: No default value is set, which means that no jobs will be started.

For information about time-related variables that you may use for this keyword, see Variables to use for setting times.

START_CLASS

Specifies the rule for starting a job of the incoming_class. The START_CLASS rule is applied whenever the BACKFILL scheduler decides whether a job step of the incoming_class should start or not.

Syntax:

START_CLASS[incoming_class] = (start_class_expression) [ && (start_class_expression) ...]

Where start_class_expression takes the form:

run_class < number_of_tasks

Which indicates that a job step of the incoming_class is only allowed to run on a node when the number of tasks of run_class running on that node is less than number_of_tasks.

Note:

1.     START_CLASS [allclasses] will be ignored.

2.     The job class specified by run_class may be the same as or different from the class specified by incoming_class.

3.     You can also define run_class as allclasses. If you do, the total number of all job tasks running on that node cannot exceed the value specified by number_of_tasks.

4.     A class name or allclasses should not appear twice on the right-hand side of the keyword statement. However, you can use other class names with allclasses on the right hand side of the statement.

5.     If there is more than one start_class_expression, you must use && between adjacent start_class_expressions.

6.     Both the START keyword and the START_CLASS keyword have to be true before a new job can start.

7.     Parenthesis ( ) are optional around start_class_expression.

For information related to using this keyword, see Planning to preempt jobs.

Default value: No default value is set.

Examples:

START_CLASS[Class_A] = (Class_A < 1)

This statement indicates that a Class_A job can only start on nodes that do not have any Class_A jobs running.

START_CLASS[Class_B] = allclasses < 5

This statement indicates that a Class_B job can only start on nodes with maximum 4 tasks running.

START_DAEMONS

Specifies whether to start the LoadLeveler daemons on the node.

Syntax:

START_DAEMONS = true | false 

Default value: true

When true, the daemons are started. In most cases, you will probably want to set this keyword to true. An example of why this keyword would be set to false is if you want to run the daemons on most of the machines in the cluster but some individual users with their own local configuration files do not want their machines to run the daemons. The individual users would modify their local configuration files and set this keyword to false. Because the global configuration file has the keyword set to true, their individual machines would still be able to participate in the LoadLeveler cluster.

Also, to define the machine as strictly a submit-only machine, set this keyword to false.

STARTD

Location of the startd executable (LoadL_startd).

Syntax:

STARTD = directory

Default value: $(BIN)/LoadL_startd

For more information related to using this keyword, see How LoadLeveler daemons process jobs.

STARTD_COREDUMP_DIR

Local directory for storing LoadL_startd core dump files.

Syntax:

STARTD_COREDUMP_DIR = directory

Default value: The /tmp directory.

For more information related to using this keyword, see Specifying file and directory locations.

STARTD_DGRAM_PORT

Specifies the port number used when connecting to the daemon.

Syntax:

STARTD_DGRAM_PORT = port number

Default value: The default is 9615.

For more information related to using this keyword, see Defining network characteristics.

STARTD_RUNS_HERE = true | false

Specifies whether the startd daemon runs on the host. If you do not want to run the startd daemon, specify false.

Syntax:

STARTD_RUNS_HERE = true | false 

Default value: true

STARTD_STREAM_PORT

Specifies the port number used when connecting to the daemon.

Syntax:

STARTD_STREAM_PORT = port number

Default value: The default is 9611.

For more information related to using this keyword, see Defining network characteristics.

STARTER

Location of the starter executable (LoadL_starter).

Syntax:

STARTER = directory

Default value: $(BIN)/LoadL_starter

For more information related to using this keyword, see How LoadLeveler daemons process jobs.

STARTER_COREDUMP_DIR

Local directory for storing LoadL_starter coredump files.

Syntax:

STARTER_COREDUMP_DIR = directory

Default value: The /tmp directory.

For more information related to using this keyword, see Specifying file and directory locations.

SUBMIT_FILTER

Specifies the program you want to run to filter a job script when the job is submitted.

Syntax:

SUBMIT_FILTER = full_path_to_executable

Where full_path_to_executable is called with the job command file as the standard input. The standard output is submitted to LoadLeveler. If the program returns with a nonzero exit code, the job submission is canceled. A submit filter can only make changes to LoadLeveler job command file keyword statements.

Default value: No default value is set.

Multicluster use: In a multicluster environment, if you specified a valid cluster list with either the llsubmit -X option or the ll_cluster API, then the SUBMIT_FILTER will instead be invoked with a modified job command file that contains a cluster_list keyword generated from either the llsubmit -X option or the ll_cluster API.

The modified job command file will contain an inserted # @ cluster_list = cluster statement just prior to the first # @ queue statement. This cluster_list statement takes precedence and overrides all previous specifications of any cluster_list statements from the original job command file.

Example: SUBMIT_FILTER in a multicluster environment

The following job command file, job.cmd, requests to be run remotely on cluster1:

#!/bin/sh
# @ cluster_list = cluster1
# @ error = job1.$(Host).$(Cluster).$(Process).err
# @ output = job1.$(Host).$(Cluster).$(Process).out
# @ queue

After issuing llsubmit -X cluster2 job.cmd, the modified job command file statements will be run on cluster2:

#!/bin/sh
# @ cluster_list = cluster1
# @ error = job1.$(Host).$(Cluster).$(Process).err
# @ output = job1.$(Host).$(Cluster).$(Process).out
# @ cluster_list = cluster2
# @ queue

For more information related to using this keyword, see Filtering a job script.

SUSPEND

Determines whether running jobs should be suspended.

Syntax:

SUSPEND: expression that evaluates to T or F (true or false)

When T, LoadLeveler temporarily suspends jobs currently running on the machine. Suspended LoadLeveler jobs will either be continued or vacated. This keyword is not supported for parallel jobs.

Default value: No default value is set.

For information about time-related variables that you may use for this keyword, see Variables to use for setting times.

SYSPRIO

System priority expression.

Syntax:

SYSPRIO : expression

You can use the following LoadLeveler variables to define the SYSPRIO expression:

·        ClassSysprio

·        GroupQueuedJobs

·        GroupRunningJobs

·        GroupSysprio

·        GroupTotalJobs

·        GroupTotalShares

·        GroupUsedBgShares

·        GroupUsedShares

·        JobIsBlueGene

·        QDate

·        UserPrio

·        UserQueuedJobs

·        UserRunningJobs

·        UserSysprio

·        UserTotalJobs

·        UserTotalShares

·        UserUsedBgShares

·        UserUsedShares

For detailed descriptions of these variables, see LoadLeveler variables.

Default value: 0 (zero)

Note:

1.     The SYSPRIO keyword is valid only on the machine where the central manager is running. Using this keyword in a local configuration file has no effect.

2.     It is recommended that you do not use UserPrio in the SYSPRIO expression, since user jobs are already ordered by UserPrio.

3.     The string SYSPRIO can be used as both the name of an expression (SYSPRIO: value) and the name of a variable (SYSPRIO = value).

To specify the expression to be used to calculate job priority you must use the syntax for the SYSPRIO expression. If the variable is mistakenly used for the SYSPRIO expression, which requires a colon (:) after the name, the job priority value will always be 0 because the SYSPRIO expression has not been defined.

4.     When the UserRunningJobs, GroupRunningJobs, UserQueuedJobs, GroupQueuedJobs, UserTotalJobs, GroupTotalJobs, GroupTotalShares, GroupUsedShares, UserTotalShares, UserUsedShares, GroupUsedBgShares, JobIsBlueGene, and UserUsedBgShares variables are used to prioritize the queue based on current usage, you should also set NEGOTIATOR_RECALCULATE_SYSPRIO_INTERVAL so that the priorities are adjusted according to current usage rather than usage only at submission time.

Examples:

·        Example 1

This example creates a FIFO job queue based on submission time:

SYSPRIO : 0 - (QDate)

·        Example 2

This example accounts for Class, User, and Group system priorities:

SYSPRIO : (ClassSysprio * 100) + (UserSysprio * 10) + (GroupSysprio * 1) - (QDate)

·        Example 3

This example orders the queue based on the number of jobs a user is currently running. The user who has the fewest jobs running is first in the queue. You should set NEGOTIATOR_RECALCULATE_SYSPRIO_INTERVAL in conjunction with this SYSPRIO expression.

SYSPRIO : 0 - UserRunningJobs

·        Example 4

This example shows one possible way to set up the SYSPRIO expression for fair share scheduling. For those jobs whose owner has no unused shares ($(UserHasShares)= 0), that job priority depends only on QDate, making it a simple FIFO queue as in Example 1.

For those jobs whose owner has unused shares ($(UserHasShares)= 1), job priority depends not only on QDate, but also on a uniform boost of 31 536 000 (the equivalent to the job being submitted one year earlier). These jobs still have priority differences because of submit time differences. It is like forming two priority tiers: the higher priority tier for jobs with unused shares and the lower priority tier for jobs without unused shares.

SYSPRIO: 31536000 * $(UserHasShares) - QDate

·        Example 5

This example divides the jobs into three priority tiers:

o       Those jobs whose owner and group both have unused shares are at the top tier

o       Those jobs whose owner or group has unused shares are at the middle tier

o       Those jobs whose owner and group both have no shares remaining are at the bottom tier

A user can submit two jobs to two different groups, the first job to a group with shares remaining and the second job to a group without any unused shares. If the user has unused shares, the first job will belong to the top tier and the second job will belong to the middle tier. If the user has no shares remaining, the first job will belong to the middle tier and the second job will belong to the bottom tier. The jobs in the top tier will be considered to run first, then the jobs in the middle tier, and lastly the jobs in the bottom tier.

SYSPRIO: 31536000 * ($(UserHasShares)+$(GroupHasShares)) - (QDate)

For more information related to using this keyword, see Setting negotiator characteristics and policies.

SYSPRIO_THRESHOLD_TO_IGNORE_STEP

Specifies a threshold value for system priority. When the system priority assigned to a job step is less than the value set for this keyword, the scheduler ignores the job, which will remain in Idle state.

Syntax:

SYSPRIO_THRESHOLD_TO_IGNORE_STEP = integer

Any integer is a valid value.

Default value: INT_MIN

For more information related to using this keyword, see Controlling the central manager scheduling cycle.

TRUNC_GSMONITOR_LOG_ON_OPEN

When true, specifies that the log file is restarted with every invocation of the daemon.

Syntax:

TRUNC_GSMONITOR_LOG_ON_OPEN = true | false

Default value: false

For more information related to using this keyword, see Configuring recording activity and log files.

TRUNC_KBDD_LOG_ON_OPEN

When true, specifies the log file is restarted with every invocation of the daemon.

Syntax:

TRUNC_KBDD_LOG_ON_OPEN = true | false

Default value: false

For more information related to using this keyword, see Configuring recording activity and log files.

TRUNC_MASTER_LOG_ON_OPEN

When true, specifies the log file is restarted with every invocation of the daemon.

Syntax:

TRUNC_MASTER_LOG_ON_OPEN = true | false

Default value: false

For more information related to using this keyword, see Configuring recording activity and log files.

TRUNC_NEGOTIATOR_LOG_ON_OPEN

When true, specifies the log file is restarted with every invocation of the daemon.

Syntax:

TRUNC_NEGOTIATOR_LOG_ON_OPEN = true | false

Default value: false

For more information related to using this keyword, see Configuring recording activity and log files.

TRUNC_SCHEDD_LOG_ON_OPEN

When true, specifies the log file is restarted with every invocation of the daemon.

Syntax:

TRUNC_SCHEDD_LOG_ON_OPEN = true | false

Default value: false

For more information related to using this keyword, see Configuring recording activity and log files.

TRUNC_STARTD_LOG_ON_OPEN

When true, specifies the log file is restarted with every invocation of the daemon.

Syntax:

TRUNC_STARTD_LOG_ON_OPEN = true | false

Default value: false

For more information related to using this keyword, see Configuring recording activity and log files.

TRUNC_STARTER_LOG_ON_OPEN

When true, specifies the log file is restarted with every invocation of the daemon.

Syntax:

TRUNC_STARTER_LOG_ON_OPEN = true | false

Default value: false

For more information related to using this keyword, see Configuring recording activity and log files.

UPDATE_ON_POLL_INTERVAL_ONLY

Specifies whether or not the LoadLeveler startd daemons will send machine update transactions to the Central Manager. Normally the LoadLeveler startd daemons running on executing nodes will send transactions to the Central Manager to provide updates of machine information at various times. An update is sent every polling interval. The polling interval is calculated by multiplying the values for the two keywords, POLLING_FREQUENCY and POLLS_PER_UPDATE, specified in the LoadLeveler configuration file.

In addition, updates are sent at other times such as when new jobs are started and when jobs terminate on the executing node. If you have a large and highly active cluster (the workload consists of a large number of short running jobs), the normal method for updating the central manager can add excessive network traffic. UPDATE_ON_POLL_INTERVAL_ONLY can help reduce this source of network traffic.

When true is specified, the LoadLeveler startd daemon will only send machine updates to the Central Manager at every polling interval and not at other times.

Syntax:

UPDATE_ON_POLL_INTERVAL_ONLY = false | true

Default value: false

VACATE

Determines whether suspended jobs should be vacated.

Syntax:

VACATE: expression that evaluates to T or F (true or false)

When T, suspended LoadLeveler jobs are removed from the machine and placed back into the queue (provided you specify restart=yes in the job command file). If a checkpoint was taken, the job restarts from the checkpoint. Otherwise, the job restarts from the beginning.

Default value: No default value is set.

For information about time-related variables that you may use for this keyword, see Variables to use for setting times.

VM_IMAGE_ALGORITHM

Specifies the virtual memory algorithm, which is used for checking the image_size requirement. This keyword is used together with the large_page job command file keyword to specify which algorithm the Central Manager uses to decide whether a machine has enough virtual memory to run a job step.

This keyword is critical for job steps that must use Large Page memory (specified by the job command file keyword large_page=M). If the VM_IMAGE_ALGORITHM keyword is set to FREE_PAGING_SPACE, the Large Page job step will never be scheduled to run. This keyword must be set to FREE_PAGING_SPACE_PLUS_FREE_REAL_MEMORY to run Large Page jobs.

When FREE_PAGING_SPACE is specified, LoadLeveler considers only free paging space when determining if a machine has enough virtual memory to run a job step.

When FREE_PAGING_SPACE_PLUS_FREE_REAL_MEMORY is specified and the job step specifies:

·        large_page=N (does not use Large Page memory), LoadLeveler considers free paging space and free regular memory when determining if a machine has enough virtual memory to run a job step.

·        large_page=Y (uses Large Page memory, if available), LoadLeveler considers free paging space, free regular memory, and free Large Page memory when determining if a machine has enough virtual memory to run a job step, although Large Page memory is only considered for machines configured to exploit the Large Page feature.

·        large_page=M (must use Large Page memory), LoadLeveler considers only Large Page memory when determining if a machine has enough virtual memory to run a job step. Only machines configured to exploit the Large Page feature are considered.

IBM® suggests that you set this keyword to the value FREE_PAGING_SPACE_PLUS_FREE_REAL_MEMORY since more types of virtual memory are considered, increasing the chances of finding a machine with enough virtual memory to run the job step.

Syntax:

VM_IMAGE_ALGORITHM = FREE_PAGING_SPACE | FREE_PAGING_SPACE_PLUS_FREE_REAL_MEMORY

Default value: FREE_PAGING_SPACE

WALLCLOCK_ENFORCE

Specifies whether the job command file keyword wall_clock_limit will be enforced for this job. The WALLCLOCK_ENFORCE keyword is valid only when an external scheduler is enabled.

Syntax:

WALLCLOCK_ENFORCE = true | false

Default value: true

X_RUNS_HERE

Specifies whether the kbdd (keyboard) daemon runs on the host. If you do not want to run the kbdd daemon, specify false.

Syntax:

X_RUNS_HERE = true | false 

Default value: true

Parent topic: Configuration file reference

Related concepts

Configuration file syntax

Related reference

User-defined keywords

LoadLeveler variables