sge_conf (5) - Linux Man Pages
sge_conf: xxQS_NAMExx configuration files
xxqs_name_sxx_conf - xxQS_NAMExx configuration files
DESCRIPTIONxxqs_name_sxx_conf defines the global and local xxQS_NAMExx configurations and can be shown/modified by using the -sconf/-mconf options. Only root or the cluster administrator may modify xxqs_name_sxx_conf.
At its initial start-up, checks to see if a valid xxQS_NAMExx configuration is available at a well known location in the xxQS_NAMExx internal directory hierarchy. If so, it loads that configuration information and proceeds. If not, writes a generic configuration containing default values to that same location. The xxQS_NAMExx execution daemons upon start-up retrieve their configuration from
The actual configuration for both and is a superposition of a global configuration and a local configuration pertinent for the host on which a master or execution daemon resides. If a local configuration is available, its entries overwrite the corresponding entries of the global configuration. Note: The local configuration does not have to contain all valid configuration entries, but only those which need to be modified against the global entries.
FORMATThe paragraphs that follow provide brief descriptions of the individual parameters that compose the global and local configurations for a xxQS_NAMExx cluster:
execd_spool_dirThe execution daemon spool directory path. Again, a feasible spool directory requires read/write access permission for root. The entry in the global configuration for this parameter can be overwritten by execution host local configurations, i.e. each may have a private spool directory with a different path, in which case it needs to provide read/write permission for the root account of the corresponding execution host only.
Under execd_spool_dir a directory named corresponding to the unqualified hostname of the execution host is opened and contains all information spooled to disk. Thus, it is possible for the execd_spool_dirs of all execution hosts to physically reference the same directory path (the root access restrictions mentioned above need to be met, however).
Changing the global execd_spool_dir parameter set at installation time is not supported in a running system. If the change should still be done it is required to restart all affected execution daemons. Please make sure running jobs have finished before doing so, otherwise running jobs will be lost.
The default location for the execution daemon spool directory is $xxQS_NAME_Sxx_ROOT/$xxQS_NAME_Sxx_CELL/spool.
mailermailer is the absolute pathname to the electronic mail delivery agent on your system. An optional prefix "user@" specifies the user under which this procedure is to be started; the default is root. The mailer must accept the following syntax:
- mailer -s subject-of-mail-message recipient
Each may use a private mail agent. Changing mailer will take immediate effect.
The default for mailer depends on the operating system of the host on which the xxQS_NAMExx master installation was run. Common values are /bin/mail or /usr/bin/Mail. Note that since the mail is sent by compute hosts, not the master, it may be necessary to take steps to route it appropriately, e.g. by using a cluster head node as a "smart host" for the private network.
xtermxterm is the absolute pathname to the X Window System terminal emulator, xterm(1).
Changing xterm will take immediate effect.
The default for xterm is system-dependent.
load_sensorA comma-separated list of executable shell script paths or programs to be started by and to be used in order to retrieve site-configurable load information (e.g. free space on a certain disk partition).
Each may use a set of private load_sensor programs or scripts. Changing load_sensor will take effect after on load report interval (see load_report_time), or two if DEMAND_LS is 0 in execd_params. A load sensor will be restarted automatically if the file modification time of the load sensor executable changes.
The global configuration entry for this value may be overwritten by the execution host local configuration.
In addition to the load sensors configured via load_sensor, searches for an executable file named qloadsensor in the execution host's xxQS_NAMExx binary directory path. If such a file is found, it is treated like the configurable load sensors defined in load_sensor. This facility is intended for pre-installing a default load sensor.
prologThe path of an executable, with optional arguments, that is started before execution of xxQS_NAMExx jobs with the same environment setting as that for the xxQS_NAMExx jobs to be started afterwards (see The prolog command is started directly, not in a shell. An optional prefix "user@" specifies the user under which this procedure is to be started. In that case see the SECURITY section below concerning security issues running as a privileged user. The procedure's standard output and the error output stream are written to the same file as used for the standard output and error output of each job.
This procedure is intended as a means for the xxQS_NAMExx administrator to automate the execution of general site-specific tasks, like the preparation of temporary file systems, with a need for the same context information as the job. For a parallel job, only a single instance of the prolog is run, on the master node. Each may use a private prolog. Correspondingly, the global or execution host local configuration can be overwritten by the queue configuration (see Changing prolog will take immediate effect.
The default for prolog is the special value NONE, which prevents execution of a prolog.
The following special variables, expanded at runtime, can be used (besides any other strings which have to be interpreted by the procedure) to compose a command line:
- The name of the host on which the prolog or epilog procedures are started.
- The array job task index (0 if not an array job).
- The user name of the job owner.
- xxQS_NAMExx's unique job identification number.
- The name of the job.
- The cluster queue name of the master queue instance, i.e. the cluster queue in which the prolog and epilog procedures are started.
- The pathname of the stdin file. This is always /dev/null for prolog, pe_start, pe_stop and epilog. It is the pathname of the stdin file for the job in the job script. When delegated file staging is enabled, this path is set to $fs_stdin_tmp_path. When delegated file staging is not enabled, it is the stdin pathname given via DRMAA or qsub.
- The pathname of the stdout/stderr file. This always points to the output/error file. When delegated file staging is enabled, this path is set to $fs_stdout_tmp_path/$fs_stderr_tmp_path. When delegated file staging is not enabled, it is the stdout/stderr pathname given via DRMAA or qsub.
- If this flag is 1, stdout and stderr are merged in one file, the stdout file. Otherwise (the default), no merging is done. Merging of stderr and stdout can be requested via the DRMAA job template attribute 'drmaa_join_files' (see or the qsub parameter '-j y' (see
- When delegated file staging is requested for the stdin file, this is the name of the host where the stdin file has to be copied from before the job is started.
- When delegated file staging is requested for the stdout/stderr file, this is the name of the host where the stdout/stderr file has to be copied to after the job has run.
- When delegated file staging is requested for the stdin file, this is the pathname of the stdin file on the host $fs_stdin_host.
- When delegated file staging is requested for the stdout/stderr file, this is the pathname of the stdout/stderr file on the host $fs_stdout_host/$fs_stderr_host.
- When delegated file staging is requested for the stdin file, this is the destination pathname of the stdin file on the execution host. The prolog must copy the stdin file from $fs_stdin_host:$fs_stdin_path to localhost:$fs_stdin_tmp_path to establish delegated file staging of the stdin file.
- When delegated file staging is requested for the stdout/stderr file, this is the source pathname of the stdout/stderr file on the execution host. The epilog must copy the stdout file from localhost:$fs_stdout_tmp_path to $fs_stdout_host:$fs_stdout_path (the stderr file from localhost:$fs_stderr_tmp_path to $fs_stderr_host:$fs_stderr_path) to establish delegated file staging of the stdout/stderr file.
- When delegated file staging is requested for the stdin/stdout/stderr file, the flag is set to "1", otherwise it is set to "0" (see in delegated_file_staging how to enable delegated file staging). These three flags correspond to the DRMAA job template attribute 'drmaa_transfer_files' (see
If the prolog is written in shell script, the usual care must be exercised, e.g. when expanding such values from the command line or the environment which are user-supplied. In particular, note that the job name could be of the form "; evil doings;". Also, use absolute path names for commands if inheriting the user's environment.
The global configuration entry for this value may be overwritten by the execution host local configuration.
epilogThe path of an executable, with optional argument, that is started after execution of xxQS_NAMExx jobs with the same environment setting as that for the xxQS_NAMExx job that has just completed (see with the addition of the variable named SGE_JOBEXIT_STAT which holds the exit status of the job. The epilog command is started directly, not in a shell. An optional prefix "user@" specifies the user under which this procedure is to be started. In that case see the SECURITY section below concerning security issues running as a privileged user. The procedure's standard output and the error output stream are written to the same file used for the standard output and error output of each job.
The same special variables can be used to compose a command line as for the prolog.
This procedure is intended as a means for the xxQS_NAMExx administrator to automate the execution of general site-specific tasks, like the cleaning up of temporary file systems with the need for the same context information as the job. For a parallel job, only a single instance of the epilog is run, on the master node. Each may use a private epilog. Correspondingly, the global or execution host local configurations can be overwritten by the queue configuration (see Changing epilog will take immediate effect.
The default for epilog is the special value NONE, which prevents execution of an epilog. The same special variables as for prolog can be used to constitute a command line.
The same considerations (above) apply as for a prolog when an epilog is written in shell script.
shell_start_modeNote: Deprecated, may be removed in future release.
This parameter defines the mechanisms which are used to actually invoke the job scripts on the execution hosts. The following values are recognized:
- If a user starts a job shell script under UNIX interactively by invoking it just with the script name the operating system's executable loader uses the information provided in a comment such as `#!/bin/csh' in the first line of the script to detect which command interpreter to start to interpret the script. This mechanism is used by xxQS_NAMExx when starting jobs if unix_behavior is defined as shell_start_mode.
- POSIX does not consider first script line comments such a `#!/bin/csh' as significant. The POSIX standard for batch queueing systems (P1003.2d) therefore requires a compliant queueing system to ignore such lines, but to use user-specified or configured default command interpreters instead. Thus, if shell_start_mode is set to posix_compliant xxQS_NAMExx will either use the command interpreter indicated by the -S option of the command or the shell parameter of the queue to be used (see for details).
Setting the shell_start_mode parameter either to posix_compliant
or unix_behavior requires you to set the umask in use for
such that every user has read access to the active_jobs directory in the
spool directory of the corresponding execution daemon. In case you have
prolog and epilog scripts configured, they also need to be
readable by any user who may execute jobs.
If this violates your site's security policies you may want to set shell_start_mode to script_from_stdin. This will force xxQS_NAMExx to open the job script as well as the epilog and prolog scripts for reading into STDIN as root (if was started as root) before changing to the job owner's user account. The script is then fed into the STDIN stream of the command interpreter indicated by the -S option of the command or the shell parameter of the queue to be used (see for details).
Thus setting shell_start_mode to script_from_stdin also implies posix_compliant behavior. Note, however, that feeding scripts into the STDIN stream of a command interpreter may cause trouble if commands like rsh(1) are invoked inside a job script as they also process the STDIN stream of the command interpreter. These problems can usually be resolved by redirecting the STDIN channel of those commands to come from /dev/null (e.g. rsh host date < /dev/null). Note also, that any command-line options associated with the job are passed to the executing shell. The shell will only forward them to the job if they are not recognized as valid shell options.
Changes to shell_start_mode will take immediate effect. The default for shell_start_mode is posix_compliant.
login_shellsUNIX command interpreters like the Bourne-Shell (see sh(1)) or the C-Shell (see csh(1)) can be used by xxQS_NAMExx to start job scripts. The command interpreters can either be started as login-shells (i.e. all system and user default resource files like .login or .profile will be executed when the command interpreter is started, and the environment for the job will be set up as if the user has just logged in) or just for command execution (i.e. only shell-specific resource files like .cshrc will be executed and a minimal default environment is set up by xxQS_NAMExx - see The parameter login_shells contains a comma-separated list of the executable names of the command interpreters to be started as login shells. Shells in this list are only started as login shells if the parameter shell_start_mode (see above) is set to posix_compliant.
Changes to login_shells will take immediate effect. The default for login_shells is sh,bash,csh,tcsh,ksh.
min_uidmin_uid places a lower bound on user IDs that may use the cluster. Users whose user ID (as returned by getpwnam(3)) is less than min_uid will not be allowed to run jobs on the cluster.
Changes to min_uid will take immediate effect. The default is 0 but, if CSP or MUNGE security is not in use, the installation script sets it to 100 to prevent unauthorized access by root or system accounts.
min_gidThis parameter sets the lower bound on group IDs that may use the cluster. Users whose default group ID (as returned by getpwnam(3)) is less than min_gid will not be allowed to run jobs on the cluster.
Changes to min_gid will take immediate effect. The default is 0 but, if CSP security is not in use, the installation script sets it to 100 to prevent unauthorized access by root or system accounts.
user_listsThe user_lists parameter contains a comma-separated list of user access lists as described in Each user contained in at least one of the access lists has access to the cluster. If the user_lists parameter is set to NONE (the default) any user has access if not explicitly excluded via the xuser_lists parameter described below. If a user is contained both in an access list xuser_lists and user_lists, the user is denied access to the cluster.
Changes to user_lists will take immediate effect.
This value is a global configuration parameter insofar as it restricts access to the whole cluster, but the execution host local configuration may define a value to restrict access to that host further.
xuser_listsThe xuser_lists parameter contains a comma-separated list of user access lists as described in Each user contained in at least one of the access lists is denied access to the cluster. If the xuser_lists parameter is set to NONE (the default) any user has access. If a user is contained both in an access list in xuser_lists and user_lists (see above) the user is denied access to the cluster.
Changes to xuser_lists will take immediate effect.
This value is a global configuration parameter insofar as it restricts access to the whole cluster, but the execution host local configuration may define a value to restrict access to that host further.
administrator_mailadministrator_mail specifies a comma-separated list of the electronic mail address(es) of the cluster administrator(s) to whom internally-generated problem reports are sent. The mail address format depends on your electronic mail system and how it is configured; consult your system's configuration guide for more information.
Changing administrator_mail takes immediate effect. The default for administrator_mail is an empty mail list.
The projects list contains all projects which are granted access to xxQS_NAMExx. Users not belonging to one of these projects cannot submit jobs. If users belong to projects in the projects list and the xprojects list (see below), they also cannot submit jobs.
Changing projects takes immediate effect. The default for projects is none.
xprojectsThe xprojects list contains all projects that are denied access to xxQS_NAMExx. Users belonging to one of these projects cannot use xxQS_NAMExx. If users belong to projects in the projects list (see above) and the xprojects list, they also cannot use the system.
Changing xprojects takes immediate effect. The default for xprojects is none.
load_report_timeSystem load is reported periodically by the execution daemons to The parameter load_report_time defines the time interval between load reports.
Each may use a different load report time. Changing load_report_time will take immediate effect.
Note: Be careful when modifying load_report_time. Reporting load too frequently might block especially if the number of execution hosts is large. Moreover, since the system load typically increases and decreases smoothly, frequent load reports hardly offer any benefit.
The default for load_report_time is 40 seconds.
reschedule_unknownDetermines whether jobs on hosts in an unknown state are rescheduled, and thus sent to other hosts. Hosts are registered as unknown if cannot establish contact to the on those hosts (see max_unheard). Likely reasons are a breakdown of the host or a breakdown of the network connection in between, but also may not be executing on such hosts.
In any case, xxQS_NAMExx can reschedule jobs running on such hosts to another system. reschedule_unknown controls the time which xxQS_NAMExx will wait before jobs are rescheduled after a host became unknown. The time format specification is hh:mm:ss. If the special value 00:00:00 is set, then jobs will not be rescheduled from this host.
Rescheduling is only initiated for jobs which have activated the rerun flag (see the -r y option of and the rerun option of Parallel jobs are only rescheduled if the host on which their master task executes is in unknown state. The behavior of reschedule_unknown for parallel jobs and for jobs without the rerun flag set can be adjusted using the qmaster_params settings ENABLE_RESCHEDULE_KILL and ENABLE_RESCHEDULE_SLAVE.
Checkpointing jobs will only be rescheduled when the when option of the corresponding checkpointing environment contains an appropriate flag. (see Interactive jobs (see are not rescheduled.
The default for reschedule_unknown is 00:00:00
max_unheardIf could not contact, or was not contacted by, the execution daemon of a host for max_unheard seconds, all queues residing on that particular host are set to status unknown. at least, should be contacted by the execution daemons in order to get the load reports. Thus, max_unheard should be greater than the load_report_time (see above).
Changing max_unheard takes immediate effect. The default for max_unheard is 5 minutes.
loglevelThis parameter specifies the level of detail that xxQS_NAMExx components such as or use to produce informative, warning or error messages which are logged to the messages files in the master and execution daemon spool directories (see the description of the execd_spool_dir parameter above). The following message levels are available:
- All error events recognized are logged.
- All error events recognized, and all detected signs of potentially erroneous behavior, are logged.
- All error events recognized, all detected signs of potentially erroneous behavior, and a variety of informative messages are logged.
Changing loglevel will take immediate effect.
The default for loglevel is log_warning.
max_aj_instancesThis parameter defines the maximum number of array tasks to be scheduled to run simultaneously per array job. An instance of an array task will be created within the master daemon when it gets a start order from the scheduler. The instance will be destroyed when the array task finishes. Thus the parameter provides control mainly over the memory consumption of array jobs in the master daemon. It is most useful for very large clusters and very large array jobs. The default for this parameter is 2000. The value 0 will deactivate this limit and will allow the scheduler to start as many array job tasks as suitable resources are available in the cluster.
Changing max_aj_instances will take immediate effect.
max_aj_tasksThis parameter defines the maximum number of array job tasks within an array job. will reject all array job submissions which request more than max_aj_tasks array job tasks. The default for this parameter is 75000. The value 0 will deactivate this limit.
Changing max_aj_tasks will take immediate effect.
max_u_jobsThe number of active (not finished) jobs which each xxQS_NAMExx user can have in the system simultaneously is controlled by this parameter. A value greater than 0 defines the limit. The default value 0 means "unlimited". If the max_u_jobs limit is exceeded by a job submission then the submission command exits with exit status 25 and an appropriate error message.
Changing max_u_jobs will take immediate effect.
max_jobsThe number of active (not finished) jobs simultaneously allowed in xxQS_NAMExx is controlled by this parameter. A value greater than 0 defines the limit. The default value 0 means "unlimited". If the max_jobs limit is exceeded by a job submission then the submission command exits with exit status 25 and an appropriate error message.
Changing max_jobs will take immediate effect.
max_advance_reservationsThe number of active (not finished) Advance Reservations simultaneously allowed in xxQS_NAMExx is controlled by this parameter. A value greater than 0 defines the limit. The default value 0 means "unlimited". If the max_advance_reservations limit is exceeded by an Advance Reservation request then the submission command exits with exit status 25 and an appropriate error message.
Changing max_advance_reservations will take immediate effect.
enforce_projectIf set to true, users are required to request a project whenever submitting a job. See the -P option to for details.
Changing enforce_project will take immediate effect. The default for enforce_project is false.
enforce_userIf set to true, a must exist to allow for job submission. Jobs are rejected if no corresponding user exists.
If set to auto, a object for the submitting user will automatically be created during job submission, if one does not already exist. The auto_user_oticket, auto_user_fshare, auto_user_default_project, and auto_user_delete_time configuration parameters will be used as default attributes of the new object.
Changing enforce_user will take immediate effect. The default for enforce_user is auto.
auto_user_oticketThe number of override tickets to assign to automatically created objects. User objects are created automatically if the enforce_user attribute is set to auto.
Changing auto_user_oticket will affect any newly created user objects, but will not change user objects created in the past.
auto_user_fshareThe number of functional shares to assign to automatically created objects. User objects are created automatically if the enforce_user attribute is set to auto.
Changing auto_user_fshare will affect any newly created user objects, but will not change user objects created in the past.
auto_user_default_projectThe default project to assign to automatically created objects. User objects are created automatically if the enforce_user attribute is set to auto.
Changing auto_user_default_project will affect any newly created user objects, but will not change user objects created in the past.
auto_user_delete_timeThe number of seconds of inactivity after which automatically created objects will be deleted. User objects are created automatically if the enforce_user attribute is set to auto. If the user has no active or pending jobs for the specified amount of time, the object will automatically be deleted. A value of 0 can be used to indicate that the automatically created user object is permanent and should not be automatically deleted.
Changing auto_user_delete_time will affect the deletion time for all users with active jobs.
set_token_cmdNB. If the qmaster spool area is world-readable for non-admin users, you must take steps to encrypt the credentials, since they are stored there after job submission.
Set_token_cmd points to a command which sets and extends AFS tokens for xxQS_NAMExx jobs. It is run by It expects two command line parameters:
set_token_cmd user token_extend_after_seconds
- SetToken - forge
which are provided by your distributor as source code. The script looks as follows:
-------------------------------- #!/bin/sh # set_token_cmd forge -u $1 -t $2 | SetToken --------------------------------
Since it is necessary for forge to read the secret AFS server key, a site might wish to replace the set_token_cmd script by a command, which connects to a custom daemon at the AFS server. The token must be forged at the AFS server and returned to the local machine, where SetToken is executed.
Changing set_token_cmd will take immediate effect. The default for set_token_cmd is none.
pag_cmdThe path to your pagsh is specified via this parameter. The process and the job run in a pagsh. Please ask your AFS administrator for details.
Changing pag_cmd will take immediate effect. The default for pag_cmd is none.
token_extend_timeThe token_extend_time is the time period for which AFS tokens are periodically extended. xxQS_NAMExx will call the token extension 30 minutes before the tokens expire until jobs have finished and the corresponding tokens are no longer required.
Changing token_extend_time will take immediate effect. The default for token_extend_time is 24:0:0, i.e. 24 hours.
Alternative path to the shepherd_cmd binary. Typically used to call the shepherd binary by a wrapper script or command. If used in production, this must take care to handle signals the way the shepherd would or, for instance, jobs will not be killed correctly.
Changing shepherd_cmd will take immediate effect. The default for shepherd_cmd is none.
gid_rangeThe gid_range is a comma-separated list of range expressions of the form m-n, where m and n are integer numbers greater than 99, and m is an abbreviation for m-m. These numbers are used in to identify processes belonging to the same job.
Each may use a separate set of group ids for this purpose. All numbers in the group id range have to be unused supplementary group ids on the system, where the is started.
Changing gid_range will take immediate effect. There is no default for gid_range. The administrator will have to assign a value for gid_range during installation of xxQS_NAMExx.
qmaster_paramsA list of additional parameters can be passed to the xxQS_NAMExx qmaster. The following values are recognized:
If this parameter is set then the s_rt, h_rt limits of a running job
are tested and acted on by the
where the job was run is in an unknown state.
After the s_rt or h_rt limit of a job is expired, the master daemon will wait additional time defined by DURATION_OFFSET (see If the execution daemon still cannot be contacted when this additional time is elapsed, then the master daemon will force the deletion of the job (see -f of
For jobs which will be deleted that way, an accounting record will be created. For usage, the record will contain the last reported online value when the execution daemon could contact qmaster. The failed state in the record will be set to 37 to indicate that the job was terminated by a limit enforced by the master daemon.
After the restart of the limit enforcement will be triggered after twice the biggest load_report_interval interval defined in has elapsed. This will give the execution daemons enough time to re-register with the master daemon.
- If this parameter is set then a deletion request for a job is automatically interpreted as a forced deletion request (see -f of if the host where the job is running is in an unknown state.
If this parameter is set, non-administrative users can force deletion of
their own jobs via the -f option of
Without this parameter, forced deletion of jobs is only allowed by the
xxQS_NAMExx manager or operator.
Note: Forced deletion for jobs is executed differently, depending on whether users are xxQS_NAMExx administrators or not. In the case of administrative users, the jobs are removed from the internal database of xxQS_NAMExx immediately. For regular users, the equivalent of a normal is executed first, and deletion is forced only if the normal cancellation was unsuccessful.
- If this parameter is set, re-queueing of jobs cannot be initiated by the job script which is under control of the user. Without this parameter, jobs returning the value 99 are rescheduled. This can be used to cause the job to be restarted on a different machine, for instance if there are not enough resources on the current one.
- If this parameter is set, the application cannot set itself to the error state. Without this parameter jobs returning the value 100 are set to the error state (and therefore can be manually rescheduled by clearing the error state). This can be used to set the job to the error state when a starting condition of the application is not fulfilled before the application itself has been started, or when a clean up procedure (e.g. in the epilog) decides that it is necessary to run the job again. To do so, return 100 in the prolog, pe_start, job script, pe_stop or epilog script.
Deprecated, may be removed in future release.
If set to "true" or "1", the reschedule_unknown parameter is not taken into account.
- If set to "true" or "1", the reschedule_unknown parameter affects also jobs which have the rerun flag not activated (see the -r y option of and the rerun option of but they are just finished as they can't be rescheduled.
- If set to "true" or "1" xxQS_NAMExx triggers job rescheduling also when the host where the slave tasks of a parallel job executes is in unknown state, if the reschedule_unknown parameter is activated.
- Sets the max number of dynamic event clients (as used by qsub -sync y and by xxQS_NAMExx DRMAA API library sessions). The default is 1000. The number of dynamic event clients should not be bigger than half of the number of file descriptors the system has. The file descriptors are shared among the connections to all exec hosts, all event clients, and file handles that the qmaster needs.
- Specifies the time interval when the monitoring information should be printed. The monitoring is disabled by default and can be enabled by specifying an interval. The monitoring is per-thread and is written to the messages file or displayed by with option -f. Example: MONITOR_TIME=0:0:10 generates and prints the monitoring information approximately every 10 seconds. The specified time is a guideline only and not a fixed interval. The interval that is actually used is printed. In this example, the interval could be anything between 9 seconds and 20 seconds.
- Monitoring information is logged into the messages files by default. This information can be accessed via by If monitoring is always enabled, the messages files can become quite large. This switch disables logging into the messages files, making qping -f the only source of monitoring data.
Profiling provides the user with the possibility to get system measurements. This can be useful for debugging or optimization of the system. The profiling output will be done within the messages file.
- Enables profiling for the qmaster signal thread (e.g. PROF_SIGNAL=true).
- Enables profiling for the qmaster worker threads (e.g. PROF_WORKER=true).
- Enables profiling for the qmaster listener threads (e.g. PROF_LISTENER=true).
- Enables profiling for the qmaster event deliver thread (e.g. PROF_DELIVER=true).
- Enables the profiling for the qmaster timed event thread (e.g. PROF_TEVENT=true).
- Enables profiling for the qmaster scheduler thread (e.g. PROF_SCHEDULER=true).
Please note that the CPU utime and stime values contained in the profiling output are not per-thread CPU times. These CPU usage statistics are per-process statistics. So the printed profiling values for CPU mean "CPU time consumed by sge_qmaster (all threads) while the reported profiling level was active".
- Sets the time interval for spooling the sharetree usage. The default is set to 00:04:00. The setting accepts colon-separated string or seconds. There is no setting to turn the sharetree spooling off. (e.g. STREE_SPOOL_INTERVAL=00:02:00)
- Sets the value of how long the qmaster will spend deleting jobs. After this time, the qmaster will continue with other tasks and schedule the deletion of remaining jobs at a later time. The default value is 3 seconds, and will be used if no value is entered. The range of valid values is > 0 and <= 5. (e.g. MAX_JOB_DELETION_TIME=1)
- Sets how long the communication will wait for GDI send/receive operations. (GDI is the Grid Engine Database Interface for interacting with objects managed by the qmaster.) The default value is set to 60 seconds. After this time, the communication library will retry, if "gdi_retries" is configured, receiving the GDI request. If not configured the communication will return with a "gdi receive failure" (e.g. gdi_timeout=120 will set the timeout time to 120 sec). Configuring no gdi_timeout value, the value defaults to 60 sec.
- Sets how often the GDI receive call will be repeated until the GDI receive error appears. The default is set to 0. In this case the call will be done 1 time with no retry. Setting the value to -1 the call will be done permanently. In combination with the gdi_timeout parameter it is possible to configure a system with, e.g. slow NFS, to make sure that all jobs will be submitted. (E.g. gdi_retries=4.)
- Turns on/off a communication library ping. This parameter will create additional debug output. This output shows information about the error messages which are returned by communication and it will give information about the application status of the qmaster. For example, if it's unclear what's the reason for gdi timeouts, this may show you some useful messages. The default value is false (off) (i.e. cl_ping=false).
- Setting this parameter allows the scheduler GDI event acknowledge timeout to be manually configured to a specific value. Currently the default value is 10 minutes with the default scheduler configuration and limited between 600 and 1200 seconds. Value is limited only in case of default value. The default value depends on the current scheduler configuration. The SCHEDULER_TIMEOUT value is specified in seconds.
- This parameter measures the response time of the server JSV. In the event that the response time of the JSV is longer than the timeout value specified, this will cause the JSV to be re-started. The default value for the timeout is 10 seconds and if modified, must be greater than 0. If the timeout is reached, the JSV will only try to re-start once; if the timeout is reached again, an error will occur.
- The threshold of a JSV is measured as the time it takes to perform a server job verification. If this value is greater than the user-defined value, it will cause logging to appear in the qmaster messages file at the INFO level. By setting this value to 0, all jobs will be logged in the qmaster messages file. This value is specified in milliseconds and has a default value of 5000.
- Beginning with version 8.0.0 of xxQS_NAMExx the scheduling behavior changed for jobs that are rescheduled by users. Rescheduled jobs will not be put at the beginning of the pending job list anymore. The submit time of those jobs is set to the end time of the previous run. Due to that, those rescheduled jobs will be appended to the pending job list as if a new job had been submitted. To achieve the old behaviour, set the parameter OLD_RESCHEDULE_BEHAVIOR. Please note that this parameter is deprecated, so it might be removed with the next minor release.
- Beginning with version 8.0.0 of xxQS_NAMExx the scheduling behavior changed for array job tasks that are rescheduled by users. As soon as an array job task gets rescheduled, all remaining pending tasks of that job will be put at the end of the pending job list. To achieve the old scheduling behaviour set the parameter OLD_RESCHEDULE_BEHAVIOR_ARRAY_JOB. Please note that this parameter is deprecated, so it might be removed with the next minor release.
Bypass execd communication in qmaster for (e.g. for throughput tests
with fake hosts). "Unknown" queue states are suppressed, but
must be used to avoid queues going into an alarm state since load values
are not simulated. Submitted
jobs are dispatched, and act as if they are run for a time determined
by the job's first argument, after 3s spent in the "transferring"
state. I.e. there is a simulated 10s runtime for a command such as
qsub -b y sleep 10In this condition, job deletion works, but at least interactive jobs, tightly-integrated parallel ones, and job suspension don't. The execution hosts configured need not exist, but must have resolvable network names.
- Don't do authentication when GSSAPI security is enabled. This, and the following parameter, determine the GSS global configuration, which can be overridden with the execd_params of the global or host-specific configuration.
- Don't store and forward credentials if GSSAPI security is enabled.
- If GNU malloc is in use (rather then jemalloc, which is usually used on GNU/Linux) enable the facility for recording all memory allocation/deallocation. Requires MALLOC_TRACE to be set in the environment (see
- Used by the test suite to block the worker thread for five seconds after handling a request to ensure another worker thread will handle a subsequent request.
- Allow monitoring statistics if xxQS_NAMExx is built to use the allocator. The information is usually obtained with the -infooption of but is generated by the daemons and can't be controlled by the client. The default is false since the output is verbose and might confuse programs parsing the traditional format. The parameter can also be set in execd_params and affects both qmaster and execd daemons.
Changing qmaster_params will take immediate effect, except that gdi_timeout, gdi_retries, and cl_ping will take effect only for new connections. The default for qmaster_params is none.
execd_paramsThis is used for passing additional parameters to the xxQS_NAMExx execution daemon. The following values are recognized:
If this parameter is set to true, the usage of "reserved" (allocated)
resources is reported in the accounting entries cpu, mem,
and maxvmem instead of the measured usage. The live usage values
are affected similarly. This means that the
"wall clock" time (end-start) is reported instead of CPU time, memory
usage is memory allocation times wall clock time (which is only computable if
the job requests h_vmem or s_vmem), and maxvmem is the requested
h_vmem or s_vmem; the same scaling by slots is done as without this option.
Note that both the wall clock and CPU times are normally available (see so this option loses information, and "reserved" here has nothing to do with advance/resource reservation. See also SHARETREE_RESERVED_USAGE below.
- If this parameter is set to true, Windows Domain accounts (WinDomAcc) are used on Windows hosts. These accounts require the use of (See also If this parameter is set to false, or is not set, local Windows accounts are used. On non-Windows hosts, this parameter is ignored.
- If a user is assigned to NGROUPS_MAX-1 supplementary groups, so that xxQS_NAMExx is not able to add one for job tracking, then the job will go into an error state when it is started. (NGROUPS_MAX is the system limit on supplementary groups; see Administrators that want to prevent the system doing so can set this parameter. In this case the NGROUPS_MAX limit is ignored and the additional group (see gid_range) is not set. As a result for those jobs no online usage will be available. Also the parameter ENABLE_ADDGRP_KILL will have no effect. Please note that it is not recommended to use this parameter. Instead the group membership of the submit user should be reduced.
- This value should only be set for debugging purposes. If set to true, the execution daemon will not remove the spool directory maintained by for a job, or cgroup directories if cgroups are in use under Linux.
- PTF_MIN_PRIORITY, PTF_MAX_PRIORITY
The maximum/minimum priority which xxQS_NAMExx will assign to a job.
Typically this is a negative/positive value in the range of -20
(maximum) to 19 (minimum) for systems which allow setting of priorities
system call. Other systems may provide different ranges.
The default priority range (which varies from system to system) is installed either by removing the parameters, or by setting a value of -999.
See the "messages" file of the execution daemon for the predefined default value on your hosts. The values are logged during the startup of the execution daemon.
- Enables the profiling for the execution daemon (e.g. PROF_EXECD=true).
- This parameter allows you to change the notification signal for the signal SIGKILL (see the -notify option of The parameter either accepts signal names (use the -l option of kill(1)) or the special value none. If set to none, no notification signal will be sent. If it is set to TERM, for instance, or another signal name, then this signal will be sent as the notification signal.
- With this parameter it is possible to modify the notification signal for the signal SIGSTOP (see the -notify parameter of The parameter either accepts signal names (use the -l option of kill(1)) or the special value none. If set to none, no notification signal will be sent. If it is set to TSTP, for instance, or another signal name, then this signal will be sent as notification signal.
If this parameter is set to true, the usage of "reserved" resources is
taken for the xxQS_NAMExx share tree consumption instead of measured
usage. See the description of
above for details.
Note: When running tightly integrated jobs with SHARETREE_RESERVED_USAGE set, and with accounting_summary enabled in the parallel environment, reserved usage will only be reported by the master task of the parallel job. No per-parallel task usage records will be sent from execd to qmaster, which can significantly reduce load on qmaster when running large tightly integrated parallel jobs.
If this parameter is set to true, the primary group id active when a
job was submitted will be set to become the primary group id for job
execution. If the parameter is not set, the primary group id as defined for
the job owner in the execution host passwd database is used.
The feature is only available for jobs submitted via and Also, it only works for jobs (and thus also for and if builtin communication is used, or the rsh and rshd components which are provided with xxQS_NAMExx (see
- S_DESCRIPTORS, H_DESCRIPTORS, S_MAXPROC, H_MAXPROC, S_MEMORYLOCKED, H_MEMORYLOCKED, S_LOCKS, H_LOCKS
Specifies soft and hard resource limits as implemented by the
system call. See that manual page on your system for more information. These
parameters complete the list of limits set by the RESOURCE LIMITS parameter
of the queue configuration as described in
Unlike the resource limits in the queue
configuration, these resource limits are set for every job on this execution
host. If a value is not specified, the resource limit is inherited from the
execution daemon process. Because this would lead to unpredictable results
if only one limit of a resource is set (soft or hard), the
corresponding other limit is set to the same value.
S_DESCRIPTORS and H_DESCRIPTORS specify a value one greater than the maximum file descriptor number that can be opened by any process of a job.
S_MAXPROC and H_MAXPROC specify the maximum number of processes that can be created by the job user on this execution host.
S_MEMORYLOCKED and H_MEMORYLOCKED specify the maximum number of bytes of virtual memory that may be locked into RAM. This typically needs to be set to "unlimited" for use with openib Infiniband, and possibly similar transports.
S_LOCKS and H_LOCKS specify the maximum number of file locks any process of a job may establish.
All of these values can be specified using the multiplier letters k, K, m, M, g and G; see for details. Limits can be specified as "infinity" to remove limits (if possible), per setrlimit(2).
- This parameter indicates whether the shepherd should allow the environment inherited by the execution daemon from the shell that started it to be inherited by the job it's starting. When true, any environment variable that is set in the shell which starts the execution daemon at the time the execution daemon is started will be set in the environment of any jobs run by that execution daemon, unless the environment variable is explicitly overridden, such as PATH or LOGNAME. If set to false, each job starts with only the environment variables that are explicitly passed on by the execution daemon, such as PATH and LOGNAME. The default value is true.
- This parameter tells the execution daemon whether to add the xxQS_NAMExx shared library directory to the library path of executed jobs. If set to true, and INHERIT_ENV is also set to true, the xxQS_NAMExx shared library directory will be prepended to the library path which is inherited from the shell which started the execution daemon. If INHERIT_ENV is set to false, the library path will contain only the xxQS_NAMExx shared library directory. If set to false, and INHERIT_ENV is set to true, the library path exported to the job will be the one inherited from the shell which started the execution daemon. If INHERIT_ENV is also set to false, the library path will be empty. After the execution daemon has set the library path, it may be further altered by the shell in which the job is executed, or by the job script itself. The default value for SET_LIB_PATH is false.
- If this parameter is set then xxQS_NAMExx uses the supplementary group ids (see gid_range) to identify all processes which are to be terminated when a job is deleted, or when cleans up after job termination. This currently only works under GNU/Linux, Solaris, Tru64, FreeBSD, and Darwin. The default value is on. Irrelevant with cpuset support (see USE_CGROUPS below).
This parameter defines the interval (default 1s) between runs of the PDC (Portable
Data Collector) by the execution daemon. The PDC is responsible for enforcing
the resource limits s_cpu, h_cpu, s_vmem and h_vmem (see
and job usage collection.
The parameter can be set
to a time_specifier (see
to PER_LOAD_REPORT or to NEVER.
If this parameter is set to PER_LOAD_REPORT the PDC is triggered in the same interval as load_report_time (see above). If this parameter is set to NEVER the PDC run is never triggered. The default is 1 second.
Note: A PDC run is quite compute intensive, and may degrade the performance of the running jobs. However, if the PDC runs less often, or never, the online usage can be incomplete or totally missing (for example online usage of very short running jobs might be missing) and the resource limit enforcement is less accurate or would not happen if PDC is turned off completely.
- If this parameter is set, then xxQS_NAMExx enables the core binding module within the execution daemon to apply binding parameters that are specified at submission time of a job. This parameter is set by default if xxQS_NAMExx was compiled with support for core binding. Find more information for job to core binding in the section -binding of
- Allow the simulation of jobs. (Job spooling and execution on the execd side is disabled.)
- Turn off authentication for the relevant host(s) when the authentication GSSAPI security feature is enabled globally.
- Turn on authentication for the relevant host(s) when the authentication GSSAPI security feature is enabled globally.
- Turn off storing and forwarding of credentials when the GSSAPI security feature is enabled globally.
- Write messages to the system logger (see syslog(3)) rather than into the spool directory.
- Automatically start any executable named qidle present in the architecture-dependent binary directory as a load sensor, similarly to qloadsensor (which is run unconditionally). It is intended to determine whether a workstation is "idle" or not, i.e. whether it has an interactive load. See e.g. the idle time HOWTO <URL: https://gridengine.sourceforge.io/SGE/howto/idle.html > or sources/experimental/qidle in the source repository, but it may be better to check the the screensaver state.
- [Linux only.] Use cgroups/cpusets for resource management if the system supports them and the necessary directories exist in the relevant filesystems (possibly created by util/resources/scripts/setup-cgroups-etc). Makes ENABLE_ADDGRP_KILL irrelevant. This option is experimental, and at least the default is likely to change in future. Default is no.
- [Linux only.] Read processes' smaps file in the proc(5) filesystem to obtain PSS usage for most accurate memory accounting, or to obtain the swap usage on older systems which don't report PSS. That can be slow when processes have very many maps (observed with an FEM code), significantly increasing the load from execd, so the default is no. Without smaps, usage is reported as RSS+swap, instead of PSS+swap, or simply as the VMsize if the swap value isn't available.
- Use setrlimit(2) to limit the amount of memory address space usable by each process in the job. Available on systems where RLIMIT_VMEM or RLIMIT_AS exists ( RLIMIT_VMEM used preferentially). When a requested memory limit is exhausted, will tend to result in job termination during memory allocation (if true) or memory initialisation (if false). Default is true.
- See qmaster_params above.
- If true, generate load sensor reports just before sending them, making the data fresher. The default is true. The switch is provided in case slow sensors are found to have a bad effect on the execd.
Changing execd_params will take effect after it is propagated to the execution daemons. The propagation is done in one load report interval. The default for execd_params is none.
reporting_paramsUsed to define the behavior of reporting modules in the xxQS_NAMExx qmaster. Changes to the reporting_params take immediate effect. The following values are recognized:
- If this parameter is set to true, the accounting file is written. The accounting file is a prerequisite for
- If this parameter is set to true, the reporting file is written. The reporting file contains data that can be used for monitoring and analysis, like job accounting, job log, host load and consumables, queue status and consumables, and sharetree configuration and usage. Attention: Depending on the size and load of the cluster, the reporting file can become quite large. Only activate the reporting file if you have a process running that will consume the reporting file! See for further information about the format and contents of the reporting file.
- The contents of the reporting file are buffered in the xxQS_NAMExx qmaster and flushed at a fixed interval. This interval can be configured with the flush_time parameter. It is specified as a time value in the format HH:MM:SS or a number of seconds. Sensible values range from a few seconds to a minute. Setting it too low may slow down the qmaster. Setting it too high will make the qmaster consume large amounts of memory for buffering data. The reporting file is opened and closed for each flush. Default 15s.
- The contents of the accounting file are buffered in the xxQS_NAMExx qmaster and flushed at a fixed interval. This interval can be configured with the accounting_flush_time parameter. It is specified as a time value in the format HH:MM:SS. Sensible values range from a few seconds to one minute. Setting it too low may slow down the qmaster. Setting it too high will make the qmaster consume large amounts of memory for buffering data. Setting it to 0 disables buffering; as soon as a record is generated, it will be written to the accounting file. If this parameter is not set, the accounting data flush interval will default to the value of the flush_time parameter. The accounting file is opened and closed for each flush.
- If this parameter is set to true, the reporting file will contain job logging information. See for more information about job logging.
- The xxQS_NAMExx qmaster can dump information about sharetree configuration and use to the reporting file. The parameter sharelog sets an interval in which sharetree information will be dumped. It is set in the format HH:MM:SS or a number of seconds. A value of 0 (default) configures qmaster not to dump sharetree information. Intervals of several minutes up to hours are sensible values for this parameter. See for further information about sharelog.
- This parameter controls writing of consumable resources to the reporting file. When set to log_consumables=true information about all consumable resources (their current usage and their capacity) will be written to the reporting file, whenever a consumable resource changes either in definition, or in capacity, or when the usage of an arbitrary consumable resource changes. When log_consumables is set to false (default), only those variables will be written to the reporting file that are configured in the report_variables in the exec host configuration and whose definition or value actually changed. This parameter is deprecated and will get removed in the next major release. See for further information about report_variables.
finished_jobsNote: Deprecated, may be removed in a future release.
xxQS_NAMExx stores a certain number of just finished jobs to provide post mortem status information via qstat -s z. The finished_jobs parameter defines the number of finished ("zombie") jobs stored. If this maximum number is reached, the eldest finished job will be discarded for every new job added to the finished job list. (The zombie list is not spooled, and so will be lost by a qmaster re-start.)
Changing finished_jobs will take immediate effect. The default for finished_jobs is 100.
rsh_commandThese three pairs of entries are responsible for defining a remote startup method for either interactive jobs by or without a command, or an interactive request with a command. The last startup method is also used to startup tasks on a slave exechost of a tightly integrated parallel job. Each pair for one startup method must contain matching communication methods. All entries can contain the value builtin (which is the default) or a full path to a binary which should be used, and additional arguments to this command if necessary.
The entries for the three ..._command definitions can, in addition, contain the value NONE in case a particular startup method should be disabled.
Changing any of these entries will take immediate effect.
The global configuration entries for these values may be overwritten by a execution host local configuration.
delegated_file_stagingThis flag must be set to "true" when the prolog and epilog are ready for delegated file staging, so that the DRMAA attribute 'drmaa_transfer_files' is supported. To establish delegated file staging, use the variables beginning with "$fs_..." in prolog and epilog to move the input, output and error files from one host to the other. When this flag is set to "false", no file staging is available for the DRMAA interface. File staging is currently implemented only via the DRMAA interface. When an error occurs while moving the input, output and error files, return error code 100 so that the error handling mechanism can handle the error correctly. (See also FORBID_APPERROR.)
reprioritizeNote: Deprecated, may be removed in future release.
This flag enables or disables the reprioritization of jobs based on their ticket amount. The reprioritize_interval in takes effect only if reprioritize is set to true. To turn off job reprioritization, the reprioritize flag must be set to false and the reprioritize_interval to 0, which is the default.
jsv_urlThis setting defines a server JSV instance which will be started and triggered by the process. This JSV instance will be used to verify job specifications of jobs before they are accepted and stored in the internal master database. The global configuration entry for this value cannot be overwritten by execution host local configurations.
Find more details concerning JSV in and
jsv_allowed_modIf there is a server JSV script defined with the jsv_url parameter, then all or modification requests for jobs are rejected by qmaster. With the jsv_allowed_mod parameter an administrator has the possibility to allow a set of switches which can then be used with clients to modify certain job attributes. The value for this parameter has to be a comma-separated list of JSV job parameter names as documented in or the value none to indicate that no modification should be allowed. Please note that even if none is specified, the switches -w and -t are allowed for qalter.
libjvm_pathlibjvm_path is usually set during qmaster installation and points to the absolute path of libjvm.so (or the corresponding library depending on your architecture - e.g. /usr/java/jre/lib/i386/server/libjvm.so). The referenced libjvm version must be at least 1.5. It is needed by the JVM qmaster thread only. If the Java VM needs additional starting parameters they can be set in additional_jvm_args. Whether the JVM thread is started at all can be defined in the file. If libjvm_path is empty, or an incorrect path, the JVM thread fails to start.
additional_jvm_argsadditional_jvm_args is usually set during qmaster installation. Details about possible values additional_jvm_args can be found in the help output of the accompanying Java command. This setting is normally not needed.
SECURITYIf prolog or epilog is specified with a user@ prefix, security considerations apply. The methods are run in a user-supplied environment (via -V or -v) which provides a mechanism to run arbitrary code as user (which might well be root) by setting variables such as LD_LIBRARY_PATH and LD_PRELOAD to affect the running of the dynamically linked programs, such as shells, which are used to implement the methods.
To combat this, known problematic variables are removed from the environment before starting the methods other than as the job owner, but this may not be foolproof on arbitrary systems with obscure variables. The environment can be safely controlled by running the methods under a statically-linked version of env(1), such as typically available using busybox(1), for example. Use
- /bin/busybox env -u ...
to unset sensitive variables, or
- /bin/busybox env -i name=value...
to set only specific variables. On some systems, such as recent Solaris, it is essentially impossible to build static binaries. In that case it is typically possible to use a setuid wrapper, relying on the dynamic linker to do the right thing. An example is the safe_exec wrapper which is available from <URL: https://sourceforge.net/projects/gridengine/files/SGE/support/ > at the time of writing. When using a non-shell scripting language wrapper for the method daemon, try to use options which avoid interpreter-specific environmental damage, such as Perl's -T and Python's -E. Privileged shell script wrappers should be avoided if possible, and should be written carefully if they are used - e.g. invoke programs with full file names - but if is used, it should be run with the -p option.
It is not currently possible to specify the variables unset, e.g. as a host-dependent execd parameter, but certain system-dependent ones are selected. The list of sensitive variables is taken mostly from GNU libc and It includes known system-dependent dynamic linker ones, sensitive locale ones and others, like TMPDIR, but does not attempt to deal interpreter-specific variables such as PYTHONPATH. The locale specification is also sanitized. See the source file source/libs/uti2/sge_execvlp.c for details. Note that TMPDIR is one of the variables affected, and may need to be recreated (typically as /tmp/$JOB_ID.$TASK_ID.$SGE_CELL).
COPYRIGHTSee for a full statement of rights and permissions.