Job Management Guide

Master job management in S9S with this comprehensive guide covering submission, monitoring, and advanced job operations.

Job Overview

S9S provides a powerful interface for managing SLURM jobs with features that go beyond traditional command-line tools:

Real-time Monitoring: Live job status updates
Batch Operations: Manage multiple jobs simultaneously
Advanced Filtering: Find jobs quickly
Direct Output Access: View logs without leaving S9S
Job Templates: Reusable job configurations
Dependency Management: Visual dependency tracking

Submitting Jobs

Quick Submit

Press s in Jobs view to open the submission wizard:

Choose Method:
- New job from scratch (Custom Job)
- From template (select a pre-configured template)

Configure Resources:


Job Name: my_analysis
Partition: compute
Nodes: 2
Tasks per Node: 28
Memory: 64GB
Time Limit: 24:00:00

Set Script:


#!/bin/bash

module load python/3.9
python analyze.py

Job Submission Workflow

Job Submission Demo

Job submission wizard with step-by-step configuration

The submission process guides you through all necessary options with helpful defaults and validation.

Submission Wizard Fields

The wizard supports 86 sbatch fields across the full SLURM OpenAPI spec. Fields are organized into three visibility tiers so the form stays manageable while still exposing every option when needed.

Always visible -- these 7 core fields appear on every new job form:

Config Key	JSON Key	sbatch Flag	Description
name	name	--job-name	Job name
script	script	(script body)	Batch script content
partition	partition	--partition	Target partition
timeLimit	time_limit	--time	Wall-clock time limit (HH:MM:SS or D-HH:MM:SS)
nodes	nodes	--nodes	Number of nodes
cpus	cpus	--cpus-per-task	CPUs per task
memory	memory	--mem	Memory per node (e.g., 4G, 1024M)

Visible by default -- these 8 fields are shown on the form unless explicitly hidden:

Config Key	JSON Key	sbatch Flag	Description
gpus	gpus	--gres=gpu:N	Number of GPUs
qos	qos	--qos	Quality of service
account	account	--account	Charge account
workingDir	working_directory	--chdir	Working directory
outputFile	output_file	--output	Stdout file path
errorFile	error_file	--error	Stderr file path
emailNotify	email_notify	--mail-type	Enable email notifications
email	email	--mail-user	Notification email address

Hidden by default -- these 71 fields become visible when a template sets a value for them, when they are removed from the hiddenFields list in config, or when a per-template hiddenFields override brings them into view.

Job arrays & dependencies:

Config Key	JSON Key	sbatch Flag	Description
arraySpec	array	--array	Array job index spec (e.g., 1-100%10)
dependencies	dependencies	--dependency	Job dependencies (list of job IDs, submitted as afterok:id1:id2)

Resource controls:

Config Key	JSON Key	sbatch Flag	Description
exclusive	exclusive	--exclusive	Exclusive node access
requeue	requeue	--requeue	Requeue on failure
gres	gres	--gres	Generic resources (e.g., gpu:a100:2)
constraints	constraints	--constraint	Required node features
ntasks	ntasks	--ntasks	Total number of tasks
ntasksPerNode	ntasks_per_node	--ntasks-per-node	Tasks per node
memoryPerCPU	memory_per_cpu	--mem-per-cpu	Memory per CPU (e.g., 4G)
minimumCPUs	minimum_cpus	(API only)	Minimum total CPUs
minimumCPUsPerNode	minimum_cpus_per_node	--mincpus	Minimum CPUs per node
maximumNodes	maximum_nodes	--nodes (max)	Maximum node count
maximumCPUs	maximum_cpus	(API only)	Maximum CPU count
tmpDiskPerNode	tmp_disk_per_node	--tmp	Temporary disk per node (MB)
overcommit	overcommit	--overcommit	Overcommit resources
contiguous	contiguous	--contiguous	Require contiguous nodes

Scheduling:

Config Key	JSON Key	sbatch Flag	Description
hold	hold	--hold	Submit in held state
priority	priority	--priority	Job priority
nice	nice	--nice	Priority adjustment
beginTime	begin_time	--begin	Deferred start (see Begin Time Formats below)
deadline	deadline	--deadline	Latest acceptable start time
immediate	immediate	--immediate	Fail if resources not available now
timeMinimum	time_minimum	--time-min	Minimum time for backfill scheduling

Placement & topology:

Config Key	JSON Key	sbatch Flag	Description
distribution	distribution	--distribution	Task distribution (block, cyclic, etc.)
threadsPerCore	threads_per_core	--threads-per-core	Threads per core
tasksPerCore	tasks_per_core	--ntasks-per-core	Tasks per core
tasksPerSocket	tasks_per_socket	--ntasks-per-socket	Tasks per socket
socketsPerNode	sockets_per_node	--sockets-per-node	Sockets per node
cpuBinding	cpu_binding	--cpu-bind	CPU binding method
cpuBindingFlags	cpu_binding_flags	--cpu-bind (flags)	CPU binding flags (VERBOSE, etc.)
memoryBinding	memory_binding	--mem-bind	Memory NUMA binding
memoryBindingType	memory_binding_type	--mem-bind (type)	Memory binding type (LOCAL, RANK)
requiredNodes	required_nodes	--nodelist	Required specific nodes
excludeNodes	exclude_nodes	--exclude	Excluded nodes

TRES (Trackable Resources):

Config Key	JSON Key	sbatch Flag	Description
cpusPerTRES	cpus_per_tres	--cpus-per-gpu	CPUs per GPU/TRES
memoryPerTRES	memory_per_tres	--mem-per-gpu	Memory per GPU/TRES
ntasksPerTRES	ntasks_per_tres	--ntasks-per-gpu	Tasks per GPU/TRES
tresPerTask	tres_per_task	--tres-per-task	TRES per task
tresPerSocket	tres_per_socket	--tres-per-socket	TRES per socket
tresPerJob	tres_per_job	--tres-per-job	TRES per job
tresBind	tres_bind	--tres-bind	TRES binding (e.g., gres/gpu:closest)
tresFreq	tres_freq	--tres-freq	TRES frequency control

Signals & notifications:

Config Key	JSON Key	sbatch Flag	Description
signal	signal	--signal	Pre-termination signal (e.g., B:USR1@300)
killOnNodeFail	kill_on_node_fail	--no-kill	Kill job on node failure
waitAllNodes	wait_all_nodes	--wait-all-nodes	Wait for all nodes to boot

Accounting & admin:

Config Key	JSON Key	sbatch Flag	Description
reservation	reservation	--reservation	Reservation name
licenses	licenses	--licenses	Required licenses
wckey	wckey	--wckey	Workload characterization key
comment	comment	--comment	Job comment
prefer	prefer	--prefer	Preferred (not required) features

I/O & environment:

Config Key	JSON Key	sbatch Flag	Description
standardInput	standard_input	--input	Stdin file path
openMode	open_mode	--open-mode	Output file mode (append or truncate)
container	container	--container	OCI container path

Advanced:

Config Key	JSON Key	sbatch Flag	Description
cpuFrequency	cpu_frequency	--cpu-freq	CPU frequency (low, medium, high, or KHz)
network	network	--network	Network specs
x11	x11	--x11	X11 forwarding (batch, first, last, all)
burstBuffer	burst_buffer	--bb	Burst buffer specification
batchFeatures	batch_features	--batch	Batch node features
coreSpecification	core_specification	--core-spec	Reserved system cores
threadSpecification	thread_specification	--thread-spec	Reserved system threads
argv	argv	(script arguments)	Script arguments (space-separated)
flags	flags	--spread-job, etc.	Job flags (comma-separated)
profile	profile	--profile	Profiling (ENERGY, NETWORK, TASK)

Cluster federation:

Config Key	JSON Key	sbatch Flag	Description
requiredSwitches	required_switches	--switches	Required network switch count
waitForSwitch	wait_for_switch	--switches (timeout)	Switch wait timeout (seconds)
clusterConstraint	cluster_constraint	--cluster-constraint	Federation cluster constraint
clusters	clusters	--clusters	Target clusters (federation)

Begin Time Formats

The beginTime field accepts all formats supported by sbatch --begin:

Format	Example	Description
Named time	`now`, `today`, `tomorrow`	Current time, midnight today, midnight tomorrow
Named hour	`midnight`, `noon`, `teatime`	00:00, 12:00, 16:00 (next occurrence)
Relative	`now+1hour`, `now+30minutes`	Offset from current time
Relative (seconds)	`now+3600`	Bare number = seconds
ISO date	`2024-12-31`	Midnight on date
ISO datetime	`2024-12-31T14:30`	Specific date and time
US date	`12/31/24`, `123124`	US date format
Time of day	`16:00`, `4:00PM`	Next occurrence of time

Named times: midnight (00:00), noon (12:00), elevenses (11:00), fika (15:00), teatime (16:00).

Template-Based Submission

To submit a job from a template, press s to open the submission wizard, then select "From template" and pick a template from the list. The selected template pre-fills the form with its configured defaults and controls which fields are visible.

You can also list available templates from the shell:


s9s templates list

Config-Driven Customization

All submission defaults and field visibility can be configured under views.jobs.submission in your config file (~/.s9s/config.yaml):


views:
  jobs:
    submission:
      # Global defaults applied to every new job
      formDefaults:
        partition: "compute"
        timeLimit: "04:00:00"
        nodes: 1
        cpus: 4
        memory: "8G"
        workingDir: "/scratch/%u"  # %u = username (SLURM substitution)
        outputFile: "slurm_%j.out"
        errorFile: "slurm_%j.err"

      # Fields to hide globally from the form
      hiddenFields:
        - arraySpec
        - exclusive
        - requeue

      # Restrict dropdown values (filters cluster-fetched values)
      fieldOptions:
        partition: ["compute", "gpu", "highmem"]
        qos: ["normal", "high"]
        account: ["research-a", "research-b"]

      # Control which template sources are loaded (default: all three)
      # Options: "builtin", "config", "saved"
      templateSources: ["builtin", "config", "saved"]

      # Define custom config templates (see Job Templates section below)
      templates:
        - name: "GPU Training Job"
          description: "PyTorch training on GPU partition"
          defaults:
            partition: "gpu"
            timeLimit: "24:00:00"
            cpus: 8
            memory: "32G"
            gpus: 2
            script: |
              #!/bin/bash
              module load cuda pytorch
              python train.py
          hiddenFields: ["arraySpec"]

Monitoring Jobs

Job States

S9S color-codes job states for quick identification:

State	Color	Description
PENDING	Yellow	Waiting for resources
RUNNING	Green	Currently executing
COMPLETED	Cyan	Finished successfully
FAILED	Red	Exited with error
CANCELLED	Gray	Cancelled by user/admin (SLURM uses British spelling)
TIMEOUT	White	Exceeded time limit
SUSPENDED	Orange	Temporarily suspended

Job Details

Press Enter on any job to view details:

Summary: ID, name, user, submission time
Resources: Nodes, CPUs, memory, GPUs
Timing: Start, elapsed, remaining time
Performance: CPU/memory efficiency
Output: Stdout/stderr file paths
Dependencies: Parent/child jobs

Press d to view job dependencies (not details).

Live Output Monitoring

View job output in real-time:

Select job and press o
Output opens showing stdout by default
Options:
- t - Toggle real-time streaming (tail -f style)
- s - Switch between stdout/stderr
- r - Refresh output
- a - Toggle auto-scroll
- e - Export output to file (text, JSON, CSV, markdown)
- Esc - Exit viewer

For more details, see the Jobs View Guide.

Job Operations

Single Job Actions

Key	Action	Description
`c`/`C`	Cancel	Cancel job (with confirmation)
`H`	Hold	Prevent job from starting
`r`	Release	Release held job
`R`	Refresh	Refresh the jobs list
`:requeue JOBID`	Requeue	Resubmit failed job (use command mode)
`d`/`D`	Dependencies	View job dependencies
`p`/`P`	Toggle Pending	Toggle pending state filter
`e`/`E`	Export	Open export dialog
`F6`	Auto-refresh	Global pause/resume (applies to all views)

Batch Operations

Select multiple jobs with Space, then press b:

Selection Methods:
- Manual: Space on each job
- Toggle multi-select: v or V
- By filter: /PENDING then select visible jobs
Batch Actions:
- Cancel selected
- Hold/Release selected
- Requeue selected
- Delete selected
- Set Priority
- Export output

Advanced Operations

Job Arrays

Array jobs are created by setting the arraySpec field in the submission wizard (for example, 1-100%10). Once submitted, individual array tasks appear in the jobs list and can be managed with standard job operations (cancel, hold, release) using the task's full job ID.

Dependencies

Job dependencies are set in the submission wizard via the dependencies field. Enter a comma-separated list of job IDs; S9S submits them as afterok:id1:id2 automatically. Dependency information is displayed in the job details view.

See #115 for planned command-mode enhancements to array and dependency management.

Advanced Filtering

Filter Syntax

S9S supports two filtering modes:

Quick Filter (/) -- plain text search across all visible columns. Type /gpu to find items containing "gpu". The only special prefix is p: for partition filtering.

Global Search (Ctrl+F) -- opens cross-resource search (available in all data views). The advanced filter bar supports field-specific field=value syntax with operators:


# Advanced filter examples
state=RUNNING                           # Running jobs
user=alice                              # Alice's jobs
state=PENDING user=bob                  # Bob's pending jobs (AND logic)
name~analysis                           # Jobs containing "analysis"
name=~"analysis.*2023"                  # Regex match
memory>4G cpus>=8                       # Resource comparisons
state!=FAILED                           # Not failed
state in (RUNNING,PENDING)              # In list

Saved Filters

Saved filters are not yet implemented. Use the / quick filter for on-the-fly filtering. See #115 for planned saved-filter support.

Job Performance

Efficiency Metrics

S9S calculates job efficiency:

CPU Efficiency: Actual vs allocated CPU usage
Memory Efficiency: Peak vs allocated memory
GPU Utilization: GPU usage percentage
I/O Performance: Read/write statistics

View metrics by selecting a job and pressing Enter to see the job details view, which includes efficiency information when available.

Job Templates

Template System Overview

S9S uses a three-tier merge system to assemble the list of available templates. When two or more sources define a template with the same name, the higher-priority source wins.

Priority	Source	Location
1 (highest)	User-saved templates	`~/.s9s/templates/*.json`
2	Config YAML templates	`views.jobs.submission.templates` in config
3 (lowest)	Built-in templates	Hardcoded in S9S

Built-in Templates

S9S ships with 8 built-in templates covering common job patterns:

Template	Description
Basic Batch Job	Simple single-node batch job
MPI Parallel Job	Parallel job using MPI across multiple nodes
GPU Job	Job requiring GPU resources
Array Job	Array job for processing multiple similar tasks
Interactive Job	Interactive session for development and testing
Long-Running Job	Extended wall-time job
High Memory Job	Job requesting large memory allocation
Development/Debug Job	Short debug session with verbose output

Config YAML Templates

Define custom templates in your config file under views.jobs.submission.templates. Each template can set default values for any form field and optionally hide irrelevant fields:


views:
  jobs:
    submission:
      templates:
        - name: "GPU Training Job"
          description: "PyTorch training on GPU partition"
          defaults:
            partition: "gpu"
            timeLimit: "24:00:00"
            cpus: 8
            memory: "32G"
            gpus: 2
            script: |
              #!/bin/bash
              module load cuda pytorch
              python train.py
          hiddenFields: ["arraySpec"]

        - name: "Genomics Pipeline"
          description: "High-memory genomics analysis"
          defaults:
            partition: "highmem"
            timeLimit: "48:00:00"
            memory: "256G"
            cpus: 32
          hiddenFields: ["gpus", "arraySpec"]

User-Saved Templates

User-saved templates are stored as individual JSON files in ~/.s9s/templates/ and have the highest priority in the merge order.

Saving from the Wizard

After configuring a job in the submission wizard, use the "Save as Template" flow to save the current form state as a new template in ~/.s9s/templates/.

Template JSON Format

Each saved template is a JSON file with the following structure:


{
  "name": "My Custom Template",
  "description": "Description of this template",
  "job_submission": {
    "name": "my_job",
    "partition": "compute",
    "time_limit": "04:00:00",
    "nodes": 2,
    "cpus": 8,
    "memory": "16G",
    "working_directory": "/scratch/%u",
    "output_file": "job_%j.out",
    "error_file": "job_%j.err",
    "script": "#!/bin/bash\nmodule load python\npython run.py"
  }
}

Naming conventions: Saved JSON templates use snake_case field names (matching Go struct JSON tags): time_limit, output_file, error_file, working_directory. Config YAML templates use camelCase: timeLimit, outputFile, errorFile, workingDir. Using the wrong convention will silently ignore the field. Script arguments (argv) are split on whitespace in both formats — quoted arguments with spaces are not supported.

Job Script Preview

Before submitting, press Preview to see the complete sbatch script that will be generated from your form values. The preview shows all #SBATCH directives derived from the form fields followed by your script body.

Key	Action
`ESC`	Close preview
`Ctrl+Y`	Copy script to clipboard (via OSC 52)

The clipboard copy produces a clean plain-text script with no color formatting, ready to paste into a file or terminal. OSC 52 clipboard support works in most modern terminals (iTerm2, kitty, tmux, Windows Terminal, Alacritty).

CLI Commands

Manage templates from the command line:


# List all templates from all sources with source indicator (builtin/config/saved)
s9s templates list

# Export all built-in and config templates to ~/.s9s/templates/ as editable JSON
s9s templates export

# Export a single template by name
s9s templates export "GPU Job"

# Overwrite existing files during export
s9s templates export --force

# Export to a custom directory
s9s templates export --dir /path/to/templates

Template Workflow

A typical workflow for customizing templates:

Export the built-ins to get editable copies:
```
s9s templates export
```
This writes all 8 built-in templates (plus any config templates) to ~/.s9s/templates/ as JSON files. Existing files are skipped unless --force is used.
Edit the JSON files in ~/.s9s/templates/ to match your environment (change partitions, modules, default resources, etc.). The exported files use the same format that JobTemplateManager loads — changes take effect on the next wizard open without restarting s9s.
Verify your templates are loaded:
```
s9s templates list
```
Edited templates show as saved source and override any built-in with the same name.

Optionally restrict sources so only your edited templates appear in the wizard:


views:
  jobs:
    submission:
      templateSources: ["saved"]

Use templates in the wizard — press s to open the submission wizard, then pick from the template selector. Your custom templates appear with the values you set.

Controlling Template Sources

By default all three sources are loaded. Use the templateSources config option to control which sources appear:


views:
  jobs:
    submission:
      # Show only user-saved and config templates (hide built-ins)
      templateSources: ["config", "saved"]

Valid values: "builtin", "config", "saved".

Job Workflows

Job Chains

To create a chain of dependent jobs, submit each job with its dependencies set in the submission wizard's dependencies field. For example, submit a preprocessing job first, then submit an analysis job with the preprocessing job's ID in the dependencies field, and so on. S9S automatically formats these as afterok dependencies.

Job chains and recurring job scheduling via command mode are not yet available. See #115 for planned workflow enhancements.

Job Reporting

Export Job Data

Export the current view by pressing e (depending on the view). This exports the visible data to a file. Command-mode export and report generation are not yet available. See #115 for planned reporting enhancements.

Tips & Best Practices

Efficiency Tips

Use templates for repetitive jobs
Set up filters for your common queries
Monitor efficiency to optimize resource requests
Use batch operations for multiple similar jobs
Enable notifications for long-running jobs

Common Workflows

Debug Failed Jobs


# Use p/P to toggle pending filter, or / for text search
/FAILED                # Quick filter for failed jobs
Enter                  # View job details
o                      # Check output/errors
:requeue JOBID         # Requeue if needed (command mode)

Monitor GPU Jobs


/gpu                           # Quick filter for GPU-related jobs
Enter                          # View job details

Bulk Cancel User Jobs


/username                     # Filter by user text
V                            # Toggle multi-select mode
Space                        # Select individual jobs
b                           # Batch operations
c                          # Cancel selected

Troubleshooting

Common Issues

Job Stuck in PENDING

Check reason code in job details
View partition limits
Check dependencies
Verify resource availability

Low Efficiency

Review resource requests
Check for I/O bottlenecks
Verify correct partition
Consider job profiling

Output Not Found

Verify output paths in job script
Check working directory
Ensure write permissions
Look for redirected output