Job Management Guide
Master job management in S9S with this comprehensive guide covering submission, monitoring, and advanced job operations.
Job Overview
S9S provides a powerful interface for managing SLURM jobs with features that go beyond traditional command-line tools:
- Real-time Monitoring: Live job status updates
- Batch Operations: Manage multiple jobs simultaneously
- Advanced Filtering: Find jobs quickly
- Direct Output Access: View logs without leaving S9S
- Job Templates: Reusable job configurations
- Dependency Management: Visual dependency tracking
Submitting Jobs
Quick Submit
Press s in Jobs view to open the submission wizard:
-
Choose Method:
- New job from scratch (Custom Job)
- From template (select a pre-configured template)
-
Configure Resources:
Job Name: my_analysis Partition: compute Nodes: 2 Tasks per Node: 28 Memory: 64GB Time Limit: 24:00:00 -
Set Script:
#!/bin/bash module load python/3.9 python analyze.py
Job Submission Workflow

Job submission wizard with step-by-step configuration
The submission process guides you through all necessary options with helpful defaults and validation.
Submission Wizard Fields
The wizard supports 86 sbatch fields across the full SLURM OpenAPI spec. Fields are organized into three visibility tiers so the form stays manageable while still exposing every option when needed.
Always visible -- these 7 core fields appear on every new job form:
| Config Key | JSON Key | sbatch Flag | Description |
|---|---|---|---|
| name | name | --job-name | Job name |
| script | script | (script body) | Batch script content |
| partition | partition | --partition | Target partition |
| timeLimit | time_limit | --time | Wall-clock time limit (HH:MM:SS or D-HH:MM:SS) |
| nodes | nodes | --nodes | Number of nodes |
| cpus | cpus | --cpus-per-task | CPUs per task |
| memory | memory | --mem | Memory per node (e.g., 4G, 1024M) |
Visible by default -- these 8 fields are shown on the form unless explicitly hidden:
| Config Key | JSON Key | sbatch Flag | Description |
|---|---|---|---|
| gpus | gpus | --gres=gpu:N | Number of GPUs |
| qos | qos | --qos | Quality of service |
| account | account | --account | Charge account |
| workingDir | working_directory | --chdir | Working directory |
| outputFile | output_file | --output | Stdout file path |
| errorFile | error_file | --error | Stderr file path |
| emailNotify | email_notify | --mail-type | Enable email notifications |
| --mail-user | Notification email address |
Hidden by default -- these 71 fields become visible when a template sets a value for them, when they are removed from the hiddenFields list in config, or when a per-template hiddenFields override brings them into view.
Job arrays & dependencies:
| Config Key | JSON Key | sbatch Flag | Description |
|---|---|---|---|
| arraySpec | array | --array | Array job index spec (e.g., 1-100%10) |
| dependencies | dependencies | --dependency | Job dependencies (list of job IDs, submitted as afterok:id1:id2) |
Resource controls:
| Config Key | JSON Key | sbatch Flag | Description |
|---|---|---|---|
| exclusive | exclusive | --exclusive | Exclusive node access |
| requeue | requeue | --requeue | Requeue on failure |
| gres | gres | --gres | Generic resources (e.g., gpu:a100:2) |
| constraints | constraints | --constraint | Required node features |
| ntasks | ntasks | --ntasks | Total number of tasks |
| ntasksPerNode | ntasks_per_node | --ntasks-per-node | Tasks per node |
| memoryPerCPU | memory_per_cpu | --mem-per-cpu | Memory per CPU (e.g., 4G) |
| minimumCPUs | minimum_cpus | (API only) | Minimum total CPUs |
| minimumCPUsPerNode | minimum_cpus_per_node | --mincpus | Minimum CPUs per node |
| maximumNodes | maximum_nodes | --nodes (max) | Maximum node count |
| maximumCPUs | maximum_cpus | (API only) | Maximum CPU count |
| tmpDiskPerNode | tmp_disk_per_node | --tmp | Temporary disk per node (MB) |
| overcommit | overcommit | --overcommit | Overcommit resources |
| contiguous | contiguous | --contiguous | Require contiguous nodes |
Scheduling:
| Config Key | JSON Key | sbatch Flag | Description |
|---|---|---|---|
| hold | hold | --hold | Submit in held state |
| priority | priority | --priority | Job priority |
| nice | nice | --nice | Priority adjustment |
| beginTime | begin_time | --begin | Deferred start (see Begin Time Formats below) |
| deadline | deadline | --deadline | Latest acceptable start time |
| immediate | immediate | --immediate | Fail if resources not available now |
| timeMinimum | time_minimum | --time-min | Minimum time for backfill scheduling |
Placement & topology:
| Config Key | JSON Key | sbatch Flag | Description |
|---|---|---|---|
| distribution | distribution | --distribution | Task distribution (block, cyclic, etc.) |
| threadsPerCore | threads_per_core | --threads-per-core | Threads per core |
| tasksPerCore | tasks_per_core | --ntasks-per-core | Tasks per core |
| tasksPerSocket | tasks_per_socket | --ntasks-per-socket | Tasks per socket |
| socketsPerNode | sockets_per_node | --sockets-per-node | Sockets per node |
| cpuBinding | cpu_binding | --cpu-bind | CPU binding method |
| cpuBindingFlags | cpu_binding_flags | --cpu-bind (flags) | CPU binding flags (VERBOSE, etc.) |
| memoryBinding | memory_binding | --mem-bind | Memory NUMA binding |
| memoryBindingType | memory_binding_type | --mem-bind (type) | Memory binding type (LOCAL, RANK) |
| requiredNodes | required_nodes | --nodelist | Required specific nodes |
| excludeNodes | exclude_nodes | --exclude | Excluded nodes |
TRES (Trackable Resources):
| Config Key | JSON Key | sbatch Flag | Description |
|---|---|---|---|
| cpusPerTRES | cpus_per_tres | --cpus-per-gpu | CPUs per GPU/TRES |
| memoryPerTRES | memory_per_tres | --mem-per-gpu | Memory per GPU/TRES |
| ntasksPerTRES | ntasks_per_tres | --ntasks-per-gpu | Tasks per GPU/TRES |
| tresPerTask | tres_per_task | --tres-per-task | TRES per task |
| tresPerSocket | tres_per_socket | --tres-per-socket | TRES per socket |
| tresPerJob | tres_per_job | --tres-per-job | TRES per job |
| tresBind | tres_bind | --tres-bind | TRES binding (e.g., gres/gpu:closest) |
| tresFreq | tres_freq | --tres-freq | TRES frequency control |
Signals & notifications:
| Config Key | JSON Key | sbatch Flag | Description |
|---|---|---|---|
| signal | signal | --signal | Pre-termination signal (e.g., B:USR1@300) |
| killOnNodeFail | kill_on_node_fail | --no-kill | Kill job on node failure |
| waitAllNodes | wait_all_nodes | --wait-all-nodes | Wait for all nodes to boot |
Accounting & admin:
| Config Key | JSON Key | sbatch Flag | Description |
|---|---|---|---|
| reservation | reservation | --reservation | Reservation name |
| licenses | licenses | --licenses | Required licenses |
| wckey | wckey | --wckey | Workload characterization key |
| comment | comment | --comment | Job comment |
| prefer | prefer | --prefer | Preferred (not required) features |
I/O & environment:
| Config Key | JSON Key | sbatch Flag | Description |
|---|---|---|---|
| standardInput | standard_input | --input | Stdin file path |
| openMode | open_mode | --open-mode | Output file mode (append or truncate) |
| container | container | --container | OCI container path |
Advanced:
| Config Key | JSON Key | sbatch Flag | Description |
|---|---|---|---|
| cpuFrequency | cpu_frequency | --cpu-freq | CPU frequency (low, medium, high, or KHz) |
| network | network | --network | Network specs |
| x11 | x11 | --x11 | X11 forwarding (batch, first, last, all) |
| burstBuffer | burst_buffer | --bb | Burst buffer specification |
| batchFeatures | batch_features | --batch | Batch node features |
| coreSpecification | core_specification | --core-spec | Reserved system cores |
| threadSpecification | thread_specification | --thread-spec | Reserved system threads |
| argv | argv | (script arguments) | Script arguments (space-separated) |
| flags | flags | --spread-job, etc. | Job flags (comma-separated) |
| profile | profile | --profile | Profiling (ENERGY, NETWORK, TASK) |
Cluster federation:
| Config Key | JSON Key | sbatch Flag | Description |
|---|---|---|---|
| requiredSwitches | required_switches | --switches | Required network switch count |
| waitForSwitch | wait_for_switch | --switches (timeout) | Switch wait timeout (seconds) |
| clusterConstraint | cluster_constraint | --cluster-constraint | Federation cluster constraint |
| clusters | clusters | --clusters | Target clusters (federation) |
Begin Time Formats
The beginTime field accepts all formats supported by sbatch --begin:
| Format | Example | Description |
|---|---|---|
| Named time | now, today, tomorrow | Current time, midnight today, midnight tomorrow |
| Named hour | midnight, noon, teatime | 00:00, 12:00, 16:00 (next occurrence) |
| Relative | now+1hour, now+30minutes | Offset from current time |
| Relative (seconds) | now+3600 | Bare number = seconds |
| ISO date | 2024-12-31 | Midnight on date |
| ISO datetime | 2024-12-31T14:30 | Specific date and time |
| US date | 12/31/24, 123124 | US date format |
| Time of day | 16:00, 4:00PM | Next occurrence of time |
Named times: midnight (00:00), noon (12:00), elevenses (11:00), fika (15:00), teatime (16:00).
Template-Based Submission
To submit a job from a template, press s to open the submission wizard, then select "From template" and pick a template from the list. The selected template pre-fills the form with its configured defaults and controls which fields are visible.
You can also list available templates from the shell:
s9s templates list
Config-Driven Customization
All submission defaults and field visibility can be configured under views.jobs.submission in your config file (~/.s9s/config.yaml):
views: jobs: submission: # Global defaults applied to every new job formDefaults: partition: "compute" timeLimit: "04:00:00" nodes: 1 cpus: 4 memory: "8G" workingDir: "/scratch/%u" # %u = username (SLURM substitution) outputFile: "slurm_%j.out" errorFile: "slurm_%j.err" # Fields to hide globally from the form hiddenFields: - arraySpec - exclusive - requeue # Restrict dropdown values (filters cluster-fetched values) fieldOptions: partition: ["compute", "gpu", "highmem"] qos: ["normal", "high"] account: ["research-a", "research-b"] # Control which template sources are loaded (default: all three) # Options: "builtin", "config", "saved" templateSources: ["builtin", "config", "saved"] # Define custom config templates (see Job Templates section below) templates: - name: "GPU Training Job" description: "PyTorch training on GPU partition" defaults: partition: "gpu" timeLimit: "24:00:00" cpus: 8 memory: "32G" gpus: 2 script: | #!/bin/bash module load cuda pytorch python train.py hiddenFields: ["arraySpec"]
Monitoring Jobs
Job States
S9S color-codes job states for quick identification:
| State | Color | Description |
|---|---|---|
| PENDING | Yellow | Waiting for resources |
| RUNNING | Green | Currently executing |
| COMPLETED | Cyan | Finished successfully |
| FAILED | Red | Exited with error |
| CANCELLED | Gray | Cancelled by user/admin (SLURM uses British spelling) |
| TIMEOUT | White | Exceeded time limit |
| SUSPENDED | Orange | Temporarily suspended |
Job Details
Press Enter on any job to view details:
- Summary: ID, name, user, submission time
- Resources: Nodes, CPUs, memory, GPUs
- Timing: Start, elapsed, remaining time
- Performance: CPU/memory efficiency
- Output: Stdout/stderr file paths
- Dependencies: Parent/child jobs
Press d to view job dependencies (not details).
Live Output Monitoring
View job output in real-time:
- Select job and press o
- Choose output type:
- Standard output (stdout)
- Standard error (stderr)
- Options:
- f - Follow/tail output
- s - Switch between stdout/stderr
- Esc - Exit viewer
For more details, see the Jobs View Guide.
Job Operations
Single Job Actions
| Key | Action | Description |
|---|---|---|
| c/C | Cancel | Cancel job (with confirmation) |
| H | Hold | Prevent job from starting |
| r | Release | Release held job |
| R | Refresh | Refresh the jobs list |
:requeue JOBID | Requeue | Resubmit failed job (use command mode) |
| d/D | Dependencies | View job dependencies |
| p/P | Toggle Pending | Toggle pending state filter |
| e/E | Export | Open export dialog |
| m/M | Auto-refresh | Toggle auto-refresh |
Batch Operations
Select multiple jobs with Space, then press b:
-
Selection Methods:
- Manual: Space on each job
- Toggle multi-select: v or V
- By filter:
/PENDINGthen select visible jobs
-
Batch Actions:
- Cancel selected
- Hold/Release selected
- Requeue selected
- Delete selected
- Set Priority
- Export output
Advanced Operations
Job Arrays
Array jobs are created by setting the arraySpec field in the submission wizard (for example, 1-100%10). Once submitted, individual array tasks appear in the jobs list and can be managed with standard job operations (cancel, hold, release) using the task's full job ID.
Dependencies
Job dependencies are set in the submission wizard via the dependencies field. Enter a comma-separated list of job IDs; S9S submits them as afterok:id1:id2 automatically. Dependency information is displayed in the job details view.
See #115 for planned command-mode enhancements to array and dependency management.
Advanced Filtering
Filter Syntax
S9S supports two filtering modes:
Quick Filter (/) -- plain text search across all visible columns. Type /gpu to find items containing "gpu". The only special prefix is p: for partition filtering.
Global Search (Ctrl+F) -- opens cross-resource search (available in all data views). The advanced filter bar supports field-specific field=value syntax with operators:
# Advanced filter examples state=RUNNING # Running jobs user=alice # Alice's jobs state=PENDING user=bob # Bob's pending jobs (AND logic) name~analysis # Jobs containing "analysis" name=~"analysis.*2023" # Regex match memory>4G cpus>=8 # Resource comparisons state!=FAILED # Not failed state in (RUNNING,PENDING) # In list
Saved Filters
Saved filters are not yet implemented. Use the / quick filter for on-the-fly filtering. See #115 for planned saved-filter support.
Job Performance
Efficiency Metrics
S9S calculates job efficiency:
- CPU Efficiency: Actual vs allocated CPU usage
- Memory Efficiency: Peak vs allocated memory
- GPU Utilization: GPU usage percentage
- I/O Performance: Read/write statistics
View metrics by selecting a job and pressing Enter to see the job details view, which includes efficiency information when available.
Job Templates
Template System Overview
S9S uses a three-tier merge system to assemble the list of available templates. When two or more sources define a template with the same name, the higher-priority source wins.
| Priority | Source | Location |
|---|---|---|
| 1 (highest) | User-saved templates | ~/.s9s/templates/*.json |
| 2 | Config YAML templates | views.jobs.submission.templates in config |
| 3 (lowest) | Built-in templates | Hardcoded in S9S |
Built-in Templates
S9S ships with 8 built-in templates covering common job patterns:
| Template | Description |
|---|---|
| Basic Batch Job | Simple single-node batch job |
| MPI Parallel Job | Parallel job using MPI across multiple nodes |
| GPU Job | Job requiring GPU resources |
| Array Job | Array job for processing multiple similar tasks |
| Interactive Job | Interactive session for development and testing |
| Long-Running Job | Extended wall-time job |
| High Memory Job | Job requesting large memory allocation |
| Development/Debug Job | Short debug session with verbose output |
Config YAML Templates
Define custom templates in your config file under views.jobs.submission.templates. Each template can set default values for any form field and optionally hide irrelevant fields:
views: jobs: submission: templates: - name: "GPU Training Job" description: "PyTorch training on GPU partition" defaults: partition: "gpu" timeLimit: "24:00:00" cpus: 8 memory: "32G" gpus: 2 script: | #!/bin/bash module load cuda pytorch python train.py hiddenFields: ["arraySpec"] - name: "Genomics Pipeline" description: "High-memory genomics analysis" defaults: partition: "highmem" timeLimit: "48:00:00" memory: "256G" cpus: 32 hiddenFields: ["gpus", "arraySpec"]
User-Saved Templates
User-saved templates are stored as individual JSON files in ~/.s9s/templates/ and have the highest priority in the merge order.
Saving from the Wizard
After configuring a job in the submission wizard, use the "Save as Template" flow to save the current form state as a new template in ~/.s9s/templates/.
Template JSON Format
Each saved template is a JSON file with the following structure:
{ "name": "My Custom Template", "description": "Description of this template", "job_submission": { "name": "my_job", "partition": "compute", "time_limit": "04:00:00", "nodes": 2, "cpus": 8, "memory": "16G", "working_directory": "/scratch/%u", "output_file": "job_%j.out", "error_file": "job_%j.err", "script": "#!/bin/bash\nmodule load python\npython run.py" } }
Naming conventions: Saved JSON templates use snake_case field names (matching Go struct JSON tags):
time_limit,output_file,error_file,working_directory. Config YAML templates use camelCase:timeLimit,outputFile,errorFile,workingDir. Using the wrong convention will silently ignore the field. Script arguments (argv) are split on whitespace in both formats — quoted arguments with spaces are not supported.
Job Script Preview
Before submitting, press Preview to see the complete sbatch script that will be generated from your form values. The preview shows all #SBATCH directives derived from the form fields followed by your script body.
| Key | Action |
|---|---|
| ESC | Close preview |
| Ctrl+Y | Copy script to clipboard (via OSC 52) |
The clipboard copy produces a clean plain-text script with no color formatting, ready to paste into a file or terminal. OSC 52 clipboard support works in most modern terminals (iTerm2, kitty, tmux, Windows Terminal, Alacritty).
CLI Commands
Manage templates from the command line:
# List all templates from all sources with source indicator (builtin/config/saved) s9s templates list # Export all built-in and config templates to ~/.s9s/templates/ as editable JSON s9s templates export # Export a single template by name s9s templates export "GPU Job" # Overwrite existing files during export s9s templates export --force # Export to a custom directory s9s templates export --dir /path/to/templates
Template Workflow
A typical workflow for customizing templates:
-
Export the built-ins to get editable copies:
s9s templates exportThis writes all 8 built-in templates (plus any config templates) to
~/.s9s/templates/as JSON files. Existing files are skipped unless--forceis used. -
Edit the JSON files in
~/.s9s/templates/to match your environment (change partitions, modules, default resources, etc.). The exported files use the same format thatJobTemplateManagerloads — changes take effect on the next wizard open without restarting s9s. -
Verify your templates are loaded:
s9s templates listEdited templates show as
savedsource and override any built-in with the same name. -
Optionally restrict sources so only your edited templates appear in the wizard:
views: jobs: submission: templateSources: ["saved"] -
Use templates in the wizard — press s to open the submission wizard, then pick from the template selector. Your custom templates appear with the values you set.
Controlling Template Sources
By default all three sources are loaded. Use the templateSources config option to control which sources appear:
views: jobs: submission: # Show only user-saved and config templates (hide built-ins) templateSources: ["config", "saved"]
Valid values: "builtin", "config", "saved".
Job Workflows
Job Chains
To create a chain of dependent jobs, submit each job with its dependencies set in the submission wizard's dependencies field. For example, submit a preprocessing job first, then submit an analysis job with the preprocessing job's ID in the dependencies field, and so on. S9S automatically formats these as afterok dependencies.
Job chains and recurring job scheduling via command mode are not yet available. See #115 for planned workflow enhancements.
Job Reporting
Export Job Data
Export the current view by pressing e (depending on the view). This exports the visible data to a file. Command-mode export and report generation are not yet available. See #115 for planned reporting enhancements.
Tips & Best Practices
Efficiency Tips
- Use templates for repetitive jobs
- Set up filters for your common queries
- Monitor efficiency to optimize resource requests
- Use batch operations for multiple similar jobs
- Enable notifications for long-running jobs
Common Workflows
Debug Failed Jobs
# Use p/P to toggle pending filter, or / for text search /FAILED # Quick filter for failed jobs Enter # View job details o # Check output/errors :requeue JOBID # Requeue if needed (command mode)
Monitor GPU Jobs
/gpu # Quick filter for GPU-related jobs Enter # View job details
Bulk Cancel User Jobs
/username # Filter by user text V # Toggle multi-select mode Space # Select individual jobs b # Batch operations c # Cancel selected
Troubleshooting
Common Issues
Job Stuck in PENDING
- Check reason code in job details
- View partition limits
- Check dependencies
- Verify resource availability
Low Efficiency
- Review resource requests
- Check for I/O bottlenecks
- Verify correct partition
- Consider job profiling
Output Not Found
- Verify output paths in job script
- Check working directory
- Ensure write permissions
- Look for redirected output
Next Steps
- Learn about Jobs View Details
- Explore Performance Monitoring
- Set up Keyboard Shortcuts
- Master Advanced Filtering