Skip to main content

SSH Integration Guide

Interactive SSH access to cluster nodes directly from S9S for debugging, monitoring, and troubleshooting.

Overview

S9S provides direct SSH access to cluster nodes, allowing you to quickly open interactive terminal sessions for debugging jobs, inspecting node status, and performing administrative tasks.

Features:

  • One-click interactive SSH to cluster nodes
  • SSH connection testing and validation
  • Node information retrieval via SSH
  • Integration with job debugging workflows
  • SSH terminal session management

Quick SSH Access

Basic SSH Operations

From the Nodes View, press

s
on a selected node to open an interactive SSH session.

KeyActionDescription
s
SSH to selected nodeDirect interactive SSH connection
S
SSH to selected nodeSame as lowercase
s

SSH from Different Views

From Nodes View:

# Navigate to nodes view
:view nodes

# Select a node and press 's' to open SSH session
node001  IDLE    16/32 cores  64GB/128GB  ← [s] SSH here

From Jobs View:

# Navigate to jobs view
:view jobs

# Press 's' on a running job to SSH to its allocated nodes
12345  alice  RUNNING  node[001-004]  ← [s] SSH to job nodes

SSH Configuration

Basic Configuration

S9S uses your system's SSH configuration by default. Configure SSH settings in

~/.s9s/config.yaml
:

ssh:
  # Default SSH user
  defaultUser: ${USER}

  # SSH key file
  keyFile: ~/.ssh/id_rsa

  # Connection options
  compression: true
  forwardAgent: true
  connectTimeout: 10s

  # Additional SSH arguments
  extraArgs: "-o StrictHostKeyChecking=ask -o ServerAliveInterval=60"

SSH Key Management

S9S leverages your existing SSH infrastructure:

ssh:
  keys:
    # Default key
    default: ~/.ssh/id_rsa

  # Key agent settings
  agent:
    useAgent: true
    addKeysOnConnect: true

Interactive SSH Sessions

Single Node SSH

Connect to individual nodes for interactive work:

# Basic SSH connection
s  # Press 's' on selected node in Nodes View

# SSH session opens in your terminal, suspending s9s temporarily
user@node001:~$

When you exit the SSH session, S9S resumes automatically.

SSH Terminal Manager

S9S provides an advanced SSH terminal manager for managing multiple SSH sessions:

# From nodes view, select a node
# Choose "SSH Terminal Manager" option

# Features:
# - View active SSH sessions
# - Switch between multiple node connections
# - Monitor session status
# - Quick access to node information

SSH with Job Context

SSH directly to nodes running specific jobs:

# From jobs view, select a running job
12345  alice  RUNNING  node[001-004]  ← Select this

# Press 's' to SSH to the first node running this job
# Useful for debugging running jobs interactively

SSH Features

Connection Testing

Test SSH connectivity to a node before opening a session:

# From the SSH options menu, select "Test Connection"
# S9S will verify:
# - SSH connectivity
# - Authentication
# - Basic command execution

Node Information Retrieval

Gather basic node information via SSH:

# From the SSH options menu, select "Get Node Info"
# Retrieves:
# - Hostname
# - Uptime
# - Memory usage
# - CPU count
# - Disk usage

SSH Options Menu

When initiating SSH from the Nodes View, S9S presents options:

  • SSH Terminal Manager - Advanced session management interface
  • Quick Connect - Direct SSH connection (fastest)
  • Test Connection - Verify SSH connectivity
  • Get Node Info - Retrieve basic node information

SSH Security

Authentication Methods

S9S uses SSH key authentication by default:

ssh:
  auth:
    method: key
    keyFile: ~/.ssh/id_rsa

Security Best Practices

ssh:
  security:
    # Require host key verification (recommended for production)
    strictHostKeyChecking: true
    knownHostsFile: ~/.ssh/known_hosts

    # Connection limits
    connectTimeout: 30s

    # Audit logging
    logConnections: true
    logFile: ~/.s9s/ssh.log

Important Security Note: By default, S9S disables strict host key checking for cluster environments where nodes are frequently rebuilt. For production use, enable strict host key checking in your configuration.

SSH Troubleshooting

Connection Issues

If SSH connection fails:

  1. Verify SSH connectivity manually:

    ssh <nodename>
  2. Check SSH agent (if using SSH agent):

    ssh-add -l
  3. Verify SSH key permissions:

    chmod 600 ~/.ssh/id_rsa
  4. Check S9S SSH configuration:

    ssh:
      keyFile: ~/.ssh/id_rsa  # Verify this path is correct
      defaultUser: ${USER}    # Verify username

Common SSH Issues

Problem: "Permission denied (publickey)"

  • Solution: Ensure your SSH public key is authorized on the target node
  • Verify
    ~/.ssh/authorized_keys
    on the node contains your public key

Problem: "Connection timeout"

  • Solution: Check network connectivity to the node
  • Verify the node is reachable:
    ping <nodename>

Problem: "Host key verification failed"

  • Solution: Update known_hosts file
  • Remove old key:
    ssh-keygen -R <nodename>
  • Or disable strict host key checking (less secure)

Best Practices

SSH Usage

  1. Use SSH keys - Never use password authentication
  2. Keep keys secure - Protect private keys with file permissions (600)
  3. Use SSH agent - Avoid entering passphrases repeatedly
  4. Close sessions - Exit SSH sessions when done to free resources
  5. Verify node state - Check node status before SSH (avoid DOWN or DRAIN nodes)

Security

  1. Verify host keys - Use strict host key checking in production
  2. Monitor connections - Enable SSH connection logging
  3. Restrict access - Ensure only authorized users have SSH access to nodes
  4. Audit regularly - Review SSH logs for suspicious activity

Workflow Examples

Debug a Running Job

# 1. Navigate to jobs view
:view jobs

# 2. Find your running job
12345  alice  RUNNING  node[001-004]

# 3. Press 's' to SSH to a job node
# S9S suspends, SSH session opens

user@node001:~$ ps aux | grep <your_program>
user@node001:~$ htop -u alice
user@node001:~$ tail -f /path/to/job/output

# 4. Exit SSH session (Ctrl+D or 'exit')
# S9S resumes automatically

Check Node Health

# 1. Navigate to nodes view
:view nodes

# 2. Select a problematic node
# 3. Press 's' → choose "Get Node Info"

# S9S retrieves and displays:
# - Uptime
# - Memory usage
# - Disk space
# - CPU count

# Or choose "Quick Connect" for full SSH access

Investigate Failed Job

# 1. Find failed job in jobs view
12345  alice  FAILED  node003

# 2. Press 's' to SSH to the node where it failed
# 3. Investigate logs, check for errors

user@node003:~$ cd /scratch/alice/job_12345
user@node003:~$ less slurm-12345.out
user@node003:~$ dmesg | tail

Integration with S9S Workflows

SSH access integrates seamlessly with S9S cluster management:

  • Job Debugging: SSH to nodes running specific jobs
  • Node Inspection: Quick access from node status screens
  • Troubleshooting: Direct access to nodes showing problems
  • Performance Analysis: Interactive exploration of node resources

Keyboard Reference

From Nodes View:

  • s
    or
    S
    - Open SSH to selected node

From Jobs View:

  • s
    or
    S
    - Open SSH to first node running selected job

From SSH Terminal Manager:

  • Enter
    - Connect to selected node/session
  • c
    - Create new SSH connection
  • i
    - Show node information
  • t
    - Open terminal session
  • s
    - Show system information
  • Esc
    - Close SSH interface

Next Steps