Manual GPU Reservations
The reserve command allows you to manually reserve GPUs for a specific duration without immediately running a command. This is useful for interactive development, planning work sessions, or blocking GPUs for maintenance.
Basic Usage
Defaults:
- --gpus: 1 GPU
- --duration: 8 hours
Options:
- --gpus, -g: Number of GPUs to reserve
- --gpu-ids: Specific GPU IDs to reserve (comma-separated, e.g., 1,3,5)
- --duration, -d: How long to reserve the GPUs
GPU Selection
- Use --gpusto let canhazgpu select GPUs using the LRU algorithm
- Use --gpu-idswhen you need specific GPUs (e.g., for hardware requirements)
- You can use both options together if --gpusmatches the GPU ID count or is 1 (default)
Duration Formats
canhazgpu supports flexible duration formats:
| Format | Description | Example | 
|---|---|---|
| 30m | 30 minutes | --duration 30m | 
| 2h | 2 hours | --duration 2h | 
| 1d | 1 day | --duration 1d | 
| 0.5h | 30 minutes (decimal) | --duration 0.5h | 
| 90m | 90 minutes | --duration 90m | 
| 3.5d | 3.5 days | --duration 3.5d | 
Common Examples
Quick Development Sessions
# Reserve 1 GPU for 2 hours
canhazgpu reserve --duration 2h
# Reserve 1 GPU for 30 minutes of testing
canhazgpu reserve --duration 30m
Multi-GPU Development
# Reserve 2 GPUs for 4 hours
canhazgpu reserve --gpus 2 --duration 4h
# Reserve 4 GPUs for distributed development
canhazgpu reserve --gpus 4 --duration 6h
# Reserve specific GPU IDs
canhazgpu reserve --gpu-ids 0,2 --duration 4h
Extended Work Sessions
# Full day development (8 hours, default)
canhazgpu reserve
# Multi-day project work
canhazgpu reserve --gpus 2 --duration 2d
# Week-long research sprint
canhazgpu reserve --gpus 1 --duration 7d
Use Cases
Interactive Development
Perfect for Jupyter notebooks, IPython sessions, or iterative model development:
# Reserve GPU for notebook session
canhazgpu reserve --duration 4h
# Note the GPU IDs from the output, e.g., "Reserved 1 GPU(s): [2]"
# Manually set CUDA_VISIBLE_DEVICES
export CUDA_VISIBLE_DEVICES=2
# Start Jupyter with the reserved GPU
jupyter notebook
# Your notebooks now have exclusive GPU access
Batch Job Preparation
Reserve GPUs while you prepare and test your batch jobs:
# Reserve GPUs for job prep
canhazgpu reserve --gpus 2 --duration 2h
# Note the GPU IDs from the output, e.g., "Reserved 2 GPU(s): [1, 3]"
# Manually set CUDA_VISIBLE_DEVICES
export CUDA_VISIBLE_DEVICES=1,3
# Test your scripts with the reserved GPUs
python test_distributed.py
# Run the actual job (using same GPUs)
python distributed_training.py
# Release when done
canhazgpu release
Maintenance Windows
Block GPUs during system maintenance or updates:
# Block GPU during driver updates
canhazgpu reserve --gpus 8 --duration 1h
# Perform maintenance
sudo apt update && sudo apt upgrade nvidia-driver-*
# Release after maintenance
canhazgpu release
Meeting and Presentation Prep
Ensure GPUs are available for demos and presentations:
# Reserve before important demo
canhazgpu reserve --gpus 1 --duration 3h
# Run demo applications
python demo_inference.py
jupyter notebook presentation.ipynb
# Release after presentation
canhazgpu release
How Manual Reservations Work
Allocation Process
- Validation: Checks actual GPU usage with nvidia-smi
- Conflict Detection: Excludes GPUs in unreserved use
- LRU Selection: Chooses least recently used GPUs
- Time-based Expiry: Sets expiration time based on duration
- Persistent Storage: Saves reservation in Redis
Environment Setup
Unlike run commands, manual reservations don't automatically set environment variables. You need to check which GPUs were allocated:
# Reserve GPUs
❯ canhazgpu reserve --gpus 2 --duration 4h
Reserved 2 GPU(s): [1, 3] for 4h 0m 0s
# Check current allocations
❯ canhazgpu status
GPU STATUS    USER     DURATION    TYPE    MODEL            DETAILS                    VALIDATION
--- --------- -------- ----------- ------- ---------------- -------------------------- ---------------------
1   in use    alice    30s         manual                   expires in 3h 59m 30s     
3   in use    alice    30s         manual                   expires in 3h 59m 30s     
# Manually set CUDA_VISIBLE_DEVICES
export CUDA_VISIBLE_DEVICES=1,3
python your_script.py
Expiration and Cleanup
Manual reservations automatically expire after the specified duration:
❯ canhazgpu status
GPU STATUS    USER     DURATION    TYPE    MODEL            DETAILS                    VALIDATION
--- --------- -------- ----------- ------- ---------------- -------------------------- ---------------------
1   in use    alice    3h 58m 45s  manual                   expires in 1m 15s         
# After expiration
❯ canhazgpu status  
GPU STATUS    USER     DURATION    TYPE    MODEL            DETAILS                    VALIDATION
--- --------- -------- ----------- ------- ---------------- -------------------------- ---------------------
1   available          free for 5s                                                    
Releasing Reservations
Manual Release
Release all your manual reservations immediately:
Checking Your Reservations
Use status to see your current reservations:
❯ canhazgpu status
GPU STATUS    USER     DURATION    TYPE    MODEL            DETAILS                    VALIDATION
--- --------- -------- ----------- ------- ---------------- -------------------------- ---------------------
0   available          free for 1h 15m 30s                                           
1   in use    alice    45m 12s     manual                   expires in 3h 14m 48s     # Your reservation
2   in use    bob      1h 30m 0s   run     pytorch-model    heartbeat 5s ago          
3   in use    alice    45m 12s     manual                   expires in 3h 14m 48s     # Your reservation
Error Handling
Insufficient GPUs
❯ canhazgpu reserve --gpus 4 --duration 2h
Error: Not enough GPUs available. Requested: 4, Available: 2 (2 GPUs in use without reservation - run 'canhazgpu status' for details)
Check the status and try again with fewer GPUs or wait for others to finish.
Invalid Duration Format
❯ canhazgpu reserve --duration 2hours
Error: Invalid duration format. Use formats like: 30m, 2h, 1d, 0.5h
Use the supported duration formats listed above.
Allocation Lock Contention
Multiple users are trying to allocate GPUs simultaneously. Wait a few seconds and try again.
Best Practices
Duration Planning
- Start conservative: Reserve for shorter periods initially
- Extend if needed: Use reserveagain to extend (requires releasing first)
- Plan for interruptions: Don't reserve longer than you'll actually use
Resource Efficiency
# Good: Reserve what you need
canhazgpu reserve --gpus 1 --duration 2h
# Wasteful: Over-reserving
canhazgpu reserve --gpus 8 --duration 24h  # Only if you really need this
Team Coordination
- Communicate: Let teammates know about long reservations
- Release early: Use canhazgpu releasewhen done early
- Check conflicts: Use canhazgpu statusbefore making large reservations
Development Workflow
# Efficient development cycle
canhazgpu reserve --duration 1h           # Start small
# ... work for 45 minutes ...
canhazgpu reserve --duration 30m          # Extend if needed (after releasing)
# ... finish work ...
canhazgpu release                         # Clean up immediately
Integration Examples
Shell Scripts
#!/bin/bash
set -e
echo "Reserving GPUs for data processing..."
canhazgpu reserve --gpus 2 --duration 3h
echo "Starting data processing pipeline..."
python preprocess.py
python feature_extraction.py
python model_training.py
echo "Releasing GPUs..."
canhazgpu release
echo "Processing complete!"
Python Integration
import subprocess
import os
def reserve_gpus(count=1, duration="2h"):
    """Reserve GPUs and return allocated GPU IDs"""
    result = subprocess.run([
        "canhazgpu", "reserve", 
        "--gpus", str(count),
        "--duration", duration
    ], capture_output=True, text=True)
    if result.returncode != 0:
        raise RuntimeError(f"GPU reservation failed: {result.stderr}")
    # Parse GPU IDs from output
    # "Reserved 2 GPU(s): [1, 3] for 2h 0m 0s"
    import re
    match = re.search(r'Reserved \d+ GPU\(s\): \[([^\]]+)\]', result.stdout)
    if match:
        gpu_ids = [int(x.strip()) for x in match.group(1).split(',')]
        os.environ['CUDA_VISIBLE_DEVICES'] = ','.join(map(str, gpu_ids))
        return gpu_ids
    return []
def release_gpus():
    """Release all manual reservations"""
    subprocess.run(["canhazgpu", "release"], check=True)
# Usage
try:
    gpu_ids = reserve_gpus(2, "1h")  
    print(f"Using GPUs: {gpu_ids}")
    # Your GPU work here
    import torch
    print(f"PyTorch sees {torch.cuda.device_count()} GPUs")
finally:
    release_gpus()
Manual reservations provide fine-grained control over GPU allocation, making them perfect for interactive development and planned work sessions.