Skip to content

Conversation

@sosoihd
Copy link

@sosoihd sosoihd commented Nov 27, 2025

Add support for latest-generation Google Cloud machine families and boot disk type configuration

Problem

Two critical limitations prevented full utilization of Google Cloud Batch capabilities:

1. Missing support for latest-generation machine families

Google Cloud has introduced several new general-purpose machine families that are not currently supported by Nextflow:

These families offer significant improvements in:

  • Price-performance ratio (up to 10% better than previous generations)
  • Memory bandwidth
  • Network throughput
  • Energy efficiency

Without this support, users cannot leverage:

  • Latest Intel Sapphire Rapids processors (C4, N4)
  • Latest AMD EPYC Genoa processors (C4A, C4D, N4A, N4D)
  • Improved performance characteristics of these newer families

2. Inability to specify boot disk type

Currently, Nextflow only allows configuring the boot disk size via google.batch.bootDiskSize, but not the disk type. This creates several issues:

Compatibility problems:

  • The new C4, C4A, C4D, N4, N4A, and N4D families do not support pd-balanced disks (the Google Cloud default)
  • These families require alternative disk types like hyperdisk-balanced or pd-ssd
  • This makes it impossible to use these new machine families at all

Performance optimization:

  • High-I/O workloads may benefit from pd-ssd (higher IOPS)
  • Cost-sensitive workflows may prefer pd-standard (lower cost)
  • Users cannot optimize disk performance for their specific workloads

Reference: Google Cloud Disk Types Documentation

Solution

This PR addresses both issues with a comprehensive solution:

1. Add support for latest-generation machine families

Machine type recognition:

  • Added C4, C4A, C4D, N4, N4A, N4D families to GENERAL_PURPOSE_FAMILIES
  • Updated local SSD handling for C4/C4A/C4D families with -lssd suffix
  • Disabled local SSD for N4, N4A, N4D families (not supported by hardware)

Testing:

  • Added comprehensive tests for all new machine families
  • Verified local SSD behavior (supported/not supported) per family

2. Add bootDiskType configuration option

New configuration parameter:

google {
    project = 'your-project-id'
    location = 'us-central1'
    batch {
        bootDiskSize = '50 GB'
        bootDiskType = 'hyperdisk-balanced'  // NEW: Specify disk type
    }
}

Supported disk types (Google Cloud documentation):

  • pd-standard - Standard persistent disk (HDD, lowest cost)
  • pd-balanced - Balanced persistent disk (SSD, default for most instances)
  • pd-ssd - SSD persistent disk (highest performance)
  • hyperdisk-balanced - Hyperdisk balanced (required for C4/N4 families)

Key features:

  • Optional configuration (backward compatible)
  • Works with bootDiskImage when both are specified
  • Ignored when using instance templates (with warning)
  • Enables use of new machine families that require specific disk types

Changes

Core Implementation

GoogleBatchMachineTypeSelector.groovy

  • Added GENERAL_PURPOSE_FAMILIES constant for C4/N4 family detection
  • Implemented isHyperdiskOnly() method to identify families requiring Hyperdisk
  • Updated findValidLocalSSDSize() to handle C4/C4D local SSD variants
  • Added logic to disable local SSD for N4/N4A/N4D families

BatchConfig.groovy

  • Added bootDiskType field with @ConfigOption annotation
  • Comprehensive documentation including machine family compatibility notes
  • Constructor initialization for the new parameter

GoogleBatchTaskHandler.groovy

  • Updated boot disk configuration logic to apply bootDiskType when specified
  • Added warning when bootDiskType is used with instance templates
  • Consolidated boot disk builder logic for clarity

Tests

GoogleBatchMachineTypeSelectorTest.groovy

  • Added tests for C4, C4A, C4D families with local SSD support
  • Added tests for N4, N4A, N4D families (no local SSD support)
  • Verified isHyperdiskOnly() behavior for all new families

BatchConfigTest.groovy

  • Added test for bootDiskType parsing from configuration
  • Added test for bootDiskType combined with other boot disk options
  • Verified default behavior (null when not specified)

GoogleBatchTaskHandlerTest.groovy

  • Added test for boot disk type configuration alone
  • Added test for boot disk type combined with boot disk image
  • Verified proper protobuf generation in job requests

Documentation

docs/reference/config.md

  • Added google.batch.bootDiskType to configuration reference
  • Documented all supported disk types with links to Google Cloud docs

docs/google.md

  • Updated disk directive documentation
  • Removed outdated limitation about disk type configuration

Compatibility

  • Fully backward compatible - All changes are additive
  • No breaking changes - Existing configurations work unchanged
  • Optional parameters - New features are opt-in
  • Instance template support - Proper handling with warnings

Testing

All tests pass:

  • ✅ 31 new tests for new machine families (local SSD handling)
  • ✅ 3 new tests for bootDiskType configuration
  • ✅ 110+ existing tests continue to pass
  • ✅ Integration with existing features verified

Test coverage:

  • Machine family detection and validation
  • Local SSD size validation per family
  • Boot disk type configuration parsing
  • Boot disk type in job request generation
  • Combined boot disk image + type scenarios
  • Instance template compatibility

Use Cases Enabled

1. Using latest-generation machines:

process myTask {
    machineType 'c4-standard-4'  // Now works!
    memory '16 GB'
    
    script:
    """
    # High-performance workload on latest Intel Sapphire Rapids
    """
}

2. Optimizing for cost:

google.batch.bootDiskType = 'pd-standard'  // Lower cost HDD boot disk

3. Optimizing for performance:

google.batch.bootDiskType = 'pd-ssd'  // High-performance SSD boot disk

4. Using new machine families:

process highPerf {
    machineType 'c4a-standard-8'  // AMD EPYC Genoa
    
    script:
    """
    # Requires hyperdisk-balanced or pd-ssd
    """
}

google.batch.bootDiskType = 'hyperdisk-balanced'  // Compatible with C4A

References

docs: update Google Batch documentation to include bootDiskType option

fix: ensure compatibility with machine families requiring Hyperdisk

test(nf-google): add unit tests for boot disk type configurations

chore: update .gitignore to exclude 'mise.toml'

build: update build-info properties for nextflow module
Signed-off-by: Sofiane Ihaddadene <[email protected]>
@sosoihd sosoihd requested a review from a team as a code owner November 27, 2025 14:57
@netlify
Copy link

netlify bot commented Nov 27, 2025

Deploy Preview for nextflow-docs-staging ready!

Name Link
🔨 Latest commit 5d05ae5
🔍 Latest deploy log https://app.netlify.com/projects/nextflow-docs-staging/deploys/692866d9cdae8f00085a4d46
😎 Deploy Preview https://deploy-preview-6616--nextflow-docs-staging.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@pditommaso pditommaso requested review from jorgee and removed request for a team November 28, 2025 09:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant