Scalable Nextflow Modules: Building a Template with Copier, CI/CD, and nf-test
Creating and maintaining a library of reusable Nextflow modules is a significant challenge for bioinformatics teams. Without a consistent structure, code quality standards, and automated testing, modules quickly become difficult to share, validate, and integrate into pipelines. The nf-modules-template solves this by providing a production-ready template that uses Copier to scaffold new module repositories, GitHub Actions for automated CI/CD workflows, pre-commit hooks for code quality, and nf-test with intelligent sharding for scalable module testing. This post explores how these technologies work together to enable reproducible, maintainable Nextflow module libraries.
1. The Problem: Fragmented Module Management
1.1. Why Module Repositories Matter
In complex bioinformatics workflows, teams often develop dozens or hundreds of reusable Nextflow processes. Without a centralized, well-structured module repository:
- Developers duplicate code across projects (violating DRY principles)
- Modules lack standardized documentation and metadata
- Testing is ad-hoc or missing entirely
- Code quality varies significantly between modules
- Integration with other pipelines is error-prone
- Version management and dependency tracking become chaotic
1.2. The Traditional Approach (Manual and Error-Prone)
Without a template, setting up a new module repository requires:
# Manual directory creation
mkdir -p my-modules/{modules,subworkflows,tests,docs,.github/workflows}
# Manual configuration file creation
touch {.pre-commit-config.yaml,ruff.toml,nf-test.config,Makefile}
# Manual GitHub workflow setup (copy-paste from other projects)
touch .github/workflows/lint.yml
touch .github/workflows/nf-test.yml
# Manual CI/CD pipeline configuration
# ... dozens of manual steps with many opportunities for error
Problems:
- Inconsistency across different module repositories
- Outdated workflows in old projects
- Difficult to enforce standards across teams
- Onboarding new developers takes time and is error-prone
- Changes to best practices require manual updates everywhere
1.3. The Better Approach: Copier + Templates
# One command to create a fully configured module repository
copier copy gh:nf-core/modules-template ./my-new-modules-library
# Follow interactive prompts
# → Repo host? GitHub
# → Organization name? my-org
# → License? MIT License
# → ... (5-10 quick prompts)
# One command to initialize
cd my-new-modules-library
bash ./project_init.sh
# Fully configured repository with:
# ✓ Pre-commit hooks (ruff, prettier, hadolint, jsonschema)
# ✓ GitHub Actions workflows (lint + nf-test)
# ✓ nf-test framework with sharding support
# ✓ Module structure and examples
# ✓ Documentation setup (MkDocs)
# ✓ Contributing guidelines
2. Understanding the Architecture
2.1. The Technology Stack
┌────────────────────────────────────────────────────┐
│ nf-modules-template Repository │
├────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────┐ │
│ │ Copier Configuration (copier.yml) │ │
│ │ - Template parameters │ │
│ │ - Jinja2 templating │ │
│ │ - Interactive prompts │ │
│ └─────────────────────────────────────────────┘ │
│ ↓ │
│ ┌─────────────────────────────────────────────┐ │
│ │ Generated Repository │ │
│ │ - Project structure │ │
│ │ - Module templates │ │
│ │ - Workflows │ │
│ │ - Configurations │ │
│ └─────────────────────────────────────────────┘ │
│ ↓ │
│ ┌─────────────────────────────────────────────┐ │
│ │ Code Quality (Pre-commit Hooks) │ │
│ │ - Ruff (Python linting) │ │
│ │ - Prettier (YAML/JSON formatting) │ │
│ │ - Hadolint (Dockerfile validation) │ │
│ │ - JSON Schema validation │ │
│ └─────────────────────────────────────────────┘ │
│ ↓ │
│ ┌─────────────────────────────────────────────┐ │
│ │ CI/CD Workflows (GitHub Actions) │ │
│ │ ┌─────────────────────────────────────────┐ │ │
│ │ │ Lint Workflow (PR/push) │ │ │
│ │ │ - Pre-commit linting │ │ │
│ │ │ - nf-core linting │ │ │
│ │ │ - YAML/JSON schema validation │ │ │
│ │ └─────────────────────────────────────────┘ │ │
│ │ ┌─────────────────────────────────────────┐ │ │
│ │ │ nf-test Workflow (parallelized) │ │ │
│ │ │ - Shard detection (get-shards action) │ │ │
│ │ │ - Parallel test execution │ │ │
│ │ │ - BAM/VCF/Utils plugins │ │ │
│ │ └─────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────┘ │
│ ↓ │
│ ┌─────────────────────────────────────────────┐ │
│ │ Module/Subworkflow (nf-test Framework) │ │
│ │ - Snapshot testing │ │
│ │ - Input/output validation │ │
│ │ - Plugin support (BAM, VCF) │ │
│ └─────────────────────────────────────────────┘ │
│ │
└────────────────────────────────────────────────────┘
2.2. Repository Structure
A generated module repository looks like:
my-modules/
├── .github/
│ ├── workflows/
│ │ ├── lint.yml # Linting checks
│ │ └── nf-test.yml # Testing workflow
│ ├── CONTRIBUTING.md # Contributing guidelines
│ └── ISSUE_TEMPLATE/ # Issue templates
├── modules/
│ ├── nf-core/ # nf-core modules (shared)
│ │ └── examplemodule/
│ │ ├── main.nf # Process definition
│ │ ├── meta.yml # Module metadata
│ │ ├── environment.yml # Conda environment
│ │ ├── tests/
│ │ │ ├── main.nf.test.jinja # Test file
│ │ │ └── tags.yml # Test tags
│ │ └── nextflow.config # Module config
│ └── my-org/ # Organization-specific modules
│ └── custommodule/ # Your custom modules follow same structure
├── subworkflows/ # Composite workflows
│ ├── nf-core/
│ └── my-org/
├── tests/
│ ├── config/
│ │ ├── nf-test.config # Test configuration
│ │ └── pytest_modules.yml # Pytest config
│ ├── pytest.ini
│ └── demo_nextflow.config
├── .pre-commit-config.yaml # Pre-commit hooks configuration
├── .nf-core.yml # nf-core configuration
├── nf-test.config # nf-test framework config
├── ruff.toml # Python linter config
├── .prettierignore & .prettierrc.yml # Code formatter config
├── mkdocs.yml # Documentation config
├── main.nf # Example main workflow
├── nextflow.config # Main Nextflow config
├── Makefile # Development commands
├── README.md # Project documentation
└── LICENSE # License file
3. Copier: Templating System
3.1. What is Copier?
Copier is a template engine and scaffolding tool that:
- Uses Jinja2 templating for dynamic file generation
- Supports interactive prompts for collecting project metadata
- Can be updated incrementally (regenerate from updated templates)
- Works with version control systems
3.2. The copier.yml Configuration
The template's copier.yml defines prompts and template variables:
repo_host:
type: str
help: "What is the host of your code repository?"
default: "https://github.com"
choices:
GitHub: "https://github.com"
Gitlab: "https://gitlab.com"
repo_org_name:
type: str
help: "What is your organization/user name?"
placeholder: "demo-org"
required: true
validator: >-
{% if not (repo_org_name | regex_search('^[a-zA-Z][a-zA-Z0-9\-_]+$')) %}
repo_org_name must start with a letter...
{% endif %}
short_org_name:
type: str
help: "What is your abbreviated org name?"
default: "{{ repo_org_name }}"
validator: >-
{% if not (short_org_name | regex_search('^[a-z\-_]+$')) %}
short_org_name must be lowercase...
{% endif %}
ci:
type: str
help: "What CI provider will you use?"
default: "github"
choices:
Github Actions: "github"
None: "none"
when: "{{ repo_host == 'https://github.com' }}"
copyright_license:
type: str
help: "Your project's license"
default: "MIT License"
choices:
- "MIT License"
- "Apache License 2.0"
- "GPL v3.0"
# ... many more options
3.3. Using Copier to Generate a Repository
# Install Copier
pipx install copier
# Generate from the template
copier copy --vcs-ref main gh:nf-core/modules-template ./my-modules
# Interactive prompts:
# repo_host? [https://github.com]:
# repo_org_name? [demo-org]: my-org
# short_org_name? [my-org]: myorg
# ci? [github]: github
# repo_name? [modules]: bioinformatics-modules
# description? [An nf-core modules repository...]: My custom Nextflow modules
# copyright_holder? [Author or Organization Name]: My Organization
# default_branch? [main]: main
3.4. Jinja2 Templating in Action
Files with .jinja extension are template files that get rendered:
# .pre-commit-config.yaml.jinja
repos:
- repo: https://github.com/pre-commit/mirrors-prettier
rev: "v3.1.0"
hooks:
- id: prettier
- repo: https://github.com/python-jsonschema/check-jsonschema
rev: 0.35.0
hooks:
- id: check-jsonschema
# Validate modules for {{ short_org_name }}
files: ^modules/{{ short_org_name }}/.*/meta\.yml$
args: ["--schemafile", "modules/yaml-schema.json"]
When rendered with short_org_name = "myorg", becomes:
# .pre-commit-config.yaml (rendered)
repos:
- repo: https://github.com/pre-commit/mirrors-prettier
rev: "v3.1.0"
hooks:
- id: prettier
- repo: https://github.com/python-jsonschema/check-jsonschema
rev: 0.35.0
hooks:
- id: check-jsonschema
# Validate modules for myorg
files: ^modules/myorg/.*/meta\.yml$
args: ["--schemafile", "modules/yaml-schema.json"]
Conditional templating also works:
# .github/workflows folder structure
{% if 'github' in repo_host %}.github{% endif %}/
{% if ci =='github' %}actions{% endif %}/
This creates .github/actions/ only for GitHub-based repositories.
4. Repository Organization: Modules and Subworkflows
4.1. Modules Directory Structure
Modules are self-contained processes with their own tests and documentation:
modules/
├── nf-core/ # Official nf-core modules
│ ├── bwa/
│ │ ├── index/
│ │ │ ├── main.nf # Module process
│ │ │ ├── meta.yml # Metadata
│ │ │ ├── environment.yml # Dependencies
│ │ │ └── tests/
│ │ │ ├── main.nf.test.jinja
│ │ │ ├── tags.yml
│ │ │ └── nextflow.config
│ │ └── mem/
│ └── fastqc/
├── my-org/ # Organization-specific modules
│ └── custom_qc/
│ ├── main.nf
│ ├── meta.yml
│ ├── environment.yml
│ └── tests/
└── yaml-schema.json # Schema for validating meta.yml
4.2. Module Anatomy: A Complete Example
// modules/my-org/examplemodule/main.nf
process EXAMPLEMODULE {
tag "$meta.id"
label 'process_medium'
// Conda environment for reproducibility
conda "${moduleDir}/environment.yml"
// Container support (Docker or Singularity)
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/fastqc:0.12.1--hdfd78af_0' :
'biocontainers/fastqc:0.12.1--hdfd78af_0' }"
// Inputs with metadata
input:
tuple val(meta), path(reads)
// Outputs with emit labels
output:
tuple val(meta), path("*.html"), emit: html
tuple val(meta), path("*.zip") , emit: zip
path "versions.yml" , emit: versions
// Conditional execution
when:
task.ext.when == null || task.ext.when
script:
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}"
def memory_in_mb = MemoryUnit.of("${task.memory}").toUnit('MB') / task.cpus
def fastqc_memory = memory_in_mb > 10000 ? 10000 : (memory_in_mb < 100 ? 100 : memory_in_mb)
"""
fastqc \\
$args \\
--threads $task.cpus \\
--memory $fastqc_memory \\
$reads
cat <<-END_VERSIONS > versions.yml
"${task.process}":
fastqc: \$( fastqc --version | sed '/FastQC v/!d; s/.*v//' )
END_VERSIONS
"""
stub:
def prefix = task.ext.prefix ?: "${meta.id}"
"""
touch ${prefix}.html
touch ${prefix}.zip
cat <<-END_VERSIONS > versions.yml
"${task.process}":
fastqc: 0.12.1
END_VERSIONS
"""
}
4.3. Module Metadata: meta.yml
# modules/my-org/examplemodule/meta.yml
name: examplemodule
description: "Brief description of what the module does"
keywords:
- fastqc
- quality-control
- sequencing
tools:
- fastqc:
description: "A quality control tool for high throughput sequence data"
homepage: "https://www.bioinformatics.babraham.ac.uk/projects/fastqc"
documentation: "https://www.bioinformatics.babraham.ac.uk/projects/fastqc/Help/"
tool_dev_url: "https://github.com/s-andrews/FastQC"
doi: "10.1093/bioinformatics/btp324"
licence: ["GPL v3"]
version: "0.12.1"
input:
- meta:
type: map
description: Metadata map with sample ID
pattern: "id"
- reads:
type: file
description: "Input FASTQ file"
pattern: "*.{fq,fastq}{,.gz}"
output:
- meta:
type: map
description: Metadata map
pattern: "id"
- html:
type: file
description: "FastQC HTML report"
pattern: "*.html"
authors:
- "@your-github-handle"
maintainers:
- "@your-github-handle"
4.4. Environment and Dependencies
# modules/my-org/examplemodule/environment.yml
name: examplemodule
channels:
- conda-forge
- bioconda
dependencies:
- bioconda::fastqc=0.12.1
- conda-forge::openjdk=17
4.5. Subworkflows: Composite Modules
Subworkflows combine multiple modules into reusable workflows:
subworkflows/
├── nf-core/
│ └── bam_qc/
│ ├── main.nf # Composite workflow
│ ├── meta.yml # Metadata
│ └── tests/
│ ├── main.nf.test
│ └── tags.yml
└── my-org/
└── variant_calling_pipeline/
// subworkflows/my-org/variant_calling_pipeline/main.nf
workflow VARIANT_CALLING_PIPELINE {
take:
bam
reference_fasta
main:
// Call multiple modules in sequence
SAMTOOLS_INDEX(bam)
BCFTOOLS_MPILEUP(bam, reference_fasta)
BCFTOOLS_CALL(BCFTOOLS_MPILEUP.out.vcf)
TABIX_BGZIP(BCFTOOLS_CALL.out.vcf)
emit:
vcf = TABIX_BGZIP.out.vcf
tbi = TABIX_BGZIP.out.tbi
}
5. Pre-commit Hooks: Automated Code Quality
5.1. What are Pre-commit Hooks?
Pre-commit hooks are scripts that run automatically before each git commit. They:
- Catch style issues before they're committed
- Enforce consistent formatting across the codebase
- Validate YAML, JSON, and Dockerfile syntax
- Prevent pushing broken code
5.2. The .pre-commit-config.yaml Setup
# .pre-commit-config.yaml
repos:
# Code formatter for YAML, JSON, Markdown, etc.
- repo: https://github.com/pre-commit/mirrors-prettier
rev: "v3.7.3"
hooks:
- id: prettier
entry: prettier --experimental-cli --write --ignore-unknown
exclude: |
(?x)^(
.*\.snap$
)$
additional_dependencies:
- prettier@3.7.3
# General file fixes
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v6.0.0
hooks:
- id: trailing-whitespace
args: [--markdown-linebreak-ext=md]
- id: end-of-file-fixer
# YAML/JSON Schema validation
- repo: https://github.com/python-jsonschema/check-jsonschema
rev: 0.35.0
hooks:
# Validate module meta.yml files
- id: check-jsonschema
name: "Validate meta.ymls (modules/my-org)"
files: ^modules/my-org/.*/meta\.yml$
types: [yaml]
args: ["--schemafile", "modules/yaml-schema.json"]
# Validate environment.yml files
- id: check-jsonschema
name: "Validate environment.ymls"
files: ^modules/my-org/.*/environment\.yml$
types: [yaml]
args: ["--schemafile", "modules/environment-schema.json"]
# Validate GitHub workflows
- id: check-github-workflows
# Python linting and formatting
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.14.7
hooks:
- id: ruff
files: \.py$
args: [--fix, --exit-non-zero-on-fix]
- id: ruff-format
# Dockerfile linting
- repo: https://github.com/hadolint/hadolint
rev: v2.14.0
hooks:
- id: hadolint-docker
args: ["--failure-threshold", "error"]
5.3. Installation and Usage
# Install pre-commit
pip install pre-commit
# Set up hooks in your repository
pre-commit install
# Manually run hooks on all files (useful for CI)
pre-commit run --all-files
# Skip hooks for a specific commit (if absolutely necessary)
git commit --no-verify
5.4. What Each Hook Does
| Hook | Purpose | Example |
|---|---|---|
| prettier | Formats YAML, JSON, Markdown to consistent style | Indentation, line length |
| trailing-whitespace | Removes trailing spaces | Cleans up editor artifacts |
| end-of-file-fixer | Ensures files end with newline | Fixes missing final newline |
| check-jsonschema | Validates YAML/JSON against schema | Ensures meta.yml structure |
| ruff | Python linting (import sorting, errors) | Fixes import order, catches undefined vars |
| ruff-format | Python code formatter | Consistent spacing, line length |
| hadolint-docker | Dockerfile linting | Catches Docker best practice violations |
6. GitHub Actions CI/CD Workflows
6.1. Lint Workflow (lint.yml)
The lint workflow runs on every pull request to catch code quality issues:
# .github/workflows/lint.yml (simplified)
name: Run Linting
on:
pull_request:
branches: [main]
merge_group:
types: [checks_requested]
workflow_dispatch:
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
pre-commit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- name: Run pre-commit checks
uses: j178/prek-action@v1 # Runs pre-commit hooks
nf-core-changes:
runs-on: ubuntu-latest
outputs:
modules: ${{ steps.filter.outputs.modules }}
subworkflows: ${{ steps.filter.outputs.subworkflows }}
steps:
- uses: actions/checkout@v6
with:
fetch-depth: 2
# Detect which files changed
- uses: dorny/paths-filter@v3
id: filter
with:
filters: |
modules:
- added|modified: 'modules/my-org/**'
subworkflows:
- added|modified: 'subworkflows/my-org/**'
# Extract module names from changed files
- name: Get module names
id: module_names
uses: actions/github-script@v8
with:
script: |
return [...new Set(${{ steps.filter.outputs.modules_files }}
.filter(x => x.endsWith('main.nf'))
.map(path => path.split('/')[2]))];
# Run nf-core linting on changed modules
- name: Run nf-core lint
uses: nf-core/lint-action@v2
with:
modules: ${{ steps.module_names.outputs.result }}
Key features:
- Concurrency control: Cancels previous runs when a new push happens
- Path filtering: Only lints files that actually changed
- Module detection: Automatically finds which modules were modified
- Pre-commit integration: Runs all pre-commit hooks
6.2. nf-test Workflow (nf-test.yml)
The nf-test workflow runs module tests in parallel using intelligent sharding:
# .github/workflows/nf-test.yml (simplified)
name: Run nf-test
on:
push:
paths-ignore:
- "**/meta.yml"
pull_request:
branches: [main]
workflow_dispatch:
env:
NFT_VER: "0.9.3"
NXF_VER: "25.04.8"
NXF_ANSI_LOG: false
jobs:
nf-test-changes:
name: Detect changes and set shards
runs-on: ubuntu-latest
outputs:
modules: ${{ steps.components.outputs.modules }}
subworkflows: ${{ steps.components.outputs.subworkflows }}
shard: ${{ steps.set-shards.outputs.shard }}
total_shards: ${{ steps.set-shards.outputs.total_shards }}
steps:
- uses: actions/checkout@v6
with:
fetch-depth: 0
# Detect nf-test files that changed
- name: List nf-test files
id: list
uses: adamrtalbot/detect-nf-test-changes@v0.0.6
with:
head: ${{ github.sha }}
base: ${{ github.event.pull_request.base.sha || 'origin/main' }}
exclude_tags: "gpu" # Exclude GPU tests
# Extract module and subworkflow names
- name: Get changed components
id: components
uses: actions/github-script@v8
with:
script: |
const paths = '${{ steps.list.outputs.components }}'.split('\n');
const modules = paths.filter(p => p.includes('modules/'));
const subworkflows = paths.filter(p => p.includes('subworkflows/'));
return { modules, subworkflows };
# Calculate shards for parallel execution
- name: Calculate shards
id: set-shards
uses: nf-core/get-shards-action@v1
with:
components: ${{ steps.list.outputs.components }}
total_shards: 4 # Run tests across 4 parallel jobs
# Run tests in parallel across multiple jobs
nf-test:
needs: nf-test-changes
runs-on: ubuntu-latest
strategy:
matrix:
shard: [1, 2, 3, 4]
steps:
- uses: actions/checkout@v6
- name: Run nf-test (shard ${{ matrix.shard }}/4)
uses: nf-core/nf-test-action@v1
with:
shard: ${{ matrix.shard }}
total_shards: 4
profile: docker
Key features:
- Intelligent sharding: Tests are automatically distributed across 4 parallel jobs
- Change detection: Only runs tests for modules that were actually modified
- Tag filtering: Excludes GPU tests on standard runners
- Docker profile: Tests run in Docker containers for consistency
6.3. How Sharding Works
Sharding divides tests across parallel jobs to reduce total runtime:
Pull Request with 100 tests
│
├─ Job 1 (Shard 1/4): Tests 1-25 [8 min]
├─ Job 2 (Shard 2/4): Tests 26-50 [8 min]
├─ Job 3 (Shard 3/4): Tests 51-75 [8 min]
└─ Job 4 (Shard 4/4): Tests 76-100 [8 min]
└─ Total Time: ~8 minutes (vs 32 minutes serial)
The get-shards-action automatically detects changed test files and assigns them to shards.
7. nf-test Framework and Testing Strategy
7.1. What is nf-test?
nf-test is a testing framework specifically designed for Nextflow. It provides:
- Snapshot testing: Compare outputs to previous golden outputs
- Input/output validation: Verify module behavior
- Plugin support: Specialized validators for BAM, VCF, and utility files
- Parallel execution: Run tests concurrently
- Integration with CI/CD: Seamless GitHub Actions integration
7.2. The nf-test.config File
// nf-test.config
config {
// Location of all nf-tests
testsDir "."
// Directory for temporary test files
workDir System.getenv("NFT_WORKDIR") ?: ".nf-test"
// Optional Nextflow config specific for tests
configFile "tests/config/nf-test.config"
// Profile to use for tests (docker, singularity, local)
profile ""
// Load testing plugins for specialized formats
plugins {
load "nft-bam@0.6.0" # BAM file validation
load "nft-utils@0.0.7" # Utility functions
load "nft-vcf@1.0.7" # VCF file validation
}
}
7.3. Writing a Test File
// modules/my-org/examplemodule/tests/main.nf.test
nextflow_process {
name "Test EXAMPLEMODULE"
script "../main.nf"
process "EXAMPLEMODULE"
test("test_fastqc_single_end") {
when {
process {
"""
input[0] = [
[id: 'test'],
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/fastq/test_1.fastq.gz')
]
"""
}
}
then {
assertAll(
{ assert process.success },
{ assert snapshot(process.out).match() }
)
}
}
test("test_fastqc_paired_end") {
when {
process {
"""
input[0] = [
[id: 'test'],
[
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/fastq/test_1.fastq.gz'),
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/fastq/test_2.fastq.gz')
]
]
"""
}
}
then {
assertAll(
{ assert process.success },
{ assert snapshot(process.out).match() },
{ assert file(process.out.html[0][1]).exists() }
)
}
}
}
7.4. Test Plugins for Specialized Validation
The nf-test plugins provide specialized validators:
// VCF validation example
then {
assertAll(
{ assert process.success },
{ assert file(process.out.vcf[0][1]).vcf.variantCount == 42 },
{ assert file(process.out.vcf[0][1]).vcf.header.samples.size() == 3 }
)
}
// BAM validation example
then {
assertAll(
{ assert process.success },
{ assert file(process.out.bam[0][1]).bam.readCount == 1000 },
{ assert file(process.out.bam[0][1]).bam.mapped == 950 }
)
}
7.5. Running Tests Locally
# Install nf-test
curl -fsSL https://get.nf-test.com | bash
export PATH=$PATH:~/.nf-test/bin
# Configure Nextflow for tests
export NXF_VER="25.04.8"
# Run all tests
nf-test test
# Run tests for a specific module
nf-test test modules/my-org/examplemodule/
# Run tests with Docker profile
nf-test test --profile docker
# Run with verbose output
nf-test test --verbose
# Update snapshots after intentional changes
nf-test test --update-snapshots
7.6. Snapshot Testing Workflow
1. First run:
└─ Test executes module process
└─ Outputs saved to .nf-test/modules/*/main.nf.test.snap
2. Subsequent runs:
└─ Test executes module again
└─ Outputs compared to snapshot
└─ If outputs match: ✓ Test passes
└─ If outputs differ: ✗ Test fails (intentional change?)
3. After code improvements:
└─ Developer reviews diffs
└─ If changes are good: nf-test --update-snapshots
└─ Snapshots updated, tests pass again
Example snapshot file:
# modules/my-org/examplemodule/tests/main.nf.test.snap
{
"test_fastqc_single_end": {
"0": {
"0": {
"id": "test"
},
"1": [
{
"0": "test_fastqc.html"
},
{
"0": "test_fastqc.zip"
}
],
"2": [
{
"0": "versions.yml"
}
]
}
}
}
8. Linting and Code Quality Tools
8.1. Ruff: Python Linting and Formatting
Ruff is a fast Python linter written in Rust that checks for:
- Import sorting and organization
- Unused imports
- Undefined names
- Code style violations
Configuration in ruff.toml:
# ruff.toml
[tool.ruff]
line-length = 120
target-version = "py311"
[tool.ruff.lint]
# Select which rules to enforce
select = ["I", "E1", "E4", "E7", "E9", "F", "UP", "N"]
# I: isort (import sorting)
# E1/E4/E7/E9: pycodestyle errors
# F: pyflakes (undefined names, etc.)
# UP: pyupgrade (modern Python syntax)
# N: pep8-naming (variable naming)
ignore = ["E501"] # Ignore line too long (handled by formatter)
[tool.ruff.format]
quote-style = "double"
indent-style = "space"
line-ending = "auto"
Pre-commit runs ruff:
# Check for issues
ruff check modules/
# Auto-fix issues
ruff check --fix modules/
# Format code
ruff format modules/
8.2. Prettier: YAML and JSON Formatting
Prettier provides opinionated code formatting for configuration files:
# .prettierrc.yml
semi: true
singleQuote: false
trailingComma: all
bracketSpacing: true
tabWidth: 2
useTabs: false
Prettier enforces consistency in:
- YAML indentation
- JSON formatting
- Markdown line breaks
- Comment spacing
8.3. Hadolint: Dockerfile Linting
Hadolint catches Docker best practice violations:
# ❌ Bad (flagged by hadolint)
FROM ubuntu:latest
RUN apt-get update
RUN apt-get install -y python3
RUN apt-get install -y git
# ✅ Good (passes hadolint)
FROM ubuntu:22.04
RUN apt-get update && \
apt-get install -y --no-install-recommends \
python3 \
git && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
Common hadolint rules:
- DL3006: Use specific base image versions (not
latest) - DL3009: Delete apt-get cache to reduce layer size
- DL3015: Avoid additional packages with
apt-get install - SC2086: Quote variables in shell commands
8.4. JSON Schema Validation
The template validates all meta.yml files against a JSON Schema:
// modules/yaml-schema.json (simplified)
{
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "nf-core module schema",
"type": "object",
"required": ["name", "description", "input", "output"],
"properties": {
"name": {
"type": "string",
"description": "Module name"
},
"description": {
"type": "string",
"description": "Module description"
},
"keywords": {
"type": "array",
"items": { "type": "string" }
},
"tools": {
"type": "object",
"description": "Tools used in the module"
},
"input": {
"type": "array",
"description": "Input ports"
},
"output": {
"type": "array",
"description": "Output ports"
}
}
}
Validation happens in pre-commit:
check-jsonschema --schemafile modules/yaml-schema.json modules/my-org/*/meta.yml
9. Putting It All Together: End-to-End Workflow
9.1. Creating a New Module Repository
# Step 1: Install tools
pipx install copier
# Step 2: Generate repository from template
copier copy gh:nf-core/modules-template ./my-bioinformatics-modules
# Follow prompts:
# repo_host: https://github.com
# repo_org_name: my-research-lab
# short_org_name: mylab
# ci: github
# repo_name: bioinformatics-modules
# description: Nextflow modules for genomic analysis
# copyright_holder: My Research Lab
# default_branch: main
# Step 3: Navigate to generated repository
cd my-bioinformatics-modules
# Step 4: Initialize as git repository
bash ./project_init.sh
# Step 5: Set up pre-commit hooks
pip install pre-commit
pre-commit install
# Step 6: Push to GitHub
git remote add origin https://github.com/my-research-lab/bioinformatics-modules.git
git branch -M main
git push -u origin main
9.2. Creating a New Module
# Copy the example module structure
mkdir -p modules/mylab/my_custom_process
cp -r modules/mylab/examplemodule/{main.nf,meta.yml,environment.yml,tests} modules/mylab/my_custom_process/
# Edit the process
vim modules/mylab/my_custom_process/main.nf
# Edit metadata
vim modules/mylab/my_custom_process/meta.yml
# Edit dependencies
vim modules/mylab/my_custom_process/environment.yml
# Write tests
vim modules/mylab/my_custom_process/tests/main.nf.test
9.3. Workflow: Add → Test → Pre-commit → Push → CI/CD
# 1. Stage changes
git add modules/mylab/my_custom_process/
# 2. Pre-commit hooks run automatically
# ✓ Prettier formats YAML
# ✓ Ruff sorts imports and checks Python
# ✓ JSON schema validates meta.yml
# ✓ Hadolint checks Dockerfile (if any)
# If issues found, they're fixed. Review and re-add.
# 3. Commit
git commit -m "feat: add my_custom_process module"
# 4. Push
git push origin main
# 5. GitHub Actions CI/CD runs:
# a. Lint workflow:
# - Pre-commit checks
# - nf-core linting
# - YAML/JSON validation
# b. nf-test workflow:
# - Detects changed tests
# - Runs tests across 4 shards in parallel
# - Validates outputs via snapshots
# 6. Pull request review
# ✓ All checks passed
# ✓ Module is ready for use
9.4. Using Modules in a Workflow
// workflows/my_pipeline.nf
// Include modules
include { FASTQC } from '../modules/nf-core/fastqc/main'
include { BWA_INDEX; BWA_MEM } from '../modules/nf-core/bwa/main'
include { MY_CUSTOM_PROCESS } from '../modules/mylab/my_custom_process/main'
workflow {
ch_input = Channel.fromFilePairs("data/{1,2}.fastq.gz")
FASTQC(ch_input)
reference_fasta = file("reference.fa")
BWA_INDEX(reference_fasta)
BWA_MEM(ch_input, BWA_INDEX.out.index)
MY_CUSTOM_PROCESS(BWA_MEM.out.bam)
}
10. Key Takeaways
- Copier enables reproducible scaffolding — Create consistent module repositories with a single command and interactive prompts
- Repository organization enforces standards — Modules and subworkflows follow a defined structure with required metadata
- Pre-commit hooks catch issues early — Linting, formatting, and validation run automatically before commits
- GitHub Actions automate CI/CD — Lint and test workflows run on every PR, with intelligent test sharding
- nf-test provides reliable testing — Snapshot testing, BAM/VCF validation, and parallel execution
- Code quality tools maintain consistency — Ruff, Prettier, and Hadolint enforce best practices
- Everything works together — Copier + structure + pre-commit + CI/CD + nf-test = production-ready modules
By using the nf-modules-template, teams can:
- Onboard new developers quickly
- Maintain consistent code quality across modules
- Catch bugs before they reach production
- Share modules reliably across the ecosystem
- Scale to hundreds of modules with confidence
References
- nf-core/modules-template — Official template repository
- Copier Documentation — Template scaffolding tool
- nf-test Documentation — Testing framework for Nextflow
- GitHub Actions Documentation — CI/CD automation
- nf-core/modules — Community module repository
- Nextflow Documentation — Workflow engine
- Ruff Documentation — Python linter and formatter
- Prettier Documentation — Code formatter
- Hadolint Documentation — Dockerfile linter