Skip to main content

Scalable Nextflow Modules: Building a Template with Copier, CI/CD, and nf-test

· 19 min read
Thanh-Giang Tan Nguyen
Founder at G Labs

Creating and maintaining a library of reusable Nextflow modules is a significant challenge for bioinformatics teams. Without a consistent structure, code quality standards, and automated testing, modules quickly become difficult to share, validate, and integrate into pipelines. The nf-modules-template solves this by providing a production-ready template that uses Copier to scaffold new module repositories, GitHub Actions for automated CI/CD workflows, pre-commit hooks for code quality, and nf-test with intelligent sharding for scalable module testing. This post explores how these technologies work together to enable reproducible, maintainable Nextflow module libraries.

1. The Problem: Fragmented Module Management

1.1. Why Module Repositories Matter

In complex bioinformatics workflows, teams often develop dozens or hundreds of reusable Nextflow processes. Without a centralized, well-structured module repository:

  • Developers duplicate code across projects (violating DRY principles)
  • Modules lack standardized documentation and metadata
  • Testing is ad-hoc or missing entirely
  • Code quality varies significantly between modules
  • Integration with other pipelines is error-prone
  • Version management and dependency tracking become chaotic

1.2. The Traditional Approach (Manual and Error-Prone)

Without a template, setting up a new module repository requires:

# Manual directory creation
mkdir -p my-modules/{modules,subworkflows,tests,docs,.github/workflows}

# Manual configuration file creation
touch {.pre-commit-config.yaml,ruff.toml,nf-test.config,Makefile}

# Manual GitHub workflow setup (copy-paste from other projects)
touch .github/workflows/lint.yml
touch .github/workflows/nf-test.yml

# Manual CI/CD pipeline configuration
# ... dozens of manual steps with many opportunities for error

Problems:

  • Inconsistency across different module repositories
  • Outdated workflows in old projects
  • Difficult to enforce standards across teams
  • Onboarding new developers takes time and is error-prone
  • Changes to best practices require manual updates everywhere

1.3. The Better Approach: Copier + Templates

# One command to create a fully configured module repository
copier copy gh:nf-core/modules-template ./my-new-modules-library

# Follow interactive prompts
# → Repo host? GitHub
# → Organization name? my-org
# → License? MIT License
# → ... (5-10 quick prompts)

# One command to initialize
cd my-new-modules-library
bash ./project_init.sh

# Fully configured repository with:
# ✓ Pre-commit hooks (ruff, prettier, hadolint, jsonschema)
# ✓ GitHub Actions workflows (lint + nf-test)
# ✓ nf-test framework with sharding support
# ✓ Module structure and examples
# ✓ Documentation setup (MkDocs)
# ✓ Contributing guidelines

2. Understanding the Architecture

2.1. The Technology Stack

┌────────────────────────────────────────────────────┐
│ nf-modules-template Repository │
├────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────┐ │
│ │ Copier Configuration (copier.yml) │ │
│ │ - Template parameters │ │
│ │ - Jinja2 templating │ │
│ │ - Interactive prompts │ │
│ └─────────────────────────────────────────────┘ │
│ ↓ │
│ ┌─────────────────────────────────────────────┐ │
│ │ Generated Repository │ │
│ │ - Project structure │ │
│ │ - Module templates │ │
│ │ - Workflows │ │
│ │ - Configurations │ │
│ └─────────────────────────────────────────────┘ │
│ ↓ │
│ ┌─────────────────────────────────────────────┐ │
│ │ Code Quality (Pre-commit Hooks) │ │
│ │ - Ruff (Python linting) │ │
│ │ - Prettier (YAML/JSON formatting) │ │
│ │ - Hadolint (Dockerfile validation) │ │
│ │ - JSON Schema validation │ │
│ └─────────────────────────────────────────────┘ │
│ ↓ │
│ ┌─────────────────────────────────────────────┐ │
│ │ CI/CD Workflows (GitHub Actions) │ │
│ │ ┌─────────────────────────────────────────┐ │ │
│ │ │ Lint Workflow (PR/push) │ │ │
│ │ │ - Pre-commit linting │ │ │
│ │ │ - nf-core linting │ │ │
│ │ │ - YAML/JSON schema validation │ │ │
│ │ └─────────────────────────────────────────┘ │ │
│ │ ┌─────────────────────────────────────────┐ │ │
│ │ │ nf-test Workflow (parallelized) │ │ │
│ │ │ - Shard detection (get-shards action) │ │ │
│ │ │ - Parallel test execution │ │ │
│ │ │ - BAM/VCF/Utils plugins │ │ │
│ │ └─────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────┘ │
│ ↓ │
│ ┌─────────────────────────────────────────────┐ │
│ │ Module/Subworkflow (nf-test Framework) │ │
│ │ - Snapshot testing │ │
│ │ - Input/output validation │ │
│ │ - Plugin support (BAM, VCF) │ │
│ └─────────────────────────────────────────────┘ │
│ │
└────────────────────────────────────────────────────┘

2.2. Repository Structure

A generated module repository looks like:

my-modules/
├── .github/
│ ├── workflows/
│ │ ├── lint.yml # Linting checks
│ │ └── nf-test.yml # Testing workflow
│ ├── CONTRIBUTING.md # Contributing guidelines
│ └── ISSUE_TEMPLATE/ # Issue templates
├── modules/
│ ├── nf-core/ # nf-core modules (shared)
│ │ └── examplemodule/
│ │ ├── main.nf # Process definition
│ │ ├── meta.yml # Module metadata
│ │ ├── environment.yml # Conda environment
│ │ ├── tests/
│ │ │ ├── main.nf.test.jinja # Test file
│ │ │ └── tags.yml # Test tags
│ │ └── nextflow.config # Module config
│ └── my-org/ # Organization-specific modules
│ └── custommodule/ # Your custom modules follow same structure
├── subworkflows/ # Composite workflows
│ ├── nf-core/
│ └── my-org/
├── tests/
│ ├── config/
│ │ ├── nf-test.config # Test configuration
│ │ └── pytest_modules.yml # Pytest config
│ ├── pytest.ini
│ └── demo_nextflow.config
├── .pre-commit-config.yaml # Pre-commit hooks configuration
├── .nf-core.yml # nf-core configuration
├── nf-test.config # nf-test framework config
├── ruff.toml # Python linter config
├── .prettierignore & .prettierrc.yml # Code formatter config
├── mkdocs.yml # Documentation config
├── main.nf # Example main workflow
├── nextflow.config # Main Nextflow config
├── Makefile # Development commands
├── README.md # Project documentation
└── LICENSE # License file

3. Copier: Templating System

3.1. What is Copier?

Copier is a template engine and scaffolding tool that:

  • Uses Jinja2 templating for dynamic file generation
  • Supports interactive prompts for collecting project metadata
  • Can be updated incrementally (regenerate from updated templates)
  • Works with version control systems

3.2. The copier.yml Configuration

The template's copier.yml defines prompts and template variables:

repo_host:
type: str
help: "What is the host of your code repository?"
default: "https://github.com"
choices:
GitHub: "https://github.com"
Gitlab: "https://gitlab.com"

repo_org_name:
type: str
help: "What is your organization/user name?"
placeholder: "demo-org"
required: true
validator: >-
{% if not (repo_org_name | regex_search('^[a-zA-Z][a-zA-Z0-9\-_]+$')) %}
repo_org_name must start with a letter...
{% endif %}

short_org_name:
type: str
help: "What is your abbreviated org name?"
default: "{{ repo_org_name }}"
validator: >-
{% if not (short_org_name | regex_search('^[a-z\-_]+$')) %}
short_org_name must be lowercase...
{% endif %}

ci:
type: str
help: "What CI provider will you use?"
default: "github"
choices:
Github Actions: "github"
None: "none"
when: "{{ repo_host == 'https://github.com' }}"

copyright_license:
type: str
help: "Your project's license"
default: "MIT License"
choices:
- "MIT License"
- "Apache License 2.0"
- "GPL v3.0"
# ... many more options

3.3. Using Copier to Generate a Repository

# Install Copier
pipx install copier

# Generate from the template
copier copy --vcs-ref main gh:nf-core/modules-template ./my-modules

# Interactive prompts:
# repo_host? [https://github.com]:
# repo_org_name? [demo-org]: my-org
# short_org_name? [my-org]: myorg
# ci? [github]: github
# repo_name? [modules]: bioinformatics-modules
# description? [An nf-core modules repository...]: My custom Nextflow modules
# copyright_holder? [Author or Organization Name]: My Organization
# default_branch? [main]: main

3.4. Jinja2 Templating in Action

Files with .jinja extension are template files that get rendered:

# .pre-commit-config.yaml.jinja
repos:
- repo: https://github.com/pre-commit/mirrors-prettier
rev: "v3.1.0"
hooks:
- id: prettier

- repo: https://github.com/python-jsonschema/check-jsonschema
rev: 0.35.0
hooks:
- id: check-jsonschema
# Validate modules for {{ short_org_name }}
files: ^modules/{{ short_org_name }}/.*/meta\.yml$
args: ["--schemafile", "modules/yaml-schema.json"]

When rendered with short_org_name = "myorg", becomes:

# .pre-commit-config.yaml (rendered)
repos:
- repo: https://github.com/pre-commit/mirrors-prettier
rev: "v3.1.0"
hooks:
- id: prettier

- repo: https://github.com/python-jsonschema/check-jsonschema
rev: 0.35.0
hooks:
- id: check-jsonschema
# Validate modules for myorg
files: ^modules/myorg/.*/meta\.yml$
args: ["--schemafile", "modules/yaml-schema.json"]

Conditional templating also works:

# .github/workflows folder structure
{% if 'github' in repo_host %}.github{% endif %}/
{% if ci =='github' %}actions{% endif %}/

This creates .github/actions/ only for GitHub-based repositories.


4. Repository Organization: Modules and Subworkflows

4.1. Modules Directory Structure

Modules are self-contained processes with their own tests and documentation:

modules/
├── nf-core/ # Official nf-core modules
│ ├── bwa/
│ │ ├── index/
│ │ │ ├── main.nf # Module process
│ │ │ ├── meta.yml # Metadata
│ │ │ ├── environment.yml # Dependencies
│ │ │ └── tests/
│ │ │ ├── main.nf.test.jinja
│ │ │ ├── tags.yml
│ │ │ └── nextflow.config
│ │ └── mem/
│ └── fastqc/
├── my-org/ # Organization-specific modules
│ └── custom_qc/
│ ├── main.nf
│ ├── meta.yml
│ ├── environment.yml
│ └── tests/
└── yaml-schema.json # Schema for validating meta.yml

4.2. Module Anatomy: A Complete Example

// modules/my-org/examplemodule/main.nf
process EXAMPLEMODULE {
tag "$meta.id"
label 'process_medium'

// Conda environment for reproducibility
conda "${moduleDir}/environment.yml"

// Container support (Docker or Singularity)
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/fastqc:0.12.1--hdfd78af_0' :
'biocontainers/fastqc:0.12.1--hdfd78af_0' }"

// Inputs with metadata
input:
tuple val(meta), path(reads)

// Outputs with emit labels
output:
tuple val(meta), path("*.html"), emit: html
tuple val(meta), path("*.zip") , emit: zip
path "versions.yml" , emit: versions

// Conditional execution
when:
task.ext.when == null || task.ext.when

script:
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}"
def memory_in_mb = MemoryUnit.of("${task.memory}").toUnit('MB') / task.cpus
def fastqc_memory = memory_in_mb > 10000 ? 10000 : (memory_in_mb < 100 ? 100 : memory_in_mb)

"""
fastqc \\
$args \\
--threads $task.cpus \\
--memory $fastqc_memory \\
$reads

cat <<-END_VERSIONS > versions.yml
"${task.process}":
fastqc: \$( fastqc --version | sed '/FastQC v/!d; s/.*v//' )
END_VERSIONS
"""

stub:
def prefix = task.ext.prefix ?: "${meta.id}"
"""
touch ${prefix}.html
touch ${prefix}.zip
cat <<-END_VERSIONS > versions.yml
"${task.process}":
fastqc: 0.12.1
END_VERSIONS
"""
}

4.3. Module Metadata: meta.yml

# modules/my-org/examplemodule/meta.yml
name: examplemodule
description: "Brief description of what the module does"
keywords:
- fastqc
- quality-control
- sequencing

tools:
- fastqc:
description: "A quality control tool for high throughput sequence data"
homepage: "https://www.bioinformatics.babraham.ac.uk/projects/fastqc"
documentation: "https://www.bioinformatics.babraham.ac.uk/projects/fastqc/Help/"
tool_dev_url: "https://github.com/s-andrews/FastQC"
doi: "10.1093/bioinformatics/btp324"
licence: ["GPL v3"]
version: "0.12.1"

input:
- meta:
type: map
description: Metadata map with sample ID
pattern: "id"
- reads:
type: file
description: "Input FASTQ file"
pattern: "*.{fq,fastq}{,.gz}"

output:
- meta:
type: map
description: Metadata map
pattern: "id"
- html:
type: file
description: "FastQC HTML report"
pattern: "*.html"

authors:
- "@your-github-handle"
maintainers:
- "@your-github-handle"

4.4. Environment and Dependencies

# modules/my-org/examplemodule/environment.yml
name: examplemodule
channels:
- conda-forge
- bioconda
dependencies:
- bioconda::fastqc=0.12.1
- conda-forge::openjdk=17

4.5. Subworkflows: Composite Modules

Subworkflows combine multiple modules into reusable workflows:

subworkflows/
├── nf-core/
│ └── bam_qc/
│ ├── main.nf # Composite workflow
│ ├── meta.yml # Metadata
│ └── tests/
│ ├── main.nf.test
│ └── tags.yml
└── my-org/
└── variant_calling_pipeline/
// subworkflows/my-org/variant_calling_pipeline/main.nf
workflow VARIANT_CALLING_PIPELINE {
take:
bam
reference_fasta

main:
// Call multiple modules in sequence
SAMTOOLS_INDEX(bam)
BCFTOOLS_MPILEUP(bam, reference_fasta)
BCFTOOLS_CALL(BCFTOOLS_MPILEUP.out.vcf)
TABIX_BGZIP(BCFTOOLS_CALL.out.vcf)

emit:
vcf = TABIX_BGZIP.out.vcf
tbi = TABIX_BGZIP.out.tbi
}

5. Pre-commit Hooks: Automated Code Quality

5.1. What are Pre-commit Hooks?

Pre-commit hooks are scripts that run automatically before each git commit. They:

  • Catch style issues before they're committed
  • Enforce consistent formatting across the codebase
  • Validate YAML, JSON, and Dockerfile syntax
  • Prevent pushing broken code

5.2. The .pre-commit-config.yaml Setup

# .pre-commit-config.yaml
repos:
# Code formatter for YAML, JSON, Markdown, etc.
- repo: https://github.com/pre-commit/mirrors-prettier
rev: "v3.7.3"
hooks:
- id: prettier
entry: prettier --experimental-cli --write --ignore-unknown
exclude: |
(?x)^(
.*\.snap$
)$
additional_dependencies:
- prettier@3.7.3

# General file fixes
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v6.0.0
hooks:
- id: trailing-whitespace
args: [--markdown-linebreak-ext=md]
- id: end-of-file-fixer

# YAML/JSON Schema validation
- repo: https://github.com/python-jsonschema/check-jsonschema
rev: 0.35.0
hooks:
# Validate module meta.yml files
- id: check-jsonschema
name: "Validate meta.ymls (modules/my-org)"
files: ^modules/my-org/.*/meta\.yml$
types: [yaml]
args: ["--schemafile", "modules/yaml-schema.json"]

# Validate environment.yml files
- id: check-jsonschema
name: "Validate environment.ymls"
files: ^modules/my-org/.*/environment\.yml$
types: [yaml]
args: ["--schemafile", "modules/environment-schema.json"]

# Validate GitHub workflows
- id: check-github-workflows

# Python linting and formatting
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.14.7
hooks:
- id: ruff
files: \.py$
args: [--fix, --exit-non-zero-on-fix]
- id: ruff-format

# Dockerfile linting
- repo: https://github.com/hadolint/hadolint
rev: v2.14.0
hooks:
- id: hadolint-docker
args: ["--failure-threshold", "error"]

5.3. Installation and Usage

# Install pre-commit
pip install pre-commit

# Set up hooks in your repository
pre-commit install

# Manually run hooks on all files (useful for CI)
pre-commit run --all-files

# Skip hooks for a specific commit (if absolutely necessary)
git commit --no-verify

5.4. What Each Hook Does

HookPurposeExample
prettierFormats YAML, JSON, Markdown to consistent styleIndentation, line length
trailing-whitespaceRemoves trailing spacesCleans up editor artifacts
end-of-file-fixerEnsures files end with newlineFixes missing final newline
check-jsonschemaValidates YAML/JSON against schemaEnsures meta.yml structure
ruffPython linting (import sorting, errors)Fixes import order, catches undefined vars
ruff-formatPython code formatterConsistent spacing, line length
hadolint-dockerDockerfile lintingCatches Docker best practice violations

6. GitHub Actions CI/CD Workflows

6.1. Lint Workflow (lint.yml)

The lint workflow runs on every pull request to catch code quality issues:

# .github/workflows/lint.yml (simplified)
name: Run Linting
on:
pull_request:
branches: [main]
merge_group:
types: [checks_requested]
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

jobs:
pre-commit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- name: Run pre-commit checks
uses: j178/prek-action@v1 # Runs pre-commit hooks

nf-core-changes:
runs-on: ubuntu-latest
outputs:
modules: ${{ steps.filter.outputs.modules }}
subworkflows: ${{ steps.filter.outputs.subworkflows }}
steps:
- uses: actions/checkout@v6
with:
fetch-depth: 2

# Detect which files changed
- uses: dorny/paths-filter@v3
id: filter
with:
filters: |
modules:
- added|modified: 'modules/my-org/**'
subworkflows:
- added|modified: 'subworkflows/my-org/**'

# Extract module names from changed files
- name: Get module names
id: module_names
uses: actions/github-script@v8
with:
script: |
return [...new Set(${{ steps.filter.outputs.modules_files }}
.filter(x => x.endsWith('main.nf'))
.map(path => path.split('/')[2]))];

# Run nf-core linting on changed modules
- name: Run nf-core lint
uses: nf-core/lint-action@v2
with:
modules: ${{ steps.module_names.outputs.result }}

Key features:

  • Concurrency control: Cancels previous runs when a new push happens
  • Path filtering: Only lints files that actually changed
  • Module detection: Automatically finds which modules were modified
  • Pre-commit integration: Runs all pre-commit hooks

6.2. nf-test Workflow (nf-test.yml)

The nf-test workflow runs module tests in parallel using intelligent sharding:

# .github/workflows/nf-test.yml (simplified)
name: Run nf-test
on:
push:
paths-ignore:
- "**/meta.yml"
pull_request:
branches: [main]
workflow_dispatch:

env:
NFT_VER: "0.9.3"
NXF_VER: "25.04.8"
NXF_ANSI_LOG: false

jobs:
nf-test-changes:
name: Detect changes and set shards
runs-on: ubuntu-latest
outputs:
modules: ${{ steps.components.outputs.modules }}
subworkflows: ${{ steps.components.outputs.subworkflows }}
shard: ${{ steps.set-shards.outputs.shard }}
total_shards: ${{ steps.set-shards.outputs.total_shards }}
steps:
- uses: actions/checkout@v6
with:
fetch-depth: 0

# Detect nf-test files that changed
- name: List nf-test files
id: list
uses: adamrtalbot/detect-nf-test-changes@v0.0.6
with:
head: ${{ github.sha }}
base: ${{ github.event.pull_request.base.sha || 'origin/main' }}
exclude_tags: "gpu" # Exclude GPU tests

# Extract module and subworkflow names
- name: Get changed components
id: components
uses: actions/github-script@v8
with:
script: |
const paths = '${{ steps.list.outputs.components }}'.split('\n');
const modules = paths.filter(p => p.includes('modules/'));
const subworkflows = paths.filter(p => p.includes('subworkflows/'));
return { modules, subworkflows };

# Calculate shards for parallel execution
- name: Calculate shards
id: set-shards
uses: nf-core/get-shards-action@v1
with:
components: ${{ steps.list.outputs.components }}
total_shards: 4 # Run tests across 4 parallel jobs

# Run tests in parallel across multiple jobs
nf-test:
needs: nf-test-changes
runs-on: ubuntu-latest
strategy:
matrix:
shard: [1, 2, 3, 4]
steps:
- uses: actions/checkout@v6

- name: Run nf-test (shard ${{ matrix.shard }}/4)
uses: nf-core/nf-test-action@v1
with:
shard: ${{ matrix.shard }}
total_shards: 4
profile: docker

Key features:

  • Intelligent sharding: Tests are automatically distributed across 4 parallel jobs
  • Change detection: Only runs tests for modules that were actually modified
  • Tag filtering: Excludes GPU tests on standard runners
  • Docker profile: Tests run in Docker containers for consistency

6.3. How Sharding Works

Sharding divides tests across parallel jobs to reduce total runtime:

Pull Request with 100 tests

├─ Job 1 (Shard 1/4): Tests 1-25 [8 min]
├─ Job 2 (Shard 2/4): Tests 26-50 [8 min]
├─ Job 3 (Shard 3/4): Tests 51-75 [8 min]
└─ Job 4 (Shard 4/4): Tests 76-100 [8 min]
└─ Total Time: ~8 minutes (vs 32 minutes serial)

The get-shards-action automatically detects changed test files and assigns them to shards.


7. nf-test Framework and Testing Strategy

7.1. What is nf-test?

nf-test is a testing framework specifically designed for Nextflow. It provides:

  • Snapshot testing: Compare outputs to previous golden outputs
  • Input/output validation: Verify module behavior
  • Plugin support: Specialized validators for BAM, VCF, and utility files
  • Parallel execution: Run tests concurrently
  • Integration with CI/CD: Seamless GitHub Actions integration

7.2. The nf-test.config File

// nf-test.config
config {
// Location of all nf-tests
testsDir "."

// Directory for temporary test files
workDir System.getenv("NFT_WORKDIR") ?: ".nf-test"

// Optional Nextflow config specific for tests
configFile "tests/config/nf-test.config"

// Profile to use for tests (docker, singularity, local)
profile ""

// Load testing plugins for specialized formats
plugins {
load "nft-bam@0.6.0" # BAM file validation
load "nft-utils@0.0.7" # Utility functions
load "nft-vcf@1.0.7" # VCF file validation
}
}

7.3. Writing a Test File

// modules/my-org/examplemodule/tests/main.nf.test
nextflow_process {

name "Test EXAMPLEMODULE"
script "../main.nf"
process "EXAMPLEMODULE"

test("test_fastqc_single_end") {

when {
process {
"""
input[0] = [
[id: 'test'],
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/fastq/test_1.fastq.gz')
]
"""
}
}

then {
assertAll(
{ assert process.success },
{ assert snapshot(process.out).match() }
)
}
}

test("test_fastqc_paired_end") {

when {
process {
"""
input[0] = [
[id: 'test'],
[
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/fastq/test_1.fastq.gz'),
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/fastq/test_2.fastq.gz')
]
]
"""
}
}

then {
assertAll(
{ assert process.success },
{ assert snapshot(process.out).match() },
{ assert file(process.out.html[0][1]).exists() }
)
}
}
}

7.4. Test Plugins for Specialized Validation

The nf-test plugins provide specialized validators:

// VCF validation example
then {
assertAll(
{ assert process.success },
{ assert file(process.out.vcf[0][1]).vcf.variantCount == 42 },
{ assert file(process.out.vcf[0][1]).vcf.header.samples.size() == 3 }
)
}

// BAM validation example
then {
assertAll(
{ assert process.success },
{ assert file(process.out.bam[0][1]).bam.readCount == 1000 },
{ assert file(process.out.bam[0][1]).bam.mapped == 950 }
)
}

7.5. Running Tests Locally

# Install nf-test
curl -fsSL https://get.nf-test.com | bash
export PATH=$PATH:~/.nf-test/bin

# Configure Nextflow for tests
export NXF_VER="25.04.8"

# Run all tests
nf-test test

# Run tests for a specific module
nf-test test modules/my-org/examplemodule/

# Run tests with Docker profile
nf-test test --profile docker

# Run with verbose output
nf-test test --verbose

# Update snapshots after intentional changes
nf-test test --update-snapshots

7.6. Snapshot Testing Workflow

1. First run:
└─ Test executes module process
└─ Outputs saved to .nf-test/modules/*/main.nf.test.snap

2. Subsequent runs:
└─ Test executes module again
└─ Outputs compared to snapshot
└─ If outputs match: ✓ Test passes
└─ If outputs differ: ✗ Test fails (intentional change?)

3. After code improvements:
└─ Developer reviews diffs
└─ If changes are good: nf-test --update-snapshots
└─ Snapshots updated, tests pass again

Example snapshot file:

# modules/my-org/examplemodule/tests/main.nf.test.snap

{
"test_fastqc_single_end": {
"0": {
"0": {
"id": "test"
},
"1": [
{
"0": "test_fastqc.html"
},
{
"0": "test_fastqc.zip"
}
],
"2": [
{
"0": "versions.yml"
}
]
}
}
}

8. Linting and Code Quality Tools

8.1. Ruff: Python Linting and Formatting

Ruff is a fast Python linter written in Rust that checks for:

  • Import sorting and organization
  • Unused imports
  • Undefined names
  • Code style violations

Configuration in ruff.toml:

# ruff.toml
[tool.ruff]
line-length = 120
target-version = "py311"

[tool.ruff.lint]
# Select which rules to enforce
select = ["I", "E1", "E4", "E7", "E9", "F", "UP", "N"]
# I: isort (import sorting)
# E1/E4/E7/E9: pycodestyle errors
# F: pyflakes (undefined names, etc.)
# UP: pyupgrade (modern Python syntax)
# N: pep8-naming (variable naming)

ignore = ["E501"] # Ignore line too long (handled by formatter)

[tool.ruff.format]
quote-style = "double"
indent-style = "space"
line-ending = "auto"

Pre-commit runs ruff:

# Check for issues
ruff check modules/

# Auto-fix issues
ruff check --fix modules/

# Format code
ruff format modules/

8.2. Prettier: YAML and JSON Formatting

Prettier provides opinionated code formatting for configuration files:

# .prettierrc.yml
semi: true
singleQuote: false
trailingComma: all
bracketSpacing: true
tabWidth: 2
useTabs: false

Prettier enforces consistency in:

  • YAML indentation
  • JSON formatting
  • Markdown line breaks
  • Comment spacing

8.3. Hadolint: Dockerfile Linting

Hadolint catches Docker best practice violations:

# ❌ Bad (flagged by hadolint)
FROM ubuntu:latest
RUN apt-get update
RUN apt-get install -y python3
RUN apt-get install -y git

# ✅ Good (passes hadolint)
FROM ubuntu:22.04
RUN apt-get update && \
apt-get install -y --no-install-recommends \
python3 \
git && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*

Common hadolint rules:

  • DL3006: Use specific base image versions (not latest)
  • DL3009: Delete apt-get cache to reduce layer size
  • DL3015: Avoid additional packages with apt-get install
  • SC2086: Quote variables in shell commands

8.4. JSON Schema Validation

The template validates all meta.yml files against a JSON Schema:

// modules/yaml-schema.json (simplified)
{
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "nf-core module schema",
"type": "object",
"required": ["name", "description", "input", "output"],
"properties": {
"name": {
"type": "string",
"description": "Module name"
},
"description": {
"type": "string",
"description": "Module description"
},
"keywords": {
"type": "array",
"items": { "type": "string" }
},
"tools": {
"type": "object",
"description": "Tools used in the module"
},
"input": {
"type": "array",
"description": "Input ports"
},
"output": {
"type": "array",
"description": "Output ports"
}
}
}

Validation happens in pre-commit:

check-jsonschema --schemafile modules/yaml-schema.json modules/my-org/*/meta.yml

9. Putting It All Together: End-to-End Workflow

9.1. Creating a New Module Repository

# Step 1: Install tools
pipx install copier

# Step 2: Generate repository from template
copier copy gh:nf-core/modules-template ./my-bioinformatics-modules

# Follow prompts:
# repo_host: https://github.com
# repo_org_name: my-research-lab
# short_org_name: mylab
# ci: github
# repo_name: bioinformatics-modules
# description: Nextflow modules for genomic analysis
# copyright_holder: My Research Lab
# default_branch: main

# Step 3: Navigate to generated repository
cd my-bioinformatics-modules

# Step 4: Initialize as git repository
bash ./project_init.sh

# Step 5: Set up pre-commit hooks
pip install pre-commit
pre-commit install

# Step 6: Push to GitHub
git remote add origin https://github.com/my-research-lab/bioinformatics-modules.git
git branch -M main
git push -u origin main

9.2. Creating a New Module

# Copy the example module structure
mkdir -p modules/mylab/my_custom_process
cp -r modules/mylab/examplemodule/{main.nf,meta.yml,environment.yml,tests} modules/mylab/my_custom_process/

# Edit the process
vim modules/mylab/my_custom_process/main.nf

# Edit metadata
vim modules/mylab/my_custom_process/meta.yml

# Edit dependencies
vim modules/mylab/my_custom_process/environment.yml

# Write tests
vim modules/mylab/my_custom_process/tests/main.nf.test

9.3. Workflow: Add → Test → Pre-commit → Push → CI/CD

# 1. Stage changes
git add modules/mylab/my_custom_process/

# 2. Pre-commit hooks run automatically
# ✓ Prettier formats YAML
# ✓ Ruff sorts imports and checks Python
# ✓ JSON schema validates meta.yml
# ✓ Hadolint checks Dockerfile (if any)
# If issues found, they're fixed. Review and re-add.

# 3. Commit
git commit -m "feat: add my_custom_process module"

# 4. Push
git push origin main

# 5. GitHub Actions CI/CD runs:
# a. Lint workflow:
# - Pre-commit checks
# - nf-core linting
# - YAML/JSON validation
# b. nf-test workflow:
# - Detects changed tests
# - Runs tests across 4 shards in parallel
# - Validates outputs via snapshots

# 6. Pull request review
# ✓ All checks passed
# ✓ Module is ready for use

9.4. Using Modules in a Workflow

// workflows/my_pipeline.nf

// Include modules
include { FASTQC } from '../modules/nf-core/fastqc/main'
include { BWA_INDEX; BWA_MEM } from '../modules/nf-core/bwa/main'
include { MY_CUSTOM_PROCESS } from '../modules/mylab/my_custom_process/main'

workflow {
ch_input = Channel.fromFilePairs("data/{1,2}.fastq.gz")

FASTQC(ch_input)

reference_fasta = file("reference.fa")
BWA_INDEX(reference_fasta)

BWA_MEM(ch_input, BWA_INDEX.out.index)

MY_CUSTOM_PROCESS(BWA_MEM.out.bam)
}

10. Key Takeaways

  1. Copier enables reproducible scaffolding — Create consistent module repositories with a single command and interactive prompts
  2. Repository organization enforces standards — Modules and subworkflows follow a defined structure with required metadata
  3. Pre-commit hooks catch issues early — Linting, formatting, and validation run automatically before commits
  4. GitHub Actions automate CI/CD — Lint and test workflows run on every PR, with intelligent test sharding
  5. nf-test provides reliable testing — Snapshot testing, BAM/VCF validation, and parallel execution
  6. Code quality tools maintain consistency — Ruff, Prettier, and Hadolint enforce best practices
  7. Everything works together — Copier + structure + pre-commit + CI/CD + nf-test = production-ready modules

By using the nf-modules-template, teams can:

  • Onboard new developers quickly
  • Maintain consistent code quality across modules
  • Catch bugs before they reach production
  • Share modules reliably across the ecosystem
  • Scale to hundreds of modules with confidence

References