Part 3: Add an existing nf-core module¶
In this third part of the Hello nf-core training course, we show you how to add an existing nf-core module to your pipeline.
One of the great advantages of nf-core pipelines is the ability to leverage pre-built, tested modules from the nf-core/modules repository. Rather than writing every process from scratch, you can install and use community-maintained modules that follow best practices.
In this section, we'll replace the custom collectGreetings
module with the cat/cat
module from nf-core/modules.
Note
This section assumes you have completed Part 2: Rewrite Hello for nf-core and have a working core-hello
pipeline.
1. Find and explore the cat/cat module¶
The collectGreetings
process in our pipeline uses the Unix cat
command to concatenate multiple greeting files into one. This is a perfect use case for the nf-core cat/cat
module, which is designed specifically for concatenating files.
1.1. Browse available modules on the nf-core website¶
The nf-core project maintains a centralized catalog of modules at https://nf-co.re/modules.
Navigate to the modules page in your web browser and use the search bar to search for "cat".
You should see cat/cat
in the search results. Click on it to view the module documentation.
The module page shows:
- A description: "A module for concatenation of gzipped or uncompressed files"
- Installation command:
nf-core modules install cat/cat
- Input and output channel structure
- Available parameters
1.2. List available modules from the command line¶
You can also search for modules directly from the command line using nf-core tools.
This will display a list of all available modules in the nf-core/modules repository. You can scroll through or pipe to grep
to find specific modules:
1.3. Get detailed information about the module¶
To see detailed information about a specific module, use the info
command:
This displays documentation about the module, including its inputs, outputs, and basic usage information.
Takeaway¶
You now know how to find and explore available nf-core modules using both the website and command-line tools.
What's next?¶
Learn how to install the module in your pipeline.
2. Install and import the module¶
Now that we've identified the cat/cat
module as a suitable replacement for our custom collectGreetings
process, let's install it in our pipeline.
2.1. Install the cat/cat module¶
From your core-hello
directory, run the following command:
The tool will prompt you to confirm the installation. Press Enter to accept the default options.
INFO Installing 'cat/cat'
INFO Include statement: include { CAT_CAT } from '../modules/nf-core/cat/cat/main'
The command automatically:
- Downloads the module files to
modules/nf-core/cat/cat/
- Updates
modules.json
to track the installed module - Provides you with the correct
include
statement to use in your workflow
2.2. Verify the module installation¶
Let's check that the module was installed correctly:
modules/nf-core/cat
└── cat
├── environment.yml
├── main.nf
├── meta.yml
└── tests
├── main.nf.test
├── main.nf.test.snap
├── nextflow.config
└── tags.yml
You can also verify the installation by listing locally installed modules:
INFO Modules installed in '.':
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Module Name ┃ Repository ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ cat/cat │ nf-core/modules │
└────────────────────────────┴─────────────────────────────┘
2.3. Add the import statement to your workflow¶
Open core-hello/workflows/hello.nf and add the include
statement for the CAT_CAT
module in the imports section.
The nf-core convention is to use uppercase for module names when importing them.
Note how the path for the nf-core module differs from the local modules:
- nf-core module:
'../modules/nf-core/cat/cat/main'
(includes the tool name twice and referencesmain.nf
) - Local module:
'../modules/local/collectGreetings.nf'
(single file reference)
Takeaway¶
You know how to install nf-core modules using the command-line tools and add the appropriate import statements to your workflow.
What's next?¶
Learn how to use the module in your workflow.
3. Wire up the module to the workflow¶
Now we need to replace the call to collectGreetings
with a call to CAT_CAT
, adapting the inputs and outputs to match the module's interface.
3.1. Examine the cat/cat module interface¶
Let's look at the cat/cat
module's main.nf file to understand its interface:
The key parts of the module are:
modules/nf-core/cat/cat/main.nf (excerpt) | |
---|---|
The module expects:
- Input: A tuple containing metadata (
meta
) and input file(s) (files_in
) - Output: A tuple containing metadata and the concatenated output file, plus a versions file
3.2. Compare with collectGreetings interface¶
Our custom collectGreetings
module has a simpler interface:
modules/local/collectGreetings.nf (excerpt) | |
---|---|
The main differences are:
CAT_CAT
requires a metadata map, whilecollectGreetings
doesn'tCAT_CAT
outputs a tuple, whilecollectGreetings
outputs a simple pathCAT_CAT
requires a filename prefix via themeta.id
field
3.3. Adapt the workflow to use CAT_CAT¶
We need to modify our workflow code to:
- Create a metadata map with an appropriate ID
- Combine the metadata with the collected files into a tuple
- Call
CAT_CAT
instead ofcollectGreetings
- Adapt downstream processes to handle the tuple output
Open core-hello/workflows/hello.nf and modify the workflow logic in the main
block:
Let's break down what we changed:
- Created metadata:
def meta = [ id: params.batch ]
creates a map with an ID field set to our batch name - Created a tuple channel:
ch_for_cat = convertToUpper.out.collect().map { files -> tuple(meta, files) }
combines the metadata and collected files into the tuple format expected byCAT_CAT
- Called CAT_CAT: Replaced
collectGreetings(...)
withCAT_CAT(ch_for_cat)
- Extracted file from tuple: Modified the cowpy call to extract just the file from the output tuple using
.map{ meta, file -> file }
- Removed count view: The
cat/cat
module doesn't emit a count, so we removed that line
Note
We removed the collectGreetings.out.count.view { ... }
line because the nf-core cat/cat
module doesn't provide a count of files. If you want to keep this functionality, you would need to count the files before calling CAT_CAT
.
3.4. Update the emit block¶
Update the emit
block to reflect the new output:
In this case, the emit block doesn't need to change because we're still emitting the cowpy output.
3.5. Test the updated workflow¶
Let's test that our workflow still works with the nf-core module:
N E X T F L O W ~ version 24.10.4
Launching `core-hello/main.nf` [curious_davinci] DSL2 - revision: c31b966b36
Input/output options
input : core-hello/assets/greetings.csv
outdir : core-hello-results
Institutional config options
config_profile_name : Test profile
config_profile_description: Minimal test dataset to check pipeline function
Generic options
validate_params : false
Core Nextflow options
runName : curious_davinci
containerEngine : docker
profile : test,docker
!! Only displaying parameters that differ from the pipeline defaults !!
------------------------------------------------------
executor > local (7)
[a1/2f8d9c] CORE_HELLO:HELLO:sayHello (1) | 3 of 3 ✔
[e2/9a8b3d] CORE_HELLO:HELLO:convertToUpper (2) | 3 of 3 ✔
[c4/7e1b2a] CORE_HELLO:HELLO:CAT_CAT | 1 of 1 ✔
[f5/3d9c8b] CORE_HELLO:HELLO:cowpy | 1 of 1 ✔
-[core/hello] Pipeline completed successfully-
Notice that CAT_CAT
now appears in the process execution list instead of collectGreetings
.
3.6. Verify the outputs¶
Check that the outputs look correct:
You should still see the concatenated and cowpy output files, though the naming may be slightly different since CAT_CAT
uses the metadata ID for the output filename.
Takeaway¶
You know how to adapt your workflow to use an nf-core module, including creating the appropriate metadata structures and handling tuple-based inputs and outputs.
What's next?¶
Clean up by optionally removing the now-unused local module.
4. Optional: Clean up unused local modules¶
Now that we're using the nf-core cat/cat
module, the local collectGreetings
module is no longer needed.
4.1. Remove the collectGreetings import¶
Remove or comment out the import line for collectGreetings
:
4.2. Optionally remove the module file¶
You can optionally delete the collectGreetings.nf
file:
However, you might want to keep it as a reference for understanding the differences between local and nf-core modules.
Takeaway¶
You know how to replace custom local modules with nf-core modules and clean up unused code.
What's next?¶
Continue to Part 4: Input validation to learn how to add schema-based input validation to your pipeline, or explore other nf-core modules you might add to enhance your pipeline further.