I am new to Nextflow scripts. I am trying to build a mitochondrial DNA variant pipeline. I have used fastqc and trimmomatic tool for quality checking and trimming a low quality sequences. I have written a script below, program is executed but shows no output.
#!/usr/bin/env nextflow
params {
fastq_dir = "/mnt/e/Bioinformatics_ppt_learning/mtDNA/nextflow_scripts/*.fastq.gz"
fastqc_dir = "/mnt/e/Bioinformatics_ppt_learning/mtDNA/nextflow_scripts/fastqc_report"
trimmed_dir = "/mnt/e/Bioinformatics_ppt_learning/mtDNA/nextflow_scripts/trimmed_fastq"
trimmomatic_jar = "/mnt/e/Bioinformatics_ppt_learning/mtDNA/nextflow_scripts/trimmomatic-0.39.jar"
}
process FastQC {
tag "Running FastQC on ${fastq}"
publishDir "${fastqc_dir}/${fastq.baseName}"
input: path fastq
script:
"""
fastqc -o ${fastqc_dir} ${fastq}
"""
}
process Trimmomatic {
tag "Trimming ${fastq.baseName}"
input:
path read1 from FastQC.output
output:
file(joinPath(trimmed_dir, "${read1.baseName}_trimmed.fastq.gz"))
script:
"""
java -jar ${params.trimmomatic_jar} PE -threads 4 \
${read1} ${joinPath(trimmed_dir, "${read1.baseName}_trimmed.fastq.gz")} \
${joinPath(trimmed_dir, "${read1.baseName}_unpaired.fastq.gz")} \
${joinPath(trimmed_dir, "${read1.baseName}_unpaired.fastq.gz")}
"""
}
workflow {
fastq_files = Channel.fromPath(params.fastq_dir)
fastq_files.each {
FastQC(fastq: it)
Trimmomatic(read1: FastQC.output)
}
}
publishDirworks by emitting items in the processoutputdeclaration to the path provided. You haven't provided an output declaration for either process, so it doesn't think there is anything to publish.Also, unless you're using it for checkpointing, you don't need the output from
FastQCforTrimmomatic, you can get the two processes to run in parallel.Don't use
joinPathor any absolute path in your processes. That's not what Nextflow is designed for, and often will lead to errors. Plus, by putting an absolute path in the output declaration, you're telling the process to look in the output directory for the file generated in the process. UsepublishDirto emit files.The
fileoperator is deprecated. Usepathinstead. The documentation is amazing for nextflow. It's a steep learning curve, but it's very good at describing how things work.So here is an updated script:
In the workflow, you shouldn't need to tell the processes to iterate over each element. This is the default behaviour of the tool. I've added some commands to the channel generation to highlight some redundancy you can add.
EDIT: Missed some of the absolute paths. Updated input to be a tuple instead since it's better at handing names this way and adjusted tags.