HiC Meta pipeline available

How to use the nextflow HiC meta pipeline
guide
Author

Mia Croiset

Published

December 6, 2024

Here’s a quick overview of the HiC meta pipeline developed following the Hic pipeline project.

The meta Hi-C pipeline is regrouping different Hi-C pipelines for Analysis of Chromosome Conformation Capture data.

The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. The pipeline is based on the nf-core/hic pipeline and the hicstuff pipeline.

pipeline schema

Usage

Prepare the environment

If you want to run the pipeline on the PSMN, you first need to set up your PSMN environment if it’s not already done.

Then your going to clone this repository in your scratch/Bio directory or locally on your computer

git clone git@gitbio.ens-lyon.fr:LBMC/hub/hic.git

Then cd in the git directory

Get started

Prepare a samplesheet with your input data that looks as follows:

samplesheet.csv:

sample,fastq_1,fastq_2
HIC_ES_4,SRR5339783_1.fastq.gz,SRR5339783_2.fastq.gz

Each row represents a pair of fastq files (paired end). Now, you can run the pipeline using:

nextflow run main.nf \
   -profile psmn \
   --workflow <hicpro/hicstuff> \
   --input samplesheet.csv \
   --fasta <path/to/genome.fasta> \
   --outdir <OUTDIR> \
   --digestion <dpnii/hindiii/arima/mboi>

If your not running the pipeline on the PSMN, make sure you have Docker installed and use -profile docker instead.

You may want to use different options such as cutsite from hicstuff or our new and optimized parasplit

For detailed options, please refer to the parameter documentation on the git page.

Pipeline output

The pipeline takes in input reads .fastq and genome .fasta and gives as output matrices .cool and .mcool, balanced or not depending of the options, and images of the matrices .pdf. You can output intermediates files such as .bam files with options.