Here’s a quick overview of the HiC meta pipeline developed following the Hic pipeline project.
The meta Hi-C pipeline is regrouping different Hi-C pipelines for Analysis of Chromosome Conformation Capture data.
The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. The pipeline is based on the nf-core/hic pipeline and the hicstuff pipeline.
Usage
Prepare the environment
If you want to run the pipeline on the PSMN, you first need to set up your PSMN environment if it’s not already done.
Then your going to clone this repository in your scratch/Bio
directory or locally on your computer
git clone git@gitbio.ens-lyon.fr:LBMC/hub/hic.git
Then cd
in the git directory
Get started
Prepare a samplesheet with your input data that looks as follows:
samplesheet.csv
:
sample,fastq_1,fastq_2
HIC_ES_4,SRR5339783_1.fastq.gz,SRR5339783_2.fastq.gz
Each row represents a pair of fastq files (paired end). Now, you can run the pipeline using:
nextflow run main.nf \
-profile psmn \
--workflow <hicpro/hicstuff> \
--input samplesheet.csv \
--fasta <path/to/genome.fasta> \
--outdir <OUTDIR> \
--digestion <dpnii/hindiii/arima/mboi>
If your not running the pipeline on the PSMN, make sure you have Docker installed and use -profile docker
instead.
You may want to use different options such as cutsite
from hicstuff or our new and optimized parasplit
For detailed options, please refer to the parameter documentation on the git page.
Pipeline output
The pipeline takes in input reads .fastq
and genome .fasta
and gives as output matrices .cool
and .mcool
, balanced or not depending of the options, and images of the matrices .pdf
. You can output intermediates files such as .bam
files with options.