SLUPipe processes workflows using a JSON configuration file. The configuration will indicate the samples, genome references, chromosome ranges, variant callers and output directory used for a workflow.
Requirements Before Running a Workflow:
1. Placing Samples in Input Directory
Place all .bam files to be processed in src/input (SLUPipe will automate generation of .bai files within this directory). Sample files for testing can be found here: https://drive.google.com/drive/folders/1QdtVEronIzs04L37BFkw29TLjNWcyOpf
Sample Files From GATK Tutorial Data 9183 Somatic Variants:
- hcc1143_T_subset50K.bam
- hcc1143_T_subset50K.bai
- hcc1143_N_subset50K.bam
- hcc1143_N_subset50K.bai
Important: Users have completely liberty of creating more input directories outside of those described here as long as they’re specified in the JSON Configuration file.
2. Creating The JSON Configuration File
ALL config files must be constructed with the following structure:
Configuration File Structure Format (JSON):
[
{
"Pipeline_Mode":"-T",
"Variant_Callers":["Pindel","Platypus"],
"Input_Directory":"/student/foo/SLUPipe/src/input",
"Output_Directory":"student/foo/SLUPipe/src/output",
"Chromosome_Range": "chr1:16,000,000-215,000,000",
"vep_ScriptPath": "/your_path_to/.conda/envs/SLUPipe/share/ensembl-vep-95.3-0",
"vep_CachePath": "/your_path_to/foo/.vep",
"reference_directory": "/student/foo/referenceFiles",
"cpuCores": "8"
}
]
Pipeline Mode & Variant Callers are indicated in the JSON file as followed:
Non-paired Mode (Tumor Only) = "-T"
Paired Mode (Normal Mode) = "-N"
MuSE = "Muse"
MuTect2 = "Mutect"
Varscan = "Varscan"
Somatic Sniper = "Sniper"
Strelka 2 = "Strelka2"
Pindel = "Pindel"
Platypus "Platypus"
Interacting With The Output Directory
SLUPipe’s workflow results will be placed in src/output/. Each sample result will have its files organized with the following directory structure:
-Sample_1:
->annotated_vcfs:
->mutect_output
-sample_1_muse.annotated.vcf
->strelka_output
-sample_1_strelka.annotated.vcf
->mafs:
-sample_1_muse.maf
-sample_1_strelka.maf
->vcfs
->mutect_output
-sample_1.vcf
->strelka_output
-sample_1.vcf
Important: Users have completely liberty of creating more different output directories outside of those described here as long as they’re specified in the JSON Configuration file.