A summary of the bioinformatics software currently installed on our Linux cluster.
Edit me

Available software

We have a lot of software already installed on the server that covers applications ranging from QC analysis and preprocessing of raw sequence data, transcriptome analysis from RNAseq data, 16S and shotgun metagenomics pipelines, WGS tools, and more. If you have an account on our cluster, then you already have access to all of the software below, so get started!

If you’re looking for a piece of software and don’t find it below, just reach out to Dan Beiting to inquire about getting it installed.

software category how to run
anvi’o metagenomics, visualization source activate anvio4
Bandage genome assembly, visualization Bandage
bcftools SNP/variant discovery and genotyping bcftools
bcl2fastq2 tools for working with genomic interval files (BAM, BED, GFF/GTF, etc.) bedtools
bedtools handling HTS file formats bedtools
blast sequence search option include blastn, blastp, or blastx
bowtie2 sequence alignment bowtie2
bwa sequence alignment bwa
Circos general visualization circos
CheckM, GroopM, BamM metagenomics, genome assembly checkm, groopm, bamm
clust co-expression modules clust
Cytoscape network analysis navigate to /home/shared/softwares/Cytoscape_v3.5.1 folder. Double click to open program
deeptools analysis of HTS data deeptools
diamond Accelerated BLAST compatible local sequence aligner diamond
EMIRGE Reconstruct full length rRNA genes from short read data emirge.py or emirge_amplicon.py
FastQC generates quality control reports for fastq files. fastqc
FastQ Screen screening fastq files against multiple reference genomes. fastq_screen
genome analysis toolkit (gatk) SNP/variant discovery and genotyping gatk
GraPhlAn metagenomics, visualization graphlan.py or graphlan_annotate.py
Humann2 metagenomics humann2
HTSeq great for summarizing alignments to get counts per gene/exons htseq-count or htseq-qa
iRep determines replication rates for bacteria from metagenomics iRep or bPTR
Kallisto mapping reads to transcripts kallisto
KneadData removes host or ‘contaminating’ reads from dataset kneaddata
Kraken2 taxonomic classification system kraken2
LotuS taxonomic classification system for 16S data Perl lotus.pl [options]
Mash comparative genomics mash
MetaPhlAn metagenomics metaphlan2.py
MinPath biological pathway reconstructions using protein family predictions MinPath1.4.py
Mothur 16S marker gene mothur
MultiQC summarizes log files produced by other tools (e.g. STAR, kallisto, FastQC, etc). multiqc
Nextflow workflow management nextflow
Picard tools handling HTS file formats java -jar /usr/local/bin/picard/build/libs/picard.jar
Prokka genome annotation for prokaryotes prokka
QIIME2 Microbial ecology pipeline for 16S rRNA data source activate qiime2
QUAST Quality Assessment Tool for Genome Assemblies quast.py
ROP Read Origin Protocol for finding identity of unmapped reads /usr/local/bin/rop/rop.sh
rsem mapping reads to transcripts rsem-(function)
samtools handling HTS file formats samtools
seqtk handling HTS file formats seqtk
seqKit handling HTS file formats seqkit
Sickle quality trimming of raw sequences sickle se or sickle pe
Sourmash Compute and compare MinHash signatures for DNA data source activate sourmash
SPAdes genome assembly from illumina short reads spades.py [options] -o <output_dir>
SQUID find trascriptomic structural variants from RNAseq data squid
SRA toolkit tools for accessing sequence data stored on SRA e.g. fasterq_dump, and many more
Star sequence alignment STAR (all caps)
Sunbeam metagenomics source activate sunbeam
Trimmomatic quality trimming of raw sequences java -jar /usr/local/bin/trimmomatic-0.33.jar
Trinity de novo transcriptome assembly Chrysalis (uppercase ‘C’), inchworm