Sequencing

Whether you are a novice or an expert in NGS sequencing technologies, GTAC can help you design a study, prepare DNA or RNA libraries, sequence samples, and analyze the data.

Analysis Pipelines

Exome/Targeted Capture Analysis

Our pipeline fully supports the goal of exome or targeted sequencing to identify SNPs within the target sequence. We use proven tools, such as Novoalign for fast, accurate read alignments and SAMtools for SNP detection. Additionally, the pipeline identifies potential indels, but not nearly as accurately as it does SNPs. Identified SNPs are annotated using SeattleSeq (human) or snpEff (mouse). Data is returned as a flat file or spreadsheet of annotated variants with respect to the reference genome sequence. Additional downstream analyses or filtering of SNPs against other samples may be available for a nominal rate.

Genome Analysis

Reads are aligned to the genome using Novoalign.  PCR duplicates are removed and coverage statistics are reported.  SNPs are identified using SAMtools and annotated using snpEFF, a tool that predicts the effects of variants on genes.  


RNA-Seq

FastQ files are aligned to the transcriptome and the whole-genome with STAR. When biologic replicates are present, our analytical pipeline will also simultaneously perform a standard EdgeR and Sailfish analysis of gene-level/exon-level features on the uniquely count of aligned, unambiguous reads per feature as determined by Subread:featureCounts using a negative-binomial test or generalize-linear negative-binomial model, respectively. Genes or exons are filtered for just those that are expressed and the results are filtered for just those genes or exons found to be differentially expressed.

Here is an example of expression analysis output: https://htcf.wustl.edu/files/3kMJmRXA/

For more information see:
STAR
EdgeR
Sailfish
Subread
Authors

 

ChIP-Seq

Sequencing reads are aligned with Novoalign.  Peaks are detected using MACs.  If sample/control information is provided by the investigator, we can include that in the analysis.

 

Demultiplexing

Because the HiSeq can produce tremendous amounts of sequence in a single lane, mutliple samples are often mixed into a single lane of a flow cell. To facilitate this each sample is tagged in a process known as multiplexing. The library core handles the multiplexing labeling process, and the analytic team 'demultiplexes' the reads assigning each of them to the appropriate sample.  Demultiplexing is included in all of the analytic pipelines.

 

Sequence Alignment

GTAC can align your data using NOVOALIGN, a fast and accurate gapped aligner.  This is the first step to many of our analysis pipelines, but can also be run as stand-alone.