打开APP
userphoto
未登录

开通VIP,畅享免费电子书等14项超值服

开通VIP
基因组实战01: introduction

1.What is GATK?

GATK stands for Genome Analysis Toolkit. It is a collection of command-line tools for analyzing high-throughput sequencing data with a primary focus on variant discovery. The tools can be used individually or chained together into complete workflows. We provide end-to-end workflows, called GATK Best Practices, tailored for specific use cases.

*Starting with version 4.0, GATK contains a copy of the Picard toolkit, so all Picard tools are available from within GATK itself. *

2.Analysis phases

(1) Data pre-processing is the first phase in all cases, and involves pre-processing the raw sequence data (provided in FASTQ or uBAM format) to produce analysis-ready BAM files. This involves alignment to a reference genome as well as some data cleanup operations to correct for technical biases and make the data suitable for analysis.

(2) Variant discovery proceeds from analysis-ready BAM files and produces variant calls. This involves identifying genomic variation in one or more individuals and applying filtering methods appropriate to the experimental design. The output is typically in VCF format although some classes of variants (such as CNVs) are difficult to represent in VCF and may therefore be represented in other structured text-based formats.

(3) Additional steps such as filtering and annotation may be required to produce a callset ready for downstream genetic analysis, depending on the application. This typically involves using resources of known variation, truthsets and other metadata to assess and improve the accuracy of the results as well as attach additional information.

3. Clinical Whole Genome Sequencing Workflow

4. Experimental designs

StrategyPanelExome(WES)Genome(WGS)
Size of target space (Mbp)~ 0.5~ 50~ 3200
Average read depth500–1000×100–150×~ 30–60×
Relative cost$$$$$$
SNV/indel detection++++++
CNV detection++++
SV detection+
Low VAF++++

Reference

https://gatk.broadinstitute.org/hc/en-us/sections/360007226651-Best-Practices-Workflows
https://www.nature.com/articles/s41525-022-00295-z
https://doi.org/10.1007/s00441-017-2636-6

 https://genomemedicine.biomedcentral.com/counter/pdf/10.1186/s13073-020-00791-w.pdf

 https://mp.weixin.qq.com/s/8bux7uTeZC5a23yVgExLIw
本站仅提供存储服务,所有内容均由用户发布,如发现有害或侵权内容,请点击举报
打开APP,阅读全文并永久保存 查看更多类似文章
猜你喜欢
类似文章
【热】打开小程序,算一算2024你的财运
微生物DNA测序数据找变异位点
Whole genome sequencing to identify host genetic risk factors for severe outcomes of hepatitis A vir
Low genetic variation is associated with low mutat...
Plant Methods:CRISPR-CAS9在植物基因精细定位中的应用
医学翻译:什么是基因组测序?
全基因组测序指南(二)
更多类似文章 >>
生活服务
热点新闻
分享 收藏 导长图 关注 下载文章
绑定账号成功
后续可登录账号畅享VIP特权!
如果VIP功能使用有故障,
可点击这里联系客服!

联系客服