从基因型到表型,经过从正面层层筛选我们会把Variants缩小到一个标胶小的范围,从表型到基因型,我们会根据疾病的临床描述找出跟疾病相关的基因的突变,和找到新的可能出现的突变去做验证(Sanger…)。
SNP/InDel过滤(Depth、GenoTyping Quality、…)
The basic rule is to remove sequencing artifacts and other systematic errors
Filtering low quality genotypes improves concordance
For GQ, we selected a minimum threshold of 20, corresponding to a Phred quality score with 99% accuracy.
Filtering low quality variants improves the Ti/Tv ratio
For DP, we selected a minimum threshold of eight reads, corresponding to a 2 × (½)8 chance (<1%) that="" a="" biallelic="" variant="" would="" appear="" to="" be="" monoallelic="" by="" random="" chance,="" assuming="" a="" two-tailed="" binomial="" model="" where="" each="" allele="" of="" a="" biallelic="" variant="" has="" a="" 50%="" chance="" of="" being="" in="" each="">1%)>
Order of filtering steps is important, applying VQSR filtering as the final step in our method provided the highest quality variant dataset.
GATK’s best practices includes a variant filtering step following Variant Quality Score Recalibration (VQSR). This “VQSR filter” uses annotation metrics, such as quality by depth, mapping quality, variant position within reads and strand bias, from “true” variants (variants found in HapMap phase 3 release 3) to generate an adaptive error model. It then applies this model to the remaining variants to calculate a probability that each variant is real. Using this recalibrated quality score, users can filter lower quality variants.
Filtered out known 1000Genome/dbSNP, Filtered out synonymous mutations (except for splice sites),Filter out intergenic and intron locus, Filtered out any variant who’s overall depth is less than 15 reads.
ClinVar Clinical Significance(Pathogenic / Likely pathogenic / Benign / Likely benign / Uncertain significance)
OMIM(表型/疾病名、遗传方式、关联的基因/位点区域)
HGMD表型
Orphanet(发病率)
HPO(表型与Decipher/OMIM/Orphanet关联)
先证者/家系、位点杂合/纯合结合OMIM的遗传方式分析
检测可能的CNV/SV,尤其是常见的外显子缺失和重复等,DGV/ExAC CNV/ClinGen/ClinVar/Decipher/CNVD 对检测到的 CNV/SV 做良性多态/恶性变异注释
联系客服