Skip to content

Poll about Gene Panel Usage

Hello everyone,

Gene Panels comprising hundreds of genes are getting more and more popular for analyzing patients with rare inherited disorders. Currently enrichment or amplification kits of Agilent SureSelect, Illumina TruSight, and Ion Torrent AmpliSeq seem to be most widely used.

We are working on a really great new feature that will make it almost fun to analyze variants from such gene panels. The new prioritizing feature will be platform agnostic and we will support all kinds of gene panels. However, we would like to make a short survey about the approaches of the GeneTalk community.

We would highly appreciate if you could participate in tan online poll. It won’t take more than 30 secs. Imagine a patient walks into your clinic and you suspect a monogenic disorder, but you are not exactly sure which gene to analyze first. What is your first choice for the diagnostic work? In the following poll there are three preset answers, that include Gene Panels for this use case:

IonTorrent AmpliSeq Inherited Disease (328 genes)

Illumina TruSight Inherited Disease (552 genes)

Agilent SureSelect Inherited Disease (2932)

The performance of the Agilent panel was analyzed by Robinson et al. and showed a diagnostic yield of more than 30%

Please also indicate if you are using another gene panel for solving the case by selecting “others”. This could be e.g. the TruSight One panel, an exome, or anything else.

Enough explanations, now let’s start the voting!

 

 

 

 

First experiences with the Watchlist

The Watchlist – a customer-pulled request,

I just read “running lean” from Ash Maurya and learned many new words! Now I can refer to some of the stuff we recently released with the appropriate terms! The research variant database was clearly a customer-pull request. We heard from many different users about the dilemma of finding a second patient.  So we moved this feature quickly from backlog to development and released around Christmas

Over the recent weeks we got feedback about the first user experiences with the new Watchlist and created a new whitepaper for the research variant database, that also trys to address some frequently asked questions. Please let us know what isn’t covered yet!

GeneTalk Rankings

Thanks to all of you who contributed valuable annotations, comments and rankings for variants over the recent months! Many of you pointed out that some variants were ranked too high in the category medical relevance. Well, from an epidemiologist point of view it is often difficult to decide whether an allele is merely disease associated or actually disease causing. A small association does not mean that there is no causal effect. Though the smaller the association, the less likely that it is causal.

With the recent data from large population studies we reviewed the Bradford Hill Criterium ”strenght” for all variants. If an allele occurs in at least one percent of the healthy population or if there are at least 6 individuals in 60,000 that show a homozygous genotype for the allele and do not suffer from a rare disorder then we reassessed the variant as probably not disease causing. However, if a variant is mentioned in an article, then we point this out by a 2 star ranking in medical relevance and link to the paper in the annotation.

Now it depends again on the GeneTalk community to further clarify the status of such variants! As Bradford Hill phrased it in his second criterium “Consistent findings observed by different persons in different places with different samples strengthens the likelihood of an effect”. As described in more detail in our Whitepaper about annotations the ranking of a variant that is used in the filter, change if the majority of the GeneTalk users thinks that another classification is appropriate.

Research Variants Watchlist

 

Dear GeneTalk Community,

thank you all so much for the annotations, comments and rankings on variants of uncertain clinical significance, VUCS, that you contributed over the recent months! We are so happy that we could help to establish many new research collaborations that resulted in fascinating articles this year: People *) came together **) looked at mutations and ***) solved challenging cases in a joint effort [link to vici paper]! 

Now we are proud to present a major upgrade in GeneTalk that will allow us to continue this success story in the next year. First of all, we completely rethought the way you are analyzing cases. No matter, whether it’s a single vcf file or multiple vcf files that you upload, in the end it’s cases that you want to solve. That’s why we now list all the samples that you have in your GeneTalk account. It sounds so obvious but it really needed some tough reengineering of the platform, as we had to extract the samples from all your files and link it with the phenotype information that you provided.

We think that the new focus on samples instead of files will actually boost your productivity in an additional way, as it will allow you to start a well defined query for any candidate mutation in any sample . Besides the annotations and comments that you already know in GeneTalk and that are usually visible to all GeneTalk users we designed a Research Variant Watchlist. The Watchlist is basically a secured environment, that will establish research collaborations between GeneTalk users with a common interest, which is a VUCS in a patient. If you like this new feature we will certainly extend it to gene and phenotype matching as well. So we are looking forward to your feedback on the user experience!

Best wishes and a happy holiday,

Your GeneTalk Team

Stay tuned!

Web 2.0 broad us many new ways to communicate and GeneTalk wants you to use all this fancy technology for your research. Let’s take Twitter for example: If a user decides to make an annotation in GeneTalk that’s visible to the public, then we will automatically generate a tweet via the GeneTalk account @Gene_Talk. Nadja Ehmke for example decided to make her annotation about a disease causing mutation in TGDS public. Her manuscript that this mutation can cause Catel Manzke syndrome was just accepted by the American Journal of Human Genetics and her work will appear in the December issue of that journal.
However, if you follow the GeneTalk account you could learn about this interesting finding way ahead of the general public!

p.s.: In case you don’t have a Twitter account nor a GeneTalk account, this is the GeneTalk annotation Nadja’s tweet is referring to:

http://gene-talk.de/annotations/887512

Convinced of its usefulness? So ask your grandma to sign you up at Twitter or GeneTalk, …

 

GeneTalk’s frequency filter now based on more than 65,000 exomes

Standing on the shoulders of giants

Power does not consist in striking with force but in striking with frequency! We adapted slightly Balzac’s catechism so that it can now be applied to sequence variants. We updated GeneTalk’s frequency filter with the genotype data of more than 65,000 exomes from the Exome Aggregation Consortium. Thanks Daniel MacArthur and all the others involved for this great data set!

Many of us are analyzing rare diseases. So rare, that often less than one individual out of ten thousand people is affected. For a patient of consanguineous parents that is suffering from a recessive illness a common filtering approach was to apply a frequency filter for homozygous genotypes of 0.001. The rationale behind this parameter settings is the following: In a highly penetrant disorder cannot be caused by an allele that occurs in a homozygous state more frequently than the incidence of the disease.  Well, if this holds true, why are we not using a filtering cutoff of 0.0001? It basically didn’t make a difference so far, because the sample size of the control group was about a few thousand individuals from the 1000 genomes project. There was simply very little power in detecting homozygous genotypes with a frequency of 0.0001. However this now chances with the 65,000 exomes from the ExAC. The probability that we will find at least one healthy individual that is homozygous for an allele of frequency 0.01 that is not deleterious is above 50%. (Thus is the calculus: Assuming Hardy Weinberg Equilibrium for the allele, the probability that none of 65,000 individuals is homozygous is (1-(0.01)^2)^65,000).

That means you will tremendously benefit in your analysis from the new frequency filter in GeneTalk. So strike it with frequency!

Quality and Coverage Filters

Many of you asked us for a quality filter, so here it is! Compared to all the other filters it wasn’t actually that hard to implement, so you might wonder, what took us that long? Well, the problem basically starts right away with the term “quality”. What is actually meant by quality? My best explanation would be: The creators of the VCF format wanted to include something like an error rate or p-value for the trustability of a variant call. However, when they realized that this is not that easy they included something in that direction, but definitely no proper probability and called it the quality column.

Some people say the value in the quality column is something like a phred score. That is, the negative of the logarithm to basis ten of the probability that the variant call is wrong. A quality score of 30 would then mean a 1:1000 chance that the variant call is a false positive. Sound’s good, so what don’t I like about it then? Well, this requires a probability model that is reasonable and this is simply not the case for most variant callers. An easy example: Most probability models assume diploid organisms with either heterozygous or homozygous genotypes. Thus the quality value is not applicable if you are interested in somatic mutations or mosaics. Another scenario, where the quality value is usually meaningless is any value above 100. Most probability models in the variant callers ignore the fact, that the DNA fragments are amplified before they are sequenced. Consider for example a position in the genome for which DNA was extracted from around 50 cells. If the genotype at this position is heterozygous we would have 50 “ref” alleles and 50 “alt” alleles. However, if we sequence with a sequencing depth of around 200, the quality value would suggest, that this call is much more trustworthy than one with a sequencing depth of only 100. But in this case a binomial model for the distribution of the sequence fragments simply doesn’t apply anymore.

In these cases it would be safer to have a look at the coverage instead of the quality value. Here, most variant callers provide information with either the DP, AD, or DP4 flag. The DP flag tells you how many sequence reads cover a certain position, the AD flag lists the number of sequence reads with the reference allele and the alternative allele. The flag that I like best, is DP4. Here the number of reads with the reference or the alternative allele that have been aligned forward or reverse are listed seperately. This allows you to see whether there is e.g. an artifact from one sequencing error: AD4:0,0,10,0 looks suspicious whereas 0,0,5,5 looks very promising. As you know all that information is shown by a move over the variant in the VCFviewer, if it was annotated in your VCF file.

So to summarize, the quality filter and the coverage filter require a  healthy skeptisism of the user and some knowledge about how these values were created.

But to get started there are some rules of thumb: If you are looking for homozygous variants in your data, a minimum coverage of 5 should be a good trade-off between noise and a true signal, for heterozygous variants you should set the minimum rather to 10. The quality values depend highly on the probability model, as stated, but minimum values of 30 are a good starter.

Please let us know about your experience and discuss it with the community in this blog entry!

 

39 novel pathogenic mutations for Usher Syndrome

It took us quite a while, but finally we got our paper accepted. That means 39 variants of unknown clinical significance can now be regarded as disease causing. You will find all annotations in GeneTalk, as well as in CinVar in one of the next releases.

New case solved with the help of GeneTalk

Hey Mateusz, congratulations on your new paper:
“Missense variant in CCDC22 causes X-linked
recessive intellectual disability with features
of Ritscher-Schinzel/3C syndrome”.
And thank you contribution valuable annotations to our knowledge base!

ESHG 2014

Yes, you are right, that’s a glass of pickled gherkins in front of us and it was completely empty after the ESHG conference in Milan. Thus no slack season for GeneTalk! Thanks for all your interest!