Skip to content

The frequency filter Hardy Weinberg dreamed of

You are suspecting a recessive mode of inheritance. What cutoff for the frequency filter should you select? Well, it depends, are you looking for a homozygous pathogenic mutation or compound heterozygotes? This makes a big difference and the Hardy-Weinberg principle might help you to decide. Let’s review the math: In a sufficiently large population there is a relationship between allele- and genotype-frequencies if certain conditions are met. If f(a) is the frequency of allele ‘a’, the frequency of the genotype ‘aa’ should be close to g(aa)=f(a)*f(a). Let’s have a look at the figure: There are 10 individuals, 1 of them shows genotype ‘aa’, 4 individuals are heterozygous and the remaining 5 show the wildtype genotype ‘AA’. One of the heterozygous individuals inherited their allele ‘a’ from their mother and the other 3 from their father. Thus, the genotype frequency of ‘aa’ is g(aa)=1/10. For the allele frequency we have to count the total number of ‘a’s and devide them by all copies of the gene, f(a)=6/20. In this example we could state that the allele- and genotype-frequency are in equilibrium as f(a)*f(a)=36/400 is close to g(aa)=1/10.

Now, what will happen, if there is selective pressure on individuals with genotype ‘aa’? This is certainly the case for pathogenic alleles in recessive disease genes and the homozygous individual in the example above is already fading. In this case the ‘aa’s are removed from the pool, but the effect on the allele frequency is not so overwhelming, it’s still at f(a)=4/18. However, the allele- and genotype-frequencies are not anymore in Hardy-Weinberg equilibrium. Actually Hardy-Weinberg disquilibrium is often a strong indication for pathogenicity and the mere existence of homozygotes in a healthy control group are a strong argrument for ruling out a candidate mutation.

Now let’s think about how you can use that information for your filtering strategy. Let’s assume the recessive disease you are trying to elucidate has an incidence of 1 in about 10.000 individuals. This means the risk allele carrier rate could be as high as 2% or two in a hundred healthy individuals. However, if there are more than e.g. 6 homozygotes in 60,000 controls, you should wonder whether this is really the disease causing mutation.

When designing the new frequency filter that is working one gentoype frequencies of several thousands of healthy controls we could almost hear your complains: “Come on, do you really expect me to do this mental arithmetic every time I am analyzing a case?” That’s why we tried to be smart on the ‘aa’s: Once you set the frequency cutoff for the heterozygous genotype frequency, we will automatically set the homozygous genotype frequency to a reasonably lower value and only leave the fine adjustment to you. Enjoy!