haskellbioinformaticsgeneticsrosalind

Computing the probability of an offspring having at least one dominant allele


I am trying to solve the 'Mendel's First Law' problem on http://rosalind.info/

I have tried several different approaches, but I just can't get my solution to return the same answer as the sample problem on their page. I know their sample output is correct though.

Here is what I have:

traitProb :: Int -> Int -> Int -> Double
traitProb k m n = getProb list
      where list = cartProd genotypes genotypes
            genotypes = (replicate k Dominant) ++ (replicate m Heterozygous) ++ (replicate n Recessive)
            getProb = sum . map ((flip (/)) total . getMultiplier)
            total = fromIntegral $ length list
            getMultiplier (Dominant, Dominant) = 1.0
            getMultiplier (Recessive, Dominant) = 1.0
            getMultiplier (Dominant, Recessive) = 1.0
            getMultiplier (Dominant, Heterozygous) = 1.0
            getMultiplier (Heterozygous, Dominant) = 1.0
            getMultiplier (Heterozygous, Heterozygous) = 0.75
            getMultiplier (Heterozygous, Recessive) = 0.5
            getMultiplier (Recessive, Heterozygous) = 0.5
            getMultiplier (Recessive, Recessive) = 0.0

I am not sure whether the code is wrong, or my method of computing the probability is wrong. Essentially the idea is to get a list of all possible parents, and then based on whether they are Homozygous Dominant, Recessive or Heterozygous, compute the probability of each pair of parents producing a child with at least one dominant allele. Then divide each result by the total number of pairs of parents. After that I just sum the list. But my answer is wrong by a little bit.

Can anyone point me in the right direction?

EDIT: cartProd is the 'cartesian product' of the two lists passed to it, if you will.

cartProd :: [a] -> [a] -> [(a, a)]
cartProd xs ys = [ (x, y) | x <- xs, y <- ys ]

Solution

  • I suggest making a slight change in your thinking by doing the calculation in three steps:

    1. What is the probability of getting genotype X for the first parent? (Also, how many different choices are there for X?)

    2. What is the probability of getting genotype Y for the second parent?

    3. Given the genotypes X and Y of the parents, what is the probability of a child displaying the dominant genotype?

    Sum steps 1-3 for each (X, Y) pair.

    When I drew the tree diagram by hand, I found it easier to calculate the probability of a child NOT having the dominant allele. There are fewer choices to sum and then you can subtract this sum from 1.