I am trying to solve the 'Mendel's First Law' problem on http://rosalind.info/
I have tried several different approaches, but I just can't get my solution to return the same answer as the sample problem on their page. I know their sample output is correct though.
Here is what I have:
traitProb :: Int -> Int -> Int -> Double
traitProb k m n = getProb list
where list = cartProd genotypes genotypes
genotypes = (replicate k Dominant) ++ (replicate m Heterozygous) ++ (replicate n Recessive)
getProb = sum . map ((flip (/)) total . getMultiplier)
total = fromIntegral $ length list
getMultiplier (Dominant, Dominant) = 1.0
getMultiplier (Recessive, Dominant) = 1.0
getMultiplier (Dominant, Recessive) = 1.0
getMultiplier (Dominant, Heterozygous) = 1.0
getMultiplier (Heterozygous, Dominant) = 1.0
getMultiplier (Heterozygous, Heterozygous) = 0.75
getMultiplier (Heterozygous, Recessive) = 0.5
getMultiplier (Recessive, Heterozygous) = 0.5
getMultiplier (Recessive, Recessive) = 0.0
I am not sure whether the code is wrong, or my method of computing the probability is wrong. Essentially the idea is to get a list of all possible parents, and then based on whether they are Homozygous Dominant, Recessive or Heterozygous, compute the probability of each pair of parents producing a child with at least one dominant allele. Then divide each result by the total number of pairs of parents. After that I just sum the list. But my answer is wrong by a little bit.
Can anyone point me in the right direction?
EDIT: cartProd is the 'cartesian product' of the two lists passed to it, if you will.
cartProd :: [a] -> [a] -> [(a, a)]
cartProd xs ys = [ (x, y) | x <- xs, y <- ys ]
I suggest making a slight change in your thinking by doing the calculation in three steps:
What is the probability of getting genotype X for the first parent? (Also, how many different choices are there for X?)
What is the probability of getting genotype Y for the second parent?
Given the genotypes X and Y of the parents, what is the probability of a child displaying the dominant genotype?
Sum steps 1-3 for each (X, Y) pair.
When I drew the tree diagram by hand, I found it easier to calculate the probability of a child NOT having the dominant allele. There are fewer choices to sum and then you can subtract this sum from 1.