Lecture 4: One-locus selection
Lecture overview
- Review of Hardy-Weinberg
- One-locus haploid selection in discrete time
- One-locus diploid selection in discrete time
- Comparing haploid and diploid selection
- One-locus haploid selection in continuous time
- Comparing haploid selection in discrete and continuous time
- The models we've covered
1. Review of Hardy-Weinberg
In Lecture 2 we saw that one round of random union among gametes (equivalently, random mating among diploids) causes the frequency of diploid genotypes at a locus with two alleles, \(A\) and \(a\), to become
- \(AA\): \(p^2\)
- \(Aa\): \(2pq\)
- \(aa\): \(q^2\)
where \(p\) is the frequency of allele \(A\) and \(q=1-p\) is the frequency of allele \(a\).
A population with these diploid genotype frequencies is said to be in Hardy-Weinberg equilibrium. Furthermore, we showed that under this model the allele frequencies do not change, \(p' = p\).
But what if there is natural selection?
2. One-locus haploid selection in discrete time
We begin by examining a model where selection acts during the haploid phase of the life-cycle. This is a little simpler than diploid selection because there are only two haploid genotypes, \(A\) and \(a\). It is also a very relevant model for species with long haploid phases (eg, some algae and fungi) or with strong competition during the haploid phase (eg, pollen competiting for ovules).
Life-cycle diagram
We first illustrate the structure of the model with a life-cycle diagram.
graph LR;
A((p)) --gamete union--> B((p'));
B --meiosis--> C((p''));
C --selection--> A;
We will census the population at the beginning of the haploid phase (immediately after meiosis).
Deriving the equations
Let the number of haploid individuals with each allele be
- \(n_A(t) =\) number of individuals with the \(A\) allele in generation \(t\)
- \(n_a(t) =\) number of individuals with the \(a\) allele in generation \(t\)
The frequency of \(A\) is therefore \(p(t) = \frac{n_a(t)}{n_A(t) + n_a(t)}\).
Now, let’s assume that during selection each haploid individual has reproductive factor
- \(W_A =\) reproductive factor of individuals with the \(A\) allele
- \(W_a =\) reproductive factor of individuals with the \(a\) allele
These reproductive factors are referred to as the absolute fitnesses as they determine the (absolute) numbers of individuals after selection, \(n_i' = W_i n_i(t)\) for \(i=A\) and \(i=a\). To relate back to the models of population growth in Lecture 3, this is exponential growth of each allele.
After selection these alleles randomly pair, go through the dipliod phase of the life cycle, and then segregate back into haploids after meiosis. From our analysis of Hardy-Weinberg we know random mating and segregation don't affect allele frequency. The frequency of \(A\) in the next generation is therefore
To make this a recursion equation we need to write \(p(t+1)\) in terms of \(p(t)\), so that if we knew \(p\) at some point in time we can recursively calculate it in all future times. To do this we divide both the numerator and denominator by the total number of individuals, \(n_A(t) + n_a(t)\), and simplify
This is now a recursion equation for the allele frequency in our model of haploid selection.
Simplifying
Our current recursion is a function of two parameters, the absolute fitnesses \(W_A\) and \(W_a\). Now notice that if we divide both the numerator and denominator by one of these fitnesses, say \(W_a\), then we can reduce the recursion to a function of only one parameter, \(w_A = W_A/W_a\), the fitness of \(A\) relative to the fitness of \(a\),
Note that the allele frequency dynamics depend only on the relative fitnesses, and not on the absolute fitnesses, meaning that evolution does not depend on how the size of the population changes. It is therefore possible to study evolution while ignoring population dynamics under this simple model.
Let's plot the recursion to get a sense of the dynamics.
import matplotlib.pyplot as plt # calculate allele frequency over time with recursion w_A = 1.1 #relative fitness of allele A p_now = 0.01 #initial allele frequency, p_0 ps = [] #list to hold allele frequencies over time ts = range(100) #list of time steps for t in ts: #for each time ps.append(p_now) #add current allele frequency to list p_now = w_A*p_now/(w_A*p_now+(1-p_now)) #update allele frequency with our recursion equation # plot plt.scatter(ts, ps) #plot the (t,p) pairs plt.xlabel("generation, $t$") #label axes plt.ylabel("frequency of $A$ allele, $p(t)$") plt.show()
3. One-locus diploid selection in discrete time
Since we are all currently in the diploid phase of our life-cycle, it is natural to ask: Does selection in the diploid phase work the same way?
Life-cycle diagram
We now have the following life-cycle diagram.
graph LR;
A((p)) --gamete union--> B((p'));
B --selection--> C((p''));
C --meiosis--> A;
We will census at the beginning of the diploid phase (immediately after gamete union).
Deriving the equations
Let the number of diploid individuals with each genotype be
- \(n_{AA}(t) =\) number of individuals with the \(AA\) genotype in generation \(t\)
- \(n_{Aa}(t) =\) number of individuals with the \(Aa\) genotype in generation \(t\)
- \(n_{aa}(t) =\) number of individuals wite the \(aa\) genotype in generation \(t\)
The frequency of allele \(A\) is then
Now, let’s assume that during selection each diploid individual has reproductive factor
- \(W_{AA} =\) reproductive factor of individuals with the \(AA\) genotype
- \(W_{Aa} =\) reproductive factor of individuals with the \(Aa\) genotype
- \(W_{aa} =\) reproductive factor of individuals with the \(aa\) genotype
These reproductive factors are again referred to as the absolute fitnesses as they determine the (absolute) numbers of individuals after selection, \(n_i' = W_i n_i(t)\) for \(i=AA\), \(i=Aa\), and \(i=aa\).
After selection these genotypes segregate into haploids via meiosis, go through the haploid phase of the life cycle, and then random join to create diploids again. Again, we know random union and segregation don't affect allele frequency. The frequency of \(A\) in the next generation is therefore
As above, we want a recursion equation in terms of allele frequency, so we want to replace the \(n_i\)'s in the right hand side of this equation with \(p\)'s. To do this we note that with the random union of gametes the diploid offspring are in Hardy-Weinberg proportions, so that
where \(n(t) = n_{AA}(t) + n_{Aa}(t) + n_{aa}(t)\) is the total population size.
Substituting these Hardy-Weinberg proportions in and simplifying, the total population size cancels out and we can rewrite the above equation in terms of allele frequency alone,
This is a recursion equation for allele frequency in our model of diploid selection.
Simplifying
As in the haploid selection case, we can divide by one of the absolute fitnesses, say \(W_{aa}\), to remove one parameter from the model
This recursion is a function of only two relative fitnesses, \(w_{AA} = W_{AA}/W_{aa}\) and \(w_{Aa} = W_{Aa}/W_{aa}\).
Let's plot the recursion to get a sense of the dynamics.
import matplotlib.pyplot as plt # calculate allele frequency over time with recursion w_AA = 1.2 #relative fitness of genotype AA w_Aa = 1.1 #relative fitness of genotype Aa p_now = 0.01 #initial allele frequency, p_0 ps = [] #list to hold allele frequencies over time ts = range(100) #list of time steps for t in ts: #for each time ps.append(p_now) #add current allele frequency to list p_now = (w_AA*p_now**2 + w_Aa*p_now*(1-p_now))/(w_AA*p_now**2 + 2*w_Aa*p_now*(1-p_now) + (1-p_now)**2) #update allele frequency with our recursion equation # plot plt.scatter(ts, ps) plt.xlabel("generation, $t$") plt.ylabel("frequency of $A$ allele, $p(t)$") plt.show()
4. Comparing haploid and diploid selection
So, returning to our original question, let's compare evolution under haploid selection
to evolution under diploid selection
To facilitate this, let’s assume the fitness of a diploid genotype is the product of the haploid fitnesses, i.e., \(W_{AA} = W_A W_A\), \(W_{Aa} = W_A W_a\), and \(W_{aa} = W_a W_a\).
It then happens that our diploid recursion reduces to the haploid recursion,
This shows that we need twice as much selection under diploid selection relative to that under haploid selection (e.g., \(W_{AA} = W_A W_A\)) for evolution to proceed as quickly. Why is evolution slower under diploid selection?
Let's check this visually (try breaking the assumption above, \(W_{ij}=W_i W_j\), to see what happens).
import matplotlib.pyplot as plt # calculate allele frequency over time with recursion w_A = 1.1 #relative fitness of allele A in the haploid model w_AA = w_A*w_A #relative fitness of genotype AA in the diploid model w_Aa = w_A #relative fitness of genotype Aa in the diploid model p_now_hap = 0.01 #initial allele frequency, p_0 p_now_dip = p_now_hap ps_hap, ps_dip = [], [] #lists to hold allele frequencies over time ts = range(100) #list of time steps for t in ts: #for each time ps_hap.append(p_now_hap) #add current allele frequency to list p_now_hap = w_A*p_now_hap/(w_A*p_now_hap+(1-p_now_hap)) #update allele frequency with haploid recursion equation ps_dip.append(p_now_dip) #add current allele frequency to list p_now_dip = (w_AA*p_now_dip**2 + w_Aa*p_now_dip*(1-p_now_dip))/(w_AA*p_now_dip**2 + 2*w_Aa*p_now_dip*(1-p_now_dip) + (1-p_now_dip)**2) #update allele frequency with diploid recursion equation # plot plt.scatter(ts, ps_hap, label='haploid') plt.scatter(ts, ps_dip, s=10, label='diploid') plt.xlabel("generation, $t$") plt.ylabel("frequency of $A$ allele, $p(t)$") plt.legend() #add legend to know which points belong to which model plt.show()
5. One-locus haploid selection in continuous time
To model haploid selection in continuous-time we assume that during the selective phase each haploid individual has growth rate
- \(r_A =\) growth rate of individuals with the \(A\) allele
- \(r_a =\) growth rate of individuals with the \(a\) allele
and the relative numbers of each type don't change otherwise (ie, during union, diploidy, or meiosis).
We therefore have exponential growth of both genotypes, \(\frac{\mathrm{d} n_i}{dt} = r_i n_i\).
At any particular point in time, \(t\), the frequency of allele \(A\) is, \(p(t) = n_A(t)/(n_A(t) + n_a(t))\).
We can therefore derive the rate of change in the frequency of allele \(A\), \(\mathrm{d}p/\mathrm{d}t\), using the qoutient rule (see Appendix 2 in the text for help with this and other math tricks)
where \(s_c = r_A - r_a\) is the continuous-time selection coefficient of allele \(A\).
A similar equation can be derived for the model of diploid-selection in continuous time, but we will not study it (see Problem 3.16 in the text if you are curious).
6. Comparing haploid selection in discrete and continuous time
Are the discrete- and continuous-time models of haploid selection as different as they look?
Not really. Discrete and continuous time models generally behave in a similar fashion when changes occur slowly over time. For this model of haploid selection, this implies that the discrete and continuous models will be similar when the fitnesses of the two alleles are nearly equal, i.e., when \(W_A - W_a\) is small. This is called "weak selection".
In the discrete model, the change in the allele frequency is
Now define \(s_d = (W_A - W_a)/W_a\) as the discrete time selection coefficient. Subbing this in as \(W_A = s_d W_a + W_a\) gives
Now assume weak selection, i.e., that \(s_d\) is small. This implies \(s_d p(t) + 1 \approx 1\). Making this approximation we have
This is equivalent to the continuous-time model when the selection coefficients are equal, \(s_c = s_d\).
7. The models we've covered
To summarize the last two lectures, we've derived four of the most classic models in ecology and evolution:
Model | Discrete time | Continous time |
---|---|---|
Exponential growth | \(n(t+1) = R n(t)\) | \(\frac{\mathrm{d}n}{\mathrm{d}t} = r n(t)\) |
Logistic growth | \(n(t+1) = (1 + r(1 - \frac{n(t)}{K}))n(t)\) | \(\frac{\mathrm{d}n}{\mathrm{d}t} = r(1 - \frac{n(t)}{K})n(t)\) |
Haploid selection | \(p(t+1) = \frac{W_A p(t)}{W_A p(t) + W_a q(t)}\) | \(\frac{\mathrm{d}p}{\mathrm{d}t} = s p(t)(1-p(t))\) |
Diploid selection | \(p(t+1) = \frac{W_{AA}p(t)^2 + W_{Aa}p(t) q(t)}{W_{AA}p(t)^2 + 2 W_{Aa} p(t) q(t) + W_{aa} q(t)^2}\) | Not derived |
See textbook Sections 3.4 and 3.5 for models of interacting species and epidemiology, respectively, which we won't cover in class.