Vector control with driving Y chromosomes: modelling the evolution of resistance

Background The introduction of new malaria control interventions has often led to the evolution of resistance, both of the parasite to new drugs and of the mosquito vector to new insecticides, compromising the efficacy of the interventions. Recent progress in molecular and population biology raises the possibility of new genetic-based interventions, and the potential for resistance to evolve against these should be considered. Here, population modelling is used to determine the main factors affecting the likelihood that resistance will evolve against a synthetic, nuclease-based driving Y chromosome that produces a male-biased sex ratio. Methods A combination of deterministic differential equation models and stochastic analyses involving branching processes and Gillespie simulations is utilized to assess the probability that resistance evolves against a driving Y that otherwise is strong enough to eliminate the target population. The model considers resistance due to changes at the target site such that they are no longer cleaved by the nuclease, and due to trans-acting autosomal suppressor alleles. Results The probability that resistance evolves increases with the mutation rate and the intrinsic rate of increase of the population, and decreases with the strength of drive and any pleiotropic fitness costs of the resistant allele. In seasonally varying environments, the time of release can also affect the probability of resistance evolving. Trans-acting suppressor alleles are more likely to suffer stochastic loss at low frequencies than target site resistant alleles. Conclusions As with any other intervention, there is a risk that resistance will evolve to new genetic approaches to vector control, and steps should be taken to minimize this probability. Two design features that should help in this regard are to reduce the rate at which resistant mutations arise, and to target sequences such that if they do arise, they impose a significant fitness cost on the mosquito. Electronic supplementary material The online version of this article (doi:10.1186/s12936-017-1932-7) contains supplementary material, which is available to authorized users.

, where the total female population is ( ) = ( ) + ( ) + ( ). Thus the total recruitment rate is dependent on the number and fitness of females, and the assumption that all participating females are fertilized [2]. Time is normalized with generation time, since the equations then depend on only one growth parameter, = / (the intrinsic growth rate of the population) rather than on two (the density-independent birth rate and death rate ).
The equations for the individual genotype populations are now considered, following a previous model [1], but here differentiating between males and females. It is assumed that individuals that participate in mating choose partners at random and, for simplicity, assume that each offspring is from an independent mating event, such that the genotypic composition of offspring is equal to draws of pairs of gametes from the available 'pool'. For example, among all offspring produced, the fraction of mutants ( ) (driving Y males with the resistant X) is given by the proportion of gametes contributed by participating females with a resistant X chromosome, [ , ] ( ), and the proportion of gametes contributed by participating males with a driving Y chromosome, [ , ] ( ): [ , ] ( ) [ , ] (  then multiplied by [ , ] ( ) [ , ] ( ) to obtain the recruitment rate and the equation for the rate of change of ( ): ( ) = 2[ − ( )] ̅ ( ) ( ) [ , ] ( ) [ , ] ( ) − ( ) Similar equations are constructed for all other genotypes, with the fraction of offspring ( ) calculated as the fraction of births for which the resistant X is contributed by the female and the wild-type X by the males, and vice versa: [ , ] ( ) [ , ] ( ) + [ , ] ( ) [ , ] ( ), taking into account the fraction of Xs from driving Y males that are resistant. Selection therefore occurs through the relative participation of different genotypes in mating and reproduction. Since total recruitment rates are dependent on the female population (A1.1), fitness cost to the resistant X therefore causes an effective reduction in fertility for females, and reduced participation in mating for males, as given by the resulting nonlinear system of seven differential equations:

A1.1a Seasonal variation
Periodic variation is modelled with a sinusoidal time dependence in the density-dependent part of the mosquito recruitment rate, γ( ) = γ 0 [ ] (1 + sin [ 2 ]) . is the period of the seasonality, and is the amplitude of the oscillations. To avoid a negative recruitment rate, which may occur for large seasonal variations in the population, it is specified that the total seasonal recruitment rate Λ( ) cannot go below zero: To find the pre-release wild-type equilibrium, the deterministic equations are solved numerically for the wild-type population [(1) in the main article], with non-negative recruitment rate given by Λ( ), until the solution has converged to its long-time periodic equilibrium ( , In the equilibria below, the populations are normalized with 0 , and = ( − 1)/ 0 is used.
When the solutions are expressed as fractions of individual types in the total population (e.g. = / ), it is found that for fitness ≥ 1 [ , ] (independent of ), the proportions of non-mutants and heterozygote females in the total population are zero. Therefore, 1 is the demarcation between the region where resistant mutations are fixed in the population ( 1 ≤ ≤ 1) and the region of intermediate equilibrium (0 < < 1 ). We derive 1 by setting the solutions for the proportions of non-mutants and heterozygote females to zero to obtain: Population suppression/extinction. The region where the total population is eliminated is now considered. We start with the equation for the total population (A1.1) at equilibrium, which includes all the female populations. It is more convenient to work with populations normalized with 0 , e.g. ̃( ) , and it is found that 0 cancels out and vanishes in (A1.1) except inside the density-dependent recruitment rate: If 0 = ( − 1) is substituted above, one obtains the recruitment rate in terms of normalized population: Henceforth, for the equations in this section, instead of referring to all normalized populations with a tilde sign (e.g. ̃( )), which would be cumbersome, it is specified that all populations , below refer to populations normalized by 0 . We obtain from (A1.1) for normalized populations: Since substituting in zero for all the populations in this equation does not yield a solution, we rewrite it in terms of the population frequencies, e.g. = , = : 2 ( + + 2 ) − 1 2( + + 2 )( − 1) = We then set = 0 above, which gives the condition for zero total population: For 1 ≤ ≤ 1, where the mutation is fixed, the total population is zero to the left of a line that is defined as the 'extinction' line, = ex [ ] (see Fig. 3 in main article). To the right of ex , the population is non-zero and with the mutation fixed in the population (composed 50:50 of females and males and with , , , equal to zero): To obtain the fitness below which there is 100% suppression of the population, = 1/2 and , = 0 are substituted into (A1.3) to obtain ex = 1/√ for 1 ≤ ≤ 1 (equivalently, one can substitute = 0 in (A1.4) and solve for = ex ).
For a given , this could be solved numerically for ex ; in practice for plotting ( Fig. 3 in main article), it is easier to calculate for a given ex < 1 .  (100% population suppression) as a function of the fitness parameter (for heterozygous females, with fitness 2 for homozygous females and hemizygous males) and the intrinsic rate of increase of the population ( ). Here, = 10 −6 . Below and to the left of = for each , the total population is zero; above and to the right, the population is nonzero. For each , the population is always nonzero for , where the Y drive is not sufficient to eliminate the population.
In addition to the equation for ex , we also calculate the lines = [ , , , ] that represent the values for which the total female population is a fraction of its value of 1 2 when there is no fitness cost: Note that =0 = ex . For 1 ≤ ≤ 1, where the mutation is fixed in the population, the total female population is half of the total population (A1.4) and substituting into (A1.6) above yields: This can be rewritten to give = 1/√ − ( − 1). where 4 is too complex to show here (and reduces to 1 / 2 for = 0). Since this equation yields for a given , it is used to construct Fig. 3 in the main paper. Each constant female population line = intersects the line = 1 (that separates the total fixation and intermediate equilibrium regions) at = * ( ). We obtain * by substituting 1 in (A1.7) above: To find the region where the total population persists for all fitnesses, one can substitute = 0 in the solution for for 0 ≤ < 1 (A1.5) to obtain:

A1.2 Model II
For this model, populations of heterozygotes with one trans-acting suppressor allele are denoted by ( ), ( ), ( ) and homozygotes by ( ), ( ), ( ). Of all births (male or female), the chance of the suppressor mutation arising on the relevant autosome is (0 ≤ ≤ 1) per individual per autosome. For example, for births from parents with only non-mutant alleles, a fraction 2 of their offspring will have a suppressor mutation, since either autosome in the offspring could mutate independently (the probability of 2 of two mutations occurring at the same time is ignored, since the focus is on ≪ 1). For costly resistance, each mutant type with one or two suppressor alleles has a decreased fitness of , , , , , ≤ 1. The same approach as for Model I above is used, except that gametes with or without the suppressor allele that are contributed by males are further differentiated by whether they also have an X, Y or driving Y chromosome and therefore result in female, wild-type male or driving Y male progeny. The time-dependent ODEs for the mutant and nonmutant populations are (with an exemplar solution for baseline parameters and no fitness cost in Fig.  A1.3): and total male population given by ( ) = ( ) + ( ) + ( ) + ( ) + ( ) + ( ). The differential equation for the total population is given by: where, as before, ̅ is the average relative female fitness and ( ) is the total number of females.  In the equilibria below, the populations are normalized with 0 , and = −1 0 is used.

Fixation/intermediate equilibrium.
As with Model I, there is a fitness ≥ 1 [ , ] (independent of ), that separates the region where resistant mutations are fixed in the population ( 1 ≤ ≤ 1) and the region of intermediate equilibrium (0 < < 1 ). Although we are not able to obtain analytical solutions for each population in the intermediate equilibrium region, the nine equilibrium equations are reduced to two equations for the fitness-weighted proportions of males and of females (not shown due to complexity). 1 is found by setting these two quantities to their fixation region value of 2 /2 (i.e., population of non-mutants and heterozygotes to zero and of homozygotes to ½) to obtain: Population suppression/extinction. The focus is now on the two regions of zero and non-zero population. To obtain ex , one starts with the long-time limit of equation (A1.8) for the total population, which includes all the female populations: This can be rewritten in terms of the proportions, e.g. = , = , = ; by solving for and then setting = 0, the condition for zero total population is obtained: where the mutation is fixed, the total population is zero below the extinction line = ex [ ] (Fig. 7, main article). Above ex , the population is non-zero and with the mutation fixed in the population (composed 50:50 of homozygote females and males and with non-mutants and heterozygotes , , , , , equal to zero): For 0 < < 1 , at intermediate equilibrium, the condition for zero population is given by (A1.10).
For this more complex case, analytical solutions for the female proportions were unobtainable, so ex [ , , ] is calculated using a different approach. The full set of equilibrium equations for the individual populations is solved numerically using Wolfram Mathematica FindRoot [3] for a given , , , . An iterative numerical method is then used to find = ex , the maximum fitness for which the population is eliminated (Fig. 7, main paper).
We also determine the lines = [ , , , ] for which the total female population is a fraction of its value of 1 2 when there is no fitness cost: where the mutation is fixed in the population, the total female population is half of the total population (A1.11), and substituting into (A1.12) above yields, as for Model I: Below 1 , in the region of intermediate equilibrium, the equilibrium equations for the individual populations are solved numerically using Wolfram Mathematica FindRoot for a given , , , . As above for ex , we iterate to find = where the total female population is /2 (Fig. 7, main paper).

A2. Branching process method
The time-inhomogeneous branching process method is used to calculate probabilities of stochastic loss. It is assumed in this model that stochastic loss of a new mutation occurs in the early stages when the mutant population size is extremely small (i.e., once the mutation has survived this early phase, it will establish with high probability), and therefore the numbers of individuals without the resistant mutation are much greater than the numbers of non-mutants (see [4]). As is standard, we therefore treat the nonmutant populations, ( ), ( ), ( ), deterministically using (2) in the main article, and treat the mutants with the resistant gene stochastically via a branching process model. For the calculation of the probabilities of stochastic loss, one may then ignore second order terms and homozygous individuals ( ). .

A2.1 Model I
By applying the linearization assumptions above in the equations for heterozygotes with the resistant gene (A1.2), one may construct a branching process for the resistant types based on birth and death rates: Above, ( ) = ( ) + ( ) + ( ) and ( ) = ( ) + ( ). ( ), ( ), ( ) are solved numerically from the deterministic equation (2) for non-mutants in the main article, with time zero corresponding to introduction of the driving Y in a carrying capacity landscape of wild types. The first term in the expression for the birth term for ( ) above is the population-wide rate at which female mutants first arise from mutation, Note that if the fitnesses of mutant males with and without the driving Y are the same, We now introduce the probability generating function (see [5]), defined as: These equations for multiple mutant types, with , , given by numerical solution of (A2.6), are the analogous expressions to Uecker and Hermisson's Eq. (16) [4] for ( ) for a single mutant type; a closed-form integral solution is obtainable for the latter simpler case [4,5].
Probability of mutation arising and surviving, . The focus is now on recovery of the population due to establishment of mutants created by cleavage-induced mutation with nonzero mutation parameter , and no pre-existing mutation. The probability that least one mutant is present at time is 1 ( ), and 1 = 1 ( → ∞) is the probability that at least one mutation has arisen and has survived stochastic loss and the population therefore persists indefinitely. Since it is specified that there are no resistant mutants introduced at any time , the initial condition (i.e. the first factor in (A2.8)) is simply (for any time ): ( , ( − ), ( − ), ( − )) = ( − ) 0 ( − ) 0 ( − ) 0 = 1, leaving only the exponential factor in (A2.8) for cleavage-induced mutations that can arise at any time after introduction of the driving Y ( = 0), hence the integration limits from 0 to : The probability 1 ( ) that at least one resistant allele is present at time after introduction of the driving Y is given by: 1 ( ) = 1 − 0,0,0 ( ) = 1 − ( , 0,0,0). From (A2.10), and transforming the integration variable as → − , we obtain: The probability 1 that at least one mutation will arise and establish over all time is 1 ( → ∞), and substituting lim →∞ (1 − ( − )) = , ( ) from (A2.9) above yields: This is Eq. (3) in the main text, with , ( ) calculated from (A2.9) and following the solution of (A2.6). Note that only , ( ) (derived from the solution for ( ) corresponding to mutants) appears in the integral, because X chromosome resistant mutations can only spontaneously arise in daughters of driving-Y males; however, it is still necessary to solve the coupled equations (A2.6) for all quantities ( ), ( ) and ( ). The probability that at least one successful mutation will arise is the rate of creation of the mutant ( ) over all times, weighted by the time-varying probability of establishment of a mutant arising at time , , ( ) (see Eq. (A7) in [4]).

Probability of mutation
. It is assumed that no mutations are present before the driving Y is introduced, but subsequently cleavage-induced mutations can first occur in female progeny from driving Y males at a rate ( ), ( ≠ 0). The probability that at least one resistant X has been created naturally (for > 0) by time is denoted as ( ), and for a nonhomogeneous Poisson process it is given by: where ( ) is the time-dependent population-wide rate at which individuals arise by mutation (A2.2). Cleavage-induced mutations can arise at any time between introduction of the driving Y and total population extinction. = ( → ∞) is defined as the probability that at least one mutation occurs before the population is extinguished (the mutation may subsequently have gone extinct, or survived and fixed). Eq. (A2.11) then becomes Eq. (5) in the main paper: where two times the integral above, represents the total number of progeny from driving Y fathers after release and before elimination in the absence of resistant mutations, of which (1 − ) are females and are driving Y males, normalized by 0 .
Exemplar results for these time-dependent probabilities for baseline parameters are presented in Fig.  A2.1. Fig. A2.2 shows the variation of , ( ) with the intrinsic growth and with (strength of Y drive). 1 ( ), that at least one mutant has arisen and is present at time (calculated using the branching process), is lower than ( ), because a proportion of mutants that arise will not survive due to stochastic loss. The long-term limit of 1 ( ) is 1 , the probability that at least one successful mutation occurs, spreads, and the total population recovers. (b) Probability of establishment , ( ) of a single new mutation in a female, , introduced at time after driving Y release at = 0, which is higher if it arises at later times after driving Y introduction (up to a plateau for baseline parameters). At later times, when the population is declining and driving Y frequency is high, the recruitment rate is higher due to less density-dependent competition, and consequently a newly-arisen female mutant has a higher probability of passing on the mutation (and passing it to a Y drive male offspring) before dying, and thus the resistant allele has a greater probability of establishing, although less than 50%. For = 6, = 0.95, 0 = 1, ℎ 0 = 0.05, = 1.
The probability of establishment of a single copy of the mutation, , ( ), is affected by the intrinsic rate of increase of the population, , and the transmission rate of the driving Y, (Fig. A2.2). Overall, , ( ) increases with increasing , and decreases as increases for the values investigated, most markedly for mutations introduced at later times . For different values of (strength of Y drive) and = 6. Interestingly, the stronger the drive, the lower the probability at later times. This is because the stronger drive produces a more extreme male bias, which in turn means the expected reproductive success of an individual resistant male will be lower (due to the greater relative scarcity of females to mate with), and therefore the overall probability of mutant establishment will be lower. For both plots, ℎ 0 = 0.05, 0 = 1, = 1.

A2.2 Model II
The same linearization method as for the resistant model is followed to construct a branching process for the resistant types based on the birth and death events from (A1.8): Note that if it is assumed that = = , the expressions above simplify to: Returning to the general case, we follow a similar procedure as above and introduce the probability generating function ( , , , ), where , , correspond to , , heterozygotes: = Probability of mutation . Unlike new target-resistant X-chromosome mutations which appear only in female progeny of driving Y males, for the suppressor mutation, every birth from non-mutants has the potential for a new suppressor mutation to arise on an autosome. The total combined rate at which , and individuals first arise is given by: The time evolution of the rates at which , and individuals first arise, and their total, is shown in Fig. A2.3 for baseline parameters, along with the mutation rate for the resistant X-chromosome mutation (with and chosen to give the same total probability that a mutation occurs before population extinction, to facilitate comparison). The total probability that at least one mutation (in a female, male or driving Y male) will have arisen by time is given by substituting the rate ( ) above into (A2.11): We then arrive at (6) in the main paper, which is the probability of at least one mutation having arisen before the population is eliminated: Above, the integral, times a factor of two, represents the total number of progeny of all types after release and before elimination if no resistance evolves, normalized by 0 .
Exemplar plots of the time-evolution of ( ), ( ), 1 ( ), and ( ) for baseline parameters are shown in Fig. A2.4. Figure A2.4. (a) Time evolution of ( ), ( ), and 1 ( ) for the suppressor mutation, calculated via the branching process method. The long-term limit 1 is the probability that at least one successful mutation occurs, spreads, and the total population recovers. (b) Time evolution of probability of establishment ( ) for a single new suppressor mutation at time , in a heterozygote female (black line) and in a heterozygote male, driving Y or wild-type (blue line), assuming that wild-type and driving Y male fitnesses are the same. For these parameters, the probability of survival for new mutations arising in a female at time rises and plateaus, but for those arising in a male it does not change significantly. Similarly to the resistant mutation model, declining population at later times means less density-dependent competition and thus higher chance of a female mutant establishing, while the higher male sex bias at later times results in male mutants making a decreased per-capita contribution to passing on the mutation, and thus males have a lower probability of establishment than female mutants. For = 6, = 0.95, ℎ 0 = 0.05, 0 = 0.1, = 1.

A2.3 Applicability of branching process method
Second-order terms and homozygote populations are ignored in the linearization of the equations that leads to the branching process formulation, but since early stochastic loss generally happens when mutant populations are small, this approximation usually holds for calculating the probability that a mutation establishes. However, for our equations with time-varying rates, for certain parameters, the solution of the linearized equations is zero at long times after driving Y release, and thus growing mutant populations will eventually fall back to zero before the probability of stochastic loss has converged. Here, the branching process cannot be used, and full simulations are used to calculate the probabilities of the mutation surviving stochastic loss (see Section A3 below). In all other regions, the computationally faster and more convenient branching process equations are used and values are confirmed with simulations as needed.
Model I. To find the parameter space where linearization for the resistant X mutation cannot be used, we consider equations (A1.2), linearized and in the long-time limit, where it can be assumed that ≫ ( ), , , , ≅ 0 and ( → ∞)/ ( → ∞) ≅ (1 − )/ . While the linearization approach implicitly assumes that once a mutant population is established, it will continue growing, there exists a range of parameters for which the solutions of these equations eventually decay with time such that ( → ∞) = ( → ∞) = 0 (for supercritical ≥ = 1 − 1 2 ). Such non-monotonic solutions occur if the following condition (for = and = 2 ) is satisfied: This condition can be also written either in terms of m or R : The equations for the branching process are thus less likely to hold at conditions of low , high m, and/or high cost to the mutation. . Again, for conditions of low , high m, and/or high cost to the mutation, the branching process method cannot be used (full simulations are carried out instead).

A3. Simulation method
Stochastic simulations are carried out using the direct Gillespie algorithm. In this approach, ( ) represent integer numbers of individuals (Model I is  . The transition rates are designated according to the basic model (A1.2) for the birth and death of each individual: We define propensity functions ( ) = ( ( )) and +7 ( ) = ( ) for = 1,7. The waiting time for the next transition, i.e. the next birth or death event of a single individual of any type, is = − is the total propensity function and 1 is a random number uniformly distributed in [0,1]. To decide which event (indexed by = 1, 14) takes place at time + , a second random number 2 is generated and the index that satisfies ∑ ( ) (with 0 ( ) = 0) is chosen. If 1 ≤ ≤ 7, a birth for has occurred and ( + ) = ( ) + 1; if 8 ≤ ≤ 14, a death for −7 has occurred and −7 ( + ) = −7 ( ) − 1. The same steps are repeated until reaches an upper limit or the populations have reached desired limits (see below). For the suppressor mutation, the same approach is used for the nine different types , , , , , , , , with transition rates based on Eq. (A1.8).
A carrying capacity of equal numbers of wild-type males and females ( = = 0 /2) is specified, typically 0 = 10 6 , and driving Y introduction ℎ 0 = 0.05 0 at time zero (a quantity high enough to ensure no early stochastic loss of the driving Y). For calculation of 1 and , we specify non-zero , , and for calculation of ( ) we set , = 0 and introduce a single mutation of specified type (female, male or wild-type male, as appropriate) at a given time after driving Y introduction. To find the percentage of runs (total runs = 10 6 ) for which no individuals are present after long times, simulations are run until either (1) the total population is extinct (either no males or no females) or (2) the mutant population reaches the equilibrium level that is predicted by deterministic equations; in practice for considerations of speed, calculations are stopped at a percentage (between 50-80%, but no less than 50,000 total population size) of that level sufficiently high that the mutation can be considered established.
A large number of individuals ( 0 = 10 6 ) is used in simulations in order to confirm the results of the branching process method. The latter assumes a very large baseline population size; the close agreement between the two calculation methods suggests that 10 6 indeed counts as large. For a subset of parameter values, simulations with 0 = 10 4 and 10 5 were also run (for equivalent population-wide mutation rates) and there was little effect on 1 . However, there are still several parameter regions where there are stochastic effects of finite numbers of discrete individuals. Firstly, in a small percentage of runs at very low , all driving Y males are lost around the time when the population is almost eliminated, leaving behind a few wild-type males and females. In some cases, these wild-types re-establish and the wild-type population recovers. Secondly, for conditions of very weak drive (close to ), an 'infinite' population will only go to zero in the limit of → ∞, but for finite numbers of individuals the population will eventually reach zero (at roughly the time when the normalised population reaches 1/ 0 in the deterministic models). Thus for a simulation of a finite number of individuals, the probability of a mutation will be expected to be smaller than the limit for an 'infinite' number. As our main interest is in strong drive, this latter stochastic effect does not unduly concern us here, and in any case, it results in a smaller probability of resistance evolving.
For the resistant X-chromosome (Model I) for baseline parameters ( 0 = 10 6 ), exemplar simulation results of randomly selected runs for which the mutation arose, established and the population recovered are shown in Fig. A3.1 along with the solution of the full deterministic equations (A1.2) in the presence of mutation (dashed line). The deterministic equations always predict recovery for these parameters (cost-free mutation), while the simulations (10 6 runs) and branching process method predict only a 6.5% probability of evolution of resistance and population recovery. Using the full simulations, the distribution of times that the total female population is suppressed to a given percentage of its pre-driving-Y release value is also investigated, for the resistant X-chromosome (Model I) for baseline parameters (Fig. A3.2). Shown are the distributions of suppression times for two values of (strength of Y drive): supercritical (population is eliminated in absence of mutation) and subcritical (drive not sufficient to eliminate population).