Heritable genetic variation in dispersal
We first estimated the variance in dispersal (i.e. natal dispersal between any of the study islands) that was attributable to additive genetic effects and several environmental effects using a basic genetic groups animal model (basic GGAM), where individuals born in the farm and non-farm island habitat types were allowed to differ in mean breeding values for dispersal but where the additive genetic variances of dispersal was similar in the two habitat types (Muff et al., 2019; Wolak & Reid, 2017). Next, we formulated an extended genetic groups animal model (extended GGAM; (Aase et al., 2022; Muff et al., 2019) where the additive genetic variance in dispersal was allowed to differ for farm and non-farm island habitat types. The two genetic groups corresponded to the farm and non-farm island habitat types, where the genomes of individuals were proportionally assigned to their origin from either the farm or non-farm genetic group. The proportional assignment to farm or non-farm genetic group origin was based on the metapopulation level pedigree that included all 3116 successfully SNP-typed individuals and an additional 440 dummy individuals that were assigned as parents to identify known relationships among recruits (such as full- or half-sibling relationships) when one or both of the true genetic parents were not genotyped (Niskanen et al., 2020). More specifically, the assignment of these 3556 real or dummy individuals’ genomes to the two genetic groups was done based on information on the assumed natal island habitat type of the phantom (i.e. unknown) parents of individuals in our metapopulation level pedigree. To obtain assumed natal island habitat type of phantom parents we first identified the known or most likely natal island habitat type of each individual in the pedigree. For 2741 of the 3116 real individuals their natal island was known either from ecological or genetic assignment data (Saatoglu et al., 2021). Because most house sparrows in our study metapopulation are resident individuals (Ranke et al., 2021; Saatoglu et al., 2021), we used the first island they were recorded on as the most likely proxy for the natal island of the remaining 375 real individuals. Furthermore, dummy individuals that had at least one known parent (N = 169) was assigned the same natal island as their parent(s). Finally, dummy individuals without any known parent(s) (N = 271) were assigned the island where their offspring were born as their natal island. In the metapopulation level pedigree, 592 real and 303 dummy individuals had either both parents (N = 646), their mother (N = 93) or their father (N = 156) missing. These unknown parents represent the pedigree’s phantom parents, and were assigned to the same natal island habitat type as their (real or dummy) offspring. Finally, the proportional genetic group contribution (qij ) values to the farm and non-farm genetic groups for each individual in the metapopulation level pedigree were calculated from the pedigree based on the phantom parents’ assumed natal island habitat types using the “ggcontrib” function from the R package NADIV (Wolak, 2012).
Our basic GGAM partitioning variation in dispersal probability allowing differences in group-specific mean breeding values was defined using a binomial regression model with logistic link function and linear predictor for individual i given as
, (1)
where µ is the intercept, Xi is a vector indicating the fixed covariates of individual i and β is a vector of fixed effects. Individual sex was included as a fixed effect to account for differences between sexes in dispersal propensity (Saatoglu et al., 2021), and the proportional genetic contribution from the non-farm genetic group was included as a fixed effect (continuous covariate) to account for any mean differences in dispersal probability between the genetic groups. Both fixed effects variables were mean centered. The random effects include individual i ’s natal island (islandi ~ N(0,σisland 2)) and hatch year (hyeari ~ N(0,σyear 2)), and captured the variance in dispersal attributable to spatio-temporal environmental variation. Furthermore, the total additive genetic effect of individuali is given as ui , which is the weighted genetic group mean effect for group 2 (g 2; we defined group 2 as the non-farm genetic group) plus the breeding valueai , distributed as a = (a1 ,…,an ) ∼ N(0 ,σA 2A ) with additive genetic variance σA 2 and additive genetic relatedness matrix A that represents the relatedness among individuals (Kruuk, 2004). Thus, the genetic group mean effect for the farm group was set to 0 (i.e.g 1 = 0) for identifiability reasons, and the estimate for g 2 is the difference in the non-farm group’s mean total additive genetic effect compared to the farm group’s mean total additive genetic effect. Note that, because our animal models were formulated as logistic regression models, there is no residual variance component (de Villemereuil, Schielzeth, Nakagawa, & Morrissey, 2016).
The basic GGAM was extended to allow estimation of group-specific additive genetic variances. Our extended GGAM was thus formulated as a logistic regression model with linear predictor given as
, (2)
where the total additive genetic effect of individual i is again given as ui , which is now the sum of the genetic group mean effect for group 2 (g 2; the non-farm genetic group) multiplied by the genetic group 2 proportion of individual i (qi2 ), plus group-specific additive genetic values of group 1 (ai 1, the farm genetic group) and group 2 (ai 2, the non-farm genetic group). As in model (1) the genetic group mean effect for the farm group was set to 0 (i.e. g 1=0) so that the estimate for g 2 is the difference in the non-farm group’s mean total additive genetic effect compared to the farm group’s mean total additive genetic effect. However, the breeding valuea i in model (1) is now split into two group-specific components ai 1 andai 2, witha j = (a1 j,…,an j) ∼ N(0 ,σAj 2A j) for both groups j = 1, 2, whereσAj 2 is the additive genetic variance in group j , and A j are group-specific relatedness matrices calculated as in Muff et al., 2019. We denote ai 1 andai 2 as the partial breeding values, because they represent the contributions to the breeding value of individual i that are inherited from group 1 and 2, respectively.
Narrow-sense heritabilities for dispersal probability were obtained from i) the basic GGAM for the whole study population combined and ii) the extended GGAM for farm and non-farm genetic groups separately, from the variance component estimates by using the formula (showing the extended GGAM case)
, (3)
where the variances are defined as above, and residual variance was approximated by π2/3 (Nakagawa & Schielzeth, 2010). The heritability estimate for dispersal from the basic GGAM (h 2) was obtained by usingσA 2 instead ofσAj 2 in formula (3). The proportion of phenotypic variance in dispersal explained by the natal island and hatch year was also estimated for the basic GGAM and the extended GGAM using the same formulas, but withσisland 2 orσyear 2 as the numerators, respectively (instead of σA 2 orσAj 2). Note that we for heritabilities and other proportions of phenotypic variance explained assume that the island and year variances are the same within the farm and non-farm habitats.
The basic GGAM and the extended GGAM were fitted in a Bayesian framework with integrated nested Laplace approximations using R-INLA (Rue, Martino, & Chopin, 2009), which is a fast and accurate alternative to MCMC (Holand, Steinsland, Martino, & Jensen, 2013; Steinsland, Larsen, Roulin, & Jensen, 2014). In order to prevent overfitting, a penalized complexity prior was used for the precisions of the environmental random components (with u = 2, α = 0.02) (Simpson, Rue, Riebler, Martins, & Sørbye, 2017).