Introduction
Currently, a large part of the genetic transfer material in cattle occurs through the use of female- mediated reproductive biotechnologies. In this regard, the Brazilian market for in vitro production of cattle embryos accounts for approximately 70% of the global production (IETS, 2016). The main impact female-mediated assisted reproductive technologies (ARTs), such as multiple ovulation and embryo transfer (MOET) and ultrasound-guided follicular puncture, or ovum pick-up (OPU), followed by in vitro fertilization and embryo transfer (IVF) is focused on the creation and dissemination of the genetic gain, as well as reduction of the interval between generations in the populations of the animals involved, through the production of superior individuals (Pryce et al., 2010).
The greater participation of females in genetic improvement programs associated with ARTs can be attributed, among other aspects, to the ease of estimating the genetic merit for traits directly linked to sex (Dassonneville et al., 2012).
Recent studies using stochastic and deterministic simulation models have shown that the genetic gain obtained with the use of ARTs in cattle populations can be significant. Increases from 23% (with MOET) to 98% (with OPU-IVF) have been obtained (Pedersen et al., 2012; Pryce et al., 2010). However, due to the small number of progenitors that contribute to the next generation, the additional genetic gain can be accompanied by an increase in endogamy within the population and among descendants, with the consequent reduction of genetic diversity (Pedersen et al., 2012).
Considering that sufficient genetic variation is necessary in animal populations, both for adaptation and resistance, and for ongoing genetic improvement of traits with economic importance (Biscarini et al., 2015), the objective of this work was to assess the genetic diversity in elite female cattle populations used in commercial in vitro embryo production.
Material and methods
Ethical considerations
This study was approved by the Ethics Committee on Animal Experimentation of Norte Fluminense State University (UENF Protocol no. 243, March 11, 2014).
Population studied
Fifty adult dairy cows of the Gir breed (Bos taurus indicus) were evaluated, used as oocyte donors in commercial programs existing for more than four years in the Brazilian industry for in vitro embryo production (Table 1).
Na= number of different alleles; Ne= number of effective alleles; Np = number of private alleles; AR= allelic richness; pAR = private allelic richness; Pl= percentage of polymorphic loci; Ho= observed heterozygosity; He= expected heterozygosity, and F= fixation index.
The samples were collected in three populations on farms located in three municipalities in the state of Rio de Janeiro, Brazil: Campos dos Goytacazes (Population 1, n= 6; 21°45′ 16″ South and 41°19′28″ West), Rio das Flores (Population 2, n= 27; 22° 9’ 32″ South and 43° 34’ 52″ West) and Valença (Population 3, n= 27; 22º14’44″ South and 43º42’ 01″ West). To assure representativeness of the samples, all donor cows of each farm were included. The animals studied were born in each farm; they were self-sufficient herds in repositioning of females.
DNA extraction and genotyping
The genomic DNA was obtained from tail hair follicles. The extraction and purification steps were carried using the NucleoSpin tissue kit (Macherey- Nagel, Düren, Germany), following the manufacturer’s instructions. The DNA concentrations were measured with a spectrophotometer (NanoDropTM 2000-Thermo Science).
The animals were genotyped using 20 microsatellite markers recommended for routine identification of kinship and paternity in cattle by the International Society for Animal Genetics (ISAG), 14 of them belonging to the main group of markers (BM1818, BM1824, BM2113, ETH10, ETH225, ETH30, INRA23, INRA63, SPS115, TGLA122, TGLA126, TGLA227, TGLA53 and TGLA57), and 6 to the additional group (CSSM66, ETH152, ILSTS06, INRA05, INRA172 and AE129). The genotyping was performed with a capillary sequencer (MegaBACE 1000 DNA Analysis System - GE Healthcare). The marker ILTS06 was excluded from the analysis due to its scarcity of amplification in the genotyped animals.
Population indicators of genetic diversity
The basic descriptive statistics of the population genetics were used to estimate the diversity within the breed divided into sub-populations, using the GenAlEx software, version 6.502 (Peakall and Smouse, 2012). The following parameters were estimated: number of different alleles (Na), number of effective alleles (Ne), number of private alleles (Np), percentage of polymorphic loci (Pl), observed heterozygosity (Ho), expected heterozygosity (He), and fixation index (F). To evaluate the genetic diversity and estimate the variance components of the populations, analysis of molecular variance (AMOVA) was applied based on microsatellite loci.
Initially, the genetic diversity distribution between the populations was studied by analysis of the genetic distances of Nei (Nei, 1977), and the F-statistics of Wright (FIS-f, FST-θ and FIT-F) (Wright, 1921); the significance was tested using bootstrapping of loci after 1,000 permutations of alleles within a population. Fisher’s exact test was applied to detect global and population deviations (per locus) from Hardy-Weinberg equilibrium (HWE).
The population allelic richness (AR) and private allelic richness (pAR) for the 19 markers tested were estimated by the rarefaction method using the HP- RARE 1.0 software. Analysis of variance (ANOVA) was applied to the allelic richness measures involving the 19 markers. Statistical differences between populations were evaluated with the SNK test using the MIXED procedure of the SAS® software, versión 9.2 (SAS Institute Inc., Cary, NC, USA) (2009).
Gene flow and population structure
The gene flow between populations was estimated by the indirect method based on the number of migrants per generation (Nm), using the value of FST in the formula Nm = [(1/ F ST ) - 1] / 4 (Gaggiotti et al., 1999).
The genetic differentiation pattern between individuals of the three populations was established by analysis of genetic distances of Nei, calculated with the GenAlEx software, version 6.502, using a shared allele distance matrix, which in turn was used to construct the dendrogram by means of the unweighted pair group method with arithmetic mean (UPGMA). The result of the matrix was visualized in a heat map (1-RD) (heatmap.2 function gplots in the R package), which revealed genetic relations between individuals of the populations.
An alternative model to analyze the population structure based on Bayesian grouping was used. For this analysis, the STRUCTURE 2.3.4 program was used. The simulation was carried out using the ancestry by mixture model (admixture) with correlated frequencies and burn-in period of 200,000 rounds followed by 500,000 MCMC (Markov chain Monte Carlo) iterations. Independent executions of “K” testing from 1 to 10 clusters with 25 repetitions were performed to confirm the consistency of the results. The ideal “K” value was chosen after analysis of the results file by the Evanno method with the web software STRUCTURE Harvester, versión 0.6.94. After identification of the ideal number of sub-populations, the last analysis was conducted by selecting the optimal “K” to generate the plot illustrating the population structure.
Results
Main characteristics of the markers
Among the 19 microsatellite markers tested, 18 were polymorphic in all the populations, for a total of 129 alleles detected. Only INRA63, in Population 1, was monomorphic. The markers were highly informative (polymorphic information content - PIC >0.5). The mean PIC value for the three populations was 0.58. The percentage of polymorphic loci (Pl) varied from 95 (Population 1) to 100% (Populations 2 and 3).
Deviations from Hardy-Weinberg equilibrium (HWE) were significant (p<0.001) in Populations 2 and 3 (markers CSSM66 and TGLA126), and in Population 3 (marker INRA23, p <0.05).
Genetic differentiation of the populations
Basic population statistics. The descriptive statistics of the genetic diversity of the three populations of oocyte donor cows used in the commercial programs for in vitro production of embryos are reported in Table 1. Variations were observed from 1.0 to 9.0 in the number of effective alleles (Ne), from 1.0 to 5.45 in the number of different alleles (Na), and from 0 to 6 in the number of private alleles (Np).
The potential for long-term adaptability and persistence of the populations was estimated by evaluating the allelic richness (AR) (Greenbaum et al., 2014). The final estimates of AR were significantly lower (p<0.05) in Population 1 (3.53) than in Population 3 (4.77). The average allelic richness value in the three populations was 4.2 alleles. The estimate of private allelic richness (pAR), showed variations from 0.21 (Population 1) to 1.06 alleles (Population 3), and was also significantly lower (p<0.005) in Population 1 compared to Population 3. The total value of pAR including all the individuals and all the loci tested was 1.88 alleles (Table 1).
The estimates of observed heterozygosity (Ho) were larger than the expected heterozygosity (He) in Populations 1 and 2, while in Population 3 the observed genetic diversity was lower than expected (Ho of 0.57 and He of 0.62).
The fixation index (F) presented the same pattern described for heterozygosity. It was negative in Populations 1 and 2 (because Ho > He), and 0.08 in Population 3 (Table 1).
Distribution of genetic diversity
Fixation indices F-statistics of Wright. The estimates of the F-statistics of Wright allow measuring the deficiency or excess of heterozygotes in a population in various ways: within sup-populations (FIS), between populations in relation to the expected diversity in the total population (FST), and of a single individual in relation to the total population (FIT) (Nei, 1977). Negative values of FIS and FIT were found in Populations 1 and 2 (indicating high heterozygosity), while Population 3 was endogamic, with values of 0.091 and 0.035, respectively. In the case of FST, the values were negative in Populations 2 and 3, indicating weak or no differentiation between these populations, but in Population 1 the value was 0.086 (FST values greater than 0.05 for microsatellite markers are considered indicative of differentiation between populations, on a scale from zero to one (Holsinger & Weir, 2009). Table 2 presents the averages per population and the global values (including all the markers and all the individuals as a single population) of the F-statistics as well as the number of migrants per generation (Nm).
* Negative FST values indicate weak or no differentiation, and are interpreted as zero. ** The result of the number of migrants (Nm) (indirect method to estimate the genetic flow between populations) when the FST value is negative lacks biological significance. FIS, the correlation between alleles within an individual relative to the sub-population to which that individual belongs; FIT, the correlation between alleles within an individual relative to the entire population; and FST, the correlation between alleles chosen randomly from within the same subpopulation relative to the entire population.
Gene flow and population structure
Gene flow is estimated from the number of migrants per generation (Nm), so an increase in this parameter is inversely related to the degree of diversity between geographically separated populations. A low Nm value leads to divergence between populations via selection and drift, possibly leading to speciation (Slatkin, 1985). When populations are small and the number of microsatellites is < 20, the value of FST is considered the best estimator or Nm in a population (Gaggiotti et al., 1999). The average Nm estimated between the populations in this study was 11,036 individuals, indicating a high genetic exchange rate.
Analysis of molecular variance (AMOVA)
The distribution of the variability of genetic diversity between and within populations measured using the genetic distance matrix, including all the pairs of genotypes found by the AMOVA, revealed that 8% of the total variation was due to differences between individuals within the populations, 91% was due to heterozygosity in the individuals, and that variation between the populations only represented 1% of the total variation (Table 3).
Table 4 presents results of the diversity measures comparing the populations. The values of scaled diversity, FST and Nm between the populations were significant (p<0.001) only between Populations 2 and 3. In general, the Shannon information analysis revealed low differentiation of the three populations.
*FST: (genetic distance between populations) values based on allelic distance matrix from F-statistics analysis.
** Nm: (Number of migrants) values based on FST values.
The relationship between Populations 2 and 3 can be considered the least narrow due to the low Nm value (17.36), versus 89.91 for Populations 1 and 2, and 34.32 between Populations 1 and 3. The highest “species” diversity was found between Populations 2 and 3 (1.244).
Genetic distances and relations between individuals
The dendrogram based on Nei’s genetic distances grouped 50 individuals of the three populations in three clusters, two of them grouping the great majority of individuals (Clusters 2 and 3); with Cluster 1 only containing 10% of the animals (Figure 1). The proximity between individuals can also be assessed in the heat map, which shows the low level of structuring of the populations, evidenced by the mixture among the components in the three clusters formed (Figure 1).
The alternative approach used in delineating the clusters of individuals based on their genotypes as revealed by the 19 microsatellite markers, using the Bayesian method (employed in the STRUCTURE software), resulted in similar grouping to that obtained by the principal component analysis in relation to the structure. In turn, the grouping identified K= 2 as the most probable number of clusters (or sub-populations) in the individuals studied (Figure 2). In the plot, it can be observed that most of the population presents similar distribution of the genetic groups (in green and red in Figure 2), with proportions varying from about 30 to 60%.
Discussion
Selection schemes based on reproductive biotechnologies are commonly used in cattle breeding. These are based on the continuous use of elite animals to generate products to supply markets for beef and milk around the world (Biscarini et al., 2015).
The results of this study (Table 1) suggest that the populations analyzed have variable levels of heterozygosis based on their different allele frequencies. Genetically, these differences in the frequencies of the loci evaluated indicate selection pressure that is also variable (natural or artificial) in each population. This can lead to fixation of some homozygous alleles (endogamy) (Pedersen et al., 2012) as in Population 3. Since a fixation index near zero is expected in random coupling, and positive values indicate fixation of alleles in homozygosis (endogamy), due mainly to the population’s genetic structure (Peakall & Smouse, 2012), it can be stated that Population 3 had the least diversity.
It should be mentioned that the type of selective pressure applied by the use of reproductive biotechnologies such as OPU-IVF can quickly modify the gene and allele frequencies in the populations due to the short intergenerational interval or the reduced number of bulls used for the mating between elite donors, among other reasons (Dassonneville et al., 2012). The specific effect of the use of ARTs on genetic diversity is unknown, however, it is expected that this will be decreased after several inbreeding mating.
Other researchers analyzing microsatellite markers in Gir cattle reported observed heterozygosity values in all the loci, ranging from 0.60 to 0.62, in the case of Brazilian Gir cattle (Egito et al., 2007; Villalobos-Cortés et al., 2015), and 0.67 in a study carried out in India (Kale et al., 2010). These results are similar to those presented in this study for Populations 1 (0.59) and 2 (0.64).
Considering the concept of allelic richness and the statistical differences between the populations (Populations 1 and 3), it can be concluded that the lower value found in the first population would reduce the chances of long-run adaptation and persistence, since the limits of selection are determined by the initial allelic composition rather than by the heterozygosity (Petit et al., 2008) (a rare allele that is lost in a founder event probably will not greatly affect the heterozygosity, but the loss will reduce the allelic richness (Greenbaum et al., 2014).
Measures of population allelic richness are not only considered important for preservation of breeds or species, but also in marker-assisted selection, since the existence of alleles, instead of their frequencies, determines a significant part of the potential to respond to selection (Petit et al., 2008). The average value of AR obtained in this study (4.2 alleles) was similar to the average obtained in a study conducted in 2015 (AR= 4.11), also with Brazilian Gir cattle (Villalobos-Cortés et al., 2015). Average values of AR in seven indigenous Vietnamese breeds were higher than those found in this study (8.72 alleles), but the authors used a larger number of microsatellites (27 loci) (Pham et al., 2013). The total value of pAR involving the three populations (1.88), also considered a function of genetic diversity, was greater than that reported in other studies (0.002 to 0.006) (Gautier et al., 2010; Porto-Neto et al., 2013; Wang, 2015). However, these studies used single nucleotide polymorphism (SNP) markers, which are considered less polymorphic than microsatellites, and hence contain a smaller number of alleles distributed in the populations (Wang, 2015).
In relation to Np and its association with pAR, the latter estimate is considered more accurate, since the statistical technique of rarefaction considers the sample size, compensating the sampling disparity (Kalinowski, 2005).
When evaluating the level of endogamy within the populations (FIS) and the individuals in relation to the total number of alleles of the populations (FIT), it can be stated that for the first two populations studied, the result is within the range reported in other studies of the Gir breed, where FIS values using genetic markers or pedigree analysis were negative (O’Brien et al., 2015; Wang, 2015), smaller than 1% (Egito et al., 2007; Porto-Neto et al., 2013) or varied between 1 and 3% (Faria et al., 2009; Santana et al., 2014). Nevertheless, the endogamy within Population 3 (9.1%) due to the fixation of alleles in homozygosis lies out above the values reported for this breed recently (Wang, 2015), indicating a delicate situation regarding genetic management. The level and rate of endogamy are generally utilized as parameters for variation within breeds and are negatively correlated with the effective population size (Biscarini et al., 2015). The greater allelic richness observed in this population should allow directed mating, seeking a balance between genetic merit and diversity. This can be attained by using tools such as optimum contribution selection.
In contrast to a recent study that compared seven Brazilian Gir breeds (Wang, 2015), the elite cow populations evaluated in this study did not present a clear pattern of differentiation between populations (values of FST <0.05, AMOVA and Shannon diversity with variance between 1 and 7%, respectively). This low diversity might have been the result of common ancestors in the establishment of the three populations (identical by descend) (Slatkin, 1985).
It should be mentioned that the low values of FST could also represent purifying selective forces, which are applied simultaneously in the populations in the same direction, imposing strong similarity between the compared groups and resulting in low differentiation. In this case, deleterious genes also can be selected jointly, affecting traits of common interest (Porto-Neto et al., 2013).
In the reproductive management of elite populations involving ARTs, it is routine to use the same reproducers in different farms due to traits such as high genetic value, reproductive efficiency and common selection objectives, thus facilitating the wide dissemination of the same alleles in different populations without the apparent influence of geographic isolation. This is confirmed by observing Populations 2 and 3, which although being nearest geographically, presented a higher value of FST (0.014 and p<0.001) compared to the more distant population pairs.
We therefore consider all the populations studied to be similar genetically due to the high gene flow between them (high migration rate) revealed in the analysis of the F-statistics (Table 2), and the Shannon diversity (Table 4) including all the markers, where the values of Nm fluctuated between 11.03 and 89.91 individuals per generation. The similarity of the populations is mainly due to migration rather than genetic drift.
It is common to evaluate the Nm in studies of diversity within breeds, but no consensus exists about the ideal value in this type of study. However, it is known that the higher the value of Nm is, the smaller the genetic separation between populations is.
Values of Nm reported among five Indian breeds varied from 5.40 to 32.80 individuals according to the degree of genetic proximity (Sharma et al., 2013).
In turn, in Brazil values of Nm and FST published in previous studies involving Zebu Gir and Brahman breeds reported a low level of differentiation (Nm=7.34 and FST=0.042) (Villalobos-Cortés et al., 2015). The reported value of Nm was considered high by those authors. The pattern of low genetic differentiation between Brazilian Zebu breeds was initially confirmed in 2007 (Egito et al., 2007), and more recently in 2015 (O’Brien et al., 2015) through the FST values (<0.05) using different classes of markers.
From the standpoint of evolutionary genetics, the genetic structure of a population is characterized by the morphological and quantitative variability existing between individuals, the reproductive system, gene flow patterns and adaptive strategies to local environments (Holsinger & Weir, 2009). The lack of structure can be an indication of the influence of selective forces acting naturally or artificially on populations, which modify the allelic frequencies in favor of homozygosis (Slatkin, 1985).
The identification of the most likely number of clusters carried out in the STRUCTURE program included the allelic frequency model correlated between the populations in the previously established parameters. This frequency model has greater power to detect distinct populations that are particularly related, presenting the same results as the independent allele frequency model in the absence of high levels of correlation between the populations (Porras-Hurtado et al., 2013).
Besides the mentioned relevance of the stratification of the populations regarding the aspects of conservation of genetic resources and genetic diversity related to the long-term adaptation of populations (Slatkin, 1985), the differences in the allele frequencies between individuals in a population, derived from the systematic variation of the allele frequencies in their ancestors through sub- populations, are highly prized in searching for QTLs (quantitative trait loci), employing the genome-wide association study (GWAS) approach (Liu, et al., 2013).
The oocyte donor cows included in the in vitro embryo production programs analyzed here presented low genetic differentiation between populations. Within the populations, they presented low to moderate diversity and inbreeding, indicating genetic management and non-standardized mating.
We suggest the use of mating guidance tools that use measures of genetic diversity and richness in elite populations dedicated to production of genetic material of interest to animal breeders. New studies assessing the effect of the levels of endogamy on traits of interest in in vitro embryo production are necessary.