Sheep are an important component of the environment based on economic and conservation perspectives (Knapik et al. 2017). These animals can adapt to landscapes that are difficult to cultivate and play an important role in food security, household economies, and cultural development of Colombian indigenous and farmer communities (Corpoica 2007). Traditionally, in the country, Colombian hair sheep have been used for meat production in extensive systems (Zuleta et al. 2009). These animals are highly diverse since crosses between different breeds date back to the colonization (Vergara et al. 2017). There are two varieties of sheep, which producers refer to as Ethiopian, or cherry red-haired sheep, and Sudan, or yellow-haired sheep; however, polychromy is common among sheep, with animals found in cherry red, yellow, white, black, and combinations of the above, due to the introduction of animals from West Africa, Ethiopia, Aruba, Curacao, and other Caribbean islands (Zuleta et al. 2009). Furthermore, in Colombia, exotic breeds are also used and crossed with Creole breeds to obtain hybrid vigor for meat production (Vergara-Garay et al. 2016).
The increasing popularity of sheep meat production worldwide has prompted the sheep industry, producers, and geneticists to address the importance of growth and meat production in this species (Zuleta et al. 2009). Growth variables, such as body weight at different stages (AI-Mamun et al. 2015; Gholizadeh et al. 2015), have been evaluated as important growth indicators (Kemper et al. 2012). Given the low productive parameters in Colombian hair sheep (Romero et al. 2002), strategies to understand and improve these indicators must be implemented. In this regard, genomics-based approaches represent a valuable tool (Casas and White 2015).
The development of high-throughput SNP (single nucleotide polymorphism) genotyping has enabled the development of GWAS to identify candidate genes for traits of interest (AI-Mamun et al. 2015; Gholizadeh et al. 2015; Kominakis et al. 2017). Despite the discovery of various candidate genes through GWAS in recent years, a small number of these have been found in sheep due to the limited availability of genomic information on sheep (Zhang et al. 2013). Furthermore, most research has been conducted on exotic breeds, while studies on native sheep are scarce (Casas and White 2015). GWAS are expected to contribute to identifying SNPs associated with traits of interest; therefore, this study aimed to analyze associations between growth traits and genomic traits using the Illumina OvineSNPs50 BeadChip array.
MATERIALS AND METHODS
Study animals
The study population comprised 167 individuals of Colombian hair sheep (females and males) belonging to Sudan (OPCS), Pelibuey (OPCP), and Ethiopian (OPCE) breeds. The number of individuals from each breed was 63, 60, and 44, respectively. The animals were in the departments of Cesar, Cordoba, and Valle del Cauca. The technical, scientific, and administrative procedures for research with animals, were adjusted to the regulations of Law 84 of 1989 and with resolution 8430 of 1993 (National Congress of Colombia).
Growth traits
The phenotypic growth traits assessed were birth weight (BW) measured during the first 24, adjusted weaning weight to 90 days (AWW) (equation 1), adjusted final weight adjusted to 1-year-old (AFW) (equation 2), pre-weaning daily weight gain (WGPRE) and post-weaning daily weight gain (WGPOS), fat thickness (FT), loin depth (LD), and loin eye area (LEA). The last three traits were measured at approximately 1 year of age, through a MyLabTMOne VET ultrasonograph (Esaote, Maastricht, Netherlands).
Where AWW is adjusted weaning weight to 90 days, WW is weaning weight, BW is birth weight, and WA is weaning age (days).
Where AFW is adjusted final weight adjusted to 1-year-old, FW is final weight, WW is weaning weight, and AFWD is the age at the final weight (days).
Samples and genotyping
Blood samples (10 mL from each animal) were obtained through jugular venipuncture and collected in vacutainer tubes with EDTA. Genomic DNA was extracted from the blood samples using Gene JET Genomic DNA purification (Thermo Fisher Scientific). The DNA was quantified through spectrophotometry using NanoDrop® ND-1000 (Thermo Fisher Scientific). DNA genotyping was performed on an Illumina OvineSNPs50 BeadChip array, targeting 54,241 SNPs, according to the Infinium® Assay Super II Illumina® protocol for use on the HiScan®SQ System.
Quality control was performed in PLINK v1.9b5.2 (Purcell et al. 2007) by excluding individuals who were missing more than 5% of genotyping information, as well as SNPs with call rates below 95%, allele frequencies below 5%, and Hardy-Weinberg equilibrium P<0.001.
Genome association analysis
The association analysis between the genotype and eight phenotypic traits was based on a linear regression model (equation 3) using 44,396 filtered SNPs. The equation was adjusted according to the genomic location of the markers using Wald’s standard statistic to calculate the P value for each SNP.
Where Y is the vector of n phenotypic observations (BW, AWW, AFW, WGPRE, WGPOS, FT, LD, and LEA) for 167 individuals, X is a covariable matrix (n x p); where n are individuals and p are SNPs, β is a vector of p effects (SNPs), and ∈ is a vector of n residuals.
Post-GWAS analysis
Using Haploview v4.2 (Barrett et al. 2005), SNPs were filtered for each trait based on a probability value of P<0.001 to obtain a matrix of significantly associated SNPs, which was used in further analyses. Next, SNPs were classified according to P-value and correlation coefficient (R2) using GWASTools (Gogarten et al. 2012) in R version 3.1 to identify the 10 most relevant SNPs associated with each phenotype studied.
Additionally, Genome Data Viewer was used to search the Ovis aries genome version 4 for growth genes, namely calpain (CAPN1), calpastatin (CAST), growth hormone (GH), growth hormone receptor (GHR), leptin (LEP), myostatin (MSTN), type 1 insulin-like growth factor (IGF-1), fatty acid-binding protein 9 (FABP9), and fatty acid-binding protein 4 (FABP4). SNPs located in these genes were identified for each trait assessed. On the other hand, a linkage disequilibrium value was established with a LD=0.001, under a window of 50,000 bp, using the SNPRelate 0.9.12 package (Zheng et al. 2012) from Bioconductor in the R program, to establish a set of filtered SNPs in equilibrium to avoid the strong influence of the groups of SNPs in the principal component analysis (PCA).
Functional analysis
The most significant genomic associations were searched in the KEGG database using Ovis aries genome v.3.1 in ReactomePA Bioconductor package (Yu and He 2016) in R to infer the biological processes of the genes involved. Furthermore, the DAVID database was used to obtain biological functions from additional databases, such as Gene Ontology, UniProt Keywords, InterPro, Kegg Pathway, Smart, and Pir SuperFamily. Finally, the KEGG identifiers retrieved were mapped to metabolic pathways in Ovis aries.
RESULTS AND DISCUSSION
In this research, a GWAS was conducted in 167 animals belonging to three breeds of Colombian hair sheep to determine associations between eight phenotypic traits and a panel of 54,241 SNPs using next-generation sequencing. The sex chromosome (OAR) and mitochondrial-associated contig were excluded from the analysis; furthermore, 44,396 SNPs were retained after discarding missing data for individuals and SNPs, low allele frequencies, and Hardy-Weinberg equilibrium (i.e., quality control).
For the association analysis, the SNPs associated with each trait were filtered according to P<0.001; as a result, 7,000 significant SNPs were identified (Table 1). After ordering by lowest P-value and highest R2, the 10 most significant SNPs showing the highest association with each of the eight traits were determined (Tables 2, 3, and 4) (Zhang et al. 2013), which resulted in a total of 80 SNPs for all traits. This study is the first approach in Colombia to identify candidate functional genes containing SNPs that are significantly associated with growth traits; furthermore, these SNPs may be involved in phenotypic determination. Despite the small sample size for a GWAS (Zhang et al. 2013; Peng et al. 2017), more than 10,000 SNPs were found to be significantly associated with eight growth traits (e.g., BW, AWW, WGPRE, AFW, and WGPOS), as well as traits that were measured by ultrasonography (e.g., FT, LD, and LEA). Moreover, this research represents a pioneer study in Colombia, thereby, contributing essential information to the ovine sector in the country.
The analysis indicated that chromosome OAR6 showed most significant associations (P<0.001), accounting for 12.5% of SNPs associated with AWW, WGPRE, AFW, WGPOS, FT, and LEA, followed by chromosome OAR1 with 11% of SNPs for AWW, WGPRE, AFW, FT, and LEA, and chromosome OAR5 with 9% of SNPs for BW, AWW, AFW, and WGPOS. Similarly, Zhang et al. (2013), suggested that chromosomes OAR1 and OAR3 are important for growth traits and meat production in sheep (i.e., for pre-weaning and post-weaning weight gain, daily weight gain, thoracic perimeter, shin circumference, weight at six months old, and weaning weight).
Association analysis for pre-weaning growth traits
Table 2 shows the 10 most significant SNPs associated with each pre-weaning phenotypic trait: BW, AWW, and WGPRE. For BW, annotation showed that 50% of SNPs were found in introns, 40% in the distal region, and 10% in promoters. Furthermore, six out of 10 SNPs were in five genes; however, functions were inferred for only three of these genes: LOC101107266, PAM, and GABRG2.
Association analysis for post-weaning growth traits
The 10 most significant SNPs associated with phenotypic traits AFW and WGPOS are found in Table 3. For AFW, 70% of SNPs were annotated in the distal region and 30% in introns. Five out of 10 SNPs were located in CEP135, EMCN, PRDM13, PIAS2, and SLC44A3 genes. Moreover, for WGPOS, 60% of SNPs were found in the distal region and 40% in introns. Six out of 10 SNPs were located in EFNA5, MAP3K7, CEP135, EMCN, PIAS2, and MTAP genes.
Association analysis for growth traits measured by ultrasonography
The 10 most significant SNPs associated with traits FT, LD, and LEA are shown in Table 4. For phenotypic trait FT, 80% of SNPs were annotated in the distal region, 10% in intron 2 of 20, and 10% were located downstream (1-2 Kb). Only one out of ten SNPs were located in a known gene, namely CEP135. Moreover, for LD, 80% of SNPs were annotated in the distal region and 20% in introns. Furthermore, four out of 10 SNPs were located in SUGP1, LOC101104424, PDYN, and TGFB3 genes. Finally, for LEA, 40% of SNPs were found in the distal region, 30% in introns, 20% in promoters, and 10% were located downstream. Five out of 10 SNPs were found in INPP4B, TBRG4, C17H4orf46, DDX24, and KIRREL genes.
Genomic location and importance of the most relevant SNPs
By annotating the 80 most significantly associated SNPs and mapping them to the chromosomes of the Ovis aries genome v3.1, eight relevant SNPs were identified on three chromosomes, namely OAR5, OAR6, and OAR23. In chromosome 5, SNPs OAR5_107968125.1 and OAR5_107977075.1 were associated with BW and located in the PAM gene (positions 99153391 bp and 99165665 bp, respectively). Furthermore, in chromosome 6, SNPs OAR6_77919148.1 (position 71481367 bp) and OAR6_27552838.1 (position 24157625 bp) were located in CEP135 and EMCN genes, respectively, and were associated with AFW and WGPOS. In particular, the first of these SNP (OAR6_77919148.1) was also found associated with FT. Finally, in chromosome 23, SNP OAR23_49635171_X.1 (position 46823961 bp), located in the PIAS2 gene, was associated with traits AFW and WGPOS. Additionally, SNPs OAR5_112451694.1 and OAR8_50320412.1 were found in chromosomes OAR5 and OAR8. These SNPs were located in functional genes EFNA5 and MAP3K7, which interact in a metabolic pathway associated with WGPOS and are involved in cell signaling processes, according to the KEGG annotation.
Previous studies that have used the Illumina OvineSNPs50 BeadChip array reported several SNPs, including OAR6_41003295.1 and OAR6_42945420.1, that were found to be significantly associated with traits such as body weight (AI-Mamun et al. 2015). In this study, both of these SNPs were associated with BW. Moreover, Gholizadeh et al. (2015), reported that SNP OAR16_46544413.1 was associated with weight at six months old; similarly, this SNP was associated with AWW in this study. Additionally, SNP s52984.1 has been associated with daily weight gain and weight at six months old, while SNPs s55067.1 and s34745.1 have been associated with weaning weight (Zhang et al. 2013). Furthermore, Ai-Mamun et al. (2015) reported that SNP OAR6_40409402.1 was associated with body weight. In this study, these four SNPs were associated with AFW. Moreover, SNPs s34745.1 and s55067.1, reported by Zhang et al. (2013), were associated with pre-weaning weight gain; similar to the results of this study that showed the association between these SNPs and WGPRE. Finally, OAR16_61248510.1 has been associated with post-weaning weight gain and s52984.1 with daily weight gain and weight at six months old (Zhang et al. 2013); in this case, both SNPs were associated with WGPOS.
Among the 80 most significant SNPs (Table 2, 3, and 4), those that were associated with more than one trait were identified as the most relevant SNPs. One SNP was located near two genes (CEP135 and EMCN), one in a single gene (PIAS2), and two SNPs in the same gene (PAM). Overall, four candidate genes were detected, which could be most influential on the growth traits assessed. However, 27 additional genes were also promising, including LOC101107266, GABRG2, TMEM178A, USH2A, LOC101113879, LOC105605154, CNGB3, CD200, PSD3, PDE4DIP, SLITRK1, PLPPR4, CHAMP1, CIPC, LOC105601856, PRDM13, SLC44A3, MTAP, SUGP1, LOC101104424, PDYN, TGFB3, INPP4B, TBRG4, C17H4orf46, DDX24, and KIRREL.
SNP OAR6_77919148.1 was located on the distal region of the CEP135 (centrosomal protein 135) gene, which belongs to the family of centrosomal proteins that are an active component of the centrosome and are involved in the biogenesis of the centriole and control of cell cycle progression (Kumar et al. 2013). In humans, alterations in this gene cause reduced cell growth rates (Hussain et al. 2012). Furthermore, in Chlamydomonas, mutation of Bld10, an ortholog of CEP135, resulted in abnormal microtubules in interphase and the mitotic spindle; these defects occurred during cell division and significantly reduced cell growth rate (Matsuura et al. 2004). Based on this, the CEP135 gene is involved in cell growth rate and can influence AFW, WGPOS, and FT.
SNP OAR6_27552838.1 is located in the distal intergenic region of EMCN (Endomucin), which encodes a sialic-rich glycoprotein rich in type I O-glycoproteins and is specifically expressed in the venous and capillary endothelium. Recent studies suggest that this gene is a potent regulator of capillary formation from existing blood vessels, or angiogenesis (Park-Windhol et al. 2017). Furthermore, EMCN plays a key role in tissue damage and wound repair, the endometrial cycle, and the adaptation of the striated muscle to stress and exercise (Olfert et al. 2015). Therefore, there is a functional relationship between this gene and the traits assessed given the involvement of angiogenesis in the skeletal muscle and the association found here between EMCN and AFW and WGPOS.
SNPs OAR5_107968125.1 and OAR5_107977075.1 are located in introns of the PAM gene (Peptidylglycine Alpha-Amidating Monooxygenase), which catalyzes COOH-terminal amidation of peptide hormones (Traci et al. 2005). In mammals, amidation activity is presumed to be coded by a single gene product with two catalytic domains (PHM and PAL) that act sequentially to amidated peptides (Prigge et al. 2000). Amide peptides function as hormones, neuromodulators, and autocrine growth factors (Prigge et al. 2000; Merkler 1994). Therefore, this gene could be associated with BW.
OAR23_49635171_X.1 was found in an intron of the PIAS2 gene (Protein Inhibitor of Activated STAT 2), which belongs to the family of STAT protein inhibitors that regulate the activity of many proteins and influence diverse processes such as the immune response, cancer formation, and cell cycle progression. Furthermore, a study on Xenopus indicated that this gene family could play an important role in embryogenesis regulation (Burn et al. 2011). Therefore, according to the literature, PIAS2 is involved in early growth processes, and in this study, it was found in association with AFW and WGPOS, which were measured during growth and development stages and in the adult stage. These findings suggest that this gene is not only involved in early growth stages but also later periods; however, this was not conclusive.
The APC showed that the three sheep breeds were separated, the OPCP (yellow) from the OPCE (blue) and OPCS (red), however, some OPCS individuals overlapped with the OPCE (Figure 1). Which could be used in the model to check the population stratification.
MAP3K7 (Mitogen-Activated Protein Kinase Kinase Kinase 7) and EFNA5 (Efrin-A5) genes
MAP3K7 and EFNA5 genes are found in shared metabolic pathways and are involved in signaling processes. MAPK (Mitogen-Activated Protein Kinases) are members of the serine/threonine kinase protein family activated by growth and stress factors. These proteins play an important role in intracellular signal transduction, which enables the cell to integrate different extracellular signals. Therefore, MAPK participates in signaling cascades that regulate cell growth, differentiation, proliferation, and cell death (Roberts and Der 2007).
EFNA5 gene belongs to Eph receptors and their ligands (i.e., efrins), which constitute the largest family of tyrosine kinase receptors (Pasquale 2005; Kullander and Klein 2002). Additionally, this gene behaves as a chemotactic molecule that participates in the correct positioning and formation of the neuromuscular junction (Li and Johnson 2013). Accordingly, satellite cells are a resident population of stem cells of the adult skeletal muscle tissue, which is responsible for growth and regeneration. These cells generally aggregate near the ends of muscle fibers, as well as the neuromuscular junction. Li and Johnson (2013) examined the effects of EFNA5 signaling on satellite cells in bovines and found that chemokines and growth factors participate in the localization of these cells. MAP3K7 and EFNA5 genes participate in metabolic pathways involved in cellular processes; furthermore, both genes were associated with WGPOS, which was measured during the growth and development stage until the adult stage. Therefore, a coherent relationship between these genes and the growth trait was established.
Search for growth-associated genes reported in the literature
The search for CAPN1, CAST, GH, GHR, LEP, MSTN, IGF-1, FABP9, and FABP4 genes in the Ovis aries genome v.4.0 allowed identifying SNPs located in CAPN1, CAST, GHR, and FABP9. CAPN1 gene contained SNPs s30026.1, which showed association with traits BW, AFW, and FT (P<0.05). CAST gene contained two SNPs: OAR5_101792466.1 was significantly associated with traits BW, AFW, and FT, while s59216 was associated with AWW and LEA. Furthermore, there were five SNPs located in GHR, including three significantly associated SNPs, namely OAR16_34620156.1 for WGPOS, OAR16_34694443.1 for LD and LEA, and OAR16_34857607.1 for BW. Moreover, SNP OAR9_60512150.1, located in the FABP9 gene, was significantly associated with WGPRE and FT.
The annotation of the candidate genes indicated that several of these show some degree of association with cell growth, apoptosis, angiogenesis, and metabolic pathways that are directly or indirectly involved in muscle growth in different species. Therefore, several of these genes could play a similar role in sheep. The sheep genome has been poorly studied compared with other species; therefore, these results contribute to an exploratory analysis of candidate novel genes that can be used in future studies to elucidate the participation of these genes in productive traits.
CONCLUSIONS
This study is one of the first to conduct a genomic analysis of Creole hair sheep in Colombia to evaluate genomic traits associated with muscle growth traits, although the sample size is small, this is pioneering research, and its information represents a fundamental input for the country’s sheep sector. In conclusion, four candidate genes, namely CEP135, EMCN, PAM, and PIAS2, were associated with eight growth traits. The CEP135 gene was the most relevant because it was associated with three of the eight traits evaluated post-weaning (AFW, WGPOS, and FT). The EMCN and PIAS2 genes were associated with the same traits (AFW and WGPOS), while the PAM gene was associated only with one preweaning growth trait (BW). It is important to highlight that the functions of these genes indicate their involvement mainly in cellular growth, apoptosis, and angiogenesis. These results can be used as a basis for future exploratory research, metabolic network analysis, and functional validation of the candidate genes.