Introduction
Cacao Theobroma cacao L. is a tree belonging to the Malvaceae family, native to central and northern humid tropical regions of South America (López-Medina, 2017). According to historical records, the processes of domestication, cultivation, and consumption of cacao were initiated by the Mayans in Mexico and Central America, who consumed it as a drink called xocoatl, a precursor of modern chocolates (Motamayor et al., 2002; Young, 2008). However, genomic analyzes indicate that it is native from the Amazon basin in northwestern South America (Zarrillo et al., 2018).
Currently, cacao is grown in hot and rainy climates in the Americas, Africa and Asia (Bhattacharjee & Kumar, 2007; Rodríguez-Medina et al., 2019) in agroforestry systems that allow it to be combined with other crops and native species, in balance with the environment (Jagoret et al., 2011; Navarro-Prado & Mendoza-Alonso, 2006; Roa-Romero et al., 2009; Romero & Urrego, 2016). Cocoa is of great economic importance due to its versatility of uses in artisanal confectionery, the cosmetic industry, and the food industry, as is the case of the production of chocolates, oils, and liquors, respectively (Bennett, 2003). In Colombia, there has been an increase in cacao cultivation in recent years, and by 2017, it ranked as the seventh largest producer worldwide (Food and Agriculture Organization [FAO], 2017). This scenario makes cocoa one of the highest sustainable bets of the national and international market, from the environmental, social and economic point of view; this is because it allows satisfying all the links in the productive chain, guaranteeing a prosperous economy, mainly in post-conflict regions, promoting peace and the substitution of illicit crops (Rodríguez-Medina et al., 2019).
Three types of cacao are known: Criollo, originating in South America and Central America; Forastero, which comes from the Amazon basin; and Trinitario, which emerged in Trinidad & Tobago as a hybrid of the first two types (De La Cruz-Medina et al., 2012). The most relevant organoleptic characteristic of Criollo cacao is its bitter, acidic, and fruity taste, with white cotyledons (Andrade-Aguirre & Angulo-Reynoso, 2007). For its part, Forastero is the most widely cultivated cacao type, estimating that it covers more or less 85% of the world production because it is more resistant to diseases and pests; further, its flavor is strong, bitter and slightly acidic (Romero & Urrego, 2016). Trinitario is a more resistant and productive type of cacao, but of lower quality compared to Criollo (Andrade-Aguirre & Angulo-Reynoso, 2007).
Until the end of the 19th century, Criollo cacao was the most cultivated in Colombia, but the presence of diseases decimated this crop that was replaced by Forastero (Rodríguez-Medina et al., 2019). From that moment, native materials have been recovered and characterized, as well as conserved in germplasm banks; however, currently, the commercial cultivation of the Criollo type is restricted (Aranzazu et al., 2009; Oicatá, 1986; Perea et al., 2013; Rodríguez-Medina et al., 2019).
Plant varieties contain differential physiological, morphological, and genotypic characteristics; hence, the importance of being characterized, establishing the particular traits of an individual or population, and of developing plant improvement programs (Aranguren et al., 2010). Reliable morphological variables must be used to carry out this characterization, thus, allowing the differentiation between groups. These variables are established in the Technical Guidelines for Varietal Description, such as those issued by the International Union for the Protection of New Varieties of Plants (UPOV, 2011). The characterization of the germplasm is accompanied by the determination of the genetic traits that allow the measurement of genetic variability (Aranguren et al., 2010; Núñez-Colín & Espodedo-López, 2014). For this, molecular markers are used, which are DNA fragments that, alone or in combination, can be used to determine genetic diversity through the detection of their polymorphisms (Azofeifa-Delgado, 2006; Rocha, 2003).
Simple nucleotide polymorphisms (or SNP) are widely used to genotype and perform assisted selection in plant breeding from the sequencing and analysis of specific regions of the genome (Chagné et al., 2007; De Wever et al., 2019; Poland & Rife, 2012). ITS regions are ribosomal DNA sequences, easy to amplify and align, and used in the study of intra- and inter-population relationships since they show evolutionary differences and allow obtaining phylogenetic and taxonomic information on individuals (Avendaño-Sánchez et al., 2015; Quijada et al., 2017).
In this study, samples of cacao fruits cultivated in the town of Mingueo, department of La Guajira, were taken where commercial cacao cultivars and native cacao plants collected in Sierra Nevada are planted, and identified by farmers as native Criollo-type cacaos. However, there is no evidence that these are of the Criollo type, as this germplasm has not been agronomically characterized. Therefore, all these cultivars were phenotypically characterized using UPOV descriptors, and a genotypic evaluation was carried out through ITS sequences, to identify kinship relationships between native cacaos of the region and the commercial ones.
Materials and methods
The study was carried out from the materials grown in the cacao germplasm collection in the farm Brisas del Mar, of the Association of Organic Producers of the Municipality of Dibulla (APOMD, for its acronym in Spanish), located at 48 m above the sea level in the district (Corregimiento) of Mingueo, Dibulla, in the department of La Guajira. The farm is located in a tropical forest area, 10 km northeast of the town of Mingueo. Ten samples of cacao plants were collected, identified by the producers as four native cacaos, five commercial clones, and an unknown hybrid (table 1). Additionally, a sample of Criollo type cacao cultivated in Becerril, César, was obtained and used as a reference. The sampling was of a simple random type, to collect information and samples. The morphological data of the cacao plants were taken in situ, according to the criteria established for T. cacao by UPOV (2011). Besides, young leaves were collected for genetic analysis.
Leaf characterization was carried out, taking five samples per plant to measure each variable. Leaves with good phytosanitary status were selected, located in the fifth node of branches situated in the tree at the breast height, in the fruiting period. The length and width of the leaf blade, as well as some qualitative features, were measured: shape of the base, shape of the apex, and intensity of the green color. Likewise, physiologically mature fruits without disease symptoms were collected, and the length, diameter, thickness of the epicarp, the concentration of soluble solids in the pulp, and pH were measured. Similarly, the following qualitative characteristics were assessed: pod shape, apex shape, base shape, pod color, surface, depth of ridges, and color of the pulp. Besides, the quantity of all the whole seeds per fruit was counted. Five seeds were taken from each fruit, and length, width, and thickness were evaluated, as well as cotyledon color and shape in its longitudinal section. All fruit and seed measurements were made with a digital vernier caliper. In the concentration of soluble solids, a refractometer was used, and the pH was estimated with a portable pH meter. In the identification of qualitative features such as shapes and colors, UPOV's descriptors (UPOV, 2011) and the Catalog of cacao cultivars (García, 2009) were used.
The data collected in the phenotypic characterization were tabulated in Microsoft Excel™ and later analyzed with statistical tools. The quantitative variables of commercial hybrid and native cacaos were compared using the non-parametric U or Mann-Whitney test. Additionally, cluster analyzes were performed using the PC-ORD version 5.0 program. For this, matrices were constructed calculating the Euclidean distances and the groupings employing Ward's method. The quantitative data were normalized before calculating the distances, and dendrograms were constructed. Subsequently, multivariate statistics (Principal Component Analysis [PCA]) were applied using all the evaluated traits.
For the genotypic characterization, young leaves of each cacao plant were used and preserved in tubes with silica gel. Then, DNA was extracted following the protocol of Lodhi et al. (1994) with modifications. DNA purity was quantified and determined by spectrophotometry using a NanoDrop ND-2000. Using the available primers in the bank of primers of the Microbiology Research Laboratory of Universidad Simón Bolívar, an in silico analysis was carried out on the different sequenced genomes of T. cacao, using the BLAST tool (Ye et al., 2012) and the Clustal W algorithm, from the Bioedit Sequence Alignment Editor program version 7.0.5.3.
Then, the primers were tested by PCR, and the most specific pair with the best amplification performance was selected. The selected primers are 86F 5'-GTGAATCATCGAATCTTTGAA-3 and 4R 5'-TCCTCCGCTTATTGATATGC-3' (White et al., 1990). Subsequently, the amplification reaction was optimized by adding DMSO (Miranda et al., 2010). The PCR products were purified and sequenced by the Sanger method, through the Macrogen company in South Korea. Once the sequences were obtained, the chromatograms were analyzed to verify their quality, and they were edited by eliminating the low-quality ends employing the BioEdit program. Then, a Blast sequence was carried out for each sequence (Ye et al., 2012) to verify its identity and rule out contamination.
Finally, dendrograms were constructed by distance (UPGMA and neighbor-joining) and phylogenetic methods (maximum likelihood and maximum parsimony) using the Mega X program (Kumar et al., 2018), to establish the kinship relationships between the ITS sequences of each cultivar. The best evolutionary model was calculated and, from the phylogenetic tree obtained by the maximum likelihood method under the parameters of the Tamura-3 evolutionary model and a Bootstrap of 1,500 repetitions (Kumar et al., 2018), a consensus tree was built to validate the results obtained by the other grouping methods. Furthermore, the cacao ITS sequence (JQ228376) reported in GenBank was used as an outgroup.
On the other hand, an in silico search was made for other markers that could be used in future work on the discrimination between cacao cultivars. For this, a search was done for the matK and rbcL genes, as well as the intergenic spacers psbA-trnH and rpl32-trnL in the cacao genomes present in the databases. From these, analyzes were made utilizing phylogenetic methods.
Results and discussion
Phenotypic characterization
The shape of the leaves was uniform within each individual, and each type of cacao evaluated. The length of the leaf varied between 271.2 and 392.6 mm, while the width had values between 90.0 and 132.6 mm. In total, two forms of the base and two of the apex were observed. Similarly, in the intensity of the leaf color, two variants were found. The leaf sample recorded from Becerril, identified with code 008, had the smallest size with 287.0 mm in length and 90.6 mm in width. Moreover, most of the leaves of the evaluated cultivars have an obtuse base, acuminated apex, with medium color intensity. Although the size of the leaf is not used to differentiate cacao cultivars, the differences observed in this study, especially in the width of the blade, reflect that there is a degree of variability between them.
Likewise, the shape of the fruits was homogeneous within each individual evaluated. The length in all cultivars varied between 110 and 277 mm; the diameter values ranged between 55 and 200 mm, and the thickness of the epicarp between 4.35 and 17.97 mm. Diversity was observed in fruit shapes, and the oblong shape was the most frequent (figure 1). In most cultivars, the basal fruit constriction was absent or very weak; two apex forms were observed, and the predominant fruit surface was moderately rough. The fruit color varied, with yellow being the most common. The pulp colors recorded were white and light cream (table 2). This diversity of forms between cultivars shows the variability that exists between them, becoming traits that allow their differentiation and identification.
Concerning the number of seeds per fruit, the CCN51 clone is the one with the highest number of seeds, followed by Criollo de Becerril (008) with 42 seeds; the cultivar with the least amount of seeds was hybrid ICS95 (table 2). The average number of seeds was 30, with a length value that ranged between 29.76 and 21.53 mm, and a width value between 15.17 and 10.59 mm. The thickness of the seeds ranged from 7.9 to 12.2 mm. In this sense, the seed size of the native cacaos found in Mingueo was larger than that of the hybrids. Most of the seeds were oval or oblong, and the predominant cotyledon color was dark purple, followed by white. All the cacao samples identified as native or creole have white cotyledons (table 2). The morphological variants of the observed seeds are typical of each cultivar and are reflected in the productive and organoleptic characteristics of these cacaos. In this way, the coloration of creole cacaos stands out in relation to commercial hybrids (García, 2009).
The cacao cultivars studied were grouped into native and non-native cacaos, and the quantitative traits were compared between the two groups. According to the Mann-Whitney test, the null hypothesis Ho is accepted for the variables fruit length, blade length, epicarp thickness, number of seeds, and pH, that is, there are no significant differences between the two cacao groups (p = 0.01). On the other hand, when comparing the variables blade width, fruit diameter, ºBrix and seed length, width and thickness, the null hypothesis is rejected, noting that there are significant differences with 99 % confidence (p = 0.01) between creole or native and hybrid cacaos for these variables.
The cluster analysis using quantitative phenotypic characteristics through Ward's method forms two large clades (figure 2): in one clade, the creole cultivars were ordered, including the Criollo cacao variety used as a control, and in the other, the hybrid cultivars.
The PCA was performed by comparing the qualitative and quantitative characteristics (figure 3). The dispersion of the samples indicates that the parameters with the most significant influence on the variability are GS (seed thickness), CC (cotyledon thickness), LF (fruit length), CF (fruit color), FF (fruit shape), PLF (depth between ridges), LS (seed length), and FSLS (longitudinal shape of the seed). Furthermore, the segregation of creole cacaos from commercial cultivars is observed (figure 3).
Genotypic analysis
Initially, some DNA extractions were carried out; however, the performance and quality were very poor. Within this procedure, the first organic extractions did not separate well the nucleic acids from the cellular residues since, at the end of the procedure, large amounts of a white pellet were observed, and it was not possible to amplify the ITS sequence from these DNAs. For this reason, several adjustments and tests were carried out using different concentrations of PVP, without obtaining significant changes. Then, the concentration of 2-β-Mercaptoethanol was increased to 1%, and the number of organic extractions with chloroform: IAA up to four times. Thus, better results were obtained, which were verified by spectrophotometric quantification.
One of the factors that affect the extraction of nucleic acids from T. cacao leaves, and that interferes with the quality and quantity of DNA, is the high concentration of polyphenols and polysaccharides in the foliar tissue of the plant (Henao et al., 2017; Martínez et al., 2013; Schrader et al., 2012). Once the adjustments to the extraction protocol were made, the DNA was obtained. According to the 260/280 ratio, these still contained some contaminants. Given the presence of these residues, the concentrations of the sample were adjusted by making dilutions to decrease the inhibitor effect and maintain a final DNA concentration of 10 ng/mL. As some authors have expressed, this type of problem is common when working with this species (Chia-Wong, 2009; Martínez et al., 2013; Ruiz, 2014). Then, the thermal profile of the PCR amplification was adjusted, and all the fragments were evidenced in the electrophoretic runs. The size of these sequences was 380 bp and corresponded to the ITS2 region, located between the 5.8S region and the largest subunit of the ribosome (White et al., 1990). This intergenic region was selected because it allows observing intraspecific variations, as it has less selective pressure than the coding regions (Quijada et al., 2017; Zambrano, 2017).
Once the sequences were obtained, grouping analyzes were performed by distance methods and UPGMA phylogenetic methods, neighbor-joining, maximum likelihood, and maximum parsimony. The nucleotide substitution models for these sequences were evaluated, and the Tamura-3 was identified as the evolutionary model with the best fit (Tamura, 1992). From these parameters, the dendrograms were constructed. In all the methods, a clade was formed in which the creole cacaos were grouped (including the Becerril reference), differing from the commercial hybrid cacao clade. Also, most clades are statistically supported by all methods. Thus, a tree was built using the maximum likelihood method, locating the consensus clades and indicating the probability for a Bootstrap of 1,500 replicas for each method (figure 4).
Even though the ITS used in the current work allowed differentiating the group of creole or native cacaos from the commercial ones, they do not permit us to answer the entire evolutionary history. For this reason, a prospective in silico analysis of other markers was carried out, allowing the complementation of this analysis and lay the foundations for future research. Initially, a search of the chloroplast genomes of cacao cultivars and other species of the Malvaceae family deposited in GenBank was made (Benson et al., 2012). The same cacao cultivars were used for all analyzes, and the species of the family varied in each marker. In these genomes, an in silico search was carried out, and by employing phylogenetic methods, the markers matK and psbA-trnH were found to have enough variability to see differences between cacao types. On the other hand, rbcL and rpl32-trnL do not allow resolving the differences between cacao cultivars.
In the area of Sierra Nevada de Santa Marta, the municipality of Dibulla has a biological diversity and environmental characteristics that favor a growing and dynamic agricultural activity. In this sense, it is necessary to carry out training programs for the production of quality cacao crops through the identification, selection, and use of internationally recommended materials and, in turn, promote the rescue of outstanding Criollo-type genotypes. This would allow the conservation and multiplication of germplasm to initiate genetic improvement programs to increase the quality of cacaos, as well as promoting efficient and sustainable cultivation systems over time. This work allows identifying the types of cacaos that are grown in the region to promote their cultivation and boost their final products.
When characterizing and identifying a cultivar, the phenotypic traits are the most distinctive. The phenotype is understood as the set of observable traits, whether morphological, physiological, or behavioral, within a species or population; this depends on the genotype, and can be influenced by environmental and nutritional factors (Botero & Arias, 2018). In this sense, the analyzed individuals were located in the same plot, in a uniform environment, under slightly variable temperature, humidity and radiation conditions, and with the utilization of the same fertilization regime and organic phytosanitary practices; furthermore, they were collected and measured on the same day. Therefore, the observed phenotypic variation should have little environmental influence.
Within the native cacaos found in Dibulla, typical Criollo type cacaos are found, and these are different from commercial cacaos (figures 2, 3 and 4). These are characterized by having leaves whose shape at the base and apex are obtuse and acuminate, respectively; the length and width of these leaves are uniform. On the other hand, most of the fruits of this cacao have an oblong shape, the surface is moderately rough, and the basal constriction is weak (table 2). Additionally, two fruit shapes were found, one with a notched apex and the other with an acute apex. Regarding the size, the fruits are smaller in length, width and thickness compared to the commercial ones. The seeds are uniform, oval in shape, and moderately elongated, and their size is similar to that of the commercial ones (table 2). As for the commercial cultivars evaluated, they retain most of the morphological characteristics described by García (2009).
Likewise, one of the phenotypic traits used to distinguish creole or native cacaos from other types is the color of the cotyledon (Avendaño et al., 2014; Ventura et al., 2004). The cotyledon color showed a considerable variation that is reflected in the distribution effect of the samples in the PCA (figure 3). In this sense, the commercial cacao cultivars showed dark purple coloration, and the native cacaos called criollos, had white cotyledons (table 2).
In the case of Creole 006, the qualitative phenotypic characterization showed that it shares characteristics with some commercial cacaos, such as the color of the cotyledon and the shape of the base of the leaf that is acute, which suggests that this creole cacao is not pure, and that it is related to a hybrid. Likewise, Creole 008 from Becerril has some characteristics similar to clone CCN 51, indicating that it originates from crosses between Forasteros and Criollos (García, 2009). However, few phenotypic characteristics relate these cultivars, and these may be polygenic; therefore, additional genetic analyses should be carried out to determine the degree of kinship. For this reason, it is necessary to carry out population studies to cover a higher number of samples, a wider distribution, and the segregation of traits in the offspring, to examine distinction, homogeneity, and stability.
The ITS used in this work, allowed differentiating the group of creole cacaos from the commercial ones. However, the results do not finish answering the entire evolutionary history (Zambrano, 2017). The small nuclear region chosen does not represent the entire genome and, therefore, does not account for the entire genetic evolution of the species. The analysis of chloroplast DNA markers (cpDNA) has been used increasingly in population genetics to determine their structure, gene flow, haplotype frequency, and phylogenetic relationships (Gutiérrez-López et al., 2016). For this reason, an in silico evaluation of other sequence markers of the chloroplast genome was made, which will allow more in-depth and precise analyzes between cacao cultivars in the future. The phylogenetic analyzes carried out on the markers matK and trnH-psbA could better resolve the differences between cacao cultivars and would allow a better understanding of the origin and kinship relationships of cacaos in the region. Likewise, in future studies, other markers such as microsatellites could be used, which would allow estimating the genetic diversity (Aranguren-Díaz et al., 2018; Lanaud & Risterucci, 1999).
Conclusions
The collection of creole cacaos of Asociación de Productores Orgánicos del Municipio Dibulla is an important plant genetic resource that must be preserved. The phenotypic and genotypic analyses show that these native cacao cultivars have differences compared to commercial cultivars. Furthermore, this native germplasm can be classified within the genetic group of Criollo-type cacaos. However, more studies must be carried out to categorize and certify these.
The current work was a baseline for the knowledge of the creole or native cacao of Sierra Nevada de Santa Marta, which will allow the development of new studies to deepen the denomination and certification of these types of cacao. In this sense, we propose to carry out more studies using a higher number of creole cacao individuals and using matK and trnH-psb markers, to better establish the relationships between cacao cultivars and intra-population variations.
Finally, the conservation of native cacao cultivars is of vital importance since it contributes to the species variability persistence over time and can favor genetic improvement programs, for which phenotypic and genotypic characterization is essential.