I. INTRODUCTION
Software development companies are constantly seeking to formulate strategies and mechanisms to improve the processes that support the development of their projects due to the need of delivering high quality products and services in short time intervals [1]. To achieve this, companies dedicate their efforts to the definition, application, and continuous improvement of their processes and practices [1]. As a result, a set of solutions have emerged, which can be classified as traditional and agile; some of the traditional solutions are CMMI [2], RUP [3], and Waterfall [4]. On the other hand, among the agile solutions are Scrum [5], Lean Software [6], TDD [7], and XP [8]. In addition, hybrid solutions that seek to apply the best of both approaches have been proposed, among the best known are Scrum & XP [9], Scrumban [10], and Scrum & CMMI [11]. However, traditional and agile solutions propose elements related to the construction of software products (Dev), leaving aside practices related to operation/infrastructure (Ops), which are addressed by solutions such as ITIL [12], COBIT [13], and by standards such as ISO/IEC 20000 [14].
The advancement and improvement in the automation techniques of practices associated with the software development life cycle brought with it the emergence of frameworks for software development that integrate the best practices for Dev and Ops, also known as DevOps, which allows improving elements such as productivity, quality, and competitiveness of software development companies [15], [16]. The DevOps concept is not new, it was introduced in 2009 [17], and it arises with the objective of proposing a set of practices and activities necessary to close the existing gap between software development and operations, and in this way improve the speed of delivery of value, optimal functionalities, and excellent quality [18]. The foregoing, through mechanisms focused on the use of technology, human talent, and processes that allow the automation of all the stages involved during the development of software projects. In this sense, it can be said that DevOps focuses on promoting practices related to continuous integration [19], change management [20], automated tests [21], continuous deployment [22], continuous maintenance [23], among others. However, adopting DevOps in software companies is not an easy task [24]. To minimize the error risk in its implementation, companies must have the necessary elements to quantify and evaluate their degree of implementation during software development. This with the aim of generating a process of continuous improvement [25] that allows them to recognize enhancement opportunities permanently through the evaluation of their processes.
The article is organized as follows: Section 2 describes the methodology used to conduct the systematic literature mapping; Section 3 presents the results obtained from the mapping; Section 4 discusses the most important observations based on the results obtained, the limitations and implications of this field; Section 5 presents the conclusions and future work.
II. METHODOLOGY
The systematic literature mapping (SLM) is a method used to identify relevant studies in an area of interest and the subsequent analysis of the information obtained from a set of criteria defined by the authors. The systematic mapping was carried out following the methodological guide proposed in [26-30], applying the following stages in an orderly manner: (i) Planning, (ii) Execution, and (iii) Documentation.
A. Planning Stage
The planning stage includes the following activities: (i) objectives and research questions; (ii) research strategy; (iii) inclusion/exclusion criteria; (iv) quality evaluation criteria, and (v) execution stage.
1) Objectives and research questions. The set of research questions was established following the Goal-Question-Metrics methodology (GQM). This approach suggests a measurement model composed of three levels of abstraction: (i) conceptual level (Objective); (ii) operational level (Question); and (iii) quantitative level (Metric). At a conceptual level, the research questions were designed in a way they are aligned with the objectives. They allowed to focus, characterize, and structure the information related to the area of interest. The research questions and its motivation can be consulted at https://bit.ly/3tZr7kD.
2) Research Strategy. To search for primary studies, combinations of the logical connectors “AND” and “OR” were applied. The search string was run on the following search engines: Google Scholar, IEEEXplore, Scopus, and SpringerLink. In addition, studies provided by experts in the field and used as gray literature were analyzed. The applied search string was: “(devops OR “develop and operation” OR “development and operation”) AND (capability OR maturity OR evaluation OR assessment OR measure OR measurement OR appraisal OR metric) AND ("reference model” OR tool OR process OR technique OR method).
3) Inclusion/Exclusion Criteria. Studies were assessed according to their title, abstract, and keywords. The studies selected as relevant were evaluated using the following criteria: (i) studies in English that propose mechanisms to assess DevOps; and (ii) studies published from 2009 (when the term DevOps was first defined [17]) to 2021 in high-impact journals, conferences, and congresses. On the other hand, studies that meet at least one of the following exclusion criteria were discarded: (i) studies that do not contribute to the DevOps assessment, (ii) studies that do not have a sufficient level of detail, (iii) discussion studies submitted as an abstract or presentation, (iv) studies without publication date, and (v) duplicate studies.
4) Quality Evaluation Criteria. To measure the quality of the primary studies, a questionnaire with a three-point scoring scale (1, 0, and -1) was defined. The criteria used for the evaluation of articles can be consulted at https://bit.ly/3qLMkwn. The sum of the score of each study forms the final score (obtaining a value between -6 and +6). The scores obtained do not represent an exclusion criterion for the primary articles, they are used as an indicator to identify which studies may have greater relevance in the future. The table presenting the quality evaluation of the studies can be consulted at the following link https://bit.ly/3tDRBb4.
5) Execution Stage. The selection of studies consisted of five iterations, one for each search source. For this, the following activities were conducted: (a) review of 7 studies corresponding to the gray literature; (b) selection of studies that meet the inclusion criteria; (c) selection of studies that answer the research questions, and (d) elimination of duplicate studies. As a result, 1211 related studies were identified, a total of 24 primary studies were obtained after applying each of the iterations.
III. RESULTS
The most relevant aspects are presented in relation to each one of the research questions defined for the SLM, and the corresponding references are presented to allow the reader to make a deeper analysis.
Q1: What is the temporal distribution of primary studies? It was identified that there is a growing interest as of 2014 in relation to the definition of proposals to assess DevOps. In 2019, the largest number of contributions was made with a total of 8 studies (33.3%) ([31, 32, 33, 34, 35, 36, 37, 38]), followed by 2020 with 5 studies (20.8%) ([39, 40, 41, 42, 43]), and 2018 with 4 studies (16.7%) ([44, 45, 46, 47]). On the other hand, in 2016 and 2017, 2 studies were conducted per year for a total of 4 studies (16.7%) ([48, 49, 50 ,51]). Finally, in 2014, 2015, and 2021, one study was conducted per year for a total of 3 studies (12.5%) ([52, 53, 54]).
Q2: What is the geographical distribution of primary studies? it was observed that most of the studies were conducted in Europe with a total of 15 (62.5%), out of which 5 ([35, 36, 44, 51, 52]) were proposed in the Netherlands, followed by Norway with 2 related studies ([31, 39]), and finally Germany, Austria, Spain, Finland, Italy, Lithuania, Portugal, and Sweden with 1 study each, for a total of 8 related studies ([32, 33, 34, 41, 42, 47, 49, 53]). On the other hand, (i) 3 studies were conducted in Africa (12.5%), out of which 2 were conducted in South Africa ([40, 45]) and 1 in Saudi Arabia ([37]); (ii) 3 related studies (12.5%) were identified in South America, out of which 2 were proposed in Colombia ([38, 43]) and 1 in Brazil ([50]); (iii) 2 related studies were identified in Asia (8.3%), and they were proposed by authors located in the geographical area of Turkey, which belongs to the Asian continent ([46, 48]); and (iv) 1 related study was carried out in North America, specifically in the United States (4.2 %) ([54]).
Q3: What are the most cited primary studies? According to the results, it was possible to observe that the most cited study was [53] with a total of 137 citations, followed by [31] with 16 citations. On the other hand, [48] and [51] were cited 15 times each. [37, 44, 47] were cited 9 times each, [45] was cited 6 times, [39] was cited 4 times, [35, 38, 49] were cited 3 times, [52] was cited 2 times, [33, 36, 50] were cited 1 time each. Finally, [32, 34, 40, 41, 42, 43, 54] have not been cited because they were recently published and have not been sufficiently disseminated to be identified by the scientific community.
Q4: What are the research methodologies or instruments used in the literature? From the results it was observed that: (i) 10 papers (41.7%) ([34, 37, 39, 40, 42, 45, 47, 49, 55, 56]) carry out exploratory studies through SLM to establish the state of the art in the use of models, processes, techniques, tools, or reference frameworks for DevOps assessment; (ii) 7 studies (29.2 %) [32, 33, 48, 50, 57, 58, 54] propose solutions to assess DevOps through case studies in software companies; (iii) 4 studies (16.7%) ([35, 41, 46, 52]) propose metrics following the action-research model; (iv) 2 papers (8.3%) ([44, 51]) were applied through systematic reviews of the literature; and (v) 1 study (4.2%) ([31]) is carried out through empirical research models proposed by the authors.
Q5: What is the type of proposed solution? In relation to the type of solution, it was identified that: (i) in [49] an exploratory study analyzing different tools to assess DevOps in Small and Medium-sized Enterprises (SMEs) dedicated to software is carried out; (ii) in [38, 39, 43, 55] SLM was carried out to identify the elements to be considered for applying DevOps in software companies; (iii) in [34, 37, 40, 51] studies were carried out to know the state of the art in relation to the use of maturity models to assess DevOps; (iv) in [47, 52, 56, 57] metrics are proposed to evaluate specific practices such as construction, integration, and continuous deployment during the different stages of DevOps adoption; (v) in ([32, 45]) competence models are proposed; (vi) in ([34, 35, 37, 40, 41, 42, 44, 46, 50, 51]) maturity models are proposed; (vii) in [32] a collaboration model is proposed; (viii) in [50] an evaluation model adapted to DevOps based on SMM (Scrum Maturity Method) is proposed; (ix) in [33] a method to certify the use of DevOps best practices was applied; (x) in [31] a model to evaluate the development, security, and operations (DevSecOps) through the values and principles proposed in DevOps is suggested; and (xi) in [54] a standard for the adoption of DevOps in software companies is proposed.
Q6: Have technological tools been proposed to assess DevOps? The analyzed studies were segmented into two categories: (i) studies that propose methodological solutions to assess DevOps (87.5%) ([31, 33, 34, 35, 39, 40, 41, 42, 44, 45, 46, 47, 50, 51, 52, 54, 55, 57, 58]); and (ii) studies that perform a comparative analysis of tools suggested to assess DevOps (12.5%) ([32, 48, 49]). To establish a broader state of knowledge regarding the use of tools, an exploratory study was carried out based on the methodology proposed in [59], in which a total of 13 tools developed by different companies seeking to assess DevOps were identified, some of the aspects were: (i) accessibility (A1), to find out if the tool is free to access, free with a trial period, or paid; (ii) method used for evaluation (A2), it is carried out through surveys, frameworks, consulting, reference models, or other; and (iii) objective or scope of the evaluation (A3), the tool evaluates the process, practices, activities, roles, tasks, principles, or other. In relation to accessibility (A1), it was observed that 7 tools (54%) ([60, 61, 62, 63, 64, 65, 66]) are free and offer their service through surveys or methodological guides, followed by 5 tools (38.4%) ([67, 68, 69, 70, 71]) which are paid, and 1 tool (7.6%) [72] that offers a trial period to the user and requests a subscription to access all the services it provides. Regarding the evaluation method used by the tools (A2), 6 of them (46.2%) ([60, 61, 62, 63, 65, 66]) carry out the evaluation through surveys; on the other hand, 5 tools (38.4%) ([68, 69, 70 ,71, 72]) carry out the evaluation through custom consulting, and 2 tools (15.4%) ([64, 67]) assess DevOps through methodological guides and specialized frameworks. Regarding the objective or scope of the evaluation (A3), 6 tools (46.2%) ([67, 68, 69, 70, 71, 72]) assess DevOps as a process taking into account the set of principles, values, tasks, activities, and roles carried out by a company; 5 tools (38.4%) ([60, 62, 63, 64, 65]) carry out an evaluation based on practices such as construction, integration, and continuous deployment; and 2 tools (15.4%) ([61, 66]) assess DevOps according to compliance with the Culture, Automation, Measurement, and Sharing principles (CAMS) proposed by DevOps.
Q7: What types of companies engage in the related studies? To conduct the analysis, the classification of large, medium, and small companies was used as a criterion according to the number of employees defined by the European Union in regulation N° 651/2014 [73] and it was complemented by the definition of micro-enterprise proposed in [74]. As a result, 14 studies (58.3%) ([31, 34, 37, 38, 39, 42, 43, 45, 46, 47, 48, 51, 52, 53]) were not applied in software development companies, 4 studies (16.7%) ([33, 36, 40, 50]) were applied only in large companies, 1 study (4.2 %) [35] was applied in a medium-sized company, and 1 study (4.2%) [41] belongs to an unknown category. On the other hand, studies evaluated in multiple companies were also conducted: 1 study [49] (4.2%) conducted the evaluation of their proposal in 1 medium-sized and 1 small company; 1 study (4.2%) [44] carried it out in a medium and a large company; 1 study (4.2%) [32] carried out multiple case studies in 3 large, 3 medium, and 3 small companies; and 1 study (4.2%) [54] proposes a standard that can be applied transversally in companies of any type.
IV. DISCUSSION
This section presents the analysis of the results obtained from the execution of the SLM.
A. Main Observations
During the last decade, significant advances have been made in favor of defining methodological solutions and tools to assess DevOps in software companies. However, a high degree of heterogeneity was evidenced in the proposed solutions since there is no clear consensus on the definitions and concepts associated with DevOps [38]. Proposals such as [35, 41, 42, 46, 55], suggest capability and maturity models supported by the process elements proposed by CMMI, unlike [58], which follows the process elements proposed by ITIL. On the other hand, in [50] a maturity model supported by the set of values and good practices proposed by SMM is suggested, and in [54] a standard for the adoption of DevOps is proposed. Also, it was identified that the industry has focused its efforts on the implementation of tools to assess DevOps through instruments such as surveys, frameworks, methodological guides, and tailored consulting services. However, each company establishes its own evaluation criteria based on the set of DevOps practices or elements that they consider appropriate. As a result, DevOps assessment solutions according to values, principles, activities, practices, roles, and tasks have been proposed. However, each author or company defines its own evaluation criteria, as they do not have a general reference model/standard to be applied transversally, and although it is clear that all solutions follow the same objective to evaluate the capacity, maturity and/or degree of competence/implementation of DevOps, there is no general consensus on how to assess DevOps in a clear and unambiguous way, thus generating confusion. Hence, a company can obtain different results after applying multiple evaluations to the same process. On the other hand, a strong interest in the use and validation of the proposed solutions in companies of different sizes was identified, focusing most of the efforts on making case studies applied to large and medium-sized companies, leaving aside the micro and small software companies, perhaps due to (i) the companies have not contemplated the institutionalization of practices related to DevOps, and/or (ii) they probably do not have the necessary resources in terms of capital and human talent that allow them to adopt assessment models optimally.
B. Limitations
The results of the exploratory study are limited to the capacity of scientific search engines. The inclusion criteria used as a starting point in the search for primary studies is limited to those written in English. In addition, the results obtained serve as a starting point for a more exhaustive version that seeks to identify gaps and elements that were not considered during this study.
V. CONCLUSIONS
In the last decade, DevOps has become one of the biggest focuses of interest for the scientific community and for the industry, which constantly seeks to improve its development processes. To achieve this, companies invest resources and time in defining good practices that allow a clear adoption of DevOps. As a result, they are in a continuous improvement process to assess whether they are applying DevOps appropriately. To do this, processes, capacity models, competence, maturity, reference frameworks, methodological guides, metrics, tools, and techniques have been proposed. However, there is a high degree of heterogeneity in these solutions, which is inconvenient for companies since they do not have a clear picture of which one, they should adopt to guarantee that their assessment is correct. In this sense, a metrics model supported by a reference model is being developed to conduct an objective DevOps assessment.