I. INTRODUCTION
One of the most important processes during the software projects life cycle is requirements engineering [1], whose objective is to define the business needs clearly and precisely and to translate customer needs into tasks that can be implemented at later stages during the solution development process [2]. Requirements engineering is crucial for the success of a software development project since it allows for avoiding reprocesses and cost overruns caused by aspects such as defects caused by ill-defined requirements and additional efforts that arise as a result of little or no requirements management during project execution. In this sense, requirements engineering equips projects with a set of tools to ensure the quality of the requirements [3]. In general, the quality assessment of requirements is performed following the guidelines proposed by the ISO/IEC/IEEE 29148 standard, which proposes that the conformity of a requirement should be performed by systematically identifying a concept known as “smell” [1].
A smell in the context of the requirements is defined as a quality violation, which can lead to a defect with a specific location and detection mechanism [4]. ISO/IEC/IEEE 29148 defines a set of bad smells, including: subjective language, ambiguous adverbs and adjectives, loopholes, open-ended, non-verifiable terms, superlatives, comparatives phrases, negative statements, vague pronouns, incomplete references, among others [4]. Currently, requirements are written in natural language; therefore, quality control on a requirement is carried out by peer reviews [3].
Identifying these smells early in the requirements development process can help detect and correct quality defects before they have a major impact on projects. On the other hand, poor management of smells present in the requirements results in a phenomenon known as Requirements Debt, which can be defined as the distance between the optimal requirements specification and the actual system implementation, under domain assumptions and constraints [19], negatively affecting projects, generating cost overruns, rework and additional efforts as a result of mitigating or solving subsequent defects.
With the objective of obtaining an updated state of knowledge on the proposals, studies and initiatives on the smells identification and classification in the requirements, and how these can cause or avoid the requirements debt, this article presents the results after carrying out a systematic mapping of the literature, in which the initiatives and proposals on this subject were documented and analyzed. The systematic mapping is structured as follows: Section II presents the research method used for the elaboration of the mapping, as well as the execution of the information research. Section III presents the results obtained in response to the research questions. Finally, Section IV presents the conclusions and future work, as well as the discussion of the results and the main observations.
II. METHOD
A systematic mapping of the literature -hereinafter SML- is a process that allows the collecting, categorizing, and structuring of the existing information on a topic of research interest, mainly in the area of Software Engineering [5]. The mapping presented in this paper follows the protocol proposed by Petersen et al. [5] [6], which describes the guidelines for conducting a systematic mapping in Software Engineering. In addition, the guidelines proposed by Kitchenham [7] and Budgen et al. [8] were followed for the research protocol, which is made up of the following stages: (i) definition of research questions; (ii) conduct search for primary studies; (iii) screening of primary studies for inclusion and exclusion criteria; (iv) quality assessment of the primary studies; and (v) data extraction. Figure 1 shows the relationship between the stages performed in the systematic mapping.
A. Definition of Research Questions
To conduct the SML, a total of three (3) research questions -hereinafter RQs- were defined and presented in Table 1. The RQs categorize the information identified on smells in software development and allow the identification of existing loopholes at the research level.
ID | Research Question | Incentive (Purposes) |
---|---|---|
RQ1 | What kinds of solutions have been proposed? | To know the contribution in one or more of the following categories: (i) conceptual definition, (ii) causes, effects, impacts, and limitations, (iii) evaluation techniques, (iv) technological tools, (v) validation in the industry, (vi) documentation methodologies, (vii) others. |
RQ2 | Which results have been achieved with the proposals made? | To identify the impact of the proposals made based on the results obtained during their validation in the software industry. |
RQ3 | Which benefits and challenges does research on the subject entail? | To determine the benefits and challenges for companies to detect and reduce requirements smells associated with the software development life cycle. |
B. Conduct Search for Primary Studies
For the search of studies, combinations were made between the keywords identified from a previous review on smells and requirements debt in software development using the logical operators “AND” and “OR”. As a result, the following basic search string was obtained: (“Requirement Smell” OR “Requirement Smells” OR “Requirements Smells”) OR “Requirements debt” AND (“Software development” OR “Software engineering” OR “requirements engineering”). The string was adapted and applied in seven (7) scientific databases: IEEE Xplore, Science Direct, Scopus, Google Scholar, Springer Link, ACM, and Web of Science (WoS).
C. Screening of Primary Studies for Inclusion and Exclusion Criteria
The selection of relevant studies was carried out at three levels: (i) review of the title, (ii) review of the abstract, introduction, and conclusions, and (iii) review of the full text to determine whether the study met all the inclusion criteria (IC) described in Table 2. Subsequently, for screening primary studies, studies that met at least one of the exclusion criteria (EC) described in Table 3 were discarded.
ID | Inclusion Criteria (IC) |
---|---|
IC1 | Articles whose focus is bad smells in software development requirements. |
IC2 | Articles whose main subject is the quality requirements in software development. |
IC3 | Articles whose subject is related to requirements engineering. |
IC4 | Articles published in journals, prestigious congresses or conferences with peer review. |
ID | Exclusion Criteria (EC) |
---|---|
EC1 | Duplicate articles (considering only the most complete and recent that can be evidenced). |
EC2 | Articles where the research topic is superficially addressed. |
EC3 | Articles not related to requirements debt during software development. |
EC4 | Articles of debate type or available only in presentation form or abstracts. |
EC5 | Articles that are books or book chapters. |
D. Quality Assessment of Primary Studies
In addition, the quality of the primary studies was assessed to determine their possible relevance in the future. The assessment was based on the instrument proposed by Kitchenham [11] and an adaptation of the assessment system proposed in [12]. As a result, a questionnaire of eleven (11) criteria was constructed and organized into 5 categories: clarity, quality, credibility, relevance, and rigor. To evaluate the criteria, a three-value scoring system (-1, 0, +1) was defined, which is presented in detail at https://tinyurl.com/29bs5lgs. It is important to clarify that the score obtained by an article is not considered an exclusion criterion; the score obtained is used to know the relevance that an article could have in the future.
E. Data Extraction
The extraction of relevant information from each study was carried out by defining a template that presents elements such as problem addressed, type and methodology of research, type and proposed solution, among others. The template made it possible to standardize and facilitate the extraction of relevant information from each study. The template can be consulted at https://tinyurl.com/2chsxq5u.
III. RESULTS
In total, seven (7) iterations were performed, one for each database. Since each database has its own configuration, it was necessary to adapt the search string regarding the original string (the adapted strings can be consulted at https://tinyurl.com/27btxspz). According to the results presented in Table 4, 533 studies were identified, of which 132 relevant studies were initially selected. After a detailed review, 46 repeated relevant studies were eliminated, resulting in a total of 86 relevant studies after applying the ICs. Subsequently, the ECs were applied to eliminate 62 studies. Finally, a total of 24 studies were obtained, which are considered primary studies, hereafter PS.
No. | Data Source | Identified Studies | Relevant Studies | Repeated Studies | Primary Studies |
---|---|---|---|---|---|
1 | Google Scholar | 275 | 56 | 0 | 19 |
2 | Scopus | 128 | 61 | 31 | 5 |
3 | Science Direct | 24 | 4 | 4 | 0 |
4 | Springer Link | 46 | 2 | 2 | 0 |
5 | IEEE Xplore | 41 | 6 | 6 | 0 |
6 | ACM | 18 | 2 | 2 | 0 |
7 | WoS | 1 | 1 | 1 | 0 |
Total | 533 | 132 | 46 | 24 |
In addition, a backward snowballing [17] of the PSs references was performed. As a result of this review, four (4) additional studies were selected, becoming a total of 28 PSs. Due to space limitations, details of all PSs can be found at https://tinyurl.com/2cqzcn2p, hereafter, articles are referenced with the acronym A, as shown at https://tinyurl.com/2cqzcn2p. The contribution of each primary study to answering the research questions posed in Table 1 is described at https://tinyurl.com/25xxj3bg. The following subsections present the results for each research question defined in this systematic mapping.
A. What Kinds of Solutions Have Been Proposed?
As shown in Figure 2, 3.6% of the studies (A1) mention a conceptual definition, where a set of smells associated with the requirements are proposed: subjective language, ambiguous adverbs and adjectives, loopholes, open-ended, non-verifiable terms, superlatives, comparatives phrases, negative statements, vague pronouns and incomplete references. On the other hand, 3.6% of the studies (A13) are described in the category of causes, effects, impacts or limitations. 28.6% of the studies (A3, A7, A11, A12, A14, A16, A17, A19) focus on the definition of assessment methods or techniques. For example, in A3, ISO/IEC/IEEE 29148 [43] and IEEE-830-1998 [44] standards were analyzed to assess whether early smell detection and correction can help improve quality reviews. In A7, 870 requirements were analyzed and categorized, detecting how many of them have requirements smells and how these are related to the use of subjective language, incomplete references or non-verifiable terms, analyzing their results in a set of examples against assessments made by humans, with the purpose of identifying ambiguous terms between different domains and classifying them by ambiguity score. Also, 32.1% of the studies (A2, A5, A15, A20, A21, A23, A24, A25, A28) concentrate on the development of technological tools. For instance, A15 proposes the Quality User Story framework to detect quality defects and suggest possible solutions based on 13 quality criteria for user stories (US). US are well-structured, atomic, minimalist, conceptually sound, problem-oriented rather than solution-oriented, unambiguous, non-conflicting, complete and well-formed sentences, unique, uniform, independent and complete. Furthermore, 21.4% of the studies (A4, A6, A10, A18, A26, A27) concentrate on performing industry validation; for example, in A6, a Natural Language Processing (NLP) tool is used to empirically validate the effect that quality defects have on test cases designed in later stages of a project. In A18, we study the problems when interpreting semantic relations of the type "If A and B then C" in the wording of the requirements since they represent a source of ambiguity. Finally, 10.7% of the studies (A8, A9, A22) are systematic literature reviews. In A9, we sought to know the criteria that should be considered to evaluate the quality of requirements in the context of agile software development using the criteria proposed by the INVEST mnemonic (Independent, Negotiable, Valuable, Estimable, Small and Testable), proposed by Bill Wake in 2003, which proposes the characteristics that should be considered to ensure the quality of a user story [45]. Nevertheless, our study included additional criteria such as completeness, consistency and uniformity of a user story.
B. Which Results Have Been Achieved with the Proposals Made?
As shown at https://tinyurl.com/25rlkcp8, the studies (A1, A2, A3, A5, A6, A8, A9) define bad smells in the requirements as indicators of a quality violation that can lead to a defect. In (A1, A2, A3, A5, A7, A8, A9, A23) they are based on ISO/IEC/IEEE 29148 to describe the following smells: subjective language, ambiguous adverbs and adjectives, loopholes, open-ended terms, superlatives, comparatives phrases, negative statements, vague pronouns and incomplete references. On the other hand, the study (A7) relies on the ISO/IEC 25010 standard to define the following smells: functional suitability, performance efficiency, compatibility, usability, reliability, security, maintainability and portability. In A8, smells are defined as signs of inaccuracy or ambiguity in the requirements statement. In (A9, A22), some sources of poor requirements quality are defined, as well as incompleteness and ambiguity. In (A22, A25), smells are defined using concepts such as vagueness, language problems and ambiguity. As a result, these studies propose a classification of requirements smells according to their lexical, syntactic, semantic and pragmatic conformity. Finally, only one of the studies proposes a definition related to the concept of requirements debt (A4), describing it as the distance between optimal requirements and the actual system implementation.
According to the analysis of the related works, it was possible to identify 15 smells present in the literature. Some of them are based on the taxonomy of the ISO/IEC/IEEE 29148 standard. Among them are: subjective language, ambiguous adverbs and adjectives, loopholes, open-ended, non-verifiable terms, superlatives, comparatives phrases, negative statements, vague pronouns, incomplete references, among others. Details of each of the smells identified in the SML are available at https://tinyurl.com/2aquzp8y. Furthermore, a total of 10 causes that generate bad smells in the requirements in software development were identified. Some identified causes are: requirements written in natural language or passive voice, limited or non-existent knowledge in the software engineering domain, use of conditionals in defining requirements, informal analysis of requirements, among others. In addition, it was possible to identify a total of 11 critical effects that occur in the presence of bad smells in requirements. Among them are: cost and time overruns in the software development life cycle, reprocessing, misunderstandings between the creators and readers of the requirements documents affecting test cases, problems in processes subsequent to the definition of software requirements, among others (the detail of causes and effects can be consulted at https://tinyurl.com/2de6lftk). Similarly, a total of 5 good practices proposed in the primary studies were observed to prevent bad smells in the requirements and anticipate their effects. These practices are: automatic review of requirements, defining measurable requirements, validation of bad smells in requirements before defining test cases, taking into account the positive and negative cases in automatically generated test cases and preprocessing requirements written in natural language before moving to the design phase. In general, all practices focus on the prevention of bad smells in requirements in software development. Regarding the use of tools, the studies mention different implementations for smell detection and quality assessment in the requirements (A1, A15, A23, A25, A26, A27, A28). Details of these practices and tools are available at https://tinyurl.com/263jynnb.
Finally, A11 proposes a Natural Language Processing approach to identify ambiguous terms across different domains and rank them by ambiguity score. A16 proposes a series of steps to ensure the quality of the requirements. A22 shows a state-of-the-art review of different NLP-based strategies for ambiguity detection, and A19 presents a set of requirements data to use in NLP tool development processes.
C. What are the Benefits and Challenges of Researching this Topic?
According to the results presented in A2, A3, A6, A24 and A25, carrying out a thorough investigation on smell identification in the requirements early in the software development process reduces the incidence of defects in later stages of development, saving money, effort and time. On the other hand, test cases can be generated early (A6) and project estimates are more accurate (A9). In addition, the use of approaches or techniques that apply automatic natural language processing allows mitigating human limitations when detecting quality defects in the requirements (A24). Also, using NLP approaches speeds up the review of quality defects, e.g., the tool defined in A25 detects four times more genuine ambiguities than an average human analyst.
In relation to the challenges, the possibility of developing new smell detection techniques in requirements has been raised to increase the understanding of which defects can be revealed by these smells and which cannot (A1, A3). In A13, six (6) challenges were identified when using NLP systems for requirements analysis, including: (i) improvement of coreference resolution, (ii) extension of the scope of inputs for NLP systems, visual, auditory, among others, (iii) reduction of ambiguity, (iv) support to the use of domain ontologies, (v) improvement of the accuracy of natural language processing in requirements, and (vi) improvement of algorithms for detecting ambiguity. A5 concludes the need to extend the existing requirements data sets for natural language processing in multiple applied studies. A4 and A11 make explicit the need to establish strategies to improve the detection of ambiguities according to the business domain in the requirements gathering and analysis phases. Additionally, challenges were identified related to the need of studying different techniques for smell detection using criteria such as: subjective language, comparative phrases, superlatives, among others, and to know their impact on the quality of the artifacts resulting from the requirements specification. Finally, the primary studies are based on the detection of smells written in English, which limits the study of smell identification or classification in other languages.
IV. DISCUSSION AND CONCLUSIONS
Results were presented at the SML that reveal the current state of research related to requirements smells and their consequences in the software development life cycle. According to the analysis of the results, the studies focus on smell identification and classification, but it was not possible to observe their extended application in exhaustive case studies. In this sense, further application in real-world environments is required to enable future software development projects to understand, manage and improve how they can detect and mitigate existing smells during the requirements specification stage. Furthermore, the SML allowed us to identify multiple definitions of the term "smell" in requirements engineering and define smells (in general) as concrete symptoms of quality defects in the requirements specifications. In addition, fifteen (15) requirements smells (https://tinyurl.com/2aquzp8y), ten (10) causes (https://tinyurl.com/2de6lftk), eleven (11) effects (https://tinyurl.com/2de6lftk) and five (5) good practices (https://tinyurl.com/263jynnb) were identified. Similarly, different methods and software tools are proposed as a solution to automate the smell identification process in the requirements (https://tinyurl.com/263jynnb). The tools found concentrate on the identification of the smells defined by the ISO/IEC/IEEE 29148 standard (https://tinyurl.com/23x4j8so) and implement detection mechanisms for each of them (https://tinyurl.com/2aquzp8y); among the most common are: dictionaries, morphological analysis and part-of-speech (POS) tagging. In addition, proposals were observed to automate the detection of bad smells in the requirements. However, there was no evidence of studies that support the detection of bad smells in projects where the requirements specification is in languages other than English.
Finally, with the development of this work, different opportunities for future work can be identified, as well as: (i) to broaden the conceptual definition of smells in requirements in software development and the usefulness that they can have for quality control, (ii) to establish new techniques or models capable of measuring the level of uncertainty in the quality artifacts used during the requirements specification phase of software development, and (iii) to improve natural language processing techniques, searching for solutions to the limitations and challenges that are currently present in NLP models.