Introduction
Teams are creating methods and practices in order to address the growing demand of the software engineering industry, so they can produce high quality software on time and on budget [1] [5].However, some circumstances in running the software engineering endeavor lead teams to continuously tailoring their own methods and practices, so the previously gained knowledge is abandoned. Consequently, knowledge transference among teams is getting harder [1], [6] [8]. Aiming to improve such transference, the SEMAT (Software Engineering Method and Theory) community is promoting a software engineering theory (the Essence standard) which is focused on identifying universal elements covering all software engineering endeavors. Such elements are expressed in terms of a simple and structured language, thus allowing for the definition of methods and practices, so they can be easily transferred, tailored, measured, and compared among teams [6], [7]. Universal elements forming the Essence Kernel are known as alphas, activity spaces, and competencies. Software engineering as a discipline includes a set of concepts for easing communication among teams. Such concepts are included into the specific terminology for software engineering [6].
Nedobity [9] warns about human-to-machine and machine-to-machine communication problems arising from deficient terminologies. Accordingly, a theory with uniformity problems is unable to provide guidance to the procedures of a discipline. Such a problem generates gaps between the real progress of the team and the progress assessed by the theory. According to Cabré [10], a theory should have three degrees of adequacy: observational, descriptive, and predictive. Since some of the elements of the theory lack uniformity, theories fail to reach such degrees, and they are unable to support new concepts generated within a discipline. Also, lack of uniformity is associated with the impossibility to compare information in the documents of a discipline [11]. The process aimed to eliminate ambiguity and improve the communication among teams is called terminology unification [10]. Some terminology problems are reported in domains like nursing [11], research and teaching [12], archive [13], automotive industry [14], and natural language requirements [15]. Such problems can be so severe that they sometimes need the intervention of governmental offices in order to achieve terminological systems [16].
We can find terminology problems in both the Essence standard and the language in which it is described [17], [18]. Constructs and definitions are affected by such problems. Consequently, in this paper we apply terminology unification to the Essence standard by selecting some base models and definitions for structuring terms, identifying disunited terms by comparing the base models and definitions, unifying terms among the base models and definitions, and measuring the gap between the current standard terms and the proposed changes.We can turn the Essence standard into a uniform theory that allows for the completion of the Essence kernel by solving such problems. Furthermore, a uniform terminology should help avoid ambiguity and improve communication among people and teams practicing the software engineering discipline.
This paper is organized as follows: in Section 2, we describe the Essence Kernel and its full set of elements; in Section 3, we present a review of terminology problems in some domains; in Section 4, we improve the Essence standard by applying terminology unification; and finally, we discuss some conclusions and future work in Section 5.
Theoretical framework
Growth of the software engineering industry has launched the need for creating new development teams with skills enough to supply high-quality, on-time, and on-budget software systems for covering the industry demand [1] - [5]. Teams are creating their own elements-methods and practices-in order to fulfill this purpose. Such elements are intended to provide guidance to processes and objects to be used on the methods [1], [6], [7], so, in this way, teams can produce high quality software systems. However, some circumstances when running the software engineering endeavor-e.g., tight deadlines, poor cost estimation, quality demands, volatile requirements, etc.-lead teams to entirely misuse their original methods and practices. They are often forced to tailor their own methods and practices and learn new ways of working [8], so new knowledge and experience gained is abandoned. Consequently, knowledge transference among teams is getting harder.
SEMAT is an initiative aimed to meet the software engineering challenges we face nowadays. As a way to reach this goal, the SEMAT community is promoting a scalable, actionable standard-called the Essence-based on proven principles and best practices [1], [6], [7]. Such a standard provides support to make the software engineering method and practice transference easier, tailoring, measuring, and comparison.
The Essence standard [6], [19] includes a set of elements and a structured language-known as the Es- sence kernel and language for software engineering methods. Elements contained in the Essence kernel are intended to be constructs for covering all software engineering endeavors [6], [7]: alphas (attributes for assessing the health and progress of the software engineering endeavor, by using states and checklists); activity spaces (groups of activities always present in any software engineering endeavor); and competencies (what is needed for performing the work, including abilities and knowledge) [6]. Such constructs are grouped into three areas of concern: customer (related to the opportunity and the stakeholder), solution (a technical area including requirements and the software system itself), and endeavor (related to the work, the team, and the way of working) [6].
The Essence standard has inspired work on some areas like teaching [20], software startups [21],the anatomy of software requirements [22], and adaptive software engineering [23].
Background
Cabré [10] establishes two degrees for the adequacy of a theory: observational, for describing the observed data, and descriptive, for describing the non-observed data. Theories with the two degrees are predictive. The lack of uniformity prevents a theory from achieving the degrees of adequacy Goosen [11] argues that uniformity is yet to be reached in the nursing terminology, so comparisons are difficult to achieve about data over time and documents coming from different sources.
According to Nedobity [9], concepts and conceptual systems are representations of reality and elaborations of the world. Thus, teams have created specialized terminologies and conceptual systems allowing for communicating among themselves. Such terminologies are composed of concepts representing objects of the world-concepts represent physical objects, as well as properties and relations of those objects. Terminology differences associated to a concept are the result of the diversity of languages. Nedobity [9] believes that deficient terminologies endanger the information flow among people and machine-to-machine communication. Similarly, several designations for the same object are results of the alternative usages of an object [10]. However, Cabré, citing Wüster, states that scientists and technicians should have a characterized and unambiguous terminology [10]. She claims ambiguity from technical languages can be removed by unifying terminology,so in this way scientists and technicians can establish an effective communication.
Terminology problems are common to different domains. Goosen [11] reports difficulties for mapping concepts between nursing terminologies and classifications, even though some international standards are defined for such a discipline. Goosen [11] shows that cross-mapping is still possible, but lack of uniformity can be demonstrated in this domain. Slisko and Dikstra [12] reveal the lack of a well-defined scientific language waiting to be used in research and teaching; they exemplify the problems with some terms related to science, which arise from the misconception and usage of terms in science as a need for improving teaching with a uniform, defined terminology. Dryden [13] summarizes all the effort devoted to standardizing terminology related to the archive domain and the main difficulties linked to this task: different languages, technological change,and the recent emergence of this discipline as a professional field. Sauberer et al. [14] claim that terminology should be self-explanatory in engineering environments, since time for discussions about the meaning of the terms can delay the work to-be-done; they suggest the development and implementation of a corporate terminology policy and they exemplify them in the context of the automotive industry. However, such policies are difficult to spread among several companies, thus causing lack of uniformity in the terms used in the whole environment. Finally, Misra [15] shows that the problems related to the usage of terms in requirements specification lead to misunders- tanding of such specification along the software development lifecycle; he advocates for a careful review of specifications in order to generate a term-alias glossary for document interpretation. Even though this is a kind of terminology unification, we need to select a unique term for representing the concepts instead of dealing with all the possible aliases of a term.
Sonneveld and Loening [24] assert that new terms are constantly being created to express new ways of working. Such assertion makes sense in the software engineering discipline too. New methods, practices, and thinking frameworks are constantly created, and they commonly result from transformations made to existing methods, practices, and thinking frameworks. Such ways of working bring up the creation of new terms in the software engineering discipline. However, the theory fails to provide an unambiguous standard where the minimal parts forming either a method or a practice are terminologically uniform. Consequently, the theory is unsupportive of new concepts,i.e., we can say-according to Cabré [10] that the Essence standard is descriptively and predictively inadequate. Elkin [25] says that the relationship between concepts should be uniform across parallel domains within the terminology. We look for such uniformity for the Essence standard in the next Section by using terminology unification.
Solution
As we previously mentioned, problems related to the uniformity of terminology should be solved, so software engineering teams can use the same terminology for improving their technical communication. In this Section, we propose an improvement to the Essence standard by solving such problems. We apply a four-stage method described in the following sub-sections.
Selection of base models and definitions
Ward [26] develops a method for addressing terminology problems. The first three stages of the method are devoted to developing a taxonomy and a glossary to be used for detecting terminology problems in the fourth stage. Similarly, Goosen [11] employs the ISO reference terminology model for nursing diagnosis and some definitions coming from different standards. Consequently, we select some base models and definitions for applying the remainder of the method. Since the Essence standard [6] has structured models for alphas and activity spaces, we select such models (see Figs. 1 and 2) as the basis for our analysis. We also use the terms and definitions included in the fourth section of the Essence standard [6]. Some checklists of the alpha states are also reviewed [6].
A third model is selected in order to include expert judgement in the analysis. Morales-Trujillo et al. [27] reported a terminological analysis made to the Essence standard by using a pre-conceptual schema (see Fig. 3). In this figure, the terms of the Essence standard are colored in blue and yellow. Since the pre- conceptual schema was validated by some of the Essence standard authors, we can use it as a pivot for evaluating some of the terms used throughout the models, definitions, and checklists of the Essence standard.
Identification of terminology problems
We review the selected definitions, checklists, and models in order to establish the usage of terms in the Essence standard. Initially, a specific sample of terms and definitions from the Essence standard is presented in Table I.
Along the Essence standard, activity space is defined as “descriptions of the challenges a team faces when developing, maintaining, and supporting software systems” [6] p. 15. However, such a description is excluded from the activity space definition. This lack of uniformity can be found in many of the Essence standard terms and definitions. Relationships of the opportunity alpha are excluded from the definition. The term opportunity should contain alpha relationships associated to the opportunity alpha for providing a better understanding of its definition. Finally, the Essence standard [6] exhibits a deeper definition of the work item by means of the alternative usages of this kind of element-e.g., elements in which the work is broken down, elements with clear definitions of done, user stories from a sprint backlog. When the definition of work item is compared to the work product definition-“an artifact of value and relevance for a software engineering endeavor; a document or a piece of software” [6] p. 92-we realize that work item and work product are the same term. Work product is defined as an element representing “concrete things to work with, providing evidence for the states an alpha is in” [6] p. 69. Furthermore, a deeper definition of work product can be inferred from the alternative usages of this kind of element-e.g., document where the user requirements are documented, use cases, product backlog or sprint backlog. Therefore, the term work item should be renamed as work product. Also, we need a definition of completion criteria (see Fig. 3), since the definition of work item includes the phrase “complete the work” [6]p. 7.
Another source of terminology problems can be related to constructs of the Essence standard like the alpha state checklists. Lack of uniformity can be found in several alpha state checklists; the value established state of the opportunity alpha, seeded state of the team alpha, and bounded state of the requirements alpha are shown in Table II. The value established state of the opportunity alpha is defined as “the value of a successful solution has been established” [6] p. 27, but solution is an area of concern and the closest construct for solution in this context is the software system alpha. In fact, solution and software system seem to be interchangeable in the context of the alpha state checklist in Table II when the value established state is detailed. The same problems can be detected in the other alpha state checklists included in Table II with terms like mission, mechanisms, commitment, governance rules, leadership model, success, prioritization, and assumptions. Some mentions of other constructs are unclear. For example, leadership is a competency of the endeavor area of concern with some levels, so probably the expression leadership model is selected is intended to be interpreted as leadership level is determined. The requirements alpha is another example of the misuse of the terminology. The checklist item the way the requirements will be described is agreed upon is related to a requirements state, but described is outside the set of the requirement states-i.e., conceived, bounded, coherent, acceptable, addressed, and fulfilled.
The aforementioned terminology problems could generate mistakes in the way we assess the progress of the team by using the Essence standard. So, at some point, the team could think they are in an advanced state of a certain alpha when they should be in a previous one. Such mistakes can lead the team to a work bottleneck as the software engineering endeavor time goes on. Terminology problems generated in the alpha checklists can lead to completeness problems affecting the uniformity of the theory.
Regarding the models selected in the previous stage, terminology problems can also arise from the relationships among alphas shown in Fig. 1 and alpha descriptions provided by the Essence standard [6]. We identify alpha relationships with terminology problems in Table III. The relationship between the alphas opportunity and requirements exhibits terminology problems with the alpha description provided in the Essence standard [6]. Opportunity is “the set of circumstances that makes it appropriate to develop or change a software system” [6] p. 5 and also “the opportunity Table II. Full Checklists for the value established, seeded, and bounded states of the Essence kernel [6] articulates the reason for the creation of the new, or changed, software system (…) It represents the team’s shared understanding of the stakeholders’ needs, and helps shape the requirements for the new software system by providing justification for its development” [6] p. 17. As a matter of fact, the description of the actual relationship between the alphas opportunity and requirements-i.e., focuses-is excluded from the description of the alpha, and other relationships are excluded from Table III-i.e., shape. Terminology problems arising from the relationship among alphas and alpha descriptions are represented by the existing and excluded relationships, thus leading to a completeness problem-e.g., the description of opportunity makes it clear that the relationship opportunity makes appropriate creates, updates, or changes software system was omitted by the authors of the Essence standard [6].
The Essence standard should provide a detailed description of the challenges faced by a team when running activity spaces of a software engineering endeavor. However, if such elements exhibit problems, the team is unable to address those challenges in a proper way. In fact, the usage of several designations for the same object or action can be mistaken by teams and produce undesired results. Moreover, terminology problems lead to misunderstanding the completeness of the theory, but we can realize that the theory is incomplete-so the completeness problem should be solved-by addressing terminology problems. We can identify activity spaces with terminology problems in Table IV.
The phrase explore possibilities comprises a verb and a noun. In order to provide an accurate name for the activity space, the noun should be included in the specialized terminology defined by the Essence standard [6], as expressed in the pre-conceptual schema of Fig. 3, and possibility is outside such terminology. Something similar occurs to system. Be advised that system was considered a name for the software system alpha, but it was rejected because it “was considered to be too general” and “the consensus was that all engineering disciplines produce some kind of system, and therefore software engineering needs to produce something more specialized than just a system” [19] p 11. In this way, the activity spaces related to the software system should be named after the name of the alpha. The same applies for understand stakeholder needs. Need (absent from Fig. 3) was also considered a name for an alpha-in this case the requirements alpha included in the Essence standard-but it was rejected because it was “considered too confusing when compared and contrasted with requirements” [8] p 19.
Terminology problems associated with coordinate activity, support the team, and track progress are related to activity space descriptions. The description of coordinate activity is to “co-ordinate and direct the team’s work,” and “this includes all ongoing planning and re-planning of the work, and adding any additional resources needed to complete the formation of the team” [6]p 19, the description of support the team is to “help the team members to help themselves, collaborate, and improve their way of working” [6]; p 20; and track progress is to “measure and assess the progress made by the team” [6] p 20. Accordingly, the activity space names should be re-defined in a way to be consistent with the actual description of them.
Unification of terms
Terminology unification is first applied to the definitions in Table I. We need to add information and make uniform use of terms, as we propose in Table V. Regarding the definition of activity space and the opportunity definition, we need to add some information for making uniform usage of the terms in the standard. Also, as we stated before, work product and work item seem to be the same construct according to their definitions, so we propose to create just one single definition and use the term work product throughout the Essence standard. We also propose to add a definition to the completion criteria, since this is a term used several times in the standard.
Terminology problems in the alpha checklists are solved by changing non-standard terms, excluding redundant information, and including some missing information in the checklist items (see Table VI).
Some of the terms we are proposing to change (and by which ones) are: new system (software system),solution (software system), success criteria (completion criteria), operation (way of working),grow (form), composition (form), impact on the solution is understood (satisfied in use), identified (recognized), shared understanding (in agreement), extent (value), and described (bounded). Some of the changes obey to definitions of the Essence standard constructs; for example, we change the term impact on the solution is understood because no states are named in this way. However, we have a clear definition of the satisfied in use state of stakeholders. Something similar happens to the term described, which is related to requirements; the next state to such a description is the bounded state.
The redundant information we propose to exclude is the following:
The impact of the solution on the stakeholders is understood is a description included in the definition of the satisfied in use state of stakeholders.
Team responsibilities are outlined, the level of team commitment is clear, the team size is determined, governance rules are defined, and leadership model is selected are checklist items related to our proposal: management and leadership of the team are clear. In this case, we are using two competencies of the endeavor area of concern for covering all redundant topics.
The mechanisms for managing the requirements are in place, the prioritization scheme is clear, constraints are identified and considered, and assumptions are clearly stated are checklist items related to our proposal: management of the requirements is clear.
The missing information we are including is the following: the work products related to the solution are those related to the software system in the operational state; we always know the required competencies, since the Essence standard only recognizes six of them, but what we need to identify is the competency level required by the team members.
We propose a solution to terminology problems of the alpha relationships in Fig. 4. Also, the details of the alpha relationships are summarized in Table VII. Most of the changes are proposed after reviewing the activity spaces defined in the Essence standard. For example, produces is very short for describing how the team is related to the software system, since we have five activity spaces related to software system in the solution area of concern: shape, develop, test, deploy, and support. Some other changes are related to the definitions included in the Essence standard. For example, way of working is defined as “the tailored set of practices and tools used by a team to guide and support their work” [6] p. 57, so the team tailors and applies the way of working, and the way of working guides and supports the work. Some other relationships from Fig. 1 are omitted, but we can recover them by reviewing some other constructs of the Essence standard. For example, we find that the team captures and understands the requirements in the definition of the analysis competency.
We propose a solution for terminology problems of the activity spaces in Fig. 5 and Table VIII. As we said before, activity spaces represent the placeholders for activities to be done in a software engineering endeavor, i:e:, what the team should perform to produce a software system. Thus, changes made to activity spaces should be reflected in the elements related to the alphas. Some of the changes we propose for activity spaces are related to the names of the alphas involved. We discussed in Section 4.2 how the name system has been rejected and the name software system was adopted. However, the older name is still applied in five out of six activity spaces belonging to the solution area of concern. This assertion is ratified by the experts by including software system as an alpha in Fig. 3. For this reason, we propose changing shape the system for shape the software system. In the same way, we propose changing possibilities and stakeholder needs for the adequate alpha name (see Figs. 1 and 3): opportunity. Some other changes are related to the definitions included in the Essence standard. For example, the team alpha is defined as “the group of people actively engaged in the development, maintenance, delivery, or support of a specific software system“ [6] p. 6; in this way, we propose changing the verb implement for develop and the verb operate for support. Finally, we propose changing coordinate activity for coordinate the work, since activity is not considered an alpha and the next name related to it as alpha is work.
Solving terminology problems within the Essence standard implies that it is incomplete. When we solve such problems, some terms are still outside of the terminology provided by the Essence standard. This leads to the definition of new elements and terms. Some evidence about such definition is the support the team activity space. When we solve the terminology problems we need a new activity space named improve the way of working (see Fig. 5), which is used to describe activities related to the way of working alpha for promoting the advance in the way of working states. Given the above, the activity space completion criteria (see the proposed definition in Table V) are compromised-the completion criteria include the collaborating state of the team alpha and in place state of the way of working alpha-since the description is outside the work to be done for achieving the checklists associated to the collaborating state of the team alpha. Based on such facts, we propose the redefinition of the support the team and improve the way of working activity spaces as follows:
Support the team
Description: Support the team to make it work as a cohesive unit, make the communication open and honest, inform each other and focus on achieving the team mission [6].
Input: Team.
Entry criteria: Team::Formed.
Completion criteria: Team::Collaborating.
Improve the way of working
Description: “Help the team members to help themselves, collaborate, and improve their way of working” [6] p. 20.
Input: Way of Working.
Entry criteria: Team::Formed, Way of Working::Foundation Established.
Completion criteria: Way of Working::In place.
Measurement of the gap between the current standard terms and the proposed changes
Misra [15] proposes a combination of a latent semantic analysis and a dissimilarity degree between two-word chains as a final stage of the terminological inconsistency analysis of natural language requirements. Similarly, Dalpiaz et al. [28] propose semantic similarity as a measure of the relatedness of two terms when analyzing terminological problems in a specification. Consequently, we include this fourth stage in our method and select the lexical semantic relatedness [29] for evaluating changes in terms and the Levenshtein distance for evaluating changes in word chains as a way to measure the proposed changes and their impact in the Essence standard.
We calculate the distance between the original and the proposed terms by using lexical semantic relatedness [29], a measure of how two words are related in meaning. To this effect, we use the online calculator included in www.olesk.com and summarize the results in Table IX. Even though some of the meanings are close (more than 90 %), we can see from Table IX we are improving the accuracy in the terminology by using the right words.
As Misra [615] suggests, we use the Levenshtein distance for evaluating the smaller number of insertion, deletion, and substitution operations required to change one word chain to the other.We also compare the dissimilarity as a percentage of the longer word chain, as we summarized in Table X. We calculate the Levenshtein distance by using the calculator included in https://es.planetcalc.com/1721/?language_select=es. As you can see from Table X, lower numbers of dissimilarity are associated with shorter distance between word chains.The usage of the same word chains is zero, the lowest number of dissimilarity. Again, we are improving the accuracy of the Essence standard terminology by adding uniform information into some constructs of the standard.
Due to the space requirements of this paper, we exemplify the terminology problems the Essence standard exhibits in the current version. Be advised that we have selected some of the constructs included in the standard and just a small number of each construct. For example, the standard has 27 reported definitions, and we work with just three of them. We work with three out of 41 states reported in the standard. The coverage is bigger in the case of the alphas and the activity spaces, but we demonstrated that terminology problems are linked to many constructs of the Essence standard. Fortunately, as we propose in this paper, the solutions to such problems can be easily achieved and they can drive improvements in accuracy, as we show with the lexical semantic relatedness and dissimilarity we calculate. For this reason, we strongly believe the guidelines we propose in this paper could help to solve the problems in question.
Conclusions and future work
In this paper we proposed the solution to some terminology problems in the Essence standard.We used a method based on terminology unification in order to intervene constructs like definitions, alpha state checklist items, relationships between alphas, and activity spaces. We identified main problems such as the use of non-standard terms for naming standard elements, the addition of redundant information, and the lack of pertinent information related to some constructs. After solving the aforementioned problems, we evaluated the accuracy of our solution with two metrics related to semantic and morphological distance. Even though we sampled the problems with some constructs, we believe the presented guidelines could help to solve other problems in the Essence standard. Also, as a result of such problem resolutions, we proposed two new alpha relationships excluded from the Essence standard, one term definition, and one new activity space. We redefined three terms, three alpha state checklists, five alpha relationships, and 11 activity spaces. Such problem resolution provides a better understanding of the Essence standard. Consequently, software engineering practitioners could have a specialized terminology that allows unambiguous communication with each other. Also, we contribute to reducing the gaps between the real progress of the team and the progress assessed by using the terminology defined in the Essence standard.
As future work, we can define the following lines of work:
Developing a focus group in order to validate the proposed changes with practitioners and experts of the software engineering field of knowledge.
Applying the solution to the rest of the constructs defined in the Essence standard.
Completing the terms defined in the Essence standard by following the guidelines defined in this paper. We believe we should have more than the 27 current definitions of the Essence standard. Some constructs such as competency level, milestone, phase, etc. are still missing in the standard.
Detecting other problems arising from terminology of the Essence standard. We can suppose we can discover more additions and deletions for the standard while reviewing the rest of the constructs.
Applying the method followed in this paper to other bodies of knowledge and standards related to software engineering-e.g., the Software Engineering Body of Knowledge SWEBoK-and other disciplines like the Project Management Body of Knowledge PMBoK.