I. INTRODUCTION
Higher Education Institutions (HEIs) must continuously monitor the quality of the service provided to their students. The National Accreditation Council (NAC) is the Colombian entity in charge of establishing whether or not HEIs that undergo evaluation carry out their functions with high quality. The NAC evaluates institutions based on twelve factors contained in Agreement 03 of 2014 [1]. One of these refers to the research processes carried out by the institution, which include such activities as internal and external funding calls, groups, seedbeds and research projects. One of the steps in the process of receiving the high-quality recognition consists of the institution submitting a self-assessment report with quantitative data for each factor. In most cases, this is a task that can last months since it involves collecting and centralizing data from different sources and is done almost manually by a group selected from teaching staff that make up the accreditation committee or the Accreditation Office [2].
Data warehouses (DW) offer an alternative solution for this type of process. A DW allows the centralization of information from various sources (relational databases, EXCEL files, etc.) [3]. A DW further supports decision-making using reports showing graphs that present the data generally and then in more detail, depending on the need of the data analyst. This would allow HEIs to make decisions to improve on shortcomings detected in the data [4]-[5].
This work was carried out using the Iterative Research Pattern (IRP) methodology, wherein the first stage, the Observation stage, DW articles at the national level [6] were identified for funding calls, groups, seedbeds and research projects, and at the international level [7]-[12], DW articles that include publications, projects and research funding. The robustness of the models found in the literature was also identified, comparing the number of dimensions and attributes presented with those identified in the NAC requirements.
The second stage consists of Problem identification. For this, the related articles were analyzed to identify possible dimensions and measures, and the self-assessment reports presented by some of the public HEIs to the NAC were reviewed. In addition, a group of experts in institutional accreditation from NAC validated the analytical requirements arising from this stage.
The third stage is Solution development, for which the MiPymes (MBD) methodology [13] was used to design the proposed dimensional models based on the requirements of the previous stage. These models can be adapted to the context of each institution since they consider elements that meet the requirements most frequently used by public HEIs in self-assessment reports, elements of the least frequent requirements (if the institution has a sufficient amount of information), and additional requirements proposed by the authors and experts in accreditation that comply with the aspects to be evaluated by the NAC. The models proposed in this work are therefore adapted to the availability of data from the HEI.
The last stage is Solution testing, addressed using a focus group of DW modeling experts who validated the proposed models according to a test that measures the adaptability of these models to the requirements identified by the authors and the experts in accreditation.
This article first presents the research methodology used. Next, a description of the six models obtained with a validation carried out by three modeling experts through a focus group, and finally, the conclusions and future work.
II. METHODOLOGY
This study uses the Iterative Research Pattern (IRP) proposed by [14], a methodology for research projects that involves computational solutions. According to [14], cycles made up of stages must be defined: observation, problem identification, solution development and solution testing. In each cycle, a product must be delivered.
In this case, a cycle was defined for the design of the proposed dimensional models adaptable to the investigative processes of an HEI based on the institutional accreditation research factor of the National Accreditation Council (NAC) in Colombia [15].
A. Observation
1) Literature Review. A literature review is carried out to understand how the DW for HEIs focused on research areas is modeled. The planning proposed by [16] is used, which consists of:
Objective: to define a set of dimensional models on research that is consistent with the factor of research of the NAC high-quality accreditation.
Resources: ACM Digital Library, IEEE Xplore, Science Direct, Scopus, Springer Link, and Web of Knowledge databases.
Research question: Q1: What dimensional modeling elements have been related to the factor of research processes?; Q2: What design situations used for research processes could be incorporated into the proposed model?
Search string: (dimension* model* OR design) AND (business intelligence OR data warehouse*) AND (educat* OR high* educat* OR academic*) AND (research* or investigation)
Inclusion/exclusion criteria: The criterion for including a candidate study was that the study must NOT have been published before 2014.
The criteria for excluding a candidate study were as follows: (1) The study is NOT relevant to the development of Data Warehouses (DW) for higher education; (2) The study does NOT present a dimensional model related to research.
The search string was applied to previously established databases, finding seven articles that met the inclusion/exclusion criteria.
2) Analysis of the Studies. In 2014, [7] proposed DW models, dashboards, and the use of data mining for HEIs, presenting a Publications star diagram whose dimensions are Program, Professor, and Publication Type. In addition to, Research, in which the dimensions are Research Category, Lead Researcher, Academic Position and Period of Research Activity. The resulting dashboards and models do not present design situations or validation mechanisms.
In 2015, [8] defined the architecture of a Business Intelligence system for academic organizations, showing a Publication model with the dimensions Publisher, Typology, Year, Author, Academic Unit, Knowledge Area, Department, and Institutional Role of the Author. This model uses many-to-many dimensions and sub-dimensions but does not have validation mechanisms.
In 2017, [9] proposed a DW for the measurement of disciplinary development in library and information sciences in academic institutions. The proposed Projects model presents design cases as sub-dimensions and many-to-many dimensions. It presents the dimensions: Participants, Financing, Area of Knowledge, Responsible for the Project, Project Status, and Participating Institution. No model validation mechanism is mentioned.
In 2018 [10], proposed a research information system to analyze bibliometric indicators. Sub-dimensions are used as design cases. The Publications model presents the dimensions Author, Indexing in Scopus and Web of Science, Department, and Citation Metrics. The model is not validated within the study.
In this same year, [11] described the development of a multidimensional model with two models, Publications and Projects. The indicators managed by the study consist of Number of Publications and Projects. The dimensions handled in the model are Time, Study Level, Research Activity, Area of Knowledge, Type of Project, Project Status, Type of Publication, and Category of Hiring of Professors. Design cases are not presented, nor is the proposed model validated.
Regarding gray literature, also in 2018 [12], a DW was built for financial data and their visualization tool. No design cases were presented in the star diagram. The Financing model handles the Fund, Financial Unit, Account, and Time dimensions. The resulting model was not validated by experts, but a query performance test and a usability test were carried out with five users.
Another study of gray literature was developed in 2022 [6], which designs dimensional models for Funding calls, Groups, Seedbeds, and Research Projects. The models handle design cases: role-playing, many-to-many, and degenerate dimensions. The models use the conformed dimensions: Project, Group, Person, Date, and Project Status. The Project model contains the dimensions: Project documents, Funding call and the Participants bridge tables, Research line, and Members; the Financing model contemplates the dimensions: Item, Financial Entity, Type of Financing, and the bridge table Type of Expenditure; the Group model presents the Discipline, Ranking, Group Location dimensions, and the bridge Project; the model Seedbed with the Integrating Role, Work Plan, Seedbed Documents, Status, and Program. Regarding validation, this study evaluated the level of satisfaction of officials of the research division of the HEI.
Table 1 summarizes the models found in the review, taking into account the use of design situations (Des. Sit.), the evaluation of the result (Eval.), the analysis capacity of the models measured in the number of dimensions and attributes, and measures.
Ref. Year | Model | Des. Sit. | Eval. | Analysis capacity of (Dimensions and attributes) | Measures |
---|---|---|---|---|---|
[6] 2014 | Publications and Research | No | No | Few dimensions. | Number of publication and research activities. |
[7] 2015 | Publications | Yes | No | Few dimensions and very few attributes. | Number of publications. |
[8] 2017 | Projects | Yes | No | No attributes. | Number of research projects. |
[9] 2018 | Publications | Yes | No | Few dimensions and few attributes. | Number of publications. |
[10] 2018 | Publications and Projects | No | No | Few dimensions and no attributes. | Number of publications and research projects. |
[11] 2018 | Financing | No | Yes | Few dimensions. | Financial amount. |
[6] 2022 | Projects, Financing, Groups, Seedbeds | Yes | Yes | Handles a greater number of dimensions and attributes. | Funding value and the number of projects, groups, and seedbeds. |
As can be seen in Table 1, most of the models found in the literature (57.14%) are related to Publications, three deal with issues related to Projects, and two contain models for Financing, Research Groups, and Seedbeds. 57.14% of these models handle design situations such as several-to-many dimensions, sub-dimensions and role play. Regarding the evaluation of the models, only 28.57% of the studies carried out some external consultation that will validate the model or the proposed implementation, showing the need to create studies that validate the results presented.
Although the dimensions, attributes and measurements of the studies found can be taken as a basis for the construction of the dimensional models of the present study, in Table 1, it can be observed that most of the dimensional models of these studies present few dimensions, attributes, and measures. This limits their analysis capacity. Furthermore, none of the models mentions the use of a quality assurance model.
3) NAC Institutional Accreditation Model. The Research and artistic and cultural creation factor of the institutional accreditation model and the eighteen available self-assessment reports presented by public HEIs that have participated in high-quality accreditation processes of the NAC were reviewed. The aim was to identify the aspects to be evaluated that were found to contain quantitative data within the reports, which are shown underlined in Table 2.
B. Problem Identification
The requirements for the modeling of a DW for investigative processes in HEIs are defined, taking into account the aspects to be evaluated underlined in Table 2, the eighteen institutional self-assessment reports of public HEIs, and the results of the review of the state of the art.
The requirements were divided into three categories taking into account the occurrence in the self-assessment reports: most frequent, repeated in more than five; least frequent, repeated from three to five; and proposed requirements, which, although not mentioned frequently, were suggested by accreditation experts. This division of the requirements seeks to define the adaptability of the models, which can serve both the HEI interested in only the most frequent requirements and the one that wants to use a more extensive version of the dimensional model. The choice will depend on the provisions internal to the institution and the availability of data.
The identified requirements were validated by the group of experts shown in Table 3 (identified as A1, A2 and A3), who reviewed them and made recommendations, allowing the final requirements to be reached.
Expert ID | Occupation | Studies | Experience |
---|---|---|---|
A1 | Academic quality assessment coordination -Universidad Industrial de Santander. | PhD in telecommunications engineering. | Five (5) years of experience as a NAC counselor at the national level. |
A2 | Academic quality assessment coordination - Universidad Industrial de Santander. | Systems engineer. | More than ten (10) years of experience in Educational Quality Assessment. |
A3 | On-staff lecturer - Universidad Industrial de Santander. | PhD in technological innovation management | More than ten (10) years of experience in program and institutional accreditation. |
C. Solution Development
The design of the DW model was carried out using the data warehouse development methodology for MiPymes (MBD) [13], which is characterized by an iterative and incremental approach, user participation and detailed phases. It is pertinent to use this methodology because it is designed for small workgroups, handling a few roles and artifacts.
The work cycle used is made up of the following phases: (1) Initiation: the set of research processes to be modeled is selected; (2) Planning: the work plan for the design of the selected research processes is defined; (3) Analysis and design: based on the requirements extracted from the NAC research factor, self-assessment reports, and the dimensions and attributes identified in the state of the art, the DW models proposed for HEIs were designed.
In the Observation stage previously detailed, a review that serves as a starting point in the Initiation phase of the MBD methodology [13] was carried out, and the business processes with the greatest viability and impact were identified. With this input, the Planning stage begins, in which the implementation of the models is justified based on the data collection problems arising in the accreditation processes. Subsequently, the third stage is applied, called Analysis and Design, which begins with the Requirement Gathering subphase, carried out based on the design elements identified in the review, the aspects to be evaluated using Agreement 03 of 2014 [1], and the opinion of experts in institutional accreditation. The second subphase of this stage is Design, where the attributes of the data warehouse are preliminarily defined, specifying whether they are primary or foreign keys. At the end of this subphase, the review and user acceptance activity is carried out, in which a group of DW modeling experts reviews and qualifies the dimensional modeling performed. The subsequent stages proposed in the MDB methodology are Development, Maintenance and Growth, and Project Management. These last three phases are not part of the scope of the study and are proposed as future work.
D. Solution Testing
Validation of the dimensional models is carried out through a focus group of experts in DW modeling (identified as M1, M2 and M3), detailed in Table 4.
ID | Occupation | Studies | Experience |
---|---|---|---|
M1 | DW lecturer - University of Magdalena. | PhD in computer science. |
|
M2 | DW lecturer - National University of Colombia. | PhD in computer science. | |
M3 | Data Architect - PRAGMA S.A. | Master in data science. |
Reading material containing details of the dimensional models was sent to the experts by email prior to conducting the focus group. In the course of the focus group, a presentation of these models was made, and the experts had a space for feedback. Finally, they filled out the questionnaire in Table 5, evaluating the models based on the sub-characteristic of adaptability of the ISO/IEC 25010 standard [17].
III. RESULTS
This section presents the six adaptable dimensional models proposed based on the Research factor of the NAC for HEIs, and their validation through a focus group.
A. Dimensional Models
In Tables 6 to 11, the six proposed adaptable dimensional models are presented, together with the requirements. The measures and attributes that are in gray comply with the least frequent and proposed requirements. The measurements and attributes in black correspond to the most frequent requirements. This distinction is made to show the model’s adaptability, allowing that if an HEI wants to consider only the most frequent requirements, it can use only the measures and attributes in black.
The dimensional model of Participation in Seedbed (See Table 6) presents a role play in Program due to the possibility that a student from a program may belong to a seedbed of another program. The sub-dimension Academic Unit is also used, which will be used in the requirements of other models. In the model as an example, the attributes and the measure of the most frequent requirement are presented in a green box.
The model of Participation in Research Projects presents a role play in Academic Unit, Person, and Person and Date Indicator due to attributes that can correspond to both the project and its participant, the many-to-many of participating groups dimension is also used, which manages both the Research Group Indicator dimension (a mini-dimension) and Research Group. In a similar way to the Seedbed Participation model, the most frequent requirement is presented with the attributes and the measure in a green box.
In the Use of Laboratory model (Table 8), all the attributes and measures of the model are gray because there were no more frequent requirements for this model. Therefore, the attributes and the measure of the first requirement (least common) are in an orange box.
In the dimensional model of Participation in Research Groups, a role play is handled for the Period dimension due to the existence of an academic period and a calendar period. The attributes and the measure of the first requirement are also shown in a green box.
The dimensional model of Intellectual Production (Table 10) has the attributes and the measure of the most frequent requirement indicated in a green box.
The Project Financing dimensional model presents the attributes and the measure in a purple box because they correspond to a proposed requirement.
B. Validation of the Dimensional Models
The validation of the proposed models was carried out through a focus group of experts presented in ¡Error! No se encuentra el origen de la referencia. and using the closed questions in Table 5, obtaining the results shown in Figure 1, where the number of experts who responded at each level of the Likert scale is shown.
For the group of experts, the proposed models are 100% adaptable to the most frequent, least frequent, and proposed requirements identified and can serve as support for strategic decision-making in research areas.
Based on the responses of the experts to the open questions in Table 5, the improvement actions mentioned in Table 12 are carried out in the models.
Expert ID | Comment | Improvement Action/Justification |
---|---|---|
M2 | Add publication geography to the intellectual production fact table. | The Location dimension is added within the model of Table 10. |
M3 | Bear in mind that percentage measures have performance flaws when making a query for large amounts of data. | In the deployment stage, the performance of the queries must be measured and it must be determined whether it is necessary to generate aggregations in the cube or indexes in the relational database. |
The bridge table of project participants has measures. It is recommended to handle it as a fact table. |
|
|
The use of the many-to-many "Bridges" tables could be studied in depth since, in the future, it would be a relationship between tables with too many records. | The use of many-to-many dimensions in the models is avoided, only the bridge table of research groups is maintained in the research projects model in Table 7 since the research groups do not present changes with high frequency. |
IV. CONCLUSIONS
The dimensional models of research in HEIs focus mainly on publications and research projects, which are built based on the needs of particular HEIs and do not contemplate high-quality requirements in education. In this article, six adaptable dimensional models are proposed, which take into account the quantitative quality requirements of the research factor for the high-quality accreditation of the NAC, the eighteen self-assessment reports available presented by public HEIs before the NAC, the dimensions and attributes found in the review of the literature, and the opinion of experts in the accreditation of educational quality.
Of the models found in the literature review, a mere 28.57% of studies went on to evaluate the model through an external consultation that would validate the model or the proposed implementation, revealing a need to create studies that go on to validate their results.
Validation of the proposed adaptable dimensional models was carried out by means of a focus group of experts in dimensional modeling, which found that the models are 100% adaptable to the most frequent, least frequent, and proposed requirements. A group of accreditation experts also validated the requirements. These adaptable models will allow HEIs to appropriate them according to the information available in the different data sources, and they can serve as support for strategic decision-making in research fields.
As future work, it is proposed to address the other quality factors established by the NAC to propose dimensional models adaptable to HEIs to support these factors.
The dimensional models proposed ought to be validated by a broader focus group, that is, with the participation of more HEIs in the country. In addition, it is sought to implement the models through the Development and Maintenance and Growth stages of the MBD methodology, allowing the generation of the reports requested by the NAC in the Research and artistic and cultural creation factor in a self-assessment process for institutional accreditation.