I. INTRODUCTION
Introduction to Programming (CS1) is a mandatory course for Systems Engineering students [1]. In this course, the students should acquire logical skills and apply them in a programming paradigm to solve computational problems through a programming language [2], [3]. The student demonstrates those skills through learning activities that measure their achievements in the programming course [4]-[7].
In the Systems Engineering program of Universidad del Valle [8], the traditional teaching-learning methodology of a programming course consists of face-to-face classes including theory and practice (laboratories) according to the content of the CS1 course. Students who pass this course continue to CS2; however, between 30% and 50% of students do not pass [9], [10]. Therefore, it is necessary to design learning strategies so that they improve their grades, achieve logical skills, stimulate collaborative work, and increase their motivation [11]-[13].
In recent years, in the educational context, great interest has been aroused in the design of collaborative tools that enhance the development of learning skills in students. In turn, they allow the emergence of other experiences and work dynamics within the classroom. This interest has started early in countries like the United States, England, and Australia, whose educational experiences reflect the benefits of working in groups compared to individual work [14]-[16].
Collaboration as a learning strategy in the classroom has yielded satisfactory results in programming courses, improving academic aspects and personal skills [17], [18]. However, identifying the individual work that each student contributes to a group programming activity is a complicated task for the teacher [19], [20]. Pereira mentions in [21] that one way to contribute to the problem is to develop tools to provide feedback based on the summative and formative evaluation.
In the literature, one of the most popular learning approaches is Computer Supported Collaborative Learning (CSCL) [18], [22], which emerges as support for traditional learning [15], [23] and seeks to control the learning process to ensure that students acquire knowledge collaboratively [24], [25]. This approach is based on the formation of groups and how these can be supported by technology to improve learning and teaching. These processes can be intervened to adapt to specific needs [26], [27].
This paper presents a strategy based on the CSCL methodology to form workgroups automatically in an introductory programming course (CS1). The purpose is to motivate students to develop programming skills and collaborate. Several experiments were conducted to prove that collaborative work improves academic grades compared to individual work.
This paper is organized as follows: Section 1 presents the related work; Section 2 describes the methodology, where the questions of interest, the selection of the CS1 course sample, and the phases of the experiments are presented in detail; Section 3 presents the results of the experiments; Section 4 discusses the results and concludes the work.
II. RELATED WORK
CodeBench [20] is a system intended to include collaboration and evaluation of programming concepts. It is based on three stages: 1) task specification (what to test, how to test, coding plan); 2) distribution of tasks among group members (publish in a public repository, determine approval requirements, define the testers, define inputs and capture outputs, calculate grades; 3) testing the programs using the tool’s automatic evaluator.
CourseMaker [21] is an early warning system designed to support the academic process. It employs learning analytics to categorize performance and group effort. Teachers use this tool to provide timely assistance to students performing poorly or at risk of failing in their classes. Groups are formed to provide targeted advice and support for multiple university students.
TRAKLA [22] is an automatic source code evaluation tool that enables group formation based on dynamic evaluation and feedback among students. This tool enables workgroup formation but does not evaluate the group source code.
I-MINDS [23] is a software for the intelligent management of online classrooms or groups. It enables programming activities to be reviewed in real time and offline, thus facilitating student practice. Its technology is based on intelligent agents that interact with users autonomously in a chatbot. Source code is evaluated using the virtual judge DOMJudge [24], where students submit their solutions to the posed programming problems and receive immediate feedback.
UNCode tool [25] has the option of grouping programming students according to three criteria: 1) by similar grades; 2) randomly; 3) applying the “pair programming” concept. The groups are assessed by the evaluation of the submitted source codes, which indicate whether the syntax and efficiency of the program are correct or have errors. Based on this evaluation, a group grade and an individual grade are assigned. Mechanisms such as static analysis and feedback with questionnaires (mini tests) are included; they help students to improve the proposed programming solutions.
III. METHODOLOGY
This paper implements an algorithm to form workgroups automatically from a learning activity in a CS1 programming course. The process was based on questions of interest, population and sample, data collection, implementation of the automatic grouping algorithm, and interpretation of the results.
A. Research Questions
This work is based on the course CS1: Fundamentals of Object-Oriented Programming (FPOO). This course has a high academic failure rate [9], [28]; therefore, it is important to integrate learning strategies based on collaboration and tools that help improving student’s academic performance [21], [29]-[31].
The strategy was implemented by answering the questions of interest. RQ1: Does the final grade of a student improve using the automatic formation of workgroups compared with traditional group formation? RQ2: What are the results in grades when activities are developed without group formation?
B. Population and Sample
The grades of 68 students were collected. They correspond to the laboratories, one exam, and the final project of the FPOO course in the second semester of 2021. This course is offered every semester at the School of Systems and Computing Engineering (EISC) of Universidad del Valle (Cali-Colombia). It consists of 4 hours of face-to-face class per week (2 theoretical and 2 practical), and 8 hours of autonomous work (development outside the classroom), for a total of 192 hours each semester (16 weeks).
In this course, students develop their skills in an object-oriented programming language. Learning outcomes are based on the design, development, documentation, and implementation of solutions that include object-oriented programming concepts (class, object, inheritance, polymorphism). Collaborative learning is based on the strengthening of attitudes toward teamwork and the integral development of the student in the cognitive and sociocognitive dimensions. The 68 students enrolled in the FPOO course were selected as sample for this study. They were randomly grouped in two: 34 students as a control group (CG) and 34 as an experimental group (EG). In the CG, 84.85% of students passed the course, and in the EG, 83.15%. The number of students and percentages achieved show that the groups are homogeneous (Table 1).
C. Automatic Formation of Workgroups Using CSCL
FPOO students must solve programming problems as a group. In this course, workgroups of 3 or 4 students were formed. However, the traditional way of grouping students has biases and is not equitable; this is evidenced in the course's final grades [32]. For this reason, an algorithm that allows creating automatic homogeneous groups has been developed from the learning activities. The strategy is described below.
D. Dataset
The data input of the algorithm to form workgroups automatically requires an initial learning activity that must be conducted individually. In this case, the grades obtained in laboratory 1 by the EG were used. 110 submissions were made through the M-IDEA automatic evaluation platform. On average, each student made 2 submissions. Grades are between 3.4 and 4.9 (on a scale from 0.0 to 5.0). Finally, the data array is generated with the grades obtained in laboratory 1 and the code of each student.
E. Algorithm for Automatic Formation of Workgroups
The algorithm takes the previously generated data array, students are then allocated a partner, the highest-performing students are paired with lowest-performing ones. Then, each pair is grouped with another, thus creating a group of four students. If there is an odd number of students in total, the algorithm takes the student who has not been assigned to a group and includes them in one of the previously defined groups randomly. The process uses the uniformity criterion, which consists of forming groups with the same number of students for the completion of activities. If the number of students not assigned to a group is equal to or less than three, the algorithm prompts the user to select whether to create a new group from this selection or distribute them among the existing groups (Figure 1).
F. Experiments
Four experiments were conducted in this study using the grades obtained by the CG and EG students’ submissions. The first experiment integrates laboratory 1; the second includes laboratories 2,3 and 4; the third includes the exam performed; and the latter integrates the final project (see Table 2). Each experiment is described below.
Experiment | Tasks | CG | EG |
---|---|---|---|
1 | Laboratory 1 | Individually | Individually |
2 | Laboratories 2, 3, 4 | Traditional group | Automatic group |
3 | Exam | Individually | Individually |
4 | Final project | Traditional group | Automatic group |
1) Test 1 (lab 1). CG and EG students conduct the Laboratory 1 individually with a time limit of 2 hours (Table 3).
2) Test 2 (lab 2, 3 and 4). Students conduct 3 laboratory activities with a time limit of 2 hours each (Table 4). The automatic formation of workgroups was used for the EG, and the traditional formation of groups was used for the CG (by affinity between students).
Task | Description | Time of assessment | % of course |
---|---|---|---|
Labs 2 and 3 | Learning objectives focus on the use of standard libraries of the C++ programming language, inheritance, and source code refactoring | 2 hours |
|
Lab 4 | Learning objectives assess inheritance, polymorphism, and refactoring of source code | 2 hours |
|
3) Test 2 (exam). In this experiment, EG and CG students take the exam individually with a time limit of 1 hour (Table 5). The exam for both groups consists of two components: the first one includes multiple-choice questions with a single answer, it allows evaluating the conceptual learning outcomes of the course. The second includes a programming exercise that the student must develop and submit in the INGInious M-IDEA automatic source code evaluation tool, which allows assessing the practical learning results of the course.
4) Test 3 (final project). In this experiment, students develop the final project in groups. This activity has a deadline of 12 weeks counted from week 4, where the statement is socialized (Table 6). For the EG, the automatic formation of workgroups was used, while in the CG, it was the traditional formation of groups (by affinity between students).
III. RESULTS
This section presents the results obtained by the CG and EG in the experiments described in the methodology. Those elements allow us to answer the defined questions of interest. Subsection 4.1 presents the results obtained by the students in laboratories 2, 3, 4 and the final project, which were carried out with automatic and traditional formation of workgroups. Subsection 4.2 presents the grade obtained by the students in laboratory 1 and exam, developed without group formation. In subsection 4.3 the final grades of the CG and the EG are compared.
In the results, the median of the scores and the Mann-Whitney statistical test were used, which allows for comparing the mean for the variables used in the EG and CG, based on a null hypothesis H0. In the process, the p-value corresponding to the significance level is obtained, if the value found is less than or equal to 0.05, the null hypothesis is rejected because it is concluded that the mean between the EG and CG differs with a level of significance of 5%. But, if the p-value is more significant than 0.05, the null hypothesis is accepted, indicating that the mean value for the two groups does not differ significantly.
A. Qualification of Students Using Group Formation (Automatic and Traditional)
To respond to RQ1, the grades students obtained in the learning activities corresponding to laboratories 2, 3, 4 and the final project were analyzed. Activities were conducted with automatic and traditional group formation. Figure 2 shows the results obtained in these laboratories for the CG and EG. In laboratory 2, the CG presents a median of 3.9, and the EG of 4.8. In laboratory 3, the CG reached 3.0 in the median of the grades, and the EG obtained 4.9. Finally, in laboratory 4, the CG reached a median of 3.9, while the EG obtained 4.8.
The median of grades of the EG is higher than the CG in laboratories 2, 3 and 4, reaching an average of 4.8 and 3.6, respectively (on a scale of 0.0 to 5.0). This shows that the implemented automatic formation of workgroups has positive effects on programming activities. The learning objectives are related to correct coding style, documentation, source code debugging, use of standard libraries C++, inheritance, polymorphism, and source code refactoring. However, it is necessary to carry out other experiments to support this idea.
Figure 3 presents the results obtained in the final project. The CG reached a median of 4.0, while the EG obtained 5.0. With the results of the experiment, it was observed that the automatic formation of workgroups generated positive results in the EG, reaching higher grades compared to the CG. This indicates that the strategy can be used to conduct activities where the use of relationships between objects, polymorphism, property changes, controller class and extensibility are evaluated.
In labs 2, 3 and the final project, the p-value was 1.29e-09, 1.64e-06 and 0.12e-04, respectively. In this case, the null hypothesis is rejected, because the mean of the scores differs between the EG and CG with a significance level of 5%. However, in laboratory 4, the p-value was 0.69. In this case, the null hypothesis is accepted because the resulting value is greater than 0.05. Thus, it is possible to conclude that the mean of the grades is similar for the students of the two groups (EG and CG).
B. Students’ Grades in Laboratory 1 and Exam (Without Group Formation)
To respond to RQ2, we analyzed the grades of the students in laboratory 1 and exam, conducted without group formation. In laboratory 1, the CG reached a median of 4.6, while that of the EG was 4.0. In this learning activity, where the correct style of coding, documentation, and debugging of the source code in C++ is evaluated, it is observed that the CG students can achieve higher grades by 12% compared to the EG (Figure 4).
In the exam, all the theoretical and practical concepts developed during the academic semester were evaluated. The CG reached a median of 4.2, while the EG obtained 4.0 (Figure 5). It is probable that students of the two groups reach this grade because the activity was developed individually (without group formation). However, it is necessary to conduct new experiments and activities to validate this argument.
In laboratory 1, the p-value was 0.50, and in the test, it was 0.20. In this case, the null hypothesis is accepted for the two learning activities because the resulting p-value is greater than 0.05, thus indicating that the mean of the grades is similar for the students of the two groups (EG and CG).
C. GPA Comparison of the CG and EG
Finally, the final median of the grades of all the learning activities defined in the CG and EG were compared. The CG reached a median of 3.1, while the EG reached a median of 4.7 (Figure 6).
The results indicate that the EG obtained better grades than the CG. This may be due to the implementation of the automatic formation of workgroups. It is necessary to conduct other experiments that allow us to discuss this idea.
The p-value was 1.34e-07, since the value is less than 0.05, the null hypothesis is rejected because the mean of the grades differs between the EG and CG students with a significance level of 5%.
IV. DISCUSSION AND CONCLUSIONS
This article presents a strategy based on the CSCL methodology to form workgroups automatically using a learning activity conducted in a programming course (CS1). For RQ1, it was determined that EG students improved their grades by 22% compared to the CG in programming activities related to correct coding style, documentation, source code debugging, inheritance, polymorphism, and code refactoring. We also observed that the EG improved their grades by 20% in relation to CG in the development of the final project. There, the use of relationships between objects, polymorphism, property changes, controller class, and extensibility were evaluated.
However, in laboratory 4, the mean grades for the two groups (EG and CG) are similar according to the Mann-Whitney statistical test. This allows us to confirm what was described by Böhne and Kardan in their investigative works [19], [20]. They mentioned that identifying the work each student contributes to the group in a programming activity is a complex task for the teacher.
When analyzing the activities without group formation, for RQ2, we observed that the CG achieved better results compared to the EG in laboratory 1 and the exam. In laboratory 1, the CG scores were 12% higher than the EG scores, while in the exam the CG was 4% higher than the EG. However, when comparing the final grade of all the activities in the EG, we observed that the results of laboratories 2, 3, 4 exceeded the results obtained in laboratory 1 and the exam by 16%, while the final project surpassed laboratory 1 and the exam by 20% (see Table 7). This shows that the automatic formation of workgroups implemented in laboratories 2, 3, 4 and the final project is effective for this type of activity.
Tasks | CG Grade | EG Grade |
---|---|---|
Laboratory 1 | 4.6 | 4.0 |
Laboratories 2, 3, 4 | 3.7 | 4.8 |
Final project | 4.0 | 5.0 |
Exam | 4.2 | 4.0 |
Authors such as Pereira, Oliveira, and Fonseca in [21], [29]-[31], mention that it is important to integrate learning strategies in automatic source code evaluation tools because they can improve the academic performance. In the INGInious M-IDEA source code evaluation tool used in our experiments, automatic feedback was generated on each release. This made it possible to monitor the learning process, and to identify whether the students applied the programming concepts in the source code and corrected errors. Likewise, the learning results of the course were more homogeneous. Group programming skills were also stimulated, which helped improve final grades through the automatic formation of workgroups.
The design of strategies that integrate collaboration, learning analytics, and technological tools significantly improve student grades, thus cancelling the possibility that only one student ends up doing the work of the whole group. In addition, it improves people skills that encourage sharing and enjoying learning computer programming.