Feedback for student programming assignments on quality is a tedious and laborious task for the instructor. In this paper, we make use of few object-oriented software metrics along with a reference code that is provided by the instructor to analyze student programs and provide feedback. The empirical study finds those software metrics that can be used on the considered programming assignments and the way reference code helps the instructor to assess them. This approach helps the instructor to easily find out quality issues in student programs. Feedback to such assignments can be provided using the guidelines which we will be discussing. We also perform an experimental study on programming assignments of sophomore students who were enrolled in an object-oriented programming course to validate our approach.
Assessment of students programming assignments are mostly done by using certain criteria like functionality, design, and programming style. Moreover, giving feedback on students’ assignments is a hectic task because the instructor needs to inspect all student assignments. Many computer-aided approaches (CAAs) have been proposed for assessing student assignments and giving feedback [
Software metrics are measures of certain aspects of a program that help to analyze and control it is quality. Researchers have found a large number of software metrics that can analyze different aspects of the source code. These software metrics can also be used by the instructor to analyze the students programing assignments and give feedback. Most of these software metrics used were related to complexity and size of the programs [
As we proceed, we will be answering three questions pertaining to our research. Can the object-oriented metrics assess student programs? How can these metrics with reference code help the instructor in analyzing student programs? Will this approach help the students?
We have organized this paper into different sections. In Section
The main approaches towards assessing student programming assignments can be categorized into formative and summative. Formative assessments provide feedback to the student while learning, whereas summative assessments evaluate the learning process after completion. Another approach is the diagnostic assessment which is used to identify student learning difficulties and areas of strength [
An automated programming assessment system (APAS) is capable of analyzing and assessing student programs thereby reducing the instructor’s workload. Presently, there are several APASs used in many computer courses. Along with program analysis some of these APASs provide feedback and some do not. Feedback is required by the students on their programs for their improvement. In this study, we focus on some approaches that used static assessments centering the idea of using software metrics, model codes and providing feedback to the students.
Software metrics are considered to be general measurements which characterize the computer programs [
The static approach does not require that the code be compiled and executed whereas dynamic approach requires so. Assessment is directly done on the code in static approach. The readability and maintainability factors of a program are assessed by the programming style feature. They are highly enhanced by the criteria like comments, indentation, line spacing, program layout, and meaningful variable names. Checkstyle is one of the tools that checks the Java code for these criteria. Error detection in a program is done using the syntax and semantics of the programming language. A programing language’s syntax is written against the grammar specification of that language. The semantics of a language forms the logical correct statements that conform to the grammar of the language. These errors are generally detected by the IDE (interactive development environment) of the language. Eclipse is one of such IDEs for Java. Another method of static assessment is the structure analysis which is done by matching the structures of programs. Generally the assignment’s structure is compared with the model programs structure. Program analysis, transformation, and program matching are used in this approach [
The process of giving feedback was also automated in the systems which evolved later. FrontDesk [
Software metrics have not been widely used in educational institutions for assessing student programs [
Cardell-Oliver [
Fuzzy logic was used in assessing student programs by applying it on software metrics and test cases [
As mentioned above we focused our study on some approaches that used software metrics, model codes, and feedback to students. Among these approaches, less focus was given to object-oriented metrics in assessing student programs. So, in this research we will try to find out whether object-oriented metrics can help in assessing student programs.
Chidamber and Kemerer [
Proposed approach.
Our idea is to accompany the software metrics with a reference code which is provided by the instructor and analyze the assignments for providing feedback on their quality. We assume that all assignments are functionally sound and tested before their metrics are calculated. We use the CKJM (Chidamber and Kemerer Java Metrics) [
Finally, we compare the time needed to analyze the assignments by the instructor with and without our approach for the issues relating to cohesion and coupling complexity. We also take a survey from the students on how well the feedback can be comprehended by providing them with the reference code.
We perform the empirical study to find two critical issues that relate to the research. The first issue corresponds to whether the quality metrics of coupling and cohesion are capable of assessing student programs. Most of the metrics are insensitive to student assignments because of their smaller size. Generally, students are given assignments that can be accomplished using few classes. Hence, most of the object-oriented metrics which were obtained from the CKJM tool did not have much variation. Some of the metrics values were found to be similar for all the assignments and could not be used to detect variations in the assignments from the reference code. For example, the inheritance and coupling metrics were similar in all the assignments as they were all targeting a common functionality and were too small in size.
The second issue focuses on the way that these metrics help the instructor to analyze the student assignments for giving the feedback. The metric values obtained from the tool are just a measure of their kind from the way the code is written. They are alone insignificant and resembling nothing. It is difficult to decide whether the metric value is good or bad from the established standards. For example, the LCOM3 (Lack of Cohesion among Methods) value which ranges from 0 to 2 is considered harmful if the value exceeds 1. These standards are too general for the smaller sized student assignments. Hence, we need to establish another standard for determining the metrics obtained from the student assignments. We procure this standard in the form of a reference code which is provided by the instructor. The reference code’s metric values are used as a standard to compare the student assignments.
We have considered a sample of actual student assignments which belonged to the sophomore students of Computer Science who were enrolled for the object-oriented programming course. These students had basic knowledge on object-oriented programming such as creating a class, creating objects for a class, and invoking members of another class. They were given a Banking Assignment in which they were suggested to create two classes: one as the Main class and the other as an Account class. Their program must be able to display a menu which consisted of the basic banking activities for deposit, withdraw, transactions, interest, showing balance, getting interest rate, and added interest. The user should be able to input the options from the menu to the program for the desired operation. The instructor was asked to provide a reference code for the given assignment which will be used in our approach. The experiment was performed on 11 student assignments which were sound in their functionalities. We will now try to optimize the student’s programming approach towards the same functionality using our approach of software metrics and reference code.
The CKJM tool was used to obtain the metrics from the student assignments and the reference code. This tool along with the Chidamber and Kemerer metrics [
We now compare the CAM, LCOM3, RFC, and CC metrics of the assignments and reference code for deviations. The grader was given guidelines on the four considered metrics on how they varied, the possible reason for their variation, and a suggested feedback to the student. The guidelines are useful to the grader to compare the assignments with the reference code.
The values of the sensitive metrics on smaller assignments, namely, RFC, CAM, LCOM3, and CC, obtained from the CKJM tool were analyzed for each of the student assignments with the reference code. We have also suggested a feedback for the metrics which was evaluated by inspecting the parts of the assignment causing the deviation of metrics.
The RFC metric is the measure of number of different methods that can be executed when the object of the class receives a message. Higher RFC of a class indicates that the overall class design is complex. Therefore, to improve the class design one must reduce the number of methods that are invoked by a class object on receiving a message.
The assignments in this study had two classes one as the Main class and the other as an Account class to perform the user’s banking functionalities depending on the input of the user’s choice. There was not significant difference among the classes from the reference code. The Main class of the reference code was used to invoke the methods in the Account class using an object. The Account class consisted of the methods to display menu, deposit money, withdraw money, obtain transactions, calculate the interest, display balance, and obtain the interest value.
Among the student assignments, some showed significantly higher RFC in the Main class than the Account class. On inspecting all such assignments we found that the logic for displaying the menu and taking input from the user was found in the Main class. Figure
Sample of a student programming assignment.
Cyclomatic complexity is the measure of linearly independent paths in the program. The more the CC value the more the complexity of the program is. In this study, the Account class of both the reference code and the assignments did not show much variation. So, we have considered the Main class’s CC for analyzing the quality of the assignments.
As the Main class had to invoke the Account classes’ methods depending on the user’s choice, a decision statement was to be used. The reference code’s Main class used the switch statement for the required functionality. On examining the CC metric values of student assignments, we found 36% of them to be having significantly higher complexities than that of the Main class. All of these assignments used a different decision statement for the same functionality which increased their complexities. They used an if-else-if ladder instead of the switch statement which would be more effective for the assignment’s solution. Figure
Sample of a student programming assignment.
Figure
Comparison graph of assignments with reference code.
The CAM metric measures the cohesion among the classes methods by considering the types of parameters used in the classes methods. The CAM value ranges from 0 to 1. According to the standards, a CAM value closer to 1 (higher cohesion) is desirable. For example, if a class’s CAM value is 0.5 we cannot say that the class is half cohesive and half uncohesive. In our study, we use the CAM value of the reference code as a standard for determining the quality of student assignments. We considered the CAM values of the Account class of all the assignments and the reference code for analysis as the Account class has more functionalities than the Main class where better cohesion is expected.
The reference code’s Account class had a constructor to initialize the class attributes and other methods for performing the user’s banking operations. On examining the sample from the student assignments, we found that 9% of the assignments had significantly lower values of CAM than the reference code. On close inspection we found that these assignments had unnecessary constructors and used method parameters that were not required. The grader can use the suggested feedback 2 to advise the student so that the code could be improved. Figure
Sample of a student programming assignment.
The LCOM3 metric is an improved variation of the LCOM metric. It measures the cohesion of the class’s methods in terms of the effective usage of the class’s attributes. Hence, we will have more specific measurement over LCOM. It's value ranges from 0 to 2 and a value lesser than 1 is desired. In this study, we use the reference code’s LCOM3 as a standard for analyzing the students assignments. We have considered the Account class from the student assignments and the reference code as it has more functionalities than the Main class.
We have found 27% of the student assignments to be having significantly higher values of LCOM3 than the reference code. On examining all such assignments we found that some assignments had unnecessary class attributes declared and others had not used the class attributes which were declared. So, the instructor can inspect these assignments for such problems and provide a feedback to the student using our guideline 3 and suggested feedback 3. Figure
Sample of a student programming assignment.
Figure
Comparison graph of assignments with reference code.
Using software metrics to assess student programs one must consider the issues of metric dependencies. For example, the metrics for coupling and complexity can be compromised for each other. Code can be written to reduce complexity by the increasing modules which in turn increases coupling between the modules. Such metric dependencies should be considered while code inspection and giving feedback to the student.
We have also noticed smaller deviations among the metric values. Too smaller deviations from the reference code for RFC and CC values can be ruled out. As RFC count includes the method calls to the methods in the class libraries, these smaller variations could be the result of the calls to such methods in the assignment programs smaller variations could be a result of using or not using such functions in the student assignments. Similarly minor CC deviations can also be ignored as they are of less significance. But in case of LCOM3 and CAM, these smaller variations could be a result of an unnecessary method or an attribute and may be considered.
The grader was asked to find the quality issues related to cohesion and coupling complexity among the student assignments. Time was calculated for reviewing each assignment by the usual way and by using the approach of software metrics with reference code. Figure
Inspection timing graph without proposed approach.
The times taken by the grader to inspect the assignments using our suggested approach are shown in Figure
Inspection timing graph with proposed approach.
The survey was also taken from the students after giving them the feedback. In this survey we will try to find out whether our approach helped the students. The students were asked to answer three questions regarding the feedback provided from the instructor. Figure
Survey results without giving out reference code.
64% of the students had difficulties in interpreting the feedback. Most of the students felt that more details were necessary to understand the feedback. So, we have provided them the reference code too and reviewed their responses. Figure
Survey results after giving out reference code.
Now, most of the students understood the way in which they could improve their code after looking at their feedback and the reference code. Thus, the proposed approach can help the grader and student in reviewing the code quality
The cohesion and coupling metrics, namely, CAM, LCOM3, and RFC, can help in assessing student programs which we have considered in our experiment. We have given guidelines for the grader in reviewing and providing possible feedback for the assignments when compared with the reference code’s metrics. The reviewing times for the assignments reveal that our approach can help the instructor in identifying the classes that needs inspection and thereby saving time. The survey from students showed that providing the reference code along with the instructor’s feedback can enhance their understandability of the provided feedback.
The students sample that we have considered in our experiment is small, so we suggest that the experiment be conducted using a larger sample using a different assignment that has more than two classes. The Ca, Ce, and some other class metrics may be varied if larger assignments having more number of classes are used for the experiment. The metric values may be analyzed in a similar procedure and verified with our approach.