^{1, 2, 3,4}

^{1}

^{1, 3, 4}

^{1}

^{1}

^{2}

^{3}

^{4}

A substantial number of research studies have investigated the separate influence of working memory, attention, motivation, and learning strategies on mathematical performance and self-regulation in general. There is still little understanding of their impact on performance when taken together, understanding their interactions, and how much each of them contributes to the prediction of mathematical performance. With the emergence of new methodologies and technologies, such as the modelling with predictive systems, it is now possible to study these effects with approaches which use a wide range of data, including student characteristics, to estimate future performance without the need of traditional testing (Boekaerts and Cascallar, 2006). This research examines the different cognitive patterns and complex relations between cognitive variables, motivation, and background variables associated with different levels of mathematical performance using artificial neural networks (ANNs). A sample of 800 entering university students was used to develop three ANN models to identify the expected future level of performance in a mathematics test. These ANN models achieved high degree of precision in the correct classification of future levels of performance, showing differences in the pattern of relative predictive weight amongst those variables. The impact on educational quality, improvement, and accountability is highlighted.

Although there is substantial research which has investigated the influences of (a) working memory [

Artificial neural networks (ANNs) have been used in several different fields of research and in applied environments, such as biology [

However, the literature shows very few studies applying neural networks in education and in educational assessment in particular [

The purpose of this research was to develop predictive classification models that could identify with sufficient precision three groups of students corresponding to the highest 30%, lowest 30%, and middle 30% of estimated future performance in a mathematics test, utilizing only cognitive, motivational, and background variables, with no consideration of the mathematics content present in the test or of any measure of previous mathematics performance. Finally, in order to compare the predictive power of this ANN-based approach with more classical statistical methods, discriminant analyses were used.

It was expected that results would enable the development of an “early warning” system which could allow early and prompt intervention with those students most in need of support and remediation in mathematics (at the level of exit from secondary education and/or at the beginning of university studies). Similarly, this approach could serve to identify top or advanced students and improve their placement and/or career choice.

A large body of literature shows working memory as a very important construct in several areas, and several studies have shown its important role in a wide range of complex cognitive behaviors, such as comprehension, reasoning, and problem solving [

Working memory capacity refers to the temporary representation of information that was just experienced or just retrieved from long-term memory but no longer exists in the external environment, and it will be operationalized by the overall measure of the automated operation span [

Mathematical cognition involves complex mechanisms or processes such as identification of relevant quantities, encoding into an internal representation, mental comparisons, and calculations [

There is some supportive but not extensive literature on the critical role of working memory in mathematical performance [

One of the most recent working memory’s approaches develops computational models that simulate the effects of individual differences and/or working memory load on participants’ performance on various cognitive tasks. Interesting areas of this approach include the model of mental algebra [

In cognitive models, attention has been traditionally involved in the control of intended actions. In this sense, attentional control has been identified as an important domain in self-regulation [

Specifically, attention problems have been related to mathematical performance. Inattention is considered as a risk factor for poor math achievement [

Current research findings suggest that attention involves different mechanisms which involve separate brain areas. In particular, attention encompasses three subsystems: (a) orienting, (b) alerting, and (c) executive control. The orienting network allows the selection of information from sensory input, the alerting network refers to a system that achieves and maintains an alert state, and executive control is responsible for resolving conflict among responses [

Previous research on self-regulated learning focuses primarily on the learning strategies that students need to use in order to guide their learning [

Motivational self-regulation includes motivational beliefs, motivation strategies, and motivational regulatory strategies. Motivational beliefs involve (a) values that students attach to a particular domain, (b) the students’ opinion of the efficiency and effectiveness of learning and teaching methods, (c) beliefs about internal control transformed into self-efficacy beliefs (opinions that students hold about their own ability in relation to a specific domain), (d) outcome expectations: beliefs about the success or failure of specific actions, (e) goal orientation: orientation to learning tasks versus egoorientation: the intention is to demonstrate success (approach ego orientation) or to hide failure (avoidance ego-orientation), and (f) effort beliefs. Domain-specific self-efficacy beliefs influence effort investment, and not the other way round [

Research shows that epistemic and motivational beliefs that students hold play an important role in self-regulation [

Other categories of beliefs have been identified about the self in relation to mathematical learning: achievement goal orientation [

Conceptually, a neural network is a computational structure consisting of several highly interconnected computational elements, known as neurons, perceptrons, or nodes. Each neuron carries out a very simple operation on its inputs and transfers the output to a subsequent node or nodes in the network topology [

Predictive streams analyses [

The ANN learns by examining individual training case, then generating a prediction for each testing case, and making adjustments to the weights whenever it makes an incorrect prediction. Information is passed back through the network in iterations, gradually changing the weights. As training progresses, the network becomes increasingly accurate in replicating the known outcomes. This process is repeated many times, and the network continues to improve its predictions until one or more of the stopping criteria have been met. A minimum level of accuracy can be set as the stopping criterion, although additional stopping criteria may be used as well (e.g., number of iteration and amount of time). Once trained, the network can be applied to future cases (validation or holdout sample) for validation and implementation [

In order to evaluate the performance of the neural network system, there are a number of measures used which provide a means of determining the quality of the solutions offered by the various network models tried. The traditional measures include the determination of actual numbers and rates for true positive (TP), true negative (TN), false positive (FP), and false negative (FN) outcomes, as products of the ANN analysis. In addition, certain summative evaluative algorithms have been developed in this field of work, to assess overall quality of the predictive system.

These overall measures are Recall, which represents the proportion of correctly identified targets, out of all targets presented in the set, and is represented as

Testing phase of the neural network predicting lowest 30% math scores.

Prediction of performance | |||
---|---|---|---|

~30% lowest | 30% lowest | ||

Observed performance | ~30% lowest | 71.40% | 28.60% |

30% lowest | 0% | 100% |

In addition, the evaluation of ANN performance is carried out with a summative measure, which is used to account for the somewhat complementary relationship between Precision and Recall. This measure is defined as F1 = (2 * Precision * Recall)/(Precision + Recall). Such a definitional expression of F1 assumes equal weights for Precision and Recall. This assumption can be modified to favor either Precision or Recall, according to the utility and cost/benefit ratio of outcomes favoring either Precision or Recall for any given predictive circumstance.

The sample included 800 university students, of both genders, ages between 18 and 25, enrolled in the first year in several different disciplines (psychology, engineering, medicine, law, social communication, business, and marketing), in three universities, during the 2009-2010 academic year.

This task provides a measure for each of the three anatomically defined attentional networks: alerting, orienting, and executive. Participants are asked to determine when a central arrow points left or right. The ANT’s responses were collected via two mouse buttons (left-right). They were instructed to focus on a centrally located fixation cross throughout the task and to respond as quickly and accurately as possible. During the practice trials, but not during the experimental trials, subjects received feedback from the computer on their speed and accuracy. The practice trials took approximately 2 minutes, and each of the three experimental blocks was approximately 5 minutes long. The whole experiment took about twenty minutes. The measure for (general) attention is the average response time regardless of the cues or flankers. To analyse the effect of the three attentional networks, a set of cognitive subtractions described by Fan et al. [

This is a computer-administered version of the Ospan instrument [

A validated Spanish version was administered. It is a 77-item questionnaire with 10 scales that assesses the students’ awareness about, and use of, learning and study strategies related to skill, will, and self-regulation components of strategic learning. The Attitude Scale assesses students’ attitudes and interest in college and academic success. It examines how facilitative or debilitative their approach to college and academics is for helping them get their work done and for succeeding in college (sample item: I feel confused and undecided as to what my educational goals should be). The Motivation Scale assesses students’ diligence, self-discipline, and willingness to exert the effort necessary to successfully complete academic requirements (sample item: When work is difficult I either give up or study only the easy parts). The Time Management Scale assesses students’ application of time management principles to academic situations (sample item: I only study when there is the pressure of a test). The Anxiety Scale assesses the degree to which students worry about school and their academic performance. Students who score low on this scale are experiencing high levels of anxiety associated with school (note that this scale is reverse scored). The Concentration Scale assesses students’ ability to direct and maintain attention on academic tasks (sample item: I find that during lectures I think of other things and do not really listen to what is being said). The Information Processing Scale assesses how well students can use imagery, verbal elaboration, organization strategies, and reasoning skills as learning strategies to help build bridges between what they already know and what they are trying to learn and remember, that is, knowledge acquisition, retention, and future application (sample item: I translate what I am studying into my own words). The Selecting Main Ideas Scale assesses students’ skill at identifying important information for further study from among less important information and supporting details (sample item: Often when studying I seem to get lost in details and cannot see the forest for the trees). The Study Aids Scale assesses students’ use of supports or resources to help them learn or retain information (sample item: I use special helps, such as italics and headings that are in my textbooks). The Self-Testing Scale assesses students’ use of reviewing and comprehension monitoring techniques to determine their level of understanding of the information to be learned (sample item: I stop periodically while reading and mentally go over or review what was said). The Test Strategies Scale assesses students’ use of test preparation and test taking strategies (sample item: In taking tests, writing themes, etc., I find I have misunderstood what is wanted and lose points because of it). Items were scored on a 5-point Likert scale ranging from “Always” to “Never.”

The last version of the On-Line Motivation Questionnaire, namely, the OMQ91 [

This test consisted of 65 multiple choice items with four or five options and only one correct answer (50 items were taken from a national test [

In addition, a questionnaire was administered in order to collect background variables: gender, highest level of education of mother and father (i.e., did not complete mandatory primary school, completed primary school, completed secondary school, completed undergraduate university studies, completed postgraduate studies), occupation of parents, and secondary school from which the student graduated (i.e., public, private religious school, private nonreligious school, bilingual school, foreign community school).

The ANN model used was a backpropagation multilayer perceptron neural network, that is, a multilayer network composed of nonlinear units, which computes its activation level by summing all the weighted activations it receives and which then transforms its activation into a response via a nonlinear transfer function. During their training phase, these systems evaluate the effect of the weight patterns on the precision of their classification of outputs, and then, through backpropagation, they adjust those weights in a recursive fashion until they maximize the precision of the resulting classifications. A predictive classification architecture based on neural networks (ANNs) model development was developed for each targeted future mathematical performance group: lowest 30%, middle 30%, and highest 30% of student performance groups. ANN parameters and variable groupings, as well as all other network architecture parameters, were manipulated to maximize predictive precision and total accuracy. Confusion matrices have been determined for each ANN, as well as receiver operating characteristic (ROC) curves to determine the discrimination level of the model. ROC analyses provide a very useful measure to establish the performance of the classifier at various levels of true positive and true negative rates, using sensitivity and specificity values. Parameters such as learning rate, momentum, number of hidden layers, stopping rules, transfer functions, and number of nodes were specified and manipulated in the model construction phase in order to maximize the overall performance of the models.

Three different neural networks (ANNs) were developed as predictive systems for the mathematics task of this study. ANN1 was developed to maximize the predictive classification of the lowest 30% of students, which would be scoring the lowest in the mathematics test. ANN2 was developed to maximize the predictive classification of the highest 30% of students, which would be scoring the highest in the mathematics test. ANN3 was developed to predict the middle 30% of students, which would be scoring in the middle level of performance in the mathematics test. The specific architecture of each of the three neural networks developed is as follows.

ANN1 (low 30%): all cognitive, motivational, and background variables were introduced in the analysis. They were used for the development of the vector-matrix containing all predictor variables for each student. The resulting network contained all the input predictors; some of them collapsed into subscales to maximize predictive classification, with a total of 36 input units. The model built contained one hidden layer, with 8 units. The output layer contained two units (categories corresponding to “belongs to lowest 30%” or “belongs to highest 70%”). A standardized method for the rescaling of covariates was used. The hidden layers had hyperbolic tangent activation functions, which is the most common activation function used for neural networks because of its greater numeric range (from −1 to 1) and the shape of its graph. For the output layer, the activation function chosen was identity, and the error function the sum of squares.

ANN2 (high 30%): all cognitive, motivational, and background variables were introduced in the analysis. They were used for the development of the vector-matrix containing all predictor variables for each student. The resulting network contained all the input predictors; some of them collapsed into subscales to maximize predictive classification, with a total of 36 input units. The model built contained two hidden layers, with 8 and 6 units, respectively, and an output layer with two units (categories corresponding to “belongs to highest 30%” or “belongs to lowest 70%”). A standardized method for the rescaling of covariates was used. The hidden layer and output layer had a hyperbolic tangent activation function, and the error function the sum of squares.

ANN3 (middle 30%): all cognitive, motivational, and background variables were introduced in the analysis. They were used for the development of the vector-matrix containing all predictor variables for each student. The resulting network contained all the input predictors; some of them collapsed into subscales to maximize predictive classification, with a total of 36 input units. The model built contained one hidden layer with 1 unit and one output layer with two units (categories corresponding to “belongs to middle 30%” or “belongs to extreme 30%’s”). A standardized method for the rescaling of covariates was used. The hidden layer had a hyperbolic tangent activation function, and the output layer applied a softmax activation function.

The software used was SPSS v.19, Neural Network Module, for the development and analysis of all predictive models in this study. The usual three development phases of the predictive system were carried out: training of the network, testing of the network developed, and validation of the network. During the training phase several models were attempted, and several modifications of the neural network parameters were tried, manipulating learning persistence, learning rate, momentum, and other criteria. These tests continued until achieving desired levels of classification, maximizing the benefits of the model chosen. In this analysis both precision and recall, as outcome measures of the network, were given equal weight. There was no need to trim the number of predictor inputs in the three models.

Discriminant analyses (DA) were carried out using the same data and the same categories of mathematical performance used in the neural networks analyses. The DA1 was performed to discriminate between the students belonging to the lowest 30% of mathematical performance and those not in that category. The DA2 has been focused on identifying students in the highest 30% versus those not in that group, and the DA3 was calculated to discriminate the students belonging to the middle 30% and those not in that category. In order to give every variable the opportunity to contribute significantly to the prediction, a stepwise discriminant analysis was calculated for each category including all independent variables. In addition, we calculated three discriminant analyses, one for each category including the independent variables of the maximised neural networks of each category.

The ANN1 was able to reach 100% correct identification of all students that belong to the target group (lowest 30%) in both the training and testing phase. The precision of NN1 equalled .75 on a maximum of 1 (see Table

Table

Relative importance of the top variables participating in the model for the predictive classification of the lowest 30% of scores in the mathematics test.

Independent variable importance, low 30% group | ||
---|---|---|

Importance | Normalized importance | |

Gender | .035 | 34.2% |

Mother’s educational level | .028 | 28.2% |

Father’s educational level | .024 | 23.9% |

Mother’s occupation | .065 | 64.5% |

Father’s occupation | .059 | 58.8% |

Age | .062 | 61.5% |

Competence-related attribution for success | .041 | 40.3% |

Personal relevance of task | .029 | 28.3% |

Subjective competence | .043 | 42.7% |

Task attraction | .042 | 41.8% |

Learning intention | .052 | 51.8% |

Reported effort | .062 | 61.1% |

Expected result of assessment | .099 | 97.7% |

Emotional state | .062 | 61.3% |

Alerting attention | .029 | 29.1% |

Orienting attention | .018 | 17.4% |

Executive attention | .067 | 66.4% |

Working memory | .081 | 80.6% |

Reaction time (operations) | .101 | 100.0% |

Normalized importance of the top variables participating in the model for the predictive classification of the lowest 30% of scores in the mathematics test.

The ANN2 reached an accuracy of 90% and 100% in the training and testing phase, respectively. The precision of ANN2 equalled .80 from a maximum of 1 (see Table

Testing phase of the neural network predicting highest 30% math scores.

Prediction of performance | |||
---|---|---|---|

~30% highest | 30% highest | ||

Observed |
~30% highest | 66.70% | 33.30% |

30% highest | 0% | 100% |

Figure

Normalized importance of the top variables participating in the model for the predictive classification of the highest 30% of scores in the mathematics test.

Both networks showed interesting differences in the pattern of relative normalized importance of those variables with the highest participation in the predictive model. For the low performers (those predicted to be in the lowest 30% of scores), several basic cognitive variables were most important in attaining a correct classification, such as “reaction-time,” “working memory capacity,” and the closely related “executive attention,” all having to do with the control and the speed of processing. In fact, three out of the top four variables in terms of relative predictive importance correspond to basic cognitive processing variables, with high relative values. Among the self-regulation variables, only “expected results of the assessment” appeared among the most predictive.

On the other hand, results from the predictive model for those expected to be in the highest 30% of the scores, the top three predictors with the most significant participation were “task attraction,” “father’s occupation,” and “reported effort,” all among the self-regulation and background variables. Only “working memory capacity” (as measured by “absolute AOSPAN”) among the basic cognitive processing variables appeared among the top five predictors and then with a much lower relative importance than for the low 30% group. It is quite evident the relative lower importance of all cognitive control and speed of processing variables, which are not discriminated well for the predictive classification in the highest 30% group. It is also worth noting the relative high importance of parents’ occupation in both low and high groups, particularly in the first neural network.

The ANN3 showed an accuracy of 74.5% and 70.6% in the training and testing phase, respectively. The precision of ANN3 equalled .70 from a maximum of 1 (see Table

Testing phase of the neural network predicting for middle 30% math scores.

Prediction of performance | |||
---|---|---|---|

~30% middle | 30% middle | ||

Observed performance | ~30% middle | 67.6% | 32.4% |

30% middle | 29.4% | 70.6% |

The most important variables for the prediction of ANN3 (middle 30%) were positive learning strategies and study techniques, reaction time (natural logarithm) of attentional networks, time management, and subjective competence (see Figure

Normalized importance of the top variables participating in the model for the predictive classification of the middle 30% of scores in the mathematics test.

DA1 focused on the lowest 30% of the students and the rest. One of the restrictions of this analyses refers to the assumption of equality of covariance matrices that, in this case, is violated (Box’s

DA2 was calculated to discriminate between the highest 30% of mathematical performance and the rest 70% of students, entering the same independent variables that were used into ANN2. Results show that the independent variables were not able to discriminate between both groups of students. The Box’s

DA3 involved the same variable as the ANN3 to predict the middle 30% of mathematical performance. The assumption of equality of covariance matrices was violated (Box’s

It is clear from these results that besides the high predictive power of the three neural networks to model the expected performance of low, middle, and high performance groups of students, this methodology has also detected important differences in the factors that seem to underlie the students’ performance. Among the student groups with the lowest 30% of math performance, the main determinants of performance appear to be basic cognitive processing variables, indicating the degree to which they represent the areas of relative weakness in the group and more discriminating from the rest of the students. This seems to indicate that it is the area of basic cognitive abilities, in other words, the basic processing capacity of the cognitive system in these students that best provides the information necessary to correctly identify this group. There is an extensive literature indicating a strong correlation between poor math performance and low working memory [

On the other hand, among the student groups with the highest 30% of math performance, the main determinants of performance appear to be self-regulation and background variables (particularly, how interested students were in the task and social indicators such as parents’ occupation). In this group cognitive processing variables had much lower levels of importance in terms of their predictive weights, probably due to the fact that this group was much stronger in its levels of cognitive processing, and therefore these variables are less discriminating when the model attempts to classify the students according to their performance level. Working memory, reaction time, and attentional networks seem to be much less discriminating among students who reach certain threshold levels needed for basic mathematical problem solving. The items about self-regulated learning were important for students with high performance, reflecting their appreciation of the content and context of the math test. The model of adaptive learning [

The prediction for the middle 30% level of math performance group of students shows a particular pattern involving learning strategies and self-efficacy (as important motivational beliefs), together with attentional resources, as important predictors. Moreover, working memory does not seem to improve the prediction of performance for this middle group of students, indicating that their mathematical performance is more determined by processes related to self-regulated learning (i.e., learning strategies, motivational beliefs, and attention). These are variables more related to environmental, instructional, and training constructs, rather than to basic cognitive processes such as working memory.

The results of the discriminant analyses (DA) confirm the lack of significant linear relations between the independent variables analysed here and mathematical performance. Neural network models have an important advantage in this area, because ANN models are able to model nonlinear and complex relationships among variables. Another assumption required for traditional statistical predictive models (e.g., equality of covariance matrices) was violated for the three stepwise discriminant analyses that were performed to predict a specific category (lowest 30% or not, highest 30% or not, and middle 30% or not). Even with this restriction, the amount of variance explained was low in the three DA analyses. None of the variables were able to discriminate between the different categories of mathematical performance. When we compare these results with the ANNs analysed in this study, it can be concluded that ANNs are more robust and perform significantly better than other classical techniques, as prior studies have indicated [

The predictive systems approach allows for the conceptualization and development of new modes of assessment which could facilitate breaking away from traditional forms of testing while at the same time improving the quality of the assessment process [