With the development of mobile platform, such as smart cellphone and pad, the E-Learning model has been rapidly developed. However, due to the low completion rate for E-Learning platform, it is very necessary to analyze the behavior characteristics of online learners to intelligently adjust online education strategy and enhance the quality of learning. In this paper, we analyzed the relation indicators of E-Learning to build the student profile and gave countermeasures. Adopting the similarity computation and Jaccard coefficient algorithm, we designed a system model to clean and dig into the educational data and also the students’ learning attitude and the duration of learning behavior to establish student profile. According to the E-Learning resources and learner behaviors, we also present the intelligent guide model to guide both E-Learning platform and learners to improve learning things. The study on student profile can help the E-Learning platform to meet and guide the students’ learning behavior deeply and also to provide personalized learning situation and promote the optimization of the E-Learning.
As an effective way for education, the E-Learning supported more knowledge and skills than the traditional education and also is beyond the restriction of time and space based on new information and communication technologies [
E-Learning education has had a rapid development. Figure
2010–2015 E-Learning user scale.
However, although more and more people are concerned about the E-Learning platform, there are only 7%–9% learners who completed MOOC’s course according to Coursera statistics data [
The student profile is a figure portrait analysis based on the big data and labeling. We collect, process, and analyze the data generated in the learners’ behavior, for an information description of individual students or groups. According to the theory of behavioral psychology, use of the student profile to analyze the data on student behavior can reflect the students’ behavior characteristics and psychodynamics. For example, the Education Big Data Research Institute of UESTC (University of Electronic Science and Technology of China) cooperate with other departments in developing the Student Profile System, which can give an early warning about failing the exam [
In view of the E-Learning data, we use the big data technology to analyze the E-Learning characteristic, and the main research contents of this paper are as follows: (
The student profile described the learning characteristics from multidimensions and multiangle. It includes the analysis indicators and influencing factors, such as student behavior, data collection, data cleaning, and student profile building and analyzing [
The main research on the student profile is the students in the school or E-Learning platform. Assume the student set as follows:
Learners’ age segment table.
Years | Symbol | Example |
---|---|---|
<17 |
|
It means student |
18–24 |
|
It means student |
25–34 |
|
It means student |
35–54 |
|
It means student |
>55 |
|
It means student |
According to age, we can predict learner profile information and further dig into the characteristics of students learning.
The online learning behavior is the kinds of learning behavior under the network environment. We focus on digging out the characteristics of learners from online learning behavior after analysis, in order to understand the student’s performance. The core of learning behavior is the operation of online learning behaviors [
Preset 12 kinds of online learning behavior.
ID | Learning behavior |
---|---|
|
Browse learning goals |
|
Text learning |
|
Multimedia learning |
|
Practice online |
|
Search & view reference |
|
Make notes |
|
Download courseware |
|
Question online |
|
Exchange interaction |
|
Communicate through E-mail |
|
Rest or listen to music |
|
Talk about QQ when learning |
Since the online learning is the period of time process with online learning behavior, it is an important parameter to evaluate the quality of online learning. In particularly, it reflects the degree of focus on learning. The duration set (timeslot) in student profile is defined as follows:
According to the above definitions,
The student profile has a complete model to guide us to analyzing the students’ online learning process. The student profile model (Figure
Student profile model.
Data acquisition includes four categories, such as student user registration data, web log data, learning behavior data, and learning content preference data. The student user registration data is mainly analysis on the characteristics of the learners, including user name, sex, date of birth, geography, occupation, and hobbies. The web log data reflects the operation of E-Learning platform, including active number, page views, access time, activation rate, and learning path. The learning behavior data is helpful for statistics analysis of online learning performance, including learning time, learning activities, learning resources, and examination results. The learning content preference data can be used to analyze the preference of courses or teachers, including browse/collection content, review content, and interactive content. It can be helpful for pushing the course accurately.
Data cleaning preprocesses the original data, removes redundant data, retains the useful data for the analysis, and organizes the data into a standard format. Because the interference of abnormal values often results in data mining distortion [
Attribute induction is the most important process of collecting the data source pretreatment. Suppose the original data field to
in which
In this section, we calculate the similarity in the behavior set of different students, through the Jaccard coefficient similarity algorithm compared with the online behavior characteristics and duration of learners, similar properties classified as a class, and the difference properties classified to different classes.
Similarity among the behavioral characteristics of different students objects belongs to nonnumeric objects; we adopt Jaccard coefficient calculated similarity [
User similarity is defined as
in which
According to similarity calculation, we obtain
It is an upper triangular matrix, where
Jaccard coefficient algorithm is described as shown in Algorithm
( ( (
( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( (
According to the result of Jaccard coefficient algorithm, we can label the learners. Suppose that the calculation is from two dimensions about learning behavior and duration. The student could be labeled “depth learning type” if learning performance takes more than 60 minutes. Similarly, the student could be labeled “tasted type” if learning performance takes less than 10 minutes. Additionally, based on the frequency of online question and online training, we labeled “inquisitive type,” “application type,” and “perseverance type.” The specific labeling method does not repeat them here. We only proposed Jaccard coefficient algorithm and labeling idea for readers.
The students’ learning attitude makes a difference to learning effect. We collected data from 18–24-year-old student group and statistically analyzed it, such as whether having the clear learning goal or not and whether having the learning plan or not. As shown in Figure
Learning attitude analysis.
In the 18–24-year-old group, there are about 94.3% learners who believed that E-Learning courses are helpful for them. There are 18.47% learners who have the clear learning objectives, and 58.6% learners have clear learning objectives occasionally. This ratio reflects that most students are quite blindly taking the E-Learning course. There are 45.9% learners who have no learning plan, and 55.73% learners are learning online while doing other things, such as QQ chat and listening to music. According to Figure
The learning behavior of online learners is diversity. To a certain extent, the frequency of learning behavior reflects the attention of learners to the learning resources. According to the frequency statistics [
Learners’ online learning behavior statistics.
In Figure
MOOC is a popular E-Learning platform, whose importance is reflected in the pass rate of the course. In view of the low pass rate on MOOC [
We suppose that the behavioral data was collected from the registered learners on Data Structures and Algorithm Analysis (DSAA) course for the first 5–7 weeks. After filtering the behavior of unregistered learners in the data record, the sample statistics are shown in Table
Number of samples.
Course | 1–5 weeks | 1–6 weeks | 1–7 weeks |
---|---|---|---|
DSAA | 9401 | 9543 | 9990 |
Define each course having
Predictive value is
in which
We have chosen the characteristic values of the courses, and they have an impact on result about learner’s study. From Table
According to this course, the data set is divided into training set, validation set, and test set randomly; the ratio is 3 : 1 : 1. To use the training parameters with the training set for each experiment, select the optimal parameters for the validation set, and then use the test set to calculate the indicators. We used three classification models: linear discriminant analysis (LDA), logistic regression (LR), and linear support vector machine (LSVM). They are used to predict the course, and the experimental results are shown in Table
Comparison of the forecasting results.
Course | Classifier | Accuracy | Precision | Recall |
|
---|---|---|---|---|---|
DSAA | LDA |
99.6 |
50.0 |
88.9 |
64.0 |
The experimental results show that the three classifiers show consistent performance, and the accuracy is higher. Figure
Sequential variation of DSAA course.
Achievement prediction can help E-Learning platform to discover the abnormal situation, so as to timely intervention and guidance for students. Because online learners are mainly independent learners, they are in isolation and lack emotional communication, which makes them lack emotional support and have difficulty in maintaining long-term learning enthusiasm [
According to the E-Learning resources and learners’ behaviors, we can present an evaluation model supported by the duration, frequency of access, concentration, and other parameters to evaluation learners’ emotion as shown in Figure
Emotional evaluation model.
In this paper, we deeply study the online learning behavior and build the student profile with big data processing technology. Firstly, we analyze the characteristics of learners and the factors that influence the learning behavior and use the method of attribute reduction to cleaning the data. Then, we calculate the similarity of students’ behavior and use the Jaccard coefficient algorithm to classify the students. Finally, the student profile has been established as well as visual analysis. We confirm that E-Learning course requires definite objective, inner motive, synchronous feedback, and independence of the learners. The student profile helps the student to understand their learning situation, to find their own problems, and to improve the completion rate of online courses. With the continuous accumulation of education data and in-depth development, the student profile is bound to promote the healthy development of E-Learning. In the future, we will conduct in-depth study on the fragmentation of knowledge aggregation online.
The authors declare that there are no conflicts of interest regarding the publication of this paper.
This work was supported by the Tianjin University of Science and Technology Youth Innovation Foundation (no. 2016LG28).