Keystroke dynamics based authentication is one of the prevention mechanisms used to protect one’s account from criminals’ illegal access. In this authentication mechanism, keystroke dynamics are used to capture patterns in a user typing behavior. Sequence alignment is shown to be one of effective algorithms for keystroke dynamics based authentication, by comparing the sequences of keystroke data to detect imposter’s anomalous sequences. In previous research, static divisor has been used for sequence generation from the keystroke data, which is a number used to divide a time difference of keystroke data into an equal-length subinterval. After the division, the subintervals are mapped to alphabet letters to form sequences. One major drawback of this static divisor is that the amount of data for this subinterval generation is often insufficient, which leads to premature termination of subinterval generation and consequently causes inaccurate sequence alignment. To alleviate this problem, we introduce sequence alignment of dynamic divisor (SADD) in this paper. In SADD, we use mean of Horner’s rule technique to generate dynamic divisors and apply them to produce the subintervals with different length. The comparative experimental results with SADD and other existing algorithms indicate that SADD is usually comparable to and often outperforms other existing algorithms.
In an era which is full of electronic services, people want to have more convenient and faster ways to assist their needs. This includes reading emails, searching information through online communication, transferring files, and paying bills online. For example, online file transfer involves storage mechanisms in a cloud system. To use these services for our private needs, we must register a login ID and password. As for the online bill payment, we must login with our ID and password before we can pay or transfer money to other users. However, it can happen that criminals detect our login ID and password and utilize our credentials to commit crimes such as stealing important files or money. Therefore, stronger and more secure authentication mechanism has to be designed and implemented to prevent these issues.
Considerable amount of authentication mechanisms has been introduced. One of the examples is a biometric system [
Keystroke dynamics concerns timing details of people’s typing data [
The common timing details that we can obtain from keystroke dynamics are dwell time and flight time. Dwell time, also known as duration time [
There is a considerable amount of machine learning algorithms introduced for keystroke dynamics such as naïve Bayes [
Of the current research we are aware of, there is still no specific algorithm to be used as the common algorithm in keystroke dynamics research. However, in Revett’s research [
The next section describes sequence alignment algorithm. Section
Sequence alignment is an algorithm that calculates the similarity among two or more sequences [
The keystroke dynamics is generated in timestamp format (millisecond). Since values in timestamp format vary and can be infinity, it is inappropriate to apply keystroke dynamics to sequence alignment algorithm directly. Therefore, we have to discretize the timestamp into subintervals. Each subinterval will represent a different category. For example, this process is similar to a questionnaire construction. We usually allow a user to choose few options, such as “strongly disagree,” “disagree,” “neither disagree nor agree,” “agree,” and “strongly agree.” Sometimes, we also just make it shorter to three options which are “disagree,” “neither disagree nor agree,” and “agree.” But usually we just put maximum options to five or six options. We do not put too many categories into the questionnaire because it is hard to be analyzed later. Revett [
We explain the algorithm design of sequence alignment in the following paragraphs. Firstly, we have to get the difference of the time interval from a feature, for instance, dwell time. This time interval difference is obtained from the difference between maximum time and minimum time. The maximum time of dwell time means the longest time for a user to press a key and release a key. The minimum time of dwell time, on the other hand, is the shortest time for a user to press a key and release a key. The formula is defined by
After we obtain the difference of the time interval from an attribute, we have to divide the difference of time intervals into twenty subintervals. The length of subintervals is defined by
Once a row of data (entry) is changed to the corresponding alphabet letters, we run the sequence alignment algorithm. One point is scored if the label is matched in an attribute. Otherwise, no points are scored. The score is described as
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14)
(1) Convert the respectively, from training phase, and then we just run one time from step 9 to step 14 from training phase. (2) (3) (4) (5) (6) (7) (8) (9) Checking_Score (10) return Checking_Score
In Algorithm
In the testing phase, the algorithm converts the given testing data to alphabet letter format based on the minimum point obtained at step 3 and the range difference from step 7 in Algorithm
After the conversion, for the actual recognition, for each row in the model, the algorithm calculates the score (Final_Score) for the row and the test data. This score is a summation of match scores (score in Algorithm
In this paper, we propose sequence alignment with dynamic divisor generation (SADD) algorithm. SADD checks the degree of sufficiency of the dataset and then provides a proper divisor instead of static divisor as shown in (
As a human, we are unfamiliar with a new thing immediately from the beginning. We have to practice a few times to get accustomed to the new thing. For example, consider an athlete who wants to run a 100-meter track in 10 seconds. However, this is nearly impossible if she is a beginner. She has to train hard and practice regularly. The time record from the first day until the day she manages to run a 100-meter track in 10 seconds could be illustrated as a graph which is shown in Figure
The time of practicing a new action versus the day of practicing it.
The phenomenon (i.e., realm point) that we have discussed previously can be applied for most activities including the typing speed of a password. Unfortunately, it is difficult to know how many hours, days, weeks, months, or years needed to reach the realm point of typing speed. Every user will have different time to reach realm point even with all the other conditions fixed. We do not know if the users are reaching the realm point or not in the beginning of the experiment, and real authentication systems do not know this either. For the best case, the data we collected cover from the beginning of the day to the realm point (or after realm point). For the worst case, it can be from the beginning of the day until the middle of the days, as shown in Figure
The worst case of the data used in the experiment or in real authentication system.
For (
(a) The best case after the data is converted into a sequence. (b) The worst case after the data is converted into a sequence.
Data mapping with the amino acid letters in the worst case. Note that mapping is started from the middle of the letters.
Now, we explain the proposed algorithm design. Firstly, (
Calculation of mean with Horner’s rule.
From line 11 in Figure
Once we obtain the ratio, we calculate the divisor by
After that, we replace the twenty values from (
Consider the worst case shown in Figure
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15) (16) (17) (18) (19) (20)
(1) Convert the respectively, from training phase. And then we run from step 15 to step 20 for just once time. (2) (3) (4) (5) (6) (7) (8) (9) Checking_Score (10) return Checking_Score
In the training phase of Algorithm
In the keystroke dynamics, there are six common timing elements that we can use between two different keystrokes. They are as follows. Hold (H): it is a duration time (or a dwell time) of pressing a key:
a holding time of the first key, H1; a holding time of the second key, H2. Up-down (UD): it is a duration time (or a flight time) between key-up of the first key and key-down of the second key. Down-down (DD): it is a duration time between key-down of the first key and key-down of the second key; it is the sum of H1 and UD. Up-up (UU): it is a duration time between key-up of the first key and key-up of the second key; it is the sum of UD and H2. Down-up (DU): it is a duration time between key-down of the first key and key-up of the second key; it is the sum of DD and H2.
In our experiment, we use the CMU benchmark dataset [
Assume that Instance number 1 is the data that we have collected from a genuine user and it is used as model and Instance number 2 is a new data inserted by an imposter. Since it is a short example, we omit the procedure to convert the timestamp format into consensus sequence (label format). As we explain above, the sequence alignment algorithm is checking element by element. The first element it checks is H1. Based on (
Therefore, in order to discover the effectiveness of DD elements in the authentication, we create another dataset from benchmark dataset which is having no DD elements (by removing all DD attributes from original dataset). Other than that, we also want to evaluate how effective it is to use all elements in the authentication. Hence, we create another dataset from benchmark dataset which is having extra UU and DU elements (by adding UU and DU attributes into original dataset). Basically, the difference between these three datasets is the number of the attributes used in the experiment. The first dataset (Dataset number 1) consists of 31 attributes, the second dataset (Dataset number 2) consists of 21 attributes, and the last dataset (Dataset number 3) has 51 attributes.
As for the benchmark dataset and two extra datasets we have created, each dataset contains 20,400 typing password entries. In each dataset, it consists of 51 subjects involved in this experiment. Each subject has 400 entries. In our experiment, during the training phase, we select one subject as genuine user and the remaining 50 subjects as imposters. First 200 entries of the chosen subject are used as training data. However, the remaining 200 entries of the chosen subject are used as testing data. Besides that, we obtain the first five entries from the remaining 50 subjects as testing data too. In total, there are 450 entries used in the testing phase. Based on Killourhy and Maxion [
The main performance of evaluation in our experiment is using receiver operating characteristic (ROC) curves. We compare our algorithm and other approaches by showing the graph in ROC curves. One of the examples of ROC curves is shown in Figure
Examples of ROC curves from any 6 subjects in dataset number 1. Note that D stands for dataset and S stands for subject.
Instance number 1 is the data collected from a genuine user and stored as a model in database and Instance number 2 is the data collected from an imposter.
As aforementioned, we run three different datasets in our experiment. The first dataset is the CMU benchmark dataset, the second dataset is without the DD element, and the last dataset is with extra UU and DU elements. Table
The average of equal error rates with the standard deviation of equal error rate (after plus-minus sign) for six algorithms in three datasets. The highest performance of the algorithm is in bold.
Algorithm | EER | ||
---|---|---|---|
Dataset number 1 | Dataset number 2 | Dataset number 3 | |
SADD |
|
|
|
Sequence alignment |
|
|
|
Median vector proximity |
|
|
|
Mahalanobis |
|
|
|
Manhattan |
|
|
|
Euclidean |
|
|
|
In order to observe the performance of each algorithm in each subject, we produce the area under curve (AUC) in Tables
The result of AUC for six algorithms in Dataset number 1. The highest value of AUC result is in bold.
Subject | Algorithm | |||||
---|---|---|---|---|---|---|
SADD | Sequence alignment | Median vector proximity | Mahalanobis | Manhattan | Euclidean | |
1 | 0.91294 |
|
0.91255 | 0.84844 | 0.81886 | 0.79240 |
2 |
|
0.96491 | 0.96794 | 0.88018 | 0.87374 | 0.86002 |
3 |
|
0.97690 | 0.95917 | 0.96926 | 0.89572 | 0.84620 |
4 | 0.99472 | 0.99249 |
|
0.97216 | 0.96588 | 0.93968 |
5 | 0.97759 | 0.97771 |
|
0.96218 | 0.93860 | 0.95596 |
6 | 0.98896 | 0.98872 |
|
0.97972 | 0.94190 | 0.92976 |
7 |
|
0.99835 | 0.99800 | 0.98944 | 0.99448 | 0.99088 |
8 | 0.99395 | 0.99359 |
|
0.96762 | 0.95878 | 0.95658 |
9 | 0.99918 | 0.99846 |
|
0.97772 | 0.97190 | 0.93394 |
10 | 0.99970 |
|
0.99311 | 0.96102 | 0.96404 | 0.94450 |
11 | 0.98905 |
|
0.96766 | 0.93818 | 0.87728 | 0.85674 |
12 | 0.89676 | 0.85888 | 0.87452 |
|
0.81641 | 0.82154 |
13 |
|
0.99626 | 0.99809 | 0.99184 | 0.98328 | 0.97100 |
14 |
|
0.98363 | 0.98269 | 0.92130 | 0.89508 | 0.87682 |
15 | 0.99442 |
|
0.99430 | 0.97628 | 0.95916 | 0.93830 |
16 |
|
0.95875 | 0.95498 | 0.93130 | 0.81956 | 0.83250 |
17 | 0.96347 | 0.96203 |
|
0.94258 | 0.94638 | 0.93536 |
18 | 0.97977 | 0.97453 |
|
0.98998 | 0.85226 | 0.63850 |
19 | 0.96628 | 0.96097 | 0.99002 |
|
0.91926 | 0.86964 |
20 |
|
0.98169 | 0.97971 | 0.95914 | 0.94242 | 0.91234 |
21 | 0.98708 | 0.97760 |
|
0.97928 | 0.95504 | 0.94054 |
22 |
|
0.98747 | 0.99111 | 0.96448 | 0.90074 | 0.85908 |
23 |
|
0.98614 | 0.98325 | 0.98912 | 0.94938 | 0.91076 |
24 | 0.98912 | 0.98902 |
|
0.96422 | 0.96178 | 0.92958 |
25 | 0.90524 | 0.89650 |
|
0.91256 | 0.92642 | 0.88396 |
26 | 0.86154 | 0.85887 |
|
0.84346 | 0.76234 | 0.73782 |
27 | 0.91178 |
|
0.90287 | 0.79592 | 0.83082 | 0.84098 |
28 | 0.88261 | 0.87920 | 0.89901 |
|
0.81372 | 0.71792 |
29 | 0.98055 |
|
0.92300 | 0.89790 | 0.80930 | 0.76686 |
30 |
|
0.94897 | 0.93700 | 0.93106 | 0.81390 | 0.80098 |
31 | 0.99604 | 0.99028 | 0.99899 |
|
0.98006 | 0.96314 |
32 | 0.98061 |
|
0.95802 | 0.94178 | 0.90500 | 0.87796 |
33 | 0.97399 | 0.96602 |
|
0.95766 | 0.93116 | 0.90708 |
34 |
|
0.98958 | 0.99164 | 0.94522 | 0.92062 | 0.89656 |
35 | 0.90270 | 0.86283 |
|
0.87956 | 0.70992 | 0.61772 |
36 | 0.98176 | 0.97069 |
|
0.93214 | 0.90228 | 0.87042 |
37 | 0.99873 | 0.99707 |
|
0.96974 | 0.98362 | 0.96498 |
38 | 0.99488 | 0.98635 |
|
0.99290 | 0.95578 | 0.89508 |
39 | 0.97015 | 0.95877 | 0.95811 |
|
0.92448 | 0.89638 |
40 |
|
0.96143 | 0.95883 | 0.93078 | 0.89306 | 0.90874 |
41 | 0.89218 |
|
0.85444 | 0.80208 | 0.70720 | 0.76250 |
42 |
|
0.99112 | 0.99190 | 0.97140 | 0.95162 | 0.93066 |
43 | 0.92320 | 0.91166 |
|
0.93964 | 0.55174 | 0.52548 |
44 |
|
0.98114 | 0.98077 | 0.95566 | 0.90678 | 0.89028 |
45 | 0.96613 | 0.96170 |
|
0.94778 | 0.94792 | 0.95978 |
46 |
|
0.99930 | 0.99831 | 0.99286 | 0.99016 | 0.97922 |
47 |
|
0.99910 | 0.99904 | 0.99086 | 0.99130 | 0.97598 |
48 |
|
0.96864 | 0.97639 | 0.95596 | 0.96432 | 0.96334 |
49 |
|
0.99994 | 0.99900 | 0.99504 | 0.99894 | 0.99572 |
50 |
|
0.99344 | 0.98778 | 0.94522 | 0.93800 | 0.93418 |
51 |
|
0.98566 | 0.98125 | 0.95238 | 0.92136 | 0.93976 |
|
||||||
Total |
|
49.28132 |
49.43454 |
48.25168 |
46.03375 |
44.84610 |
The result of AUC for six algorithms in Dataset number 2. The highest value of AUC result is in bold.
Subject | Algorithm | |||||
---|---|---|---|---|---|---|
SADD | Sequence alignment | Median vector proximity | Mahalanobis | Manhattan | Euclidean | |
1 | 0.88624 |
|
0.87691 | 0.84844 | 0.82584 | 0.79356 |
2 | 0.95560 |
|
0.93895 | 0.88018 | 0.86560 | 0.85034 |
3 |
|
0.97993 | 0.96152 | 0.96926 | 0.90524 | 0.85016 |
4 | 0.99222 | 0.99102 |
|
0.97216 | 0.96684 | 0.93098 |
5 | 0.94865 |
|
0.94441 | 0.96218 | 0.93528 | 0.95348 |
6 | 0.98901 |
|
0.98706 | 0.97972 | 0.95880 | 0.93774 |
7 |
|
0.99776 | 0.99636 | 0.98944 | 0.99494 | 0.98914 |
8 |
|
0.98971 | 0.98889 | 0.96762 | 0.96574 | 0.96284 |
9 | 0.99926 | 0.99839 |
|
0.97772 | 0.98964 | 0.95676 |
10 | 0.99645 |
|
0.98651 | 0.96102 | 0.96532 | 0.94166 |
11 | 0.98800 |
|
0.95766 | 0.93818 | 0.88172 | 0.85402 |
12 |
|
0.87188 | 0.88252 | 0.91010 | 0.82332 | 0.81468 |
13 |
|
0.99646 | 0.99836 | 0.99184 | 0.98646 | 0.97188 |
14 |
|
0.98281 | 0.98131 | 0.92130 | 0.90544 | 0.88206 |
15 | 0.99350 |
|
0.98994 | 0.97628 | 0.96714 | 0.93850 |
16 |
|
0.97288 | 0.95340 | 0.93130 | 0.85616 | 0.84848 |
17 | 0.95874 |
|
0.94673 | 0.94258 | 0.94676 | 0.93334 |
18 | 0.99080 | 0.98800 |
|
0.98998 | 0.90300 | 0.68836 |
19 | 0.97484 | 0.97116 | 0.99244 |
|
0.94462 | 0.87930 |
20 |
|
0.98143 | 0.97688 | 0.95914 | 0.94462 | 0.91040 |
21 |
|
0.98306 | 0.98878 | 0.97928 | 0.96214 | 0.94290 |
22 |
|
0.98849 | 0.98830 | 0.96448 | 0.92518 | 0.86614 |
23 |
|
0.98886 | 0.97813 | 0.98912 | 0.96066 | 0.91082 |
24 | 0.98587 |
|
0.98557 | 0.96422 | 0.96584 | 0.93000 |
25 | 0.91843 | 0.90906 |
|
0.91256 | 0.92510 | 0.86990 |
26 | 0.84544 | 0.85266 |
|
0.84346 | 0.76024 | 0.73304 |
27 | 0.87100 |
|
0.82663 | 0.79592 | 0.81164 | 0.83336 |
28 | 0.93937 | 0.93577 |
|
0.93832 | 0.82262 | 0.6955 |
29 | 0.97335 |
|
0.91598 | 0.89790 | 0.82654 | 0.77378 |
30 |
|
0.94329 | 0.89873 | 0.93106 | 0.81544 | 0.78808 |
31 | 0.99886 | 0.99779 | 0.99926 |
|
0.98600 | 0.96808 |
32 | 0.97698 |
|
0.94639 | 0.94178 | 0.92584 | 0.88910 |
33 |
|
0.97630 | 0.97367 | 0.95766 | 0.93922 | 0.90810 |
34 |
|
0.99215 | 0.99290 | 0.94522 | 0.93410 | 0.90136 |
35 | 0.95729 | 0.92842 |
|
0.87956 | 0.75252 | 0.61538 |
36 | 0.99026 | 0.98381 |
|
0.93214 | 0.92598 | 0.87978 |
37 |
|
0.99773 | 0.99888 | 0.96974 | 0.98948 | 0.96940 |
38 | 0.99652 | 0.99269 |
|
0.99290 | 0.96702 | 0.90490 |
39 |
|
0.97180 | 0.95770 | 0.97590 | 0.94078 | 0.90424 |
40 |
|
0.96871 | 0.94501 | 0.93078 | 0.91412 | 0.91472 |
41 | 0.83326 |
|
0.78742 | 0.80208 | 0.68962 | 0.74806 |
42 |
|
0.99254 | 0.99048 | 0.97140 | 0.95918 | 0.93112 |
43 | 0.95977 | 0.94853 |
|
0.93964 | 0.60102 | 0.52690 |
44 |
|
0.97880 | 0.97269 | 0.95566 | 0.92728 | 0.89918 |
45 | 0.94854 | 0.95723 | 0.94442 | 0.94778 | 0.94986 |
|
46 |
|
0.99959 | 0.99874 | 0.99286 | 0.99284 | 0.98012 |
47 |
|
0.99902 | 0.99711 | 0.99086 | 0.99500 | 0.97770 |
48 | 0.96373 | 0.96001 | 0.94336 | 0.95596 |
|
0.96292 |
49 |
|
0.99742 | 0.99736 | 0.99504 | 0.99928 | 0.99722 |
50 |
|
0.98757 | 0.97627 | 0.94522 | 0.94132 | 0.93674 |
51 | 0.98904 |
|
0.97148 | 0.95238 | 0.92646 | 0.93746 |
|
||||||
Total |
|
49.47285 |
49.02047 |
48.25168 |
46.53034 |
44.94286 |
The result of AUC for six algorithms in Dataset number 3. The highest value of AUC result is in bold.
Subject | Algorithm | |||||
---|---|---|---|---|---|---|
SADD | Sequence alignment | Median vector proximity | Mahalanobis | Manhattan | Euclidean | |
1 | 0.91878 | 0.91561 |
|
0.84844 | 0.82134 | 0.79844 |
2 | 0.97017 | 0.96301 |
|
0.88018 | 0.88380 | 0.85948 |
3 |
|
0.97933 | 0.95722 | 0.96926 | 0.90408 | 0.85802 |
4 | 0.99602 | 0.99316 |
|
0.97216 | 0.96790 | 0.94638 |
5 | 0.98845 | 0.98843 |
|
0.96218 | 0.95524 | 0.96498 |
6 |
|
0.98988 | 0.99043 | 0.97972 | 0.92956 | 0.92516 |
7 |
|
0.99884 | 0.99874 | 0.98944 | 0.99544 | 0.99210 |
8 | 0.99434 | 0.99422 |
|
0.96762 | 0.95000 | 0.94710 |
9 |
|
0.99693 | 0.99721 | 0.97772 | 0.95844 | 0.92222 |
10 | 0.99970 |
|
0.99689 | 0.96102 | 0.96640 | 0.94618 |
11 | 0.98862 |
|
0.96965 | 0.93818 | 0.87660 | 0.86392 |
12 | 0.89050 | 0.86445 | 0.84490 |
|
0.82128 | 0.83304 |
13 |
|
0.99702 | 0.99727 | 0.99184 | 0.97992 | 0.96996 |
14 |
|
0.98127 | 0.97807 | 0.92130 | 0.88548 | 0.87208 |
15 | 0.99006 | 0.99071 |
|
0.97628 | 0.94972 | 0.93500 |
16 |
|
0.94369 | 0.94444 | 0.93130 | 0.80648 | 0.83034 |
17 | 0.96680 | 0.96742 |
|
0.94258 | 0.95020 | 0.93990 |
18 | 0.96225 | 0.95484 | 0.98715 |
|
0.80096 | 0.60398 |
19 | 0.95516 | 0.95233 | 0.98263 |
|
0.90200 | 0.86606 |
20 |
|
0.98131 | 0.97650 | 0.95914 | 0.94202 | 0.91498 |
21 | 0.97955 | 0.97219 |
|
0.97928 | 0.95442 | 0.94144 |
22 | 0.99033 | 0.98696 |
|
0.96448 | 0.89646 | 0.86752 |
23 |
|
0.98497 | 0.98092 | 0.98912 | 0.94574 | 0.91436 |
24 | 0.99158 | 0.99109 |
|
0.96422 | 0.95720 | 0.92892 |
25 | 0.87958 | 0.87253 |
|
0.91256 | 0.93030 | 0.89172 |
26 | 0.87085 | 0.87509 |
|
0.84346 | 0.77044 | 0.74542 |
27 | 0.92324 |
|
0.92630 | 0.79592 | 0.84516 | 0.84584 |
28 | 0.86118 | 0.87980 | 0.88506 |
|
0.85035 | 0.77736 |
29 | 0.97665 |
|
0.92037 | 0.89790 | 0.80344 | 0.76698 |
30 |
|
0.95189 | 0.95176 | 0.93106 | 0.82536 | 0.81706 |
31 | 0.98530 | 0.97686 | 0.99673 |
|
0.97476 | 0.95788 |
32 |
|
0.97706 | 0.95275 | 0.94178 | 0.88974 | 0.87134 |
33 |
|
0.97639 | 0.97776 | 0.95766 | 0.93554 | 0.91100 |
34 |
|
0.98795 | 0.98872 | 0.94522 | 0.91766 | 0.90142 |
35 | 0.86485 | 0.84880 |
|
0.87956 | 0.73278 | 0.64826 |
36 | 0.9760 | 0.96774 |
|
0.93214 | 0.90458 | 0.88226 |
37 | 0.99892 | 0.99692 |
|
0.96974 | 0.98150 | 0.96516 |
38 | 0.98589 | 0.97523 |
|
0.99290 | 0.94534 | 0.88950 |
39 | 0.97196 | 0.96555 | 0.94897 |
|
0.91350 | 0.89376 |
40 |
|
0.96767 | 0.96202 | 0.93078 | 0.88326 | 0.90798 |
41 | 0.91268 |
|
0.88391 | 0.80208 | 0.72702 | 0.77560 |
42 |
|
0.99096 | 0.99086 | 0.97140 | 0.95082 | 0.93314 |
43 | 0.89182 | 0.89006 |
|
0.93964 | 0.54252 | 0.54492 |
44 |
|
0.98154 | 0.98108 | 0.95566 | 0.89562 | 0.88752 |
45 | 0.97411 | 0.97339 |
|
0.94778 | 0.94900 | 0.96054 |
46 |
|
0.99942 | 0.99838 | 0.99286 | 0.98820 | 0.97910 |
47 | 0.99952 |
|
0.99873 | 0.99086 | 0.98826 | 0.97440 |
48 | 0.98405 | 0.97524 |
|
0.95596 | 0.96146 | 0.96468 |
49 |
|
0.99994 | 0.99991 | 0.99504 | 0.99734 | 0.99464 |
50 |
|
0.99190 | 0.98780 | 0.94522 | 0.93488 | 0.92860 |
51 |
|
0.98596 | 0.98243 | 0.95238 | 0.92398 | 0.94170 |
|
||||||
Total |
49.36191 |
49.21733 |
|
48.25168 |
45.96349 |
44.99934 |
Besides step 9 in the testing phase from Algorithms
The average of equal error rates with the standard deviation of equal error rate (after plus-minus sign) by using different statistical metrics in both sequence alignment algorithm and SADD. The highest performance of the algorithm and statistical metric is in bold.
Statistical metric | SADD | Sequence alignment | ||||
---|---|---|---|---|---|---|
Dataset number 1 | Dataset number 2 | Dataset number 3 | Dataset number 1 | Dataset number 2 | Dataset number 3 | |
Minimum | 0.273 |
0.335 |
0.264 |
0.400 |
0.472 |
0.325 |
Maximum | 0.083 |
0.092 |
0.084 |
0.088 |
0.098 |
0.086 |
Mean |
|
|
|
0.084 |
0.079 |
0.087 |
Median | 0.087 |
0.095 |
0.091 |
0.102 |
0.115 |
0.099 |
Mode | 0.134 |
0.135 |
0.146 |
0.157 |
0.153 |
0.160 |
In addition, we operate an experiment with different number of the bins (different number of the categories) to test the effectiveness of the number of bins toward our proposed algorithm and the sequence alignment algorithm with static divisor. We show the result in Table
The average of equal error rates with the standard deviation of equal error rate (after plus-minus sign) by using different number of bins in both sequence alignment algorithm and SADD. The highest performance of the algorithms in appropriate number of the bins is in bold.
Bins | Dataset number 1 | Dataset number 2 | Dataset number 3 | |||
---|---|---|---|---|---|---|
SADD | SA | SADD | SA | SADD | SA | |
10 | 0.083 |
0.089 |
0.079 |
0.077 |
0.095 |
0.101 |
20 |
|
0.084 |
|
|
0.082 |
0.087 |
30 | 0.077 |
|
0.082 |
0.083 |
|
|
40 | 0.077 |
0.086 |
0.082 |
0.087 |
0.079 |
0.084 |
50 | 0.079 |
0.086 |
0.084 |
0.087 |
0.079 |
0.085 |
In Revett’s research [
Al-Jarrah [
Giot and Rosenberger’s [
Syed et al. [
In this paper, we have proposed sequence alignment with dynamic divisor generation (SADD) for user authentication by using the keystroke dynamics. Based on the experiments we have conducted, our algorithm produces promising results and also mostly outperforms other previous work. We also empirically show that the dynamic divisor generally outperforms static divisor. We believe that the dynamic divisor takes an important role in sequence alignment algorithm because it calculates the degree of sufficiency of the dataset (by using mean of Horner’s rule) and then it provides faultless calculation for an appropriate divisor to be used in each attribute. These dynamic divisors help to prevent the genuine user’s data digressed from the legal categories.
Based on Giot and Rosenberger’s [
The authors declare that there is no conflict of interests regarding the publication of this paper.
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MEST) (no. NRF-2013R1A1A2013401).