Data mining techniques have numerous applications in malware detection. Classification method is one of the most popular data mining techniques. In this paper we present a data mining classification approach to detect malware behavior. We proposed different classification methods in order to detect malware based on the feature and behavior of each malware. A dynamic analysis method has been presented for identifying the malware features. A suggested program has been presented for converting a malware behavior executive history XML file to a suitable WEKA tool input. To illustrate the performance efficiency as well as training data and test, we apply the proposed approaches to a real case study data set using WEKA tool. The evaluation results demonstrated the availability of the proposed data mining approach. Also our proposed data mining approach is more efficient for detecting malware and behavioral classification of malware can be useful to detect malware in a behavioral antivirus.
1. Introduction
Malicious code is one of the serious threats on the internet platform that is called malware [1]. Malware is known as a malicious application that has been obviously considered to damage the networks and computers [2]. The malware detection design depends on a signature database [3, 4]. For example, a file can be examined with comparison of its bytes using signatures database. If there is an equal specification in the bytes, the suspicious file will be recognized as a malicious file [5, 6]. Some subjects concentrate the signature-based malware detection less than dependable entirely which cannot handle the dynamic modification of malware behavior and cannot identify the hidden malware. In contrast, the behavior based malware detection can find the real behavior of a malicious file [7, 8].
The data mining objectives contain refining advertising abilities, irregular patterns detection, and the upcoming based experiences prediction [9] which can be influenced to identify the suspicious programs which have a destructive content for computer systems such as Virus, Worm, and Trojan [10]. The malware word is assigned to [11, 12] as a destructive file. Data mining techniques rely on data sets that contain some individual configurations for the malicious files and benign software to construct the classification methods for malware detection [13, 14].
Because of the growing malware in the technology, the knowledge of unknown malware protection is an essential topic in the malware detection according to the machine learning methods. Generally, the data mining approaches specified both malicious executable and benign software programs as set of malware programs in the wild [13, 15, 16]. Usually, the data mining algorithms can be categorized into two various forms: supervised and unsupervised learning procedures. The supervised learning methods are called classification algorithms that are needed to the exercise for data set [13, 17]. In contrast, the unsupervised learning methods are called clustering algorithms that are attempted to evaluate organizing data into different clusters [18, 19].
Usually, the malware programs are classified into some parts such as Worm, Virus, Trojan, Spyware, Backdoor, and Rootkit [10, 20–22]. The base of typical and traditional approaches to identify the malware is using signature-based techniques. In recent years, the disappointment of old methods in unrecognized malware detection or polymorphic malicious files exasperated researchers and they attempted to present more dependable approaches for malware detection with behavior of the malware [23]. The procedure of detecting and finding malware has been done by two types of analysis: static analysis and dynamic analysis. In the software analyzing methods, analyzing without running the codes is called static analysis which can detect the malicious code and put it in one of the available collections based on different learning methods [24]. In the static analysis, malicious files and malware are detected based on binary codes. The main disadvantage of static analysis is unavailability of the source codes of the program. It is valuable to declare that extracting binary codes is a relatively complex and complicated work.
In contrast, the dynamic analysis detects malicious codes according to the runtime behavior [10]. The runtime code analyzing is called dynamic analysis which also denoted behavior analyzing and observing behavior and system interaction [23]. Dynamic analysis mechanism needs to execute the infested files in a virtual machine [21]. Dynamic analysis can be used with classification and clustering methods to navigate the increasing volume and range of malware. The malware classification methods help to assign unknown malware to recognized families [7, 20]. Therefore, malware classification is used to filter unknown cases and thus decreases the costs of analysis [8, 25–29].
The contributions of this paper are included as follows:
Proposing a behavioral analysis mechanism for malware detection.
Presenting a converter program for transforming a malware behavior executive history XML file to a suitable WEKA input.
Discussing some classification methods on a real case study of malware.
Comparing the experimental results such as Correctly Classified Instances, mean absolute error, and accurate optimistic ratio in the real data set by WEKA tool.
Testing the best classification method based on the important features in the malware detection in order to develop a behavioral antivirus.
The structure of this paper is organized as follows: in Section 2 we have discussed some backgrounds and related works in the malware detection and data mining techniques. Section 3 depicts the malware behavioral analysis. In this section we propose a new approach for analyzing the malware behavior and translating the malicious files to data mining files by using a real case study. Also this section describes the classification and prediction approaches using data mining platform. Then, we apply some of the popular classification methods on our real case study using WEKA tool. The evaluation and experimental results are reported in Section 4. Section 5 concludes discussion and the future work.
2. Related Works
This section discusses a brief background and some related works for malware detection in data mining methods. Firstly, we review data mining approach briefly based on classification methods in malware and other systems. Recently, some researchers presented the different approaches in malware analysis. Schultz et al. [30] proposed a data mining method to recognize the new malicious files in runtime execution. Their method was based on three types of DLL calls such as the list of DLLs used by the binary; the list of DLL function calls; and number of different system calls used within each DLL. Also they examine byte orders extracted from the hex-dump (a hexadecimal schema of computer data) of an executable file using signature methods. The main structure of this method is based on Naive-Bayes (NB) algorithm. They compared the experimental results by traditional signature-based methods.
Also Kolter and Maloof [31] presented a data mining approach and n-gram analysis to identify malicious executable files based on signature approach. They presented a hex-dump utility for translating each executable file to hexadecimal code in an ASCII format. Their main data set consisted of the clean programs and the malicious programs. They analyzed the proposed approach by some popular classification methods such as instance-based learner, TFIDF, Naive-Bayes, support vector machines, decision tree, boosted Naive-Bayes, and boosted decision tree. In the other research, Siddiqui et al. [32] proposed data mining techniques for recognition some malware programs such as Worms. They considered variable length instruction sequence for their approach. Their main data set includes some Windows files and Worms. As experimental results, sequence reduction was executed, 97% of the sequences were removed, and random forest decision tree model was performed slightly better than the others.
Also some research work presented the data mining methodologies for different approach. For example, in [33] the researchers presented various data mining methods that have been developed for cancer diagnosis. Consequently, this research focused on captivating the clinical information which can be found without surgery to exchange the pathology report. They used to discover the association between the clinical information and the pathology report in order to maintain lung cancer pathologic staging diagnosis using data mining techniques. In the other research [1, 34], the authors proposed a data mining approach to analyze the students careers. Their approach is based on clustering and sequential methods with the aim of categorizing strategies for refining the performance of the exams scheduling and students. They analyzed a real case study using K-mean cluster techniques in WEKA tool. Likewise [26] presented a new data mining method for the problem of detecting the phishing websites using a developed associative classification method called multilabel classifier that generates multiple labels rules. They analyzed the experimental results by various patterns in WEKA software. Also the researchers in [35] analyzed the several decision tree models to classify patients of the hospital surveillance data as a real case study. The experimental results of their analysis showed that their approach improved identical dissemination of instances in each class. Other related work [36] used a neurofuzzy data mining approach for classification of generalized bell-shaped membership functions. They applied the proposed technique to ten real standard data sets from the UCI machine learning repository for classification using Kappa statistic. They simulated proposed technique in MATLAB. Also some researches focused on the other approaches that consist of the host behavior classification methods [37–40]. For example, [29] presented a novel managed discretization technique for analyzing multivariate time series which uses frequent temporal patterns as features for classification of time chain for geared near improvement of classification correctness. This paper used temporal abstraction classification approach and time intervals mining for the presented multivariate time series. Also [38] presented novel Artificial Neural Networks (ANN) based mechanism for discovering the computer Worms based on the behavioral computer events. According to estimation of different parameters of the infected computers, the ANN, decision tree, and K-nearest neighbors classification techniques are compared. The other research is [41] where the authors presented computer measurements extracted mechanism for identifying unknown computer Worm activity in the operating system using support vector approaches. This paper separates a series of trials to check the new technique by retaining several computer configuration activities.
To the best of our knowledge, there is no any approach that analyzes the malware behavior in data mining platform exactly and also there is no any approach to convert malware behavior XML executive history file to a suitable WEKA tool input. Our approach can be used in base of a behavioral antivirus. For improving this defect, we present a new approach to translate a malicious file to the data mining platform. Then we consider some classification methods for evaluating our approach based on malware behavior.
3. Malware Behavior Analysis
In this section, we proposed a malware behavioral analysis mechanism as shown in Figure 1. In this mechanism, a XML file of malware behavior executive history will be converted to a nonsparse matrix using a suggested application. This application is produced with VB.Net language. Figure 2 shows a snapshot of XML convertor to a nonsparse matrix using our suggested application. The procedure of converting each XML file to a suitable WEKA input includes two elements: the number of library file calls which are attacked by malware and their volume. For example, in Box 1 the XML library file ntdll.dll has been called 16 times by the malware which are between (0,2). Then, we translate this matrix to WEKA input data set. The training methods will be proceeded by some classification algorithms. Each classification that has best performance will be chosen for test platform by new data set malware. Finally, this procedure can be used for developing a behavioral antivirus. For describing the behavioral model of malware we should download the XML file which is available in PIL (http://dws.informatik.uni-mannheim.de) as an XML file [38–40]. We use 7155 XML files as data set 1 and data set 2. Our first data set contains 4024 XML file and data set 2 has 3131 XML files too. Data set 1 has 89 properties and data set 2 has 91 properties for each malware.
Box 1: A sample part of XML file contains a malware behavior.
<?xml version=“1.0” ?>
- <!- -
This analysis was created by CWSandbox (c) CWSE GmbH/Sunbelt Software
The behavioral analysis of malware detection mechanism.
A snapshot of XML convertor to nonsparse matrix.
Then, we convert this XML file to a nonsparse matrix by using our suggested program. The nonsparse matrix includes two numbers: the first number shows the number of properties and the second number shows their importance. The first row of this matrix is shown as follows:
The last number of this row is 88 T1 that shows the kind of malware.
Finally we analyze the executive history of malware in WEKA environment. The malware executive history can be developed by some applications such as SandBox tool and virtual machine for safe execution of malware in computer systems and preventing malware spread [28, 38–41]. The XML file includes useful information such as system library files calls, creating, searching, and change of files, modifying registry, main processes information, creating the mutex (a mutex is an application object which permits the multiple program threads to share the same resource), modifying virtual memory, sending email, registry operations, and switches communications. By using the suggested program all of the information is read and saved as a nonsparse matrix.
Now, the matrix has been converted to a standard form of WEKA tool input as .arff file for data set 1 and data set 2. This standard form is shown in Box 2.
Box 2: An example of standard form for WEKA input.
This section describes the classification methods in two real case studies as data set 1 and data set 2. At first, we analyze the data mining result on data set 1 and data set 2 by WEKA classification algorithms. For specifying the performance of classification methods in WEKA, we describe some effective features briefly [27]. The Correctly Classified Instances (CCI) depict the test cases percentages that were correctly classified. Also the Incorrectly Classified Instances (ICI) represent the test cases percentages that were incorrectly classified.
The relative absolute error (RAE) is qualified to a simple predictor error which is objective for the typical real values. In the RAE, the error is only the total absolute error rather than the total squared error.
Definition 1.
A relative absolute error is a 3-tuple RAEi=(Fi,j,Vj,₸) in formula (1), where Fi,j is the value predicted by the individual program i for sample case j (out of k sample cases); Vj is the objective value for sample case j; and i₸ is given by the following formulas(1)RAEi=∑j=1kFi,j-Vj∑j=1kVj-₸,(2)₸=1k∑j=1kVj.Also the mean absolute error (MAE) shows the mean average greatness of the errors in a set of predictions, without allowing for their course. The MAE depicts the correctness of incessant variables in prediction procedure. The MAE specifies and verifies an average on the absolute values between forecast and the corresponding statement. The MAE is a linear score which means that all the individual differences are weighted equally in the average [42–44].
Definition 2.
A mean absolute error is a 2-tuple MAEi=(Pi,Ti) in formula (3), where Pi is the prediction of value and Ti is the true value. This feature specifies the average error in the classification procedure in (3)MAEi=1k∑j=1kPj-Tj.Also we can measure the classifiers proficiency using a true optimistic ratio (TOR), where NC is the number of correctly detected malware programs and NI is the number of incorrectly detected malware programs in (4). The AOR creates the cost of estimated classification that is significant to setting the cost of malware classification [45]:(4)TOR=NCNC+NI.Also there are two error rates for measuring the classification performance. The False Acceptance Rate (FAR) is the ratio of the number of test cases that are incorrectly accepted by a given model to the total number of cases. This means that this ratio shows the percentage of invalid inputs which are incorrectly accepted. The False Rejection Rate (FRR) is the ratio of the number of test cases that are incorrectly rejected by a given model to the total number of cases. This means that this ratio shows the percentage of valid inputs which are incorrectly rejected [46]. By using these factors we can calculate the Total Error Rate (TER) as follows [47]:(5)TER=FAR+FRRNC+NI.In the classification process, we use NaiveBayse, BayseNet, IB1, J48, and classification via regression algorithms. The NaiveBayes and BayesNet are a probabilistic learning algorithms based on supervised learning method which require a small number of training data to estimate the constraints. The IB1 data mining algorithm is based on lazy approaches. Also J48 data mining algorithm is based on decision tree methods. Finally, classification via regression algorithm is based on Meta approach that is the new approach in data mining methods. In other words regression analysis is a statistical method which is used to achieve data analysis. Regression is applied with correlation analysis usually. The correlation analysis evaluates the association degree between two quantitative data sets [37]. For example, Figure 3 shows the classification result of NaiveBayse algorithm in WEKA tool. The following section describes the experimental results of classification algorithms in WEKA. Some effective features such as Correctly Classified Instances, Incorrectly Classified Instances, mean absolute error, and relative absolute error are compared with each other in order to achieve the best classification algorithm for developing a behavioral antivirus.
The snapshot of NaiveBayse classification algorithm in WEKA.
4. Experimental Results and Discussion
In this section, we implemented our approach using WEKA tool. We use a system by Intel Core i3 2.13 GHz CPU, 4 GB RAM, for the classification methods. This analysis has been done by some classification algorithms such as NaiveBayse, BayseNet, IB1, J48, System Vector Machine (SVM), and logistic regression method. We compared performance of classification methods in two malware data sets.
In Tables 1 and 2, the statistical analysis of data sets 1 and 2 is specified for proposed classification methods. The compared factors in the classification methods are Correctly Classified Instances, Incorrectly Classified Instances, Kappa statistic, mean absolute error, relative absolute error, root mean squared error, and root relative squared error. In this comparison, we show that the classification via regression method has best performance in malware detection. For example, in data set 1, the number of correctly classified malware programs is 3051 from total 4024 malware programs. Also in data set 2, the number of correctly classified malware programs is 3069 from total 3131 malware programs.
The statistical analysis of data set 1 for specified classification methods.
Algorithms
Results
Correctly Classified InstancesNumber, %
Incorrectly Classified InstancesNumber, %
Mean absolute error
Relative absolute error
Kappa statistic
Root mean squared error
Root relative squared error
Total number of instances
NaiveBayes
1107, 27.5099%
2917, 72.4901%
0.0069
90.0871%
0.2526
0.0754
122.8107%
4024
BayesNet
2662, 66.1531%
1362, 33.8469%
0.0032
42.4047%
0.5979
0.0479
78.1282%
4024
IB1
2802, 69.6322%
1222, 30.3678%
0.0028
37.2325%
0.6199
0.0533
86.8274%
4024
J48
2908, 72.2664%
1116, 27.7336%
0.0032
41.6312%
0.6379
0.0454
73.9957%
4024
Regression
3051, 75.8201%
973, 24.1799%
0.0011
21.0201%
0.6859
0.0392
63.9686%
4024
SVM
2251, 64.1571%
1773, 35.8429%
0.0039
42.0019%
0.5743
0.4758
84.9596%
4024
The statistical analysis of data set 2 for specified classification methods.
Algorithms
Results
Correctly Classified InstancesNumber, %
Incorrectly Classified InstancesNumber, %
Mean absolute error
Relative absolute error
Kappa statistic
Root mean squared error
Root relative squared error
Total number of instances
NaiveBayes
2678, 85.5318%
453, 14.4682%
0.012
15.3329%
0.8459
0.1026
51.8792%
3131
BayesNet
2874, 91.7918%
257, 8.2082%
0.0073
9.3575%
0.9127
0.0747
37.7504%
3131
IB1
3028, 96.7103%
103, 3.2897%
0.0027
3.5032%
0.965
0.0524
26.472%
3131
J48
3008, 96.0715%
123, 3.9285%
0.0043
5.5353%
0.9581
0.0527
26.652%
3131
Regression
3069, 98.321%
62, 1.679%
0.0021
2.2102%
0.9578
0.0543
27.4333%
3131
SVM
1698, 54.2319%
1433, 45.7681%
0.0046
5.7993%
0.5011
0.1942
98.1954
3131
According to Tables 1 and 2, the percentage of Correctly Classified Instances of the logistic regression algorithm is higher than the other classification methods in each of data sets 1 and 2. Also the percentage of Incorrectly Classified Instances of the logistic regression algorithm is lower than the other classification methods in each of data sets 1 and 2.
After data mining process, we test a new malware case by the regression classification algorithm. 100 binary malware programs are downloaded from NetLux (http://vxheaven.org/) and we analyzed their behaviors by using CW-Sandbox tool and we get its XML file [38]. Then, we add these 100 malware programs to the new data set and compute the quality of their classification as true optimistic ratio. As we expect, by classification via regression 88 malware programs are detected. So we can use the classification via regression to develop a behavioral antivirus.
Figure 4 depicts the true optimistic ratio percentage for malware detection in the new data sets. The true optimistic ratio percentage of regression method is higher than the other classification methods in the new data set.
The true optimistic ratio for the classifications test in the new data set.
After testing our new case study by 100 malware programs, Table 3 describes a statistical result for the False Acceptance Rate (FAR) number of cases and the False Rejection Rate (FRR) number of cases. Of course, there are some platforms such as STAC (http://tec.citius.usc.es/stac/) [48] for statistical comparison of the tested algorithms. But we use the WEKA tool for statistical and experimental results for our data sets.
The statistical analysis of the FAR and FRR number of cases in the new test case study.
Algorithms
Statistical analysis
Number of FAR cases
Number of FRR cases
Total number of instances
NaiveBayes
5
6
100
BayesNet
4
2
100
IB1
2
1
100
J48
3
2
100
Regression
1
0
100
SVM
3
2
100
According to Table 3, there is no valid input which is incorrectly rejected using our approach by regression method. Also NaiveBayes method rejected 6 valid inputs incorrectly.
Also in this test case we find one FAR incorrectly accepted as a malware. So, Figure 5 shows the Total Error Rate (TER) for our new test case using our approach by the regression method.
The Total Error Rate (TER) for the classifications test in the new data set.
5. Conclusion and Future Work
In this paper, we proposed a new data mining approach based on classification methodologies for detecting malware behavior. Firstly, a malware behavior executive history XML file is converted to a nonsparse matrix using our suggested application. Then, this matrix was translated to WEKA input data set. To illustrate the performance efficiency, we applied the proposed approaches to a real case study data set using WEKA tool. The training methods proceeded using some classification algorithms such as NaiveBayse, BayseNet, IB1, J48, and regression algorithms. The regression classification method had best performance for classification of malware detection. Also we analyzed the new data set by the regression classification method. The evaluation results demonstrated the availability of the proposed data mining approach. Also our proposed data mining mechanism is more efficient for detecting malware. By notice to the experimental results, classification of malware behavioral features can be a convenient method in developing a behavioral antivirus. In the future work, we will try to develop and analyze a real behavioral antivirus platform based on classification via regression algorithm.
Competing Interests
The authors declare that they have no competing interests.
Ekta GandotraD. B.SofatS.Malware analysis and classification: a survey201455664WangP.WangY.-S.Malware behavioural detection and vaccine development by using a support vector model classifier20158161012102610.1016/j.jcss.2014.12.0142-s2.0-84928276892OllmannG.The evolution of commercial malware development kits and colour-by-numbers custom malware2008200894710.1016/S1361-3723(08)70135-02-s2.0-52149092178GhiasiM.SamiA.SalehiZ.Dynamic VSA: a framework for malware detection based on register contents20154411112210.1016/j.engappai.2015.05.0082-s2.0-84940021061BruschiD.MartignoniL.MongaM.BüschkesR.LaskovP.Detecting self-mutating malware using control-flow graph matching20064064Berlin, GermanySpringer129143ChouchaneM. R.LakhotiaA.Using engine signature to detect metamorphic malwareProceedings of the 4th ACM Workshop on Recurring Malcode2006Alexandria, Va, USAKuzurinN.ShokurovA.VarnovskyN.ZakharovV.GarayJ.LenstraA.MamboM.PeraltaR.On the concept of software obfuscation in computer security20074779Berlin, GermanySpringer281298ChristodorescuM.JhaS.Testing malware detectors2004294344410.1145/1013886.1007518Mehedy MasudL. K.ThuraisinghamB.20121CRC PressEgeleM.ScholteT.KirdaE.KruegelC.A survey on automated dynamic malware-analysis techniques and tools200844142Monire NorouziS. P.MahjurA.A new approach for formal behavioral modeling of protection services in antivirus systems201445767SafarkhanlouA.SouriA.NorouziM.SardroudS. E. H.Formalizing and verification of an antivirus protection service using model checking20155713241331SantosI.BrezoF.Ugarte-PedreroX.BringasP. G.Opcode sequences as representation of executables for data-mining-based unknown malware detection2013231648210.1016/j.ins.2011.08.020MR30288132-s2.0-84874105145AbdelhamidN.AyeshA.ThabtahF.Phishing detection based associative classification data mining201441135948595910.1016/j.eswa.2014.03.0192-s2.0-84899698551JacobG.DebarH.FiliolE.Behavioral detection of malware: from a survey towards an established taxonomy20084325126610.1007/s11416-008-0086-02-s2.0-48349134267ShabtaiA.MoskovitchR.EloviciY.GlezerC.Detection of malicious code by applying machine learning classifiers on static features: a state-of-the-art survey2009141162910.1016/j.istr.2009.03.0032-s2.0-65749099969KotsiantisS. B.Supervised machine learning: a review of classification techniquesProceedings of the Conference on Emerging Artificial Intelligence Applications in Computer Engineering: Real Word AI Systems with Applications in eHealth, HCI, Information Retrieval and Pervasive Technologies2007IOS Press324JainA. K.MurtyM. N.FlynnP. J.Data clustering: a review199931326432310.1145/331499.3315042-s2.0-84893405732ChenL.JiangQ.WangS.Model-based method for projective clustering20122471291130510.1109/TKDE.2010.2562-s2.0-84861723325RaviC.ManoharanR.Malware detection using Windows Api sequence and machine learning20124317121610.5120/6194-8715RizwanR.HazarikaG. C.ChetiaG.Malware threats and mitigation strategies: a survey201129269732-s2.0-79961220847MathurK.SarojH.A survey on techniques in detection and analyzing malware executables2012442DohertyN. F.AnastasakisL.FulfordH.The information security policy unpacked: a critical study of the content of university policies200929644945710.1016/j.ijinfomgt.2009.05.0032-s2.0-70449596008TahanG.RokachL.ShaharY.Automatic malware detection using common segment analysis and meta-features2012139499792-s2.0-84860689465BaileyM.OberheideJ.AndersenJ.MaoZ. M.JahanianF.NazarioJ.KruegelC.LippmannR.ClarkA.Automated classification and analysis of internet malware20074637Berlin, GermanySpringer178197BayerU.MoserA.KruegelC.KirdaE.Dynamic analysis of malicious code200621677710.1007/s11416-006-0012-22-s2.0-33748932391KolterJ. Z.MaloofM. A.Learning to detect and classify malicious executables in the wild2006727212744MR2274458TriniusP.WillemsC.HolzT.RieckK.2009MoskovitchR.ShaharY.Classification-driven temporal discretization of multivariate time series201529487191310.1007/s10618-014-0380-zMR33539142-s2.0-84930482550SchultzM. G.EskinE.ZadokE.StolfoS. J.Data mining methods for detection of new malicious executablesProceedings of the IEEE Symposium on Security and Privacy, S&P2001Oakland, Calif, USA384910.1109/SECPRI.2001.924286KolterJ. Z.MaloofM. A.Learning to detect malicious executables in the wildProceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '04)August 2004Seattle, Wash, USAACM4704782-s2.0-12244279567SiddiquiM.WangM. C.LeeJ.Detecting internet worms using data mining techniques200864853YangH.ChenY.-P. P.Data mining in lung cancer pathologic staging diagnosis: correlation between clinical and pathology information20154215-166168617610.1016/j.eswa.2015.03.0192-s2.0-84937761806CampagniR.MerliniD.SprugnoliR.VerriM. C.Data mining models for student careers201542135508552110.1016/j.eswa.2015.02.0522-s2.0-84926682433RahmanR. M.Md HasanF. R.Using and comparing different decision tree classification techniques for mining ICDDR,B Hospital Surveillance data2011389114211143610.1016/j.eswa.2011.03.0152-s2.0-79955605100GhoshS.BiswasS.SarkarD.SarkarP. P.A novel Neuro-fuzzy classification technique for data mining201415312914710.1016/j.eij.2014.08.0012-s2.0-84927558995MoskovitchR.ShaharY.Fast time intervals mining using the transitivity of temporal relations2015421214810.1007/s10115-013-0707-x2-s2.0-84939240022StopelD.BogerZ.MoskovitchR.ShaharY.EloviciY.Application of artificial neural networks techniques to computer worm detectionProceedings of the International Joint Conference on Neural Networks (IJCNN '06)July 2006236223692-s2.0-40649103203StopelD.MoskovitchR.BogerZ.ShaharY.EloviciY.Using artificial neural networks to detect unknown computer worms200918766367410.1007/s00521-009-0238-22-s2.0-70350139996MoskovitchR.GusI.PludermanS.StopelD.FeherC.GlezerC.ShaharY.EloviciY.Detection of unknown computer worms activity based on computer behavior using data miningProceedings of the 1st IEEE Symposium on Computational Intelligence and Data Mining (CIDM '07)April 2007Honolulu, Hawaii, USAIEEE20220910.1109/cidm.2007.3688732-s2.0-34548783445NissimN.MoskovitchR.RokachL.EloviciY.Detecting unknown computer worm activity via support vector machines and active learning201215445947510.1007/s10044-012-0296-4MR29928102-s2.0-84867852050KarthikN.ArulR.PrasadM. J. H.KamalakannanC.SureshL. P.DashS. S.PanigrahiB. K.Modeling of wind turbine power curves using firefly algorithm2015326New Delhi, IndiaSpringer14071414GaltonF.1892Macmillan and CompanyEugenioB. D.GlassM.The kappa statistic: a second look20043019510110.1162/0891201047736334022-s2.0-2142668188MohammadM. N.SulaimanN.MuhsinO. A.A novel intrusion detection system by using intelligent data mining in weka environment2011312371242KantardzicM.2002John Wiley & SonsDeshmukhM.PrasadM. N. K.JainL. C.BeheraH. S.MandalJ. K.MohapatraD. P.Partial segmentation and matching technique for iris recognition201531Springer India7786Rodríguez-FdezI.CanosaA.MucientesM.BugarínA.STAC: a web platform for the comparison of algorithms using statistical testsProceedings of the IEEE International Conference on Fuzzy SystemsAugust 2015Istanbul, Turkey1810.1109/FUZZ-IEEE.2015.7337889