Trusted Measurement Model Based on Multitenant Behaviors

With a fast growing pervasive computing, especially cloud computing, the behaviour measurement is at the core and plays a vital role. A new behaviour measurement tailored for Multitenants in cloud computing is needed urgently to fundamentally establish trust relationship. Based on our previous research, we propose an improved trust relationship scheme which captures the world of cloud computing where multitenants share the same physical computing platform. Here, we first present the related work on multitenant behaviour; secondly, we give the scheme of behaviour measurement where decoupling of multitenants is taken into account; thirdly, we explicitly explain our decoupling algorithm for multitenants; fourthly, we introduce a new way of similarity calculation for deviation control, which fits the coupled multitenants under study well; lastly, we design the experiments to test our scheme.


Introduction
Cloud computing has recently attracted an important attention and dubbed as the "next best thing" in information and communication technologies (ICT) [1]. As the intrinsic feature of cloud computing, multitenancy brings sharing concept to almost all information technologies such as sharing computing resources, sharing storage resources, and sharing network. Coresident clients might have no preestablished trust relationship and might have no knowledge of the existence or identities of other clients. In such a setting, if one of the coresidents maybe attacks the other coresidents it will be much easier to succeed and be difficult to detect. Therefore, this risk incurred by trusted measurement of multitenant is a barrier to acceptance of cloud computing. Actually cloud computing system, such as Amazon's Elastic Compute Cloud (EC2), Microsoft's Azure, and Rackspace's Mosso, is a large scale system which is studied in cybernetics long before. Here we leverage the generalized predictive control affiliated to cybernetics to solve the problem of behavior measurement of multitenants on the same physical server brought by the new paradigm of cloud computing.

Background
This section consists of two parts: one is multitenancy threat; the other is the brief introduction of generalized predictive control.

Multitenancy Threat.
It is important to consider the unique security risks introduced by multitenancy as intrinsic of the new paradigm of cloud computing in order to be able to derive adequate security solutions. As more and more applications become exported to third-party compute clouds, it becomes increasingly important to quantify any threats to confidentiality that exist in this setting [2,3]. An obvious threat to these consumers of cloud computing is malicious behavior by the cloud provider, who is certainly in a position to violate customer confidentiality or integrity. However, this is a known risk with obvious analogs in virtually any industry practicing outsourcing. In this work, we consider the provider and its infrastructure to be trusted. This also means we do not consider attacks that rely upon subverting a cloud's administrative functions, via insider abuse or vulnerabilities in the cloud management systems (e.g., virtual machine monitors). In our threat model, adversaries are non-provideraffiliated malicious parties. Victims are multitenants running confidentiality-requiring services in the cloud. A traditional threat in such a setting is direct compromise, where an attacker attempts remote exploitation of vulnerabilities in the software running on the system. Of course, this threat exists for cloud applications as well. These kinds of attacks (while important) are a known threat and the risks they present are understood.
We instead focus on where third-party cloud computing gives attackers novel abilities, implicitly expanding the attack 2 The Scientific World Journal surface of the victim. We assume that, like any customer, a malicious party can run and control many instances in the cloud, simply by contracting for them. Further, Based on the fact the economies offered by third-party compute clouds derive from multiplexing physical infrastructure, we assume (and later validate) that attacker's instances might even run on the same physical hardware as potential victims. From this vantage, an attacker might manipulate shared physical resources (e.g., CPU caches, branch target buffers, network queues, etc.) to learn otherwise confidential information.

Generalized Predictive Control.
In general sense, predictive control, regardless of various algorithms, is based on the following three basic principles [4].
(1) Predictive Model. Predictive control is also referred to as model-based control where this model is referred to as predictive model. The predictive model can predict the future output of the object based on historical information and input. And the predictive model does not emphasize its structure but emphasizes the function of the model. Therefore, the traditional model such as equation of state and the transfer function can be used as a predictive model. Similarly, nonparametric model such as step response and impulse response can also be used directly as a predictive model.
(2) Rolling Optimization. Predictive control is an optimal control algorithm, which determines the future action through an optimal performance index. However, the optimization studied in predictive control is different from optimal control in the traditional sense, and the subtle difference is that optimization in the predictive control is a rolling optimization within the limited time. At each sampling instant, the optimization performance indicators relate only to a limited time since the right moment. Until the next sampling instant, this optimization period moves forward. At different instants, the relative forms of optimization performance indicators are the same, but its absolute form, that is, containing time area, is different. Therefore, during predictive control, optimization is not offline conducted only once but repeated online, which is the core of rolling optimization, that is, the fundamental characteristics of optimal control here is different from the traditional ones.
(3) Feedback Correction. Predictive control is a closed-loop control algorithm, where a series of further control actions can be ascertained by optimization. Predictive control does not perform all these actions but perform the present action. So that the deviation from the ideal state can be avoided; this is resulted from either the model mismatch or environmental interference. Until the next sampling time, the first is to detect the actual output of the object; the second is to take advantage of this real-time information to correct the prediction based on the model; and the final is to conduct the new optimization. Therefore, the optimization of the predictive control is not only based on the model, but also the feedback information, which constitutes a closed-loop optimization.

Related Work
There exist several measurement models such as Tripwire [5], AEGIS [6], and trusted box [7], the trust chain model proposed by the TCG (trusted computing group). These models focus on different measurement aspects of the system or file program, but these approaches belong to static integrity measurement of the resource. As a result, they cannot consider the dynamic trustworthiness in the system.
Further the researchers put forward the following schemes to realize dynamic measurement. In [8], there is a coprocessor-based kernel integrity monitor. The monitor periodically checks system memory and detects whether malicious programs change the host system kernel. Binding instructions and data (BIND) binds with the data and the corresponding block of process in order to provide a basis for the verification side to trace data processing. However, it cannot cope with many attacks when the system is running [9]. Policy reduced integrity measurement architecture (PRIMA) focuses on the flow of information when the system is running [10], but the model trusts flow of information which comes from the trusted subjects in mandatory access control (MAC). However, it is still a role-based privilege. The measure mode is too simple and does not conform to the definition of definition of trust. Behavior based trustworthiness attestation mode (BTAM) is trusted proof model based on the behavior of the system [11]. This model firstly determines whether the system behavior is related to trustworthiness of platform state. For a large number of behaviors that cannot be determined, this model has not yet given the solution. Therefore, the dynamic trusted measure theory and technology is an urgent need for the development of cloud computing [12].
Gong [13] firstly introduces generalized prediction control theory to analyze and measure the tenants' behaviour in the information system. The novel scheme greatly increases the trustworthiness and security of information system and opens a new direction towards behaviour measurement [13]. However, the new features mentioned above brought by the cloud computing were not considered and studied. This paper is to improve that model and to adapt the new feature of multitenancy brought by cloud computing.

Model Design
Traditional authorization and authentication are to solve the main problem whether the user's identity is trusted, while they are ineffective to solve whether the user's behavior is trusted. The original drive to promote the change of system status is the behavior [14]. Therefore, the trusted measurement of the behavior is more precise than the trusted measurement of the identity when it comes to reflect the trustworthiness of the system. The design of our model is consistent with the trustworthiness defined by Trusted Computing Group (TCG); that is, it is defined as trusted if the behavior can be expected [13]. According to this definition, we propose a measurement model for virtual machine behavior shown in Figure 1.
The first step: the characteristics of the shared resources in cloud computing brings the advantages while leading to  security problems. So it is necessary to conduct decoupling control over the behavior of the virtual machines on the same physical platform. Illustratively, the decoupling control aims to simplify the control over many virtual machines sharing resources of the same physical computing node into a lot of individual control loops for each virtual machine corresponding to individual customers.
The second step: according to the decoupling control algorithm, the inputs and outputs of several virtual machines in the same physical computer can be decoupled. The decoupled inputs and outputs of appropriate virtual machine can be controlled by the generalized predictive control algorithm here. Specifically, through the past and present behavior of the virtual machine, the further behavior can be predicted.
The third step: to match predicted behavior with characteristics list of malicious behaviors so as to obtain the similarity value/deviation value. If the deviation value is less than the threshold value predetermined by the system, then the behavior is trusted, otherwise it is an untrusted behavior.

Model Implementation
The multiple tenants studied here refer to the ones who share the same physical resource such as network card and bandwidth. Due to the multitenancy sharing, the cloud computing becomes much more complicated. In order to better predict the tenant's complicated behaviors, we utilize the multiple variable generalized predictive control to capture those behaviors. In this section, firstly we depict the cloud computing system in the view of generalized predictive control; secondly, we present the description of behaviors in cloud computing; thirdly, we introduce the establishment of list of malicious tenants' behaviors; fourthly, we give decoupling algorithm for multitenant behaviors both in private and public clouds using generalized predictive control without coupling; fifthly, we give the similarity calculation used in our scheme for deviation control to confirm whether the suspected behavior is trusted or not finally.

Description of Controlled Object.
From the view of control theory, the physical computing nodes where several virtual machines colocate can be taken as a multi-input, multioutput information flow control system. Figure 2 shows a physical computing node colocated by four virtual machines from the perspective of the generalized predictive control theory. Eight behavioral measurement points are as input of the information system; the outputs are four virtual machines captured by eight behavioral measurement points, which are in line with the appropriate expectation, respectively. Each virtual machine is one of outputs of the entire physical computing node, while all four virtual machines are equal to total inputs of the entire physical compute node, such that the total traffic of all four virtual machines should be equal to the traffic of physical computing nodes.

Description of Tenant's Behavior.
There exists monitoring components in virtualized trusted computing platform based on dual-system architecture proposed by our research team. These monitoring components can identify measurement indicators of the behavior performance of virtual machine. There are several commonly used monitoring components as follows: (1) memory and CPU monitor: to monitor memory usage and CPU call rate and report monitoring results to the behavioral data collector; (2)   analyzing communication message packet of the suspicious port. So that we can determine the role of the suspicious port and the corresponding process behavior of this port. If a suspicious user process is found to monitor a suspicious port and to communicate the message frequently, it is necessary to temporarily suspend the implementation of the process and to report to the Cloud Security Management Center; (3) network traffic detector: its role is to monitor the flow of network communication, in particular, the network traffic coming out of a virtual machine. Each virtual node has been deployed the monitor, so that both the denial of service attacks and the worm can be monitored and found. As a matter of fact, DoS and worm attacks will lead to a sharp rise in network traffic. If it is found that a virtual machine computing task unconventionally and frequently sends out a lot of the packages with the same content, this task needs to be suspended, that is, to prevent the execution of the virtual machine user tasks, and then to be reported to the Cloud Security Management Center.
In the cloud computing model, we studied the related results conducted by both foreign researchers such as Khorshed et al. [15] and local researchers such as Li et al. [16]; we choose the following to depict the virtual machine behavior, which is the number of transmitted packets, the number of received packets, the number of lost packets, disk read speed, disk write speed, memory usage, CPU usage, and the number The Scientific World Journal 5 Table 1: Virtual machine behavior metric vector.

Measurement point
Measured object MP 1 Number of packets transmitted MP 2 Number of packets received MP 3 Number of packets lost MP 4 Disk read rate MP 5 Disk write rate MP 6 Memory usage MP 7 CPU usage MP 8 Number of failed administrative log on attempt of failed login attempts. Here, these eight performance indicators are named as measurement point, abbreviated as MP. In this paper, the behavior measurement vector of running virtual machine consists of the aforementioned 8 measurement points, see Table 1.

List of Tenant's Malicious Behaviors. The researchers from
University of California, San Diego, and the Massachusetts Institute of Technology, Cambridge University [17] conducted a thorough experimental study on Amazon's Elastic Compute Cloud [18]. The results show that the cloud infrastructure can be mapped out, and the position of a specific virtual machine can be located. They also point out that the aforementioned information can be exploited to make side channel attacks so as to collect the information of the target virtual machine located on the same physical machine. In a recent study, Rocha and Correia [19] investigated how malicious insiders steal confidential data and demonstrated these attacks using the video and showing insiders can easily obtain passwords, encryption keys, and documents. Chonka et al. [20] reproduced the scenario of some recent attacks happening in the cloud computing and demonstrated how the HTTP-DOS and XML DoS occur in the cloud computing. Khorshed et al. found that there exists some common factor behind these attack models [17][18][19], because all the attackers use a similar attack tools and follow a certain attack process. Khorshed et al. firstly collected relevant attack tools such as Hping, socket programming, httping Unix shell script, and side channel attacks. Next they collected a variety of attack scenarios related to network security by browsing relevant website and blog, such as Danchev [21] and Grossman [22] as well as their research work [23][24][25], and then generated attack script using the aforementioned documents.
Based on above steps, Khorshed et al. designed the experiment to collect data in the cloud computing environment. The type of data will determine the kind of data collection tools. In the attack scenario, most common data types are as follows 8 performance indicators such as the number of transmitted and received data packets, processing time, the round-trip time, and CPU usage. Khorshed et al. adopted machine learning techniques to classify the attacks related to malicious use of resources in the cloud computing. Through a large number of experiments, they obtained 8 measurement points of behavioral performance such as the number of transmitted packets, the number of received packets, the number of lost packets, disk read speed, disk write speed, memory usage, CPU usage, and the number of failed login attempts. Further, they concluded the behavioral characteristics of the classic attack in conduction of eight measurement points [26].

Decoupling Algorithm.
To maximize efficiency, multiple VMs, one VM corresponding to one tenant, may be simultaneously assigned to be executed on the same physical server, which is supported by virtualization technology. As a result, tenants share the physical resources (e.g., CPU caches, branch target buffers, network queues, etc.) to accomplish their computation tasks. From the angle of generalized predictive control (GPC), cloud computing system under study corresponds to multiple inputs and multiple outputs system in cybernetics which is different from the single input and output system that is studied in [13]. The essential difference is the coupling between tenants on the same physical server, which should be studied thoroughly. In this section, first we use GPC theory to capture the multitenant behavior and then to derive the decoupling algorithms that is shown at the end of this part.
The multitenant's behavior in cloud computing can be described by where . { ( )} and { ( )} indicate coresident tenants' inputs and outputs. ( ) is -dimension independent random disturbance vector, and its mean value and variance are zero and , respectively. Without loss of generality, suppose ( −1 ) is diagonal matrix.
( −1 ) is divided into two parts, namely, where ( −1 ) is diagonal matrix polynomials and̃( −1 ) is a matrix whose diagonal is zero. Equation (2) indicates that ( −1 ) is the direct relation between tenant's inputs and outputs, and̃( −1 ) is the mutual coupling part of communication channel.
Using (1) and (2), we have Performance index function is as follows: indicates generalized outputs, ( ) indicates the inverse of ( −1 ), ( + ) is fixed vector, || || 2 indicates , and is symmetric positive definite matrix. There is no such ( −1 ) ( + − 1), part of (4), in the performance index of common generalized prediction control.̃( −1 ) is a matrix polynomial whose diagonal is zero and̃( −1 ) can be used to eliminate the coupling effect between channels. Similarly, weighted constant matrix can be divided into two and ; is a diagonal matrix and̃is a matrix whose diagonal is zero; the function of̃is the same as that of̃( −1 ).
We use the methods in [27] to achieve the decoupling algorithm.
The first rows of ( + ) Substituting above formula into (3), we obtain the closedloop system equation: wherẽindicates the mutual coupling part = [ −1 ( 11 + ⋅ ⋅ ⋅ +̃) According to (18), the coupling of closed-loop system is decoupled if and only if̃= 0. Because the number of variables is less than that of equations, both̃( −1 ) andõ f (19) can be obtained by least squares method; consequentlỹ is not equal to zero exactly, and further decoupling is approximate.
Moreover, controlled object of formula (1) is CARMA model. Since there is no steady error in outputs of closedloop system, it is necessary to determine the matrix of the performance index (4). To be simplified, let 1 = 2 = ⋅ ⋅ ⋅ = = ; we can obtain from formula (18): After substituting 1 ,̃, and into (16), the following law of decoupling space can be derived: Generalized predictive control based decoupling algorithm is as follows.
( ) is the predicted value of individual virtual machine, after decoupling, on the virtualized platform of cloud computing.

Decoupling Algorithm for Public
Cloud. The parameters used above are known in the case the user of the virtual machine is fixed, while the aforementioned algorithm with decoupling is not applicable where the users are not fixed. For example, the users in public cloud computing are not fixed, so that the parameters related to users' behavior are unknown. In such public cloud computing, it is necessary to use parameter estimation to obtain the appropriate parameters of the corresponding controlled object and then conduct the predictive control algorithm mentioned above.
Then the method to deal with (22) may become very complex.

8
The Scientific World Journal In this paper, to solve the newest ( −1 ), we introduce the least squares method with weighs.

Deviation
Control. The behavior of the virtual machine can be mapped to a point in the space that consists of eight behavioral measurement points. The model of behavioral trusted measurement can determine whether the behavior of the virtual machine is out of security border, that is, whether the behavior is a malicious one. Mathematically, the aforementioned is to obtain the distance between two points in 8-dimensional space that consists of 8 behavior measurement points. This is actually a problem to calculate the similarity between two different objects. Similarity calculation is widely used in the intrusion detection technology and other technologies. The typical solutions are like inner product, Dice coefficient, cosine function, and Jaccard coefficient method [28].
In this paper, gray correlation analysis is adopted to calculate the deviation value. Because the predictive value of the virtual machine behavior is unknown, the historical and present behavior of the virtual machine is consistent with the information, so that this known information and corresponding location information constitute a gray system [29]. At present, the gray system theory has been extended to many fields such as the industrial, agricultural, social, economic, energy, geology, and petroleum, successfully solving a large number of practical problems in production, living, and scientific research and making remarkable achievements. The gray relational analysis is a branch of the gray system theory.
The basic idea of gray relational analysis is to determine whether they are similar to each other by the degree of similarity of curve geometry composed of the appropriate data sequence. In terms of mathematics, gray correlation degree is used here to reflect the degree of similarity. The closer the two curves are, the greater the degree of correlation of the two corresponding data sequences is, and vice versa. When it comes to specific analysis, it is desirable to replace unlimited convergence curve with approximate convergence (data array), so as to provide a great convenience in the case of dealing with a large number of practical problems.
Combined with the characteristics of a distributed computing environment based on virtual architectures, Grey Relational Analysis is adopted in this paper, and the specific calculation steps are as follows.
(1) According to the measurement point of the behavior of the virtual machine, to create the reference sequence of a virtual machine behavior, suppose data sequences can form the following matrix: ) . The data sequence is known as the reference sequence that can reflect the characteristics of the behavior of the system. The data sequence is known as comparison sequence that is composed of the factors that affect behavior of the system.
(2) The goal of the behavior of the virtual machine decides the value of the behavioral measurement point and further determines comparison sequence that has impact on the behavior of the system.
Reference data sequence should be a standard for the comparison. Here, reference data sequence comes from the list of the behavioral characteristics, seen in Table 1, written as (3) Nondimensionalization of the reference sequence and the comparison sequence.
Due to the fact that the factors in the system have various physical meanings, the dimensions involved in the factors are different as well. As a result, it is difficult to compare the factors so as not to obtain a correct conclusion. When it comes to Grey Relational Analysis, generally it is required to carry out nondimensionalization of the appropriate data. The methodologies commonly used for nondimensionalization are as follows, for example, equalization method and the initialization method, seen in (31): The Scientific World Journal 9 After nondimensionalization, data sequence is as follows: ( 0 , 1 , . . . , ) = ( ) . (32) Here the initialization method is adopted to conduct nondimensionalization.
(4) In our scheme, the comparison sequence refers to the behavioral measurement vector of the virtual machine to be measured. For every behavior of the virtual machine, the corresponding absolute difference between the comparison sequence and reference sequence is calculated,respectively; that is, | 0 ( ) − ( )|, where = 1, . . . , 8; = 1, . . . , , is defined as the number of sampling values of the object to be measured during a given period.
Calculation of the relational coefficient through formula (33), the coefficient of the appropriate elements between every comparison sequence and reference sequence is calculated, respectively. Relational coefficient actually represents the degree of the difference between two curves in terms of geometry. Therefore, the degree of difference can reflect the degree of relationship: where 1 = min min 0 ( ) − ( ) where is identification coefficient, 0 < < 1, and usually = 0.5.
Calculation of the degree of relationship. Because the relation coefficient reflects the degree of relationship between comparison sequence and reference sequence at each moment. So, obviously there is more than one value and these values are dispersed. Therefore, it is necessary to use one value to reflect all of relation coefficient values moment. Here the average value is chosen to represent the degree of relationship between the comparison sequence and the reference sequence. The corresponding formula is as follows:

Simulation and Results
In this paper, NetLogo simulation is the use of cloud computing mode virtual machines on the virtual platform to analyze  Table 2.
A major function of the proposed scheme is to detect a variety of malicious behaviors of the virtual machine. To guarantee the trustworthiness of the group as much as possible, this paper uses the successful detection rate (abbreviated as MSR) of malicious behavior to reflect the detection ability of our scheme against malicious behaviors.
Within Δ , suppose there are ( ) computing nodes with malicious behavior and ( ) computing nodes with trusted behavior in the system, so that % can be described as follows: This paper will simulate the attack process of "worm" virus, and then to test the effectiveness of our scheme by detecting the behavior of the "worm" virus. As a matter of fact, worm virus has the following characteristics such as breaking into antivirus software, compromising security model of the system, and implantation of Trojan into downloader. The virus typical invasion action [30] is denoted by Attack Behavior.
According to the description of the behavior in Table 1, worm virus attacks can be abstracted as a behavioral vector: Attack Behavior In order to verify the effectiveness of the trusted measurement method of the behavior of the virtual machine here, we take the scheme without the decoupling proposed in literature [13] as contrast. In our experiments, the initial ratios of infected virtual machines are set as 30%, 50%, and 70%, respectively. For behavioral trusted measurement both with decoupling and without decoupling based on generalized predictive control, the experimental simulation are carried out three times.
When the percentages of malicious nodes are 30%, 50%, and 70%, the corresponding experimental results are shown from Figure 3 to Figure 5. After the analysis of Figures 3, 4, and 5, the following conclusions are summarized.
(1) Generally by the analysis of three figures, the simulation system for behavioral measurement model with decoupling can reach a steady state faster than the one without decoupling. The so-called steady state means such state that the number of the malicious nodes within the simulation system is 0. In our experiments, one of the parameters is the recovery chance that indicates the probability that infected node recovers as normal. In practical applications, finally the infected compute nodes recover as normal by various measurements, for example antivirus software. Faster to reach steady state means the corresponding scheme of behavioral trusted measurement is more accurate than the counterpart; that is, the user can detect and stop the spread of malicious worm virus timelier.
(2) In Figures 3, 4, and 5, the red line (decoupling algorithm) is almost below the black line (traditional algorithm), which indicates that, at any time, the scheme with decoupling proposed here can help accurately reflect the trusted state of the virtual machine and further take timely measurement so as to restrict the spread of the worm virus.
(3) In Figure 5, the distance between the red line (decoupling algorithm) and the black line (traditional algorithm) is larger than the previous two figures, which indicates, as the proportion of the malicious nodes in the system goes more, that the behavioral trusted measurement proposed here is better than the scheme in [13]. In summary, the experimental simulation shows that trusted measurement scheme here can effectively predict and control the behaviors of the virtual machine. So that such attack behavior that results from the abuse of the resources in cloud computing can be found timely and well restricted; that is, the security of the entire group can be well guaranteed.

Conclusion
The scheme for trusted measurement over dynamic multitenant behavior in cloud computing environment put forward here addresses the problem of resource-sharing existing The Scientific World Journal 11 in the cloud computing. By extending our previous model to the multiple tenants who share the same resource, we can further use the generalized predictive control to depict complicated behavior in cloud computing. Thanks to the advantages of generalized predictive control such as rolls optimized method and the feedback adjustment, the complicated behaviors of multitenants are well controlled. Further, the problems incurred by coupling between multitenants are solved effectively by the decoupling algorithm of generalized predictive control. As a result, the malicious behaviors between multitenants are restricted in cloud computing platform. In other words, our scheme avoids the threats introduced by multitenancy under cloud computing. In the future, we will refine our scheme and take into account the nonlinear behaviors between multiple tenants in order to deal with the behavior of tenants much more precisely.