Role-based access control (RBAC) is widely adopted in network security management, and role mining technology has been extensively used to automatically generate user roles from datasets in a bottom-up way. However, almost all role mining methods discover the user roles from existing user-permission assignments, which neglect the dependency relationships between user permissions. To extend the ability of role mining technology, this paper proposes a novel role mining framework based on multi-domain information. The framework estimates the similarity between different permissions based on the fundamental information in the physical, network, and digital domains and attaches interdependent permissions to the same role. Three simulated network scenarios with different multi-domain configurations are used to validate the effectiveness of our method. The experimental results show that the method can not only capture the interdependent relationships between permissions, but also detect user roles and permissions more reasonably.
Access control is a fundamental concern in network security management. Role-based access control (RBAC) has become the dominant model for both commercial and research fields [
Existing role mining approaches mainly discover a proper user-role assignment relation
To address the above-mentioned issues, this paper proposes a novel role mining framework named as RMMDI from the perspective of network security management. Instead of mining user roles from user-permission assignments, the framework discovers user roles from the fundamental information in multiple domains, including the physical domain, network domain, and digital domain. The framework is aimed at outputting a flat RBAC state that divides user permissions into several disjoint subsets. The user permissions in one set tend to be interdependent while the permissions in different sets tend to be independent. If a permission set is assigned to a user role, a user assigned some roles is unlikely to get extra permissions assigned to other roles. As such, potential security risks involved in the user-permission assignments process can be avoided.
The rest of this paper is organized as follows. In Section
RBAC has become a dominating model for access control in network security. Instead of assigning permissions to the user directly, RBAC introduces the concept of roles to make access control system more compact and comprehensive [
Kuhlmann et al. first proposed the concept of role mining for finding roles from user-permission assignment data [
Besides those traditional role mining algorithms, there are also many important approaches that emerged in recent years. For example, Frank et al. proposed a probabilistic approach to improve the role mining process by taking account of the business information. The approach utilized the similarity between user-permission relations to detect exceptional assignments and wrong assignments [
With regard to goodness measure, several metrics have been proposed in the literature, including minimizing the number of roles [
Although there are a lot of effective role mining approaches, most of them neglect the relationships between user permissions. From the perspective of network security management, user permissions are not independent. A user or potential attacker may get extra permissions from the preassigned permissions, which may introduce fatal risks to network security. Hence, in the framework RMMDI, we model the interrelationships between user permissions from multi-domain configuration information and get more reasonable user roles, mitigating the vulnerabilities and strengthening network security.
Traditional network security analysis mainly concentrates on the network domain, with a few concerns on other domains. However, with the deepening of research on insider threat, an increasing number of studies have shown that the attacker will attack the network not only in digital ways, but also through the physical domain and social domain.
The existing methods of joint modeling of network multi-domain information mainly define multi-domain information by using the formalized methods and then make inference based on the logical rules to judge whether the system can reach the unsafe state. Probst et al. proposed a formal model for describing scenarios that span the physical and digital domain [
In this paper, we take possible interaction effects among multi-domain permissions into consideration, which are the basis of similar permission finding and role mining based on multiple domain information.
The community is a universal property in many complex networks, which means that network nodes can be divided into small groups [
Nonnegative Matrix Factorization (NMF) [
In the framework RMMDI, we use the Pairwise Coregularized NMF clustering algorithm proposed in [
In this paper, we proposed a role mining framework based on the multi-domain information, which is named as RMMDI. The framework is aimed at dividing possible user permissions into several disjoint subsets and assigning each subset to a user role. Then users are assigned with one or more necessary roles according to the permission they deserve. The structure of RMMDI is shown in Figure
Role mining framework based on multi-domain information.
The basic information acquisition module obtains the necessary basic information from the target network, including multi-domain entity information and relationship information. The relationship network construction module constructs eight networks based on the obtained basic information, including the intermediate networks and ultimate networks. The community detection and role definition module detects permission communities on the ultimate networks by a multi-view community detection method and defines possible user roles.
The basic information acquisition module is to collect network basic information, including the entities and entity relationships in the physical domain, network domain, and information domain, which are the foundation of relationship network construction.
There are five kinds of entities involved in the framework, i.e., space, object, service, info, and user.
Entity space represents specific physical space such as city, campus, building, or room, which is in the physical domain. All the space entities are represented as a set
There are seven kinds of relationships involved in the framework, i.e., spatial similarity relationships, containment relationships, service access relationships, local management relationships, remote management relationships, service domination relationships, and info domination relationships.
Spatial similarity relationships are described by the matrix
where
Device containment relationships are described by the matrix
Service access relationships are described by the matrix
Local management relationships are described by the matrix
Remote management relationships are described by the matrix
Service domination relationships are described by the matrix
Info domination relationships are described by the matrix
where symbol
The relationship network construction module is to construct basic relationship networks based on the obtained basic information. As shown in Figure
Relationship networks constructed in RMMDI.
The five intermediate networks are described as undirected weighted graphs, whose adjacency matrices are constructed from the seven basic relationship matrices.
where the symbol
The three ultimate networks are also described as undirected weighted graphs, whose adjacency matrices are constructed from the seven basic relationship matrices and five intermediate networks.
where
As the matrix
where
After building the ultimate networks, services can be divided into community relations through multi-view clustering algorithm, where all service permissions are divided into a community division
In multiview service community discovery, we use the Pairwise Coregularized NMF clustering algorithm (PCoNMF) proposed in [
The hypothesis behind PCoNMF is to regularize the coefficient matrices of the different views to a common consensus, which is then used for clustering. PCoNMF also adopts alternating optimization to minimize the objective function. The optimization works as follows:
According to [
Hence, the permission community detection algorithm is shown as Algorithm
In this section, we evaluate our role mining method based on the multi-domain information of a simulated network, which is the simplification of the inner network of Corporation M.
We built a simulation network for experiments, including a router, a firewall, an Intrusion Prevention System (IPS), 3 switches (Switch1, Switch2, and Switch3), 6 servers (WServer, DServer, FServer, GServer, OServer, and IServer), 3 gate machines (GM1, GM2, and GM3), and 13 terminals (T1, T2, T3, …, T13). We used a HUAWEI S7706 as the core router, three HUAWEI S5700 as switches, a TOPSEC NGFW 4000-UF as the firewall, a TOPSEC IDP 3000 as IPS, and computers from Dell and HP as the servers or terminals. The router enabled 3-layer routing and the firewall were configured with bidirectional access control lists. All the servers and terminals were installed with different versions of Windows, including Windows 2003 Server, Windows XP, and Windows 7. We deployed an entrance guard system including 3 gate machines and a server (GServer). The gate machines used face recognition technology to determine whether a person can pass or not. An office automation system was deployed on the OServer, whose database was deployed on the DServer. We also deployed two websites and an FTP using IIS (Internet Information Services) on WServer, IServer, and FServer. Similarly, the websites depended on the same database deployed on DServer. The physical link relationships among devices are shown in Figure
An example network.
All the devices are distributed in 12 rooms in 3 buildings. 10 devices are located in building 1: terminal T1, T2, and T3 are in room 1-1; T4 and T5 are in room 1-4; T6 and T7 are in room 1-5; Switch1 is in room 1-2; and GM1 is in the hall of building 1 (room 1-3). 8 devices are located in building 2: terminals T8 and T9 are in room 2-1; T10 and T11 are in room 2-4; T12 and T13 are in room 2-5; Switch2 is in room 2-2; and GM2 is in the hall of building 2 (room 2-3). 10 devices are located in building 3: router, firewall, IPS, Switch3, and all servers are in room 3-1, and GM3 is in the hall of building 3 (room 3-2).
There were 34 services in the network, including 28 management services and 6 business services. The management services were used for device management, while the business services were used for corporation business. Each device was managed by a management service. The router and switches enabled SSH service. The servers and terminals enabled the Remote Desktop Service. In addition, the gate machines enabled web-based management interfaces. The website deployed on WServer provided a web service on port 80 named as WS_W, which was used to publish public information. The FServer provided an FTP service on port 21 named as FS_F, which was used by Network Administrators to share information. The GServer provided a data transmission service on port 8080 named as GS_T, which was used to synchronize data between GM machine and GServer. The OServer provided a web service on port 80 named as OS_W, which was used to document circulation for all users. The IServer provided a web service on port 80 named as IS_W, which was used by Server Administrators to share information. The DServer provided a database service on port 1433 named as DS_D, which was used to provide underlying support for WS_W, OS_W, and IS_W.
There were 33 passwords in the analysis. Each service, except for WS_W and OS_W, has a password. Besides,
Device related information.
Device | Location | Services | Password | Device | Location | Services | Password |
---|---|---|---|---|---|---|---|
T1 | R1-1 | T1_M | T1_M_P | GM2 | R2-3 | G2_M | G2_M_P |
| |||||||
T2 | R1-1 | T2_M | T2_M_P | GM3 | R3-2 | G3_M | G3_M_P |
| |||||||
T3 | R1-1 | T3_M | T3_M_P | Switch1 | R1-2 | S1_M | S1_M_P |
| |||||||
T4 | R1-4 | T4_M | T4_M_P | Switch2 | R2-2 | S2_M | S2_M_P |
| |||||||
T5 | R1-4 | T5_M | T5_M_P | Switch3 | R3-1 | S3_M | S3_M_P |
| |||||||
T6 | R1-5 | T6_M | T6_M_P | Router | R3-1 | R_M | R_M_P |
| |||||||
T7 | R1-5 | T7_M | T7_M_P | Firewall | R3-1 | F_M | F_M_P |
| |||||||
T8 | R2-1 | T8_M | T8_M_P | IPS | R3-1 | IPS_M | IPS_M_P |
| |||||||
T9 | R2-1 | T9_M | T9_M_P | WServer | R3-1 | WS_W | - - |
WS_M | WS_M_P | ||||||
| |||||||
T10 | R2-4 | T10_M | T10_M_P | DServer | R3-1 | DS_D | DS_D_P |
DS_M | DS_M_P | ||||||
| |||||||
T11 | R2-4 | T11_M | T11_M_P | FServer | R3-1 | FS_F | FS_F_P |
FS_M | FS_M_P | ||||||
| |||||||
T12 | R2-5 | T12_M | T12_M_P | GServer | R3-1 | GS_T | GS_T_P |
GS_M | GS_M_P | ||||||
| |||||||
T13 | R2-5 | T13_M | T13_M_P | OServer | R3-1 | OS_W | -- |
OS_M | OS_M_P | ||||||
| |||||||
GM1 | R1-3 | G1_M | G1_M_P | IServer | R3-1 | IS_W | IS_W_P |
IS_M | IS_M_P |
There were 13 users involved in analysis named from User1 to User13, who used terminals T1 to T13 and knew passwords T1_M_P to T13_M_P, respectively. Using the top-down approaches, the network security administrators had gotten 5 user roles for the business information, which were named as Ordinary User, Server Administrator, Database Administrator, Network Administrator, and Security Administrator. The role-permission assignments are listed in Table
User role-permission assignments by top-down methods.
Roles | Service Permissions |
---|---|
Ordinary User | WS_W, OS_W |
| |
Server Administrator | WS_M, FS_M, GS_M, OS_M, IS_M, DS_M, IS_W |
| |
Database Administrator | DS_D |
| |
Network Administrator | S1_M, S2_M,S3_M, R_M, FS_F |
| |
Security Administrator | F_M, IPS_M, G1_M, G2_M, G3_M, GS_T |
To demonstrate the effectiveness of our method, we compare our approach with two groups of baselines. The first group comprises 5 clustering methods: 2 single view methods and 3 multiview methods. The second group comprises 4 traditional role mining methods: ORCA (OFFIS Role mining tool with Cluster Analysis), CM (Complete Miner), HPr (HP Role Minimization), and HPe (HP Edge Minimization)
SP (Spectral Clustering). SP [
SymNMF. SymNMF [
PCoSpec (Pairwise Coregularized Spectral clustering) and CCoSpec (Center-wise Coregularized Spectral clustering). Two coregularization schemes are adopted in spectral clustering framework [
CCoNMF (Cluster-wise Coregularized NMF clustering). CCoNMF extends NMF for multiview clustering by jointly factorizing the multiple matrices through cluster-wise coregularization [
RMSC (Robust Multiview Spectral Clustering). RMSC [
ORCA. ORCA [
Complete Miner (CM). CM [
HP Role Minimization and HP Edge Minimization. HP Role Minimization (HPr) and HP Edge Minimization (HPe) [
To validate our framework and method, we built 3 scenarios named as Scenario1 (S1), Scenario2 (S2), and Scenario3 (S3) based on the basic experimental environment shown in Figure
User-role assignments in different scenario.
Scenario | Ordinary | Server | Database Administrator | Network | Security |
---|---|---|---|---|---|
S1 | User1, User2, | User4, User5 | User6, User7 | User8, User9, User10,User11 | User12, User13 |
| |||||
S2 | All Users | User7, User8 | User11, User12, User13 | User4, User5 | User9, User10 |
| |||||
S3 | User1, User2, User5, User6, User9, User10, User13 | User4, User7 | User8 | User11 | User12 |
For each scenario, we first configured the gate machines and firewall according to Tables
It should be noted that there were potential conflicts among multi-domain configurations on the semantic level. Take the user User4 in S1 as example. User4 was a Server Administrator and should not access service DS_M and the firewall had forbidden T4 to access service DS_D directly, but T4 was permitted to access service WS_M and there was no firewall between WServer and DServer. Thus, User4 can use T4 to log in WServer remotely first and then access service DS_D (he can get the password DS_D_P from the configuration files on WServer). This is a typical semantic conflict between the network access control lists. Similarly, as User4 had the ability to access DS_D physically by entering the room 3-1, there is another conflict between the network access control list and the spatial access control list. Those conflicts may result in extra permissions for users.
Then, we extracted basic information from the network and established the necessary relationship matrices. Note that there were 16 space entities, 28 object entities, 34 service entities, 32 information entities, and 13 user entities in all the 3 scenarios. The relationship matrices
Finally, we detected the user roles by RMMDI and compared the results with the two groups of baseline methods. On the one hand, we performed the role mining baseline methods based on the user-permission assignment (UPA) matrices constructed from the firewall configurations and compared the results with RMMDI. On the other hand, we studied the best parameters for each clustering method and then compared the effectiveness of RMMDI with the clustering baseline methods. Accuracy and normalized mutual information (NMI) [
Firstly, we performed the baseline role mining methods based on the firewall configurations. As the firewall only conducted the network access control lists, it can only reflect the accessibility between devices and services. Since each terminal was assigned to a user, we can get the 3 different UPA matrices from it. As we wanted to find disjoint service subnets, we used
Role mining results of baselines on all scenarios.
Scenario | Method | Role | Permissions |
---|---|---|---|
S1 | ORCA | Role1 | WS_W, OS_W |
Role2 | WS_M, FS_M, GS_M, OS_M, IS_M, DS_M, IS_W | ||
Role3 | DS_D | ||
Role4 | S1_M, S2_M,S3_M, R_M, FS_F | ||
Role5 | F_M, IPS_M, GS_T | ||
| |||
S2 | HPr | Role1 | WS_W, OS_W |
Role2 | WS_W, OS_W, WS_M, FS_M, GS_M, OS_M, IS_M, DS_M, IS_W | ||
Role3 | WS_W, OS_W, DS_D | ||
Role4 | WS_W, OS_W, S1_M, S2_M, S3_M, R_M, FS_F | ||
Role5 | WS_W, OS_W, F_M, IPS_M, GS_T | ||
| |||
S2 | ORCA | Role1 | WS_W, OS_W |
Role2 | WS_M, FS_M, GS_M, OS_M, IS_M, DS_M, IS_W | ||
Role3 | WS_W, OS_W, DS_D | ||
Role4 | S1_M, S2_M, S3_M, R_M, FS_F | ||
Role5 | F_M, IPS_M, GS_T | ||
| |||
S3 | ORCA | Role1 | WS_W, OS_W |
Role2 | WS_M, FS_M, GS_M, OS_M, IS_M, DS_M, IS_W | ||
Role3 | DS_D | ||
Role4 | S1_M, S2_M,S3_M, R_M, FS_F | ||
Role5 | F_M, IPS_M, GS_T |
Then, we also performed the RMMDI on all scenarios with the role number
The most common result of RMMDI when k=5.
Method | Role | Permissions |
---|---|---|
RMMDI | Role1 | WS_W, OS_W |
Role2 | WS_M, FS_M, GS_M, OS_M, IS_M, DS_M, IS_W, DS_D | |
Role3 | GS_T | |
Role4 | S1_M, S2_M,S3_M, R_M, FS_F | |
Role5 | F_M, IPS_M |
Finally, we changed the role number
The most common result of RMMDI when k=4.
Method | Role | Permissions |
---|---|---|
RMMDI | Role1 | WS_W, OS_W |
Role2 | WS_M, FS_M, GS_M, OS_M, IS_M, DS_M, IS_W, DS_D | |
Role3 | S1_M, S2_M,S3_M, R_M, FS_F | |
Role4 | F_M, IPS_M, GS_T |
We studied the parameters used in RMMDI as well as the baseline clustering methods. We performed a series of experiments for a series of different parameters and tried to find out the optimal parameters. The experiments were conducted under S2 with role number
We first studied the parameters used in baseline methods, including
Then, we studied the parameters in PCoNMF and CCoNMF. There are 3 parameters:
Evaluating the accuracy and NMI of PCoNMF and CCoNMF on varying
Evaluating the accuracy and NMI of PCoNMF and CCoNMF on varying
Finally, we studied the parameter
Evaluating the accuracy and NMI of RMMDI on varying
Evaluating the accuracy of RMMDI on varying
Evaluating the NMI of RMMDI on varying
We found that the curves showed a downward trend in whole, and the accuracy and NMI got greater values when
We also conducted experiments to compare the effectiveness of RMMDI with the clustering baseline methods. We performed all algorithms 200 times on each scenario and compared results with the ground truth shown in Table
Accuracy for different methods on 3 scenarios.
Scenario | SP | SymNMF | PCoSpec | CCoSpec | PCoNMF | CCoNMF | RMSC |
---|---|---|---|---|---|---|---|
S1 | 0.9010 | 0.9229 | 0.7121 | 0.9057 | | 0.9190 | 0.6952 |
| |||||||
S2 | 0.8981 | 0.9276 | 0.6675 | 0.9072 | | 0.9365 | 0.5762 |
| |||||||
S3 | 0.9210 | 0.9181 | 0.7070 | 0.9035 | | 0.9333 | 0.6333 |
NMI for different methods on 3 scenarios.
Scenario | SP | SymNMF | PCoSpec | CCoSpec | PCoNMF | CCoNMF | RMSC |
---|---|---|---|---|---|---|---|
S1 | 0.8605 | 0.8571 | 0.5827 | 0.8414 | | 0.8579 | 0.6382 |
| |||||||
S2 | 0.8506 | 0.8703 | 0.4869 | 0.8497 | | 0.8744 | 0.479 |
| |||||||
S3 | 0.8692 | 0.8538 | 0.5958 | 0.8464 | | 0.8719 | 0.4890 |
Runtime for different methods on 3 scenarios (s).
Scenario | SP | SymNMF | PCoSpec | CCoSpec | PCoNMF | CCoNMF | RMSC |
---|---|---|---|---|---|---|---|
S1 | 0.0178 | 0.0109 | 0.1367 | 0.1247 | 0.8533 | | 0.0580 |
| |||||||
S2 | 0.0145 | 0.0091 | 0.0973 | 0.1108 | 0.8518 | | 0.0440 |
| |||||||
S3 | 0.0127 | 0.0085 | 0.0929 | 0.1031 | 0.8494 | | 0.0230 |
We propose a novel user role framework, which uses multiple domain information to mine user roles other than the preassigned user-permission assignment matrix.
It is proved that the framework is suitable for role mining. For the three scenarios used in the experiment, different users are assigned to different user roles. One user may be assigned one or more user roles, and one user role may be assigned to several users. For the results listed in Tables
More importantly, it is also demonstrated that the framework has the ability to find interdependent relationships between permissions, avoiding potential errors. From the experimental results in Section
It is also proved that the performances of different clustering methods vary in the framework. As shown in Tables
There are 4 parameters involved in the RMMDI in total and it is important to select proper values to the parameters. The first two parameters
In this paper, a novel framework for role mining based on multi-domain information named as RMMDI is proposed. The key idea of the framework is to mine user roles from multiple domain information rather than existing user-permission assignment matrices. In the framework, information from the physical domain, network domain, and digital domain is used to find the relationships between user permissions, and multi-view community detection methods are used to integrate information from different domains. Experiments on 3 simulated network scenarios demonstrate that RMMDI can capture the interdependent relationships between permissions and perform user-role mining more effectively and reasonably.
The data used to support the findings of this study are available from the corresponding author upon request.
The authors declare that they have no conflicts of interest.
This work is supported by the grants from the National Key R&D Program of China (Project No. 2017YFB0802800).