A Novel Concept Acquisition Approach Based on Formal Contexts

As an important tool for data analysis and knowledge processing, formal concept analysis (FCA) has been applied to many fields. In this paper, we introduce a new method to find all formal concepts based on formal contexts. The amount of intents calculation is reduced by the method. And the corresponding algorithm of our approach is proposed. The main theorems and the corresponding algorithm are examined by examples, respectively. At last, several real-life databases are analyzed to demonstrate the application of the proposed approach. Experimental results show that the proposed approach is simple and effective.


Introduction
Formal concept analysis (FCA), proposed by Wille in 1982 [1], is a field of applied mathematics based on the mathematization of concept and conceptual hierarchy. It thereby activates mathematical thinking for conceptual data analysis and knowledge processing. FCA starts with a formal context defined as a triple containing an object set, an attribute (property) set, and a binary relation between the object set and the attribute set. A formal concept is a pair (object subset, attribute subset) induced by the binary relation, and a concept lattice is an ordered hierarchical structure of formal concepts. A formal context in FCA corresponds to a special information system with input data being two-valued in rough set theory [2].
Formal concepts are very important notions of FCA. And intents and extents are also very important elements of formal concepts. The set of intents (extents) is isomorphic to the corresponding concept lattice under the order relationship "⊇" ("⊆"). So, if the set of intents is determined, the corresponding concept lattice is identified. Thus, obtaining all intents or extents is very important. Generally, the basic way to obtain all intents or extents is via their definitions. If there are objects, then we should calculate 2 times to obtain all intents. Obviously, the computational costing is very huge. To solve this problem, we give a new method to obtain all intents. And correspondingly, the formal concepts are determined. This paper is organized as follows. In Section 2, we briefly review some basic notions related to FCA. In Section 3, a novel concept acquisition approach is introduced and some related conclusions are given. In Section 4, the corresponding algorithm is proposed and experimental results are shown to illustrate the validity of our method. Finally, conclusions are drawn in Section 5.

Preliminaries
In this section, we recall some basic notions and properties in FCA.
Definition 1 (see [24]). A formal context ( , , ) consists of two sets and and a relation between and . The elements of are called the objects and the elements of are called the attributes of the context. In order to express that an object is in a relation with an attribute , we write or ( , ) ∈ and read it as "the object has the attribute m. " With respect to a formal context ( , , ), Ganter and Wille [24] defined a pair of dual operators for any ⊆ and ⊆ by * = { ∈ | ∀ ∈ } , A formal context is called canonical if ∀ ∈ , * ̸ = 0, * ̸ = , and ∀ ∈ , ̸ = 0, ̸ = . We assume that all the formal contexts we study in the sequel are finite and canonical.
is called a formal concept, where is called the extent of the formal concept and is called the intent of the formal concept. For any ∈ , a pair ( * , * ) is a formal concept and is called an object concept. Similarly, for any ∈ , a pair ( , * ) is a formal concept and is called an attribute concept. The family of all formal concepts of ( , , ) forms a complete lattice that is called the concept lattice and is denoted by ( , , ). For any ( 1 , 1 ), ( 2 , 2 ) ∈ ( , , ), the partial order is defined by And the infimum ∧ and supremum ∨ of ( 1 , 1 ) and ( 2 , 2 ) are defined by respectively.

A Novel Concept Acquisition Approach
The basic way to obtain all intents or extents is via their definitions. If there are objects, then we should calculate 2 times to get all intents. Obviously, the amount of computation is very large. So our paper presents a new approach to solve the problem. In this section, we give this new method and some theorems to explain its rationality and validity. Before giving the method, we firstly propose a related definition. Since the method in this paper is aimed at obtaining all intents, we use subsets of to determine subsets of . On the contrary, if we want to obtain all extents, the subsets of can be used to determine subsets of . This point has been illustrated in the sequel. Proof. Suppose ∈ +2 . By Definition 4, there exists +2 ∈ +2 such that = * +2 .
The other one is that ∈ . In this case, To sum up the above two cases, +2 ⊆ +1 holds.
Theorem 6 guarantees the convergence of Algorithm 2 involved in the sequel. Proof. According to the condition +1 ⊆ , we have +2 ⊆ +1 by Theorem 6. Using Theorem 6 repeatedly, we can easily obtain the following results: Proof. We will adopt the proof by contradiction.
Theorem 10 gives a sufficient and necessary condition and computation method to find ( , , ). Now, the process to calculate all intents is summarized as follows.
Step 1. (1 ≤ ≤ | |) continuously. The computation needs to stop at +1 which exactly meets +1 ⊆ . Meanwhile, the set of intents is The merit of our method is that we do not need to calculate all , 1 ≤ ≤ | | and the computation needs only to stop at +1 which exactly meets +1 ⊆ . Now all the intents have been found and there is no extra computing.
In the following, we use an example in the literature [24] to examine the main results about the new method to find all intents of formal concepts.

4
The Scientific World Journal The formal context in Table 2 is a minor revision of the famous example, a film "Living Beings and Water" [24]. Since we require all the formal contexts in this paper are canonical, we delete the attribute (water) from the original formal context. The objects are living beings mentioned in the film and are denoted by = {1, 2, 3, 4, 5, 6, 7, 8}, where 1 is leech, 2 is bream, 3 is frog, 4 is dog, 5 is spike-weed, 6 is reed, 7 is bean, and 8 is maize. And the attributes in = { , , , , , , ℎ, } are the properties which the film emphasizes: : lives in water, : lives on land, : needs chlorophyll to produce food, : two seed leaves, : one seed leaf, : can move around, ℎ: has limbs, and : suckles its offspring. The corresponding concept lattice ( , , ) of this formal context is shown in Figure 2.

Algorithms. Algorithm 1 is given based on Definition 1 completely.
Algorithm 2 is based on our approach presented by Theorem 10. Comparing with Algorithm 1, we add a condition to terminate the program.
The time complexity of Algorithm 2 is analyzed as follows.
Denote = min{| |, | |}; by Definition 4, we know the time complexity of Step I in Algorithms 1 or 2 is ( ). So we can get two matters as follows.
(2) Suppose that Algorithm 2 will be terminated in the th step; then the time complexity of Algorithm 2 is (∑ = =1 ( )) by Theorem 10. We can easily get (∑ = =1 ( )) ≤ (2 ). We present an example demonstrating performance of Algorithm 2. The database "patient and Ill symptoms" showed in Table 3 comes from UCI Machine Learning Repository [25]. Suppose there are 12 patients which are denoted by 1, . . . , 12 and 8 symptoms of patients which are denoted by , . . . , ℎ, where is headache, is fever, stands for painful limbs, represents swollen glands in neck, is cold, is stiff neck, is rash, and ℎ is vomiting. Input the formal context and run the program; we obtain the set of all intents when

) Membership of Developing Countries in Supranational
Group [24]. In this data, 130 developing countries are objects. Six properties (group of 77, nonaligned, least developed countries, most seriously affected countries, Organization of Petrol Exporting Countries, and African Caribbean and Pacfic Countries) are attributes.
The results are shown in Table 4 and Figure 3, where Time 1 and Time 2 are the running time of Algorithms 1 and 2, respectively. | | presents the number of intents and the efficiency is equivalent to (Time 1 − Time 2)/Time 1. It can be seen that Algorithm 2 is much more efficient than Algorithm 1 along with the increase of | |.

Conclusion
To find new methods to solve the difficult problems of the concept lattice construction is a hot problem. Constructing concept lattices is a novel research branch for data processing and data analysis. Different methods play essential roles in different problems. This paper first defines some basic notions. Based on the basic notion of intents, we obtain a new judgment method of finding all intents of formal concepts. Moreover, an example is given to explain the feasibility of this method. At last, we give the corresponding algorithm of this method and do the experiments to illustrate the effectiveness of this method. For Algorithm 2, we have the following discussion which can be applied to real application. We can compare | | with | | of a formal context. If | | ≤ | |, then we use subsets of to determine subsets of and output the set of intents. Otherwise, according to the duality principle, the subsets of can be used to determine subsets of and output the set of extents. We will improve the corresponding algorithm of this method in the future.