Most classical search engines choose and rank advertisements (ads) based on their clickthrough rates (CTRs). To predict an ad’s CTR, historical click information is frequently concerned. To accurately predict the CTR of the new ads is challenging and critical for real world applications, since we do not have plentiful historical data about these ads. Adopting Bayesian network (BN) as the effective framework for representing and inferring dependencies and uncertainties among variables, in this paper, we establish a BNbased model to predict the CTRs of new ads. First, we built a Bayesian network of the keywords that are used to describe the ads in a certain domain, called keyword BN and abbreviated as KBN. Second, we proposed an algorithm for approximate inferences of the KBN to find similar keywords with those that describe the new ads. Finally based on the similar keywords, we obtain the similar ads and then calculate the CTR of the new ad by using the CTRs of the ads that are similar with the new ad. Experimental results show the efficiency and accuracy of our method.
Search engine has become an important means for finding information on the Internet today. Most classical search engines are funded through textual advertising placed next to their search results. Search engine advertising has become a significant element of the Web browsing experience [
In recent years, CTR prediction has been widely concerned in academic communities of computational advertisement. For example, Agarwal et al. [
In general, the above methods are only suitable for the ads that have plentiful historical click logs, except for the new ads (without plentiful historical click logs). It is known that all advertising impressions and clicks follow a mathematical relationship known as powerlaw distribution [
It is natural to consider the uncertainties in ads and CTR prediction, while uncertainty related mechanisms actually have been incorporated into CTR prediction in recent years [
To construct the BN from the keywords that are used to describe the ads, we first find out the keywords appearing in the user queries and the ads. If the same keyword appears in the user queries and the ads simultaneously, then the ads associated with this keyword may be clicked by users. So, we use these keywords as BN’s nodes, where the edges between nodes are to describe the relationship between similar keywords. The constructed BN is called keyword BN, abbreviated as KBN.
To predict the CTR of the new ad, we could use similar ads’ CTRs. In order to obtain similar ads, we use the probabilistic inferences on the KBN to obtain the similar keywords at first. This makes the BN be reasonably looked upon as the underlying model of probabilistic inferences for predicting the CTRs of new ads. Many algorithms for BNs exact inferences have been proposed [
Generally, the main contributions of this paper are as follows.
We propose an efficient method to construct the KBN from the keywords to describe the given ads, as the basis of probabilistic inferences and CTR prediction.
We propose an algorithm for KBN’s approximate inferences to predict the probability distributions of possible values for the known keywords and correspondingly give the idea to predict the new ad’s CTR.
We implement the proposed algorithms and make preliminary experiments to test the feasibility of our method.
The remainder of this paper is organized as follows. In Section
Search engine advertising usually uses keywords to describe ads, and advertisers pay for the cost of these keywords. Thus, in this paper, we use the set of keywords to describe ads and user queries, defined as follows.
Let
If the same keyword
Actually, (
A BN is a DAG
Each node has a CPT that quantifies the effects that the parents have on the node. The parents of node
Based on the definition of a general BN, following we give the definition of keyword Bayesian network. Formally, a KBN is a pair
We use a pair
To construct the KBN from the given keywords is to construct the DAG and calculate each node’s CPT. Without loss of generality, the critical and difficult step in KBN construction is to construct the DAG, which is consistent with that of general BN’s construction [
The set of directed edges connecting pairs of nodes in the KBN describes the relationship between similar keywords. This means that, to describe the relationship between similar keywords, we will have to address the following two problems:
For problem
Let
For problem
So, we can compare
(1)
(2)
(3) For
(4) For
(5) If
(6) If
(7) Else
(8) End If
(9) End If
(10) End For
(11) End For
(12) Return
It should be noted that no cycles will be generated when executing Algorithm
It is worth noting that the execution of computing CPTs in the KBN by (
Let
By Step 1 in Algorithm
By Step 5 in Algorithm
By Step 6 in Algorithm
A simple KBN.
In order to predict the CTR of a new ad, we are to find the keywords of the ads with known CTRs, which are similar to the keywords of the new ad. Thus, the similarity of ads’ functionality or sense can be inferred and discovered. Consequently, we can use the similarity between the keywords of the new ad and those of the ads with known CTRs to find the ads that are related to the new ad.
Although we can obtain the keywords with the direct similarity relationship by the ideas presented in Section
It is known that Gibbs sampling is a Markov chain Monte Carlo algorithm and it always generates a Markov chain of samples [
We generate a random sample consistent with the evidence nodes. In the KBN, the evidence nodes are the new ad’s keywords, where we assume that the new ad’s keywords are in the KBN. Then, for the query expressed as
Nonevidence nodes are sampled randomly, including all the keywords in the KBN except the new ad’s keywords. From the CPTs of the KBN, the conditional probabilities of nonevidence nodes can be obtained given their Markov chains, respectively. Thus, the new state can be achieved. This process will be iterated until the given threshold sampling time is reached.
A set of samples can be generated, and the corresponding probability distributions can be achieved. The desired probabilities of
(1) Initialization:
(2) For
(3) Estimate
By Algorithm
Let
We say the process has reached its stationary distribution if
If the following equation holds, then
This guarantees the convergence and effectiveness of Algorithm
We consider the query
By Step 1 in Algorithm
By Step 2 in Algorithm
If we suppose
By Step 3 in Algorithm
Based on the results of KBN inferences given in Section
This means that we simply use the average CTR of the similar ads (with known CTRs) as the prediction of the new ad’s CTR. The idea is illustrated by the following example.
Suppose
To test the feasibility of the ideas proposed in this paper, we implemented our methods for constructing and inferring KBN, as well as that for predicting the new ad’s CTR. We mainly tested the efficiency of KBN construction, the efficiency, correctness, and convergence of KBN inference, and the accuracy of the KBNbased CTR prediction.
In the experiments, we adopted the test data from KDD Cup 2012Track 2 [
We chose
Efficiency of KBN construction.
Thus, from the efficiency point of view, the above observations make our further improvement of KBN construction be feasible by incorporating dataintensive computing techniques for the aggregation query processing. This is exactly our future work.
To test the precision of the KBN inference, we established the tests on the KBN in Example
Errors of KBN inference.
Netica (%)  KBN (%)  Error (%)  


91.1  83.3  7.8 

8.91  16.7  7.8 

73.0  83.2  10.2 

27.0  16.8  10.2 

75.1  83.9  8.8 

24.9  16.1  8.8 

34.8  23.9  10.9 

65.2  76.1  10.9 

69.6  77.9  8.3 

30.4  22.1  8.3 

33.1  24.2  8.9 

66.9  75.8  8.9 
To test the efficiency and convergence of Algorithm
Efficiency of KBN inference.
Convergence of KBN inference.
We randomly selected 500 ads from the test dataset, and we compared the ads’ CTRs predicted by (
Correctness of CTR prediction.
Ad ID  rCTR (%)  pCTR (%)  Error (%) 

21442606  42.2  53.1  10.9 
10950981  42  64  22 
20302019  70.5  71.2  0.7 
20158057  100  77  23 
10757736  33.3  67  33.7 
10110478  62.5  77  14.5 
10228933  100  46  54 
20174702  73.6  65.6  8 
20277194  50  79  29 
20745473  50  46  4 
Predicting CTRs for new ads is extremely important and very challenging in the field of computational advertising. In this paper, we proposed an approach for predicting the CTRs of new ads by using the other ads with known CTRs and the inherent similarity of their keywords. The similarity of the ad keywords establishes the similarity of the semantics or functionality of the ads. Adopting BN as the framework for representing and inferring the association and uncertainty, we mainly proposed the methods for constructing and inferring the keyword Bayesian network. Theoretical and experimental results verify the feasibility of our methods.
To make our methods applicable in realistic situations, we will incorporate the dataintensive computing techniques to improve the efficiency of the aggregation query processing when constructing KBN from the large scale test dataset. Meanwhile, we will also improve the performance of KBN inferences and the corresponding CTR prediction. In this paper, we assume that all the new ad’s keywords are included in the KBN, which is the basis for exploring the method for the situation if not all the keywords are not included (i.e., some keywords of the new ads are missing). Moreover, we can also further explore the accurate user targeting and CTR prediction based on the ideas given in this paper. These are exactly our future work.
The authors declare that there is no conflict of interests regarding the publication of this paper.
This paper was supported by the National Natural Science Foundation of China (nos. 61163003, 61263043), the Yunnan Provincial Foundation for Leaders of Disciplines in Science and Technology (no. 2012HB004), the Natural Science Foundation of Yunnan Province (nos. 2011FB020, 2013FB010), and the Research Foundation of Key Laboratory of Software Engineering of Yunnan Province (no. 2012SE013).