ON A MULTIDIMENSIONAL OIL EXPLORATION PROBLEM

This paper is concerned with optimal strategies for drilling in an oil exploration model. An exploration area contains n1 large and n2 small oilfields, where n1 and n2 are unknown, and represented by a two-dimensional prior distribution π. A single exploration well discovers at most one oilfield, and the discovery process is governed by some probabilistic law. Drilling a single well costs c, and the values of a large and small oilfield are v1 and v2, respectively, v1 > v2 > c > 0. At each time t = 1,2, . . . , the operator is faced with the option of stopping drilling and retiring with no reward, or continuing drilling. In the event of drilling, the operator has to choose the number k, 0≤ k ≤m (m fixed), of wells to be drilled. Rewards are additive and discounted geometrically. Based on the entire history of the process and potentially on future prospects, the operator seeks the optimal strategy for drilling that maximizes the total expected return over the infinite horizon. We show that when π π′ in monotone likelihood ratio, then the optimal expected return under prior π is greater than or equal to the optimal expected return under π′. Finally, special cases where explicit calculations can be done are presented.


Introduction
In this paper, we consider the problem of finding optimal strategies for drilling in Beale's model of oil exploration, see Beale [2].
Benkherouf [3] showed that the form of the detection mechanism for new oilfields in Beale's model ensured that (i) the vector (s 1 ,s 2 , f ), with s 1 denoting the number of successes in discovering large oilfields, s 2 the number of successes in discovering small oilfields, and f the total number of failures, is a sufficient statistics; (ii) the posterior distribution of the number of undiscovered oilfields can always be written in a product form M 1 (n 1 )M 2 (n 2 ), with n 1 representing the number of large oilfields and n 2 the number of small oilfields.In other words, if the prior distribution of the number of undiscovered oilfields has the two-dimensional form π = (π 1 ,π 2 ), then this form is preserved after drilling exploration wells in Beale's model.This as we will see below will turn out to be important and crucial in simplifying the analysis of the present paper.
Remark 1.1.(i) The form of the detection mechanism proposed in Beale [2] implies that large oilfields have some priority over small oilfields in the detection, see Benkherouf [3].
(ii) Beale's model can easily be generalized to more than two types of oilfields in the following manner, see Benkherouf [4].Assume that we have an area that contains n i oilfields of type i remaining to be discovered with probability π i (n i ), i = 1,...,k.Given that there are (n 1 ,...,n i ,...,n k ) of type 1,...,i,...,k to be discovered, then the probability that an exploration well discovers an oilfield of type i is q n1  1 ••• q ni−1 i−1 (1 − q ni i ) and the probability of discovering nothing is q n1 1 ••• q nk k , with 0 < q i < 1, i = 1,...,k.For the purpose of this paper, we will only consider Beale's model.We will comment where we feel it is appropriate on the implication of our results on the more general model.
Next, we assume that a single exploration well costs c, and that the value of a large field is v 1 and that of a small field is v 2 with v 1 > v 2 > c > 0. Rewards are assumed to be additive and discounted geometrically by a factor θ, (0 < θ < 1).At each time t = 1,2,..., the operator is faced with the option of stopping drilling and retiring with no reward, or continuing to drill.Should the option of drilling be chosen, then a number k, 0 ≤ k ≤ m (m fixed) representing the number of exploration wells to be drilled must be selected, leading to an immediate expected return denoted by R k {π(t)}, where π(0) = (π 1 ,π 2 ), and π(t + 1) is deducible from π(t) by an application of Bayes' theorem.
The operator is interested in finding an optimal strategy for drilling that maximizes the total expected reward over an infinite horizon and which is consistent with the possible courses of actions just described.
We will next illustrate how to update π for the case of drilling a single exploration well.Let π s1 (n 1 ,n 2 ) be the posterior distribution of the number of undiscovered oilfields after discovering a large oilfield, π s2 (n 1 ,n 2 ) the posterior distribution after discovering a small oilfield, and π f (n 1 ,n 2 ) the posterior distribution after discovering nothing.Then we may write ) ) Similarly, we can compute the posterior distribution representing the number of oilfields remaining to be discovered under any of the drilling strategies described above.
Note that R k {π(t)} ≤ kv 1 < ∞, since drilling k exploration wells can at most discover k large oilfields.This implies that for Beale's model, and under the assumption of geometric discounting, according to the general theory of Markov decision processes (see, e.g., Ross [18]), an optimal strategy for drilling which is deterministic, stationary, and Markov exists.
The main contribution of this paper is twofold: (i) to enlarge the action space in the model of Benkherouf [3] to include more than a single exploration well at any given time, (ii) to extend the monotonicity result obtained in [3] and to examine the optimal strategies for drilling for some particular classes of prior distributions representing the number of undiscovered oilfields.In the next section, we will prove the main result of this paper which asserts that if π(0) := π π (0) := π , with denoting the monotone likelihood ratio ordering (to be defined below for vectors of probability distributions), then the optimal expected return for the model under π is greater than or equal to the optimal expected return for the model under π .This result generalizes earlier results by Benkherouf and Bather [7], Benkherouf [3,4].In Section 3, we present some examples where some computations related to optimal strategies for drilling are possible.The paper concludes with some general remarks and some suggestions for possible future research.

Model and theoretical results
Recall that π 1 (0),π 2 (0) ,..., π 1 n 1 ,π 2 n 2 ,..., (2.1) is a sequence of probabilities representing the number of undiscovered large and small oilfields.Define a policy S by where S(t) refers to the number of exploration wells drilled at time t, with 1 ≤ S(t) ≤ m, for all t ∈ N. We require that S(t) be dependent on the entire history of successes and failures up to time (t − 1).Let V S (π,τ) be the expected return obtained from following policy S up to some stopping time τ.Then, V S (π,τ) may be written as where the expectation is taken over the stopping time τ and all possible realizations of the process under policy S. Write Our goal is to determine a policy S * and a stopping time τ * for which the maximum in (2.3) is attained.The pair (S * ,τ * ), as outlined above, exists.Before we go on further, we make a remark regarding the preservation of the product form of the prior distribution or equivalently the vector form π = (π 1 ,π 2 ).We note that an examination of (1.2)-(1.4)reveals that this form is preserved when drilling a single exploration well.A similar argument can be used to show that this form is preserved under any strategy S, which is consistent with the requirement that 1 ≤ S(t) ≤ m, t ∈ N. In other words, we can always write the posterior distribution π(t) in the form (π 1 (t),π 2 (t)).
The following definitions are needed before stating the main result of this paper.
Values of n such that π n = π n = 0 are excluded from the sequence of ratios, but π n /π n = ∞ are included.
For more information on monotone likelihood ratio ordering and other types of ordering, Ross [19] may be consulted.Definition 2.2.It is said that a vector of probability distribution π = (π 1 ,π 2 ) is greater than a vector π = (π 1 ,π 2 ) in monotone likelihood ratio and it is written that π π if π i π i in monotone likelihood ratio for all i = 1, 2.
The proof of Theorem 2.3 is lengthy and requires a number of preliminary results.Let us assume that the number of undiscovered oilfields in a given area is known to be n 1 and n 2 and let P(n 1 ,n 2 ,i, j,k) be the probability that when k exploration wells are drilled, we discover i large oilfields and j small oilfields.This probability equals zero for i > n 1 , j > n 2 , i < 0, j < 0, i + j > k, where n 1 and n 2 are nonnegative integers.
By considering what is found when the first well is drilled, we obtain, for n 1 ≥ 1, n 2 ≥ 1, i ≥ 0, j ≥ 0, and i + j ≤ k, where some of the terms on the right-hand side may be zero (e.g., if i = 0 then terms involving i − 1 are zero, and if i + j = k, then the final term will be zero because then i + j > k − 1).
For k = 1 we have, for n 1 ≥ 0 and n 2 ≥ 0, (2.7) Let F(n 1 ,n 2 ,k) be the expected return from drilling k wells when there are n 1 large oilfields and n 2 small oilfields, so that for n 1 ≥ 0 and n 2 ≥ 0.
For k ≥ 0, we set ) Proof.The proof is via induction on r.When r = 1, the left-hand side of (2.13) is and this is ≥ 0 because q 1 and q 2 are between 0 and 1. Hence the result is true for r = 1.Now suppose that (2.13) holds for some r ≥ 1.We aim to show that it holds for r + 1.We have, by Lemma 2.4, where we have used the inductive hypothesis in the final step.We also have using the inductive hypothesis again.So we obtain using the inductive hypothesis one last time.
Remark 2.6.Lemma 2.5 has a nice interpretation and a less formal proof.Assume that we are in position (n 1 ,n 2 ) and are given the choice between accepting F(n 1 ,n 2 ,r) or drilling (r + 1) exploration wells under the constraint that the first trial is only for a small oilfield.The second choice gives a better expected return.This means that which leads to the required result.
We have a similar, but more complicated, lemma for n 1 .
Lemma 2.7.For r ≥ 1, Proof.This lemma is proved by induction on r.When r = 1, the left-hand side of (2. 19) is 2 )v 2 ≥ 0, and since q 1 and q 2 are in (0,1), we see that the final expression is ≥ 0. Thus, (2.19) holds for r = 1.Now assume that (2.19) holds for some r ≥ 1.We want to show that (2.19) holds for r + 1.Using Lemma 2.4, and after some algebra, we find that (2.21) using the inductive hypothesis three times.
Remark 2.8.This lemma as Lemma 2.5 has a nice interpretation as well as a less formal proof.We need first to rewrite the statement of the lemma as Here, we assume that we are in position (n 1 ,n 2 ) and have on hand (r + 1) exploration wells to drill.Further, we are given a choice between taking the amount F(n 1 ,n 2 ,r + 1), or trying for a small field in the first trial.The first choice is clearly better, which gives which leads to the required result after rearrangement of the above expression.
Proof.We first show that F(n 1 ,n 2 ,r) is increasing in n 2 by induction on r.When r = 1, we have which clearly increases as n 2 increases, for all n 1 ≥ 0. For our inductive hypothesis, assume that for some r ≥ 1, we have We want to show that From Lemma 2.4, we have where we used the inductive hypothesis three times at the penultimate step.By Lemma 2.5, the expression and this is F(n 1 ,n 2 ,r + 1) from Lemma 2.4.Hence we have shown that We now show that F(n 1 ,n 2 ,r) is increasing in n 1 , using induction on r.When r = 1, we have (2.30) and, since v 2 < v 1 , this is increasing in n 1 .
We have a similar inductive hypothesis as for the n 2 case.We assume that we have We want to show that Following an approach similar to that for the n 2 case, we have n 1 ≥ n 1 . (2.33) Using Lemma 2.7, we see that {•••} ≥ 0, so (2.33) is greater than or equal to (2.34) by Lemma 2.4.Hence we have shown that (2.35) We are now in a position to proceed to the proof of Theorem 2.3.However, before that we need to be armed with the following result.
For the proof of this theorem, see, for example, Ross [19].
Proof of Theorem 2.3.Expressions (2.3) and (2.4) imply that V (π) may be written as where I k {S(t)} is equal to 1 if strategy S chooses to drill k wells at time t and 0 otherwise.We next show that if π π , then It can be shown that R k {π(t)} may be written as (2.37) Also, Bayes theorem insures that if π π , then π(t) π (t) with probability one.It follows from Lemma 2.9 that (2.37) and Theorem 2.10 give which leads to the required result.Now, the proof of Theorem 2.3 is immediate from (2.36) and (2.37).
Remark 2.11.The technical details involved in the proof of Theorem 2.3 beg the question whether it is possible to obtain a similar theorem for the general model alluded to in Remark 1.
1.The answer is affirmative.The only new idea is to note that a series of lemmas similar to Lemmas 2.4-2.9 can be obtained, which can then be used to finalize a theorem similar to Theorem 2.3.We next only sketch the ideas, the details are left for the reader.
Let F(n 1 ,...,n k ,r) be the expected return from drilling r wells when there are n i , i = 1,...,k, wells to be discovered of type i.Let (2.39) Also, let v i be the value of an oilfield of type i, i = 1,...,k, where We may then write (2.41) An informal argument similar to that used for Remarks 2.6 and 2.8 can be shown to lead to (2.42) To see this, let Now, assume that we are in position n and we have r + 1 exploration wells to drill.Further, we are given two options, in one option, the first exploration well can hit any oilfields of type j, j + 1,...,k, then subsequently any type of oilfields and in the second option, the first exploration can hit oilfields of type j + 1,...,k, and subsequently any type of oilfields.The first option is clearly better and this gives which leads to the required result.Now, Lemma 2.9 has an obvious multidimensional extension, namely, F(n,r) is increasing in n i , i = 1,...,k, the proof of which follows the same line of that Lemma 2.9 making use of the extension results obtained above.
Note that if we were interested in finding the optimal strategy for drilling for the finite planning horizon problem, then (2.3) becomes where n is the planning horizon.
It is clear from the proof of Theorem 2.3 that the statement of the theorem remains valid in this case.Now, if V (π) = 0, then we call the state π a stopping state, else where it is a continuation state.One implication of Theorem 2.3 is that if a state π is found to be a stopping state and if we have identified some states of which π is the largest one in monotone likelihood ratio, then these states must be stopping states.Lemma 2.12.Assume that the number of exploration wells drilled is k.Let π f denote the posterior distribution after complete failures in discovering oilfields, then π π f .Proof.The proof is immediate from the definition of the ordering and Bayes' theorem.Now, Theorem 2.3 implies that complete failure makes future prospects worse.Let α k i, j (π) denote the probability of discovering i large oilfields and j small oilfields after drilling k exploration wells, given that the number of undiscovered sources is represented by π.Also, let π(i, j,k) be the posterior distribution of the number of undiscovered sources after drilling k wells and discovering i large oilfields and i small oilfields with 0 ≤ i + j ≤ k.
Let S k be a policy that initially calls for drilling k wells, then continuing optimally.Write Z(S k ,π) for its expected return.Then it can be checked after some algebra that The following result is preliminary and will be useful later on.
Lemma 2.13.If π is a continuation state, then

.47)
Proof.The proof is immediate from (2.46) and noting that

.49)
Proof.Note that V (π) can be written as The lemma is then immediate from Lemma 2.13.

Optimal strategies for drilling for the Euler family of distributions
In this section, we will be concerned with obtaining optimal strategies for drilling when the prior distribution representing the number of undiscovered sources belongs to the Euler family of distributions.This family was proposed by Benkherouf and Bather [7].
In their paper, Benkherouf and Bather examined a simpler version of the model of the present paper where a single type of oilfields was considered.They were able to characterize explicitly the optimal strategies for drilling when the prior distribution belongs to the Euler family.The passage to the multidimensional model makes the quest for an explicit characterization of the optimal strategy for drilling harder.
We remark here that the Euler family of distributions provides nice priors for modeling the number of undiscovered oilfields, where explicit computations of optimal strategies for drilling can be done.Further, they simulate nicely the useful lifetime of an oilfield.In that it is well known in the oil industry that oilfields go through three stages during exploration.In the first stage (which is called at times the immature stage) oil companies experience an increase in the rate of discoveries as oilfields are discovered.In the second stage a steady state of discoveries is observed and in the last stage a continual decrease in the rate of discoveries is observed; see Kaufman et al. [10].The simple Euler distribution models the second stage reasonably well.For the other stages, there exists a number of variants of the Euler distribution which model these stages well, some of which will be discussed below.One modification gives rise to the family of distributions defined below.Definition 3.2.A univariate distribution π = (π 0 ,π 1 ,...), n≥0 π n = 1, is said to belong to the family of simple mixture of Euler distributions with parameters λ, q, and ρ, 0 < λ < 1, and 0 < q < 1, ρ > 0, if and this is denoted π = EM(λ,ρ, q).Note that when ρ = 0, we recover the Euler distribution.In other words, the simple mixture of Euler distributions is a mixture of an atom at zero and Euler distribution with weights ρ/(1 + ρ) and 1/(1 + ρ), respectively.
We will next find optimal strategies for drilling under the assumption that the number of undiscovered oilfields in large oilfields and small oilfields are represented by some Euler distributions.
Note that it can be shown using Bayes' theorem that the following transitions to π occur after drilling a single well: after a success in a large oilfield, after a success in a small oilfield, and after total failure.For similar transitions readers may consult Benkherouf and Bather [7].It can also be shown that the posterior distribution of the number of undiscovered sources has always the form (E(•,•),E(•,•)) which we will call the two-dimensional Euler distribution.Also, it can be shown that drilling will always lead to a posterior distribution which is always stochastically smaller in monotone likelihood ratio than the prior distribution (see Definition 2.1).Now, Theorem 2.3 means that drilling will always make things worse in term of expected return.In other words, we are in the deteriorating case.The next theorem characterizes the stopping states where a drilling will no longer be profitable.
Theorem 3.3.Let π = (E(λ 1 , q 1 ),E(λ 2 , q 2 )).The state π is a stopping state if and only if Assume first that π is a stopping state, then clearly it is not profitable to drill any number of wells and in particular Z(S 1 ,π) ≤ 0. By definition of Z(S 1 ,π), we have But π is a stopping state.Thus It can also be checked that from which the "if " part follows.
To show the "only if " part, we will show that R 1 (π) ≤ 0, implies that R k (π) ≤ 0, for 2 ≤ k ≤ m.We use here an inductive argument.
Let us assume that R 1 (π) ≤ 0 implies that R l (π) ≤ 0, for some l < m and let us show that this implies that R l+1 (π) ≤ 0.
Assume that we are currently in state π = (E(λ 1 , q 1 ),E(λ 2 , q 2 )), then it is clear that after drilling a number of exploration wells, the number of undiscovered oilfields will still be represented by the same family of distributions.Namely, the posterior will have the form for some a and b less than or equal to m.The worst scenario occurs when complete failures occur in both oilfields.Define Then, one implication of Theorem 3.3 is that in this case drilling s exploration wells, with 0 < s < k * (λ 1 ,λ 2 ), will still be profitable.One suspects that in this case drilling k * (λ 1 ,λ 2 ) − 1, is suboptimal in some sense.The following analysis sets to prove this.Recall the definition of Z(S k ,π), which is the expected return obtained by following the strategy that initially calls for drilling k wells, then continuing optimally.Then, we have the following lemma.Lemma 3.4.If π = (E(λ 1 , q 1 ),E(λ 2 , q 2 )), then for all k = 1,...,k * − 2, Z S k+1 ,π ≥ Z S k ,π . (3.16) Proof.Note that by definition of S k , we have Now, Lemma 2.14 and (3.17) imply that A bit of algebra shows that (3.18) and (3.19) lead to from which we get after a success in a small oilfield, after total failure.Note that similar results can be found in Benkherouf and Bather [7].
We remark here that a success in a large field changes the first component of π to Euler distribution while keeping the second component unchanged.A success in a small field keeps the first component in the same family but with different parameters, while the second component changed to the Euler family.Finally, total failure keeps the two components in the family of simple mixture of Euler distributions.Let in general.This in turn makes the search for a characterization of the optimal strategy for drilling harder.We will only consider the case m = 2 for simplicity.
The following theorem proposes a way for identifying some stopping states.
(i) If ρ 1 = ρ 2 = 0, then stop drilling if and only if (3.34) Proof.We will only prove (i) and (ii).The proof of (iii) follows using an analogous argument.
To show (i) note that this case is equivalent to the pure simple Euler distributions from which the result is immediate from Theorem 3.3.
Proof of (ii).Assume that we are in state π = (EM(λ 1 ,0),EM(λ 2 ,ρ 2 )) which satisfies the hypothesis of the theorem and that π is a continuation state.Then, Lemma 2.14 implies that which leads to (3.36) The hypothesis of the theorem and the fact that which is contradiction with Theorem 2.3.This completes the proof.

.38)
It can be shown that for all stopping times τ, This in turn implies that all previous analysis remains valid here.One small but significant difference is in parts (ii) and (iii) of Theorem 3.10, where the "if " in the statement "then stop drilling if . . ." is replaced by "if and only if."We next sketch the proof of the (only if part) of (ii).Note that if π = (EM(λ 1 ,0);EM(λ 2 ,ρ 2 )) is a stopping state then so is the state (EM(λ 1 q 1 , 0);EM(λ 2 q 2 ,ρ 2 /(1 − λ 2 ))).Also, drilling a single exploration well is not profitable, giving which leads to the required result.
We remark at this stage that in Benkherouf and Bather [7] a new family of distributions called the Heine distributions was introduced.If the number of undiscovered oilfields is represented by the Heine distributions, then the posterior distributions will be only a function of the number of oilfields drilled in the case of a single area.All the computations in Section 3 for the Euler distributions and mixtures can be repeated for the Heine family with simpler results.We have not done this as they can easily be carried out by mimicking the approach of the Euler case.
Note that it would have been nice to have a complete characterization of the stopping region, that is, a theorem similar to Theorem 3.6.However, the best we could obtain is the above theorem.This theorem is a welcome aid in identifying some stopping states.It is also worth mentioning that if at any time we get successes in both large and small oilfields, then we are in Euler case and Theorem 3.3 applies.Further, Lemma 2.12 implies that consecutive failures means that the stopping region is approached faster.
To summarize, this paper dealt with finding optimal strategies for drilling for Beale's model for oil exploration where the number of undiscovered oilfields was represented by a two-dimensional prior distribution π = (π 1 ,π 2 ).At each discrete epoch of time the operator was faced with the options of stopping and retiring with no reward or continuing drilling in which case a number k, 1 ≤ k ≤ m, representing the number of exploration wells to drill, must be selected.It was shown that if π π (in monotone likelihood ratio), then V (π) ≥ V (π ), where V (π) is the value of the expected return obtained from following the optimal strategy when the prior is π, say.Also, special cases where the priors belonged to the Euler family were treated.
It is worth mentioning that although the analysis was carried for the two-dimensional case, it is not difficult to extend it to the multidimensional case along the lines mentioned in Remark 2.11.Another possible line of investigation is related to the cost of drilling.To be precise, assume that drilling k exploration wells costs c k > 0, where c k + c j > c k+ j .This latter assumption means that drilling (k + j) oil exploration wells in one trial is cheaper than drilling the same number of exploration wells in two trials.Most of the analysis up to Section 3 remains valid here.However, it is not clear what happens to the search for the optimal strategies for drilling in the Euler family and in the mixture of the Euler cases.Further, it seems natural to consider as a next step in our investigation the multiarea problem, that is, the case where at each epoch of time the operator is faced with the choice of the area in which to drill as well as selecting the number of exploration wells to drill.This problem is similar in structure to what is called superprocesses in Gittins's indices theory.This will be the subject of a future investigation.For a simpler problem, see [8].