NONLINEAR FILTERING AND OPTIMAL QUALITY CONTROL

Some stochastic models of optimal decision processes in quality control problems are formulated, analyzed, and solved. It is assumed that costs, positive or negative, are assigned to various events in a simple manufacturing 
model, such as processing an item, producing a saleable item, discarding 
an item for salvage, selling a lemon, etc. and models are described by 
giving a sequence of events, some of which are decisions to process, to 
abandon, to accept, to restart, …. All the models have the rather unrealistic classical information pattern of cumulative data. The object is then to 
find optimal procedures for minimizing the total cost incurred, first in dealing with a single item, and second, in operating until an item first passes 
all the tests. The policies that appear as optimal depend on such matters 
as whether a conditional probability given certain data exceeds a ratio of 
prices, and on more complex functionals of the conditional expectations in 
the problem. Special sufficient classes of policies are discerned, which reduce the decision problem to finding one number.


Introduction
The need to compete in a global market, where only products of low cost and high quality can capture their share of sales, has stimulated renewed interest in quality con- trol.Since the birth, in the 1940s, of what is known today as statistical quality con- trol, research in this area has focused mainly on the development of standards, charts for process control, acceptance sampling plans, and inspection strategies.(See Abdel- Malek [1], Banks [2], and Kennedy [4].)As computer control of manufacturing per- meated industry in the 1970s, the possibilities and advantages of sequential, on-line quality control have become evident.Taguchi [5] is credited with pioneering this philosophy, which has led to improved performance for many manufacturing 1For Ryszard Syski.
processes.However, not much attention has been paid to the sequential, multi-opera- tional nature of manufacturing, nor to the roles of information structure and partial or noisy measurements.For this reason, we believe that the field of quality control can profit substantially from results and progress in nonlinear filtering and stochastic control, which can be used for describing the relevance of data, and in finding optimal decisions based on them, respectively.We propose to illustrate such methods by means of some elementary models.Even simple manufacturing situations, such as two machining operations and two inspections, can have a very complex information structure and an involved menu of possible options: whether to inspect; what to measure; whether and how to control by feedback or feedforward; whether to accept or pass an item on the basis of some particular information; or to reject it and channel it to some other disposition, like salvage.We do not intend on covering this gamut.
The examples we consider have several features in common.They all embody the "classical information pattern" in which all information available for earlier decisions is also available for the current one.We are aware that not many realistic quality control situations have this structure.However, any other pattern seems to end up with pointwise minimizations that depend explicitly on the functional form of un- known decision rules, a difficulty first emphasized by Witsenhausen [6].Within the classical pattern, all the solutions we arrive at are properties of conditional expectation, especially the averaging or smoothing property.Sometimes, it is possible to identify a special set of admissible policies which are "sufficient", in the sense that for any policy, one can easily find one in the special set that is at least as good.And finally, filtering only enters the problem because it is the way by which the condition- al probabilities and expectations that arise in the solutions are calculated.
Thus, while we do not stress the filtering in this paper, it always underlies and may be the hardest part of the problem.
2. Process or Discard?One Stage A machine processes items (called "widgets") with an attribute x for which the requir- ed tolerance is z E A. The effect of the machine is to define another attribute z, with the specification z E B. Both tolerances must be met for the product to be useful and have a sales value.The quantities x and z are random variables defined on a suitable probability space, and a limited amount of information is available, expressed by a r- algebra defined on the space.The setup is as follows: On the basis of , a deci- sion is made whether to accept an item for processing, or to reject it.Processing for z incurs a machine operation cost c z. Rejecting an item generates a salvage cost c s, which could be positive or negative.An accepted item, that after processing meets the required tolerances, is sold for a price (negative cost) p; if it does not meet the re- quired tolerances it incurs a high "lemon" cost c attributed to extra overhead and hassle, cost of a replacement for the customer, loss of company standing, salvage, etc.
An admissible policy for making the decision is a function v, measurable on ctJ, taking values in {0,1}, with the interpretation that v--1 means acceptance, and v 0 means rejection.This model for a quality control situation is completed by choosing as criterion the expected total cost C of a decision about one item, which for the policy v takes the form: where E is the expectation operator, and I the indicator function.By taking the con- ditional expectation with respect to the data , it can be seen that v should be 1 if: (2) and 0 otherwise.In other words, the decision rule should be: accept an item for pro- cessing if the conditional probability of meeting the specifications given the data ex- ceeds the price ratio: (z + e)/(e + ); (3) otherwise reject.We note that this example has an intuitively appealing solution, but call attention to the fact that the filtering is hidden in the calculation of the con- ditional probability, and that the example covers only one item in one stage.The filt- ering can be kept hidden because the solution we arrived at is just a consequence of the properties of conditional expectation.However, if a more concrete case would help, suppose that the datum y has the form h(x)+ n, with n Gaussian (0, 1) and in- dependent of (x, z).Then r{h(x)+ n}, and nonlinear filtering (see Jazwinski [3])   tells us that as a function of y P{x e A,z B I } N-1n{exp{yh(x) 1/2h(x)2}I{x A,z E B}} N --same E as in previous line, without the I term and the equation for the least cost C is: C E{min{O, cz+C.-cs-(cg+p)P{x A,z B[qJ}}} +c s. (4a) 3. Process or Discard?Go Until the First Acceptance Being in business to manufacture a whole lot of widgets, we should ask the next natural question: How much does it cost to produce the first accepted item if we re- start the process after a rejection with the same, but independent, statistics?This looks superficially like the same problem as in Section 2, and one might expect the same policy to be optimal, but such does not seem to be the case.In this kind of manufacturing situation, one does not get a chance at any sort of return until an item considered for processing is accepted.Applying a certain amount of hindsight, we can say that this new problem appears to be linear fractional in the policy, while the previous problem was linear.Therefore, we may guess that the right policy can be found by solving a linear problem involving an unknown constant, and then finding this constant, which actually turns out to be the optimal cost.We are going to keep trying items until our quality control plan results in an acceptance, so we must restrict the admissible policies to those that actually accept a widget with positive probability.With this proviso, and keeping in mind that our stochastic process has a renewal epoch at a rejection, at which we start over, we see that the cost C(v) of a policy v satisfies the equation in C: where N(v) c, + E{v(c z + c e c, (c e + p)I{x A,z .B}!} and D(v) E{v}.The denominator D makes the problem linear fractional, and that is why we needed the proviso on the admissible policies, to make D(v) > O.
For an amusing digression, we let K be a numerical constant, and we define a "special K" policy to be that one v of the form Indicator [E{c z + c. c s K (c e + p)I{x e A, z e B} ctj} < 0], (7) and for convenience, let g(K) denote the conditional expectation in the last indicator.
Then we prove: Proposition: Let u be an admissible policy, with cost C(u), as defined above.The "special g" policy v, with g C(u), is at least as good as u; C(v) <_ C(u).
Proof: With v defined from u as in the hypothesis, we see that: vg(C(u)) min{0, g(C(u)) < ug(C(u)), ( 8) The left-hand side is N(v)-D(v)C(u), and the right-hand side is zero, which also equals N(v)-D(v)C(v).Since D(v) > 0, the result follows.
In principle, this result gives a policy improvement iteration scheme for the pro- blem.However, it also implies that the problem reduces to finding the right value of K.The nature of the "special K" policy, as defined, suggests that the right K is also the optimal cost.
To find the right K, we shall argue that an optimal policy may as well be a "special K" policy for some K, and that in fact, K must be the optimal cost, say C*.The corresponding "special C*" policy is: u* Indicator of g(C*) < O. (10) Now it can be seen that N(u*) c, + E{u*(c z + c e c s (c e + p)I {x e A,z B})} The last equality gives an equation for C*.We shall claim that under reasonable con- ditions, the root C* exists, is unique, and is the optimal cost of trying items until the first acceptance.
Another way to see this is to argue that an optimal policy v cannot be reduced by the policy improvement ploy of changing to the "special K" policy with K C(v), which is: u Indicator of g(C(v)) < 0 (13) The formula for the cost of u implies that C(u) E{(1 u)C(u) + c s + u(c z + ce-c s (c e + p)I{z A,z B})} ( 15) and we can move the conditional expectation in to conclude: 0 c s + E{min{0, g(C(u))}} ( 16) for an optimal cost C(u), as before.
Now that we have found the equation for what we think is the right K, that is, a necessary condition for optimality, it is straightforward to show that it is sufficient.
Consider any admissible policy v, that is, one with Ev > 0, with cost C(v) given by N(v)/n(v).We calculate that: < c s + vE{c z + c, c s C* (cg + p)I{x A, z e B} q'J}" (17b) By the equation for C*, the left-hand side has an expectation of zero, and that of the right-side is N(v)-C*D(v), whence" and hence C* <_ C(v).Thus, C* is a lower bound on the achievable cost, and it is easy to see that the bound is achieved by the "special K" policy with K C*.
4. Two Measurements and Two Decisions, and More Problems with more than one decision make it necessary to specify the operative in- formation structure by stating what data are available for what decision.The clear- est case is usually the so-called "classical" information pattern, in which the decisions are made in a sequence, with the latter decisions based on all the data available for the earlier ones, and possibly more.This case often permits use of dynamic pro- gramming methods, while most other cases are very difficult.Let us consider a more complex case, again involving the random variables x and z.We initially receive data about x, say in the form of a random variable Yl" If we accept the item and process for z as before, we receive some more information about one or both of x,z, in the form of a random variable Y2" We then again decide whether to accept the item, using BOTH information items for the second decision.
Here, an acceptance ends the decision process, and a rejection leads to starting over at a renewal epoch.We are interested in the minimal cost of producing the first item that is accepted (i.e., the first item that passes both tests).This is equivalent to having a quality control situation with two inspectors, with the first relaying his data to the second.Since there are two junctures at which an item may be discarded, one before and one after processing for z, we use a separate cost for each possibility; thus, c I is the cost of salvaging an item rejected after seeing Yl, and c 2 is the analogous cost for y2.
For this scenario, an admissible policy is now a pair (u, v) of functions each with values in {0, 1}, with u depending only on Yl, and v depending on both Yl and Y2" The cost equation for (u, v) is C E{uvI(x e A,z e B})Cz-P)} + E{(1 t)(C 1 -+-+ E{u(1 v)(c z + c 2 + C)} + E{uv(1 I{x e A,z B}) (Cz +   (19)   It appears that in spite of the greater complexity of costs and options, the solution of this two-inspection problem is very similar to that of the one-decision problem in Section 3. Indeed, for the interested and diligent reader, there is whole class of pro- blems in which successive measurements can be prescribed or opted for in sequence be- fore an item is finally accepted, for which the minimal cost to a first acceptance has this same structure.
The equation for the optimal cost C* is: 0 C 1 + E{[E{(c 2 c + c z + [ce -(c e + p){x E A,z B lYl,Y2} )1Yl}] } (20) where It]denotes min{0, t}.There is a class of "special K" policies which are suffi- cient in the sense used previously, and the "special K" policy with K-C* is optimal.The arguments for these claims closely parallel those used above in Section 3.
Still, the non-classical case in which the second inspector knows only Yl and u remains unsolved, although two necessary optimality conditions can be written down.