A Solvable Time-Inconsistent Principal-Agent Problem

We consider the dynamic contract model with time inconsistency preference of principal-agent problem to study the influence of the time inconsistency preference on the optimal effort and the optimal rewardmechanism.We show that when both the principal and the agent are time-consistent, the optimal effort and the optimal reward are the decreasing functions of the uncertain factor. And when the agent is time-inconsistent, the impatience of the agent has a negative impact on the optimal contract. The higher the discount rate of the agent is, the lower the efforts provided; agents tend to the timely enjoyment. In addition, when both the principal and the agent are time-inconsistent, in a special case, their impatience can offset the impact of uncertainty factor on the optimal contract, but, in turn, their impatience will affect the contract.


Introduction
The principal-agent problem is a classic issue of the optimal contract and is widely used in financial and economical fields.The principal and the agent, as two parties of the contract, interact with each other.Under the constraints of the contract, the agent creates profits for the principal and the principal pays the salary for the agent as incentives.In this paper, we introduce an optimal contract where both the principal and the agent are time-inconsistent to solve the principle-agent problem under moral hazard in a dynamic environment.In general, the agent is regarded as risk-neutral and the principal is risk aversion.
In solving the optimal contract with the time-inconsistent principle-agent problem, there are three problems we need to face.The first is the solution to the principal-agent's problem in the continuous-time.A continuous-time model where the agent controls the Brownian motion drift rate over the time interval is studied by [1].Later, [2,3] uses martingale methods to develop the first-order approach of principal-agent problems under moral hazard with exponential utility.Reference [4] shows that the first best sharing rule is also linear in output in the continuous-time principal-agent model with exponential utility.Reference [5,6] uses the stochastic maximum principle to extended Holmstrom's model, and discuss the optimal solution of the agent with private information in the continuous-time model.Reference [7,8] uses the forward-backward stochastic differential equations to consider the optimal contract under moral hazard.Reference [9] systematically expands the problem of continuous-time principal-agent.However, the above methods solved the principal-agent problem in a continuous period of time under the time-consistent preference, which is simplistic for the actual situation.Therefore, it is natural to consider the time inconsistency in the principal-agent model.
The second problem is how to find the optimal strategy when the time preference is inconsistent.Reference [10] proposes the optimal contracts for the principal who contracted with the dynamically inconsistent agents in a discrete case.Their study includes exploitative contracts that applied for naive agents to better explain the true contractual arrangements.Based that, [11,12] takes the neutral agent as a benchmark to study the possibility that the principal manipulates the naive agent.The result shows that the innocence of the agent does not bring benefits to the principal, and the maximum effectiveness of the principal is the same in front of the neutral agent and naive agent.Besides, the definition of naive and neutral agent was firstly mentioned in [13].Reference [14] takes the discount of the quasi-hyperbolic  as the agent's discount function, then discuss the optimal contract and the profit level of the principal when the agent is neutral or naive.The above mainly deals with the problem of time-inconsistent agents in discrete time.However, less attention is paid in continuous-time because it is complicated to solve the closedloop solution under nonconstant discounting.Reference [15] proposes the optimal contract models based on the Pontryagin maximum principle for forward-backward stochastic differential equations to study a general continuous-time principal-agent problem in which the utility function is timeinconsistent.
The third question is how to find the exact solution of the Hamilton-Jacobi-Bellman (HJB) equation.Using the stochastic control to solve the HJB equation is a complex mathematical process, especially in the case of increasing the control variables, which will be more complex nonlinear partial differential equations.By the Legendre transform, the problem can be transformed into a dual problem that is convenient for analysis, so as to solve some model solving problems.Reference [16] studies the portfolio problem under the general utility function, and prove the effectiveness of using the Legendre dual method to solve the HJB equation.Reference [17] uses Legendre transform-dual theory to solve the optimal investment problem based on hyperbolic absolute risk aversion (HARA) preference under constant elastic variance model.References [18,19] study the investment-consumption of HARA utility by Legendre method.
In this paper, we study the optimal incentive contract under moral hazard in the framework of the principal-agent problem with time inconsistency in continuous-time.Assume both the principal and the agent are time-inconsistent, where the principal is risk-averse with an exponential utility function and the agent is risk-neutral with a linear utility function.To describe the time inconsistency of participants, we assume that the discount rate of participants is a function of time (not a constant) but still takes the form of exponential discounting because the principal's utility function is the exponential utility function.According to the property of the exponential function, we can divide the discount function into two parts: one part is the traditional discount function (discount rate is constant) and the other part is the uncertainty.Under the moral hazard, the principal can observe the process of output but cannot observe the agent's efforts and random perturbations.Thus the principal considers a part of the discount function as an unknown factor that affects the output.Reference [20] puts forward that the principal can constantly learn and update his belief from the unknown factor (the uncertain part of the discount function) through the existing information and historical information in the production process.Therefore, we transform the timeinconsistent principal into a principal who has the consistent time and learning process.For time-inconsistent agents, we can employ the Markov subgame perfect Nash equilibrium method [21] to get its time consistency strategy.
Through the above assumption, we solve the optimal contract in two cases where the principal is time-consistent and time-inconsistent.When the principal is time consistency, we use the stochastic optimal control method to derive the nonlinear partial differential equation (HJB equation) for the optimal value function of the principal.This partial differential equation is hard to solve for an exact closed-loop solution; however, the original problem can transfer to a dual problem by applying the Legendre transform in some cases.To obtain the exact solution of the optimal contract (closedloop solution), we use the Legendre transform-dual theory to obtain the explicit solution of the optimal solution and the optimal contract.When the principal is time-inconsistent, we obtain a value function which takes the time, the agent's personal information, and the utility of agent as variables, so as to derive a three-dimensional nonlinear second-order HJB equation.In this situation, we solve the HJB equation by guess solution.
The general structure of this paper is as follows.Section 2 presents the model.The incentive compatibility conditions and the given proof are provided in Section 3. Section 4 studies the optimal contract of a time-consistent and time-inconsistent principal.Section 5 provides numerical simulation of optimal strategy.Finally, we made the conclusion in Section 6.

The Model
2.1.The Agent.Suppose a principal made the contract with a time-inconsistent agent to manage a production process (or invest a risk project) and the initial time of contracting period is recorded as 0. Consider an infinite horizon stochastic environment; let { 0  } be a standard Brownian motion on the probability space (Ω, F, P 0 ).The risk process that pays a cumulative process   evolved on period [0, ] as follows: assumes that A ∈ R is a compact set and   ∈ A is the agent's effort choice.  ∈ R is the salary of the agent (or his consumption). is the project's volatility (constant) where  > 0. The path of   is observable both from the principal and the agent, but the path of { 0  } is observable only from the agent, and the effort choice   is unobservable from the principal.
At the initial moment time 0, the principal provides the agent with a contract (pay the salary according to the contract).Assume that the salary is composed of two parts, a continuous payment   and a terminal payment   .Moreover, we assume that the agent is risk-neutral,  and V ( we will give the explicit functional forms for  and V in the specific question (see (26) and ( 27) )) are utility functions, and  and V are concave and twice continuously differentiable.And the agent has a discount function ℎ, where ℎ() =  − ∫  0 () (in this paper, for convenience, sometimes ℎ() is written as ℎ  ) is a general discount function; see [22]; then the agent will be time-inconsistent if () is time dependence.
The agent's preferences as of time 0 read 2.2.The Principal.In this paper, we assume that the principal is partially naive type ( [23] in a discrete time-inconsistent model, disaggregate participants into mature type, naive type, and partially naive type based on the cognitive differences of participants about their own future preference), which means that the principal knows he is time-inconsistent (his discount function is time-variant), but his current perception of the future discount rate is biased against the true value of the future discount rate (at time , he cannot be sure the value of discount rate () (we can further assume that (), () : [0, ] → [0, 1]) when 0 ≤  <  ≤ ).Therefore, the principal will continuously update his belief in the future discount rate based on the past information.The detailed analysis is as follows.
The principal's preferences as of time 0 will be where  and  are utility functions by the principle and () =  − ∫  0 () is the discount function.Assume that the utility function (  −  ) = − −(  −  ) over salary (consumption) and effort and (−  ) = − −(−  ) , where  is an absolute risk aversion coefficient.Hence, we rewrite (3) as follows: where   = (∫  0 () − )/ that  ∈ [0, ].From (4), we can split the principal's discount factor  into two parts under the condition of the exponential utility function: one part is a constant discount rate  and another part is   .The purpose of the above operation is that the principal estimates a suitable constant discount rate  instead of a time-varying discount rate (), and the principal does not know the exact value of this constant.Therefore, the principal will constantly update the recognition of  based on past information.  is the subjective choice of the principal, but   indicates an objective reality, reflects the type of principal (time-inconsistent or time-consistent), and does not depend on the subjective choice of the principal.So we can set   as a part of the output (investment) process.In this way, we can turn the principal's time inconsistency problem into an unknown constant discount rate problem.If the principal is time-consistent, namely, the principal's discount rate is a constant  0 , he can choose  0 as a constant discount rate ; hence  0 =  ( or   = 0).Under the probability measure P of the principal, we can regard that   is an intrinsic influence of the risk item and is not subject to the control of the principal but must be considered.Hence the risk process (1) becomes As discussed above, we know that the process   and the path of  0  are observable from the agent; therefore the measure for the agent is P 0 , which means that the agent does not learn in secretly, so the agent's beliefs will not be a hiddenstate variable ( [24]) (this does not mean that the agent cannot mislead the principal by choosing an effort, just that such actions do not cause persistent hidden states according to the agent's beliefs).From (1) we have Equation ( 5) expresses the principal's beliefs about the project and (6) expresses the agent's beliefs.The disagreement between the principal and the agent is caused by principal's nonindex discount.
At time , the principal knows the exact value of ∫  0 (), because () is his discount rate, but he does not knows the exact value of ; therefore we use a sided Bayesian learning model after signed contract, and we assume that the prior about  at time 0 is normally distributed with mean 0 and variance  0 .The agent does not update his beliefs because he has perfect information.If the agent follows the recommended effort choice , the principal's posterior beliefs about   depend on   and on cumulative effort According to the Kalman-Bucy filter, see [25].The conditional expectation K = E[  |   ,   ] =  −2 (  −   )/  and the precision of filtering   = E[(  − K ) −2 ] satisfy the system of equations where K0 = 0 and  0 = 0 and   is a standard Brownian motion under the measure induced by the effort sequence

Incentive Compatible Conditions
In this section, we focus on the agent's problem.Since the agent's objective function relies on the consumption process {  }, that is, it relies on the history of the whole output, so it is non-Markov (the specific proof see [6]).Thus, the agent optimization problem cannot be solved by the standard dynamic programming principle.We will employ the stochastic maximum principle of the solution to the weak situation of the agent problem.The main idea is to apply random variational methods; the relative papers [8,20,26] used the similar approach.
Define the agent's continuation value (promised utility)   as the expected discounted utility for remaining in the contract from date  forward where   ≜ {  : 0 ≤  ≤ } is the output history.We use Γ to relate the expectation operator E  [⋅] under the measure Q  (because the agent's objective function depends on the consumption process   , which is non-Markovian since it depends on the whole output path   ; hence the optimization problem ( 9) cannot be analyzed with standard methods.So we use a martingale approach.Given a contract (,  * ), the agent controls the distribution of salary through his choice of effort.For the specific technical treatment see Appendix A).The agent's objective function can be recast as where  = (, ) represents the salary paid by the principal based on output history.
After the change of measure, the time-inconsistent agent's problem only has one control.We apply a stochastic maximum principle to characterize the agent's optimality condition, and we also use the dynamic programming equation derive a stochastic maximum principle with general time-inconsistent.
The agent's problem is to find an admissible control to maximize the expected reward   (0; , ).In other words, the agent needs to solve the problem sup Given , for all 0 ≤  ≤ , subject to next we define the optimal effort for the time-inconsistent agent.Let  > 0 and E  ⊆ [0, ] be a measurable set whose Lebesgue measure is |E  | = .Let   ∈ A be an arbitrary effort choice.We define the following: with E  = [0,], and it is clear that    ∈ A. We refer    as a needle variation of the effort choice  *  .Then, we have following definition.

Definition 1. An effort choice 𝑒 *
is an optimal effort choice for the time-inconsistent agent, for  ∈ (0, ], The optimal density process {Γ * ()} ∈[0,] is a solution of the stochastic differential equation.
Through the above technical processing, we convert the time-inconsistent strategy into time consistency optimal strategy of the agent.
Next, we analyze the conditions for the implementation of incentive contract.According to the previous analysis, the agent will control the distribution of salary by choosing his effort.The idea of using the distribution of salary as control to solve principal-agent problem goes back to [27] is and expanded by [5,20].The learning process of the principal complicates our problem, as the past effort affects not only current salary but also future expectations of the agent and the principal.Therefore, we have to deal with a principalagent problem with time inconsistency and learning process.In Appendix B, we show how this difficulty can be handled through an extension of the proof by [8,20].And the conclusion presents the following.

Proposition . The agent's continuation value can be uniquely represented by the following differential form:
where   is a square integrable predictable process.The necessary and sufficient conditions for  *  is the optimal effort choice reads: (i) If  *  is the optimal effort choice, then for every  ∈ [0, ] there exists a solution {  ,   } ∈[0,] of (16) which satisfies (in this paper,    and    represent function  taking the firstorder partial derivative and second-order partial derivative of , respectively) where (ii) For almost all , if the following inequality holds where  is the predictable process defined by then  *  is the optimal effort choice.
According to (18), we say that   is a stochastic process capturing the value of private information and then obtain the solution for all  ∈ [, ].
In the following, at any time  *  > 0, the process for  reads the coefficients  is chosen by the principal to maximize his expected utility (the proof is given by [20]) According to (19), the process   is the random fluctuation in the discounted sum of marginal utilities evaluated from time 0. Based on the stochastic differential equation of   , we can obtain that   =  − ∫  0 ()s   .Besides that,   ,   ,   , and   are endogenous, which implies that we need to get a con tract to satisfy the necessary conditions and then prove that it also meets the sufficient conditions.If the contract has no explicit solution, it will be difficult to prove that the contract also satisfies the sufficient condition.In this paper, the utility function for the principal is exponential function and the utility function for the agent is linear function.In the next section we will employ the exponential utility function to get the closed-loop solution of the contract.

The Optimal Contract
This section detailedly explains how to solve the principal's problem and derive the optimal contract in closed form when the principal's utility is exponential.
Eliminating K from the list states.For a given contract (,  * ), the principal expected utility form data  forward reads and defines and we have the following result.
Proof in Appendix C.
Assume that Propositions 2 and 3 hold, so that the necessary condition is also sufficient.The principal's problem consists of solving for   (, , ) subject to the two promisekeeping constraints ( 9) and ( 18) and also to the incentive constraint (17).Given that the posterior mean K does not enter directly into any of the constraints, it can be dispensed as a state, only leaving the precision as a belief state.Furthermore, since   is deterministic, we may index the precision by .The fact is that the expected value of   is immaterial to the principal's objective and illustrates that incentives are designed to reward effort, not to ability.e Agent's utility function.To obtain the solution of optimal contract, according to our assumption in Section 2.1, the agent is risk-neutral and the utility functions of the agent are linear, i.e.,

𝑢 (𝑐
moreover, we make a particular assumption about the terminal utility for the agent, setting where  is a constant.This assumption implies a situation in which an infinitely lived agent retires at the termination date T of the contract and, after retirement, he can consume a permanent annuity derived from   .We always concentrate on problems where the contracting horizon goes to infinity  → ∞, so this particular assumption V is not critical.Incentive providing contracts.Restore the principal's optimization problem as subject to Since the state variables   and   are Markovian processes, we can use the HJB equation to analyze the principal's optimal control problem.Take (, , ) as the principal's value function; this value function satisfies the HJB equation for 0 <  < :

Second Best Contracts for the Time-Consistent Principal.
In this section, we mainly consider that the model with the time-consistent principal and setting hidden action means that the principal can observe the process   but does not know the type of agent and also cannot observe the agent's effort   .For incentive contracts (, ) :   > 0, for any  ≥ 0, the necessary condition for incentive compatibility (17) becomes   = −  (  ,   ).When the principal is time-consistent, it expresses that   =  for all  ∈ [0, ]; hence we say  = 0.There is no need to inquire  and there is no influence of belief manipulation, which indicates that the information value  is equal to zero.

The HJB Equation.
When the time tends to infinity ( → ∞), the agent's continuous value function  is the only state variable for writing the principal's HJB equation as follows: with the terminal condition (,   ) = (−  ), where   = V(  ).Taking the first-order conditions for (, ), we have Under full information, the principal can observe the agent's effort and consumption and there is no private information in this case.Hence, the principal can freely choose   as parts of the contract; i.e.,  is independent of  and , and then we have the following proposition.
Proposition .Under full information, the optimal effort for the principal is a constant  * = 1.We say  * = 1 is the first best effort level.
Under hidden action case, the optimal effort and consumption derived from (34) shows Putting ( 35) into (33), the HJB equation (33) for the value function (, ) is rewritten as Recalling the principal with the exponential utility function, the HJB equation is a complex nonlinear partial differential equation.It is difficult to take the classic separation of the variable method and solve it intuitively.In the next, we will employ the Legendre transform to turn the problem into a dual problem, by solving the dual problem to obtain the optimal solution for the original problem.

Legendre Transform. The dual function of 𝑉 is defined by
where  < 0 is the dual variable of .The function (, ) is closely related to V(, ) and can be used as a dual function of the function (, ).In this paper, (, ) is defined as the dual function of (, ) and satisfies According to the definition of the dual function, we have  =   (, ) and Based on the conclusion in [28], the following transformation rules are obtained: With the analysis from [29], the function () and Û() can be changed to pass the Legendre conversion The relationship between the optimal values  * and  * is According to equation (39) and rules (40), the HJB equation for the dual problem is Taking the derivative of  and combineing (35), we have Thus, the following two ordinary differential equations are established: Proposition .Assume that (i) the principal is time-consistent and the agent is time inconsistent, (ii) (, ) and V() are as defined in (26) and (27), and (iii)  *  > 0 for all , so that the incentive constraint (17) binds for almost all  ∈ [0, ].Then recommended effort and the agent's consumption is given by where

Second Best Contracts for the Time-Inconsistent Principal.
In this section, we discuss the case when the principal is time-inconsistent, which means the discount rate  is not a constant.In this case, the principal still cannot observe the agent's efforts and consumption (moral hazard).Hence, the value of private information is not zero.As described in Section 3, the HJB equation is as follows: Now we need to solve this above equation by guessing the solution.Under the first-order conditions for (, , ), we have Substituting ( 55) into (54), denoting that Ṽ = (   ⋅    − (  ) 2 )/  , we have where Ẽ = ((/) Ṽ +    −   ) 2 /( 2 Ṽ +   ).
In particular, we suppose the value function has the following form: with () = 1 and (, ) = 0. Hence, for some functions () and (, ), the expressions of optimal effort and consumption are The following two differential equations can be obtained by eliminating the dependence on : According to (60), we can obtain From the above analysis, we need to know the specific expressions of (, ) to get the explicit expression of effort () and consumption ().
Substituting (, ) = Ã() + B() into (61), we calculate to get The following two differential equations can be obtained by eliminating the dependence on : Lemma 6 conclusion can be provided by solving the above two differential equations.
Proposition .When the principal is time-inconsistent, the expressions of optimal effort and consumption for the agent are where (), Ã(), and B() are given by ( 62) and ( 64), respectively.
It can be seen from the proposition that the second optimal effort is less than the first optimal effort.The optimal consumption is the linear function of the agent's promise value and private information.

Numerical Simulation
In this section, we provide a numerical simulation to characterize the dynamic behavior of the optimal portfolio strategy derived in the previous section.Firstly, an optimal effort numerical simulation is performed when the principal was time-consistent.
As shown in Figure 1, the discount rate of the agent is taken as the constant discount rate, namely, () = .The optimal effort, under different volatility, is reduced with the increase of volatility.It also shows that the greater the uncertainty, the lower the efforts of the agent.In addition, the three curves are almost declining, indicating that effort is a decreasing function of time.
If we take the discount rate of the agent as ()/ =  − (), i.e., () =  +  − , where  = 0.1 and  = 0.1; 0.2; 0.5, respectively.The curves of effort variation are drawn in Figure 2. Analogy with Figure 1, although the discount function is different, we still can get the similar conclusion, which means that the greater the uncertainty for any type of agent is (whether he is time-consistent or time-inconsistent), the less effort he provides.The reason is that in the case of moral hazard, the principal cannot distinguish the influence of the agent's efforts and uncertainty on the risk project's return.
Next, we simulate the optimal consumption (salary) under the specific parameters.Let () =  = ,  = /ℎ, and  2 = 0.25.According to the expression of , we have Since  is a stochastic process, the mathematical expectation of  can be expressed as where () = V(()) ≜  + ln () within a constant .
Then we substitute the expressions of ,  and  into the above expression; the relationship between () and time  can be simulated with the terminal condition () = 1.
Consumption is initially diminishing because the effort is a function that decreases monotonically over time.After falling to a certain value, the bottoming out of consumption rose and this trend can be explained by the value of terminal condition of consumption setting.The overall consumption trend is shown in Figure 3.
As shown in Figure 3, in the period of time just after the contract has been performed (for example  ∈ [0,5]), the greater the volatility, the lower the consumption.The reason is that greater volatility leads to lower efforts.In the latter part of the contract, the situation is just the opposite.
Finally, we simulate the optimal effort trend when the principal is time-inconsistent.Assuming that the discount rate for the agent is the constant discount rate, then two different effort curves are both horizontal lines, as Figure 4 shows.This indicates that, under the established discount rate, the optimal effort does not change over time.The reason is that we hypothesize the value function of principal as an exponential function under private information.And also, the effort is a decreasing function of the agent discount rate, which means that if the agent pays more attention to the present value (timely enjoyment), the higher discount rate and the less effort provided.

Conclusion
In this paper, we were interested in a time-inconsistent principal-agent problem under full information and moral hazard framework.In particular, the optimal contracts we discussed in details assume that the principal is risk aversion and the agent is risk-neutral.There are two main works we have done in the paper.First, we made the technical processing of the time-inconsistent principal and agent, respectively.We transformed the principal by the changing of the timevarying discount rate into the time-consistent.And we used the Markov subgame perfect Nash equilibrium method to get the time consistency strategy for the time-inconsistent agent.Second, we used the Legendre transform duality theory to transform the HJB equation into the dual equation.The solution of the original HJB equation was obtained smoothly, thus obtaining the explicit expression of the optimal effort and optimal consumption.Under moral hazard, we also obtained the exact solution of the original HJB equation by using the guessing solution.We found that the optimal consumption of agent is a linear function of promised value  and private information .The optimal effort is the function of the agent discount rate.Eventually, we considered the contractual relationship between the principal and the agent in a special circumstances.The more general situations of timeinconsistent contracts should be considered in future research.so that   is also a Brownian motion under Q.Given a contract (,   ), we define the drift of output as  (, ,  t ) =   −  (, ) + K (A.2) Since expected output is linear in cumulative output, then we define a F  − predictable process with an effort  ∈ A: for 0 ≤  ≤ , and Γ  is an F  − martingale with E(Γ  ()) = 1 for all  ∈ A. By Girsanov theorem, the defined new measure Q  is as and the process    defined by is a Brownian motion under Q  , and the triple (,   , Q  ) is a weak solution of the following SDE: Hence each effort choice  results in a different Brownian motion.Γ  defined above satisfied Γ  = E(Γ  | F  ) which is the relative process for the change of measure.

B. The Proof of Proposition 2
Consider the agent problem and suppose (,   ) is given; in general we assume  = 0 to ease the presentation.Hence, ( 16) is satisfied.
There is no drift in Γ; in addition, since Γ  > 0, we can define the conditions on the Hamitonian  rather than H. Hence, the first-order condition for  is Finally, the necessary condition for  *  to be an optimal effort choice is    = 0, namely,

Figure 1 :Figure 2 :
Figure 1: The efforts of constant discount rate for different uncertainty .

Figure 3 :
Figure 3: The consumption of constant discount rate for different uncertainty .

Figure 4 :
Figure 4: The efforts with time-inconsistent principle for different uncertainty .