Dynamical Regulation of mRNA Distribution by Cross-Talking Signaling Pathways

Gene transcription is a random process in single cells manifested by the observed distribution of mRNA copy numbers in homogeneous cell populations. A central question is to understand how mRNA distribution is modulated under environmental changes. In this work, we initiate a theoretical study on mRNA distribution dynamics for the stochastic transcription model that involves cross-talking signaling pathways to direct gene activation in response to external signals. We first express the distribution in mathematical dynamical formulas under both moderate and high transcriptional upregulations. In each scenario, our further numerical examples display an observed dynamical transition type among three distribution modes for stress genes in yeast. In particular, the intermediate bimodal stage sustains within a certain length of early time and lasts much longer than that generated by the single pathway. /is shows the general and robust bimodal transcription regulated by the cross-talk of signaling pathways.


Introduction
Recent single-cell measurements have generated massive data on the histogram of mRNA copy numbers for the target gene in homogeneous cell populations [1,2].
is provides a good approximation for data fitting by the probability mass function P m (t), the probability that there are exactly m mRNA molecules of the gene of our concern at time t in one cell [3,4]. e distribution profile of P m (t) contains a panoramic information for distinct cellular phenotypes [5,6]. e commonly observed modes are the decaying distribution that P m (t) decreases in m, the unimodal distribution that P m (t) peaks uniquely at some m > 0, and the bimodal distribution that P m (t) takes exactly two peaks with the first one at m � 0 and the other one at some m > 0. A decaying or unimodal distribution suggests a phenotypic homogeneity, while a bimodal distribution supports a binary process that steers cells into subpopulations with distinct cell identities [1,7,8].
It has been a central question to understand how varying external signals influence the profile of P m (t) for inducible genes [1,2]. As suggested by the prevailing two-state model [1,9], it may involve intrinsic mechanisms that regulate a random switching between gene-off (inactive) and gene-on (active) states [6,10,11]. As shown in the following equation: the dwell times in the off and on states are independent and exponentially distributed with the activation rate λ > 0 and inactivation rate c > 0, respectively. In an active gene, RNA polymerase can bind to the promoter and traverses the template DNA strand to direct mRNA synthesis. e waiting time for the birth of a new mRNA molecule and its lifetime before death are independent and exponentially distributed with the synthesis rate v > 0 and degradation rate δ > 0, respectively. e two-state model (1) has emerged as a standard framework for quantifying the deviation of the mRNA level in individual cells from the mean, through calculating the noise (variance over mean squared), fano factor (variance over mean), and probability distribution P m (t) of random mRNA copy numbers [1,2,8]. ese indexes have been expressed as analytical formulas for the two-state model or its more elaborated extension regarding a larger degree of biological realism such as chromatin remodeling, mRNA maturation, and cell division [6,[12][13][14]. Together with experimental data, those formulas can be reversed to the variation of system parameters and, in turn, uncover a large spectrum of regulation modes that cells utilize under different cellular environments [2,4,[14][15][16].
On the contrary, the two-state model (1) implicitly assumes that the gene activation from the off state to the on state is directed by a single signaling pathway. Such assumption may not be justified for a large class of inducible genes which maintain a low transcription level induced by the spontaneous basal pathway under normal cellular growth conditions [17,18], while upregulating the transcription through specific signaling pathways in response to acute external changes [19,20]. Also, the activation of other genes that are involved in stem cell renewal, development, and immunity is often mediated by two signal transduction pathways [21][22][23]. In all these cases, the downstream specific transcription factors (TFs) in one pathway compete with the other TFs in another pathway for binding at their shared promoter sites to direct the formation of the intermediate complex [19,24]. erefore, the two competitive cross-talking pathways could be generated to activate the target gene.
By integrating competitive cross-talking pathways into the gene activation process [24,25], the two-state model (1) can be generalized as the following equation: e transcription of the gene can be either induced by the first pathway, or alternatively, the second pathway. e sojourn times in the two pathways are independent and exponentially distributed, with the induction strength λ 1 > 0 for the first pathway and the strength λ 2 > 0 for the other, satisfying We rename the gene-off state as O 1 if it is transformed to the gene-on state by the first pathway, and as O 2 otherwise. en, the pathway selection probabilities, denoted by satisfy 0 < q 1 , q 2 < 1, Also, the gene inactivation from the on state to the off state and the birth and the death of mRNA molecules are separately controlled by the first-order kinetic rates c, v, and δ, as modeled in the two-state model (1).
When t ⟶ ∞, the exact form for the stationary mass function lim t⟶∞ P m (t) was stated in [25]. e subsequent numerical examples showed that cross-talking pathways are more likely to generate bimodal distribution compared to the single pathway with q 1 � 1 or λ 1 � λ 2 . is naturally gives rise to an interesting question of whether the crosstalking regulation still maintains the robust bimodal distribution as time develops. When time t is finite, however, the analytical formulas of P m (t) have remained elusive due to its intrinsic complexity. Only a few papers considered the case of a single pathway and expressed P m (t) or the corresponding generating function in the form of integrals [12], infinite series [26,27], and hypergeometric functions [6,28] under certain parameter regions. We shall derive P m (t) generated by cross-talking pathways in simple mathematical formulas in Section 2, and then discuss their dynamical profiles and implications in Section 3. e main conclusion and its discussion are given in the last section.

Analytical Formulas for Dynamical Distribution
In this section, we have endeavored to calculate P m (t) by solving the system of master equations. e standard approach is to transform it into a system of first-order partial differential equations through the method of generating functions [6,12]. e analytical form of P m (t) has been found to be the solutions for several special types of thirdorder ordinary differential equations. Solving their corresponding initial value problems and then extracting P m (t) are in general formidable and pose a major obstacle in the study.
We have successfully derived many exact forms of P m (t) within a large range of system parameters. In particular, our first result assumes that the weaker signaling pathway is more frequently selected. In this case, the upregulation of the target gene may not be efficient due to the exposure of cells to mild external signals or the intrinsic mechanism that avoids overexuberant transcription of the target gene [17,20,29]. Theorem 1. If c � 2δ, q 1 > q 2 , and λ 2 � λ 1 + δ/(q 1 − q 2 ), then the probability mass function P m (t) in the cross-talking pathway model can be expressed as and for m � 1, 2, . . .,

Complexity
or alternatively, for m � 0, 1, . . ., Proof. Let P m,j (t), j � e, 1, 2, be the respective probabilities of m mRNA molecules at time t that the gene is residing in the on state and the off state with the jth pathway being selected. en, the total probability mass function is P m (t) � P m,1 (t) + P m,2 (t) + P m,e (t).
Following the standard procedure in the theory of stochastic processes [13,24], we find that the three partial mass functions in the cross-talking pathway model satisfy the following system of master equations: where, by convention, we set P − 1,e (t) � 0. Without loss of generality, we assume that the gene is inactive and the number of transcripts is zero at t � 0 with the initial conditions: Following the standard procedure [6,12], we introduce the probability generating functions to transform the initial value problem for the master equations (10)-(12) into the problem for the system of firstorder partial differential equations: If the initial value problem (15)-(18) can be solved analytically, then we may obtain P m (t) by adding the three solutions together and then applying the conversion formula: We continue from there and transform (15)-(17) into ordinary differential equations through the method of characteristics [12,28,30]. Let z 0 be a parameter. Let be the restriction of V 1 , V 2 , V e , and V on the characteristic curve z � z 0 e δt . en, Complexity 3 By introducing nondimensionalized system parameters, and from the further transformation, we find u i ′ (t) � δxw i ′ (x), and transform (21)-(23) into It is interesting to note that equations (26) and (27) are indeed identical with equations (17)- (19) in [25]. erefore, by introducing two real numbers α > β > 0 determined by algebra equations [24], the same calculation verifying (21) in [25] shows that w(x) is the unique solution of the initial value problem: for the linear operator L given by with real constants a, b, c, and d and a smooth function To proceed, we define by and utilize notations (24) and (29) and the assumption of theorem to verify We then reformulate L (30) in the form e substitution of (32) and (33) into the above equation gives y ′ (x) � y(x), which readily converts (30) to the simple problem: e unique solution of (35) is y � y(vz 0 )exp(x − vz 0 ). We replace its left side by (32) and the right side by the initial condition in (35). is gives an inhomogeneous equation: subject to the initial condition given in (30) Taking into account that the homogeneous counterpart of (36) is Euler's equation which possesses two independent solutions x − λ 1 and x − λ 2 , their linear combination, adding a particular solution of (36), constitutes the general solution of (36). Following the standard method of undetermined coefficients in the theory of ordinary differential equation [31], we find that one particular solution takes the following form: By fixing the coefficients in the linear combination through the initial condition (37), we obtain the unique solution of (36) and (37) in the following form: 4 Complexity Now, the transformations u(t) � V(z 0 e δt , t) defined in (20) and u(t) � w(vz 0 e δt ) defined in (25) convert (39) back to By making the change of variable s � vz 0 exp(δτ) in the integral, we rewrite V as In turn, by using conversion z 0 � ze − δt and then letting s � t − τ in the integral, we obtain We divide the rest of the proof into two parts. We derive (6) and (7) in the first part and (8) in the second part.
Derivation of (6) and (7): we first note that the second integral in (42) can be expressed in an equivalent form through integration by parts: which helps us rewrite the generating function V(z, t) as Complexity 5 To finally verify (6) and (7), we extract the probability mass function P m (t) from (44) through the conversion formula (19). Since the first two terms in (44) are eliminated by differentiation with respect to z, P 0 (t) in (6) needs to be derived from (44) directly as P 0 (t) � V (− 1, t). For m ≥ 1, By substituting v � v/δ and z � − 1 into this expression and dividing it by m!, we obtain (7).
Derivation of (8): by changing the first integral in (42) in the form we rewrite (42) as With the help of the elementary identity z m ze az ( ) zz m � (m + az)a m− 1 e az , we find en, (8) follows from (19) immediately. Next, we present a result for the case c � 0, for which gene activation is irreversible [26]. It may give an approximation to the case when the gene-on state is regulated to be very stable (c ≪ 1) under the highest induction level [15,16,32].
or alternatively, for m � 0, 1, . . ., Proof. As in the proof of eorem 1, our basic strategy is to solve the initial value problem (30). If c � 0, then α � λ 2 and β � λ 1 by definitions (24) and (29). Combining this with (31), we see that L According to the initial conditions in (30), we denote by en, (52) is Euler's equation of y(x) with two independent solutions y 1 (x) � x − λ 1 and y 2 (x) � x − λ 2 . Hence, y(x) can be expressed as a linear combination of y 1 (x) and y 2 (x). By determining the coefficients through initial conditions of y(x), it readily follows that Together with (53), we find that w(x) can be obtained by solving the simple problem: By a routine calculation of the first-order linear ordinary differential equation [31], we obtain By replacing x with vz and z 0 with ze − δt , we convert w(x) to the generating function V(z, t):

(58)
After substituting z 0 � ze − δt and then s � t − τ, we finally obtain Derivation of (50): through integration by parts, we express the last integral of (59) in an equivalent form: Complexity e substitution of this into (59) helps us rewrite the generating function V(z, t) as

Dynamical Bimodal Distribution for Inducible Genes
We elucidate the regulation of cross-talking pathways on dynamical mRNA distribution for inducible genes under acute stresses, which have to fulfill two requirements. First, a default or spontaneous mechanism maintains transcription at the basal level when cells are under normal cellular environments to avoid the overexuberant expression [17,29]. Second, the exposure of cells to the extracellular stresses results in the transient activation of dedicated intracellular signaling pathways, which enables cells to adapt immediately to the new environment for maximal cell survival [19,20]. For instance, the osmotic shock induces the MAPK HOG pathway in S. cerevisiae through the phosphorylation of MAPK Hog1 that leads to the rapid translocation of Hog1 into the nucleus to launch a transcriptional program [19]. erefore, the cross-talk between a weak basal pathway and an inducible signaling pathway could be generated to initiate the stress gene transcription.
We shall discuss the observed dynamical transition among three distribution modes for the STL1 gene in S. cerevisiae in response to acute osmotic stress [19]. Under mild or high NaCl concentrations, no expression was detected in most cells at the initial time referred to as the decaying distribution and full expression was detected at a large time scale referred to as the unimodal distribution. e scenario becomes more fascinated within the intermediate time as both nonexpressing and fully expressing cell subpopulations were presented, characterized as the bimodal distribution. It is believed that such bimodal transcription output is controlled by both the retention time and concentration of Hog1 in the nucleus and results in a larger phenotypic variability for cell survival in harsh environments [19].
We demonstrate through numerical examples that crosstalking pathways can generate dynamical bimodality of osmoresponsive gene transcription under mild or high stress level. Most mRNA half-lives in S. cerevisiae range around a median of 11 min [33], so we fix δ � ln(2)/11 ≈ 0.063 min − 1 . We set the transcription rate as its upper bound v/δ m � 10 [34], and thus v � 0.63 min − 1 . Also, we set a relative small strength rate λ 1 � 0.1 min − 1 for the basal pathway and a relative large strength rate λ 2 � 3 min − 1 for the signaling pathway. Such parameter set generates a prolonged region of the gene-off time (denoted by T off ): that covers the lag time of 1 to 3 minutes for the activation of most typical osmoresponsive genes, such as STL1, GRE2, or GPD1 [20]. e mild stress level results in a moderate increasing nuclear accumulation of activated Hog1 [19]. is leads to the well-matched competition between two pathways with their selected probabilities q 1 ≈ q 2 . We set q 1 � 0.51 and q 2 � 0.49 that generates transcriptional efficiency 1/T off � 0.19 min − 1 . On the other hand, the high stress level induces saturated nuclear Hog1 concentration [19] and thus may direct gene activation more frequently through the signaling pathway with q 2 > q 1 . We then set q 1 � 0.3 and q 2 � 0.7 that generates 1/T off � 0.3 min − 1 . Also, it was reported that the gene-on state could be significantly stabilized in response to increasing stress level, and thus c ≪ 1 [2,15,16,32]. We next take advantage of exact forms (8) and (51) to quantify dynamical mRNA distribution under mild and high stress levels, respectively. In both scenarios, we observe obvious transitions from the decaying distribution to unimodal distribution as time develops, going through a bimodal stage within a 30 min length of early time (Figure 1).
To test to what extent the cross-talking pathways modulate the bimodality, we reset q 1 � 1 so that the gene is activated by a single pathway with strength rate λ � λ 1 . In turn, we let λ be equal to transcriptional efficiencies for both cases of mild and high stress levels. Interestingly, it shows that the intermediate bimodality becomes much more fragile that lasts less than 5 min, see Figure 1 (inset).
is confirms that the crosstalking regulation indeed prolongs duration of mRNA bimodal distribution, reinforcing the general feature of bimodal transcription for stress genes [19,35].

Conclusion and Discussion
Recent single-cell studies have generated massive data on the distribution of mRNA copy numbers for many genes under various experimental conditions [1][2][3]. e most commonly observed modes for the distribution are the decaying distribution, unimodal distribution, and bimodal distribution. It is reported that the bimodal distribution could help differentiate an isogenic cell population into two dynamically stable groups with distinct phenotypes. For instance, the bimodal Nanog expression plays a key role in mediating cell differentiation in embryonic stem cells [7]; the transcriptional bimodality of Tat generates one HIV highly expressed subpopulation and the other with HIV latency [36]; the bimodal transcription pattern of stress genes in yeast increases phenotypic variability in response to unpredictable environmental changes [19]. In the classical two-state model (1), the bimodal distribution has been suggested to originate from the random switch between gene-off and gene-on states as bimodality would disappear when the gene is always active [7,28].
One of our major concerns is how the gene activation process modulates the bimodal distribution for inducible genes in face of external cues. We introduce a cross-talking pathway model (2) by integrating two competitive signaling pathways into the two-state model [24,25]. Comparing to other statistical concepts such as the noise or fano factor that can be computed relatively easily, exploring transparent analytical formulas for mRNA distribution remains a challenging task. For instance, a traditional approach for calculating mRNA distribution is to express the corresponding generating function in the form of hypergeometric functions [6,28]. However, this does not permit an easy tractable way to further transform the generating function to mRNA distribution when the time and system parameters are finite. We endeavor to express time-dependent mRNA distribution in simple mathematical formulas within a large range of finite parameters for the cross-talking pathway model. e formulas are obtained by solving the initial value problems of third-order ordinary differential equations, which are derived from the master equations of the crosstalking pathway model. We consider stress-responsive genes under acute stresses as the implement of dynamical analytical formulas for mRNA distribution. e stress gene can be activated either by a weak spontaneous basal pathway with strength λ 1 and selection probability q 2 , or alternately by an inducible signaling pathway with strength λ 2 and selection probability q 2 � 1 − q 1 . e probability q 1 (or q 2 ) may be governed by the number and availability of activated transcription factors (TFs) near the binding sites [19,20] and thus is related with the stress level; the inducible strength λ 2 can be governed by the binding strength between the TFs and the DNA sites [11,20], and thus may be tightly related with the type of external signals. We suggest that the signaling pathway is efficiently induced with On the assumption of eorem 1, (65) leads to q 1 ∼ q 2 , which shows a mild stress level that may induce a similar concentration of activated TFs in each pathway. e assumption of eorem 2 presents an active system in which the gene inactivation process is blocked. It suggests a very high stress level that may induce a significant decrease of the gene inactivation rate c [2,16,32]. For instance, compared with normal cellular conditions, the inductions with the highest strength lead to the decrease of c for over 10 6 folds for the Lac promoter in E. coli [32] and possible 10 5 fold change for the POL1 promoter in yeast [34].
Under both mild and high stress levels, we utilize the parameter rates for stress genes in yeast to understand their dynamical distributions. We reveal a typical dynamical transition from the decaying distribution to unimodal distribution as time develops, going through an intermediate bimodal stage with over 30 minutes long. is result strongly suggests the general and robust bimodal transcription of stress genes regulated by the cross-talking pathways in response to acute stresses [19,35]. First, noting that the system takes about 120 minutes to reach the steady state, bimodal distribution maybe clearly visible in the course of real-time imaging due to its relatively large time window of 30 minutes. Second, under the same durations for the gene-off and gene-on states, as well as the same mRNA synthesis and degradation rates, the two-state model (a single pathway) can only generate a much fragile intermediate bimodal stage that lasts less than 5 minutes. e general intermediate bimodal stage is consistent with recent observations that the cross-talking pathway model (2) is more likely to generate high transcription noise and bimodal distribution at the steady state compared with the two-state model (1) [24,25]. It suggests that natural selection may favor two or more parallel signaling pathways for gene activation to enhance expression variability which may provide a clear benefit in the face of an environmental stress [19,37]. On the other hand, the cross-talking pathway model (2) assumes the constant system rates and cannot take into account some inducible genes and immune genes regulated by the oscillating signal [38], temporal feedback [39], or different cell cycle phases [14]. Interestingly, the dynamical bimodality can be easily generated through the other very different mechanisms such as the system involves positive or negative genetic feedback loops [39]. e extensions of our model that include those salient biological features may require assuming time-varying parameters with biological realism accordingly. Future work will focus on how those extended models modulate the dynamical bimodal or even multimodal transcription distributions.

Data Availability
e data used to support findings of this study are included within the article: (1) the degradation rate of the mRNA molecule in S. cerevisiae equals ln(2)/11 ≈ 0.063 (1/min), which is calculated through the 11 min average half-life obtained in reference [33]. (2) e mRNA synthesis rate in S. cerevisiae equals 0.063 * 10�0.63 (1/min), which is calculated according to the mRNA degradation rate 0.063 (1/min) estimated in (1) and the upper bound of its transcription rate 10 (per mRNA lifetime) given in reference [34]. (3) e 1-3 min gene-off duration for most typical osmoresponsive genes in S. cerevisiae is reported in reference [20].

Conflicts of Interest
e authors declare that they have no conflicts of interest.