A short review on recent developments in TMD factorization and implementation

In the latest years the theoretical and phenomenological advances in the factorization of several collider processes using the transverse momentum dependent distributions (TMD)has greatly increased. I attempt here a short resume of the newest developments discussing also the most recent perturbative QCD calculations. The work is not strictly directed to experts in the field and it wants to offer an overview of the tools and concepts which are behind the TMD factorization and evolution. I consider both theoretical and phenomenological aspects, some of which have still to be fully explored. It is expected that actual colliders and the Electron Ion Collider (EIC) will provide important information in this respect.


Introduction
The knowledge of the structure of hadrons is a leitmotiv for the study of quantum chromodynamics (QCD) for decades. Apart from the notions of quarks and gluons (we call them generically "partons" in the following), the natural question is how the momenta of these particles are distributed inside the hadrons and how the spin of hadrons is generated. Phenomenologically it is possible to access at this problem only in some particular kinematical conditions, as provided, for instance, in experiments like (semi-inclusive) deep inelastic scattering, vector and scalar boson production, ℓ + ℓ − → hadrons, or jets. I review the basic principle which supports this investigation. Let us consider, to start with, the cross section for di-lepton production in a typical Drell-Yan process → ℓ + ℓ − + where includes all particles which are not directly measured. The cross section for this process can be written formally as where 2 is the virtual di-lepton invariant mass, 푖 are the parton momenta fraction along a light-cone direction or Bjorken variables, and are the parton distribution functions (PDF). The r.h.s. of (1) assumes several notions which, nowadays, can be found in textbooks. In fact a central hypothesis is a clear energy separation between the di-lepton invariant mass and the scale at which QCD cannot be treated perturbatively any more (we call it the hadronization scale Λ ∼ O(1) GeV), that is, 2 ≫ Λ 2 . Given this, one can factorize the cross section in a perturbatively calculable part H and the rest. Formula (1) represents just a first term of an "operator product expansion" of the cross section. The price to pay for this separation is the introduction of a factorization scale which can be used to resume logarithms in combination with renormalization group equations [1][2][3]. Another aspect, which is remarkable, is that the nonperturbative part of the cross section can be also expressed as the product of two parton distribution functions. This fact has two main consequences: on the one hand, all the nonperturbative information of the process is included in the PDFs; on the other hand, the partons belonging to different hadrons are completely disentangled. In these conditions so the longitudinal momenta of quarks and gluons can be reconstructed nonperturbatively and this fact has given rise to a large investigation whose review goes beyond the purpose of this writing.

Advances in High Energy Physics
The ideal description of the process in (1) however becomes more involved in the case of more differential cross sections [32][33][34]. So, for instance, one can wonder whether a formula like 2 2 has any physical consistency. (I use the notation 푇 for 2-dimensional impact parameter, − 2 푇 = 푇 2 ≥ 0, is the center of mass energy of the process, The answer to this question is necessarily more complex than in the case of (1) for the simple fact that a new kinematic scale, 푇 , the transverse momentum of the di-lepton pair, has now appeared. In this article I will concentrate on the description of the case which is interesting for a number of observables. The restriction to this kinematical regime represents also a limitation of the present approach which should be overcome with further studies.
The study of factorization [25,27,29,30,35,36] has lead finally to the conclusion that actually (2) in not completely correct because the cross section for these kind of processes should instead be of the form with 1 2 = 4 and 푖 being the rapidity scales. Formula (4) shows explicitly that the TMD functions contain nonperturbative QCD information different from the usual PDF, while they still allow completing disentangle QCD effects coming from different hadrons. These new nonperturbative QCD inputs can be written in terms of well-defined matrix elements of field operators which can be extracted from experiments or evaluated with appropriate theoretical tools. These objectives require some discussion, which I partially provide in this text. The scale is the authentic key stone of the TMD factorization. Its origin is different from the usual factorization scale and because of this it is allowed to perform a special resummation for this scale. This leads to the fact that a consistent and efficient implementation of the ( , ) evolution is crucial for the prediction and extraction of TMDs from data. A possible implementation of the TMD evolution is historically provided by Collins-Soper-Sterman (CSS) [32][33][34]. However a complete discussion of more efficient alternatives has started more recently [21-23, 26, 37]. The point is that the rapidity scale evolution has both a perturbative and nonperturbative input, as it is actually provided by (derivatives of) an operator matrix element (the so called soft function). An efficient implementation and scale choice so should separate as much as possible the nonperturbative inputs with different origin inside the cross sections. This target is not completely realized with the CSS implementation, while it can be achieved with the -prescription discussed in the text. This discussion is also relevant for multiple reasons. In fact various orders in perturbation theory are available already for unpolarized and polarized distribution and, in the future, one expects more results in this respect for many polarized distributions. When dealing with several perturbative orders, the convergence of the perturbative series can be seriously undermined by an inappropriate choice of scales, and this is a well-known problem that can affect the theoretical error of any result. A more subtle issue comes from the fact that the evolution corrections can also be of nonperturbative nature. It would be certainly clarifying a scheme in which the nonperturbative effects of the evolution are clearly separated from the intrinsic nonperturbative TMD effects. Such a request results to be important when several extraction of TMD from data are compared and also when a complete nonperturbative evaluation of TMD can be provided.
In the rest of this review I will try to give an idea on how all these problems can be consistently treated, which can be useful also to explore new and more efficient solutions. Several parts of this review use material that can be originally found in [4,23,38].

Factorization
The factorization of the cross sections into TMD matrix elements has been provided by several authors and it has been object of many discussions [25,27,29,30,[32][33][34][35][36]. We briefly review the main ideas here for the case of Drell-Yan. The process is characterized by two initial hadrons which come from opposite collinear directions and produce two leptons in the final state plus unmeasured radiation. We identify collinear (anticollinear) light-cone directions ( ) and 2 = 2 = 0, ⋅ = 1 for the momentum of colliding particles. The momentum of collinear particles is = ( + , − , ⊥ ) with ⋅ = − , ⋅ = + and ⊥ = − ( ⋅ ) − ( ⋅ ) and + ≫ ⊥ ≫ − . The momenta of collinear particles are characterized by the scaling ≃ (1, 2 , ) where is the di-lepton invariant mass and is a small parameter ∼ Λ 푄C퐷 / being Λ 푄퐶퐷 the hadronization scale. A reversed scaling of momentum is valid for anticollinear particles, say ≃ ( 2 , 1, ). The soft radiation which entangles collinear and anticollinear particles is homogeneous in momentum distribution (its momentum scales as ∼ ( , , )) and can be distinguished from the collinear radiation only for a different scaling of the components of the momenta. Given this, it is natural to divide the hadronic phase space in regions as in Figure 1. In this picture, the collinear and soft regions are necessarily separated by rapidity and they all share the same energy 2 ∼ Λ 2 .

Soft Interactions and Soft
Factor. Because the soft radiation is not finally measured, its interactions should be included (and resumed) in the collinear parts, which become sensitive to a rapidity scale which acts in a way similar to the usual factorization scale. It is possible to define the soft radiation through a "soft factor"; that is, by an operator matrix element, where we have used the Wilson line definitions [39][40][41] appropriate for a Drell-Yan process, The direct calculation of the soft factor is all but trivial and the way the calculation is performed can influence directly the final formal definition of the transverse momentum dependent distribution used by different authors. In fact a simple perturbative calculation shows that in the soft factor there are divergences which cannot be regularized dimensionally (say, they are not explicitly ultraviolet (UV) or infrared (IR)) which occur when the integration momenta are big and aligned on the light-cone directions. The divergences that arise in this configuration of momenta are generically called rapidity divergences and regulated by a rapidity regulator. One can understand the necessity of a specific regulator observing that the light-like Wilson lines are invariant under the coordinate rescaling in their own light-like directions. This invariance leads to an ambiguity in the definition of rapidity divergences. Indeed, the boost of the collinear components of momenta + → + , − → − / (with an arbitrary number) leaves the soft function invariant, while in the limit → ∞ one obtains the rapidity divergent configuration. Therefore the soft function cannot be explicitly calculated without a regularization which breaks its boost invariance. The coordinate space description of rapidity divergences as well as the counting rules for them have been derived in [42,43]. The nature of the divergences in the soft factor has been studied explicitly in [44] at one loop and in [45] at NNLO, which conclude that, once all contributions are included, the soft factor depends only on ultraviolet and rapidity divergences (and IR divergences are present only in the intermediate steps of the calculations, but not in the final result). Different regulators have also been shown to be more or less efficient within different approaches to the calculations of transverse momentum dependent distributions. For instance, NNLO perturbative calculations for unpolarized distributions, transversity, and pretzelosity have been performed using de -regulator of [4,15,19] while for the recent attempts of lattice calculations off-the-light-cone Wilson lines are preferred [46][47][48][49][50][51][52][53][54][55][56]. The discussion of the type of regulator involves usually another issue, which is also important for the complete definition of TMDs. While collinear and soft sectors can be distinguished by rapidity, the choice of a rapidity regulator forces a certain overlap of the two regions which should be removed, in order to arrive to a consistent formulation of the factorized cross section. This is called "zero-bin" problem in Soft Collinear Effective Theory (SCET) [57]) and its solution is usually provided in any formulation of the factorization theorem. The amount of the zero-bin overlap is usually fixed by the same soft function in some particular limit although it is generally impossible to define this subtraction in a unique (in the sense of regulator independent) form. Because of this overlap one can find in the literature that the soft function is used in a different way in different formulations of the factorization theorem. The evolution properties of TMDs however are independent of these subtleties and they are the same in all formulations. A possible rapidity renormalization schemedependence is traditionally fixed by requiring −1 −1 = 1 (for this notation see discussion on Section 2.2).
The factorization theorem to all orders in perturbation theory relies on the peculiar property of soft function of being at most linear in the logarithms generated by the rapidity divergences. Then it comes natural to factorize it in two pieces [30], and in turn this feature allows to define the individual TMDs. Using the -regulator one can write to all orders in perturbation theory, as well as to all orders in theexpansion (the UV divergences are regulated in dimensional regularization = 4 − 2 ) [45].
where tildes mark quantities calculated in coordinate space, ] is an arbitrary and positive real number that transforms as + under boosts, and we introduce the convenient notation Despite the fact that the soft function is not measurable per se, its derivative provides the so-called rapidity anomalous dimension, with l 훿 = ln( 2 /| + − |). Because of its definition the rapidity anomalous dimension D has both a perturbative (finite; calculable) part and a nonperturbative part. This fact should be always taken into account despite the fact that many experimental data are actually marginally sensitive to the nonperturbative nature of the rapidity anomalous dimension. A nonperturbative estimation of the evolution kernel with lattice has been recently proposed in [58] and I expect a deep discussion on this issue in the future. A renormalon based calculation has also provided some approximate value for this nonperturbative contribution [59].

TMD Operators.
Another fundamental ingredient in the formulation of the factorization theorem is represented by the definition of the TMD operators that are involved. We use here the notation of [4]. The TMDs which appear in a Drell-Yan process can be rewritten starting from the bare operators (here I consider only the quark case, for simplicity)  (6). The collinear and soft Wilson line should be distinguished because the gluons which define them have a different scaling in effective field theories and also because they should be regularized differently with respect to rapidity divergences (see the definition of -regulator in [45] for soft and collinear matrix elements). The hadronic matrix elements, define the bare or unsubtracted TMDs. These bare operators do not include for the moment any soft radiation and they are just collinear object (one can refer to them as "beam functions") and because of boost invariance they can be calculated in principle in any frame. A renormalization issue however appears because of rapidity divergences and overlap with the soft radiation (this problem is usually referred to as zero-bin problem in effective field theory [57]). The soft interactions can be incorporated in the definition of the TMD through an appropriate "rapidity renormalization factor". The final form of the rapidity renormalization factor ( in the following) is dictated by the factorization theorem. The renormalized operators and the TMD are defined, respectively, as and 푞 is the UV renormalization constant for TMD operators and 푞 is the rapidity renormalization factor. Both these factors are the same for particle and antiparticle; however they are different for quarks and gluons. These factors also occur in the same way in parton distribution functions and fragmentation functions. The scales and are the scales of UV and rapidity subtractions, respectively. The way these factors are ordered corresponds to first removing rapidity divergences and then the rest of UV divergences from the bare matrix elements as in [4]. It is possible to proceed Advances in High Energy Physics 5 also in a different way (for instance, in [29,60,61] they cancel the rapidity divergences from the beam functions and soft factors independently); however finally one achieves an equivalent resummation of rapidity logarithms. In [5,27,62] for TMDPDFs the soft function is hidden in the product of two TMDs. Some comments finally are necessary for the zero-bin overlap problem. In principle an overlap factor affects the rapidity renormalization factor as where 푓 ( 푇 ) is the soft function and Zb 푓 is the zero-bin contribution [25,30,36,57,63] and both are different in the quark and gluon cases. The zero-bin part assumes a particular form depending on the regulator for rapidity divergences. For instance, the modified -regularization [45] has been constructed such that the zero-bin subtraction is literally equal to the soft function: The definition is nontrivial because it implies a different regularized form for collinear Wilson lines 푛(푛) ( ) and for soft Wilson lines 푛(푛) ( ). In the modified -regularization, the expression for the rapidity renormalization factor is and this relation has been tested at NNLO in [4,24,45].
In the formulation of TMDs by Collins in [25] the rapidity divergences are handled by tilting the Wilson lines off-thelight-cone. Then the contribution of the overlapping regions and soft factors can be recombined into individual TMDs by the proper combination of different soft functions with a partially removed regulator. This combination gives the factor 푓 , The rest of logical steps remain the same as with theregulator. Notice that, due to the process independence of the soft function [25,30,36,63,64], the factor 푓 is also process independent. An important aspect of factorization is finally represented by the cancellation of unphysical modes, the Glauber gluons. A check of this cancellation has been provided in [25,[65][66][67] and I do not review it here.

Matching at Large 푇 (or Small-)
The practical implementation of the TMD for data analysis benefits from asymptotic limits of the distribution. These limits allow defining the TMDs at different scale and constraining the nonperturbative behavior of the TMDs. Commonly one starts with the large transverse momentum limit of the TMD. In this case one can refactorize the TMDs in terms of Wilson coefficient and collinear parton distribution functions (PDF), following the usual rules for operator product expansion (OPE). At operator level one finds and the symbol ⊗ is the Mellin convolution in variable or , and , 耠 enumerate the flavors of partons. The running on the scales , 푏 , and is independent of the regularization scheme and it is dictated by the renormalization group equations which we discuss later. In the case of initial states (17) reduces to where 푓 ←㨀푁 are the integrated collinear distributions, that is, the parton distribution functions (PDF) which depend only on the Bjorken variables ( for PDFs) and the renormalization scale . All dependence on the transverse coordinate 푇 and rapidity scale is contained in the matching coefficient and can be calculated perturbatively. I report the definition of the PDFs for completeness The calculation of the Wilson coefficients in (17) uses the standard methods of the operator product expansion which profit of the fact that the coefficients do not depend on the infrared limit of the matrix elements. The current status of these calculations for quark distributions is resumed in Table 1. Less information is generally available in the case of gluon TMDs. Basically the matching coefficients for unpolarized gluons are known at NNLO [6] and linearly polarized gluons at NLO [15]. In general the TMDs which match onto collinear twist-3 functions are much less known, which reflects the difficulty of the computations. It would be very useful to have a better knowledge of all these less known functions at higher perturbative order before the advent of Electron Ion Collider (EIC). In the rest of this Section 1 focuses on unpolarized quark distributions which offer also an important understanding on the power of the TMD factorization. The necessity of a complete NLO estimation of all TMDs is both theoretical and phenomenological. Actually a difficulty of the TMD extraction from data is due to the fact that it is a nontrivial function of two variables (Bjorken and transverse momentum) so that a complete mapping on a plane is necessary. This target is achievable 6 Advances in High Energy Physics t w -4 --- * The calculation is done in the momentum space. The result is given for the moments of distribution. * * The pretzelosity can in principle be a twist-2 observable; however its twist-2 matching coefficient has been found to be zero up to NNLO [19]. Therefore one can conjecture that pretzelosity is actually a twist-4 observable. Some arguments in favor of this can also be found in [20]. * * * The quark Sivers function at NLO has a long story [8][9][10][11][12][13]. A complete calculation is now available in [14].
thanks to the factorization of the cross section and the consequent extraction of the TMD evolution part, which is process independent. A second important information comes from the asymptotic limit of the TMD for large transverse momentum, which is perturbatively calculable. The simple LO expressions for the TMD in general do not provide much information (they are just constants), so that in order to achieve a wise modeling a NLO calculation is always necessary. The higher order calculations allow also testing the stability with respect to the scales that match the TMD perturbative and nonperturbative parts. For the unpolarized case a study in this sense can be found in [22] both for high energy and low-energy data. Using a LO calculations one cannot even quantify this error. Finally, another lesson that comes from the analysis of the unpolarized case is that a good portion of the TMD is tractable starting from their asymptotic expansion for large transverse momenta. In any case even a 10% average precision of the SIDIS cross section at EIC will need a NLO theoretical input.

Evolution
The evolution of the TMDs in the factorization, , scale is derived from their defining operators and from (15), in an usual way. Equation (20) is a standard renormalization group equation (which comes from the renormalization of the ultraviolet divergences), the function 퐹 ( , ) is called the TMD anomalous dimension, and it contains both single and double logarithms. The same (13) can be used to write the running with respect to the rapidity scale, . Because the rapidity scale evolution comes from soft interactions and more specifically from the soft factor (see discussion in [45,61] and, e.g., [24,42,43]) which is the same for initial and final states, the rapidity scale evolution is the same for TMD parton distribution functions and TMD fragmentation functions, and it is also spin-independent (so it is the same also for TMDs at higher twist), The function D( , 푇 ) is called the rapidity anomalous dimension and actually one has D( , 푇 ) ≡ D( , | 푇 |). The anomalous dimensions for these pair of evolution have been addressed with several names in the literature as it is shown in Table 2.
Quark and gluon rapidity anomalous dimensions are related up to three loops by the Casimir scaling (see [45]), From (22) one finds that the anomalous dimension is where we introduce the notation The large-푇 expansion of the TMD introduces also another evolution scale, which is needed for in the matching of Wilson coefficients with the collinear operators. In the case of the unpolarized TMDs this is provided by the DGLAP (DGLAP is an acronym for Dokshitzer, Gribov, Lipatov, Altarelli, Parisi where are the DGLAP kernels for the PDF. Similar equations hold for unpolarized TMD fragmentation functions. It The solutions of these differential equations are This defines the reduced matching coefficientŝwhose renormalization group evolution equations are with the kernel The perturbative expansion of all these functions provides consistency requirements for the logarithmic terms at a given order. Using the notation for the -th perturbative order, one finds that the knowledge of the coefficient at order − 1 provides all the terms with ̸ = 0 at order in this series. So finally any higher order calculation provides new information on terms (푛;0) 푓←㨀푓 . A resume of the present status of available calculations is provided in Table 1.

Implementation of TMD Formalism and TMD Extraction from Data
As an example of application of the TMD formalism I review the study of unpolarized TMD parton distribution functions in Drell-Yan and Z-boson production following [23]. Namely, I consider the process ℎ 1 + ℎ 2 → ( → 耠 ) + , where is the electroweak neutral gauge boson, * or . The incoming hadrons ℎ 푖 have momenta 1 and 2 with ( 1 + 2 ) 2 = . The gauge boson decays to the lepton pair with momenta 1 and 2 . The momentum of the gauge boson or equivalently the invariant mass of lepton pair is 2 = 2 = ( 1 + 2 ) 2 . The differential cross section for the Drell-Yan process can be written in the form [68,69] where 1/2 is the flux factor; Δ 퐺 is the (Feynman) propagator for the gauge boson . The hadron and lepton tensors are, respectively, where 퐺 휇 is the electroweak current. Within the TMD factorization, the unpolarized hadron tensor (see, e.g., [70]) is where 푇 is the transverse part of the metric tensor and the summation runs over the active quark flavors. The variable is the hard factorization scale. The variables 1,2 are the scales of soft-gluons factorization, and they fulfill the relation 1 2 = 4 . In the following, we consider the symmetric point 1 = 2 = = 2 . The factors 퐺퐺 푓푓 are the electro-weak charges and they are given explicitly in [23]. The factor 푉 is the matching coefficient of the QCD neutral current to the same current expressed in terms of collinear quark fields. The explicit expressions for 푉 can be found in [71][72][73].
In (34) I have not included power corrections to the TMD factorization (to be distinguished from the power corrections to the TMD operator product expansion). It is difficult to establish the amount of these corrections but a phenomenological study in [23] and a more formal study in the large-푐 limit (that is, the limit of large number of colors) in [74] have found a reasonable upper value ( 푇 / ) 푚푎푥 ∼ 0.2. A study which takes into account the structure of operators in the type of corrections has been started in [75]. In general the power corrections should be included when the dilepton invariant mass is of order a few GeV (this is the case, for instance, of HERMES experiment and, perhaps to a possibly less extent, COMPASS) or when the experimental precision is extreme (as it possibly happens with ATLAS experiment). This is issue is important phenomenologically and involves the study of cross sections with the inclusion of factorization breaking contributions. Some recent suggestion have appeared in [76,77] which have still to be tested phenomenologically. One should remark however that the implementation of these factorization breaking corrections strongly depends on the fact that the factorized part of the cross section is correctly realized and phenomenologically tested. More studies on this issue are necessary in the future.
Evaluating the lepton tensor and combining together all factors one obtains the cross section for the unpolarized Drell-Yan process at leading order of TMD factorization, in the form [25,27,34,36,78,79] 2 where is the rapidity of the produced gauge boson. The factor P is a part of the lepton tensor and contains information on the fiducial cuts. This factor provides important information on the actual measured leptons and should be always included when the relative experimental information is provided. The evolution of the TMDs play a special role in (35) and we collect of evolution equations here: and on the right hand side of these equation we have omitted the reference to flavor for simplicity. The main features of these anomalous dimensions that have been discussed in previous sections are the following: The TMD anomalous dimension 퐹 ( , ) contains both single and double logarithms and the anomalous dimension 푉 refers to the finite part of the renormalization of the vector form factor; see Table 2. Equation (36) cannot fix the logarithmic part of D entirely, but only order by order in perturbation theory, because the parameter is also responsible for the running of the coupling constant. The rapidity anomalous dimension D is a nonperturbative function (see the discussion about the renormalon in the perturbative series of this function in [32,59,80]), although it can be perturbatively expanded for small | 푇 |.
The double-evolution equation of the TMDs can be formulated as in [23] using a two-dimensional vector field notation, that i reproduces here. The procedure consists in introducing a convenient two-dimensional variable which treats scales and equally, , ln ( (1 GeV 2 ) ) , (37) where the dimension of the scale parameters is explicitly indicated and the bold font means the two-dimensional vectors. Then one defines the standard vector differential operations in the plane^, namely, the gradient and the curl The TMD anomalous dimensions can be all included in a vector evolution field E(^, 푇 ), Here and in the following, we use the vectors^as the argument of the anomalous dimensions for brevity, keeping Using this formalism, (36) are equivalent to the statement that the evolution flow is irrotational, and is the evolution scalar potential for TMD. According to the gradient theorem any line integral of the field E is pathindependent and equals the difference of values of potential at end-points. Therefore, the solution for the factor in (40) is and ] 1,2 are the first and second components of the vector^in (37), and the last term is an arbitrary -dependent function. The evolution field presented in the previous section is conservative only when the full perturbative expansion of the evolution equations is known. In practice only a few terms of the evolution are calculated, so that it is important to understand in which sense the evolution field remains conservative. Using the Helmholtz decomposition, the evolution field is split into two parts The fieldẼ is irrotational, the field Θ is divergence-free, and they are orthogonal to each other curlẼ = 0, with the notation curl(curl) = ∇ 2 . Then, one can write the irrotational fieldẼ, which is conservative, as the gradient of a scalar potentialẼ and the divergence-free part as the vector curl (see (38)) of another scalar potential Θ(^, 푇 ) = curl (^, 푇 ). The curl of the evolution field can be calculated using the definitions (36), The function Γ can be calculated order by order in perturbation theory. For instance, at order one finds is the -function with first terms removed. For instance, we have In these expressions the -function is not expanded because in applications one can find a different perturbative order with respect to the rest of the anomalous dimensions. The immediate consequence of the fact that the evolution field E is no more conservative is that the evolution factor [ 푇 ;^푓 →^푖] is dependent on the path chosen to join the initial and final points^푖,^푓 and this fact introduces a theoretical error which can be dominant in certain implementation of the evolution kernels. The difference between two solutions evaluated on different paths is ln where 1 ∪ 2 is the closed path built from paths 1 and 2 and Ω( 1 ∪ 2 ) is the area surrounded by these paths. Using the independence of Γ on the variable , (51) becomes ln where 1,2 ( ) is the -component of the path 1,2 at the scale . This equation shows that the difference between paths becomes bigger with largely separated rapidity scales 푖 .
The path independence of the evolution is crucial for the implementation of the perturbative formalism, as its absence can derive into uninterpretable extractions of TMDs or big theoretical errors. The path independence can be achieved observing that should hold order by order in perturbation theory. Once this is realized it is possible to define null-evolution lines in the ( , ) plane, which coincide with equipotential lines, and the evolution of TMD takes place only between two different lines. I resume here two possible solutions to this problem, following [23].
In the literature one can find a typical way to implement the evolution that one can call the improved D scenario which includes the Collins-Soper-Sterman formalism [21,22,25,26,29,81,82]. In this scenario one chooses a scale 0 such that In this way one obtains The situation in this scenario can be visualized in Figure 2. Any choice of 0 corresponds to a different scheme as 0 is the point where evolution flips from path 1 and path 2 in Figure 2.
The differences that can appear in the extraction of TMDs which depend on the choice of 0 can be numerically large, so that the selection of this scale can cause also some problems when a sufficient precision is required.
The presence of the intermediate scale 0 is not unavoidable in the implementation of the TMD evolution. In fact the integrability condition (53) can be restored by changing the anomalous dimension 퐹 to a modified value 푀 such that and the corresponding solution for the evolution factor reads improved solution: ln These expressions should be completed with the resummation of D by means of renormalization group equation (55) as it is not implicitly included in this scenario.

Prescription and Optimal
where 휇 is defined such that ( 푖 , 휇 (^퐵, b 푇 )) ∈ (^퐵, 푇 ) and ( 푓 , 푓 ) ¡ ∈ (^퐵, 푇 ). In order to minimize the evolution effect and so to have a more stable prediction/extraction of TMDs the initial and final evolution curves should be selected with care. Once this is settled, it remains to find an appropriate set of initial and final line which is sufficiently stable for all relevant processes. The final point of the rapidity evolution, 푓 , is as usual dictated by the hard subprocess. Concerning the starting line it is convenient to take into account also the matching of the TMD on the respective integrated distributions, which formally is where is PDF or FF and is the Wilson coefficient function. The coefficient function includes the dependence on 푇 within the logarithms L 휇 and L 휁 . The traditional choice, see, e.g., [25,26,83], 푖 = 2 푖 leaves uncanceled logarithmic factors in the coefficient function which explode at small 푇 . The damage caused by this choice is partially cured inserting more prescriptions like the * prescription [25], which however bias the modeling of the nonperturbative part of the TMD. In fact, among the others, this prescription correlated the nonperturbative part of the evolution factor with the intrinsic nonperturbative part of the TMD. Although this prescription may work for some initial studies it results to have serious drawbacks for more precise analysis.
The -prescription suggested in [22,23] provides an attempt to improve the stability of the perturbative series and keep separate the truly nonperturbative TMD distribution from the nonperturbative part of the evolution kernel. The advantage of this separation is that one can always use all known perturbative information for the evolution factor, even if the knowledge of the collinear matching is not at the same perturbative order. This reduces drastically the perturbative uncertainty in the extractions of TMDs from data and also facilitates the understanding of direct calculations of the TMD through nonperturbative QCD methods like lattice.
I provide here some description of the -prescription following [22,23]. A TMD distribution in the -prescription reads where 휇 is defined such that ( 푖 , 휇 (^퐵, 푇 )) ∈ (^퐵, 푇 ), that is, the value of 푖 is such that the initial scales of the TMD distribution ( 푖 , 휇 ) belong to the null-evolution curve (^퐵, 푇 ). At this stage let me rewrite (61) specifying the scales, where OPE is an intrinsic scale for the expansion of the TMD in terms of Wilson coefficients and PDFs and it is a free parameter. In general the values of OPE are restricted by the values of , except if^퐵 = (ln 2 saddle , ln saddle ) ⇒ OPE unrestricted. (65) The last choice give us much more freedom to model the nonperturbative part of the TMD and the definition of the initial scale is unique and nonperturbatively defined. The choice of saddle as the initial point is so optimal and consistent with the reexpression of TMDs using PDFs. This scheme fixes the optimal-TMD-distribution; that is, it fixes the initial special null-evolution curve. As a summery any initial point in the saddle curve is defined nonperturbatively and it is unique and performing this choice it is consistent to write optimal TMD simply as ( , 푇 ).
A plot for the -factor is given in Figure 3. At largethe shape of the rapidity anomalous dimension is nonperturbative as renormalon studies confirm [59,80] (see also [84]). So, at large-the expression for D should be extracted from data fitting, while at small-it should match the perturbative expression. We recall that the -prescription has, among its benefit, the one of separating the modeling of the nonperturbative part of the evolution factor from the rest of nonperturbative parts of the TMD. This implies that it always recommendable to use the highest nonperturbative order in the evolution factor, even if the matching coefficients of the TMD with the collinear functions are known to a lesser precision perturbatively.
The nonperturbative part of the evolution kernel can be modeled in different ways. For instance, one can introduce a simple ansatz like a the modification where = | 푇 | and max is a parameter, such that max < as suggested a long ago in [33], * ( ) = (1 + as part of the * prescription [25]. Let us stress that the choice of a * can be admissible separately for the evolution factor and that (66) does not imply * -prescription for the whole TMD distribution. With the choice max < the saddle point is always in the observable region, which allows determining the optimal TMD. In this case the evolution factor reads and the resumed expression for the TMD anomalous dimension as in [21] is understood in the last line. In (68), the scale saddle is -dependent and defined by the equation Evolution kernel at NNLO q T (GeV) q T (GeV) The expression for the cross section with the optimal TMD definition is particularly compact and reads The derivation of the saddle point using formula (69) is in practice done numerically, so that an efficient method to extract it or to approximate this point should be discussed as in [23]. A technical discussion of this issue is beyond the point of this paper.
Let me conclude this section discussing a comparison of the optimal TMD construction with a more traditional implementation on data following the recent fits in [22,85]. The absence of an intermediate scale 0 removes one (scheme dependent) source of error and at the same time it allows the path independence of the final result. In this way it is possible to directly compare D NP from different extractions and models. In the definition of the optimal TMD the low-energy normalization is defined "nonperturbatively" and uniquely by (69) which implies that the perturbative order of the evolution is completely unrelated to the perturbative order of matching of the TMD on the respective collinear functions. Because the evolution factor is known often at higher orders with respect to the Wilson coefficient matching factors, it is always possible to fully use all the available perturbative information. The theoretical uncertainty of TMDs is estimated with the variation of 푓 and OPE . The fact that the number of varied scales is different from more standard analysis does not necessarily imply a reduction of theoretical errors. The error in fact reshuffles in 푓 and OPE , but the description is now more coherent. One can appreciate this effect in Figure 4 taken from [22]. In this figure one compares for the ATLAS experiment a standard method to test the dependence on the scales and thus the stability of the perturbation theory prediction, multiplying each scale by a parameter [22,37,86,87], and varying the parameters nearby their central value. For example, in the notation of [22], one changes scales as and checks the variations of 푖 ∈ (1/2, 2). The variation of all these four parameters is consistent with a nonoptimal definition of TMDs, while in the optimal case only the variation of 2 and 4 is necessary.

Conclusions
The formulation of factorization theorems in terms of TMDs is a first fundamental step for the study of the structure of hadrons and the origin of spin. The use of the effective field theory appears essential to correctly order the QCD contributions. Properties of TMDs like evolution and their asymptotic limit at large values of transverse momentum can be systematically calculated starting from the definition of correct operators and the evaluation of the interesting matrix elements. A key point for the renormalization of TMDs is represented by the so-called soft matrix element which is common in the definition of all spin dependent leading twist TMD.
Still, all this is just a starting point for the study of TMDs. In fact a correct implementation of evolution requires a control of all renormalization scales that appear in the factorization theorem. I have described here some of these possibility putting the accent on some recent interesting developments which, at least theoretically, allow a better control of the resumed QCD series. The understanding of factorization allows also precisely defining the range of ideal experimental conditions where this formalism can be applied. A full analysis of present data using all the theoretical information collected so far is still missing and it will certainly be an object of research in the forthcoming years. The formalism  figure) and optimal (lower figure) TMD implementation. Here, the kinematics bin-integration, etc., is for the Z-boson production measure at ATLAS at 8 TeV [31]. described in this work is the one developed for unpolarized distributions. However the evolution factors are universal; that is, they are the same for polarized and unpolarized leading twist TMDs and they are valid in Drell-Yan, SIDIS experiments, and + − colliders, where the factorization theorem applies. All this formalism is expected to be tested on data in the near future. Nevertheless a lot of perturbative and nonperturbative information is still missing. Giving a look at Table 1 one can see that for many TMD one has only a lowest order perturbative calculation which should be improved in order to have a reliable description of data. While the information on the nonperturbative structure of TMD is still poor and still driven by phenomenological models, it is important to implement the TMD formalism in such a way that perturbative and nonperturbative effects are well separated. And among the nonperturbative effects, one should be able to distinguish the ones of the evolution kernel from the rest. In the text I have discussed a possible solution to this problem. Some prominent research lines which possibly will deserve more attention in the future include the cases where hadrons are measured inside the jets, see, for instance, [88][89][90] or outside a jet (say, hadron-jet interactions) [91][92][93] and lattice.

Conflicts of Interest
The authors declare that they have no conflicts of interest.