Analysis of top quark pair production signal from neutral 2HDM Higgs bosons at LHC

In this paper, the top quark pair production events are analyzed as a source of neutral Higgs bosons of two Higgs doublet model type I at LHC. The production mechanism is $pp \to H/A \to t\bar{t}$ assuming a fully hadronic final state through $t \to Wb \to jjb$. In order to distinguish the signal from the main background which is the standard model $t\bar{t}$, we benefit from the fact that the top quarks in signal events acquire a large Lorentz boost due to the heavy neutral Higgs boson. This feature leads to three collinear jets (a fat jet) which is a discriminating tool for identification of the top quarks from the Higgs boson resonances. Events with two identified top jets are selected and the invariant mass of the top pair is calculated for both signal and background. It is shown that the low $\tan\beta$ region has still some parts which can be covered by this analysis and has not been excluded yet by flavor physics data.

One of the motivations for the two Higgs doublet model is supersymmetry where each particle has a superpartner. The supersymmetry provides an elegant solution to the gauge coupling unification, dark matter candidate and the Higgs boson mass radiative correction by a natural parameters tuning. In such a model two Higgs a e-mail: majid.hashemi@cern.ch b e-mail: mahboob 72 00@yahoo.com doublets are required to give mass to the double space of the particles [12][13][14].
There are four types of 2HDMs with different scenarios of Higgs-fermion couplings. The ratio of vacuum expectation values of the two Higgs doublets (tan β = v 2 /v 1 ) is a measure of the Higgs-fermion coupling in all 2HDM types [15].
In general, 2HDM involves five physical Higgs bosons due to the extended degrees of freedom added to the model by introducing the second Higgs doublet. The lightest Higgs boson, h, is like the SM Higgs boson. The rest are two neutral Higgs bosons, H, A (subjects of this study), and two charged bosons, H ± . A review of the theory and phenomenology of 2HDM can be found in [16].
In addition to direct searches for the 2HDM Higgs bosons at colliders, there are indirect searches based on flavor Physics data by investigating sources of deviations from SM when processes which involve 2HDM Higgs bosons are introduced [17]. Limites obtained from these type of studies are one of the strongest limits on the mass of the charged and neutral Higgs bosons and tan β and will be referred to when presenting the final results.
The adopted scenario in this analysis is a search for heavy neutral Higgs boson with mass in the range 0.5-1 TeV at LHC operating at √ s = 14 TeV. All heavy Higgs bosons (CP-even, CP-odd and the charged Higgs) are assumed to be degenerate, i.e., m H = m A = m H ± . The region of interest is low tan β and the final restuls will be limited to tan β < 2. The signal process is pp → H/A → tt → W + bW −b → jjbjjb. The fully hadronic final state is expected to result in two fat jets (each consisting of three sub-jets associated with the top quark) which are examined using the updated HEPTopTagger 2 [18,19]. Events which contain two identified (tagged) top jets are used to fill the top pair invariant mass distribution histogram. The same approach is applied on background events and a final shape discrimination is performed to evaluate the signal significance. Before going to the details of the analysis, a brief review of the theoretical framework is presented in the next section.

The Higgs sector of 2HDM
The 2HDM Lagrangian for neutral Higgs-fermion couplings as introduced in [20] takes the form: with U (D) being the up(down)-type quarks, L the lepton fields, h, H, A the neutral Higgs boson fields, κ f = √ 2 m f v for any fermion type f and s β−α = sin(β − α) and c β−α = cos(β − α). The ρ f parameters define the model type and are proportional to κ f as in Tab. 1 [21]. Therefore the four types of interactions (2HDM types) depend on the values of ρ f [22].
In this study, we require s β−α = 1 which has two advantages. The first one is that the s β−α factor in the lightest Higgs-gauge coupling is set to unity while the heavier Higgs, H, decouples from gauge bosons [16]. On the other hand, the SM-like Higgs-fermion interactions are tan β independent.
According to Tab. 1, the type I is interesting for low tan β as all couplings in the neutral Higgs sector are proportional to cot β. This feature leads to cancellation of this factor as long as Higgs boson branching ratio of decay to leptons and quarks is concerned. The mass of the fermion thus plays an important role in the decay rate and as seen from Figs. 1 and 2, the Higgs boson decay to tt dominates for all relevant Higgs boson masses and tan β values. The decay to a pair of gluons proceeds through a preferably top quark loop and stands as the  second channel. The third channel is H/A → bb which has been shown to be visible at LHC [23]. The current study focuses on H/A → tt with branching ratio being near unity and independent of the Higgs boson mass ( Fig. 1) and tan β (Fig. 2).

Signal and background cross sections
The signal process under study is a Higgs boson production with the Higgs boson masses in the range 500− 1000 GeV. The three Higgs bosons masses are set to be equal for minimizing ∆ρ [24]. All selected points are checked to be consistent with the potential stability, perturbativity and unitarity requirements and the current experimental limits on Higgs boson masses using 2HDMC 1.6.3 [25,26].
There has been phenomenological searches for leptophilic Higgs boson within type IV 2HDM at LHC [27] and linear colliders [28,29]. These searches are based on leptonic decay of the Higgs boson. On the other hand, the type I 2HDM can be considered as a leptophobic model where the Higgs boson decay to quarks plays an important role. At the first glance, decays to all fermions are relevant at low tan β values. However, the fermion mass in the Higgs-fermion vertex enhances the top quark coupling dramatically compared to other channels. This is due to the fact that the common cot β factors cancel out when calculating branching ratio of Higgs decays to fermions. Therefore in this analysis, the Higgs boson decay to tt is considered as the signal.
While the neutral Higgs boson searches at LEP [30,31] leads to m A ≥ 93.4 GeV, the LHC results [32,33] indicate that the neutral Higgs boson mass in the range m H/A = 200 − 400 GeV is excluded for tan β ≥ 5. This result is based on minimal supersymmetric standard model (MSSM) which has a different Higgs boson spectrum from 2HDM due to supersymmetry constraints. Since our region of interest is Higgs boson masses above 500 GeV, no contraints from LEP or LHC limits the current analysis and the Higgs boson masses under study.
There are also results from flavor physics data which impose lower limits on the charged Higgs mass in type II and III at 480 GeV [34]. An update to this work is reported in [35] where low tan β values are excluded to some extent. The idea in such analyses is based on the contribution from additional Feynman diagrams which involve charged Higgs bosons and their effect depends on the type of the 2HDM. The type I and IV behave different from type II and III as far as the charged Higgs coupling to quarks is concerned. In the former, the charged Higgs coupling to all quark types is suppressed at low tan β, while in the latter, coupling with at least one type of the quarks (up type or down type) is enhanced with tan β. Therefore charged Higgs limits from flavor physics in type I and IV are very soft and basically relevant at tan β values as low as 2. This is the region of search in this analysis. Although we are dealing with neutral Higgs bosons, since the scenario under study is a degenerate scenario based on m H = m A = m H ± , limits on the charged Higgs are propagated into the final results.
The signal cross sections times branching ratio of Higgs (H/A) decay to tt are shown in figs. 3 and 4. The cross section decreases with increasing the Higgs boson mass as well as tan β. Therefore the most suitable area for search is where the mass is as low as possible and tan β is also very small.
The main SM background processes are tt, gauge boson pair production W W , W Z, ZZ, s−channel and t−channel single top, single W and single Z/γ * . The signal and background cross sections are listed in tab. 2.

Signal selection and analysis
The generation of signal and background events starts with PYTHIA 8 [36] followed by jet reconstruction using FASTJET 2.8 [37,38].
The jet reconstruction algorithms are classified according to their different subjet distance measures which can be written as d j1j2 = ∆R 2 j1j2 /D2×min(p 2n T,j1 , p 2n T,j2 ) with n = −1, 0, 1 for anti-k T , Cambridge/Aachen (CA) and k T algorithms respectively. The k T algorithm first combines the soft and collinear subjets and is suitable for reconstructing the QCD splitting history in top tagging algorithm. The anti-k T algorithm, first combines the hardest subjets to obtain a stable jet with clean jet boundary. The CA algorithm always combines the most collinear subjets while not being sensitive to soft splittings and therefore is suitable for top tagging reconstruction. The algorithm adopted by HEPTopTagger is thus CA with a cone size of ∆R = 1.5.
The HEPTopTagger is one of recent algorithms introduced for boosted top quark reconstruction [39]. It is based on a CA jet reconstruction with ∆R = 1.5 and  Table 2 The signal and background cross sections at √ s = 14 TeV. The "sts" and "stt" denote the s-channel and t-channel single top processes respectively. the top jet candidate p T above 200 GeV. The threshold can be lowered down to 150 GeV without significant loss of efficiency [40,41]. Having the collection of fat jets in the first step, the top tagging algorithm starts with undoing the last clustering of the top jet candidate j and requiring the mass drop criterion as minm ji < 0.8m j where j i is the ith subjet from the jet j. Subjets with m j < 30 GeV are not considered to end the unclustering iteration.
In the second step a filtering is applied to find a three-subjet combination with a jet mass within m t ±25 GeV.
In the last step, having sorted jets in p T , several requirements are applied to find the best combination of subjets with two subjets giving the best W boson invariant mass and the whole three subjets to be consistent with the top quark invariant mass. Details of these criteria are expressed in [40].
Performing the algorithm, a selection efficiency for each signal sample is obtained. The same procedure is applied on background samples. An event is required to have two top jets identified. The invariant mass of the two top jets are calculated as the Higgs boson candidate mass. Both signal and background distributions of top quark pair invariant masses are normalized according to the corresponding cross sections. The signal on top of the background is then plotted for each benchmark point as seen in Figs. 5-10. At this step, since a large number of background is still filling the signal region, a mass window is applied to select the signal and increase the signal to background ratio. The position of the mass window (both left and right sides) is determined in an automatic search based on requiring the maximum signal significance. This is performed in a loop over bins of the histogram and finding the left and right bins inside which the signal significance is maximum. Table 3 shows mass window position, total efficiencies for signal and background events, final number of signal and background events passed the mass window cut, their ratio and the signal significance as S/ √ B at two values of tan β = 0.5 and 1. The integrated lumi-nosity is set to 300 f b −1 . The tab. 3 clearly shows the high sensitivity of the signal significance to tan β parameter. The analysis is thus relevant to tan β values as low as ∼ 2. Figure 11 shows the signal significance as a function of the Higgs boson mass for different tan β values. The dashed horizontal line indicates the 5σ significance. Using the analysis results for Higgs boson masses from 500 GeV to 1000 GeV, one can obtain the 95% C.L. exclusion region and the 5σ discovery contours. Figure  12 shows the exclusion region at 95% C.L. including the recent result from [35] (the result reported in [35] is based on charged Higgs mass as a function of tan β, however, it is included in the current work as a limit for all Higgs bosons since the Higgs boson masses are equal in the scenario adopted in this analysis). The 5σ contour is also shown in Fig. 13.
As seen from Figs. 12 and 13, both exclusion and discovery are possible at regions not yet excluded by any experimental or phenomenological analysis. Therefore any sign of extra top pair signals on top of SM background could be regarded as a signal for new physics especially 2HDM. It should be noted that in this analysis, a full set of background processes was studied. However, all background processes led to very small number of events which were negligible compared to the SM tt. Therefore final plots are based on signal on top of the tt distribution without any sizable error.

Conclusions
Extra sources of tt events from what we expect from standard model can appear from theories beyond standard model such as two Higgs doublet models. In 2HDM type I, the heavy neutral (CP-even or odd) Higgs decay to tt dominates the other channels. In such a scenario a proton-proton collision may create a neutral Higgs decaying to tt. The signal from such a process, can be observed as an excess of top pair events over what is expected from SM. The discriminating tool can be a top pair invariant mass distribution filled with events containing two top jets from both signal and background   processes. The analysis performed in this work, shows that such a signal is observable at integrated luminosity of 300 f b −1 for tan β values which depend on the Higgs boson mass. The exclusion at 95% C.L. is also possible at the same integrated luminosity for tan β < 2 with m (H/A) = 600 GeV as the best point.