Physics of Gamma-Ray Bursts Prompt Emission

In recent years, our understanding of gamma-ray bursts (GRB) prompt emission has been revolutionized, due to a combination of new instruments, new analysis methods and novel ideas. In this review, I describe the most recent observational results and the current theoretical interpretation. Observationally, a major development is the rise of time-resolved spectral analysis. These led to (I) identification of a distinguished high energy component, with GeV photons often seen at a delay; and (II) firm evidence for the existence of a photospheric (thermal) component in a large number of bursts. These results triggered many theoretical efforts aimed at understanding the physical conditions in the inner jet regions from which the prompt photons are emitted, as well as the spectral diversity observed. I highlight some areas of active theoretical research. These include: (I) understanding the role played by magnetic fields in shaping the dynamics of GRB outflow and spectra; (II) understanding the microphysics of kinetic and magnetic energy transfer, namely accelerating particle to high energies in both shock waves and magnetic reconnection layers; (III) understanding how sub-photospheric energy dissipation broadens the"Planck"spectrum; and (IV) geometrical light aberration effects. I highlight some of these efforts, and point towards gaps that still exist in our knowledge as well as promising directions for the future.


Introduction
In spite of an extensive study for nearly a generation, understanding of gamma-ray bursts (GRB) prompt emission still remains an open question. The main reason for this is the nature of the prompt emission phase: the prompt emission lasts typically a few seconds (or less), without repetition and with variable lightcurve. Furthermore, the spectra vary from burst to burst, and do not show any clear feature that could easily be associated with any simple emission model. This is A. Pe'er Physics Department, University College Cork, Cork, Ireland Tel.: +353-21-4902594 E-mail: a.peer@ucc.ie in contrast to the afterglow phase, which lasts much longer, up to years, with (relatively) smooth, well characteristic behavior. These features enable afterglow studies using long term, multi-waveband observations, as well as relatively easy comparison with theories.
Nonetheless, I think it is fair to claim that in recent years understanding of GRB prompt emission has been revolutionized. This follows the launch of Swift satellite in 2004 and Fermi satellite in 2008. These satellites enable much more detailed studies of the prompt emission, both in the spectral and temporal domains. The new data led to the realization that the observed spectra are composed of several distinctive components. (I) A thermal component identified on top of a non-thermal spectra was observed in a large number of bursts. This component show a unique temporal behavior. (II) There are evidence that the very high energy (> GeV) part of the spectra evolve differently than the lower energy part, hence is likely to have a separate origin. (III) The sharp cutoff in the lightcurves of many GRBs observed by Swift enables a clear discrimination between the prompt and the afterglow phases.
The decomposition of the spectra into separate components, presumably with different physical origin, enabled an independent study of the properties of each component, as well as study of the complex connection between the different components. Thanks to these studies, we are finally reaching a critical point in which a self consistent physical picture of the GRB prompt emission, more complete than ever is emerging. This physical insight is of course a crucial link that connects the physics of GRB progenitor stars with that of their environments.
Many of the ideas gained in these studies are relevant to many other astronomical objects, such as active galactic nuclei (AGNs), X-ray binaries (XRBs) and tidal disruption events (TDEs). All these transient objects share the common feature of having (trans)-relativistic jetted outflows. Therefore, despite the obvious differences, many similarities between various underlying physical processes in these objects and in GRBs are likely to exist. These include the basic questions of jet launching and propagation, as well as the microphysics of energy transfer via magnetic reconnection and particle acceleration to high energies. Furthermore, understanding the physical conditions that exist during the prompt emission phase enables the study of other fundamental questions such as whether GRBs are sources of (ultra-high energy) cosmic rays and neutrinos, as well as the potential of detecting gravitational waves associated with GRBs.
In this review, I will describe the current (Dec. 2014) observational status, as well as the emerging theoretical picture. I will emphasis a major development of recent years, namely the realization that photospheric emission may play a key role, both directly and indirectly, as part of the observed spectra. I should stress though that in spite of several major observational and theoretical breakthroughs that took place in recent years, our understanding is still far from being complete. I will discuss the gaps that still exist in our knowledge, and novel ideas raised in addressing them. I will point to current scientific efforts, which are focused on different, sometimes even perpendicular directions.
The rapid progress in this field is the cause of the fact that in the past decade there have been very many excellent reviews covering various aspects of GRB phenomenology and physics. A partial list includes reviews by Waxman (2003); Piran (2004); Zhang and Mészáros (2004); Mészáros (2006); Nakar (2007); Zhang (2007); Fan and Piran (2008); ; ; Gehrels and Mészáros (2012); Bucciantini (2012); ; Daigne (2013); Zhang (2014); Kumar and Zhang (2014); Berger (2014); Meszaros and Rees (2014). My goal here is not to compete with these reviews, but to highlight some of the recent -partially, still controversial results and developments in this field, as well as to point into current and future directions which are promising paths.
This review is organized as follows. In section 2 I discuss the current observational status. I discuss the lightcurves ( §2.1), observed spectra ( §2.2), polarization ( §2.3), counterparts at high and low energies ( §2.4) and notable correlations ( §2.5). I particularly emphasis the different models used today in fitting the prompt emission spectra. Section 3 is devoted to theoretical ideas. To my opinion, the easiest way to understand the nature of GRBs is to follow the various episodes of energy transfer that occur during the GRB evolution. I thus begin by discussing models of GRB progenitors ( §3.1), that provide the source of energy. This follows by discussing models of relativistic expansion, both "hot" (photon-dominated) ( §3.2), and "cold" (magnetic-dominated) ( §3.3). I then discuss recent progress in understanding how dissipation of the kinetic and/or magnetic energy is used in accelerating particles to high-energies ( §3.4). I complete with the discussion of the final stage of energy conversion -namely, radiative processes by the hot particles as well as the photospheric contribution ( §3.5), which lead to the observed signal. I conclude with a look into the future in §4.
2 Key observational properties

Lightcurves
The most notable property of GRB prompt emission lightcurve is that it is irregular, diverse and complex. No two gamma-ray bursts lightcurves are identical, a fact which obviously makes their study challenging. While some GRBs are extremely variable with variability time scale in the millisecond range, others are much smoother. Some have only a single peak, while others show multiple peaks; see Figure 1. Typically, individual peaks are not symmetric, but show a "fast rise exponential decay" (FRED) behavior.
The total duration of GRB prompt emission is traditionally defined by the "T 90 " parameter, which is the time interval between the epoch when 5% and 95% of the total fluence is detected. As thoroughly discussed by [Kumar and Zhang (2014)], this (arbitrary) definition is very subjective, due to many reasons. (1) It depends on the energy range and sensitivity of the different detectors; (2) Different intrinsic lightcurves -some lightcurves are very spiky with gaps between the spikes, while others are smooth; (3) No discrimination is made between the "prompt" phase and the early afterglow emission; (4) It does not take into account the difference in redshifts between the bursts, which can be substantial.
In spite of these drawbacks, T 90 is still the most commonly used parameter in describing the total duration of the prompt phase. While T 90 is observed to vary between milliseconds and thousands of seconds (the longest to date is GRB111209A, with duration of ∼ 2.5 × 10 4 s [Gendre et al. (2013b)]), from the early 1990's, it was noted that the T 90 distribution of GRB's is bimodal ]. About < ∼ 1/4 of GRBs in the BATSE catalog are "short", with average T 90 of ∼ 0.2 − 0.3 s, and roughly 3/4 are "long," with average T 90 ≈ 20 − 30 s Fig. 1 Light curves of 12 bright gamma-ray bursts detected by BATSE. Gamma-ray bursts light curves display a tremendous amount of diversity and few discernible patterns. This sample includes short events and long events (duration ranging from milliseconds to minutes), events with smooth behavior and single peaks, and events with highly variable, erratic behavior with many peaks. Created by Daniel Perley with data from the public BATSE archive (http://gammaray.msfc.nasa.gov/batse/grb/catalog/). [Paciesas et al. (1999)]. The boundary between these two distributions is at ∼ 2 s. Similar results are obtained by Fermi (see Figure 2), though the subjective definition of T 90 results in a bit different ratio, where only 17% of Fermi-GBM bursts are considered as "short", the rest being long ; Qin et al. (2013); von ]. Similar conclusion -though with much smaller sample, and even lesser fraction of short GRBs are observed in the Swift-Bat catalog ] and by Integral [Bošnjak et al. (2014)]. These results do not change if instead one uses T 50 parameter, defined in a similar way. These results are accompanied by different hardness ratio (the ratio between the observed photon flux at the high and low energy bands of the detector), where short bursts are, on the average harder (higher ratio of energetic photons) than long ones ]. Other clues for different origin are the association of only the long GRBs with core collapse supernova, of type Ib/c [Galama et al. (1998); Hjorth et al. (2003); Stanek et al. (2003); ; Pian et al. (2006); Cobb et al. (2010); Starling et al. (2011)] which are not found in short GRBs [Kann et al. (2011)]; association of short GRBs to galaxies with little star-formation (as opposed to long GRBs which are found in star forming galaxies) and residing at different locations within their host galaxies than long GRBs ; Fox et al. (2005); Villasenor et al. (2005); ; ; ; Troja et al. (2008); ]. Altogether, these results thus suggest two different progenitor classes. However, a more careful analysis reveals a more complex picture with many outliers to these rules [e.g., Zhang et al. (2007b);Nysewander et al. (2009);Norris et al. (2011);Berger (2011);Bromberg et al. (2013); . It is therefore possible -maybe even likely -that the population of short GRBs may have more than a single progenitor (or physical origin). In addition, there have been several claims for a small, third class of "intermediate" GRBs, with T 90 ∼ 2 s [Mukherjee et al. (1998);Horváth (1998);Horváth et al. (2006); Veres et al. (2010)], but this is still controversial [e.g., Hakkila et al. (2003); Bromberg et al. (2013)].
To further add to the confusion, the lightcurve itself vary with energy band (e.g., Figure 3). One of Fermi's most important results, to my view is the discovery that the highest energy photons (in the LAT band) are observed to both (I) lag behind the emission at lower energies; and (II) last longer. Both these results are Light curves for GRB 080916C observed with the GBM and the LAT detectors on board the Fermi satellite, from lowest to highest energies. The top graph shows the sum of the counts, in the 8-to 260-keV energy band, of two NaI detectors. The second is the corresponding plot for BGO detector 0, between 260 keV and 5 MeV. The third shows all LAT events passing the on board event filter for gamma-rays. (Insets) Views of the first 15 s from the trigger time. In all cases, the bin width is 0.5 s; the per-second counting rate is reported on the right for convenience. Taken from Abdo et al. (2009c). seen in Figure 3. Similarly, the width of individual pulses are energy dependent. It was found that the pulse width ω vary with energy, ω(E) ∝ E −α with α ∼ 0.3 − 0.4 ; Liang et al. (2006)].
Already in the BATSE era, several bursts were found to have "ultra-long" duration, having T 90 exceeding ∼ 10 3 s [e.g., ; Nicastro et al. (2004)]. Recently, several additional bursts were found in this category (e.g., GRB 091024A, GRB 101225A, GRB 111209A GRB 121027A and GRB 130925A ; Gendre et al. (2013b);Stratta et al. (2013); Levan et al. (2014); Evans et al. (2014)]), which raise the idea of a new class of GRBs. If these bursts indeed represent a separate class, they may have a different progenitor than that of "regular" long GRBs [Gendre et al. (2013a); Nakauchi et al. (2013); Levan et al. (2014)]. However, recent analysis showed that bursts with duration T 90 ∼ 10 3 s need not belong to a special population, while bursts with T 90 > ∼ 10 4 s may belong to a separate population ; Gao and Mészáros (2014) Band" model In order to avoid biases towards a preferred physical emission model, GRB spectra are traditionally fitted with a mathematical function, which is known as the "Band" function (after the late David Band) [Band et al. (1993)]. This function had become the standard in this field, and is often refereed to as "Band model". The photon number spectra in this model are given by: This model thus has 4 free parameters: low energy spectral slope, α, high energy spectral slope, β, break energy, ≈ E 0 , and an over all normalization, A. It is found that such a simplistic model, which resembles a "broken power law" is capable of providing good fits to many different GRB spectra; see Figure 4 for an example. Thus, this model is by far the most widely used in describing GRB spectra. Some variations of this model have been introduced in the literature. Examples are single power law (PL), "smooth broken power law" (SBPL), or "Comptonized model" (Comp) [see, e.g., Kaneko et al. (2006); Nava et al. (2011b); Goldstein et al. (2012]. These are very similar in nature, and do not, in general provide a better physical insight. On the down side, clearly, having only 4 free parameters, this model is unable to capture complex spectral behavior that is known now to exist, such as the different temporal behavior of the high energy emission discussed above. Even more importantly, as will be discussed below, the limited number of free model parameters in this model can easily lead to wrong conclusions. Furthermore, this model -on purpose -is mathematical in nature, and therefore fitting the data with this model does not, by itself, provide any clue about the physical origin of the emission. In order to obtain such an insight, one has to compare the fitted results to the predictions of different theoretical models. When using the "Band" model to fit a large number of bursts, the distribution of the key model parameters (the low and high energy slopes α and β and the peak energy E peak ) show a surprisingly narrow distribution (see Figure 5). The spectral properties of the two categories: short and long GRBs, detected by both BATSE, Integral as well as Fermi are very similar, with only minor differences (Preece et al. 2000;Kaneko et al. 2006;Nava et al. 2011a;Goldstein et al. 2012Bošnjak et al. 2014;. The low energy spectral slope is roughly in the range −1.5 < α < 0, averaging at α ≃ −1. The distribution of the high energy spectral slope peaks at β ≃ −2. While typically β < −1.3, many Histograms of the distributions of "Band" model free parameters: the low energy slope α (left), peak energy E peak (center) and the high energy slope, β (right). The data represent 3800 spectra derived from 487 GRBs in the first FERMI-GBM catalogue. The difference between solid and dashed curves are the goodness of fits-the solid curve represent fits which were done under minimum χ 2 criteria, and the dash curves are for all GRBs in the catalogue. Figure adopted from Goldstein et al. (2012).
bursts show a very steep β, consistent with an exponential cutoff. The peak energy averages around E peak ≃ 200 keV, and it ranges from tens keV up to ∼MeV (and even higher, in a few rare, exceptional bursts).
As can be seen in Figures 4 and 5, the "Band" fits to the spectra have three key spectral properties. (1) The prompt emission extends to very high energies, > ∼ MeV. This energy is above the threshold for pair production (me = 0.511 MeV), which is the original motivation for relativistic expansion of GRB outflows (see below). (2) The "Band" fits do not resemble a "Planck" function; hence the reason why thermal emission, which was initially suggested as the origin of GRB prompt spectra [Goodman (1986); Paczynski (1986)] was quickly abandoned, and not considered as a valid radiation process for a long time.
(3) The values of the free "Band" model parameters, and in particular the value of the low energy spectral slope, α are not easily fitted with any simply broad-band radiative process such as synchrotron or synchrotron self-Compton (SSC). Although in some bursts, synchrotron emission could be used to fit the spectra [e.g., Tavani (1996); Cohen et al. (1997); Panaitescu et al. (1999); Frontera et al. (2000)], this is not the case in the vast majority of GRBs [Crider et al. (1997); Preece et al. (1998; Ghirlanda et al. (2003)]. This was noted already in 1998, with the term "synchrotron line of death" coined by R. Preece [Preece et al. (1998)], to emphasis the inability of the synchrotron emission model to provide good fits to the spectra of [most] GRBs.
Indeed, these three observational properties introduce a major theoretical challenge, as currently no simple physically motivated model is able to provide convincing explanation to the observed spectra. However, as already discussed above, the "Band" fits suffer from several inherent major drawbacks, and therefore the obtained results must be treated with great care.

"Hybrid" model
An alternative model for fitting the GRB prompt spectra was proposed by F. Ryde [Ryde (2004[Ryde ( , 2005]. Being aware of the limitations of the "Band" model, when analyzing BATSE data, Ryde proposed a "hybrid" model that contains a thermal component (a Planck function) and a single power law to fit the nonthermal part of the spectra (presumably, resulting from Comptonization of the thermal photons). Ryde's hybrid model thus contain four free parameters -the same number of free parameters as the "Band" model: two parameters fit the thermal part of the spectrum (temperature and thermal flux) and two fit the nonthermal part. Thus, as opposed to the "Band" model which is mathematical in nature, Ryde's model suggests a physical interpretation to at least part of the observed spectra (the thermal part). An example of the fit is shown in Figure 6.
Clearly, a single power law cannot be considered a valid physical model in describing the non-thermal part of the spectra, as it diverges. Nonetheless, it can be acceptable approximation when considering a limited energy range, as was available when analyzing BATSE data. While the hybrid model was able to provide comparable, or even better fits with respect to the "Band" model to several doesens bright GRBs [Ryde (2004[Ryde ( , 2005; Ryde and Pe'er (2009);McGlynn et al. (2009);, is was shown that this model over predict the flux at low energies (X-ray range) for many GRBs ; Frontera et al. (2013)]. This discrepancy, however, can easily be explained by the over-simplification of the use of a single power law as a way to describe the non-thermal spectra both above and below the thermal peak. From a physical perspective, one expects Comptonization to modify the spectra above the thermal peak, but not below it; see discussion below.
As Fermi enables a much broader spectral coverage than BATSE, in recent years Ryde's hybrid model could be confronted with data over a broader spectral range. Indeed, it was found that in several bursts (e.g., GRB090510 [Ackermann et al. (2010)], GRB090902B [Abdo et al. (2009b); Ryde et al. (2010] GRB110721A [Axelsson et al. (2012); Iyyani et al. (2013)], GRB100724B [Guiriec et al. (2011)], GRB100507 [Ghirlanda et al. (2013)] or GRB120323A ]) the broad band spectra are best fitted with a combined "Band + thermal" model (see Figure 7). In these fits, the peak of the thermal component is always found to be below the peak energy of the "Band" part of the spectrum. This is consistent with the rising "single power law" that was used in fitting the band-limited non thermal spectra.
The "Band + thermal" model fits require six free parameters, as opposed to the four free parameters in both the "Band" and in the original "hybrid" models. While this is considered as a drawback, this model has several notable advantages. First, this model does not suffer from the energy divergence of a single power law fit, as in Ryde's original proposal. Second, in comparison with "Band" model fits, it shows significant improvement in quality, both in statistical errors (reduced χ 2 ), and even more importantly, by the behavior of the residuals: when fitting the data with a "Band" function, often the residuals to the fit show a "wiggly" behavior, implying that they are not randomly distributed. This is solved when adding the thermal component to the fits.
Similar to Ryde's original model, fits with "Band + thermal" model can provide a physical explanation to only the thermal part of the spectra; they still do not suggest physical origin to the non-thermal part of the spectra. Nonetheless, the addition of the thermal part implies that the values of the free model parameters used in fitting the non-thermal part, such as the low energy spectral slope (α), as well as the peak energy E peak are different than the values that would have been obtained by pure "Band" fits (namely, without the thermal component; see ; Basak and Rao (2014); Deng and Zhang (2014); Guiriec et al. (2015)). In some bursts, the new values obtained are consistent with the predictions of synchrotron theory, suggesting a synchrotron origin of the nonthermal part [Burgess et al. (2014b); Yu et al. (2015)]. However, in many cases this interpretation is insufficient [e.g., Burgess et al. (2014a)]; see further discussion below. Another (relatively minor) drawback of these fits is that from a theoretical perspective, even if a thermal component exists in the spectra, it is expected to have the shape of a gray-body rather than a pure "Planck", due to light aberration (see below).
One therefore concludes that the "Band + thermal" fits which became very popular recently can be viewed as an intermediate step towards full physicallymotivated fits of the spectra. They contain a mix of a physically-motivated part (the thermal part) with an addition mathematical function (the "Band" part) whose physical origin still needs clarification.
As of today, pure "Planck" spectral component is clearly identified in only a very small fraction of bursts. Nonetheless, there is a good reason to believe that it is in fact very ubiquitous, and that the main reason it is not clearly identified is due to its distortion. A recent work [Axelsson and Borgonovo (2015)], examined the width of the spectral peak, quantified by W , the ratio of energies that define the full width half maximum (FWHM). The results of an analysis of over 2900 different BATSE and Fermi bursts are shown in Figure 8. The smaller W is, the narrow the spectral width. Imposed on the sample are the line representing the spectral width from a pure "Planck" (black), and a line representing the spectral width for slow cooling synchrotron (red). Fast cooling synchrotron results in much wider spectral width, which would be shown to the far right of this plot. Thus, while virtually all the spectral width are wider than "Planck", over ∼ 80% are narrower than allowed by the synchrotron model. On the one hand, "narrowing" a synchrotron spectra is (nearly) impossible. However, there are various ways, which Fig. 7 The spectra of GRB110721A is best fit with a "Band" model (peaking at E peak ∼ 1 MeV), and a blackbody component (having temperature T ∼ 100 keV). The advantage over using just a "Band" function is evident when looking at the residuals (Taken from Iyyani et al. (2013)) will be discussed below in which pure "Planck" spectra can be broadened. Thus, although "pure" Planck is very rare, these data suggests that broadening of the "Planck" spectra plays a major role in shaping the spectral shape of the vast majority of GRB spectra.

Time resolved spectral analysis
Ryde's original analysis is based on time-resolved spectra. The lightcurve is cut into time bins (having typical duration > ∼ 1 s), and the spectra at each time bin is analyzed independently. This approach clearly limits the number of bursts that could be analyzed in this method to only the brightest ones, presumably those showing smooth lightcurve over several -several tens of seconds (namely, mainly the long GRBs). However, its great advantage is that it enables to detect temporal evolution in the properties of the fitted parameters; in particular, in the temperature and flux of the thermal component.
One of the key results of the analysis carried by Ryde and Pe'er (2009), is the well defined temporal behavior of both the temperature and flux of the thermal component. Both the temperature and flux evolve as a broken power law in time: T ∝ t α , and F ∝ t β , with α ≃ 0 and β ≃ 0.6 at t < t brk ≈few s, and α ≃ −0.68 and β ≃ −2 at later times (see Figure 9). This temporal behavior was found among all sources in which thermal emission could be identified. It may therefore provide a strong clue about the nature of the prompt emission, in at least those GRBs for which thermal component was identified. To my personal view, these findings may hold the key to understanding the origin of the prompt emission, and possibly the nature of the progenitor.
Due to Fermi's much greater sensitivity, time resolved spectral analysis is today in broad use. This enables to observed temporal evolution not only of the thermal component, but of other parts of the spectra as well (see, e.g., Figure 4). As an example, a recent analysis of GRB130427A reveals a temporal change in the peak

Long GRBs
Synchrotron Blackbody Fig. 8 Full width half maximum of the spectral peaks of over 2900 bursts fitted with the "Band" function. The narrow most spectra are compatible with a "Planck" spectrum. About ∼ 80% of the spectra are too narrow to be fitted with the (slow cooling) synchrotron emission model (red line). When fast cooling is added, nearly 100% of the spectra are too narrow to be compatible with this model. As it is physically impossible to narrow the broad-band synchrotron spectra, these results thus suggest that the spectral peak is due to some widening mechanism on of the Planck spectrum, which are therefore pronounced (indirectly) in the vast majority of spectra. Figure taken from Axelsson and Borgonovo (2015). energy during the first 2.5 s of the burst, which could be interpreted as due to synchrotron origin ].

Distinguished high energy component
Prior to the Fermi era, time resolved spectral analysis was very difficult to conduct due to the relatively low sensitivity of the BATSE detector, and therefore its use was limited to bright GRBs with smooth lightcurve. However, Fermi's superb sensitivity enables to carry time resolved analysis to many more bursts. One of the findings is the delayed onset of GeV emission with respect to emission at lower energies which is seen in a substantial fraction of LAT bursts (see, e.g., Figure 3). This delayed onset is further accompanied by a long lived emission ( > ∼ 10 2 s), and separate lightcurve [Abdo et al. (2009b,c,a); Kumar and Barniol Duran (2010)]. The GeV emission decays as a power law in time, L GeV ∝ t −1.2 [Ghirlanda et al. (2010); Ackermann et al. (2013); Nava et al. (2014)]. Furthermore, the GeV emission shows smooth decay (see Figure 10). This behavior naturally points towards a separate origin of the GeV and lower energy photons; see discussion below.
Thus, one can conclude that at this point in time (Dec. 2014), evidence exist for three separate components in GRB spectra: (I) a thermal component, peaking typically at ∼ 100 keV; (II) a non-thermal component, whose origin is not fully  clear, peaking at < ∼ MeV and -lacking clear physical picture, is fitted with a "Band" function; and (III) a third component, at very high energies ( > ∼ 100 MeV) showing a separate temporal evolution Guiriec et al. (2015)].
Not all three components are clearly identified in all GRBs; in fact, separate evolution of the high energy part is observed in only a handful of GRBs. The fraction of GRBs which show clear evidence for the existence of a thermal component is not fully clear; it seem to depend on the brightness, with bright GRBs more likely to show evidence for a thermal component (up to 50% of bright GRBs show clear evidence for a separate thermal component [Guiriec et al. (2015) and Larsson et. al., in prep.]). Furthermore, this fraction is sensitive to to the analysis method. Thus, final conclusions are still lacking.
Even more interestingly, it is not at all clear that the "bump" identified as a thermal component is indeed such; such a bump could have other origins as well (see discussion below). Thus, I think it is fair to claim that we are now in a transition phase: on the one hand, it is clear that fitting the data with a pure "Band" model is insufficient, and thus more complicated models, which are capable of capturing more subtle features of the spectra are being used. On the other hand, these models are still not fully physically motivated, and thus a full physical insight of the origin of prompt emission is still lacking.

Polarization
The leading models of the non-thermal emission, namely synchrotron emission and Compton scattering, both produce highly polarized emission [Rybicki and Lightman (1979)]. Nonetheless, due to the spherical assumption, the inability to spatially resolve the sources, and the fact that polarization was initially discovered only during the afterglow phase [Covino et al. (1999); Wijers et al. (1999)], polarization was initially discussed only in the context of GRB afterglow, but not the prompt phase [e.g., Loeb and Perna (1998); Gruzinov and Waxman (1999); ; Medvedev and Loeb (1999); Granot and Königl (2003)].
The first claim of highly linearly polarized prompt emission in a GRB, Π = (80 ± 20)% in GRB021206 by RHESSI [Coburn and Boggs (2003)] was disputed by a later analysis [Rutledge and Fox (2004)]. A later analysis of BATSE data show that the prompt emission of GRB930131 and GRB96092 are consistent with having high linear polarization, Π > 35% and Π > 50%; though the exact degree of polarization could not be well constrained [Willis et al. (2005)]. Similarly, Kalemci et al. (2007);McGlynn et al. (2007) and Götz et al. (2009) showed that the prompt spectrum of GRB041219a observed by INTEGRAL is consistent with being highly polarized, but with no statistical significance.
Recently, high linear polarization, Π = (27 ± 11)% was observed in the prompt phase of GRB 100826a by the GAP instrument on board IKAROS satellite [Yonetoku et al. (2011)]. As opposed to former measurements, the significance level of this measurement is high, 2.9σ. High linear polarization degree was further detected in GRB110301a (Π = 70 ± 22%) with 3.7σ confidence, and in GRB100826a (Π = 84 +16 −28 %) with 3.3σ confidence [Yonetoku et al. (2012)]. As of today, there is no agreed theoretical interpretation to the observed spectra (see discussion below). However, different theoretical models predict different levels of polarization, which are correlated with the different spectra. Therefore, polarization measurements have a tremendous potential in shedding new light on the different theoretical models, and may hold the key in discriminating between them.

Emission at other wavebands
Clearly, the prompt emission spectra is not necessarily limited to those wavebands that can be detected by existing satellites. Although broad band spectral coverage is important in providing clues to the origin of the prompt emission and the nature of GRBs, due to their random nature and to the short duration it is extremely difficult to observe the prompt emission without fast, accurate triggering.
As the physical origin of the prompt emission is not fully clear, it is difficult to estimate the flux at wavebands other than observed. Naively, the flux is estimated by interpolating the "Band" function to the required energy [e.g., Granot et al. (2010)]. However, as discussed above (and proved in the past), this method is misleading, as (1) the "Band" model is a very crude approximation to a more complicated spectra; and (2) the values of the "Band" model low and high energy slopes change when new components are added. Thus, it is of no surprise that early estimates were not matched by observations.

High energy counterpart
At high energies, there has been one claim of possible TeV emission associated with GRB970417a [Atkins et al. (2000)]. However, since then, no other confirmed detection of high energy photons associated with any GRB prompt emission were reported. Despite numerous attempts, only upper limits on the very high en-

Optical counterpart
At lower energies (optic -X), there have been several long GRBs for which a precursor (or a very long prompt emission duration) enabled fast slew of ground based robotic telescopes (and / or Swift XRT and UVOT detectors) to the source during the prompt phase. The first ever detection of optical emission during the prompt phase of a GRB was that of GRB990123 [Akerlof et al. (1999)]. Other examples of optical detection are GRB041219A [Blake et al. (2005)], GRB060124 ], GRB 061121 [Page et al. (2007)] the "naked eye" GRB080319B  The results are diverse. In some cases (e.g., GRB990123), the peak of the optical flux lags behind that of the γ-ray flux, while in other GRBs (e.g., GRB080319B), no lag is observed. This is shown in Figure 11. Similarly, while in some bursts, such as GRB080319B or GRB090727 the optical flux is several orders of magnitude higher than that obtained by direct interpolation of the "Band" function from the x/γ ray band, in other bursts, such as GRB080928, it seem to be fitted well with a broken-power law extending at all energies (see Figure 12). To further add to the confusion, some GRBs show complex temporal and spectral behavior, in which the optical flux and lightcurve changes its properties (with respect to the x/γ) emission with time. Examples are GRB050820 [Vestrand et al. (2006)] and GRB110205A [Zheng et al. (2012)].
These different properties hint towards different origin of the optical emission. It should be stressed that due to the observational constraints, optical counterparts are observed to date only in very long GRBs, with typical T 90 of hundreds of seconds (or more). Thus, the optical emission may be viewed as part of the prompt phase, but also as part of the early afterglow; it may result from the reverse shock which takes place during the early afterglow epoch. See further discussion below.

Correlations
There have been several claims in the past for correlations between various observables of the prompt GRB emission. Clearly, such correlations could potentially be extremely useful in both understanding the origin of the emission, as well as the ability to use GRBs as probes, e.g., "standard candles" similar to supernova 1a. However, a word of caution is needed: as already discussed, many of the correlations are based on values of fitted parameters, such as E pk , which are sensitive to the fitted model chosen -typically, the "Band" function. As more refined models -such as, e.g., the addition of a thermal component can change the peak energy, the claimed correlation may need to be modified. Since final conclusion about the best physically motivated model that can describe the prompt emission spectra has not emerged yet, it is too early to know the modification that may be required to the claimed correlations. Similarly, some of the correlations are based on the prompt emission duration, which is ill-defined. The first correlation was found between the peak energy (identified as temperature) and luminosity of single pulses within the prompt emission [Golenetskii et al. (1983)]. They found L ∝ E α peak , with α ∼ 1.6. These results were confirmed by Kargatis et al. (1994), though the errors on α were large, as α ≃ 1.5 − 1.7.
The Amati relation has been questioned by several authors, claiming that it is an artifact of a selection effect or biases [e.g.,   (2012)]. However, counter arguments are that even is such selection effects exist, they cannot completely exclude the correlation [Ghirlanda et al. (2005[Ghirlanda et al. ( , 2008; ; Basak and Rao (2013)]. To conclude, it seem that current data (and analysis method) do support some correlation, though with wide scatter. This scatter still needs to be understood before the correlation could be used as a tool, e.g., for cosmological studies [Virgili et al. (2012); Heussaff et al. (2013)].
There are a few other notable correlations that were found in recent years. One is a correlation between the (redshift corrected) peak energy E peak,z and the isotropic luminosity in γ-rays at the peak flux, L γ,peak,iso [Wei and Gao (2003); Yonetoku et al. (2004)]: E peak,z ∝ L 0.52 γ,p,iso . A second correlation is between E peak,z and the geometrically-corrected gamma-ray energy, Eγ ≃ (θ 2 j /2)E γ,iso , where θ j is the jet opening angle (inferred from afterglow observations): E peak,z ∝ E 0.7 γ [Ghirlanda et al. (2004)]. It was argued that this relation is tighter than the Amati relation; however, it relies on the correct interpretation of breaks in the afterglow lightcurve to be associated with jet breaks, which can be problematic ; Liang et al. (2008); Kocevski and Butler (2008); Racusin et al. (2009);Ryan et al. (2015)].
Several other proposed correlations exist; I refer the reader to Kumar and Zhang (2014), for a full list.

Theoretical framework
Perhaps the easiest way to understand the nature of GRBs is to follow the different episodes of energy conversion. Although the details of the energy transfer are still highly debatable, there is a wide agreement, based on firm observational evidence, that there are several key episodes of energy conversion in GRBs. (1) Initially, a large amount of energy, ∼ 10 53 erg or more, is released in a very short time, in a compact region. The source of this energy must be gravitational. (2) Substantial (2) Part of this energy is used in producing the relativistic jet. This could be mediated by hot photons ("fireball"), or by magnetic field. (3) The thermal photons decouple at the photosphere. (4) Part of the jet kinetic energy is dissipated (by internal collisions, in this picture) to produce the observed γ rays. (5) The remaining kinetic energy is deposited into the surrounding medium, heating it and producing the observed afterglow. Cartoon is taken from Meszaros and Rees (2014). part of this energy is converted into kinetic energy, in the form of relativistic outflow. This is the stage in which GRB jets are formed and accelerated to relativistic velocities. The exact nature of this acceleration process, and in particular the role played by magnetic fields in it, is still not fully clear. (3) (Part of) this kinetic energy is dissipated, and is used in producing the gamma rays that we observe in the prompt emission. Note that part of the observed prompt emission (the thermal part) may originate directly from photons emitted during the initial explosion; the energy carried by these photons is therefore not initially converted to kinetic form.
(4) The remaining of the kinetic energy (still in the form of relativistic jet) runs into the interstellar medium (ISM) and heats it, producing the observed afterglow. The kinetic energy is thus gradually converted into heat, and the afterglow gradually fades away. A cartoon showing these basic ingredients in the context of the "fireball" model, is shown in Figure 13, adapted from Meszaros and Rees (2014).

Progenitors
The key properties that are required from GRB progenitors are: (1) the ability to release a huge amount of energy, ∼ 10 52 − 10 53 erg (possibly even larger), within the observed GRB duration of few seconds; (2) the ability to explain the fast time variability observed, δt > ∼ 10 −3 s, implying (via light crossing time argument) that the energy source must be compact: R ∼ cδt ∼ 300 km, namely of stellar size.
While 20 years ago, over hundred different models were proposed in explaining possible GRB progenitors [see Nemiroff (1994)], natural selection (namely, con-frontation with observations over the years) led to the survival of two main scenarios. The first is a merger of two neutron stars (NS-NS), or a black hole and a neutron star (BH-NS). The occurance rate, as well as the expected energy released, ∼ GM 2 /R ∼ 10 53 erg (using M ∼ M ⊙ and R > ∼ R sch. , the Scharzschild radius of stellar-size black hole), are sufficient for extra-galactic GRBs [Eichler et al. (1989); Paczynski (1990); Narayan et al. (1991); ; Narayan et al. (1992)]. The alternative scenario is the core collapse of a massive star, accompanied by accretion into a black hole [Woosley (1993); Paczyński (1998a,b); Fryer et al. (1999); MacFadyen and Popham et al. (1999); Woosley and Bloom (2006) and references therein]. In this scenario, similar amount of energy, up to ∼ 10 54 erg may be released by tapping the rotational energy of a Kerr black hole formed in the core collapse, and/or the inner layers of the accretion disk.
The observational association of long GRBs to type Ib/c supernova discussed above, as well as the time scale of the collapse event, < ∼ 1 minute, which is similar to that observed in long GRBs, makes the core collapse, or "collapsar" model, the leading model for explaining long GRBs. The merger scenario, on the other hand, is currently the leading model in explaining short GRBs [see, e.g., discussions in Nakar (2007) 3.2 Relativistic expansion and kinetic energy dissipation: the "fireball" model A GRB event is associated with a catastrophic energy release of a stellar size object. The huge amount of energy, ∼ 10 52 − 10 53 erg released in such a short time and compact volume, results in a copious production of neutrinos -antineutrinos (initially in thermal equilibrium) and possible release of gravitational waves. These two, by far the most dominant energy forms are of yet not detected. A smaller fraction of the energy (of the order 10 −3 − 10 −2 of the total energy released) goes into high temperature ( > ∼ MeV) plasma, containing photons, e ± pairs, and baryons, known as "fireball" [Cavallo and Rees (1978)]. The fireball may contain a comparable -or even larger amount of magnetic energy, in which case it is Poynting flux dominated [Usov (1994); Thompson (1994); Katz (1997); Mészáros and Rees (1997); Lyutikov and Blandford (2003); Zhang and Yan (2011)] 1 .
The scaling laws that govern the expansion of the fireball depend on its magnetization. Thus, one must discriminate between photon-dominated (or magneticallypoor) outflow and magnetic dominated outflow. I discuss in this section the photondominated ("hot fireball"). Magnetic dominated ("cold fireball") will be discussed in the next section (section 3.3).

Photon dominated outflow
Let us consider first photon-dominated outflow. In this model, it is assumed that a large fraction of the energy released during the collapse / merger is converted directly into photons close to the jet core, at radius r 0 (which should be > ∼ the Schwarzschild radius of the newly formed black hole). The photon temperature is where a is the radiation constant, L is the luminosity and Q = 10 x Qx in cgs units is used here and below. This temperature is above the threshold for pair production, implying that a large number of e ± pairs are created via photonphoton interactions (and justifying the assumption of full thermalization). The observed luminosity is many orders of magnitude above the Eddington luminosity, L E = 4πGMmpc/σ T = 1.25 × 10 38 (M/M ⊙ ) erg s −1 , implying that radiation pressure is much larger than self gravity, and the fireball must expand. The dynamics of the expected relativistic fireball were first investigated by [Goodman (1986); Paczynski (1986); Shemi and Piran (1990)]. The ultimate velocity it will reach depends on the amount of baryons (baryon load) within the fireball [Paczynski (1990)], which is uncertain. The baryon load can be deduced from observations: as the final expansion kinetic energy cannot exceed the explosion energy, the highest Lorentz factor that can be reached is Γmax = E/M c 2 . Thus, the fact that GRBs are known to have high bulk Lorentz factors, Γ > ∼ 10 2 at later stages (during the prompt and afterglow emission) [Krolik and Pier (1991);Fenimore et al. (1993); Woods and Loeb (1995); Baring and Harding (1997); ; Lithwick and Sari (2001)

Scaling laws for relativistic expansion: instantaneous energy release
The scaling laws for the fireball evolution follows conservation of energy and entropy. Let us assume first that the energy release is "instantaneous", namely within a shell of size δr ∼ r 0 . Thus, the total energy contained within the shell (as seen by an observer outside the expanding shell) is (3) Here, T ′ (r) is the shell's comoving temperature, and V ′ = 4πr 2 δr ′ is its comoving volume. Note that the first factor of Γ (r) is needed in converting the comoving energy to the observed energy, and the second originates from transformation of the shell's width: the shell's comoving width (as measured by a comoving observer within it) is related to its width as measured in the lab frame (r 0 ) by δr ′ = Γ (r)r 0 . Starting from the fundamental thermodynamic relation, dS = (dU + pdV )/T , one can write the entropy of a fluid component with zero chemical potential (such as photon fluid) in its comoving frame, S ′ = V ′ (u ′ + p ′ )/T ′ . Here, u ′ , p ′ are the internal energy density and pressure measured in the comoving frame. For photons, Since initially, both the rest mass and energy of the baryons are negligible, the entropy is provided by the photons. Thus, conservation of entropy implies Dividing Equations 3 and 4, one obtains Γ (r)T ′ (r) = const, from which (using again these equations) one can write the scaling laws of the fireball evolution, As the shell accelerates, the baryon kinetic energy Γ (r)Mc 2 increases, until it become comparable to the total fireball energy (the energy released in the explosion) at Γ = Γmax ≃ η, at radius rs ∼ ηr 0 (assuming that the outflow is still optically thick at rs, and so the acceleration can continue until this radius). Here, η ≡ E/M c 2 is the specific entropy per baryon. Note that during the acceleration phase, the shell's kinetic energy increase comes at the expense of the (comoving) internal energy, as is reflected by the fact that the comoving temperature drops.
Beyond the saturation radius rs, most of the available energy is in kinetic form, and so the flow can no longer accelerate, and it coasts. The spatial evolution of the Lorentz factor is thus Equation 4 that describes conservation of (comoving) entropy, holds in this regime as well; therefore, in the regime r > rs one obtains r 2 T ′ (r) 3 = const, or The observed temperature therefore evolves with radius as T 0 × (r/rs) −2/3 r > rs (8)

Continuous energy release
Let us assume next that the energy is released over a longer duration, t ≫ r 0 /c (as is the case in long GRBs). In this scenario, the progenitor continuously emits energy at a rate L (erg/s), and this emission is accompanied by mass ejected at a rateṀ = L/ηc 2 . The analysis carried above is valid for each fluid element separately, provided that E is replaced by L and M byṀ, and thus the scaling laws derived above for the evolution of the (average) Lorentz factor and temperature as a function of radius hold. However, there are a few additions to this scenario. We first note the following [Waxman (2003)]. The comoving number density of baryons follow mass conservation: (assuming spherical explosion). Below rs, the (comoving) energy density of each fluid element is relativistic, aT ′ (r) 4 /n ′ p mpc 2 = η(r 0 /r). Thus, the speed of sound in the comoving frame is cs ≃ c/ √ 3 ∼ c. The time it takes a fluid element to expand to radius r, r/c in the observer frame, corresponds to time t ′ ∼ r/Γ c in the comoving frame; during this time, sound waves propagate a distance t ′ cs ∼ rc/Γ c = r/Γ (in the comoving frame), which is equal to r/Γ 2 = r 2 0 /r in the observer frame. This implies that at the early stages of the expansion, where r > ∼ r 0 , sound waves have enough time to smooth spatial fluctuations on scale ∼ r 0 . On the other hand, regions separated by ∆r > r 0 cannot interact with each other. As a result, fluctuations in the energy emission rate would result in the ejection and propagation of a collection of independent sub-shells, each have typical thickness r 0 .
Each fluid element may have a slightly different density, and thus have a slightly different terminal Lorentz factor; the standard assumption is δΓ ∼ η. This implies a velocity spread δv = v 1 − v 2 ≈ c 2η 2 , where η is the characteristic value of the terminal Lorentz factor. If such two fluid elements originate within a shell (of initial thickness r 0 ), spreading between these fluid elements will occur after typical time t spread = r 0 /δv, and at radius (in the observer's frame) ] According to the discussion above, this is also the typical radius where two separate shells will begin to interact (sometimes referred to as the "collision radius", r col ). The spreading radius is a factor η larger than the saturation radius. Thus, no internal collisions are expected during the acceleration phase, namely at r < rs. Below the spreading radius individual shell's thickness (in the observer's frame), δr, is approximately constant and equal to r 0 . At larger radii, r > r spread , it becomes δr = rδv/c ∼ r/η 2 .
Since the comoving radial width of each shell is δr ′ = Γ δr, it can be written as The comoving volume of each sub-shell, V ′ ∝ r 2 δr ′ is thus

Internal collisions as possible mechanism of kinetic energy dissipation
At radii r > r spread = r coll , spreading within a single shell, as well as interaction between two consecutive shells become possible. The idea of shell collision was suggested early on [Paczynski and Xu (1994); Rees and Meszaros (1994); Sari and Piran (1997b); Kobayashi et al. (1997); Daigne and Mochkovitch (1998)], as a way to dissipate the jet kinetic energy, and convert it into the observed radiation.
The key advantages of the internal collision model are: (1) its simplicity -it is a very straight forward idea that naturally rises from the discussion above; (2) it is capable of explaining the rapid variability observed; and (3) the internal collisions are accompanied by (internal) shock waves. It is believed that these shock waves are capable of accelerating particles to high energies, via Fermi mechanism. The energetic particles, in turn, can emit the high-energy, non-thermal photons observed, e.g., via synchrotron emission. Thus, the internal collisions is believed to be an essential part in this energy conversion chain that results in the production of γ-rays.
On the other hand, the main drawbacks of the model are (1) the very low efficiency of energy conversion; (2) by itself, the model does not explain the observed spectra -only suggests a way in which the kinetic energy can be dissipated. In order to explain the observed spectra, one needs to add further assumptions about how the dissipated energy is used in producing the photons (e.g., assumptions about particle acceleration, etc.). Furthermore, as will be discussed in section 3.5 below, it is impossible to explain the observed spectra within the framework of this model using standard radiative processes (such as synchrotron emission or Compton scattering), without invoking additional assumptions external to it. (3) Another major drawback of this model is its lack of predictivity: while it does suggest a way of dissipating the kinetic energy, it does not provide many details, such as the time in which dissipations are expected, or the amount of energy that should be dissipated in each collision (only rough limits). Thus, it lacks a predictive power.
The basic assumption is that at radius r coll = r spread two shells collide. This collision dissipates part of the kinetic energy, and converts it into photons. The time delay of the produced photons (with respect to a hypothetical photon emitted at the center of expansion and travels directly towards the observer) is namely is of the same order as the central engine variability time. Thus, this model is capable of explaining the rapid ( > ∼ 1 ms) variability observed. On the other hand, this mechanism suffers a severe efficiency problem, as only the differential kinetic energy between two shells can be dissipated. Consider, e.g., two shells of masses m 1 and m 2 , and initial Lorentz factors Γ 1 and Γ 2 undergoing plastic collision. Conservation of energy and momentum implies that the final Lorentz factor of the combined shell is [Kobayashi et al. (1997)] (assuming that both Γ 1 , Γ 2 ≫ 1).
The efficiency of kinetic energy dissipation is Thus, in order to achieve high dissipation efficiency, one ideally requires similar masses, m 1 ≃ m 2 and high contrast in Lorentz factors (Γ 1 /Γ 2 ) ≫ 1. Such high contrast is difficult to explain within the context of either the "collapsar" or the "merger" progenitor scenarios. Even under these ideal conditions, the combined shell's Lorentz factor, Γ f will be high; therefore the contrast between the Lorentz factors of a newly coming shell and the merger shell in the next collision, will not be as high. As a numerical example, if the initial contrast is (Γ 1 /Γ 2 ) = 10, for m 1 = m 2 one can obtain high efficiency of > ∼ 40%; however, the efficiency of the next collision will drop to ∼ 11%. When considering ensemble of colliding shells under various assumptions of the ejection properties of the different shells, typical values of the global efficiency are of the order of 1% − 10%. [Mochkovitch et al. (1995); Kobayashi et al. (1997); Panaitescu et al. (1999); ; Kumar (1999); Spada et al. (2000); Guetta et al. (2001); Maxham and Zhang (2009)] These values are in contrast to observational evidence of a much higher efficiency of kinetic energy conversion during the prompt emission, of the order of tens of percents (∼ 50%), which are inferred by estimating the kinetic energy using afterglow measurements [Lloyd-Ronning and Zhang (2004) While higher efficiency of energy conversion in internal shocks was suggested by a few authors [Beloborodov (2000); Kobayashi and Sari (2001)], we point out that these works assumed very large contrast in Lorentz factors, (Γ 1 /Γ 2 ) ≫ 10 for almost all collisions; as discussed above such a scenario is unlikely to be realistic within the framework of the known progenitor models.
I further stress that the efficiency discussed in this section refers only to the efficiency in dissipating the kinetic energy. There are a few more episodes of energy conversion that are required before the dissipated energy is radiated as the observed γ-rays. These include (i) using the dissipated energy to accelerate the radiating particles [likely electrons] to high energies; (ii) converting the radiating particle's energy into photons; and (iii) finally, the detectors are sensitive only over a limited energy band, and thus part of the radiated photons cannot be detected. Thus, over all, the measured efficiency, namely, the energy of the observed γ-ray photons relative to the kinetic energy, is expected to be very low in this model, inconsistent with observations. An alternative idea for kinetic energy dissipation arises from the possibility that the jet composition may contain a large number of free neutrons. These neutrons, that are produced by dissociation of nuclei by γ-ray photons in the inner regions, decouple from the protons below the photosphere (see below) due to the lower cross section for proton-neutron collision relative to Thomson cross section [Derishev et al. (1999); Bahcall and Mészáros (2000); Mészáros and Rees (2000a); Rossi et al. (2006)]. This leads to friction between protons and neutrons as they have different velocities, which, in turn results in production of e + that follow the decay of pions (which are produced themselves by p − n interactions). These positrons IC scatter the thermal photons, producing γ-ray radiation peaking at ∼ MeV [Beloborodov (2010)]. A similar result is obtained when non-zero magnetic fields are added, in which case contribution of synchrotron emission becomes comparable to that of scattering the thermal photons [Vurm et al. (2011)].

Optical depth and photosphere
During the initial stages of energy release, a high temperature, > ∼ MeV (see Equation 2) "fireball" is formed. At such high temperature, large number of e ± pairs are produced [Paczynski (1986); Goodman (1986); Shemi and Piran (1990)]. The photons are scattered by these pairs, and cannot escape. However, once the temperature drops to T ′ < ∼ 17 keV, the pairs recombine, and thereafter only a residual number of pairs is left in the plasma [Paczynski (1986)]. Provided that η < ∼ 10 5 , the density of residual pairs is much less than the density of "baryonic" electrons associated with the protons, ne = np. (A large number of pairs may be produced later on, when kinetic energy is dissipated, e.g., by shell collisions). This recombination typically happens at r < rs.
Equation 9 thus provides a good approximation to the number density of both protons and electrons in the plasma. Using this equation, one can calculate the optical depth by integrating the mean free path of photons emitted at radius r.
A 1-d calculation (namely, photons emitted on the line of sight) gives [Paczynski (1990); Abramowicz et al. (1991) where β is the flow velocity, and σ T is Thomson's cross section; the use of this cross section is justified since in the comoving frame, the photon's temperature is The photospheric radius can be defined as the radius from which τ (r ph ) = 1, In this calculation, I assumed constant Lorentz factor Γ = η, which is justified for r ph > rs. In the case of fluctuative flow resulting in shells, η represents an average value of the shell's Lorentz factor. Further note that an upper limit on η within the framework of this model is given by the requirement r ph > rs → η < (Lσ T /8πmpc 3 r 0 ) ≃ 10 3 L 1/4 52 r −1/4 0,7 . This is because as the photons decouple the plasma at the photosphere, for larger values of η the acceleration cannot continue above r ph [Mészáros and Rees (2000b); ]. In this scenario, the observed spectra is expected to be (quasi)-thermal, in contrast to the observations.
The observed temperature at the photosphere is calculated using Equations 2, 8 and 17, Similarly, the observed thermal luminosity, L ob th ∝ r 2 Γ 2 T ′4 ∝ r 0 at r < rs and L ob th ∝ r −2/3 at r > rs [Mészáros and Rees (2000b)]. Thus, Note the very strong dependence of the observed temperature and luminosity 2 on η.
The results of Equation 19 show that the energy released as thermal photons may be a few % of the explosion energy. This value is of the same order as the efficiency of the dissipation of kinetic energy via internal shocks. However, as discussed above, only a fraction of the kinetic energy dissipated via internal shocks is eventually observed as photons, while no additional episodes of energy conversion (and losses) are added to the result in Equation 19. Furthermore, the result in Equation 19 is very sensitive to the uncertain value of η, via the ratio of (r ph /rs): for high η, r ph is close to rs, reducing the adiabatic losses and increasing the ratio of thermal luminosity. In such a scenario, the internal shocks -if occurring, are likely to take place at r coll ∼ ηrs > r ph , namely in the optically thin region. I will discuss the consequences of this result in section 3.5.3 below.
The calculation of the photospheric radius in Equation 17 was generalized by Pe'er (2008) to include photons emitted off-axis; in this case, the term "photospheric radius" should be replaced with "photospheric surface", which is the surface of last scattering of photons before they decouple the plasma. Somewhat counter intuitively, for a relativistic (Γ ≫ 1) spherical explosion this surface assumes a parabolic shape, given by [Pe'er (2008)] where R d ≡Ṁ σ T /(4mpβc) depends on the mass ejection rate and velocity.
An even closer inspection reveals that photons do not necessarily decouple the plasma at the photospheric surface; this surface of τ (r, θ) = 1 simply represent a probability of e −1 for a photon to decouple the plasma. Instead, the photons have a finite probability of decoupling the plasma at every location in space. This is demonstrated in Figure 14, adopted from [Pe'er (2008)]. This realization led A. Beloborodov to coin the term "vague photosphere" [Beloborodov (2011)].
The immediate implication of this non-trivial shape of the photosphere is that the expected radiative signal emerging from the photosphere cannot have a pure "Planck" shape, but is observed as a gray-body, due to the different Doppler boosts and different adiabatic energy losses of photons below r ph [Pe'er (2008); Pe'er and Ryde (2011)]. This is in fact the relativistic extension of the "limb darkening" effect known from stellar physics. As will be discussed in section 3.5.4 below, while in spherical outflow only a moderate modification to a pure "Planck" spectra is expected, this effect becomes extremely pronounced when considering more realistic jet geometries, and can in fact be used to study GRB jet geometries ].

The magnetar model
A second type of models assumes that the energy released during the collapse (or the merger) is not converted directly into photon-dominated outflow, but instead, is initially used in producing very strong magnetic fields (Poynting flux dominated plasma). Only at a second stage, the energy stored in the magnetic field is used in both accelerating the outflow to relativistic speeds (jet production and acceleration) as well as heating the particles within the jet.
There are a few motivations for considering this alternative scenario. Observationally, one of the key discoveries of the Swift satellite is the existence of a long lasting "plateau" seen in the the early afterglow of GRBs at the X-ray band ; ; Rowlinson et al. (2013)]. This plateau is difficult to explain in the context of jet interaction with the environment, but can be explained by continuous central engine activity (though it may be explained by other mechanisms, e.g., reverse shock emission; see, [van Eerten (2014a,b)]) . A second motivation is the fact that magnetic fields are long thought to play a major role in jet launching in other astronomical objects, such as AGNs, via the Blandford-Znajek [Blandford and Znajek (1977)] or the Blandford-Payne   Spruit (2010). It is thus plausible that they may play some role in the context of GRBs as well.
The key idea is that the core collapse of the massive star does not form a black hole immediately, but instead leads to a rapidly rotating proto-neutron star, with a period of ∼ 1 ms, and very strong surface magnetic fields (B > ∼ 10 15 G). This is known as the "magnetar" model [Usov (1992); Thompson (1994); Kluźniak and Ruderman (1998); Spruit (1999); Wheeler et al. (2000); Thompson et al. (2004)]. The maximum energy that can be stored in a rotating neutron star is ∼ 2 × 10 52 erg, and the typical timescale over which this energy can be extracted is ∼ 10 s (for this value of the magnetic field). These value are similar to the values observed in long GRBs. The magnetic energy extracted drives a jet along the polar axis of the neutron star [Uzdensky and MacFadyen (2007); Bucciantini et al. (2008Bucciantini et al. ( , 2009); Komissarov et al. (2009) ;Levinson (2013, 2014)]. Following this main energy extraction, residual rotational or magnetic energy may continue to power late time flaring or afterglow emission, which may be the origin of the observed X-ray plateau [Metzger et al. (2011)].

Scaling laws for jet acceleration in magnetized outflows
Extraction of the magnetic energy leads to acceleration of particles to relativistic velocities. The evolution of the hydrodynamic quantities in these Poyntingflux dominated outflow was considered by several authors [Spruit et al. (2001); Drenkhahn (2002); Drenkhahn and Spruit (2002); Vlahakis and Königl (2003); Giannios (2005Giannios ( , 2006; Giannios and Spruit (2005); Mészáros and Rees (2011)]. The scaling laws of the acceleration can be derived by noting that due to the high baryon load, ideal MHD limit can be assumed [Spruit et al. (2001)].
In this model, there are two parts to the luminosity [Drenkhahn (2002)]: a kinetic part, L k = ΓṀ c 2 , and a magnetic part, L M = 4πr 2 cβ(B 2 /4π), where β is the outflow velocity. Thus, L = L k + L M . Furthermore in this model, throughout most of the jet evolution the dominated component of the magnetic field is the toroidal component, and so B ⊥ β.
An important physical quantity is the magnetization parameter, σ, which is the ratio of Poynting flux to kinetic energy flux: At the Alfvén radius, r 0 (at r = r 0 , the flow velocity is equal to the Alfvén speed), the key assumption is that the flow is highly magnetized, and so the magnetization is σ(r 0 ) ≡ σ 0 ≫ 1. The magnetization plays a similar role to that of the baryon loading, in the classical fireball model. The basic idea is that the magnetic field in the flow changes polarity on a small scale, λ, which is of the order of the light cylinder in the central engine frame (λ ≈ 2πc/Ω), where Ω is the angular frequency of the central engineeither a spinning neutron star or black hole; see [Coroniti (1990)]. This polarity change leads to magnetic energy dissipation via reconnection process. It is assumed that the dissipation of magnetic energy takes place at a constant rate, that is modeled by a fraction ǫ of the Alfvén speed. As the details of the reconnection process are uncertain, the value of ǫ is highly uncertain. Often a constant value ǫ ≈ 0.1 is assumed. This implies that the (comoving) reconnection time is t ′ A is the (comoving) Alfvén speed, and λ ′ = Γ λ. Since the plasma is relativistic, v ′ A ∼ c, and one finds that t ′ rec ∝ Γ . In the lab frame, trec = Γ t ′ rec ∝ Γ 2 . Assuming that a constant fraction of the dissipated magnetic energy is used in accelerating the jet, the rate of kinetic energy increase is therefore given by from which one immediately finds the scaling law Γ (r) ∝ r 1/3 . The maximum Lorentz factor that can be achieved in this mechanism is calculated as follows. First, one writes the total luminosity as L = L k + L M = (σ 0 + 1)Γ 0Ṁ c 2 , where Γ 0 is the Lorentz factor of the flow at the Alfvén radius. Second, generalization of the Alfvénic velocity to relativistic speeds [Lichnerowicz (1967); Gedalin (1993)] reads By definition of the Alfvénic radius, the flow Lorentz factor at this radius is Γ 0 = γ A ≃ √ σ 0 (since at this radius the flow is Poynting-flux dominated, σ 0 ≫ 1). Thus, the mass ejection rate is written asṀ ≈ L/σ 3/2 0 c 2 . As the luminosity is assumed constant throughout the outflow, the maximum Lorentz factor is reached when L ∼ L k ≫ L M , namely L = ΓmaxṀ c 2 . Thus, In comparison to the photon-dominated outflow, jet acceleration in the Poyntingflux dominated outflow model is thus much more gradual. The saturation radius is at rs = r 0 σ 3 0 ≈ 10 13.5 σ 3 2 (ǫΩ) −1 3 cm. Similar calculations to that presented above show the photospheric radius to be at radius [Giannios and Spruit (2005)] r ph = 6 × 10 11 L 3/5 52 which is similar (for the values of parameters chosen) to the photospheric radius obtained in the photon-dominated flow. Note that in this scenario, the photosphere occurs while the flow is still accelerating. The model described above is clearly very simplistic. In particular, it assumes constant luminosity, and constant rate of reconnection along the jet. As such, it is difficult to explain the observed rapid variability in the framework of this model. Furthermore, one still faces the need to dissipate the kinetic energy in order to produce the observed γ-rays. As was shown by several authors ; Mimica et al. (2009);Mimica and Aloy (2010)], kinetic energy dissipation via shock waves is much less efficient in Poynting-flow dominated plasma relative to weakly magnetized plasma.
Moreover, even if this is the correct model in describing (even if only approximately) the magnetic energy dissipation rate, it is not known what fraction of the dissipated magnetic energy is used in accelerating the jet (increasing the bulk Lorentz factor), and what fraction is used in heating the particles (increasing their random motion). Lacking clear theoretical model, it is often simply assumed that about half of the dissipated energy is used in accelerating the jet, the other half in heating the particles [Spruit and Drenkhahn (2004)]. Clearly, all these assumptions can be questioned. Despite numerous efforts in recent years in studying magnetic reconnection [e.g., Uzdensky and McKinney (2011);McKinney and Uzdensky (2012); Cerutti et al. (2012Cerutti et al. ( , 2013; Werner et al. (2014)] this is still an open issue.
Being aware of these limitations, in recent years several authors have dropped the steady assumption, and considered models in which the acceleration of a magnetic outflow occurs over a finite, short duration [Contopoulos (1995); Tchekhovskoy et al. (2010b); Komissarov et al. (2010); Granot et al. (2011)]. The basic idea is that variability in the central engine leads to the ejection of magnetized plasma shells, that expand due to internal magnetic pressure gradient once they lose causal contact with the source.
One suggestion is that similar to the internal shock model, the shells collide at some radius r coll . The collision distort the ordered magnetic field lines entrained in the ejecta. Once reaching a critical point, fast reconnection seeds occur, which induce relativistic MHD turbulence in the interaction regions. This model, known as Internal-Collision-induced Magnetic Reconnection and Turbulence (ICMART) [Zhang and Yan (2011)] may be able to overcome the low efficiency difficulty of the classical internal shock scenario.

Particle acceleration
In order to produce the non-thermal spectra observed, one can in principle consider two mechanisms. The first is emission of radiation via various non-thermal processes, such as synchrotron, Compton, etc. This is the traditional way which is widely considered in the literature. A second way which was discussed only recently is the use of light aberration, to modify the (naively expected) Planck spectrum emitted at the photosphere. The potentials and drawbacks of this second idea will be considered in section 3.5.4. First, let me consider the traditional way of producing the spectra via non-thermal radiative processes 3 .
The internal collisions, magnetic reconnection, or possibly other unknown mechanism dissipate part of the outflow kinetic energy 4 . This dissipated energy, in turn, can be used to heat the particles (increase their random motion), and/or accelerate some fraction of them to a non-thermal distribution. Traditionally, it is also assumed that some fraction of this dissipated energy is used in producing (or enhancing) magnetic fields. Once accelerated, the high energy particles emit the non-thermal spectra.
The most widely discussed mechanism for acceleration of particles is the Fermi mechanism [Fermi (1949[Fermi ( , 1954], which requires particles to cross back and forth a shock wave. Thus, this mechanism is naturally associated with internal shell collisions, where shock waves are expected to form. A basic explanation of this mechanism can be found in the textbook by [Longair]. For reviews see [Bell (1978); Blandford and Ostriker (1978); Blandford and Eichler (1987); Jones and Ellison (1991)]. In this process, the accelerated particle crosses the shock multiple times, and in each crossing its energy increases by a (nearly) constant fraction, ∆E/E ∼ 1. This results in a power law distribution of the accelerated particles, N (E) ∝ E −S with power law index S ≈ 2.0 − 2.4 [Kirk et al. (1998[Kirk et al. ( , 2000; Ellison et al. (1990); Achterberg et al. (2001); Ellison and Double (2004)]. Recent developments in particle-in-cell (PIC) simulations have allowed to model this process from first principles, and study it in more detail [Silva et al. (2003); Nishikawa et al. (2003); Spitkovsky (2008b); Sironi and Spitkovsky (2009);Haugbølle (2011)]. As can be seen in Figure 15 taken from [Spitkovsky (2008b)], indeed a power law tail above a low energy Maxwellian in the particle distribution is formed.
The main drawback of the PIC simulations is that due to the numerical complexity of the problem, these simulations can only cover a tiny fraction (∼ 10 −8 ) of the actual emitting region in which energetic particles exist. Thus, these simulations can only serve as guidelines, and the problem is still far from being fully resolved. Regardless of the exact details, it is clear that particle acceleration via the Fermi mechanism requires the existence of shock waves, and is thus directly related to the internal dynamics of the gas, and possibly to the generation of magnetic fields, as mentioned above.
The question of particle acceleration in magnetic reconnection layers have also been extensively addressed in recent years [see Romanova and Lovelace (1992); Zenitani and Hoshino (2001) (2014) for a partial list of works]. The physics of acceleration is somewhat more complicated than in non-magnetized outflows, and may involve several different mechanisms. The basic picture is that the dissipation of the magnetic field occurs in sheets. The first mechanism relies on the realization that within these sheets, there are regions of high electric fields; particles can therefore be accelerated directly by the strong electric fields. A second mechanism is based on instabilities within the sheets, that create "magnetic islands", (plasmoids) that are moving close to the Alfvén speed (see Figure 16). Particles can therefore be accelerated via Fermi mechanism by scattering between the plasmoids. A third mechanism is based on converging plasma flows towards the current sheets, that provide another way of particle acceleration via first order Fermi process.
In addition, if the flow is Poynting-flux dominated, particles may also be accelerated in shock waves; however, it was argued that Fermi-type acceleration in Fig. 16 Results of an electromagnetic particle in cell (PIC) simulation TRISTAN-MP show the structure of the reconnection layer (left) and the accelerated particle distribution function (right). Left: structure of the reconnection layer. Shown are the particle densities (a), (b); magnetic energy fraction (c) and mean kinetic energy per particle (d). The plasmoids are clearly seen. Right: temporal evolution of particle energy spectrum. The spectrum at late times resembles a power law with slope p = 2 (dotted red line), and is clearly departed from a Maxwellian. The dependence of the spectrum on the magnetization is shown in the inset. Figure is takes from Sironi and Spitkovsky (2014).
shock waves that may develop in highly magnetized plasma may be inefficient Spitkovsky (2009, 2011)]. Thus, while clearly addressing the question of particle acceleration in magnetized outflow is a very active research field, the numerical limitations imply that theoretical understanding of this process, and its details (e.g., what fraction of the reconnected energy is being used in accelerating particles, or the energy distribution of the accelerated particles) is still very limited.
Although the power law distribution of particles resulting from Fermi-type, or perhaps magnetic-reconnection acceleration is the most widely discussed, we point out that alternative models exist. One such model involves particle acceleration by a strong electromagnetic potential, which can exceed 10 20 eV close to the jet core [Lovelace (1976); Blandford (1976);Neronov et al. (2009)]. The accelerated particles may produce a high energy cascade of electron-positron pairs. Additional model involves stochastic acceleration of particles due to resonant interactions with plasma waves in the black hole magnetosphere [Dermer et al. (1996)].
Several authors have also considered the possibility that particles in fact have a relativistic quasi-Maxwellian distribution [Jones and Hardee (1979); Cioffi and Jones (1980); Wardziński and Zdziarski (2000); Pe'er and Casella (2009)]. Such a distribution, with the required temperature (∼ 10 11 − 10 12 K) may be generated if particles are roughly thermalized behind a relativistic strong shock wave [e.g., Blandford and McKee (1977)]. While such a model is consistent with several key observations, it is difficult to explain the very high energy (GeV) emission without invoking very energetic particles, and therefore some type of particle acceleration mechanism must take place as part of the kinetic energy dissipation process.
3.5 Radiative processes and the production of the observed spectra.
Following jet acceleration, kinetic energy dissipation (either via shock waves or via magnetic reconnection) and particle acceleration, the final stage of energy conversion must produce the observed spectra. As the γ-ray spectra is both very broad and non-thermal (does not resemble "Planck"), most efforts to date are focused on identifying the relevant radiative processes and physical conditions that enable the production of the observed spectra. The leading radiative models initially discussed are synchrotron emission, accompanied by synchrotron-self Compton at high energies. However, as has already mentioned, it was shown that this model is inconsistent with the data, in particular low energy spectral slopes.
Various suggestions of ways to overcome this drawback by modifying some of the physical conditions and / or physical properties of the plasma were proposed in the last decade. However, a major revolution occurred with the realization that part of the spectra is thermal. This led to new set of models in which part of the emission originates from below the photosphere (the optically thick region). It should be stressed that only part of the spectrum -but not all of it is assumed to originate from the photosphere. Thus, in these models as well, there is room for optically thin (synchrotron and IC) emission, originating from a different location. Finally, a few most recent works on light aberration show that the contribution of the photospheric emission may be much broader than previously thought.

Optically thin model: synchrotron
Synchrotron emission is perhaps the most widely discussed model for explaining GRB prompt emission. It has several advantages. First, it has been extensively studied since the 1960's [Ginzburg and Syrovatskii (1965); Blumenthal and Gould (1970)] and is the leading model for interpreting non-thermal emission in AGNs, XRBs and emission during the afterglow phase of GRBs. Second, it is very simple: it requires only two basic ingredients, namely energetic particles and a strong magnetic field. Both are believed to be produced in shock waves (or magnetic reconnection phase), which tie it nicely to the general "fireball" (both "hot" and "cold") picture discussed above. Third, it is broad band in nature (as opposed, e.g., to the "Planck" spectrum), with a distinctive spectral peak, that could be associated with the observed peak energy. Fourth, it provides a very efficient way of energy transfer, as for the typical parameters, energetic electrons radiate nearly 100% of their energy. These properties made synchrotron emission the most widely discussed radiative model in the context of GRB prompt emission ; Meszaros and Rees (1993); Meszaros et al. (1993); Mészáros et al. (1994); Paczynski and Xu (1994); Papathanassiou and Meszaros (1996); Tavani (1996); Cohen et al. (1997); Sari and Piran (1997a); Pilla and Loeb (1998); Daigne and Mochkovitch (1998) for a very partial list].
Consider a source at redshift z which is moving at velocity β ≡ v/c (corresponding Lorentz factor Γ = (1 − β) −1/2 ) at angle θ with respect to the observer. The emitted photons are thus seen with a Doppler boost D = [Γ (1 − β cos θ)] −1 . Synchrotron emission from electrons having random Lorentz factor γ el in a magnetic field B (all in the comoving frame) is observed at a typical energy (1 + z) erg.
If this model is to explain the peak observed energy, ǫ ob ≈ 200 keV with typical Lorentz factor D ≃ Γ ∼ 100 (relevant for on-axis observer), one obtains a condition on the typical electron Lorentz factor and magnetic field, Bγ 2 el ≃ 3.6 × 10 10 1 + z 2 Thus, both strong magnetic field and very energetic electrons are required in interpreting the observed spectral peak as due to synchrotron emission. Such high values of the electrons Lorentz factor are not excluded by any of the known models for particle acceleration. High values of the magnetic fields may be present if the outflow is Poynting flux dominated. In the photon-dominated outflow, strong magnetic fields may be generated via two stream (Weibel) instabilities [Weibel (1959); Medvedev and Loeb (1999) One can therefore conclude that the synchrotron model is capable of explaining the peak energy. However, one alarming problem is that the high values of both B and γ el required, when expressed as fraction of available thermal energy (the parameters ǫe and ǫ B ) are much higher than the (normalized) values inferred from GRB afterglow measurements [Wijers et al. (1997); Panaitescu and Kumar (2002); Santana et al. (2014); Barniol Duran (2014)]. This is of a concern, since broad band GRB afterglow observations are typically well fitted with the synchrotron model, and the microphysics of particle acceleration and magnetic field generation should be similar in both prompt and afterglow environments 5 .
The main concern though is the low energy spectral slope. As long as the electrons maintain their energy, the expected synchrotron spectrum below the peak energy is Fν ∝ ν 1/3 (corresponding photon number N E ∝ E −2/3 ) [e.g., Rybicki and Lightman (1979)]. This is roughly consistent with the observed low energy spectral slope, α = −1 (see Section 2.2.2).
However, at these high energies, and with such strong magnetic field, the radiating electrons rapidly cool by radiating their energy on a very short time scale: Here, E = γ el mec 2 is the electron's energy, P is the radiated power, u B ≡ B 2 /8π is the energy density in the magnetic field, σ T is Thomson's cross section and Y is Compton parameter. The factor (1 + Y ) is added to consider cooling via both synchrotron and Compton scattering.
Using the values obtained in Equation 27, one finds the (comoving) cooling time to be This time is to be compared with the comoving dynamical time, t ′ dyn ∼ R/Γ c. If the cooling time is shorter than the dynamical time, the resulting spectra below the peak is Fν ∝ ν −1/2 [e.g., Sari et al. (1996Sari et al. ( , 1998 While values of the power law index smaller than −3/2, corresponding to shallow spectra can be obtained by superposition of various emission sites, steeper values cannot be obtained. Thus, the observed low energy spectral slope of ∼ 85% of the GRBs (see Figure 5) which show α larger than this value ( α = −1) cannot be explained by synchrotron emission model. This is the "synchrotron line of death" problem introduced above.
The condition for t ′ cool > ∼ t ′ dyn can be written as The value of the emission radius R = 10 14 cm is chosen as a representative value that enables variability over time scale δt ob ∼ R/Γ 2 c ∼ 0.3 R 14 Γ −2 2 s. Since γ el represents the characteristic energy of the radiating electrons, such high values of the typical Lorentz factor γ el are very challenging for theoretical modeling. However, a much more severe problem is that in this model, under these conditions, the energy content in the magnetic field must be very low (see Equation 27). In order to explain the observed flux, one must therefore demand high energy content in the electron's component, which is several orders of magnitude higher than that stored in the magnetic field [Kumar and McMahon (2008); Beniamini and Piran (2013); Kumar and Zhang (2014)]. This, in turn, implies that inverse Compton becomes significant, producing ∼ TeV emission component that substantially increase the total energy budget. As was shown by [Kumar and McMahon (2008)], such a scenario can only be avoided if the emission radius is R > ∼ 10 17 cm, in which case it is impossible to explain the rapid variability observed. Thus, the overall conclusion is that classical synchrotron emission as a leading radiative process fails to explain the key properties of the prompt emission of the vast majority of GRBs [Ghisellini et al. (2000); ].

Suggested modifications to the classical synchrotron scenario
The basic synchrotron emission scenario thus fails to self-consistently explain both the energy of the spectral peak and the low energy spectral slope. In the past decade there have been several suggestions of ways in which the basic picture might be modified, so that the modified synchrotron emission, accompanied by inverse Compton scattering of the synchrotron photons (synchrotron-self Compton; SSC) would be able to account for these key observations.
The key problem is the fast cooling of the electrons, namely t cool < t dyn . However, in order for the electrons to rapidly cool they must be embedded in a strong magnetic field. The spatial structure of the magnetic field is not clear at all. Thus, it was proposed by [Pe'er and Zhang (2006)] that the magnetic field may decay on a relatively short length scale, and so the electrons would not be able to efficiently cool. This idea had gain interest recently [ Zhao et al. (2014); Uhm and Zhang (2014)]. Its major drawback is the need for high energy budget, as only a small part of the energy stored in the electrons is radiated. Another idea is that synchrotron self absorption may produce steep low energy slope below the observed peak [Lloyd and Petrosian (2000)]. However, this requires unrealistically high magnetic field. Typically, the synchrotron self absorption frequency is expected at the IR/Optic band [e.g., Rybicki and Lightman (1979); Granot et al. (2000)]. Thus, synchrotron self absorption may be relevant in shaping the spectrum at the X-rays only under very extreme conditions [e.g., Pe'er and Waxman (2004b)].
Looking into a different parameter space region, it was suggested that the observed peak energy is not due to synchrotron emission, but due to inverse-Compton scattering of the synchrotron photons, which are emitted at much lower energies ; Dermer and Böttcher (2000); Stern and Poutanen (2004)]. In these models, the steep low energy spectral slope can result from upscattering of synchrotron self absorbed photons. However, a careful analysis of this scenario (e.g., [Kumar and Zhang (2014)]) reveals requirements on the emission radius, R > ∼ 10 16 cm and optical flux (associated with the synchrotron seed photons) that are inconsistent with observations. Furthermore, a second scattering would lead to substantial TeV flux, resulting in an energy crisis [Derishev et al. (2001); . Thus, this model as well is concluded as not being viable as the leading radiative model during the GRB prompt emission ].
If the energy density in the photon field is much greater than in the magnetic field, then electron cooling by inverse Compton scattering the low energy photons dominated over cooling by synchrotron radiation. The most energetic electrons cool less efficiently due to the Klein-Nishina (KN) decrease in the scattering cross section. Thus, in this parameter space where KN effect is important, steeper low energy spectral slopes can be obtained [Derishev et al. (2001);Nakar et al. (2009);]. However, even under the most extreme conditions, the steepest slope that can be obtained is no harder than Fν ∼ ν 0 [Nakar et al. (2009);Barniol Duran et al. (2012)], corresponding to N E ∝ E −1 -which can explain at most ∼ 50% of the low energy spectral slopes observed. Moreover, very high values of the Lorentz factor, γ el > ∼ 10 6 are assumed which challenge theoretical models, as discussed above.
A different proposition was that the heating of the electrons may be slow; namely, the electrons may be continuously heated while radiating their energy as synchrotron photons. This way, the rapid electrons cooling is avoided, and a shallower spectra can be obtained [Ghisellini and Celotti (1999a,b); Kumar and McMahon (2008); Asano and Terasawa (2009);Murase et al. (2012)]. While there is no known mechanism that could continuously heat the electrons as they cross the shock wave and are advected downstream in the classical internal collision scenario, it was proposed that slow heating may result from MHD turbulence down stream of the shock front [Murase et al. (2012)]. Thus this may be an interesting alternative, though currently there are still large gaps in the physics involved in the slow heating process.
A different suggestion is emission by the hadrons (protons). The key idea is that whatever mechanism that is capable of accelerating electrons to high energies, should accelerate protons as well; in fact, the fact that high energy cosmic rays are observed necessitate the existence of such a mechanism, although its detailed in the context of GRBs are unknown. Many authors have considered possible contribution of energetic protons to the observed spectra [e.g., Bottcher and Dermer (1998);Totani (1998); Gupta and Zhang (2007); Razzaque et al. (2010); ; Crumley and Kumar (2013)]. Energetic proton contribution to the spectrum is both via direct synchrotron emission, and also indirectly by photo-pion production or photo-pair production.
Clearly, proton acceleration to high energies would imply that GRBs are potentially strong source of both high energy cosmic rays and energetic neutrinos [Milgrom and Usov (1995); Waxman (1995); Waxman and Bahcall (1997); Waxman (2004)]. On the other hand, the main drawback of this suggestion is that protons are much less efficient radiators than electrons (as the ratio of proton to electron cross section for synchrotron emission ∼ (me/mp) 2 ). Thus, in order to produce the observed luminosity in γ-rays, the energy content of the protons must be very high, with proton luminosity of ∼ 10 55 − 10 56 erg s −1 . This is at least 3 orders of magnitude higher than the requirement for leptonic models.

Photospheric emission
As discussed above, photospheric (thermal) emission is an inherent part of both the "hot" and "cold" (magnetized) versions of the fireball model. Thus, it is not surprising that the very early models of cosmological GRBs considered photospheric emission as a leading radiative mechanism [Goodman (1986);Paczynski (1986Paczynski ( , 1990; Thompson (1994)]. However, following the observational evidence of a non-thermal emission, and lacking clear evidence for a thermal component, this idea was abandoned for over a decade.
Renewed interest in this idea began in the early 2000's, with the realization that the synchrotron model -even after being modified, cannot explain the observed spectra. Thus, several authors considered addition of thermal photons to the overall non-thermal spectra, being either dominant [Eichler and Levinson (2000); Daigne and Mochkovitch (2002)] or sub-dominant [Mészáros and Rees (2000b); ; ]. Note that as neither the internal collision or the magnetic reconnection models provide clear indication of the location and the amount of dissipated kinetic energy that is later converted into non-thermal radiation, it is impossible to determine the expected ratio of thermal to non-thermal photons from first principles in the framework of these models. Lacking clear observational evidence, it was therefore thought that r ph ≫ rs, in which case adiabatic losses lead to strong suppression of the thermal luminosity and temperature (see Equations 18,19).
However, as was pointed out by [Pe'er and Waxman (2004a)], in the scenario where r ph ≫ rs it is possible that substantial fraction of kinetic energy dissipation occurs below the photosphere (e.g., in the internal collision scenario, if r coll < r ph ). In this case, the radiated (non-thermal) photons that are emitted as a result of the dissipation process cannot directly escape, but are advected with the flow until they escape at the photosphere. This triggers several events. First, multiple Compton scattering substantially modifies the optically thin (synchrotron) spectra, presumably emitted initially by the heated electrons. Second, the electrons in the plasma rapidly cool, mainly by IC scattering. However, they quickly reach a 'quasi steady state', and their distribution becomes quasi-Maxwellian, irrespective of their initial (accelerated) distribution. The temperature of the electrons is determined by balance between heating -both external, as well as by direct Compton scattering energetic photons, and cooling (adiabatic and radiative) ]. The photon field is then modified by scattering from this quasi-Maxwellian distribution of electrons. The overall result is a regulation of the spectral peak at ∼ 1 MeV (for dissipation that takes place at moderate optical depth, τ ∼a few -few tens), Time averaged broad band spectra expected following kinetic energy dissipation at various optical depths. For low optical depth, the two low energy bumps are due to synchrotron emission and the original thermal component, and the high energy bumps are due to inverse Compton phenomenon. At high optical depth, τ ≥ 100, a Wien peak is formed at ∼ 10 keV, and is blue-shifted to the MeV range by the bulk Lorentz factor ≃ 100 expected in GRBs. In the intermediate regime, 0.1 < τ < 100, a flat energy spectrum above the thermal peak is obtained by multiple Compton scattering. Figure is takes from . and low energy spectral slopes consistent with observations [Pe'er and Waxman (2004a)].
The addition of the thermal photons that originate from the initial explosion (this term is more pronounced if r ph > ∼ rs) significantly enhances these effects ]. The thermal photons serve as seed photons for IC scattering, resulting in rapid cooling of the non-thermal electrons that are heated in the sub-photospheric energy dissipation event. As the rapid IC cooling leads to a quasi-steady state distribution of the electrons, the outcome is a 'two temperature plasma', with electron temperature higher than the thermal photon temperature, T el > T ph . An important result of this model is that the electron temperature is highly regulated, and is very weakly sensitive to the model uncertainties; see ] for details. If the dissipation occurs at intermediate optical depth, τ ∼few -few tens, the emerging spectrum has a nearly 'top hat' shape (see Figure 17). Below T ph the spectrum is steep, similar to the Rayleigh-Jeans part of the thermal spectrum; in between T ph and T el , a nearly flat energy spectra, νfν ∝ ν 0 (corresponding N E ∝ E −2 ) is obtained, resulting from multiple Compton scattering; and an exponential cutoff is expected at higher energies.
Interestingly, the spectral slope obtained in the intermediate regime is similar to the obtained high energy spectral slope, β ∼ −2 (see discussion in section 2.2.2 and Figure 5). Thus, a simple interpretation is to associate the observed E pk with T ph . However, this is likely a too simplistic interpretation from the following reasons. First, the predicted low energy spectral slopes, being (modified) thermal are typically harder than the observed [Deng and Zhang (2014)]. Second, in GRB110721A, the peak energy is at ≈ 15 MeV at early times [Axelsson et al. (2012); Iyyani et al. (2013)], which is too high to be accounted for by T ph ; ]. Moreover, recent analysis of Fermi data show a thermal peak at lower energies than E pk (see, e.g., figure 7), which is consistent with the interpretation of the thermal peak being associated with T ph . The key result of this model, that T ph < T el is consistent with the observational result of E peak,th < E pk , which is applicable to all GRBs in which thermal emission was identified so far. This model thus suggests that E pk may be associated with T el , though it does not exclude synchrotron origin for E pk ; see further discussion below.
If the optical depth in which the kinetic energy dissipation takes place is τ > ∼ 100, the resulting spectra is close to thermal; while if τ < ∼ a few, the result is a complex spectra, with synchrotron peak, thermal peak and at least two peaks resulting from IC scattering (see Figure 17). Below the thermal peak, the main contribution is from synchrotron photons, that are emitted by the electrons at the quasi steady distribution. Above the thermal peak, multiple IC scattering is the main emission process, resulting in nearly flat energy spectra. Thus, this model naturally predicts different spectral slopes below and above the thermal peak.
It should be noted that the above analysis holds for a single dissipation episode. In explaining the complex GRB lightcurve, multiple such episodes (e.g., internal collisions) are expected. Thus, a variety of observed spectra, which are superposition of the different spectra that are obtained by dissipation at different optical depth are expected [Keren and Levinson (2014)]. In spite if this success, this model still suffers two main drawbacks. The first one already discussed is the need to explain low energy spectral slopes that are not as hard as the Rayleigh-Jeans part of a Planck spectra. Further, this model needs to explain the high peak energy (> MeV) observed in some bursts in a self-consistent way. A second drawback is the inability of the sub photospheric dissipation model to explain the very high energy (GeV) emission seen. Such high energy photons must originate from some dissipation above the photosphere.
There are two solutions to these problems. The first is geometric in nature, and takes into account the non-spherical nature of GRB jets to explain how low energy spectral slopes are modified. This will be discussed below. The second is the realization that the photospheric emission must be accompanied by at least another one dissipation process that takes place above the photosphere. This conclusion, however, is aligned with both observations of different temporal behavior of the high energy component (see section 2.2.5), as well as with the basic idea of multiple dissipation episodes, inherent to both the "internal collision" model and to the magnetic reconnection model. Indeed, in the one case in which detailed modeling was done by considering two emission zones (photosphere and external one), very good fits to the data of GRB090902B were obtained [Pe'er et al. (2012)]. This fits were done with a fully physically motivated model, which enables to determine the physical conditions at both emission zones [Pe'er et al. (2012)]. This is demonstrated in Figure 18.

Geometrical Broadening
As was already discussed in section 3.2.5, the definition of the photosphere as the last scattering surface must be modified to incorporate the fact that photons have finite probability of being scattered at every location in space where particles exist. This led to the concept of "vague photosphere" (see Figure 14). The observational consequences of this effect were studied by several authors [Pe'er (2008); Beloborodov (2010Beloborodov ( , 2011; ; Aksenov et al. (2013); Vereshchagin (2014)] In spherical explosion case, the effect of the vague photosphere is not large; it somewhat modifies the Rayleigh-Jeans part of the spectrum, to read Fν ∝ ν 3/2 ( [Beloborodov (2011)]). However, for non-spherical explosion, the effect becomes dramatic.
The scenario ofṀ =Ṁ(θ) was considered by ]. While photospheric emission from the inner parts of the jet result in mild modification to the black body spectrum, photons emitted from the outer jet's photosphere dominate the spectra at low energies (see Figure 19). For narrow jets (θ j Γ 0 < ∼ few), this leads to flat low energy spectra, dN/dE ∝ E −1 , which is independent on the viewing angle, and very weakly dependent on the exact jet profile. This result thus both suggests the possibility that the low energy slopes are in fact part of the photospheric emission, and in addition can be used to infer the jet geometry.
A second aspect of the model, is that the photospheric emission can be observed to be highly polarized, with up to ≈ 40% polarization [Lundman et al. (2014); Chang et al. (2014)]. While IC scattering produces highly polarized light, in spherical models the polarization from different viewing angles cancels. However, this cancellation is incomplete in jet-like models (observed off-axis). While the observed flux by an observer off the jet axis (that can see highly polarized light) is reduced, it is still high enough to be detected [Lundman et al. (2014)].
A third unique aspect that results from jet geometry (rather than spherical explosion) is photon energy gain by Fermi-like process. As photons are scattered back and forth between the jet core and the sheath, on the average they gain energy. This leads to a high energy power law tail (above the thermal peak) ; Ito et al. (2013)]. This again may serve as a new tool in studying jet geometry; though the importance of this effect in determining the high energy spectra of GRBs is still not fully clear [Lundman et. al., in prep.].

A few implications of the photospheric term
A great advantage of the photospheric emission in its relative simplicity. By definition, the photosphere is the inner most region from which electromagnetic signal can reach the observer. Thus, the properties of the emission site are much more constrained, relative, e.g., to synchrotron emission (whose emission radius, magnetic field strength and particle distribution are not known).
In fact, in the framework of the "hot" fireball model, the (1-d) photospheric radius is a function of only two parameters: the luminosity (which can be measured once the distance is known) and the Lorentz factor (see Equation 17). The photospheric radius is related to the observed temperature and flux via r ph /Γ ∝ (F ob bb /σT ob 4 ) 1/4 , where σ is Stefan's constant, and the extra factor of Γ −1 is due to light aberration. Since r ph ∝ LΓ −3 , measurements of the temperature and flux for bursts with known redshift enables an independent measurement of Γ , the Lorentz factor at the photosphere [Pe'er et al. (2007)]. This, in turn, can be used to determine the full dynamical properties of the outflow.
One interesting result is that by using this method , it is found that r 0 , the size of the jet base, is ∼ 10 8.5 cm, two-three orders of magnitude above the Schwarzschild radius [Pe'er et al. (2007); Ghirlanda et al. (2013); Iyyani et al. Fig. 19 Left. The expected (observed) spectrum from a relativistic, optically thick outflow. The resulting spectra does not resemble the naively expected "Planck" spectrum. Separate integration of the contributions from the inner jet (where Γ ≈ Γ 0 ), outer jet (where Γ drops with angle) and envelope is shown with dashed, dotdashed and dotted lines, respectively. Right. The assumed jet profile. Figure  (2013); Larsson et al. (2015)]. Interestingly, this result is aligned with recent constraints found by [Vurm et al. (2013)], that showed that the conditions for full thermalization takes place only if dissipation takes place at intermediate radii, ∼ 10 10 cm, where the outflow Lorentz factor is mild, Γ ∼ 10. Furthermore, this radius of ∼ 10 8.5 cm is a robust radius where jet collimation shock is observed in numerical simulations ; Mizuta and Ioka (2013)]. These results thus point toward a new understanding of the early phases of jet dynamics.
A second interesting implication is an indirect way of constraining the magnetization of the outflow. It was shown by ; Daigne and Mochkovitch (2002); Zhang and Pe'er (2009)] that for similar parameters, the photospheric contribution in highly magnetized outflows is suppressed. Lack of pronounced thermal component can therefore be used to obtain a lower value on the magnetization parameter, σ [Zhang and Pe'er (2009)]. Furthermore, it was recently shown [Bégué and Pe'er (2014)] that in fact in the framework of standard magnetic reconnection model, conditions for full thermalization do not exit in the entire region below the photosphere. As a result, the produced photons are up-scattered, and the resulting peak of the Wien distribution formed is at > ∼ 10 MeV. This again leads to the conclusion that identification of thermal component at energies of < ∼ 100 keV must imply that the outflow cannot be highly magnetized.

Summary and conclusion
We are currently in the middle of a very exciting epoch in the study of GRB prompt emission. Being very short, random and non-repetitive, study of the prompt emission is notoriously difficult. The fact that no two GRBs are similar makes it extremely difficult to draw firm conclusions that are valid for all GRBs. Nonetheless, following the launch of Swift and Fermi, ample observational and theoretical efforts have been put in understanding the elusive nature of these complex events. I think that it is fair to say that we are finally close to understanding the essence of it.
To my opinion, there are two parts to the revolution that take place in the last few years. The first is the raise of the time-dependent spectral analysis, which enables a distinction between different spectral components that show different temporal evolution. A particularly good example is the temporal behavior of the high energy (GeV) part of the spectrum, that is lagging behind lower energy photons. This temporal distinction enables a separate study of each component, and points towards more than a single emission zone. This distinction, in fact, is aligned with the initial assumptions of the "fireball" model, in which internal collisions (or several episodes of magnetic energy dissipation) lead to multiple emission zones.
The second part of the revolution is associated with the identification of a thermal component on top of the non-thermal spectra. For many years, until today, the standard fitting of GRB spectra were, and still are carried using a mathematical function, namely the "Band" model. Being mathematical in nature, this model does not have any "preferred" physical scenario, but its results can be interpreted in more than one way. As a result, it is difficult to obtain a theoretical insight using these fits. As was pointed out over 15 years ago, basic radiative models, such as synchrotron, fail to provide a valid interpretation to the obtained results. Moreover, while a great advantage of this model is its simplicity, here lies also its most severe limitation: being very simply, it is not able to account for many spectral and temporal details, which are likely crucial in understanding the underlying physics of GRBs.
It was only in recent years, with the abandoning of the "Band" model as a sole model for fitting GRB prompt emission data, that rapid progress was enabled. The introduction of thermal emission component played a key role in this revolution. First, it provides a strongly physically motivated explanation to at least part of the spectrum. Second, the values of the parameters describing the non-thermal part of the spectra are different than the values derived without the addition of a thermal component; this makes it easier to provide a physical interpretation to the nonthermal part. Third, the observed well defined temporal behavior opened a new window into exploring the temporal evolution of the spectra. These observational realization triggered a wealth of theoretical ideas aimed at explaining both the observed spectral and temporal behaviors.
Currently, there is still no single theoretical model that is accepted by the majority of the community. This is due to the fact that although it is clear that synchrotron emission from optically thin regions cannot account for the vast majority of GRBs, pure thermal component is only rarely observed. Furthermore, clearly the very high energy (GeV band) emission has a non-thermal origin, and therefore even if thermal component does play an important role, there must be additional processes contributing to the high energy part. Moreover, while thermal photons are observed in some GRBs, there are others in which there is no evidence for such a component. Thus, whatever theoretical idea may be used to explain the data, it must be able to explain the diversity observed.
At present epoch, there are three leading suggestions for explaining the variety of the data. The first is that the variety seen is due to different in magnetization. It is indeed a very appealing idea, if it can be proved that the variety of observed spectra depends only on a single parameter. The second type of models consider the different jet geometries, and the different observing angles relative to the jet axis. This is a novel approach, never taken before, and as such there is ample of room for continuing research in this direction. The third type of models consider sub photospheric energy dissipation as a way of broadening the "Planck" spectra. The observed spectra in these models thus mainly depends on the details of the dissipation process, and in particular the optical depth in which it takes place.
All of these models hold great promise, as they enable not only to identify directly the key ingredients that shape the observed spectra, but also enable one to use observations to directly infer physical properties. These include the jet dynamics, Lorentz factor, geometry (Γ as a function of r, θ and maybe also φ), and even the magnetization. Knowledge of these quantities thus directly reflects on answering basic questions of great interest to astronomy, such as jet launching, composition and collimation.
Thus, to conclude, my view is that we are in the middle of the 'prompt emission revolution'. It is too early to claim that we fully understand the prompt emission -indeed, we have reached no consensus yet about many of the key properties, as is reflected by the large number of different ideas. However, we understand various key properties of the prompt emission in a completely different way than only 5 -10 years ago. Thus, I believe that another 5-10 years from now there is a good chance that we could get to a conclusive idea about the nature of the prompt emission, and would be able to use it as a great tool in studying many other important issues, such as stellar evolution, gravitational waves and cosmic rays.