Frontier Fields Clusters: Chandra and JVLA View of the Pre-Merging Cluster MACS J0416.1-2403

Merging galaxy clusters leave long-lasting signatures on the baryonic and non-baryonic cluster constituents, including shock fronts, cold fronts, X-ray substructure, radio halos, and offsets between the dark matter and the gas components. Using observations from Chandra, the Jansky Very Large Array, the Giant Metrewave Radio Telescope, and the Hubble Space Telescope, we present a multiwavelength analysis of the merging Frontier Fields cluster MACS J0416.1-2403 (z=0.396), which consists of a NE and a SW subclusters whose cores are separated on the sky by ~250 kpc. We find that the NE subcluster has a compact core and hosts an X-ray cavity, yet it is not a cool core. Approximately 450 kpc south-south west of the SW subcluster, we detect a density discontinuity that corresponds to a compression factor of ~1.5. The discontinuity was most likely caused by the interaction of the SW subcluster with a less massive structure detected in the lensing maps SW of the subcluster's center. For both the NE and the SW subclusters, the dark matter and the gas components are well-aligned, suggesting that MACS J0416.1-2403 is a pre-merging system. The cluster also hosts a radio halo, which is unusual for a pre-merging system. The halo has a 1.4 GHz power of (1.06 +/- 0.09) x 10^{24} W Hz^{-1}, which is somewhat lower than expected based on the X-ray luminosity of the cluster. We suggest that we are either witnessing the birth of a radio halo, or have discovered a rare ultra-steep spectrum halo.


INTRODUCTION
Galaxy clusters grow by merging with other clusters and by accreting smaller mass structures from the intergalactic medium. Signs of these interactions are imprinted in the ICM and detected in X-ray observations as cold fronts, shock fronts, turbulence, and ram pressurestripped gas (e.g., Markevitch & Vikhlinin 2007;Randall et al. 2008a;Zhuravleva et al. 2015). Other footprints of cluster interactions can be seen in the radio band as halos and relics (e.g., Feretti et al. 2012). If the merger is not in the plane of the sky, mergers can be detected in optical observations based on multiple peaks in the radial velocity distribution and in the spatial galaxy distribution. Furthermore, comparison of X-ray and optical/lensing data also reveals signatures of merging events, most notably as offsets between the dark matter (DM) and the gas components of the colliding clusters (e.g., Markevitch et al. 2004;Clowe et al. 2006;Randall et al. 2008b;Merten et al. 2011;Dawson et al. 2012).
Radio halos, which have been detected in some mergers, are diffuse synchrotron-emitting sources with low surface brightness, steep spectral indices (α < −1, F ν ∝ ν α ), and typical sizes of ∼ 1 Mpc. Because of their steep radio spectra, low-frequency radio observations play an important role in characterizing their properties. Two main models have been proposed to explain the origin of such halos: • Reacceleration by turbulence: Large-scale turbulence generated during the merger event supplies the energy required to reaccelerate fossil cosmic rays (CRs) back to relativistic energies, at which time they become synchrotron-bright (e.g., Brunetti et al. 2001;Petrosian 2001).
• Acceleration by hadronic collisions: Inelastic collisions between CR protons trapped in the gravitational potential of a cluster (e.g. originating from supernova explosions, active galactic nuclei [AGN] outbursts, previous merger events, etc.) and thermal ICM protons give rise to a secondary population of CR electrons, which consequently are visible at radio frequencies (e.g., Blasi & Colafrancesco 1999;Dolag & Enßlin 2000).
However, hybrid models have also been postulated, in which radio halos may be produced by turbulent reacceleration of secondary particles resulting from hadronic proton-proton collisions (Brunetti & Blasi 2005;Brunetti & Lazarian 2011). The radio halos predicted by hybrid models are expected to be found in more relaxed clusters and to be underluminous for the masses of the hosting clusters (see also Brunetti & Jones 2014, for a review).
Hadronic radio halo models predict that magnetic fields should either be different in clusters with and without halos, or that gamma-ray emission should be detected in clusters hosting halos (e.g., Jeltema & Profumo 2011). However, there is no evidence that magnetic fields are stronger in clusters with radio halos (e.g., Bonafede 2010), and there has been no conclusive gamma-ray detection with the Fermi telescope (Ackermann et al. 2010). Therefore, current observational evidence disfavors hadronic models.
A textbook example of a merging cluster with a bright radio halo is the famous Bullet cluster, 1E 0657-56 (Elvis et al. 1992;Liang et al. 2000a;Markevitch et al. 2002;Shimwell et al. 2014). Chandra has revealed that this system has a bullet-like gas cloud moving through the disturbed ICM of the main cluster (Markevitch et al. 2002). The gas cloud is the surviving core of a subcluster whose outer layers were ram pressure-stripped in the collision. Immediately ahead of the core there is a density discontinuity associated with a cold front, while further out in the same direction there is a density discontinuity associated with a shock front (Markevitch et al. 2002;Owers et al. 2009). Recently, another shock front has been found on the opposite side of the Bullet cluster, behind the bullet-like gas cloud (Shimwell et al. 2015). Chandra observations of the Bullet cluster were followed up with optical/lensing data acquired, most notably, with the Very Large Telescope (VLT) and the Hubble Space Telescope (HST). Chung et al. (2010) have shown that the redshift distribution of the Bullet cluster is bi-modal, with two redshift peaks at z ∼ 0.21 and z ∼ 0.35, which imply a velocity difference ∼ 3000 km s −1 between the two merging subclusters. The gravitational lensing analysis has shown that the Bullet cluster is a dissociative merger, in which the essentially collisionless DM (σ DM < 0.7 g cm −2 , Randall et al. 2008b) has decoupled from the collisional ICM (Markevitch et al. 2004;Randall et al. 2008b). Lage & Farrar (2014) have combined the X-ray and lensing data with numerical simulations to constrain the merger scenario of the Bullet cluster.
Here, we present results from Chandra, Jansky Very Large Array (JVLA), and Giant Metrewave Radio Telescope (GMRT) observations of another merging galaxy cluster: the HST Frontier Fields 6 cluster MACS J0416. 1-24031- (z=0.396, Ebeling et al. 2001Mann & Ebeling 2012). Optical and lensing studies of this cluster have been presented recently by, e.g., Zitrin et al. (2013); Schirmer et al. (2014); Jauzac et al. (2014); Zitrin et al. (2015); Grillo et al. (2015). Most recently, Jauzac et al. (2015) mapped the DM distribution of this merging system using strong-and weak-lensing data. Their analysis revealed two mass concentrations associated with the two main subclusters involved in the merger, plus two smaller X-ray-dark mass structures NE and SW of the cluster center. Jauzac et al. (2015) combined their lensing data with archival Chandra observations to study the offsets between the DM and the gas components. They report good DM-gas alignment for the NE subcluster, but a significant offset for the SW one. Based on the lensing and X-ray results, Jauzac et al. (2015) proposed two possible scenarios for the merger event in MACS J0416.1-2403one pre-merging and one post-merging -but were unable to distinguish between the two.
The analysis presented herein combines significantly deeper Chandra observations with recently acquired JVLA and GMRT data and with the optical/lensing results reported by Jauzac et al. (2015) to improve our understanding of the merger event in MACS J0416.1-2403. The Chandra JVLA, and GMRT observations and data reduction procedures are described in Section 2. In Section 3 we discuss the background analysis of the Chandra data, and in Sections 4, 5, 6, and 7 we present the X-ray results. The radio results are presented in Section 8. In Section 9 we use the deeper Chandra data to revise the DM-gas offsets reported by Jauzac et al. (2015). In Section 10 we discuss the implications of our findings for the merger scenario of MACS J0416.1-2403. In Section 11 we examine the origin of the radio halo. A summary of our results is provided in Section 12.
Throughout the paper we assume a ΛCDM cosmology with H 0 = 70 km s −1 Mpc −1 , Ω M = 0.3, and Ω Λ = 0.7. At the redshift of the cluster, 1 corresponds to approximately 320 kpc. Unless stated otherwise, the errors are quoted at the 90% confidence level.

Chandra
The HST Frontier Fields cluster MACS J0416.1-2403 was observed with Chandra for 324 ks between June 2009 and December 2014. A summary of the observations is given in Table 1.
The data were reduced using CIAO v4.7 with the calibration files in CALDB v4.6.5. The particle background level of the VFAINT observations was lowered by filtering out cosmic ray events associated with significant flux in the 16 border pixels of the 5 × 5 event islands. Soft proton flares were screened out from all observations using the deflare routine, which is based on the lc clean code written by M. Markevitch. The flare screening was done using an off-cluster region at the edges of the field of view (FOV), which was selected to include only pixels located at distances greater than 2.5 Mpc from the cluster center and to exclude any point sources detectable by eye. The clean exposure times following this cleaning are listed in Table 1.
The inspection of an ObsID 10446 spectrum extracted from an off-cluster region showed a significant highenergy tail, which indicates that this observation is contaminated by flares that were not detected by the filtering routine. Given the relatively short exposure time of this observation, we decided to exclude ObsID 10446 from our analysis rather than attempt to model the flare component.
We reprojected the other five observations to a common reference frame, and created merged images in the energy bands 0.5 − 2, 0.5 − 3, 0.5 − 4, 0.5 − 7 keV, and 2 − 7 keV. The exposure-corrected, vignetting-corrected 0.5 − 4 keV mosaic image is shown in Figure 1. To detect point sources, we also created merged exposure mapweighted PSF images (ECF = 90%) in each of the five energy bands. Point sources were detected in the individual bands with the CIAO task wavdetect, using the merged maps, wavelet scales of 1, 2, 4, 8, 16, and 32 pixels, a sigma threshold of 5 × 10 −7 , and ellipses with 5σ axes. A few additional point sources that were missed by wavdetect were selected by eye and excluded from the data. All point sources detected by wavdetect were excluded from the data very conservatively, using the elliptical region that covered the largest area among the five elliptical regions identified for the different energy bands.
2.2. JVLA JVLA observations of MACS J0416.1-2403 were obtained in the L-band in the BnA, CnB, and DnC array configurations. The observations were recorded with the default wide-band L-band setup, giving 16 spectral windows each having 64 channels, covering the entire 1-2 GHz band. An overview of the observations is given in Table 2.
The data were calibrated with the Common Astronomy Software Applications 7 (CASA) package version 4.2.1. As a first step we flagged data affected by radio frequency interference (RFI) using AOFlagger (Offringa et al. 2010), after correcting for the bandpass shape. Data affected by other problems and antenna shadowing were flagged as well. After flagging, we applied the predetermined elevation dependent gain tables and antenna offset positions. The data were Hanning smoothed.
We determined initial gain solutions using 10 channels at the center of the spectral windows for the primary calibrators 3C147 and 3C138. We then recalibrated the bandpass and obtained delay terms (gaintype='K') using the unpolarized calibrator 3C147. A next step consisted of calibration of the cross-band delays (gaintype='KCROSS') using the polarized calibrator 3C138. We then solved again for the gains on the primary calibrators but now using all channels. The channel dependent leakage corrections were found using 3C147 and the polarization angles were set using 3C138. The gains were again re-determined for all calibrator sources which included the phase calibrator J0416-1851. Finally, the flux density-scale was bootstrapped from the primary calibrators to J0416-1851 and the calibration solutions were applied to the target field.
To refine the calibration for the target field, we performed three rounds of phase-only self-calibration and two final rounds of amplitude and phase self-calibration. W-projection (Cornwell et al. , 2005 was employed during the imaging, taking the non-coplanar nature of the array into account. The self-calibration was independently performed for the three different datasets from the different array configurations. The full bandwidth was imaged using MS-MFS clean (nterms=2; Rau & Cornwell 2011). Briggs (1995 weighting with a robust factor of -0.75 was used for self-calibration. Clean masks were used, which were made with the PyBDSM 8 source detection package.
After the self-calibration, the three datasets were combined. One final round of self-calibration was carried out on the combined dataset. The final images were corrected for the primary beam attenuation.
In addition, we imaged the dataset using an inner uvrange cut of 4.3 kλ (corresponding to a scale of about 1 , or 320 kpc). This model was then subtracted from the visibility data to allow a search for diffuse emission in the cluster. We imaged the dataset with the emission from compact sources subtracted using a Gaussian uv-taper of 20 and employing multi-scale clean (Cornwell 2008). An overview of the resulting image properties, such as root-mean-square (rms) noise and resolution are given in Table 3.

GMRT
GMRT 610 MHz data for MACS J0416.1-2403 were collected on December 3, 2013 using 26 antennas. The on-source time was 5.0 hrs, and a total bandwidth of 33.3 MHz (RR correlations only), split into 512 channels, was recorded. NRAO's Astronomical Image Processing System (AIPS) was used to carry out the initial calibration of the visibility dataset. The primary calibrator 3C 147 was used to set the flux density scale and derive the bandpass solutions for all the antennas. The source J0409-179 was observed for 6-mins scans at intervals of ∼ 25 mins and used as the secondary phase and gain calibrator. About 20% of the data, mostly from short baselines, were affected by RFI and subsequently flagged. Gain solutions were obtained for the calibrator sources and together with the bandpass solutions applied to the target field. In total, 480 of the 512 channels were used; the rest of the channels were discarded as they were too noisy due to the bandpass roll-off. The 480 channels were averaged down to 48 channels to reduce the size of the data.
The visibility data were then imported into CASA to refine the calibration via the process of self-calibration. Three rounds of phase-only self-calibration and two final rounds of amplitude and phase self-calibration were applied. The phase-only self-calibration was carried out on a 30 s timescale. The amplitude and phase selfcalibration was carried out on a 2 min timescale, pre-  applying the phase-only solutions before solving for the amplitude and phases on the longer 2 min timescale. We employed W-projection during the imaging and used nterms=2.
A map with Briggs (robust=0.5) weighting of the field was produced, which resulted in a resolution of 7.6 ×4.0 and an rms noise level of 54 µJy beam −1 (Table 3). Similarly to the JVLA L-band data reduction, we imaged the dataset using an inner uv-range cut of 4.3 kλ to obtain a model of the compact sources in the field. After subtracing this model from the uv-data, we re-imaged with uniform weighting and a Gaussian uv-taper to obtain a beam size of 20 .

VLITE
A new commensal observing system called the VLA Low Band Ionospheric and Transient Experiment (VLITE) 9 has been developed for the NRAO JVLA (Clarke et al., in prep.) and was operational during the observations of MACS J0416.1-2403. The VLITE correlator is a custom designed DiFX software correlator (Deller et al. 2011). The system processes 64 MHz of bandwidth centered on 352 MHz with two second temporal resolution and 100 kHz spectral resolution. VLITE operates during nearly all pointed JVLA observations with primary science goals at frequencies above 1 GHz, 9 http://vlite.nrao.edu/ providing data simultaneously for 10 JVLA antennas using the low band receiver system (Clarke et al. 2011).
We processed the CnB configuration VLITE observation of MACS J0416.1-2403 from 12 January 2015 using a combination of the Obit software package (Cotton 2008) and AIPS (van Moorsel et al. 1996). VLITE data at frequencies ν >360 MHz were removed due to the presence of strong RFI from a satellite downlink that is present during most operations. The data at ν <360 MHz were flagged using the AIPS program RFLAG to remove the majority of the remaining RFI. The bandpass was flattened using observations of several calibrators taken near the time of the observations (3C48, 3C138, 3C147, and 3C286). These calibrator sources were taken in the same primary observing band as MACS J0416.1-2403. The delays were determined using these same calibrators. The delay corrected data then were flux calibrated using these same calibrator sources. No phase calibrator is required for low band observations due to the large field of view (FWHM ∼ 2.3 • at 320 MHz) and typical presence of sufficiently strong sources in the field that allow for selfcalibration.
The initial imaging of the target field revealed additional RFI that was identified in the residual data set after all compact sources had been subtracted from the uv data. These baselines were excised from the full data set and we further refined the calibration through four rounds of phase-only self-calibration. Non-coplanar ef-4h15m00.00s 30.00s 16m00.00s 30.00s 17m00.00s  Table 1, while the top image zooms in on the brightest X-ray regions of the cluster, in a region of size 2.9 × 2.9 Mpc 2 . The dashed grey circle marks the boundary of the R 500 region of the cluster.
fects were taken into account in the imaging steps using small facets to make a fly-eye image of the full field out to the FWHM with additional facets placed on bright sources out to 20 • from the field center. Data were imaged using small clean masks placed around all sources. The final VLITE image of MACS J0416.1-2403 has a resolution of 45.7 × 31.1 and an rms noise of 2.1 mJy beam −1 . The image was made using a Briggs robust parameter of 0 (see Table 3).

X-RAY FOREGROUND & BACKGROUND MODELLING
We analyzed the Chandra data using the Group F stowed background event files, which are appropriate for observations taken after September 21, 2009. The particle background was subtracted from the data, while the foreground and background sky components were modeled from off-cluster regions. For each of the cluster ObsIDs, we created corresponding stowed background event files, applied the VFAINT cleaning to those associated with cluster observations taken in this observing mode, and renormalized the stowed background observations such that the count rates in the energy band 10 − 12 keV matched those of the corresponding observation in the same energy band. We modeled the sky components as the sum of unabsorbed thermal emission from the Local Hot Bubble ( ). The redshifts of the foreground components were set to 0, while the abundances were set to solar values assuming the abundance table of Anders & Grevesse (1989). The hydrogen column density was fixed to 2.89 × 10 20 atoms cm −2 (Kalberla et al. 2005). The Chandra spectra were fit in the energy band 0.5 − 7 keV using Xspec v12.8.2 -the lower limit was chosen to avoid calibration uncertainties at low energies, and the upper limit to increase the signal-to-noise ratio (SNR).
When fitting the Chandra spectra in the energy band 0.5 − 7 keV, the ∼ 0.1 keV LHB component cannot be constrained. Instead, we constrain this component using a ROentgen SATellite (ROSAT ) All-Sky Survey (RASS) spectrum 10 corresponding to an annulus with radii 0.15 and 1.0 degrees around the cluster. The ROSAT spectrum was fit in the energy band 0.1 − 2.4 keV, with the normalization of the power-law background component fixed to 8.85 × 10 −7 photons keV −1 cm −2 s −1 arcmin −2 (Moretti et al. 2003). The temperatures and normalizations of the LHB and GH components were free in the fit. The results of the fit to the ROSAT spectrum are presented in Table 4.
Based on the ROSAT results, the parameters of the LHB components were fixed for the Chandra spectra. The five Chandra spectra were then fitted in parallel, assuming that they are described by the same foreground model. The power-law normalizations, on the other hand, were left to vary independently, in order to account for the varying exposure time across the merged observation that was used for point source detection. The energy sub-band 0.5 − 0.8 keV was ignored for Ob-sID 16237 due to negative spectral residuals caused by a change in the spectral shape of the background component that is not filtered by the VFAINT cleaning, which occurred between 2009 (the date of the Group F background files) and 2013 (the date of the observation; A. Vikhlinin, priv. comm.). The subtraction of the instrumental background resulted in small line residuals near the spectral positions of the Al Kα (E ≈ 1.49 keV), Si Kα (E ≈ 1.75 keV), and Au Mα, β (E ≈ 2.1 keV) fluorescent instrumental lines. Therefore, we excluded from the spectra very narrow bands (∆E = 0.10 keV) surrounding these lines.
The best-fitting foreground and background parameters are summarized in Table 4. The spectra were grouped to have at least 1 count per bin, and the fit was done using the extended C-statistic 11 (Cash 1979).

GLOBAL X-RAY PROPERTIES
MACS J0416.1-2403 has been previously identified as a merging galaxy cluster at z = 0.396 (Mann & Ebeling 2012). The brightness of the NE subcluster core is significantly more peaked than that of the SW subcluster, which instead has a rather flat brightness distribution ( Figure 1).
We determined the average cluster temperature by extracting spectra from a circular region of radius 1.2 Mpc (approximately R 500 , Sayers et al. 2013) centered at α = 4 h 16 m 08.8 s and δ = −24 • 04 14.0 (J2000). The spectra extracted from the five Chandra ObsIDs were fit in parallel, assuming the cluster emission is described by a single-temperature thermal component 12 . For this global fit, the sky background parameters were fixed to the values listed in Table 4, and the redshift was fixed to 0.396. We found T = 10.06 +0.50 −0.49 keV and L X, 0.1−2.4 keV = (9.14±0.10)×10 44 erg s −1 . The average cluster parameters are shown in Table 5.

Mapping of the ICM Temperature
We mapped the cluster properties using contbin (Sanders 2006). 13 The merged Chandra image was adaptively binned in regions of ∼ 3600 source plus sky background counts in the energy band 0.5 − 7 keV. The bins follow the surface brightness contours of the cluster, thus minimizing possible gas mixing in bins located near density discontinuities. Instrumental background and total spectra were extracted from regions corresponding to each bin, and grouped to have at least one count per bin. The net spectra were modelled as the sum of sky background components plus a single temperature absorbed thermal model (phabs×APEC ) with free temperature and normalization. The metallicity was fixed to Z = 0.24Z (see Section 4). The sky background emission was fixed to the model summarized in Table 4. The 11 https://heasarc.gsfc.nasa.gov/xanadu/xspec/manual/ XSappendixStatistics.html 12 We tried adding an additional thermal component to fit the cluster emission, but the parameters of the second component could not be constrained. 13 We also created temperature maps using the codes described by Churazov et al. (2003) and Randall et al. (2008a), and obtained consistent results at the 90% confidence level. Table 4 Foreground and background spectral models --ObsID 16523 a Temperature, in units of keV. b Normalization, in units of cm −5 arcmin −2 for the thermal components, and in units of photons keV −1 cm −2 s −1 arcmin −2 for the non-thermal components. c Fixed parameter.

Table 5
Cluster properties in R 500 14 ± 0.10) × 10 44 spectra extracted from the five Chandra datasets were fitted in parallel, with the temperatures and normalizations linked between the different ObsIDs. The fits were done using the extended C-statistic. The resulting temperature map is shown in Figure 2. The temperature is high throughout the cluster, which causes the uncertainties on the measurements to be high. The temperature map does not show significant structure. Within the uncertainties, the temperatures out to ∼ 400 kpc from the cluster center are consistent with ∼ 10 keV.

Mapping of the ICM Pressure and Entropy
The electron number density can be derived either from the model normalization of the fitted spectrum, or from the surface brightness in the regions of interest. Extracting the number density from either of these quantities requires assumptions about the cluster geometry. Here, we estimated the electron number density from the surface brightness; we note that our results are unchanged when deriving the density from the spectral normalization.
The surface brightness is essentially proportional to the emission measure, where the integration is done along the line of sight and n e is the electron number density. More accurately, the surface brightness is also temperature-dependent. This dependence introduces an uncertainty of 10% in the pressure and entropy maps, which is less than the uncertainties on the temperatures and does not affect our results.
We approximated the density as ζ 1/2 , and defined the pseudo-pressure and pseudo-entropy as: The pseudo-pressure and pseudo-entropy maps are included in Figure 2. Strong jumps in pressure and entropy would indicate the presence of strong density discontinuities in the ICM. However, neither the pressure map, nor the entropy map of MACS J0416.1-2403 shows evidence of such strong jumps between the regions. We note that while a pressure and entropy discontinuity can be seen between regions #3 and #0, the size of region #0 is too large for this result to truly indicate that there is a density discontinuity at the boundary of the two regions.
In Table 6 we list the best-fitting temperatures, spectral normalizations, and 0.5 − 2 keV count rates for the regions in Figure 2. The bin numbers necessary to relate Table 6 Table 6 is available at http://hea-www.cfa.harvard.edu/~gogrean/ interactive/MACSJ0416_Tmap.html.

PROPERTIES OF THE INDIVIDUAL SUBCLUSTERS
The merging history of the individual subclusters is closely related to their degree of disturbance. In the following two sections, we perform the imaging analysis of the NE and SW subclusters in order to characterize their merging states. For both subclusters, we created surface brightness profiles in sectors centered on the respective X-ray peaks, and attempted to model the profiles with isothermal β-models (Cavaliere & Fusco-Femiano 1976, 1978: where S 0 is the central surface brightness, r c is the core radius, and r is the radius from the cluster centre.  Figure 2 with the fit values in Table 6. Numbers are region numbers. Black ellipses mark the regions where point sources have been removed. Right: Same surface brightness map as in Figure 1, with overlaid regions used for mapping the physical properties of the ICM. Table 6 Best-fitting temperatures, normalizations, and 0.5 − 2 keV count rates for the regions in Figure 3.  Simple β-models provide good descriptions of the Xray profiles of galaxy clusters that do not have strong ICM temperature gradients (e.g., Ettori 2000b,a). Based on Figure 2, there are indeed no strong temperature gradients in the NE and the SW subclusters of MACS J0416.1-2403.
Before fitting, all the profiles were binned to a minimum of 1 count/bin. The fits used Cash statistics. Fitting was done using a modified version of the PROFFIT package 14 (Eckert et al. 2011). All the surface brightness profiles presented in the following subsections are in the energy band 0.5 − 4 keV. The profiles were instrumental background-subtracted, and exposure-and vignetting-corrected.

NE Subcluster
The sector used to create the surface brightness profile of the NE subcluster and the fits to this profile are shown in Figure 4. The sky background surface brightness level was determined by fitting a constant to the outer bins (radii 5.0 −10.7 ) of the profile. The sky background level was kept fixed in the following fits. The very central part of the profile is significantly underestimated by the bestfitting β-model. Instead, the profile is described better by a double β-model: Double β-models provide good representations of cooling-core clusters (e.g., Jones & Forman 1984;Ota et al. 2013). However, the core of the NE subcluster in MACS J0416.1-2403 is far from cool, having a temperature > 10 keV based on the temperature map in Figure 2. To constrain a possible cooler component, we examined the probability that the gas in bin #4 (see Figure 3) has two phases: one corresponding to a "hidden" cool core, and another corresponding to the hot plasma that increases the average core temperature to > 10 keV. We modelled the spectrum of bin #4 with a two temperature APEC model. The normalizations of the two components were free in the fit, as was one of the temperatures; the temperature of the second, cooler thermal component was fixed to 5 keV. The fit sets an upper limit of 2.82 × 10 −4 cm −5 arcmin −2 on the normalization of the cooler component (corresponding to a luminosity over 5 times lower than that of the NE subcluster), and provides no improvement in the statistics of the fit. A cool component with a temperature < 5 keV would need to be even fainter. Cooler gas in the NE core could be masked by inverse Compton (IC) emission from the active galactic nucleus (AGN) hosted by the NE subcluster brightest cluster galaxy (BCG). To examine this possibility, we measured the temperature in an annulus with radii 32 and 64 kpc around the AGN in the NE BCG. The best-fitting temperature in this annulus is very high, 15.65 +5.03 −3.22 keV. In a smaller circle with a radius of 35 kpc around the NE core, the best-fitting temperature is 10.54 +3.00 −2.11 keV, and consistent with the temperature calculated in an annulus around the core. Therefore, we find no evidence that the NE core is cool.
14 Available upon request.
From the best-fitting double β-model, we calculated the central density of the NE core to be (1.4 ± 0.3) × 10 −2 cm −3 . The density and temperature of the NE core hence imply a cooling time of 3.5 +1.0 −0.9 Gyr, which further supports our conclusion that the NE subcluster cannot be classified as a cool core cluster based on currently available X-ray data.
To investigate if the double β-model shape of the NE profile is caused by substructure in a particular direction, we divided the NE sector shown in Figure 4 into three subsectors, and modeled the profile of each subsector with β-and double β-models. The fits are shown in Figure 5. For each of the subsectors, a double β-model describes the profile better than a single β-model at a confidence level > 99.99%.

SW Subcluster
The surface brightness profile of the SW subcluster and the sector used in the extraction of this profile are shown in Figure 6. As for the NE profiles, the sky background profile was modelled by fitting a constant to the outer bins of the profile, in the range r = 4.5 − 10.7 . Unlike the NE profiles, the SW profile is modelled well by a simple β-model. The fit is shown in Figure 6.
A weak edge in the profile can be seen at r ∼ 1.5 . We divided the profile into three subsectors with equal opening angles (60 • ) to check whether the edge is seen along all three directions, and found that it is present only in the central subsector (position angles 280 • −340 • , measured from the W in a counterclockwise direction). To describe it, we fitted a projected broken power-law elliptical density model to the surface brightness profile between r = 0.5 − 4.0 . The density model is defined as: where n is the electron number density, C is the density compression, and r d is the radius of the density jump. The fit is shown in Figure 7. If the density discontinuity is a shock front, then its magnitude corresponds to a shock with Mach number M = 1.40 +0.14 −0.12 . Unfortunately, the count statistics are too poor to allow us to distinguish between a cold front and a shock front based on the temperature jump, and thus we cannot determine the nature of the surface brightness edge.

SUBSTRUCTURE IN THE ICM
The search for substructure in the ICM is motivated by the identification in the lensing maps of two less massive structures in addition to the main NE and SW subclusters (Jauzac et al. 2015). The positions of these mass structures are shown in Figure 8 and denoted by S1 and S2 for consistency with the notation of Jauzac et al. (2015). In the analysis of Jauzac et al. (2015), S1 and S2 were found to be X-ray-dark; however, the X-ray data presented here are ∼ 6 times deeper, which would make it easier to observe ICM substructure.
We searched for substructure using the unsharpmasked image of the cluster, which was created by dividing the difference of two 0.5 − 4 keV fluxed images convolved with Gaussians of widths 4 and 10 by their sum. The resulting image, shown in Figure 8, highlights substructure on scales of ∼ 20 − 50 kpc. 15 There is no excess X-ray emission at the positions of S1 and S2. However, the emission is elongated in the direction of both mass structures. In the south, the ICM appears elongated in the direction of S1, while in the north the emission is elongated along the line connecting the NE and SW subclusters and then appears to curve in the direction of S2. We note that if S1 and S2 have already merged with the NE and SW subclusters, the dark matter would have decoupled from the gas, and therefore we do not necessarily expect a spatial overlap between the dark matter and gas components.
Interestingly, the unsharp-masked image enchances a small cavity with a diameter ∼ 50 kpc NW of the core of the NE subcluster. We examined the significance of the cavity from the azimuthal surface brightness profile in an annulus around the cluster center. The annulus was divided in 14 partial annuli with equal opening angles. The partial annuli and the azimuthal surface brightness profile are shown in Figure 9. The azimuthal surface brightness profile is lowest in 4 partial annuli located NW of the NE core. The largest dip is in the partial annulus that crosses the middle of the X-ray cavity seen in the unsharp-masked image; the opening angles of this partial annulus are 51 − 77 degrees. No radio emission fills the X-ray cavity, but the cavity was likely inflated by the AGN hosted by the NE brightest cluster galaxy (see Section 8).

RADIO RESULTS
The JVLA 1 − 2 GHz images reveal several compact sources in the cluster region ( Figure 10). Two of these sources are associated with cD galaxies in the NE and SW subclusters. These sources are also detected in the GMRT 610 MHz image ( Figure 11). These two point-like AGN have 1.5 GHz integrated flux densities 16 of 1.47 ± 0.08 (NE) and 0.27 ± 0.03 (SW) mJy. At 610 MHz we measure flux densities of 3.24±0.34 (NE) and 0.33±0.07 (SW) mJy for these sources.
We also find diffuse extended emission in the cluster. The diffuse emission reveals itself in the JVLA image as an increase in the "noise" in the general cluster area. This diffuse emission is better visible in our lowresolution tapered image with emission from compact sources subtracted (Figure 10). The diffuse emission has a elongated shape measuring about 120 × 45 (0.65 by 0.24 Mpc) and is oriented along a NE-SW axis, following the overall distribution of the X-ray emission. We also find evidence for this diffuse emission in the GMRT 610 MHz image, although less clearly than in the JVLA 1.5 GHz image (Figure 11). In the low-resolution tapered GMRT image there is again evidence for diffuse emission, but the peak flux is only at a level of 3σ rms .
For the integrated flux of the diffuse emission we measure 1.58 ± 0.13 mJy (at 1.5 GHz) in an ellipse with radii of 1 by 0.5 orientated with a position angle of 45 • (following the overall brightness distribution). From the GMRT tapered image we estimate an integrated flux of 6.8 ± 3.0 mJy for the diffuse emission in the same area.
Taking the two flux measurements at 1.5 and 0.61 GHz, we obtain a spectral index of α 610 1500 = −1.6 ± 0.5 for the diffuse emission. Based on the above results, we compute a radio power of P 1.4 GHz = (1.06 ± 0.09) × 10 24 W Hz −1 , scaling with the spectral index of α = −1.6.
VLITE shows emission co-incident with the NE subcluster where the higher resolution 1 − 2 GHz images ( Figure 10) reveal the compact sources and the brightest portion of the diffuse emission. We fit the 320-360 MHz VLITE emission with a single Gaussian component to measure the integrated flux. We measure a total flux of 17 ± 6 mJy where we have included an 8% uncertainty in the absolute flux scale. The VLITE emission contains both the diffuse component and the compact emission from the two point sources associated with the NE and SW clusters. We use the VLA and GMRT flux measurements to determine the spectral index of the two compact sources, assume that spectral index extends to the central VLITE frequency, and estimate the contribution to the total VLITE emission from the compact sources. We subtract that from the VLITE flux and get an estimate of the diffuse component detected by VLITE of 11 ± 4 mJy. Comparing the VLITE flux to the JVLA flux, we calculate a spectral index of α 340 1500 = −1.3 ± 0.3. The results are consistent with the spectral index estimate from the GMRT data. Deeper low-frequency data are required to better constrain the spectral properties of the diffuse emission.
In the JVLA 1 − 2 GHz data we did not detect diffuse polarized emission in the cluster in Stokes Q and U images we made. Halos are generally unpolarized at the few percent level or less so this result is not surprising (e.g., Feretti et al. 2012). We note however, that a proper search for diffuse emission, taking full advantage of the large bandwidth, would require Faraday Rotation Measure Synthesis.

GAS-DM DECOUPLING
The combined X-ray and lensing analysis of Jauzac et al. (2015) determined that the gas component of the SW subcluster has decoupled from the DM component  Jauzac et al. (2015) are marked as S1 and S2 with circles of diameters 100 kpc, as also done by Jauzac et al. (2015). Black crosses mark the centers of the DM halos of the two main subclusters (M. Jauzac, priv. comm.). The dashed arc shows the position of the density discontinuity detected near the SW subcluster. The location of the X-ray cavity is also marked. and lags behind the subcluster's galaxies as it travels towards the NE subcluster. In the NE subcluster, on the other hand, the DM and gas components were determined by Jauzac et al. (2015) to spatially overlap. 17 Offsets between the DM and gas components of merg-17 An analysis of the DM-gas offset in MACS J0416.1-2403 was carried out more recently by Harvey et al. (2015). For the SW subcluster, their DM peak is offset by about 100 kpc (∼ 20 ) from the DM peak determined by previous analyses (e.g., Jauzac et al. 2015;Grillo et al. 2015, models at https://archive.stsci.edu/pub/hlsp/frontier/macs0416/models/). We do not understand the cause of the offset, but choose to use the DM peaks determined by Jauzac et al. (2015), because their locations are consistent within 2 with the locations of the DM peaks determined by other authors (e.g., Grillo et al. 2015). We ing galaxy clusters set crucial constraints on the merger state. In particular, significant offsets indicate a postmerging system, while a lack of offsets supports a premerging scenario. Below, we discuss the DM-gas offsets in light of the deeper Chandra data.
The centers of the DM halos of the two subclusters (M. Jauzac, priv. comm.) are marked on Figure 12, which shows an HST ACS/F814W image of the cluster, also point out that the peak of the galaxy distribution chosen by Harvey et al. (2015) for the SW subcluster in MACS J0416.1-2403 is strongly biased by a foreground galaxy (z = 0.10 +0.12  with overlaid Chandra and JVLA contours. The centers of the X-ray cores were defined as the peaks in the Xray emission, and were determined from the 0.5 − 4 keV Chandra surface brightness map of the cluster, convolved with a Gaussian of σ = 4 . We also considered the uncertainties on the surface brightness distribution, and defined the uncertainties on the X-ray peaks as the regions around the cores in which the X-ray brightness is consistent (within the propagated Poisson errors on the counts image) with the brightness of the corresponding peak. We note that our approach only provides a lower limit on the extent of the peak regions, because the background was included in the surface brightness map used for this analysis. The uncertainties on the DM centers are < 1 , significantly smaller than the uncertainties on the centers of the X-ray cores. In Figure 13, we compare the position of the DM centers with that of the X-ray peak regions. Rather than confirming the previously-reported offset between the gas and DM components of the SW subcluster, Figure 13 shows that both X-ray peaks are, within the uncertainties, co-located with the DM centers. We speculate that the discrepancy between the results presented herein and those reported by Jauzac et al. (2015) is caused by a higher noise level in the significantly shallower X-ray data used in previous studies; our data is ≈ 6 times deeper, which is equivalent to a reduction in noise by a factor of ≈ 2.5. Jauzac et al. (2015) showed that the galaxy redshift distribution in MACS J0416.1-2403 is bimodal relative to the mean redshift, with the SW subcluster (mean redshift 0.3966) moving towards the observer and the NE subcluster (mean redshift 0.3990) moving away from the observer. Based on these observations, Jauzac et al. (2015) suggested two possible merger scenarios:

MERGER SCENARIO
• MACS J0416.1-2403 is a pre-merging system, in which the SW subcluster comes from behind the NE subcluster, and is now seen near first core passage; • MACS J0416.1-2403 is a post-merging system, in which the SW cluster approached the NE subcluster from above, and is now seen near its second core passage.
In this section, we try to distinguish between the two scenarios based on our X-ray and radio findings.
10.1. Cooler gas in the NE? Approximately 300 kpc NE of the NE core, the temperature maps show a region (region #10; T = 8.41 +1.37 −0.90 keV) that is cooler than the neighboring region closer to the cluster center (region #8; T = 12.18 +2.41 −1.90 ). The cool gas in this region could be pre-shock gas ahead of a shock front. We examined the surface brightness profile across the cooler region, but could not confirm a surface brightness discontinuity with a confidence ≥ 90%. A density jump could be masked by projection effects. Nonetheless, while the temperature difference between regions #8 and #10 is significant at 90% confidence (but not at 2σ) and there is a pseudo-pressure jump between the two regions, the lack of a clear density jump does not allow us to confirm a shock front NE of the NE core.

No gas-DM offset
In head-on binary mergers, the essentially collisionless DM travels ahead of the collisional gas. There are several scenarios that could explain the spatial overlap of the DM and gas in MACS J0416.1-2403: (i) MACS J0416.1-2403 is not a head-on merger that progresses outside the plane of the sky, and the NE and SW subclusters are seen before first core passage and have yet to interact strongly with each other.
(ii) The NE and SW subclusters are seen after first core passage, and the DM and gas overlap only in projection. This requires either that one of the subclusters was essentially undisturbed by the merger and our line of sight aligns with the trajectory of the other subcluster, or that the subclusters' trajectories are parallel and both are aligned with our line of sight.
(iii) The NE and SW subclusters are seen after first core passage, and the gas of the SW subcluster slingshot and caught up with the DM halo, after they had separated near pericenter.
The radial velocity difference between the BCGs of the NE and SW subclusters was reported by Jauzac et al. (2015) to be ≈ 800 km s −1 . The sound speed in a cluster with a temperature of ≈ 10 keV is ≈ 1300 km s −1 , and typical collision velocities are 1−2 times the sound speed. Therefore, the radial velocity difference between the two BCGs in MACS J0416.1-2403 seems rather low. The velocity could be explained if the subclusters are seen well before or after first core passage, possibly near the turnaround point. 10.3. Substructure in the NE core The surface brightness profile of the NE subcluster cannot be described by a single β-model, but instead by a double β-model composed of a very compact core and a more extended gas halo. Such a profile is typically used to model cool core clusters (e.g. Jones & Forman 1984;Ota et al. 2013), but it can also describe the surface brightness of merging clusters seen in projection (e.g. Zuhone 2009;Machacek et al. 2010). The core of the NE subcluster is very hot and has a long cooling time, which disfavors the possibility that the NE subcluster hosts a cool core. Instead, the observed substructure is more likely the result of a merger event.
The most straight-forward scenario would be that the NE and SW subclusters have already merged, and while the core of the SW subcluster was destroyed in the collision, as evidenced by the subcluster's flat surface brightness, the core of the NE subcluster survived. Cool cores that survive a recent merger event preserve part of their cold gas. However, we found no evidence of cool gas in the NE core. Therefore, the NE core is unlikley to be the remnant of a recent interaction between the NE and SW subclusters.
An alternative scenario is that the NE subcluster is merging with the dark matter halo S2 8. However, S2 is not associated with a concentration of galaxies, and thus is an unlikely candidate for a group of galaxies; instead, S2 is probably a cosmic filament seen in projection along the line of sight (Jauzac et al. 2015). No other mass concentration that the NE subcluster might be currently merging with has been detected.
One other possibility, postulated by numerical simulations of Poole et al. (2008), is that the NE subcluster underwent a merger or a series of mergers over ∼ 1 Gyr ago, and while its core has already relaxed back to a compact state, it has not yet recooled. In this scenario, the merger could not have been with the SW subcluster, because the two subclusters are involved in an ongoing merger. Instead, the NE subcluster needs to have interacted several Gyr ago with one or more clusters that are no longer directly detectable in the X-ray maps, in the lensing maps, or in the redshift distribution of the NE subcluster.
Alternatively, past mergers could also have heated the cool core without destroying it, by inducing sloshing in the central galaxy. The kinetic energy of the sloshing central galaxy would then have been dissipated as heat between successive mergers.
10.4. The AGN and the X-ray cavity in the NE core The collision velocity between the NE and SW subcluster is unlikely to be highly supersonic. If the cluster is a post-merging system, a cavity could have had enough time to form since first core passage. However, given the short timescale and the intense turbulence following the moment of first core passage, forming a new cavity would require a strong AGN outburst. The expectation would then be to observe radio emission associated with the X-ray cavity. However, this radio emission is not observed. Instead, the presence of an X-ray cavity NW of the NE core supports a pre-merger scenario. The cavity was most likely inflated by a recent weak outburst of the AGN detected in the BCG of the NE subcluster.
10.5. SW density discontinuity The density discontinuity detected S of the SW subcluster is located along the line connecting the NE subcluster, the SW subcluster, and S1. The discontinuity also appears coincident with the position of S1, but this could be a projection effect. There are two scenarios that would explain the origin of the density discontinuity: the merger of the SW subcluster with S1, or the merger of the SW subcluster with the NE subcluster.
If the discontinuity were triggered by a merger between the two main subclusters, then the merger must have already progressed past the moment of the first core passage. Furthermore, because the NE core is very compact and hosts an X-ray cavity, it is unlikely that it was significantly affected by a merger with the SW subcluster. Therefore, taking into account the constraints set by the lack of an offset between the gas and DM peaks of both subclusters, it is the SW subcluster whose trajectory must be aligned with our line of sight. In this geometry, a shock front would be expected ahead of the SW subcluster (with respect to its direction of motion), i.e. our line of sight should be perpendicular to the 3D shock "cap". Instead, however, the surface brightness discontinuity we detect is seen in projection only S-SW of the SW subcluster. If the detected density discontinuity is the appearance of the shock in projection, then we would expect to observe it over a larger angular opening. Most likely, the density discontinuity is caused by the interaction of the SW subcluster with S1.
In conclusion, while we cannot completely eliminate the possibility of MACS J0416.1-2403 being a post-merging cluster, we believe the sum of our findings favors a pre-merging scenario.

A NEWLY-DISCOVERED RADIO HALO
We classify the diffuse radio emission in the cluster as a halo, since the emission has low-surface brightness, large extent, and is centrally located. The halo has a somewhat smaller size than other radio halos in merging clusters, which have typical sizes of 1 − 1.5 Mpc. (e.g., Feretti et al. 2012). The size is more similar to that of radio mini-halos found in cool-core clusters (e.g., Giacintucci et al. 2014). However, an important difference from mini-halos is that neither the NE nor the SW subclusters contains a cool core. In addition, as shown in Figure 12, the halo seems to be associated with both subclusters.
The radio emission around the northern subcluster contains about a factor of 2 more flux than the emission around the southern one. An interesting question is whether we observe a single radio halo, caused by the merger between the NE and the SW subclusters, or two individual halos belonging to the two subclusters. In the latter case, these halos must have originated from previous merger events within two subclusters themselves. This would make MACS J0416.1-2403 similar to the case of the radio halo pair in Abell 399 and Abell 401 (Murgia et al. 2010), with the difference that Abell 399/401 seems to be at a much earlier pre-merger stage.
For radio halos, a correlation is observed between the cluster's X-ray luminosity and the radio halo power (the L X − P correlation), with the most X-ray luminous clusters corresponding to the most powerful radio halos (e.g., Liang et al. 2000b;Enßlin & Röttgering 2002;Cassano et al. 2006). The X-ray luminosity is used as a proxy of cluster mass. The distribution in the L X − P plane is bimodal when clusters without halos are included. Merging clusters with radio halos follow the L X − P correlation, while the upper limits on the radio power of other clusters are well below the correlation (Brunetti et al. 2007).
A correlation is also observed between the cluster's integrated Sunyaev-Zel'dovich (SZ) effect (i.e., the integrated Compton Y SZ parameter) and the radio halo power (e.g., Basu 2012Basu , 2013. Y SZ is proportional to the volume integral of the pressure. The SZ signal should be less affected by the cluster dynamical state and therefore is expected to be a better indicator of the cluster's mass. Basu (2012Basu ( , 2013 and Cassano et al. (2013) showed that there is good agreement between the Y SZ − P scaling relation and the scaling relations based on X-ray data.
In Figure 14, we place MACS J0416.1-2403 on the L X − P and Y SZ − P diagrams, with the values for the other clusters taken from Cassano et al. (2013). The Y SZ, 500 value for the cluster was recently calculated by Planck Collaboration et al. (2015), which found Y SZ, 500 = (0.71 ± 0.24) × 10 −4 Mpc 2 . Specifically, this is the value of Y SZ, 500 obtained from the union catalog based on the MMF1 18 detection algorithm. We also computed the value of Y SZ, 500 using the gNFW model fit to Bolocam presented by Czakon et al. (2014) and obtained Y SZ, 500 = (1.49 ± 0.28) × 10 −4 Mpc 2 . The difference between the Planck and Bolocam measurements is not understood, but is likely due to some combination of random noise fluctuations, deviations from the assumed pressure profile used to extract Y SZ, 500 (particularly at large radii), contamination from dust or other astronomical signals, and calibration uncertainties. Furthermore, we note that this cluster is not detected by PowellSnakes -one of the three Planck algorithms -and the values of Y SZ, 500 obtained from the other two algorithms (MMF1 and MMF3 19 ) are significantly different, perhaps indicating that measurements of Y SZ, 500 towards this cluster contain larger than typical systematic uncertainties. In Figure 14, we plot both the Planck and Bolocam values of Y SZ, 500 . Interestingly, the radio power falls on the low end of the L X − P correlation, which means that the radio halo in MACS J0416.1-2403 is somewhat less luminous than expected from the X-ray luminosity of the cluster. However, using the Planck measurement of Y SZ, 500 , we find that the halo power is consistent with the expected Y SZ − P correlation, indicating the halo is not underluminous for the cluster mass. While the Bolocam value of Y SZ, 500 does imply lower than expected halo power, the nominal Y SZ − P relation used for comparison is derived solely from Planck SZ measurements. We therefore consider our interpretation based on the Planck value of Y SZ, 500 to be more robust. Given this interpretation, MACS J0416.1-2403 could be just an outlier in the L X − P diagram, based on the fact that there is significant scatter in the correlations.
It is also possible that the X-ray luminosity of MACS J0416.1-2403 is not a good indicator of the cluster's mass, and the luminosity is boosted by the merger. A boost in X-ray luminosity is predicted for approximately one sound-crossing time, which for MACS J0416.1-2403 is ≈ 1 Gyr (Ricker & Sarazin 2001). This boost would affect other clusters on the correlation as well, but it could affect MACS J0416.1-2403 more since we are seeing a cluster close to first core passage. The mass of the cluster in R 500 is (7.0 ± 1.3) × 10 14 M (Umetsu et al. 2014). Based on 19 Matched multi-filter algorithm, implementation 3. the scaling relations of Pratt et al. (2009), the mass corresponds to a 0.1 − 2.4 keV cluster luminosity of (1.1 ± 0.4) × 10 45 erg s −1 , which is in agreement with the luminosity calculated from the Chandra data. Therefore, based on the M 500 − L 500 scaling relation, we have no indication that the cluster luminosity is higher than expected for a cluster of this mass.
Another possibility is that we are seeing an ultra-steep spectrum radio halo (α −1.5, Brunetti et al. 2008). In this case, the radio halo is only underluminous at higher frequencies (i.e., above a GHz), while the integrated flux density rapidly increases at lower frequencies. The 320 − 360 MHz VLITE and 610 MHz GMRT data presented in Section 8 are not sufficient to confirm, or rule out, the presence of an ultra-steep spectrum radio halo. Lowfrequency GMRT (ν < 610 MHz) and/or deep P-band JVLA observations are required to settle this case.
A third possibility is that we could be witnessing the first stages of the formation of this radio halo. We do not think that the radio halo is fading, because the NE and SW subclusters are still in the process of merging and most likely they have not yet undergone core passage (see Section 10). Therefore, when the merger progresses further, the radio halo could brighten, increase in size, and move onto the correlation.

SUMMARY
The HST Frontier Field cluster MACS J0416.1-2403 (z = 0.396) is a merging system that consists of two main subclusters and two additional smaller mass structures (Jauzac et al. 2015). Here, we presented results from deep Chandra JVLA, and GMRT observations of the cluster. The main aims of our analysis were to identify signatures of merger activity in the ICM, constrain the merger scenario, and detect and characterize diffuse radio sources. Below is a summary of our main findings: • The cluster has an average temperature of 10.06 +0.50 −0.49 keV, an average metallicity of 0.24 +0.05 −0.04 Z , and a 0.1 − 2.4 keV luminosity of L X = (9.14 ± 0.09) × 10 44 erg s −1 .
• The NE subcluster has a very compact core and a nearby X-ray cavity, but its temperature is very high (T ∼ 10 keV), we find no evidence of multiphase gas, and its cooling time is 3 Gyr. The presence of a compact core and of an X-ray cavity indicates that the NE subcluster was not disrupted by a recent major merger event.
• S-SW of the SW subcluster, there is a weak density discontinuity that is best-fitted by a density compression of 1.56 +0.38 −0.29 . The discontinuity is located along the line connecting the NE subcluster, the SW subcluster, and the mass structure S1 reported by Jauzac et al. (2015). Most likely, the discontinuity was caused by a collision between the SW subcluster and S1, rather than by a collision between the NE and SW subclusters.
• The DM and gas components of the NE and SW subclusters are well-aligned, which favors a scenario in which the two subclusters are seen before first core passage.
• MACS J0416.1-2403 has a radio halo with a 1.4 GHz power of (1.06 ± 0.09) × 10 24 W Hz −1 , which is somewhat less luminous than predicted by the L X − P correlation. However, the halo aligns well on the Y SZ, 500 − P correlation, indicating that it is not underluminous for the cluster mass. We could be observing the cluster at a point in the merger event when the X-ray luminosity is significantly boosted. Alternatively, we could be observing the birth of a giant halo, or a very rare ultrasteep halo.