In-depth example

In this notebook we will be going through an example of running imcascade in a realistic setting and discussing issues along the way

[1]:
#Load all neccesary packages
import numpy as np
import matplotlib.pyplot as plt
import time
import sep

import astropy.units as u
from astropy.coordinates import SkyCoord

For this example we will be running imcascade on HSC data on a MW mass galaxy at z=0.25. The data is attained in the cell below and is retrieved using the unagi python package availible here, written by Song Huang. The cell below is used to download the data and the PSF

[2]:
from unagi import hsc
from unagi.task import hsc_psf,hsc_cutout
pdr2 = hsc.Hsc(dr='pdr2',rerun = 'pdr2_wide')

#Downloaded from HSC archive, this a MW mass galaxy at z~0.25 at the sky location below
ra,dec = 219.36054754*u.deg, -0.40994375*u.deg
examp_coord = SkyCoord(ra = ra, dec = dec)
cutout = hsc_cutout(examp_coord, cutout_size=20*u.arcsec, filters='i', archive = pdr2, dr = 'pdr2', verbose=True, variance=True, mask=True, save_output = False)
psf = hsc_psf(examp_coord, filters='i', archive=pdr2, save_output = False)

#Retrieve science and variance images
img = cutout[1].data.byteswap().newbyteorder()
var = cutout[3].data.byteswap().newbyteorder()
psf_data = psf[0].data
# Get table list from /home/tbm/anaconda3/envs/py3/lib/python3.8/site-packages/unagi/data/pdr2_wide/pdr2_wide_tables.fits
# Retrieving cutout image in filter: i
# Retrieving coadd PSF model in filter: i

Setting up

Modelling the PSF

To use imcascade while accounting for the PSF, you need to have a Gaussian decomposition of the PSF. While this is availible for some surveys, you can use the imcascade.psf_fitter module to help if you have a pixelized version.

The following function first fits the PSF profile in 1D to decide what the best widths are. Then a 2D fit is used to find the correct weights

[3]:
from imcascade.psf_fitter import PSFFitter
psf_fitter = PSFFitter(psf_data)
psf_sig,psf_a,chi2, fig = psf_fitter.fit_N(3, plot = True) # Can choose number
print (psf_sig,psf_a)
plt.show()
[1.56676938 3.15624179 7.16928138] [0.72289466 0.21654144 0.05830564]
_images/example_6_1.png

We can see that a model with three gaussians provides a pretty good fit! Generally we find 2-3 components works well for standard ground based telescopes and for more complicated PSFs, like HST WFC3, we find 4 works well. There is some incentive to use a small number of gaussians to define the PSF as it decreasese the time to render a model image. Additionally it is good to check that the sum of the weights, psf_a, is close to one. This ensures the PSF, and the fit are properly normalized

Organizing all the inputs

First let’s take a quick look at the science and variance images. We will be fitting a model to the science image and the inverse of the variance image will be used as the pixel weights when fitting

[4]:
fig, (ax1,ax2) = plt.subplots(1,2, figsize = (10,5))
ax1.imshow(img, vmin = -0.1, vmax = 0.2)
ax1.set_title('Science Image')
ax2.imshow(var,vmin = 0, vmax = 0.005)
ax2.set_title('Variance Image')
plt.show()
_images/example_10_0.png

Additionally we will be building a mask to mask contaminating sources that we don’t want affecting the fit

[5]:
# Use sep to detect sources
bkg = sep.Background(img)
x_cent,y_cent = int(img.shape[0]/2.) , int(img.shape[1]/2.)
obj,seg = sep.extract(img - bkg.back(), 1.5, err = np.sqrt(var), deblend_cont = 0.005,segmentation_map = True)
seg[np.where(seg == seg[x_cent,y_cent])] = 0
mask_raw = seg > 0

#Convolve mask with a gaussian to expand it and make sure that all low-SB emission is masked
from imcascade.utils import expand_mask
mask = expand_mask(mask_raw, radius = 1.5)
mask = np.array(mask, dtype = bool)


fig, (ax1,ax2) = plt.subplots(1,2, figsize = (10,5))
ax1.imshow(mask, vmin = -0.1, vmax = 0.2)
ax1.set_title('Mask')
ax2.imshow(img*np.logical_not(mask),vmin = -0.01, vmax = 0.02)
ax2.set_title('Masked, Streched, Science Image')
plt.show()
_images/example_12_0.png

Choosing the widths for the Gaussian components

The next major decision is what set of widths to use for the Gaussian components. In general we reccomend logarithmically spaced widths. This means there are more components are smaller radii where the signal is the largest and the profile changes the quickest. asinh scaling can also work.

Next we have to choose start and end points. This should be 0.75-1 pixel (or half the PSF width) to roughly 8-10 times the effective radius. The estimate of the effective radius does not need to be perfect, for example the Kron radius for sep or Sextractor works well. This should help decide the size of the cutout too. In order to properly model the sky the cutout size should be at least 3-4 times larger then the largest width, so 30-40 times the effective radius.

Finally we have to choose the number of components. In our testing somewhere around 9-11 seems to work.

Note

These decisions are not trivial and can affect on the outcome of an imcascade fit. However reasonable changes withn the confines discussed here shouldn’t greatly affect the results. You should run tests to ensure you have chosen a reliable set of widths. If the results are very sensitive to the choice of widths, you should be wary and there may be other issues at play.

In this example we estimate the effective radius to be roughly 6 pixels so we use 9 components with logarithmically spaced widths from 1 pixels to 60 pixels (~10 x r eff) and use a cutout size of 240 pixels, roughly 40 times the effective radius.

[6]:
sig = np.logspace(np.log10(1),np.log10(60), num = 9)

We can also specify inital conditions to help make inital guesses. Here we specify the estimated half light radii and total flux. The code make some intelligent guesses on the inital conditions and the bounds but this may help ensure a quick and reliable fit. It is also possible to specify guesses and bounds for individual components, sky values etc. but this is more involved. See the user’s guide for more details

[7]:
init_dict  = {'re':6., 'flux': 1000.}

Running imcascade

Least squares-minimization

To run imcascade we first need to intialize a Fitter instance with all the inputs discussed above. This class organizes all the data and contains all the methods used to fit the image

[8]:
from imcascade import Fitter
fitter = Fitter(img,sig, psf_sig, psf_a, weight = 1./var, mask = mask, init_dict = init_dict)

Now we can run least squares minimization using the command below

[9]:
min_res = fitter.run_ls_min()
2022-06-22 12:30:26,609 - Running least squares minimization
2022-06-22 12:30:53,368 - Finished least squares minimization
[10]:
print (min_res)
[ 1.20224028e+02  1.18955019e+02  9.11389758e-01  2.29047642e+00
  2.20703755e+00 -1.99999486e+00  2.13014635e+00  2.55804376e+00
  2.85067400e+00 -1.99999840e+00  1.44969096e+00 -1.99999947e+00
 -1.99999996e+00  7.17048936e-03 -1.24392340e-04 -2.20482392e-05]

Here we have printed out the best-fit parameters. They are, in order, \(x_0\),\(y_0\), axis ratio and position angle. Than the next 9 values are the best fit weights for the Gaussian components. Note that by default imcascade explores these in log scale. This can be changes by passing the option log_weight_scale = False when intializing. The final three parameters describe the best fit tilted-plane sky model. The can also be disabled when intializing with sky_model = False or can be set to a flat sky model with sky_type = 'flat'.

These aren’t super easy to parase as is, which is why we use the ImcascadeResults class, described below

[11]:
from imcascade import ImcascadeResults

#Initialized using `imcascade.Fitter.fitter` instance
res_class = ImcascadeResults(fitter)
fig = res_class.make_diagnostic_fig()
/mnt/c/Users/timbl/Documents/files/research/packages/imcascade/imcascade/results.py:697: RuntimeWarning: divide by zero encountered in true_divide
  rms_med = np.median(1./np.sqrt(fitter.weight) )
_images/example_25_1.png

Here we have used the make_diagnostic_fig() function to help diagnose the fit. This figures shows the (masked) observed image, best fit model, sky and residuals. Along with some facts about the fit and the intrinsic surface brightness profile and curve-of-growth. From this it is easy to diagnose if something has gone catastrophically wrong. Such as: Major source not masked properly, poor convergence on fit, or the COG not asymptoting. Here we can see the fit performs pretty well, with no major issues!

Posterior estimation

Below we show the commands that could be used to use Bayesian techniques to explore the posterior distributions. Specifically we are using the “express” method discussed in the paper based on pre-rendered images. There are additional options when running dynesty using imcascade. Specifically we offer two choices of priors, the default is based on the results of the least-squares minimization, the other being uniform priors. We found the former runs quicker and more reliably as the priors are not as broad. It is also possible to set you own priors, see the Advanced section for more details.

>  fitter.run_dynesty(method = 'express')
>  fitter.save_results('./examp_results.asdf')

This is much quicker than the traditional method. However it still took about 15 min to run on my laptop. So I have run it previously and we will load the saved data.

Analyzing the results

Since when using imcascade the paramters that are fit are the fluxes of each Gaussian component, the analysis is more involved then other parameterized models, which fit directly for the total flux, radius etc. To assist in this we have written the results module and the ImcascadeResults class. This class can be initialized multiple ways. First you can pass it a Fitter instance after running run_ls_min() and/or run_dynesty(). Alternatively it can be passed a string which denotes the location of an ASDF file of the saved results

[12]:
#Initialized with saved file
res_class_w_post = ImcascadeResults('examp_results.asdf')

ImcascadeResults will default to using the posterior to derrive morphological parameters if it is availible.

There are a number of functions we have written to calculate various morpological quantities, please see the API reference for all functions. For a lot of applications, one can simple run run_basic_analysis() which calculates a series of common morpological quantities

[13]:
#we can also specify the percentiles used to calculate the error bars, here we use the 5th-95th percentile
res_class_w_post.run_basic_analysis(zpt = 27, errp_lo = 5, errp_hi = 95)
[13]:
{'flux': array([1392.30063682,    8.20183351,    8.24031006]),
 'mag': array([1.91406674e+01, 6.40697201e-03, 6.41482156e-03]),
 'r20': array([2.45196295, 0.01713131, 0.01708627]),
 'r50': array([6.22617613, 0.03933737, 0.04009991]),
 'r80': array([11.42789865,  0.11402678,  0.11763424]),
 'r90': array([14.71073612,  0.22744139,  0.23404145]),
 'C80_20': array([4.66096268, 0.02856234, 0.02881198]),
 'C90_50': array([2.36285489, 0.02340729, 0.02374323])}

In addition to these integrated quantities, we can calculate the surface brightness profile and curve-of-growth as a function of semi-major axis

[14]:
#Set of radii to calculate profiles at
rplot = np.linspace(0, 50, num = 100)

#Here we will calculate the posterier distribution of the surface_brightness profile
sbp_all = res_class_w_post.calc_sbp(rplot)
print (sbp_all.shape)

#Here we calculate the curve-of-growth for the posterier
cog_all = res_class_w_post.calc_cog(rplot)
(100, 29206)

If you use res_class_w_post where the postior distribution is availible it will return a 2D array containing the SBP of each sample of the posterior.

Now we plot many grey lines which show individual samples from the posterior

[15]:
fig, (ax1,ax2) = plt.subplots(1,2, figsize = (12,5))
ax1.plot(rplot, sbp_all[:,::100], 'k-', alpha = 0.05)

ax1.set_yscale('log')
ax1.set_ylim([5e-4,5e1])
ax1.set_title('Surface Brightness Profile')
ax1.set_xlabel('Radius (pixels)')
ax1.set_ylabel('Intensity')

ax2.plot(rplot,cog_all[:,::100], 'k-', alpha = 0.05)
ax2.set_title('Curve-of-growth')
ax2.set_xlabel('Radius (pixels)')
ax2.set_ylabel(r'$F(<R)$')
plt.show()
_images/example_36_0.png

If you are interested in morphological quantity that is not included, it is likely that it will be easy to calculate and code up, so please contact us!