Surveying the Universe
Research
On Gaussian approximations
Feb 6th
You have a function that you want to approximate as an N-dimensional (multivariate) Gaussian (normal) distribution. What do you do?
If you are me, you spend a couple of weeks deriving stuff, and then finally figure out the easy way.
But hopefully you are not me, so here is the easy way, to save you the bother.
First, find the peak of your function, and put a hat on the value(s) or the parameter(s) at this point. Now your (N-dimensional) function can be approximated as
![f(x) \simeq f(\hat{x})\exp \left[-\frac{1}{2} (x-\hat{x})^\mathrm{T} \Sigma^{-1} (x-\hat{x}) \right]](http://www.anthonysmith.me.uk/research/wp-content/plugins/latex/cache/tex_1cab1414ae46a675ba455c6274182d29.gif)
where
is the covariance matrix, and our task it to figure out what it is.
But fear not, help is at hand. If we find the following Hessian matrix, and if we assume that the covariance matrix is symmetric, we have

That's it. Easy. Now you can even integrate the function

HerMES point source catalogues
Sep 27th
Well, it's finally here: HerMES: point source catalogues from deep Herschel-SPIRE observations, by yours truly and lots of other people otherwise known as "Al".
Some of the catalogues are available here, if you're interested.
The biggest challenge was confusion (about which there is a great deal of confusion). For example, in this HerMES image, everything you can see is light from galaxies, and each galaxy produces a small round blob in the image. The number of galaxies is so high, that light from any one galaxy is confused with light from numerous neighbouring galaxies.
So when the computer looks at the image and spots a blob with brightness 30 mJy, what does that mean? Does it mean there is one galaxy there with brightness 30 mJy? Or perhaps there are 10 galaxies, with brightnesses of 15 mJy or lower?
It's difficult to tell. But what you can do (and what we did in the paper) is to stick fake galaxies (blobs) into the image, and see what difference that makes. So if we have a fake galaxy with brightness 30 mJy, and we put it in at a random position, what effect will that have, on average? What is the probability that it will be detected? (This is approaching what we mean by the completeness of the catalogue, although it is far from easy to know what is meant by those simple words "it" and "detected"!) What brightness will it be measured to have, on average? 20 mJy? 30 mJy? 40 mJy? 100 mJy?
For very bright galaxies, it's relatively straightforward. But for fainter things, it can do your head in...
Bayesian number counts
Apr 30th
Here's a simple bit of statistics for a Friday lunchtime. You count the number of galaxies in a certain area on the sky (with the galaxies satisfying some specific properties, if you like). What is the true number density? Let the expected number be
(the true number density multiplied by the area on the sky) and the measured number be
. Then, in true Bayesian fashion, what we want is

Now, for the prior,
, we assume a prior which is flat on a logarithmic scale. That is, we guess (before making the observation) that the expected number is as likely to lie between 1 and 10 as it is to lie between 1000 and 10,000. (The alternative, a flat prior on a linear scale, would mean that we guess the true density is just as likely to lie between 10,001 and 10,010 as it is to lie between 1 and 10, which is ridiculous.) So
. The likelihood,
is given by the Poisson distribution. So, ignoring the normalizing factor of
,


And this is the Gamma distribution. Easy peasy.
Herschel for everyone
Mar 9th
I've just learned that the Herschel Science Archive has been opened up to the world, so any old Tom, Dick or Harry can download the data and start writing their own Nature papers. Well, okay, most of the data is (are) proprietary, but there's quite a bit of public data on there. Here are some links.
Astroinformatics
Sep 23rd
Data volumes from multiple sky surveys have grown from gigabytes into terabytes during the past decade, and will grow from terabytes into tens (or hundreds) of petabytes in the next decade. ... For astronomy to effectively cope with and reap the maximum scientific return from existing and future large sky surveys, facilities, and data-producing projects, we need our own information science specialists. We therefore recommend the formal creation, recognition, and support of a major new discipline, which we call Astroinformatics. ... Now is the time for the recognition of Astroinformatics as an essential methodology of astronomical research. The future of astronomy depends on it.
Pixelating a 2-D Gaussian with Python
Sep 4th
They're coming thick and fast now.
Here's a Python function to accompany the previous post. It's not maximally efficient, but should make sense...
from scipy import stats def gaussian_pixel(minxy, maxxy, sigma, meanxy=(0.,0.), norm=None): """Return the value of a pixel sampling a 2D Gaussian, normalized such that the area under the Gaussian is 1 (default) or such that the peak is given by norm.""" x1, y1 = minxy x2, y2 = maxxy x0, y0 = meanxy if norm is None: norm = 1. / 2 / math.pi / sigma ** 2 return norm * 2 * math.pi * sigma ** 2 / (x2 - x1) / (y2 - y1) * ( (1 - stats.erfc((x2 - x0) / math.sqrt(2) / sigma)) / 2. - (1 - stats.erfc((x1 - x0) / math.sqrt(2) / sigma)) / 2.) * ( (1 - stats.erfc((y2 - y0) / math.sqrt(2) / sigma)) / 2. - (1 - stats.erfc((y1 - y0) / math.sqrt(2) / sigma)) / 2.)
On the normalization of PRFs
Sep 4th
Yesterday I said that the PRF for a map in Jy/beam (or similar) should be normalized so that that peak is 1. But this is true only for an idealised (not pixelated) PRF, or if the map has infinitesimally small pixels.
If the pixels are larger than infinitesimal, as is generally the case, then the maximum value of the pixelated PRF will be the average value over the pixel, which will be less than 1.
For example, if the PRF is a two-dimensional Gaussian, centred on
, with standard deviation
, then the value in a pixel with
and
will be



Ugh. Let's make that simpler. For a PRF centred on
, and a pixel
, this is

As an example, the fairly-Gaussian beam for the Herschel Space Observatory SPIRE instrument has an FWHM of around 18", which corresponds to a standard deviation of around
. If we make a Jy/beam map with pixel size 6", then the peak value for a 1 Jy point source in the centre of a pixel will be

No big deal really...
Estimating the flux of a point source
Sep 3rd
You have a map and you know what a point source looks like. How do you filter the map so that the value of each pixel is now the most likely flux of a point source centred on that pixel? (An isolated point source, to be more precise.)
Easy.
First, find
, which is the point response function (PRF), telling you what a point source of flux 1 will look like in the map. This may be normalized so that the peak is 1 (if your map is in Jy/beam or similar), or so that
(if your map is in Jy/pixel or similar). If your map is in MJy/sr ... well, figure it out and add a comment below. Basically, if you normalize your PRF correctly, you won't need to worry about the map units in what follows. Phew.
Now the measured value of each pixel around the point source,
, will be

is the flux of the source and
is the noise, drawn from a normal distribution with mean zero and standard deviation
.
Now the badness of the fit is measured by the
, which is given by

,
will be at a minimum, so 

, we find the maximum likelihood solution 
Now just do this for each pixel in the map (corresponding to a point source centred on each pixel) and you're done.
Worked example.
is 0.5, 1.0 and 0.5, for three adjacent pixels (you'll have realised that the map is in Jy/beam or similar), and
is 1, 2 and 1 Jy/beam, for three adjacent pixels, with the same (tiny!) value of
for each pixel (in this case, we can ignore the value of
in what follows). So the flux at the central pixel is estimated to be

This is an example of a matched filter (I haven't read the page, but hopefully including the link will make me look clever). And, given that point sources are under no particular obligation to align themselves with the centres of the pixels of your map,
can easily be re-estimated for a source with a certain offset from the pixel centre.
The science of galaxy formation...
Aug 26th
...is the title of a provocative article by Gerry Gilmore(*) on today's astro-ph. There's a bit about the scientific method, such as:
The appropriate scientific methodology with which to address such questions is itself problematic: how does one apply what many consider the “traditional scientific method”, involving objective analysis of independent repeated experiments as a test of theory, when the Universe does not allow us to experiment, in the traditional laboratory physics sense; when we have no useful predictive theory for much of astrophysics; and when the nature of the Universe may restrict our observation to only a very small part of an unobservable larger whole? More specifically, is the observational test of prediction how science actually operates? Is that how astrophysics operates?
Good stuff. But the most cutting remarks come in his assessment of the current approach to modelling galaxy formation:
Such a long list of observations all inconsistent with apparently fundamental features of galaxy formation models suggests two approaches. In one approach, new complex physics (“feedback”) must be added, to “improve” agreement with observation. The appearances are to be saved. In another, common assumptions in the galaxy simulations could be examined further.
With the reference to the saving of appearances, the allusion is to Ptolemy's epicycles: making a misguided model seem more plausible by making it more contrived.
The specific problem Gilmore sees with cosmological simulations is the suppression of the "ultraviolet divergence", i.e., small-scale perturbations, by "numerical smoothing (‘finite resolution')": "It is unlikely that Nature does it that way." He suggests that many of the inconsistencies between galaxy formation models and observations could be a result of this poor handling of the small-scale power spectrum.
(*) Disclaimer: I will not be held responsible for any damage sustained to your eyes as a result of following links on this page.
Papers: your personal library of science
Jul 22nd
Looking for a piece of software for your Mac that will allow you to:
- keep track of PDFs of academic papers,
- search for papers using Google Scholar, ADS, arXiv, ...,
- search your personal library in an instant,
- read papers full-screen,
- add notes to papers,
- organise the papers using collections and smart collections,
- interact with BibTeX databases and citation keys,
- and do all the above in something that looks and feels like iTunes?
Here it is: Papers by mekentosj.com.
Galaxy Zoo: the independence of morphology and colour
Jun 3rd
Galaxies come in two types: red, elliptical galaxies that reside in high-density regions, and blue, spiral galaxies that reside in low-density regions. Right?
Actually, no.
At least, not according to this Galaxy Zoo paper, on the independence of morphology and colour (or here).
First of all, there's a sizeable population of galaxies that blatantly refuse to allow their colour to determine what shape they should be. There are red galaxies with beautiful spiral morphology and blue galaxies with plain old elliptical morphology.
Okay, but we know that red galaxies like to hang out in crowded places, and that elliptical galaxies are similarly gregarious, so clearly there's some connection between being red and being well-rounded?
Nope, wrong again!
The main reason that we see more red galaxies in dense environments is that the fraction of spiral galaxies that are red changes, and the fraction of elliptical galaxies that are blue changes. So in sparsely populated bits of the universe, most of the spiral galaxies are blue, but in densely populated regions, most of the spiral galaxies are red. It's similar for elliptical galaxies. In low-density regions, a large fraction (not quite half) of the elliptical galaxies are blue, whereas in dense environments the vast majority of elliptical galaxies are red.
So the morphology-density relation has really very little (directly) to do with the colour-density relation.
Moral: "elliptical/spiral" doesn't mean "red/blue"!
UKIDSS paper submitted
Jun 3rd
Well, the deed has been done, and the paper has finally been submitted to MNRAS and to astro-ph. You can read it if you really want to: Luminosity and surface brightness distribution of K-band galaxies from the UKIDSS Large Area Survey. Here's a picture from the paper:
This is the K-band luminosity function: the number of galaxies per volume as a function of their luminosity, with low luminosity at the left and high luminosity at the right. It's far from perfect, but hopefully a step in the right direction. There's quite a bit of incompleteness (missing galaxies) and uncertainty (due to small numbers of galaxies and large-scale structure) at the faint end (left-hand side of the plot). But perhaps more interesting is the disagreement at the bright end (right-hand side). All of the previous results shown on the plot used 2MASS imaging, so this might explain the different results we have found. Specifically, it could be that (1) we use Petrosian magnitudes rather than Kron or total magnitudes, (2) UKIDSS photometry is better than 2MASS photometry, (3) the evolution corrections are different, (4) something else or (5) any combination of the above.
Evolution of Schechter function ... so?
Apr 4th
This is some work in progress: K-band luminosity function from the UKIDSS Large Area Survey (LAS, black dots), showing the number of galaxies per unit volume depending on the luminosity of the galaxies, from faint (left) to bright (right). I.e., there are lots more small galaxies than big galaxies.
I've fit several Schechter functions to the data. This is a convenient way of describing the luminosity function in terms of three numbers: the slope of the faint end (alpha), the luminosity brighter than which the number of galaxies drops off rapidly (M-star) and the number of galaxies per unit volume at M-star (phi-star). To fit the Schechter functions I've used only a portion of the data, as shown in the figure. For example, for the green curve, I've used only the black points brighter than (to the right of) absolute magnitude -21.
Now here's the point. At high redshift, it is possible to see only the brightest galaxies. So we would be able to plot only the black points towards the right-hand side of the figure. But what effect would this have on the Schechter function? Even if we assume the luminosity function does not vary with redshift, our Schechter function fits would! In fact, if we relied on the Schechter function fit to tell us how the galaxy population varied with redshift (a silly thing to do, but people do it all the time), we would infer that the high-redshift galaxy population was (1) brighter (2) more dominated by small galaxies and (3) less abundant than the low-redshift galaxy population.
(Now (1) and (3) are probably true, but we don't need the Schechter function to tell us. Not so sure about (2).)
Moral: don't rely on the Schechter function!
A galaxy being emitted by a star
Feb 28th
UKIDSS at ESO
Dec 20th
Just back from my first visit to Garching (near Munich). ESO, to be more specific. The reason for the visit: a three-day workshop on Science from UKIDSS.
Here's the gist of it. Lots of good results already, lots of work in progress, and a sense that UKIDSS has come of age: the needle-in-a-haystack hunters now have enough hay (they hope!) to find some record-breaking needles (the smallest, nearest or furthest known luminiferous objects in the Universe) and the (Galactic or extra-Galactic) Gallup pollers have now canvassed enough individuals (stars or galaxies) to be reasonably confident about the views of the whole population.
I'm one of the extra-Galactic Gallup pollers. Some slides from the talk I gave on the final morning are on my (small but growing!) publications page.
Next tasks:
- Investigate the problem with deblending of large galaxies
- Write paper
- Write thesis
- Get job
Filtering astro-ph with CosmoCoffee
Dec 14th
One of the things mentioned in Sarah Bridle's talk at YAM last week was a filter for arXiv.org provided by CosmoCoffee. I decided to sample it this week.
After creating an account on CosmoCoffee, you will need to edit the keywords in your profile to reflect your interests (well, I did!). Then click on Arxiv new filter and you're off!
Here are my settings:
- Arxives in order of interest: astro-ph
- Arxiv New search key strings: galaxy (redshift )?survey, luminosity (function|density), surface brightness, UKIDSS, UKIRT, VISTA, SDSS, Sloan, WFCAM, near infrared, stellar mass, star formation (rate|history), galax, Bayes, redshift, astro-ph, ADS, extragalactic
And here are the results:
- Monday: 54 new on astro-ph, of which 21 made it through the filter. These were not only filtered but also sorted by CosmoCoffee so the most relevant were listed first. Very useful.
- Tuesday: 76 on astro-ph; 27 on CosmoCoffee. It missed The Future of Cosmology by George Efstathiou, which was a fun read. But I can't think of any way to adjust the CosmoCoffee filter to catch papers like this, without catching loads of other cosmology papers. But in the full astro-ph listings I skimmed over the paper on globular clusters and their host galaxies, which was ranked highly by CosmoCoffee.
- Wednesday: 42 on astro-ph; 14 on CosmoCoffee, filtered and sorted just right.
- Thursday: 52 on astro-ph; 22 on CosmoCoffee. Hmm, wish I knew more about dwarf galaxies.
- Friday: 30 on astro-ph; 7 on CosmoCoffee (must be getting near Christmas). Glad I skimmed through astro-ph, as my filter settings excluded this fascinating article on the history of dark energy. Apparently Newton thought of it (or something like it) 320 years ago!
Conclusion: based on this week's experience, I'm likely to miss interesting and relevant papers if I use either astro-ph or CosmoCoffee ... so I'll use both! Start each day adagio on CosmoCoffee, accelerando poco a poco, then prestissimo through astro-ph.
UKIDSS poster
Dec 10th
Last Friday was the RAS Young Astronomers Meeting up in Edinburgh. I presented a poster, A census of K-band galaxies from the UKIDSS Large Area Survey, which I've just put online on my (very short!) publications page.





I live in York and I'm a research fellow in