A global source of population density has been on my low-priority wish list for some time, so I was very excited when I found Steve Mosher’s work with the Nighlights data set. “Nightlights” refers to the artificial lights seen from space at night. Astronomers call it “light pollution” which is pretty accurate since it’s decidedly not the light which we all use to see and avoid peril at night. Rather, it’s the light (and energy) wasted in that pursuit by being emitted or reflected straight up into the sky.
Steve has since spent some quality time with other R packages like Rgooglemap exploring this data set and has noticed some problems with the geocoding of the nightlights data. I noticed the same thing, though much more naively, just trying to check out the data around my home:
library(RCurl) library(R.utils) library(rgdal) library(raster) url_radianceCalibrated = "ftp://ftp.ngdc.noaa.gov/DMSP/web_data/x_rad_cal/rad_cal.tar" calibratedLights = "rad_cal.tar" hiResTif = "world_avg.tif" download.file(url_radianceCalibrated,calibratedLights,mode="wb") untar(calibratedLights) gunzip(paste(hiResTif, '.gz', sep='') hiResLights = raster("world_avg.tif") # Eastern Mass., Cape Cod & Islands: e = extent(-71.5, -69.5, 41, 43) r = crop(hiResLights,e) plot(r)
which looks amazing… right up until you overlay the county boundaries from the standard ‘maps’ package:
library(maps) map("county",xlim=c(-71.5,-69.5),ylim=c(41,43),add=T)
Alas. To my eye, there’s a clear shift to the northwest (see Provincetown at the tip of Cape Cod), and perhaps a bit of a clockwise rotation as well (see the big bulge of Cape Ann north of Boston).
Newer = better… ?
I have a lot to learn about this data, but in my poking around, I did find more recent observations available on a “Version 4 DMSP-OLS Nighttime Lights Time Series” page. But warning — these files are big. The tar file I download next is over 350MB:
url = "http://www.ngdc.noaa.gov/dmsp/data/web_data/v4composites/F152007.v4.tar" dest = "F152007.v4.tar" tif = "F152007.v4b_web.stable_lights.avg_vis.tif" download.file(url, dest) untar(dest) gunzip( paste(tif, '.gz', sep='') ) f15 = raster(tif) e = extent(-71.5, -69.5, 41, 43) r = crop(f15, e) plot(r) map("county",xlim=c(-71.5,-69.5),ylim=c(41,43),add=T)
which looks a lot better, though still probably not perfect.
ggplot2 with raster
I am a huge fan of ggplot2, but since this was my first exposure to the raster package, I just copied and pasted Steve’s code to make the plots. But I couldn’t help myself to try to reproduce them in ggplot2.
Getting your data into a data.frame
is the key to using ggplot2. Fortunately, the raster package includes a rasterToPoints()
function which outputs a list which is easily cast to a data.frame:
p = rasterToPoints(r) df = data.frame(p) colnames(df) = c("lon", "lat", "dn")
which makes the actual plotting so easy, even qplot()
will do it:
library(ggplot2) qplot(lon, lat, color=dn, data=df) + scale_colour_gradient(low="black", high='white', transform='sqrt') + theme_bw() + borders("state", colour="yellow") + xlim(c(-71.5, -69.5)) + ylim(c(41, 43))
The only technical glitch is in the overlay, as zooming in truncates the northernmost coastline points but the geom_polygon()
layer created by borders()
seems compelled to close the shape and connects the northern Mass. coast with Rhode Island.
October 15, 2010 at 12:28 AM
Very cool.
a couple of us have been trying to figure out the geocoding errors with Nightlights. It may come down to a rounding issue. there appears to be a 1 cell shift in certain areas,
The other data you refer to is a product I hope to look at, but for now I am sticking with the radiance calibrated lights to finish this work
When I’m done I’ll post all the code.
Also, population density at 1km is available from 3 sources. I’ll be covering that in future posts.
October 21, 2010 at 7:58 PM
Hi Steve:
Thanks for visiting and commenting. I have enjoyed your posts on this topic and look forward to more.
I am particularly interested in any sources of population density you can recommend since I would like to use it as to compare the noise impact of various approach and departure procedures to airports. NextGen’s satellite-based navigation significantly increases the freedom to design new approaches; a readily-available source of population density could make it easier to be considerate of those on the ground.
Thanks again,
Jeffrey
October 23, 2010 at 5:29 PM
Ok do you have specific region you are thinking about?
Global datasets are available ( google GPW) and there is GRUMP and a third ( I’ll remeber the name)
Then also you’ll want to look perhaps At NCEP reanalysis for historical wind reconstructions. I’m not sure of the accuracy there
October 23, 2010 at 5:32 PM
if you like you can also just go from Nights to population density using Imoff’s catagories.
That is, the nightlights are cut into 7 categories, each corresponding to a different density of people per square ha.
October 15, 2010 at 2:31 PM
about the data shift:
The county data in the maps package are somewhat generalized and imprecise. For better comparisons, here is much more precise coastline (border) of the USA (but the shift remains)
library(raster_
usa <- getData('GADM', country="USA", level=0)
One minor thing to correct for imprecision because gdal reports the upper left corner and the resolution:
xmax(hiResLights) = 180
ymin(hiResLights) = -90
October 21, 2010 at 8:14 PM
Hi Robert:
Wow — that coastline data is beautiful. I’ll have to throw it in a new post to show it off.
Thanks for pointer!
Jeffrey
October 16, 2010 at 3:24 PM
I believe you could use
scale_x_continuous(limits=c(41, 43))
instead of xlim(c(41,43)) to get the axes truncated without the artifacts …
October 21, 2010 at 8:13 PM
Hi Ben:
Thanks for visiting and for the suggestion. Unfortuntely, that didn’t make a difference for me. The docs say that xlim(…) and ylim(…) are just shorthand for scale_x_continuous(limits=…) and scale_y_continuous(limits=…).
This code after that from the help page for borders() seems to fix the artifacts:
ma = map_data(“county”, “massachusetts”)
ggplot() + geom_polygon(data=ma, aes(long,lat,group=group), color=”yellow”) + xlim(c(-71.5, -69.5)) + ylim(c(41, 43))
But I think Robert (comment above) nails it by using higher quality coastline data.
Thanks,
Jeffrey
October 22, 2010 at 11:29 AM
[…] Mac users are bett…Marc Schwartz on vecLib: Why Mac users are bett…Jeffrey Breen on Nightlights: cool data, bad…Jeffrey Breen on Nightlights: cool data, bad…Jeffrey Breen on Nightlights: cool data, […]
July 13, 2011 at 11:31 PM
Just let me correct something:
ftp://ftp.ngdc.noaa.gov/DMSP/web_data/x_rad_cal/
July 14, 2011 at 8:57 AM
Hi Carlos:
Thanks for the correction — it must have moved. I’ll update it now.
Jeffrey
August 26, 2011 at 2:25 AM
I Have been trying to access again, however is not working now!