Astronomical Techniques - Data Archives

Astronomy has a long history of exploiting data originally taken by someone else or for another purpose, going back to the imperial Chinese records of nova and supernova outbursts, through the plate vaults at Lick, Mt. Wilson, and Palomar, and brought to real public availability by digital data and networking. For many astronomers, it was the IUE experience that brought home the values of an archive - by the end of the mission, more than 105 spectra had been taken, and the archive was producing more published results than were new data. The WWW has added value to the whole concept. Now that data are digital, they can be exactly duplicated (unlike photographic plates), andthe HTTP interface avoids the plethora of individual account authorizations that were briefly common in the early 1990s. And by now, networks are routinely fast enough that shovelling a GByte of data around is becoming routine (if not exactly friendly).

A useful data archive must be filled - that is, the actual data must be routinely retrievable. It must be searchable and preferably indexed on useful quantities such as position, time, and perhaps target identifiers. The spherical nature of celestial coordinates makes position searches a bit more complex than commercial data bases support off the shelf. It must be documented - the user was't there at the time, so every factor that might affect data quality or interpretation must be saved along with the actual data bits. This makes data from space missions more amenable than from many Earthbound sites with changeable atmospheric conditions.

The standard in archiving is set by the HST project. The data are all uniform and of identifiable quality, and they have more resources for its management than anyone else. The archive offers not only a range of search and sorting options, but the ability to see a highly compressed image preview and a view of the field superimposed on the Digital Sky Survey. It then offers ftp delivery to your machine. This works so well that STScI has taken over management of the archives from the International Ultraviolet Explorer (IUE), Extreme Ultraviolet Explorer (EUVE), Far-UV Spectroscopic Explorer (FUSE), and Copernicus. The archive may be found at http://archive.stsci.edu with the multimission options at http://archive.stsci.edu/mast.html. This is also the server for the VLA FIRST 21-cm sky survey.

X-ray and γ-ray data are mostly to be found at the High Energy Astrophysics Science Archive Research Center (HEASARC). This does not include the full data set for the ROSAT all-sky scanning survey, which must be retrieved to the Max-Planck-Institut, but does include all ROSAT pointed observations.

Infrared data may be found from the IPAC IRSA site, including IRAS, ISO and 2-Micron Astronomical Sky Survey (2MASS) products. IRAS had nearly full-sky coverage with a polar ecliptic scanning scheme. 2MASS uses many short exposures stepping the telescope in between, while ISO was a more traditional observatory with many small targeted fields for imaging and spectroscopy. The actual ISO archive interface is at the ESA ISO Data Centre.

For ground-based data, the most extensive archive is the Digitized Sky Survey, comprising the Palomar Sky Survey and the ESO-SRC survey of the southern sky. There are two versions available, plus a CDROM set. The WWW sources are Skyview (http://skyview.gsfc.nasa.gov/skyview.html) and the HST proposal-preparation site (http://stdatu.stsci.edu/dss/dss\_form\_phase2.html). The PSS had a pivotal role in the astronomy of the 1960s, and is worth knowing about in some detail. The Sloan Digital Sky Survey (SDSS, http://www-sdss.fnal.gov:8000/) will have a similar impact at the start of the 21st century.

The PSS used the 1.2-m Schmidt at Palomar, deliberately built to survey potential targets for the Hale telescope. On 14-inch plates, it covered slightly more than a 6× 6° field at once, so the survey has strips centered at δ = 0°, ±6°, ±12°... photographed in red (E) and blue (O) light. This is where the Abell and Zwicky catalogs came from (and in fact some of the plates were taken by George Abell as a graduate student, so he got the first look at many fields). The POSS was a huge advance, magnitudes deeper than the earlier photographic surveys. The limiting magnitude was about 21, and varies depending on the generation of glass or paper copy you can find. This was complemented in the 1980s by a southern survey (originally from -15° south, later extended to the equator) as a joint effort with the 1m ESO Schmidt (red-light F plates) and the 1.2m UK Schnidt in Australia (blue J plates). Better corrector plates and emulsions make these data about a magnitude deeper than the original sky survey, and they have better overlap being made on 5° centers. This scheme was repeated for the POSS-II, whose film copies are still being distributed, for which the Palomar Schmidt was optically upgraded (and renamed Oschin) as well as using the new III-class plates. The original survey was reviewed by Lund and Dixon 1973 PASP 85, 230). Some catalogs still list rectangular (x,y) coordinates on the PSS prints, since this wasn't easy to use in the days before digitized images (you can still find the plastic coordinate overlays for each image). In the years just before the POSS-II, a short-exposure yellow-light survey ("Quick V") was done from Palomar to produce the HST Guide-Star Catalog, to reduce effects of proper motion since the late 1950s. The southern surveys were recent enough not to need such a repeat performance. There are efforts to catalog objects from scans of the POSS plates, such as the Minnesota APS effort and the Edinburgh COSMOS group.

Traditional ground-based observatories have lagged behind in producing useful archives, both because of expense and because the vagaries of weather and multiple observers make documenting the observations and data quality a real challenge. The Isaac Newton Group at La Palma has a usable archive, as does the Canada-France-Hawaii telescope. There has been heated discussion about whether to keep a Keck archive, since astronomers at one of its operating institutions are adamantly against the whole idea. The Gemini observatory has as a built-in requirement filling and maintaining an archive, as does ESO's Very Large Telescope (which, alas, requires a European domain address for access). KPNO and CTIO run a minimal "save-the-bits" operation which could be turned into an archive, but for now retrieval really is an emergency procedure.

Additional radio surveys online include the NRAO VLA Sky Survey, a 20-cm survey with different surface-brightness sensitivity than FIRST. Raw data have been archived at the VLA since very early in its operation, for those who want to reprocess them to modern standards. Single-dish surveys exist in both skymap and catalog form, having largely been done with scanning patterns in either right ascension or declination, while the VLA surveys use a large number of slightly overlapping short pointings.

A related issue is availability and searchability of catalogs, compilations of derived quantities. Most of these are in fact available electronically. Whether planning observations or writing a paper, it's a really good idea to make sure you're current on what's already known. Data collections keyed to specific objects are SIMBAD primarily for stars ( US mirror site) and the NASA Extragalactic Database (NED) for extragalactic objects (galaxies and QSOs). One generally wants to access catalog data either horizontally (a set of data uniformly derived for all objects in a class) or vertically (a wide range of data from various sources for a single object), and the collection strategies must be tuned to these needs.

Easy access to these data resources has changd the practice of astronomical research, and this will only continue in coming years. Many projects once requiring dedicated observations with a 1-m telescope can now be done at your desktop using SDSS and 2MASS data. Particularly for imaging and high-dispersion spectroscopy, there's a great deal of information in HST data which goes beyond the goals of the original proposal, and by now some large programs are done in a public-service mode (a classic example is the pair of Hubble Deep Fields). The success of the HDF model makes large public projects more and more attractive. In the temporal domain, the International AGN Watch (http://www.astronomy.ohio-state.edu/~agnwatch/) includes observations by more tha 100 astronomers spanning nearly 20 years.

The Virtual Observatory era


« Data presentation and standards | Some bits of career advice »

Course home page | Bill Keel's Home Page | Image Usage and Copyright Info | UA Astronomy

keel@bildad.astr.ua.edu
2006	  © 2000-2006