select m.pgc,m.objname,m.al2000,m.de2000,s.id,s.name,s.ra,s.dec into ledasn from leda.m000 as m, sn.sn as s where q3c_join(m.al2000,m.de2000,s.ra,s.dec,0.05);
It took about 90 seconds to join 2,000,000 galaxies with 3348 sne to produce about 20,000 results (yes, 0.05 is too much !) on very modest linux box: PIII 1Gz, 256Mb RAM, PostgreSQL 8.1.2, q3c 1.1
q3c doesn't supports ellipses yet, so for now I tried circle (should correct for positional angle !)
select m.pgc,m.objname,m.al2000,m.de2000,s.id,s.name,s.ra,s.dec into ledasn from leda.m000 as m, sn.sn as s where q3c_join(m.al2000,m.de2000,s.ra,s.dec,2*10^(m.logd25-1)/60.);
use coalesce() function to specify default joining diameter (if logd25 is null).
Well, I finally decided to abandon galaxy table from our catalogue and to use hyperleda as a reference data for galaxies. This will greatly simplify maintenance of our catalog and make it more transparent. But, I see a problem with support of several reference tables for galaxies. For example, hyperleda is ok for close galaxies, but modern sn surveys look much further and we certainly need other catalogs with faint galaxies. I need to figure out how to modify SQL scheme. Ideally, we want be able to:
CREATE TABLE history ( id bigint PRIMARY KEY, who text, when timestamp without time zone, what text, old ???? );
For now, I use triggers to store modification datetime in mod_date column.
The problem:It's needed to maintain catalog, which contains data collected from different catalogs (primary). We want to find out the best database schema which supports synchronization of data in such (secondary) catalog with primary catalogs.
It's possible to store every property of a galaxy as a separate row, so we could maintain separate table with references to the source catalog and record id in source catalog.
CREATE TABLE cats ( id integer PRIMARY KEY, name text ); -- a list of properties of galaxy CREATE TABLE properties ( id integer PRIMARY KEY, name text ); CREATE TABLE galaxy ( id integer PRIMARY KEY, name text ); CREATE TABLE galaxy_property ( gid integer REFERENCES galaxy(id), pid integer REFERENCES properties(id), value float, UNIQUE(gid,pid) ); CREATE TABLE map ( gid integer REFERENCES galaxy(id), -- galaxy id pid integer REFERENCES properties(id), -- property id cid integer REFERENCES cats(id), -- catalog id cidid integer, -- id in catalog cid UNIQUE(gid,pid,cid) );
t=# select * from galaxy; id | name ----+------ 1 | N224 t=# select * from galaxy_property where gid=1; gid | pid | value -----+-----+------- 1 | 1 | 0.74 1 | 2 | 41.3 1 | 3 | 0 t=# select * from properties; id | name ----+------ 1 | ra 2 | dec 3 | z t=# select * from cats; id | name ----+------- 1 | LEDA 2 | NED 3 | SDSS 4 | OTHER (4 rows)
Benefits:
Drawbacks:
Instead of maintaining separate row for each attribute, we use custom composite type, which encapsulates knowledge about value and source catalog where this value comes from. See table galaxy_properties instead of galaxy_property'.
CREATE TYPE val_cid as ( value float, cid integer); CREATE TABLE galaxy_properties ( gid integer REFERENCES galaxy(id), ra val_cid, dec val_cid, z val_cid );
Notice, composite type designated as BLOB on picture!
t=# select * from galaxy_properties; gid | ra | dec | z -----+----------+----------+------- 1 | (0.74,1) | (41.3,1) | (0,3) (1 row)
For composite type we should use special syntax to access specific component.
t=# select * from galaxy_properties where (z).cid = 3; gid | ra | dec | z -----+----------+----------+------- 1 | (0.74,1) | (41.3,1) | (0,3) (1 row)
Benefits:
Drawbacks:
Example query - identify source (source catalog, record id in source catalog) of redshift of galaxy with gid = 1.
t=# select g.id, g.name as galaxy, c.name as catalog,m.cidid from galaxy_properties gp, cats c, map m,galaxy g,properties p where (gp.z).cid = c.id and gp.gid = 1 and m.gid = gp.gid and m.cid = c.id and g.id = gp.gid and m.pid=p.id and p.name='z'; id | galaxy | catalog | cidid ----+--------+---------+------- 1 | N224 | SDSS | 178 (1 row)
What if we combine tables galaxy and galaxy_properties ? Then, if assume, that (gid,pid) uniquely connected with (cid,cidid), we could a bit easier identified the source of a specific column.
t=# select g.gid, g.name as galaxy, c.name as catalog,m.cidid from galaxies g, cats c, map m, properties p where g.gid = 1 and m.gid = g.gid and m.cid = c.id and m.pid=p.id and p.name='z'; gid | galaxy | catalog | cidid -----+--------+---------+------- 1 | N224 | SDSS | 178 (1 row)