When you set out to use a Data Dictionary to control your
database upgrades, you must consider not just columns and
tables, but also indexes and keys. The process for adding
indexes and keys is almost the same as that for columns and
tables, but there are a few wrinkles you have to be aware
of.
Review of Basic Ideas
In my "http://database-programmer.blogspot.com/2008/06/using-data-dictionary.html"
>First Essay On Data Dictionaries, one of the major points
was that the dictionary is easiest to use if it is in some
type of plaintext format that is in source control along with
the rest of your application files, and is the processed with
a "builder" program.
Last week we saw the "http://database-programmer.blogspot.com/2009/01/dictionary-based-database-upgrades.html"
>Basic compare operation used by the builder to
build and update tables.
You read in your data dictionary, query the information_schema
to determine the current structure of the database, and then
generate commands to add new tables and add new columns
to existing tables.
The Importance of Keys and Indexes
If a builder program is to be useful, it must be complete,
if it leaves you with manual tasks after a build then the
entire concept of automation is lost. The builder must be
able to build the entire structure of the database at
very least, and this means it must be able to work out
keys and indexes.
The Basic Steps
The basic steps of building indexes and keys are these:
- Load your dictionary to some format you can work with
easily. I prefer to load it to tables.
- Query the databases's INFORMATION_SCHEMA to determine
which indexes and keys already exist.
- Execute some type of diff to determine which indexes
need to be built.
- Build the indexes that are not there.
- If you like, drop the indexes that are not in the spec.
Sidebar: Dropping Indexes
A builder program can add things, and it can also drop things.
When it comes to destructive operations, I prefer not
to have my builder drop tables or columns, because the
consequences of a mistake can be unthinkable.
However, when it comes to indexes, it is much more likely to
be ok to drop a stray index. Dropping indexes does not destroy
user data. Also, extraneous indexes will slow down inserts
and updates, so getting rid of them is usually the Right Thing.
Step 1: Your Specification
Your data dictionary format must have some way of letting
you specify an index. It is also a good idea to allow you
to specify an ascending or descending property for each
column, and to specify if the index is to be unique (effectively
making it a unique constraint).
Here is an example of a very simple set of indexes:
table example:
description: My example table
index first_name:
column first_name:
index last_name:
column last_name:
index socialsec:
unique: "Y"
column socialsec:
# ... column definitions follow...
I am currently working on a program that requires frequent access
by three columns, where the first two are in descending order
but not the first. An index spec for this might look like:
table shipments:
description: Incoming Magazines
index history:
column bipad:
column year:
flag_asc: "N"
column issue:
flag_asc: "N"
# ... column definitions follow...
As far as loading this into memory, I covered that in some detail
"http://database-programmer.blogspot.com/2009/01/dictionary-based-database-upgrades.html"
>last week and will not dwell on it here. I will simply assume
you have code to parse and load the spec to a format that works
for you.
Step 2: The Information Schema or Server Tables
When I set out to write my builder, I found that the information_schema
was a bit more complicated than I needed. The server I was using,
Postgres, had a simpler way to get what I wanted. I also found I would
get all kinds of extraneous definitions of indexes on system tables
or tables that were not in my spec. The query below
was the easiest way to get index definitions that were limited to the
tables in my spec on the Postgres platform:
Select tablename,indexname,indexdef
FROM pg_indexes
JOIN zdd.tables on pg_indexes.tablename = zdd.tables.table_id
WHERE schemaname='public'
As far as primary keys and foreign keys go, the story is basically
the same, your server may provide them in a convenient way the
way Postgres gives index definitions, or you may have to dig a little
deeper to get precisely what you want.
Step 3: The Diff
So now we have a picture of the indexes we need to exist, and the
indexes that already exist. It is time to look at how to diff
them effectively. This step does not work the same way as it does
with columns and tables.
Before we go into how to do the diff, let's review how we did it with
tables and columns. We can basically diff tables and columns by
name. If our spec lists table CUSTOMERS and it does not appear
to exist in the database, we can build the table CUSTOMERS, simple
as that. But with indexes the name really does not mean anything,
what really matters is what columns are being indexed.
This is why we diff indexes on the column definitions, not on
their names. If you want a complete trail, you would begin with
this table that describes your own indexes:
SPEC_NAME | TABLE | COLUMNS
------------+-----------+-----------------
CUST1 | CUSTOMERS | zipcode:state
HIST1 | HISTORY | bipad:year:issue
ORDERS1 | ORDERS | bipad:year:issue
Then you pull the list of indexes from the server, and lets
say you get something like this:
DB_NAME | TABLE | COLUMNS
----------+-----------+-----------------
CUST1 | CUSTOMERS | zipcode:state
ABCD | HISTORY | bipad:year:issue
ORDER1 | ORDERS | year:issue
When you join these two together, you are matching on TABLE and
COLUMNS, we do not care about the index names. A query to join
them might look like this:
SELECT spec.spec_name,spec.table,spec.columns
,db.db_name,db.columns as db_cols
FROM spec
FULL OUTER JOIN db On spec.table = db.table
AND spec.columns = db.column
This query would give us the following output:
SPEC_NAME | TABLE | COLUMNS | DB_NAME | DB_COLS
------------+-----------+------------------+---------+---------------------
CUST1 | CUSTOMERS | zipcode:state | CUST1 | zipcode:state
HIST1 | HISTORY | bipad:year:issue | ABCD | bipad:year:issue
ORDERS1 | ORDERS | bipad:year:issue | |
| | | ORDER1 | year:issue
Now let us examine the results row by row.
- The first row shows that the index on zipcode+state on the
customers table is in the spec and in the database, we take
no action on that index.
- The second row shows that the index on bipad+year+issue is
also in both the database and the spec. This particular index
has a different name in the database, but we don't care.
(Maybe the programmer changed the name in the spec). We take
no action on this index.
- The third line shows an index on the ORDERS table that is not
in the database, we must build that index.
- The fourth line shows an index in the database that is not
in the spec, you can drop that if you want to.
The Rest of It
From here it is a simple matter to generate some commands to create
the indexes we need.
Keys work the same way, with a few obvious differences in how they
might be named.
We can add features from here to track if the columns are being
indexed in ascending or descending order.
Conclusion: Indexes Go By Definition
When writing a database upgrade "builder" program, they key thing
to understand about indexes and keys is that you are looking to
indentify and build indexes according to their definition,
and that names do not matter at all.
Update Contact :
No Wa/Telepon (puat) : 085267792168
No Wa/Telepon (fajar) : 085369237896
Email : Fajarudinsidik@gmail.com
No Wa/Telepon (puat) : 085267792168
No Wa/Telepon (fajar) : 085369237896
Email: Fajarudinsidik@gmail.com
atau Kirimkan Private messanger melalui email dengan klik tombol order dibawah ini :