Welcome to the Database Programmer, the blog for anybody who
wants to learn the practical realities of working
with databases.
There is a new entry every Monday morning, and the
>complete table of contents is here. This week we are
talking about tools again, specifically upgrades.
The Impermanent Primary Key
Sometimes you get a situation where you have a great
possibility for a natural key, except that the key will
change from time to time. I call this the "Impermanent
Primary Key" and we are going to look at this pattern today.
Magazines and BIPADS
I am currently writing a system for a magazine distributor,
while also maintaining his old system for him. When I first
started with him he explained that a magazine is identified
by a "BIPAD", a 5 digit number unique to the magazine. It
seemed to me that if you
take a BIPAD, and issue number, and a volume year, you have
a pretty good choice for a natural primary key.
Then one day he walked up to me and said, "I have fourteen
magazines with new bipads this month." This was a surprise,
as I had been fairly sure that a BIPAD and a magazine were
uniquely matched for all eternity.
Sidebar: Did the Requirement Change?
Before we get into the design pattern for this, it is worth
asking, is this a case where the customer changed the
requirement? The answer is a most emphatic no! The customer
told me, if I had listened precisely, that a magazine
"has a BIPAD". He never said that a magazine has only one
unique BIPAD, I inferred that myself. Such inferrences are
mistakes, and we cannot blame them on the customer, because
it was my job to ask all sides of the question, such as
"can a magazine have more than one BIPAD?" or "Is a BIPAD
permanent?" or "Can a BIPAD ever be used again on another
magazine?"
Spelling Out The Requirements
So the reality is that a BIPAD is assigned to only one
magazine, but a magazine may get more than one BIPAD, as
the bipad will change from time to time. My customer's
operational requirements are:
- The history of individual BIPADs must be preserved in
the transaction tables.
- The histories of multiple BIPADs for a single magazine
may need to be combined from time to time.
- During a BIPAD change, transactions will continue
for both BIPADs.
I should also note that transaction information comes from
outside sources and always contains the BIPAD. It is
our responsibility to make sure the BIPAD can be mapped
into our database correctly. We have no power to ask outside
parties to use numbers created or assigned by us.
Some Candidate Solutions
There are a few ways to handle this. Here are four choices that
are most likely to come up in a brainstorming session:
- Use BIPAD as a primary key in the BIPADS table,
and change the BIPAD as required.
- Create a From-To table that maps old BIPAD values
to new values.
- Use an integer primary key in the BIPADS table and make
BIPAD just a regular
column, so you can just change it as required.
- Make the table of BIPADs a child table to a table
of MAGAZINEs. When a BIPAD changes make a new entry
in the BIPADs table.
We can eliminate option one immediately, because foreign keys
do not allow this possibility. If you have a table of BIPADS,
with BIPAD as a primary key, then your transaction tables will
have foreign keys to this table. If you could change the
value of a BIPAD then suddenly those transactions that reference
the old value would not be valid. We say in every day language
that you would orphan the transactions. A foreign key
exists to prevent such things, so option one is out.
Option two looks pretty reasonable, but it is no good at
all. I included option two as an example of what happens
when you use a coding mentality to design tables. Option two
records accurately an action, the changing of a BIPAD,
by recording the old and new values. While this may seem harmless,
it makes a wreck out
the idea of using primary keys and foreign keys. What is the
primary key of such a table? What table and column do the
transaction tables reference? When an idea gives
problems for the basic building blocks of a database, we know
that nothing but trouble is going to follow.
This leaves options three and four, which we will now
consider in more detail.
The First Two Requirements in Detail
The first two requirements are that we must be able to examine
the individual histories and the combined histories as requested.
Options three and four both allow this, as both of them have
the BIPAD values in the transaction tables and both options also
have a table that can be used to combine histories.
Option four uses the BIPAD values natively in all tables,
so there is no need for any special planning or actions when
saving data or querying it. If you want to query for a single
BIPAD, then specify that. If you want to query for a magazine,
then JOIN to the MAGAZINES table and filter on your magazine
and there you go. Option four satisfies the requirements with
no fanfare.
Option three requires some additional storage. The transaction
tables need to keep track of the BIPAD for accuracy,
but they also need
that meaningless integer foreign key to the table of BIPADS.
This always strikes me as funny because the integer key is
promoted as a performance method, but the basic reality of a disk
drive is that performance goes as the amount of data you
have to read and write. Where option four writes only the
value of BIPAD, option three must write more data, and
therefore will always be slower.
Option three also requires more code and more disk
activity. Because we receive transaction data as BIPAD
values, but we are using a meaningless number as the
foreign key to the BIPADS table, we have to do a lookup into
the BIPADS table to find out the meaningless integer key.
This means we need a read operation that was not necessary
with Option four. Again this is rather ironic since the
integer key is promoted as a performance tool.
Finally, the problem of having to look up the value of
the BIPAD's integer key introduces application code, so now
with Option three I have to write code where for Option
four I do not.
But when all is said and done,
if your tools lock you into option three
and you have no choice, you can in fact satisfy
requirements one and two. You wll be able to examine
both the individual
histories and the combined histories. So we probably have
to say so far that it is a matter of taste, tools, and habits.
Disqualifying Option Three
The third requirement is that transactions will occur for
both BIPADs during a transition. For option four this is not
really an issue because there are two rows in the BIPADS table,
one for each BIPAD. Any transaction table that is a child
table of BIPADS is automatically covered by the foreign
key.
Option three however has a fatal problem. When a transaction
comes in with a BIPAD on it, we have to validate it by
looking up the BIPAD in the BIPADs table, so we can get
the meaningless integer primary key. Except that after
a new value is entered the old value will not be there.
We cannot satisfy the third requirement and Option three
is now disqualified.
Those who are comfortable using integer primary keys for
everything may suggest a way to rescue the situation, but
because Option four does not require any application code and
does not require rescuing, I will leave it to others to
rescue option three.
More Details On The Impermanent Primary Key
When we have an impermanent primary key, one that changes
from time to time, we can create a pair of tables. The
master entities are tracked in the top table, and the
child table tracks the impermanent values. This second table
is the parent of all of the transactions:
In the example above, which has to do with magazines and
BIPADs, the master table I built uses character primary keys,
and the child table uses the BIPAD values, so it looks
something like this:
MAGAZINE (PK) | DESCRIPTION
--------------+--------------------------
TVGUIDE | TV GUIDE
COSMO | COSMOPOLITAN
ASTORIES | AMAZING STORIES
CROSSW | CROSSWORDS
|
|
|
|
/|\
MAGAZINE | BIPAD (PK) | OTHERS...
---------+------------+-------------------
TVGUIDE | 12345 | XXXXX
TVGUIDE | 34345 | XXXXX
COSMO | 29830 | XXXXX
COSMO | 23813 | XXXXX
|
|
|
/|\
Transaction Tables have foreign keys
to this column
The SQL that would create these two tables might
look like this:
CREATE TABLE MAGAZINES (
magazine char(10)
,description varchar(35)
,primary key (magazine)
);
CREATE TABLE BIPADS (
magazine char(10)
,bipad char(5)
,primary key (biapd)
,foreign key (magazine) references magazines (magazine)
)
Conclusion
The "Impermanent Primary key" pattern occurs when some value
is permanent as far as individual transactions are concerned,
but may change over the lifetime of the master entity.
In these cases, we create a parent-child table pair.
The actual master
table sits at the very top, with a child table below it that
holds the semi-permanent values. All transactions are children
of the child table.
"http://database-programmer.blogspot.com/2008/03/requirements-are-always-wrong-or.html"
>Next Essay: The Requirements are Always Wrong, Or, Iterative Database Development.
Update Contact :
No Wa/Telepon (puat) : 085267792168
No Wa/Telepon (fajar) : 085369237896
Email : Fajarudinsidik@gmail.com
No Wa/Telepon (puat) : 085267792168
No Wa/Telepon (fajar) : 085369237896
Email: Fajarudinsidik@gmail.com
atau Kirimkan Private messanger melalui email dengan klik tombol order dibawah ini :