.:: Jasa Membuat Aplikasi Website,Desktop,Android Order Now..!! | | Order Now..!! Jasa Membuat Project Arduino,Robotic,Print 3D ::.

Database Skills: Third Normal Form and Calculated Values

0 komentar


بِسْــــــــــــــــمِ اﷲِالرَّحْمَنِ اارَّحِيم
bismillaahirrahmaanirrahiim

السَّلاَمُ عَلَيْكُمْ وَرَحْمَةُ اللهِ وَبَرَكَاتُهُ
Assalamu'alaikum warahmatullahi wabarakatuh


This essay is part of the Keys, Normalization and Denormalization series.
There is also a Complete Table of Contents and a >Skills-oriented Table Of Contents.



Update: There is a new essay on >Business Logic that also goes into how calculated values affect your code paradigm.



Third Normal Form Defined



Third normal form is pretty simple. In plain terms it means
that no column can depend on a non-key column. In technical
terms we say that there are no "transitive dependencies."



The example below shows a table that is not in third normal
form. The problem is that the letter grade depends on
the numeric grade:




SUBJECT | YEAR | TEACHER | STUDENT | GRADE | LETTER
---------+-------+---------+-------------+-------+-----------
HIST-101 | 2008 | RUSSELL | NIRGALAI | 80 | B
HIST-101 | 2008 | RUSSELL | JBOONE | 90 | A
HIST-101 | 2008 | RUSSELL | PCLAYBORNE | 95 | A


To repeat: The key is SUBJECT + YEAR + TEACHER + STUDENT, there are no
duplicate values of this combination of columns. In plain terms we
mean that no student can be listed in the same class twice, which
only makes sense after all. We note next that there
is only one actual property of the table, the GRADE column. This
is the student's final grade in a particular class.



The violation
is the last column, LETTER, which is not a property of the key
but is functionally dependent upon the GRADE value. The term
"functionally dependent" means that column LETTER is a function
of GRADE.



As always, normalization is meant to help us. The problem with
the table as-is is that you cannot guarantee that when the
GRADE column is updated that the LETTER column will be
correctly updated. The non-normalized table can lead
to what we call UPDATE ANOMALIES.



Problem: Calculated Values Are a Fact of Life



We have seen in prior essays that enforcing first normal
form and second normal form leads to more tables
and simpler code. We saw in both
cases that a programmer's natural tendency to use a
"simple" solution of fewer non-normalized
tables leads to more complicated and error-prone
code. But when we come to third normal form the
benefits do not seem so clear-cut.



Third normal form tells us to leave out
the LETTER column, but the plain fact is that the
LETTER column is very useful! The situation will only
get worse if there are a few dozen (or a few hundred!)
tables. If we leave out all calculated
values then we must provide for them outside of
the database, but if we put them into the tables
we risk all kinds of errors in live data.



This is where things get sticky. In this week's essay
I am going to explain the classic solutions to this
problem, and include my own as the last one. It is
my personal opinion that the problem of calculated values
is at the heart of many of the flame wars, arguments
and battles
that are fought in the arena of database application
development. You can't live with 'em but you can't live
without 'em. Some kind of logical approach is required that
gives us the benefits of normalization and the benefits
of calculated values.



Preserving 3NF With Another Table



In many cases something that looks like a calculation can
actually be removed out to another table and turned
into a foreign key. As a matter of fact, the example I gave
above can be moved out to another table:


GRADE | LETTER
-------+---------
100 | A
99 | A
98 | A
97 | A
96 | A
95 | A
94 | A
93 | A
92 | A
91 | A
90 | A
89 | B
88 | B
87 | B
86 | B
....
.... etc ....
....
1 | F
0 | F


Most programmers would not think of a solution like this
because it has too many rows!
Your basic programmer finds it ugly, so he will not use
it. This underscores a recurring theme of this series:
skills that lead to good coding will lead to bad
table design. Table design should be done according
to the principles of table design, not the principles
of coding. The customer will never know you have an
"ugly" table, they will just know your system works well.



There are certain benefits that you always get when you
normalize and when you put everything into tables.
The big advantage is that you can
change your system without modifying a program file. In
the United States we would leave my "ugly" table as it
is for public schools, but for many private schools
we would change it to 93-100 for an "A", 85-92 for a "B",
and so forth. We could make 70-100 "P" for pass and
0-69 "F" for fail. We can do all of this without
changing any program code, which is always a good thing.



Preserving 3NF With a View



A "database view" is a SELECT statement that is given
a name and stored permanently in the database. The nifty
thing about views is that they can be queried with
a SELECT statement just like a table. A view is a great
tool for preserving normalization because you can
keep the table itself fully normalized and put the
calculated values in the view. In most databases
a view is created with syntax like this:




CREATE OR REPLACE VIEW grades_extended AS
SELECT subject, year, teacher, student , grade
,CASE WHEN grade >= 90 THEN 'A'
WHEN grade >= 80 THEN 'B'
WHEN grade >= 70 THEN 'C'
WHEN grade >= 60 THEN 'D'
ELSE 'F' END as LETTER
FROM grades;

-- this command pulls only from the normalized table
SELECT * FROM grades

-- this command gives letter grades also
SELECT * FROM grades_extended


Because a view is simply a SELECT statement, a view
can JOIN information from multiple tables, and it
can also do aggregate functions like SUM, AVG and
so forth, so views are very powerful indeed.



Views are very popular and can really help out in
simple situations. I have found that they become
very difficult to manage as the table count increases,
and so I personally do not use them for this purpose
anymore, but I would not discourage anybody who wants
to try them out. They will probably benefit you
a lot.



Preserving 3NF by Calculating As Needed



Another approach is to keep the calculated columns out of
the database and calculate them as needed.



Some years ago I was made Lead Systems Architect
on a package that already had bout 750 tables when I
took the position.
The tables were in strict third normal form: no calculated
values, no exceptions.



All calculations were made as needed, and they were
all performed in program code.
This particular program was an ERP package, so there
were a great many monetary calculations. Absolutely nothing
about it was simple, every single calculation had many
possible formulas that took into account discounts, locations,
customer types, time of year, and as far as I could tell
the sunspot cycle and the phase of the moon.



The sales team was very successful, and so this program was
under heavy modification at all times by programmers on
two continents. The end result was this:

  • Many calculations involved calls to subroutines in
    2, 3 or 10 different programs, making them nearly impossible
    to debug.
  • The program was completely impossible to document,
    we didn't even try.
  • The program was impossible to test, because of
    course nobody really knew what it was supposed to be doing.
  • Performance was often terrible. The need to calculate
    every value on every access put a huge strain on the
    database as information was pulled over and over and cycles
    were spent doing calculations.
  • Only a few people "knew where the bodies were
    buried", which contradicts standard advice for keeping
    staff interchangeable.
  • Sometimes a customer would print an invoice, then print
    it a few months later after an upgrade and get
    different numbers, which we could not explain or
    correct except at great time and expense.


When somebody tells me to keep 3rd normal form religiously
and keep all calculations in program code I say
"been there, seen that, no thanks."



Discarding 3NF by Using Program Code



The mostly popular approach by far is to calculate and
store derived values using program code.



Most programmers use some kind of object-oriented language,
and are used to the idea that they will have a GRADES class
that handles the calculations for the GRADES table.
Whenever anybody inserts or updates the table, some method
looks at the GRADE column (a value from 0-100) and
re-calculates the LETTER code, then saves the whole thing
in the table. This violates 3NF but makes
the database more fully populated and therefore more
useful.



The major advantage to this approach is that most programmers
understand how to do it with little prompting or
explanation. Moreover, it does make the database indeed
much more useful.



The largest problem with this approach is that subvertible. Unless
the programmer is absolutely certain that nobody can access
the database except through her code, the possibility always
remains that a different program will mess up the values. In fact,
you must be able to guarantee that even within your program
no rogue programmer can accidentally
(or maliciously) issue commands that subvert the values.
If you get even one prima donna on your team that does not
believe in following rules, serious damage can be done.

Some programmers say, "that's ok, my team is small and
tight and only my program will
access the data," but this just means they never plan to sell
a program to a company that has their own IT department.
That company will demand access to the database, possibly
even write access, and then you have a problem.



However, when all is said and done, this is a perfectly valid
way to get working programs written. When it is reasonable
to expect that nobody else can access the database, and you
trust your team, then this
method may be the Right Thing.



Discarding 3NF With Triggers



The last approach I will explain is the one that I use
myself. In this approach you put a "trigger" on the
table that performs the calculation for you. A trigger is
a small program that resides in the database server itself
and is written is some kind of SQL-like extended language
(or Perl or C or anything supported by your server).
The trigger executes whenever certain events
occur in the database. Specifically, every trigger fires
for exactly one table, and might fire before or after
an event, and might fire for insert, update, or delete.
Triggers are ideal for performing calculations.



Trigger code tends to be highly specific to the platform
you are using. The code below works for PostgreSQL and will
definitely not work for MS SQL Server or MySQL, you will have
to look up the syntax for those systems and do a little
translation. But here it is:


CREATE OR REPLACE FUNCTION grades_ins_before_f RETURNS TRIGGER AS
$$
BEGIN
new.letter = CASE WHEN new.grade >= 90 THEN 'A'
WHEN new.grade >= 80 THEN 'B'
WHEN new.grade >= 70 THEN 'C'
WHEN new.grade >= 60 THEN 'D'
ELSE 'F' END;
END
$$ LANGUAGE PLPGSQL SECURITY DEFINER;

CREATE TRIGGER grades_ins_before_t BEFORE INSERT OR UPDATE
ON GRADES
FOR EACH ROW EXECUTE PROCEDURE grades_ins_before_f();


I should note that PostgreSQL's triggers are very powerful, but
their two-stage command to make a trigger is very annoying.
Other servers such as MS SQL Server, DB/2 and MySQL have
simpler syntax. I put up with PostgreSQL because it is so
powerful and it is free software.



One very big advantage of this system is that the calculated
value is non-subvertible, no rogue commands can corrupt
the calculated values. I personally love this because it
gives me all of the advantages of normalized data while also
getting useful derived data in the tables for easy access.



Conclusion: Denormalization and Sorting It Out



Third normal form itself is easy enough to understand,
but it brings us our first conflict between good
database principles and the real world. The principles
of normalization keep out calculated values, but
calculated values are part of every useful applicaiton.



This is why so many people say that it is necessary
to denormalize. But keep in mind always that there is
a difference between non-normalized and de-normalized.
We have no interest in a non-normalized database, where
the designer has not even bothered to identify the
correct tables, primary keys, and foreign keys.

What we have seen instead is that it is very useful
first to normalize a database by identifying the individual
things we want to keep track of, making an individual
table for each specific kind of thing, and then
identifying the primary properties of each of
these things. The process of de-normalization begins
when we define calculated values and begin to put
those into the tables. Once we do this, we must
choose a method to ensure that these values are
correct, using views, triggers, program code or
a combination of the three.




Other Posts




This essay is part of the Keys, Normalization and Denormalization series.
There is also a Complete Table of Contents and a >Skills-oriented Table Of Contents.


Update Contact :
No Wa/Telepon (puat) : 085267792168
No Wa/Telepon (fajar) : 085369237896
Email : Fajarudinsidik@gmail.com
NB :: Bila Sobat tertarik Ingin membuat software, membeli software, membeli source code, membeli hardware elektronika untuk kepentingan Perusahaan maupun Tugas Akhir (TA/SKRIPSI), Insyaallah Saya siap membantu, untuk Respon Cepat dapat menghubungi kami, melalui :

No Wa/Telepon (puat) : 085267792168
No Wa/Telepon (fajar) : 085369237896
Email: Fajarudinsidik@gmail.com


atau Kirimkan Private messanger melalui email dengan klik tombol order dibawah ini :

ٱلْحَمْدُ لِلَّهِ رَبِّ ٱلْعَٰلَمِين
Alhamdulilah hirobil alamin

وَ السَّلاَمُ عَلَيْكُمْ وَرَحْمَةُ اللهِ وَبَرَكَاتُهُ
wassalamualaikum warahmatullahi wabarakatuh


Artikel Database Skills: Third Normal Form and Calculated Values, Diterbitkan oleh scodeaplikasi pada Senin, 07 Januari 2008. Semoga artikel ini dapat menambah wawasan Anda. Website ini dipost dari beberapa sumber, bisa cek disini sumber, Sobat diperbolehkan mengcopy paste / menyebar luaskan artikel ini, karena segala yang dipost di public adalah milik public. Bila Sobat tertarik Ingin membuat software, membeli software, membeli source code ,Dengan Cara menghubungi saya Ke Email: Fajarudinsidik@gmail.com, atau No Hp/WA : (fajar) : 085369237896, (puat) : 085267792168.

Tawk.to