Data Bases
Custom Term Papers
Free Term Papers
Free Research Papers
Free Essays
Free Book Reports
Plagiarism?
Links
Top 100 Term Paper Sites
Top 25 Essay Sites
Top 50 Essay Sites
Search 97,000 Papers @ DirectEssays.com
Search 101,000 Papers @ ExampleEssays.com
Search 90,000 Papers @ MegaEssays.com
Free Essays
Term Paper Sites
Chuck III's Free Essays
Free College Essays
TermPaperSites.com
Free Essays
My Term Papers
Essay World
Planet Papers
Search Lots of Essays
Back to Subjects
-
Computers
Data Normalization
Data Normalization Data normalization is an important step in any database development process. Through this tedious process a developer can eliminate duplication and develop standards by which all data can be measured. This paper addresses the history and function of data normalization as it applies to the course at hand. In 1970, Dr. E.F. Codd's seminal paper "A Relational Model for Large Shared Databanks" was published in Communications of the ACM. This paper introduced the topic of data normalization, so-named because, at the time, President Nixon was normalizing relations with China. Data normalization is a technique used during logical data modeling to ensure that there is only one way to know a fact, by removing all structures that provide more than one way to know the same fact as represented in a database relation (table). The goal of normalization is to control and eliminate redundancy, and mitigate the effects of modification anomalies -- which are generally insertion and deletion anomalies. (Insertion anomalies occur when the storage of information about one attribute requires additional information about a second attribute. Deletion anomalies occur when the deletion of one fact results in the loss of a second fact). There are six generally recognized normal forms of a relation: first normal form, second normal form, third normal form, Boyce/Codd normal form, fourth normal form, and fifth normal form, also called projection/join normal form. Other normal forms (e.g., Domain/Key) exist but will not be discussed here. The normal forms are hierarchical, i.e., each normal form builds upon its predecessor. Although many people consider a relation to be normalized only when it is in third normal form, technically speaking, a relation in only first normal form can be considered normalized. First normal form (1NF) - All attributes must be atomic. That is, there can exist no repeating groups in an attribute. For example, in a relation that describes a student, the student's classes should not be stored in one field, separated by commas. Rather, the classes should be moved to their own relation, which should include a link back to the student relation (called a foreign key). Second normal form (2NF) - A relation is in second normal form if it is in first normal form and each attribute is fully functionally dependent on the entire primary key. That is, no subset of the key can determine an attribute's value. Third normal form (3NF) - A relation is in third normal form if it is in second normal form and each non-key attribute is fully functionally dependent on the entire primary key, and not on any other non-key attribute. That is, no transitive dependencies exist among the attributes. A transitive dependency can be described as follows: "if A determines B, and B determines C, then A determines C." Boyce-Codd normal form (BCNF) - A relation is in Boyce-Codd normal form if it is in third normal form and all candidate keys defined for the relation satisfy the test for third normal form. Fourth normal form (4NF) - There should not exist any nontrivial multi-valued dependencies in a relation. To move from BCNF to 4NF, remove any independently multi-valued components of the primary key to two new parent entities. For example, if an employee can have many skills and many dependents, move the skill and dependent information to separate tables since they repeat AND since they are independent of each other. Fifth normal form (5NF) - By now, you've seen that normalization results in splitting tables from one table into two or more tables to eliminate anomalies. One tacit property of this splitting is that the designer could always reconstruct the original table by joining the new ones created during normalization. Fifth normal form differs from the definitions of the previous normal forms in that 5NF defines a goal to be reached, rather than the resolution of a particular anomaly. The goal to be reached with 5NF is to keep splitting the tables until either of two states is reached: 1. Further splitting would result in tables that could NOT be joined to recreate the original. 2. The only splits left are trivial. Without trying to sound too "Zen", it would be fair to say that a thorough understanding of data normalization is a journey, not a destination. The topic is fundamental in that a robust database design cannot be achieved without normalization, yet it is a largely arcane subject that gets less intuitive as one progresses up through the normal forms. Despite the complex notion of database normalization, the application of it is virtually essential to any development undertaking. Bibliography:
Word Count: 756
Copyright © 1998-2008
College Term Papers
, INC All Rights Reserved.
DMCA Notifications and Requests