irreducibility - no proper subset of K has the uniqueness property.
Primary key
The candidate key that is selected to identify tuples uniquely within the relation.
Foreign key
An attribute, or set of attributes, within one relation that matches the candidate key of some
(possibly the same) relation.
Relational Integrity
Every data model has a set of integrity rules, which ensures that the data is accurate.
Since every attribute has an associated domain, there are constraints (called
domain constraints) that form restrictions on the set of values allowed for the
attributes of relations.
In addition, there are two important integrity rules, which are constraints or restrictions that
apply to all instances of the database. The two principal rules for the relational model are
known as entity integrity and referential integrity.
Null
represents a value for an attribute that is
currently unknown or is not applicable for this tuple.
Entity integrity
in a base relation, no attribute of a primary key can be Null.
Referential integrity
If a foreign key exists in a relation, either the foreign key value must match
a candidate key value of some tuple in its home relation or the foreign key
value must be wholly Null.
Enterprise constraints
Additional rules specified by the users or database administrators of a database.
Entity-Relationship Model
Introduction
One of the most difficult aspects of a database design is the fact that designers, programmers,
and end-users tend to view data and its use in different ways. Unfortunately,
unless we gain a common understanding that reflects how the enterprise operates, the
design we produce will fail to meet the users' requirements. To ensure that we get a precise
understanding of the nature of the data and how it is used, we need to have a model
for communication that is non-technical and free of ambiguities. The Entity-Relationship model
is one such example.
ER model is a top-down approach to database design that begins by identifying
the important data called entities and relationships between
the data that must be represented in the model.
Entity type
A group of objects with the same properties, which are identified by the
enterprise as having an independent existence. (The basic concept of the ER model,
which represents a group of "objects" in the "real world" with the same properties.)
Entity occurrence
A uniquely identifiable objects of an entity type.
Example: Branch, Department, Staff, Skills
Relationship type
A set of meaningful associations among entity types
(each relationship is given a name that describes its functions).
Example: Branch
Has Department
Relationship occurrence
A uniquely identifiable association, which includes one occurrence from each
participating entity types.
Degree of a relationship type
The number of participating entity types in a relationship.
(binary, ternary, quaternary).
Example of a quaternary relation: A solicitor arranges a bid on behalf
of a buyer supported by a financial institute.
Recursive relationship
A relationship type where the same entity participates more than once in
different roles. (Relationships may be given role names to indicate
the purpose that each entity type plays in a relationship.)
Example:

Staff (Supervisor) Supervises Staff (Supervisee).
Attribute
A property of an entity or a relationship.
Attributes can be classified as:
or derived.
Attribute domain
A set of available values for one or more attributes.
Simple (or atomic) attribute
An attribute composed of a single component with an independent existence.
(Simple attribute can not be further subdivided into small components.)
Example: Salary.
Composite attribute
An attribute composed of multiple components, each with different existence.
(Can be further divided.)
Example: Address can be subdivided into
Street,
City,
State,
ZipCode.
Single-valued attribute
An attribute that holds a single value for each occurrence of an entity type.
Multi-valued attribute
An attribute that holds multiple values for each occurrence of an entity type.
Example: Phones (of an office),
Authors (of a book),
Skills (of an employee).
Derived attribute
An attribute that represents a value that is derivable from the values of a related attribute
or a set of attributes, not necessarily in the same entity.
Keys
Candidate key
The minimal set of attributes that uniquely identifies each occurrence of an entity
type.
Primary key
The candidate key that is selected to uniquely identify each occurrence of an entity
type.
Composite key
A candidate key that consists of more than one attribute.
Note: Relationship also may have attributes.
Example: Relationship: Parts SoldTo Client may have attributes When or/and HowMany.
Structural constraints
These constraints should reflect the restrictions on the relationships as perceived in the
"real world". The main type of constraints on relationships is called multiplicity.
Multiplicity
The number (or range) of possible occurrences of an entity type that may relate to a single occurrence
of an associated entity type through a particular relationship.
Binary relationship are generally referred to as being:
Part SoldTo Client (*:*)
Functional Dependencies
One of the main concepts associated with normalization is functional dependency,
which describes the relationship between attributes.
Characteristics of Functional Dependencies
Assume, that a relational schema
has attributes (A, B, C, D, ..., Z) and that the whole database is described by
a single universal relation called R = (A, B, C, D, ..., Z).
Functional dependency
Describes the relationship between attributes in a relation. For example, if A
and B are attributes of relation R, B is functionally
dependent on A (denote A->B), if each value of A
is associated with exactly one value of B. (A and B may each
consist of one or more attributes.)
Determinant
Refers to the attribute or a set of attributes on the left-hand side of the arrow
of a functional dependency.
See example.
Inference Rules for Functional Dependencies
Closure of
X
is the set of all
functional dependencies that are implied by a given set of functional dependencies
X (written X+).
A set of inference rules, called Armstrong's axioms, specifies how new functional dependencies can
be inferred from given ones.
| Name | Explanation | Example |
|
if B is a subset of A, then A->B
|
sName,Position->Position
|
| if A->B, then A,C->B,C |
bAddress->branchNo => bAddress,Position->branchNo,Position |
| If |
A->B |
, then A->C |
| and |
| B->C |
|
staffNo->branchNo branchNo->bAddress |
=> staffNo->bAddress |
|
Note that each of these three rules can be directly derived from the definition
of functional dependency. The rules are complete in that given a set
X of functional dependencies, all functional dependencies implied
by X can be derived from X using these rules.
In other words, the rules can be used to derived the closure of X+.
Several further rules can be derived from the three given above that simplify
practical task of computing X+.
|
| A->A |
|
| If A->B,C, then |
A->B and A->C
|
|
| staffNo->Position,Salary => |
staffNo->Position and staffNo->Salary |
|
| If | A->B and A->C |
, then A->B,C |
|
staffNo->Position and staffNo->Salary |
=> staffNo->Position,Salary |
|
| If | A->B and C->D |
, then A,C->B,D |
|
staffNo->sName and branchNo->bAddress |
=> staffNo,branchNo->Position,bAddress |
|
Minimal Set of Functional Dependencies
Definition 1:
A set of functional dependencies Y is covered by a set
of functional dependencies X, if every functional dependency in
Y is also in X+.
Definition 2:
A set of functional dependencies X
is set to be equivalent to a set of functional dependencies Y if
X is covered by Y and Y is covered by X.
Definition 3:
A set of functional dependencies
X is
minimal if it satisfies
the following conditions:
|
Every dependency in X has a single attribute on its right-hand side.
|
|
We can not replace any dependency A->B in X
with dependency C->B, where C is
a proper subset of A, and still have a set of dependencies that is equivalent
to X.
|
|
We can not remove any dependency from X and still have a set
of dependencies that is equivalent
to X.
|
Definition 4: Full Functional Dependency.
Indicates that if A and B are attributes
of a relation, B is fully functionally dependent on A if
B is functionally dependent on A,
but not on any proper subset of A.
Definition 5: Transitive dependency.
A condition where A, B, and C are attributes
of a relation such that A->B and B->C, then
C is transitively dependent on A via B (provided that A is not
functionally dependent on B or C).
Normalization
Normalization is a formal technique for analyzing relations based on their primary key
(or candidate key) and functional dependencies (Codd, 1972). The technique involves
a series of rules that can be used to test individual relations so that a database
can be normalized to any degree. When a requirement is not met, the relation violating
the requirement must be decomposed into relations that individually meet the requirements
of normalization.
First Normal Form
A relation is in First Normal Form if the intersection of of each row and column
contains one and only one value.
The domain of attributes must include only atomic (simple, indivisible) values and
the value of any attribute in a tuple must be a single value from the domain of that attribute.
Example.
Second Normal Form
A relation is in Second Normal Form if it is in 1NF and every non-primary-key attributes
is
fully functionally dependent on the primary key.
Example
Third Normal Form
A relation is in Third Normal Form if it is in 2NF and no non-primary-key attribute is
transitively dependent on the primary key.
Example
Boyce-Codd Normal Form
A relation is in BCNF, if and only if, every determinant is a candidate key.
Example
Before moving to higher level normal forms (which are not used very often) let's
review the normalization process.
Fourth Normal Form
Multi-valued dependency (MVD)
Represents a dependency between attributes (for example, A, B,
and C) in a relation, such that for each value of A there is a set of values for B
and a set values for C. However, the set of values for B and C are independent of
each other.
We represent a MVD between attributes A, B, and C using the following notation:
A ->> B
A ->> C
A multi-valued dependency A ->>B can be trivial if
(a) B is a subset A or
(b) A union B = R
and nontrivial otherwise.
A relation is in 4NF if it is in BCNF and contains no nontrivial multi-valued dependencies.
Example
Fifth Normal Form
Lossless-join dependency
A property of decomposition, which ensures that no spurious tuples are generated
when relations are reunited through a natural join operation.
Join dependency
Describes a type of dependency. For example, for a relation R with subsets of the attributes of R denoted as A, B,
..., Z, a relation R satisfies a join dependency if, and only if, every legal value of R is equal to the join of its projections on A, B,
..., Z.
We've seen a lot of examples where we decompose a relation into two relations. However, there are cases
where we require to perform a lossless-join decompose of a relation into more than two relations. These cases
are the focus of the lossless-join dependency and fifth normal form.
A relation is in 5NF if it has no join dependency.
Example