What is a relational database?
A British engineer Ted Codd working for IBM elucidated the theory of relational databases "A Relational Model of Data for Large Shared Data Banks" in the 1970s whilst working for IBM. At the core of this relational model is the concept of normalisation which is the separation of the logical and physical data models. This enables you to "see" the database in a completely different way to its underlying structure.The importance of this separation is that it makes relational databases extremely flexible unlike, for example, hierarchical databases. This flexibility means that either layer can be changed without affecting the other. With earlier database models, changes in business requirements requiring new data structures necessitated the complete re-design of the database.
A relational database, therefore, can be regarded as containing a set of 2-dimensional tables ("relations") with each table comprising rows ("tuples") and columns ("domains"). Relationships between database tables are created when one table has a column with the same meaning as a column in another table. The actual values of the columns are irrelevant but they must refer to/mean the same thing.
Let's take the example of a very simple database with just 2 tables (tuples):
Now that we can say that these two tables are related, the other part of relational database model - relational calculus (which is essentially set theory) enables relations (i.e. database tables) to be combined in various ways:
Employees
Departments
Joining employees and departments would produce the following result:
The number and data types of the columns must be the same for the union of relations so the departments table requires an extra column.
A relational database, therefore, can be regarded as containing a set of 2-dimensional tables ("relations") with each table comprising rows ("tuples") and columns ("domains"). Relationships between database tables are created when one table has a column with the same meaning as a column in another table. The actual values of the columns are irrelevant but they must refer to/mean the same thing.
Let's take the example of a very simple database with just 2 tables (tuples):
- employees
- departments
- employee_id
- employee_ name
- department_id
- department_id
- department_name
Now that we can say that these two tables are related, the other part of relational database model - relational calculus (which is essentially set theory) enables relations (i.e. database tables) to be combined in various ways:
- the union of 2 relations results in a set of data containing those elements that exist in one or other relation (or both relations);
- the result of the join (the intersection) of 2 relations is the set of elements that exist in both relations;
- the exclusive "OR" produces the set of items that are in either of the relations but not both
- an outer-join is the same as the join but also includes elements from one or both relations that are not in the other (depending on whether a full outer-join or a left or right outer join is performed)
- relations can also be subtracted from each other leaving the set of elements that were in the first relation but not the 2nd
Employees
Employee_Id | Employee_Name | Department_Id |
1 | Mike Jones | 3 |
2 | Phil Rogers | 6 |
3 | Dave Tanner | 3 |
4 | Paul Johnson | 2 |
5 | Raj Patel | 1 |
6 | Qamar Aziz | 5 |
Department_Id | Department_Name |
1 | Marketing |
2 | Sales |
3 | IT |
Joining employees and departments would produce the following result:
Department_Id | Department_Name | Employee_Id | Employee_Name |
1 | Marketing | 5 | Raj Patel |
2 | Sales | 4 | Paul Johnson |
3 | IT | 3 | Dave Tanner |
No comments:
Post a Comment