Relational Database Management Systems (RDBMS) are the backbone of modern data management, offering a structured and efficient way to store, retrieve, and manipulate data. Unlike other database models, RDBMS organizes data into tables, enabling easy access and management through Structured Query Language (SQL).
Relational databases are the powerhouse behind many essential business systems. Notably, websites rely heavily on databases for most of their functions. Here are some services that relational databases provide for websites and other business applications:
- User Authentication and Authorization: Relational databases store user credentials, profiles, and roles. When you log into an application, the database verifies your credentials and determines your access rights.
- Content Management Systems (CMS): Platforms like WordPress, Drupal, and Joomla use relational databases to manage and organize content, including articles, images, and user comments, enabling efficient content updates and retrieval.
- E-Commerce Transactions: Online stores utilize relational databases to manage product listings, inventories, customer orders, payment transactions, and user accounts, ensuring a smooth shopping experience.
- Social Media Platforms: Databases store user-generated content such as posts, likes, comments, and messages, and track user relationships (followers, friends) to facilitate social interactions and personalized feeds.
- Search Functionality: Many applications, including websites, often implement complex search features where relational databases index content and provide fast retrieval of relevant results based on user queries.
- Operational Data Management: All transactional data captured in an application, such as an ERP, gets stored in an RDBMS, enabling information to be made available to multiple business functions.
- Analytics and Reporting: Collected data can be made available for business analysis or for report generation.
- User Activity Tracking: Websites collect and analyze user interaction data, such as page views, click-through rates, and conversion metrics, which are stored in relational databases for generating reports and insights.
- Customer Relationship Management (CRM): CRM systems use relational databases to manage customer interactions, track sales, and maintain customer service records.
- Forum and Community Boards: Discussion forums and community boards store threads, posts, user profiles, and moderation data in relational databases to manage user interactions and maintain order.
These functions showcase the versatility and critical importance of relational databases in powering dynamic and data-driven websites, ensuring they run smoothly and efficiently.
This guide delves into the essential components, features, and benefits of RDBMS, providing valuable insights for both novices and experienced professionals. We examine key concepts such as normalization, data integrity, and transaction management. This guide shows how RDBMS ensures reliability, consistency, and scalability in data handling. Whether you’re looking to optimize your existing database or choosing the best RDBMS for your needs, this guide offers the knowledge and tools to make informed decisions.
The relational model
The “relational” term refers to the relational model, which was defined by Edgar Frank Codd, better known as E F Codd, in 1969 while he was working as a computer scientist for IBM.
There are 13 requirements in Codd’s definition for a database to be categorized as relational and almost none of the real-world implementations comply with all of these. However, the two main requirements for a relational database, in layman’s terms are:
- Data should be normalized.
- Data should be accessible through relational operators.
These two categories are explained further below.
Data normalization
Data is organized into “entities” and each entity has “attributes”. When planning a relational database, you gather all of the business’s documents and write down the headings that are there. For example, an invoice may have Invoice Number, Invoice Date, VAT or EIN Number, Business Name, Business Address, Customer Name, Customer Address, Invoice Line Number, Item Code, Item Description, Item Unit Price, Quantity, Line Total, and Invoice Amount. This collection of data has information at several levels. The process of sorting out the headings into different groups is called “normalization”.
There are many stages to normalization, but in practice, you only need to perform the first three. These are:
- First normal form (1NF): Separate out repeating groups
- Second normal form (2NF): Separate out partial key dependencies
- Third normal form (3NF): Separate out attributes that are better identified by fields other than the primary key.
By the end of these three steps, each attribute in the group is dependent on the key, the whole key, and nothing but the key.
First normal form
Let’s take a look at an invoice example. In first normal form, we separate out repeating groups. Clearly, there are two levels of data on the invoice: the header information and the line data. In order to work out which attributes belong in which group, pick a field that seems to be the primary data element of the document – this is a unique identifier for this invoice and so must be the Invoice Number field. This primary identifier is called a “constraint” because you have to force uniqueness for this field on all records in the database. This will become the “primary key” of the table.
Ask how many times each of the other fields have different data for each invoice number. You will end up with two groups: INVOICE HEADER with attributes Invoice Number, Invoice Date, VAT or EIN Number, Business Name, Business Address, Customer Name, Customer Address, and Invoice Amount; and the INVOICE LINE with attributes Invoice Line Number, Item Code, Item Description, Item Unit Price, Quantity, and Line Total.
Line Number would not uniquely identify each record in the database, so the INVOICE LINE group needs Invoice Number added to it. In this case, the table has a composite key. Invoice Number is also a foreign key to the INVOICE HEADER group.
In the case of the INVOICE HEADER group, the Invoice Number should be unique for each invoice. However, that can’t be guaranteed, so it is usual to create a hidden unique identifier field on the table, which will be populated from a sequence to ensure that there can never be duplicated values in this field.
By adding Invoice Header ID to the INVOICE HEADER table, we provide a better value as the foreign key in the INVOICE LINE group, so we add an Invoice Header ID field into that table. It is quicker to “join” on one field when trying to specify a single record, and so it would be better to create a unique identifier for records in the INVOICE LINE group, called Invoice Line ID. Invoice Header ID is now the primary key of the INVOICE HEADER group and Invoice Line ID is the primary key of the INVOICE LINE group. Now we call the groups “entities” and the fields “attributes”.
Second normal form
To get to the second normal form, you need to split out attributes that are not dependent on all of the elements in a composite key. This requires no work in the INVOICE HEADER entity. The INVOICE LINE entity did have a composite key (Invoice Number and Line Number) but this has been replaced by the unique identifier Invoice Line ID. So, there are no partial key dependencies here.
Third normal form
In this step, you should split out attributes that are better identified by another attribute in the group. In the case of INVOICE HEADER, the attribute Business Address is more dependent on Business Name than the primary key. So, this would create a table called BUSINESS DETAILS, which would have attributes Business Name, and Business Address. It would be better to split out the lines of the business address into Address Line 1, Address Line 2, City, State, Country, and Postal Code. The VAT or EIN Number should also be removed from INVOICE HEADER to this entity.
Customer Address is related to Customer Name more than Invoice Header ID, so these fields should be split out, creating a CUSTOMER entity and a CUSTOMER ADDRESS entity.
Business Name is not a unique identifier, so you should add a Business Details ID field. You might invoice from several addresses. If this is the case, then you need to split out the address details to another entity called BUSINESS BRANCH. Add a unique identifier to this entity: Business Branch ID.
The CUSTOMER and CUSTOMER ADDRESS tables also need unique identifiers Customer ID and Customer Address ID.
You need to put a reference in each of the entities that have lost attributes to new entities so that they can relate to them. Therefore, Business Branch ID needs to go into the BUSINESS DETAILS entity, Customer Address ID should be added to CUSTOMER, and Business Details ID and Customer ID need to be attributes of the INVOICE HEADER attribute.
In the INVOICE LINE entity, Invoice Number is no longer needed, because it is wholly dependent on Invoice Header ID and already exists in the INVOICE HEADER entity. Item Code, Item Description, and Item Unit Price are all attributes of the item, not the invoice line, so these fields need to be split out to a new entity, called ITEM. You need to put a unique identifier Item ID on the ITEM entity and add that attribute to the INVOICE LINE as a foreign key.
Normalized data
In the example above, you now have seven entities:
INVOICE HEADER
- Invoice Header ID
- Invoice Number
- Invoice Date
- Business Details ID
- Customer ID
- Invoice Amount
BUSINESS DETAILS
- Business Details ID
- Business Name
- Business Branch ID
- VAT Number
BUSINESS BRANCH
- Business Branch ID
- Branch Address Line 1
- Branch Address Line 2
- Branch City
- Branch State
- Branch Country
- Branch Postal Code
CUSTOMER
- Customer ID
- Customer Name
- Customer Address ID
CUSTOMER ADDRESS
- Customer Address ID
- Customer Address Line 1
- Customer Address Line 2
- Customer City
- Customer State
- Customer Country
- Customer Postal Code
INVOICE LINE
- Invoice Line ID
- Invoice Header ID
- Invoice Line Number
- Item ID
- Quantity
- Line Total
ITEM
- Item ID
- Item Code
- Item Description
- Item Unit Price
As you can see, a relatively simple document has become a lot of structures. However, splitting out the data in this way will greatly reduce the amount of data that needs to be stored.
Database design
The next step is to put all of these entities in a plan, showing the attributes of each and marking the primary keys. It is then necessary to add on the relationships between these entities, which run from the foreign key in one table to the primary key in another.
This relationship must either be one-to-many, or many-to-one. This means that for an instance of a record in one entity, there must be several records in the related table. If you discover a one-to-one relationship, those two entities should be merged. If you see a many-to-many relationship, you must break it up by inserting an intermediate entity that will have a many-to-one relationship with the each of the two original entities.
The plan that this work creates is the Entity-Relationship Diagram, which is the foundation document for the relational database.
Database creation
Each of the elements in the Entity-Relation Diagram translates into database objects:
- Entity = table
- Attribute = table column
- Constraint / primary key = unique index populated by a sequence
RDBMS have several command sets. Objects are created by a Data Description Language (DDL). The command to create an object is:
CREATE