Dealing with structured data

Since the 1970's database systems have been based on relational principles, (otherwise referred to as SQL databases, the language used to access the information). The simple example we have been looking at so far was designed as a set of relational tables, But other types of database are  becoming popular collectively known as NOSQL databases.  In this case we are using CouchDB, but the system also works identically with MongoDB. 

Let's look at the same very simple application using both approaches to identify the differences. 

In the relational database example (above left), the structure of the data is restrained by four rules:

  1. The data is always stored in 'flat' records in a table. This is like a spreadsheet. There is no structure to the record. 
  2. Every record (called a 'row' for obvious reasons) has the same fields in the same order.  This is pre-defined for the database in a 'schema'. You have to create a 'schema' before you can use the database..
  3. There are many exam papers for each subject, so they have to be stored in a separate table. You need to make sure that  you can't delete the parent table (subjects) if there are child records (exam papers) that link to it.  This issue is called referential integrity and is supported by SUDSJS.

The process of reducing the data model to individual tables is called normalisation.  This model has served the industry well for half a century. 

In the document database example (above right) a lot is new. Data is stored in records called 'documents'. 

  1. The documents are structured, so if you want to store the date as day/month/year you define a date group and then day/month/year below it. You can't do that with a relational database and you end up with structured field names; start-day start-month etc..
  2. Different documents don't have to have the same information. If a subject needs different information to the rest, then it can be included. There is no fixed schema.
  3. Instead of links from the papers to the subjects, we can list the papers in the subject record.  There is no limit to the number of papers you could store in this way.  This is sometimes called denormalized data and can improve performance. Note that the arrow is pointing in the opposite direction.

We have used the system to model the same problem solved using each of these features:

  1. A relational model
  2. Structured data
  3. Variable record content

Next: The relational model