The main purpose of this post is to try to provide insights about what SQL and NoSQL are, what they currently offer, which kind of problems are targeting, and where are they going to in the near future.
Also, we will try to provide hints about how to decide which technology to use, or if it could be better to use both while trying to find a better solution
SQL stands for (Standard Query Language),
Created in the 1970s to allow people to use the relational data model without extensive programming or mathematic training.
SQL is the standard language for Relational Database System. Used to create objects, store, update, delete and retrieve data from databases.
Relational Database management system (RDBMS)
Data is stored in Columns which are attributes (name, age, address),
Rows are a set of columns on a single instance for these attributes (all the data for a single person)
Tables are a set of rows and columns and are related to other tables and tables grouped into Databases.
All data has strong declared datatypes on the table for example ID INT, name VARCHAR (String), Date (Datetime), etc. Also, the meaning of the columns should be decided at the design stage of the database. This causes the design time to increase considerably since all the data types, schema, relationships, constraints, business rules, etc. Have to be predefined and they have to be robust and tested enough to ensure the data model will work.
A database schema is the skeleton structure that represents the logical view of the entire database. It defines how the data is organized, how the data is associated with these rules and relationships and formulates all the constraints that are to be applied to the data.
SQL databases also have a strongly fixed schema defined, which means its design can be changed but it may imply a having downtime on the Database and a lot of effort to accomplish these changes. Business rules are defined also at this level
In addition to tables and fields
Some things that a database schema defines are:
- Database links
Reducing anomalies and protects the integrity of your database by prescribing exactly how transactions interact with the database.
A business rule could be easily explained as the steps, rules and validations data should pass to be stored on the database.
Business rule Example: A sale on the sales table should validate that the client exists on Clients table, one new client should be stored on Client’s table, all clients should have its legal information complete, then it should validate that there is enough inventory for the items on the sale, so go to check inventory table and then continue with the sales process, etc.
It started to be known as No SQL databases but now is defined as Not Only SQL, this was because SQL Database can feed data to NoSQL databases.
The term started to be popular by the late ’90s and early 2000s
When relational databases reach a few GB, they start to become slow and any operation or scalability starts to be very expensive and difficult.
NoSQL started as a database meant to address these issues with a highly scalable and with a fast answer on processing huge amounts of data without failing or taking too long to complete.
Since NoSQL engines are free, these are the three main features NoSQL improved: scalability, speed, and cost.
NoSQL databases sacrificed ACID (Atomicity, Consistency, Isolation, and Durability) compliance for flexibility and processing speed.
Non-Relational Databases :
Usually grouped into four categories: Key-value stores, Column stores, Document stores, and Graph stores
Key-value: Stores data in a schema-less way that consists of indexed keys and values.
Column store: Stores data tables as columns rather than rows sectioning out columns allows for excellent scalability and high performance.
Document database: taking the key-value concept and adding more complexity, each document in this type of database has its own data, and its own unique key, which is used to retrieve it.
Graph database: Have data that are interconnected and best represented as a graph.
Schema Flexible (unstructured)
Data doesn’t have any datatype it just came as data as it comes, (Text, images, video, sound, SQL Databases, etc.)
On this example, the employee collection is creating on-demand and inserting one record directly
Present (What they are and what they are not)
|SQL databases also have a strong schema defined which means, all data have declared datatypes on the table for example ID INT, name VARCHAR (String), Date (Datetime), etc||Data doesn’t have any datatypes or formant, we need to process or store it just as it comes.|
|Vertically scalable (by the power of server’s hardware capacity)
You can manage the increasing load by increasing the CPU, RAM, SSD, etc,
|Horizontal scalable (adding servers in the pool to reduce the load)
You can just add a few more servers easily in your NoSQL database infrastructure to handle the large traffic.
|Uses SQL (language)||UnQL (Unstructured Query language), sometimes engines have their own language, or they may have a SQL-Like language to query on them or could be accessed using programming languages directly.|
|Complex queries compatible||Non-standard interfaces, difficult or impossible to perform complex queries and transactions|
|MySQL, PostgreSQL, SQL Server, Oracle||MongoDB, Cassandra, Hbase, BigTable, CouchDB, Redis, Neo4j|
Good things on SQL
- Powerful query language
- Can handle large numbers of transactions in a single query
- Keeps data integrity, unicity, and consistency (ACID Compliance).
- Uses transactional model
- Maturity of standards and visualization tools
- Many different choices of the database, and wide support
Not so good things on SQL
- The predefined and inflexible data model
- Bad design implies a lot of future work
- Performance decrease as the amount of data increase
- Long-time spent on the design stage
- High cost on licensing
- Realtime and streaming data performance is poor
Good things on NoSQL
- Flexible data models can be changed on the fly without affecting existing data
- Good at storing large datasets/objects
- Create a database without having to develop
- Lot of engines are opensource, so costs are low
- No need to develop a detailed model
Not so good things on NoSQL
- Lack of enterprise-level support and reporting tools
- Query languages are varied and often not very powerful
- Slow for searching and complex queries across multiple tables/collections
- Data retrieved isn’t always up to date
- Complex transactions are even more complex
Future (Trends and what we can expect)
We will talk about a term named “NewSQL”.
These databases are a relational model and Transactional databases, that seek to provide the same scalability, performance, and flexibility of NoSQL systems. Keeping the ACID (Atomicity, Consistency, Isolation, and Durability) from traditional databases, creating a whole new architecture to allow working as a cluster, advanced in-memory processing, high concurrency, and streaming capability.
There are a few databases engines that are now considered as NewSQL, such as Amazon Aurora, Google Spanner, Cockroach DB, Apache Ignite, Volt DB, MemSQL, Clustrix, Infobright, TokuDB, also, popular engines are releasing features to start competing on this field such as MySQL cluster or SQL Server (with columnstore and InMemory features).
As we have been reading on this post NoSQL is not meant to replace SQL databases, but NewSQL is aiming to do it eventually or that popular engines adapt these new architectures and features to work as NoSQL and maybe in the future those two would be merged as one.
Good things on NewSQL:
- Minimize application complexity stronger consistency and often full transactional support.
- Familiar SQL and standard tooling.
- Richer analytics leveraging SQL and extensions.
- Many systems offer “NoSQL-style” clustering with more traditional data and query models.
Not so good things on NewSQL
- No NewSQL systems are as general-purpose as traditional SQL systems set out to be.
- In-memory architectures may be inappropriate for volumes exceeding a few terabytes.
- It offers only partial access to the rich tooling of traditional SQL systems.
How and when to choose SQL vs NoSQL
- There are logically related data which can be identified up-front and is unlikely to change
- Transactions on your system must be reliable
- Data integrity is essential
- Consistency of data is more important than partition tolerance
- You want standards-based proven technology that developers are used to working with
- Commercial applications with good enterprise support are preferred
- Unrelated, indeterminate or evolving data requirements
- Simpler or looser project objectives, perfect for Agile projects where you can start coding immediately without design or definitions
- Speed and scalability are important
- Realtime answers needed
- Open source technology with good community support is preferred
NoSQL is not trying to replace SQL, it is targeting a part of the solutions when SQL was becoming slow and expensive. On data solutions, there always be lots of sources and in the near future there will be more kinds coming such as NewSQL and others that are still on the development stage.
On Intersys we are working with these database technologies and we have qualified people who can help you to find a solution on both SQL and NoSQL to improve the systems on your company.