Getting Started with NoSQL
Chapters
This course introduces you to NoSQL databases and touch on various subjects. We will use CouchDb to explain things that are common with many NoSQL databases.
This course is divided into 16 chapters
- Intro
- Prerequisites
- What is NoSQL
- What kinds of data stores are available.
- Learn CAP theorem
- When to use NoSQL databases.
- CouchDB
- Install CouchDB on Windows
- Install CouchDB on Linux.
- Storing data
- Retreive stored data
- Querying data, Creating customer views and using Map Reduce function
- A word on attachments, querying attachments and deplying a simple webpage application on CouchDB
- Securing CouchDB
- Partitioning
- Other NoSQL databases and A word on where to go from here.
We hope that you will enjoy this course. If you have any feedback please send it through.
Author: Subject Coach
Added on: 2nd Jan 2015
Please get in touch with your teacher or tutor in case you have a question related to this lesson
None just yet!
NoSQL databases serves main purpose of data scalability and performance. Because of scaling strategy, an additional layer of complexity is introduced. This is where cap theorem comes into play.
cap theorem states that, in a distributed systems, you can only have 2 of the following 3 guarantees across a read and write pair.
Consistency, where a read is guaranteed for the most recent write.
Availability, where reasonable data will be returned by a non-dailing node, within a reasonable time frame.
Partition tolerance, where system will keep on operating normally during a partitioning phase.
One of the above must be sacrificed. Consistency and Availability is promised by Relational databases such as MySQL, Oracle.
Relational systems are the databases we've been using for a while now. RDBMSs and systems that support ACID (Atomicity, Consistency, Isolation and Durability) and joins are considered relational.
Key-value systems basically support get, put, and delete operations based on a primary key.
Column-oriented systems still use tables but have no joins (joins must be handled within your application). It's very easy to map data from object-oriented software to these systems.
Document-oriented systems store structured "documents" such as JSON or XML but does not support joins. It's very easy to push and retrieve data from these stores and can be mapped quite easily from object-oriented software.
Let's me try to explain in a more friendly way,
cap is just like a triangle with 3 vertices, let's name these 3 vertices as
1. Consistency.
2. Availability.
3. Partition tolerance.
RDBMSes promise CA, that means, they provide Consitency and Availability.
NoSQL database has two categories, one promise AP (Availability and partition tolerance), such as Cassandra, CouchDB.
Second category promise CP (Consistency and Partition tolerance), such as MongoDB, Redis etc.
cap tradeoffs may be irrelevant in some cases. If you have data that can fit in few Mega Bytes then Partioning doesn't really make sense. On other hand if you have high traffic site and, you are storing article comments in NoSQL store,
it doesn't really matter if data is not really current because it will be available at some point, thus in this scenario you don't really have to worry about consistency.
The cap theorem can be used as a guide for categorizing the tradeoffs between different databases. Consistency, availability, and partition tolerance are all desirable properties in a database.
You should be good with a CP or AP depending on what kind of data you are storing and on your application.