Meet Redis – Memory, Persistence and Advanced Data Structures

 

redis

In my previous post I introduced Redis and shown a very basic client application that interacts with it. So with basics covered by that post to want to touch on some more advanced, frequently misunderstood  features of Redis, mainly persistence. Unlike many other NoSQL in-memory distributed  key value stores Redis actually offers data persistence.  There are two main persistence options with Redis:

  • RDB. The RDB persistence (default mode) performs point-in-time snapshots of your dataset at specified intervals
  • AOF. The AOF (append-only file) persistence – which logs every write operation received by the server, that can later be “played” again at server startup, reconstructing the original dataset (commands are logged using the same format as the Redis protocol itself).

As mentioned RDB is default. This method is great for following:

  • RDB is a very compact single-file point-in-time representation of your Redis data. RDB files are perfect for backups
  • RDB is very good for disaster recovery, being a single compact file can be transferred to far data centers or cloud providers (Azure  Storage)
  • Most performing option. The only work the Redis parent process needs to do in order to persist is forking a child that will do all the rest.

However, RDB may not be best for:

  • Not the best HA strategy in case of power or server outage. As you understand all of the data after snapshot is gone on restore. For example if you snapshot every 10 minutes, you may lose up to  last 10 minutes of data.
  • RDB needs to fork() often in order to persist on disk using a child process. Fork() can be time consuming if the dataset is big, and may result in Redis to stop serving clients for some millisecond or even for one second if the dataset is very big and the CPU performance not great

AOF on the other hand logs every write operation and can be replayed on startup (oh how it sounds like SQL transaction logs or Oracle REDO logging) . Pluses for this approach can be:

  • Much more durable, more accepted HA
  • The AOF log is an append only log, so there are no seeks, nor corruption problems if there is a power outage. Even if the log ends with an half-written command for some reason (disk full or other reasons) the redis-check-aof tool is able to fix it easily.
  • Redis is able to automatically rewrite the AOF in background when it gets too big. The rewrite is completely safe as while Redis continues appending to the old file, a completely new one is produced with the minimal set of operations needed to create the current data set, and once this second file is ready Redis switches the two and starts appending to the new one.

Minuses for AOF are quite obvious:

  • Performance overhead and slower write operations.
  • AOF files are usually bigger than the equivalent RDB files for the same dataset.

What should I use for my application?

It depends. If you are using Redis strictly as in-memory distributed cache store backed by some RDBMS, like SQL Server or MySQL, using Cache Aside Pattern – http://msdn.microsoft.com/en-us/library/dn589799.aspx, you may not need persistence outside of occasional backup (lets say every 24 hours) provided by RDB. You will get obviously more transaction performance minimizing any kind of Disk IO. If on the other hand, you want a degree of data safety roughly comparable to database, you should use both persistence methods. If you are somewhere in the middle, care a lot about your data, but still can live with a few minutes of data lose in case of disasters, you can simply use RDB alone. Redis docs discourage use of AOF alone, since to have an RDB snapshot from time to time is a great idea for doing database backups, for faster restarts, and in the event of bugs in the AOF engine. In my opinion, although Redis is fine NoSQL in-memory cache product its no substitute for RDBMS , as it lacks HA and other features of full RDBMS product like SQL Server so don’t attempt to substitute RDBMS with Redis. It’s a different product for different purpose.  

So how do I configure persistence in Redis. Well, we are going back to redis.conf configuration file, in my case as I am running MSOpenTech Redis on Windows its redis.windows.conf

image

Lets start Redis with redis-server.exe

image

 

Lets configure RDB for my Redis instance to save every 60 sec for at least 1000 keys changed.  We will do it using save parameter:

 

save 60 1000

Actually this is what I set up in my configuration file:

################################ SNAPSHOTTING  #################################
#
# Save the DB on disk:
#
#   save  
#
#   Will save the DB if both the given number of seconds and the given
#   number of write operations against the DB occurred.
#
#   In the example below the behaviour will be to save:
#   after 900 sec (15 min) if at least 1 key changed
#   after 300 sec (5 min) if at least 10 keys changed
#   after 60 sec if at least 10000 keys changed
#
#   Note: you can disable saving at all commenting all the "save" lines.
#
#   It is also possible to remove all the previously configured save
#   points by adding a save directive with a single empty string argument
#   like in the following example:
#
#   save ""

save 60 10000

Where does RDB dump that snapshot? Well, that’s controlled via dbfilename configuration. Note I will be dumping into my Redis directory (dir configuration parameter) to file dump.rdb

# The filename where to dump the DB
dbfilename dump.rdb

# The working directory.
#
# The DB will be written inside this directory, with the filename specified
# above using the 'dbfilename' configuration directive.
# 
# The Append Only File and the QFork memory mapped file will also be created 
# inside this directory.
# 
# Note that you must specify a directory here, not a file name.
dir ./

Now I will change my application from first post to write 1000 keys to Redis and check what happens.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using StackExchange.Redis; 

namespace RedisTest
{
    class Program
    {
        static void Main(string[] args)
        {
            ConnectionMultiplexer redis = ConnectionMultiplexer.Connect("localhost");
            IDatabase db = redis.GetDatabase(0);
            int counter;
            string value;
            string key;
            //Create 5 users and put these into Redis
            for (counter = 0; counter < 999; counter++)
            {
                 value = "user" + counter.ToString();
                 key = "5676" + counter.ToString();
                 db.StringSet(key,value);
            }
      
        // Retrieve keys\values from Redis
            for (counter = 0; counter < 999; counter++)
            {
                key = "5676" + counter.ToString();
                value = db.StringGet(key);
                System.Console.Out.WriteLine(key + "," + value);
            }
            System.Console.ReadLine();




        }
    }
}

And here is my dump file 60 seconds later after execution:

image

Here is how I can restore Redis from dump.rdb:

  • Stop redis (because redis overwrites the current rdb file when it exits).
  • Copy your backup rdb file to the redis working directory (this is the dir option in your redis conf, as you can see above). Also make sure your backup filename matches the dbfilename config option.
  • Change the redis configuration appendonly flag to no (otherwise redis will ignore your rdb file when it starts).
  • Start Redis
  • Run redis-cli BGREWRITEAOF to create a new appendonly file.
  • Restore redis configuration appendonly flag to yes.

Now that we have RDB snapshots working, how do they exactly work. Here what docs state. Whenever Redis needs to dump the dataset to disk, this is what happens:

  • Redis forks. We now have a child and a parent process.
  • The child starts to write the dataset to a temporary RDB file.
  • When the child is done writing the new RDB file, it replaces the old one.

This method allows Redis to benefit from copy-on-write semantics.

What about AOF?  Well you can turn on AOP via appendonly configuration parameter setting it to yes.

appendonly yes

After you set this, every time write is issued via SET it will append to AOF. When you restart Redis it will re-play the AOF to rebuild the state.

In the previous post I showed operating with strings and Redis. But beauty of Redis is its ability to work with more complex data structures, like lists and hashes. Lets fire up Redis CLI and see.

Lists. Lists let you store and manipulate an array of values for a given key. You can add values to the list, get the first or last
value and manipulate values at a given index. Lists maintain their order and have efficient index-based operations. Here I push a new user at the front of the list (users_list) with value gennadyk .

image

Now using LRANGE command I get subset of the list. It takes the index of the first element you want to retrieve as its first parameter and the index of the last element you want to retrieve as its second parameter. A value of -1 for the second parameter means to retrieve all elements in the list.

image

Sets. Sets are used to store unique values and provide a number of set-based operations, like unions. Sets in Redis are not ordered, unless you use Sorted Sets.  Here are use SADD command to add US to set called countries

image

SMEMBERS command returns all of the set members

image

Hashes. Hashes are a good example of why calling Redis a key-value store isn’t quite accurate. You see, in a lot of ways, hashes are like strings. The important difference is that they provide an extra level of indirection: a field. For example , here I am tracking my superheroes such as superman, spiderman and mythical frogman with their power level property in relation to each other:

image

So now I can not only retrieve my superman, but his power level field as well:

image

 

Sorted Sets. If hashes are like strings but with fields, then sorted sets
are like sets but with a score. The score provides sorting and ranking capabilities. Perhaps my above superheroes sample can be better expressed as sorted set vs. hashes.

There is a to Redis, much more than I can cover without writing a book. Best reference to Redis commands is here – http://redis.io/commands, a very good blog on Redis data structures and CLI – http://stillimproving.net/2013/06/redis-101.

Marius covers Redis Persistence really well in this blog – http://mariuszprzydatek.com/2013/08/29/redis-persistence/, and official Redis persistence docs are here – http://redis.io/topics/persistence, as well as great deeper entry here – http://oldblog.antirez.com/post/redis-persistence-demystified.html 

Hope this helps.

Advertisements

3 thoughts on “Meet Redis – Memory, Persistence and Advanced Data Structures

  1. Pingback: Meet Redis – Masters, Slaves and Scaling Out | A posteriori
    • Hello Anirudh ,
      Usually people configure either one – RDB or AOF for persistence. Snapshots via SAVE\BGSAVE are very useful and used with RDB as sort of a backup you can recover from. With AOF you recover from append only file versus snapshot. RDB is more performant with less overhead, but there is possible data loss between time of last snapshot and time of an outage; with AOF there is no data loss as appendonly file has all of the data, but there is a lot higher overhead for high throughput scenarios. So usually you set up RDB with snapshot or AOF not both.

      Hope this helps, for more see – http://redis.io/topics/persistence

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s