Forecast Cloudy – Troubleshooting and Debugging Azure PaaS Roles -Architecture And Built-In Logging

First of all these posts would be pretty much impossible without MSDN blogs posts of Kevin Williamson and awesome work done by my colleague Jeff Van Nortwick running Windows Debugger\WinDBG in Azure. They both moved me to do something to understand this topic deeper and huge thanks to both for information they shared with me.

So for a while all of us treated Azure PaaS as “black box development platform” and I am fairly sure in some ways it was intended to be so when introduced. Yet, that went completely against my idea that in order to develop well on something you have to understand its internals – after all I like to think of myself as sort of DevOps character, residing somewhere between development and infrastructure with good dose of DBA thrown in. Its against my nature to state that “something doesn’t work well and there is nothing we can do about it” , instead my favorite activity is to collect and pour through data and understand what is the cause – finding virtual “needle in the haystack”. Therefore I was really excited when someone pointed me to Kevin Williamson blog.

What is great there is awesome explanation of Azure architecture and not in some big cubes in Visio that some architects draw with no details, but in terms of processes spawning more processes.  So how does Azure actually work:

High level like this:

image

Azure considers each rack a ‘node’ of compute power and puts a switch on top of it. Each node — servers+top rack switch — is considered a ‘fault domain’ , i.e., a possible point of failure. An aggregator and load balancers manage groups of nodes, and all feed back to the Fabric Controller (FC), the operational heart of Azure.

The FC gets it’s marching orders from the “Red Dog Front End” (RDFE). RDFE takes its name from nomenclature left over from Dave Cutler’s original Red Dog project that became Azure. The RDFE acts as kind of router for request and traffic to and from the load balancers and Fabric Controller.

The Fabric Controller does all the heavy lifting for Azure. It provisions, stores, delivers, monitors and commands the virtual machines (VMs) that make up Azure. It is a “distributed stateful application distributed across data center nodes and fault domains.”

In English, this means there are a number of Fabric Controller instances running in various racks. One is elected to act as the primary controller. If it fails, another picks up the slack. If the entire FC fails, all of the operations it started, including the nodes, keep running, albeit without much governance until it comes back online. If you start a service on Azure, the FC can fall over entirely and your service is not shut down.

The Fabric Controller automates pretty much everything, including new hardware installs. New blades are configured for PXE and the FC has a PXE boot server in it. It boots a ‘maintenance image,’ which downloads a host operating system (OS) that includes all the parts necessary to make it an Azure host machine. Sysprep is run, the system is rebooted as a unique machine and the FC sucks it into the fold.

image

This is explained very well by Mark Russinovich in his various very good talks – http://channel9.msdn.com/Events/TechEd/NorthAmerica/2013/WAD-B402#fbid=, http://azure.microsoft.com/en-us/documentation/videos/mark-russinovich-windows-on-azure/ , http://blogs.technet.com/b/markrussinovich/archive/2012/08/22/3515679.aspx

However, what I was interested more in is particular Azure Role architecture and here is where Kevin’s information was really helpful.  Kevin’s diagram is large and complex as you can see below, so lets dig into into main parts of it:

image

  • RDFE is still main communication path from you (user) to the fabric. RDFE APIs are used by portal, Visual Studio, SDKs, etc. Therefore all requests from user to fabric go through RDFE (Red Dog Front End).  FFE (Fabric Front End) is the layer which translates requests from RDFE into the fabric commands. All requests from RDFE go through the FFE to reach the fabric controllers
  • I also already mentioned Fabric Controller above. Essentially it’s a watchdog responsible for monitoring and administering all of the VMs\Resources in data center.
  • But here is new item – Host Agent. As the name implies and it only makes sense in virtualized environment this lives on the Host and communicates to Guest agents (there are now I understand 2 of them). Here comes the idea of “heartbeat” that is well known to many from Windows Clustering for example, Host Agent receives “heartbeat” from the Guest and if Guest doesn’t respond for 10 minutes its OS is restarted.
  • Guest agents – WaAppAgent and WindowsAzureGuestAgent. Now we are on guest\role itself. WindowsAzureGuestAgent is responsible for Guest OS configuration, communicating status\heartbeat to the Host, setting up the SID for the user account which the role will run under Interestingly enough WaAppAgent is “overseer” of WindowsAzureGuestAgent  as it installs, configures, updates WindowsAzureGuestAgent.exe itself.
  • WindowsAzureGuestAgent spins up another process – WaHostBootstrapper. That process is responsible for the role configuration and starting up all of the appropriate tasks and processes to configure and run the role, monitoring its own child processes, raising StatusCheck event on role Host process.
  • Startup Tasks are defined by the role model and started by WaHostBootstrapper. Startup tasks can be configured to run in the Background asynchronously and the host bootstrapper will start the startup task and then continue on to other startup tasks. Startup tasks can also be configured to run in Simple (default) mode where the host bootstrapper will wait for the startup task to finish running and return with a success (0) exit code before continuing on to the next startup task. When expanded into startup tasks the DiagnosticsAgent and RemoteAccessAgent are unique in that they define 2 startup tasks each, one regular and one with a /blockStartup parameter. The normal startup task is defined as a Background startup task so that it can run in the background while the role itself is running. The /blockStartup startup task is defined as a Simple startup task so that WaHostBootstrapper will wait for it to exit before continuing. The /blockStartup task simply waits for the regular task to finish initializing and then it will exit and allow the host bootstrapper to continue. The reason this is done is so that diagnostics and RDP access can be configured prior to the role processes starting up (this is done via the /blockStartup task), and that diagnostics and RDP access can continue running after the host bootstrapper has finished with startup tasks (this is done via the normal task).
  • Now what is most important for troubleshooting custom code on the PaaS roles – Worker Processes. WaWorkerHost is the standard host process for normal worker roles. This host process will host all of the role’s DLLs and entry point code such as OnStart and Run. WaWebHost is the standard host process for web roles when they are configured to use the SDK 1.2 compatible Hostable Web Core (HWC).WaIISHost is the host process for role entry point code for web roles using Full IIS. This process will load the first DLL found which implements the RoleEntryPoint class (this DLL is defined in E:\__entrypoint.txt) and execute the code from this class (OnStart, Run, OnStop). Any RoleEnvironment events (ie. StatusCheck, Changed, etc) created in the RoleEntryPoint class will be raised in this process. W3WP is the standard IIS worker process which will be used when the role is configured to use Full IIS. This will run the AppPool configured from IISConfigurator

So that’s definitely a lot of technical theory courtesy of Kevin. What’s more interesting talks about logs, yes logs available to you and me to troubleshoot common Azure roie issues with Wa*.logs and event logs.  So now for a bit of practice – let get to these logs and look at them.

First thing before we examine logs we need to a some sort of Azure PaaS role deployed, either web or worker role. Lets build a test little worker role and deploy it to Azure.

So I created a little basic Azure worker role, named it testservicegennadyk and deployed it to Azure. In order to get to logs I mention above I will need to RDP to my role, therefore when I publish my role I have to enable RDP. I can do that in Publish settings for my Role in Visual Studio:

image

Now I can go to Azure Portal and see\connect to my role in the portal

image

Now I can see following :

Windows Azure Event Logs via Event Viewer –> Applications and Services Logs –> Windows Azure that contains important diagnostic output from the Windows Azure Runtime, including information such as Role starts/stops, startup tasks, OnStart start and stop, OnRun start, crashes, recycles, etc

image

 

Application Event Logs via Event Viewer –> Windows Logs –> Application. Standard troubleshooting source for any application development.

App Engine Runtime Log – in C:\Logs\AppEngineRuntime.log. These logs are written by WindowsAzureGuestAgent.exe and contain information about events happening within the guest agent and the VM.  This includes information such as firewall configuration, role state changes, recycles, reboots, health status changes, role stops/starts, certificate configuration, etc.

image

App Agent Heartbeat Logs – In C:\Logs\WaAppAgent.log. These logs are written by WindowsAzureGuestAgent.exe and contain information about the status of the health probes to the host bootstrapper. These logs are typically useful for determining what is the current state of the role within the VM, as well as determining what the state was at some time in the past. You can definitely see great information on Heartbeat Pings in that log like:

[00000004] [01/29/2015 19:36:01.91] [HEART] WindowsAzureGuestAgent Heartbeat.
[00000004] [01/29/2015 19:36:01.91] [INFO]  Role a95af93b891f4451ac81ac670af28d5f.WorkerRole1_IN_0 is reporting state Ready.
[00000009] [01/29/2015 19:36:04.49] [INFO]  Role a95af93b891f4451ac81ac670af28d5f.WorkerRole1_IN_0 has current state Started, desired state Started, and goal state execution status StartSucceeded.

Host Bootstrapper Logs – in C:\Resources\WaHostBootstrapper.log. This log contains entries for startup tasks (including plugins such as Caching or RDP) and health probes to the host process running your role entrypoint code (ie. WebRole.cs code running in WaIISHost.exe). A new log file is generated each time the host bootstrapper is restarted (ie. each time your role is recycled due to a crash, recycle, VM restart, upgrade, etc) which makes these logs easy to use to determine how often or when your role recycled.

If this was a web role I would be able able to get to IIS logs and HTTP.SYS logs, but with worker role sample things will be limited.  IIS logs would be in C:\Resources\Directory\{DeploymentID}.{Rolename}.DiagnosticStore\LogFiles\Web and HHTP.SYS in D:\WIndows\System32\LogFiles\HTTPERR

Hope this was helpful. In the the next post I plan to actually install WinDBG on my sample role and attempt to debug it.

Advertisements

Forecast Cloudy – Using Microsoft Azure Redis Cache

 

In my previous posts I introduced Redis , in particular Microsoft port of that open source technology to Windows by MsOpenTech. In this post I want to show how you can use Azure Cache  version of Redis based on that port that went to general availability around time of Microsoft TechEd 2014 Conference in May 2014.

Creating Azure Redis Cache:

First you will login with your credentials to new Azure Portal at https://portal.azure.com. Pick Browse on home page and New button in lower left corner: Pick Redis Cache

redis_azure2

 

First enter DNS name, that would be subdomain name to use for the cache endpoint. The endpoint must be a string between six and twenty characters, contain only lowercase numbers and letters, and must start with a letter.

Use Pricing Tier to select the desired cache size and features. Azure Redis Cache is available in the following two tiers.

  • Basic – single node, multiple sizes up to 53 GB.
  • Standard – Two node Master/Slave, 99.9% SLA, multiple sizes up to 53 GB.

For Subscription, select the Azure subscription that you want to use for the cache. In Resource group, select or create a resource group for your cache.

Use Geolocation to specify the geographic location in which your cache is hosted. For the best performance, Microsoft strongly recommends that you create the cache in the same region as the cache client application.

redis_azure3

At this time I will hit create and my cache will be created.  After a bit here is what you see on a portal:

redis_azure4

 

Using Azure Redis Cache

We start by getting the connection details, this requires 2 steps:

1. Get the URI, which is easy to get using the properties window, as shown below

azure_redis12

2. Once you have copied the host name url, we need to copy the password, which you can grab from the keys area

Now that we have these two important bits that we need to connect, lets start using our cache service.

Just like in my on premise demo previously  I will use StackExchange.Redis package from NuGet in my client.  Below is very simple basic way to put\retrieve strings from Azure Redis:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using StackExchange.Redis; 

namespace RedisTest
{
    class Program
    {
        static void Main(string[] args)
        {
            ConnectionMultiplexer redis = ConnectionMultiplexer.Connect(",ssl=true,password=");
            IDatabase db = redis.GetDatabase();

            // Perform cache operations using the cache object...
            // Simple put of integral data types into the cache
            db.StringSet("GM", "GENERAL MOTORS");
            db.StringSet("F", "FORD");
            db.StringSet("MSFT", "MICROSOFT");

      
        // Retrieve keys\values from Redis
          
               string value = db.StringGet("GM");
               
                System.Console.Out.WriteLine("GM" + "," + value);
            
            System.Console.ReadLine();




        }
    }

And yet above demo is a bit boring as you usually will not store cache items like this in real time. Therefore a snippet below will use complex classes with properties and magic of JSON serialization to put these in Redis Cache:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using StackExchange.Redis;
using Newtonsoft.Json; 
namespace RedisTest
{
    class Program
    {
        static void Main(string[] args)
        {
            ConnectionMultiplexer redis = ConnectionMultiplexer.Connect("gennadykredis.redis.cache.windows.net,ssl=true,password=nsq/4GhGUKu8OidX15Eo6raWhC/Z5KefSnu5uwy1IRo=");
            IDatabase db = redis.GetDatabase();

            // Perform cache operations using the cache object...
            // Simple put of OBJECT data types into the cache
            Product myProd = new Product() { Price=127.99, Description = "Bycicle" };
            var serializedmyProd = JsonConvert.SerializeObject(myProd);
            db.StringSet("serializedmyProd", serializedmyProd);


            Product myProd2 = new Product() { Price = 4999.99, Description = "Motorcycle" };
            var serializedmyProd2 = JsonConvert.SerializeObject(myProd2);
            db.StringSet("serializedmyProd2", serializedmyProd2);

      
        // Retrieve keys\values from Redis

            var myProd3 = JsonConvert.DeserializeObject(db.StringGet("serializedmyProd"));
            var myProd4 = JsonConvert.DeserializeObject(db.StringGet("serializedmyProd2"));
          
            System.Console.Out.WriteLine(myProd3.Description.ToString() + " " + myProd4.Description.ToString()); 
            System.Console.ReadLine();




        }
    }

    class Product
    {
        public string Description  { get; set; }
        public double Price { get; set; }


       

    }

So as with previous blogs on Redis demo is pretty simple, however it shows you foundation to move deeper as necessary. For more on Redis Cache on Microsoft Azure see – http://azure.microsoft.com/en-us/documentation/services/cache/, http://msdn.microsoft.com/en-us/library/azure/dn690516.aspx, http://msdn.microsoft.com/en-us/library/azure/dn690523.aspx and https://sachabarbs.wordpress.com/2014/11/25/azure-redis-cache/ 

Hope this is helpful as it has been different and fun for myself.

Meet Redis – Masters, Slaves and Scaling Out

redis

In my previous posts I introduced Redis and attempted to show how it can work with advanced data structures , as well as persistence options. Another important Redis feature is master –slave asynchronous replication. Data from any Redis server can replicate to any number of slaves. A slave may be a master to another slave. This allows Redis to implement a single-rooted replication tree. Redis slaves can be configured to accept writes, permitting intentional and unintentional inconsistency between instances. Replication is useful for read (but not write) scalability or data redundancy.

redis_master_slave

How Redis replication works. According to Redis docs this is workflow description for Redis asynchronous replication:

  • If you set up a slave, upon connection it sends a SYNC command. It doesn’t matter if it’s the first time it has connected or if it’s a reconnection.
  • The master then starts background saving, and starts to buffer all new commands received that will modify the dataset. When the background saving is complete, the master transfers the database file to the slave, which saves it on disk, and then loads it into memory. The master will then send to the slave all buffered commands. This is done as a stream of commands and is in the same format of the Redis protocol itself.
  • Slaves are able to automatically reconnect when the master <-> slave link goes down for some reason. If the master receives multiple concurrent slave synchronization requests, it performs a single background save in order to serve all of them.
  • When a master and a slave reconnects after the link went down, a full resync is always performed. However, starting with Redis 2.8, a partial resynchronization is also possible.

So Redis master-slave replication can be useful in number of scenarios here:

  • Scaling performance by using the replicas for intensive read operations.
  • Data redundancy in multiple locations
  • Offloading data persistency costs in terms of expensive Disk IO (covered in last post) from the master by delegating it to the slaves

So, if replication is pretty useful as far as read-only scale out – how do I configure it? To configure replication is trivial: just add the following line to the slave configuration file (slave instance redis.conf) :

slaveof 

Example:

slaveof 10.84.16.18 6379

More importantly you can use SLAVEOF command in Redis CLI to switch replication on the fly – http://redis.io/commands/slaveof.  If a Redis server is already acting as slave, the command SLAVEOF NO ONE will turn off the replication, turning the Redis server into a MASTER. In the proper form SLAVEOF hostname port will make the server a slave of another server listening at the specified hostname and port.

Since Redis 2.6, slaves support a read-only mode that is enabled by default. This behavior is controlled by the slave-read-only option in the redis.conf file, and can be enabled and disabled at runtime using CONFIG SET.

That’s great, but what if for HA purposes I need an automated failover here from master to slave? Enter Redis Sentinel – system designed to help managing Redis instances.  It does following:

  • Sentinel constantly checks if your master and slave instances are working as expected
  • Sentinel can notify the system administrator, or another computer program, via an API, that something is wrong with one of the monitored Redis instances.
  • If a master is not working as expected, Sentinel can start a failover process where a slave is promoted to master, the other additional slaves are reconfigured to use the new master, and the applications using the Redis server informed about the new address to use when connecting.
  • Sentinel acts as a source of authority for clients service discovery: clients connect to Sentinels in order to ask for the address of the current Redis master responsible for a given service. If a failover occurs, Sentinels will report the new address.

For more on Redis Sentinel see – http://redis.io/topics/sentinel. Unfortunately MSOpenTech port of Redis on Windows doesn’t support this feature so I couldn’t easily test it here, hope that in future blog entry testing Redis on Linux flavor I can show you Sentinel configuration and failover.

However, even through above are great features, there is one item that is missing here that for example was present in AppFabric Cache  – distributed cluster capable of linear scale out for write traffic. Yes, theoretically I can have multiple masters in Redis as well, however you would have to build some sort of sharding mechanism as multiple folks did in Silicon Valley (Instagram and Facebook I believe done so) to scale out. Fortunately, there is a new Redis Cluster Project. Redis Cluster provides a way to run a Redis installation where data is automatically sharded across multiple Redis nodes.

Commands dealing with multiple keys are not supported by the cluster, because this would require moving data between Redis nodes, making Redis Cluster not able to provide Redis-alike performances and predictable behavior under load. Redis Cluster also provides some degree of availability during partitions, that is in practical terms the ability to continue the operations when some nodes fail or are not able to communicate. So here is what you get with Redis Cluster:

  • The ability to automatically split your dataset among multiple nodes (true scale out)
  • The ability to continue operations when a subset of the nodes are experiencing failures or are unable to communicate with the rest of the cluster.

Every Redis Cluster node requires two TCP connections open. The normal Redis TCP port used to serve clients, for example 6379, plus the port obtained by adding 10000 to the data port, so 16379 in the example.This second high port is used for the Cluster bus, that is a node-to-node communication channel using a binary protocol. The Cluster bus is used by nodes for failure detection, configuration update, failover authorization and so forth. Clients should never try to communicate with the cluster bus port, but always with the normal Redis command port, however make sure you open both ports in your firewall, otherwise Redis cluster nodes will be not able to communicate.

To create a cluster, the first thing we need is to have a few empty Redis instances running in cluster mode. This basically means that clusters are not created using normal Redis instances, but a special mode needs to be configured so that the Redis instance will enable the Cluster specific features and commands. Therefore we will add following to configuration (redis.conf):

port 6379
cluster-enabled yes
cluster-config-file nodes.conf
cluster-node-timeout 5000
appendonly yes

As you can see what enables the cluster mode is simply the cluster-enabled directive. Every instance also contains the path of a file where the configuration for this node is stored, that by default is nodes.conf. This file is never touched by humans, it is simply generated at startup by the Redis Cluster instances, and updated every time it is needed.

Note that the minimal cluster that works as expected requires to contain at least three master nodes.

When instances are setup and running cluster node, next you need to create a cluster using Redis Cluster command line utility – redis-trib. The redis-trib utility is in the src directory of the Redis source code distribution. Example of use would be something like:

./redis-trib.rb create host1.domain.com:6379 host2.domain.com:6379

As Redis Cluster is still work in progress check out Redis Cluster Spec – http://redis.io/topics/cluster-spec and doc pages – http://redis.io/topics/cluster-tutorial . For internals and details on Redis Cluster also see this presentation from Redis – http://redis.io/presentation/Redis_Cluster.pdf

Hope this helps.

Meet Redis – Memory, Persistence and Advanced Data Structures

 

redis

In my previous post I introduced Redis and shown a very basic client application that interacts with it. So with basics covered by that post to want to touch on some more advanced, frequently misunderstood  features of Redis, mainly persistence. Unlike many other NoSQL in-memory distributed  key value stores Redis actually offers data persistence.  There are two main persistence options with Redis:

  • RDB. The RDB persistence (default mode) performs point-in-time snapshots of your dataset at specified intervals
  • AOF. The AOF (append-only file) persistence – which logs every write operation received by the server, that can later be “played” again at server startup, reconstructing the original dataset (commands are logged using the same format as the Redis protocol itself).

As mentioned RDB is default. This method is great for following:

  • RDB is a very compact single-file point-in-time representation of your Redis data. RDB files are perfect for backups
  • RDB is very good for disaster recovery, being a single compact file can be transferred to far data centers or cloud providers (Azure  Storage)
  • Most performing option. The only work the Redis parent process needs to do in order to persist is forking a child that will do all the rest.

However, RDB may not be best for:

  • Not the best HA strategy in case of power or server outage. As you understand all of the data after snapshot is gone on restore. For example if you snapshot every 10 minutes, you may lose up to  last 10 minutes of data.
  • RDB needs to fork() often in order to persist on disk using a child process. Fork() can be time consuming if the dataset is big, and may result in Redis to stop serving clients for some millisecond or even for one second if the dataset is very big and the CPU performance not great

AOF on the other hand logs every write operation and can be replayed on startup (oh how it sounds like SQL transaction logs or Oracle REDO logging) . Pluses for this approach can be:

  • Much more durable, more accepted HA
  • The AOF log is an append only log, so there are no seeks, nor corruption problems if there is a power outage. Even if the log ends with an half-written command for some reason (disk full or other reasons) the redis-check-aof tool is able to fix it easily.
  • Redis is able to automatically rewrite the AOF in background when it gets too big. The rewrite is completely safe as while Redis continues appending to the old file, a completely new one is produced with the minimal set of operations needed to create the current data set, and once this second file is ready Redis switches the two and starts appending to the new one.

Minuses for AOF are quite obvious:

  • Performance overhead and slower write operations.
  • AOF files are usually bigger than the equivalent RDB files for the same dataset.

What should I use for my application?

It depends. If you are using Redis strictly as in-memory distributed cache store backed by some RDBMS, like SQL Server or MySQL, using Cache Aside Pattern – http://msdn.microsoft.com/en-us/library/dn589799.aspx, you may not need persistence outside of occasional backup (lets say every 24 hours) provided by RDB. You will get obviously more transaction performance minimizing any kind of Disk IO. If on the other hand, you want a degree of data safety roughly comparable to database, you should use both persistence methods. If you are somewhere in the middle, care a lot about your data, but still can live with a few minutes of data lose in case of disasters, you can simply use RDB alone. Redis docs discourage use of AOF alone, since to have an RDB snapshot from time to time is a great idea for doing database backups, for faster restarts, and in the event of bugs in the AOF engine. In my opinion, although Redis is fine NoSQL in-memory cache product its no substitute for RDBMS , as it lacks HA and other features of full RDBMS product like SQL Server so don’t attempt to substitute RDBMS with Redis. It’s a different product for different purpose.  

So how do I configure persistence in Redis. Well, we are going back to redis.conf configuration file, in my case as I am running MSOpenTech Redis on Windows its redis.windows.conf

image

Lets start Redis with redis-server.exe

image

 

Lets configure RDB for my Redis instance to save every 60 sec for at least 1000 keys changed.  We will do it using save parameter:

 

save 60 1000

Actually this is what I set up in my configuration file:

################################ SNAPSHOTTING  #################################
#
# Save the DB on disk:
#
#   save  
#
#   Will save the DB if both the given number of seconds and the given
#   number of write operations against the DB occurred.
#
#   In the example below the behaviour will be to save:
#   after 900 sec (15 min) if at least 1 key changed
#   after 300 sec (5 min) if at least 10 keys changed
#   after 60 sec if at least 10000 keys changed
#
#   Note: you can disable saving at all commenting all the "save" lines.
#
#   It is also possible to remove all the previously configured save
#   points by adding a save directive with a single empty string argument
#   like in the following example:
#
#   save ""

save 60 10000

Where does RDB dump that snapshot? Well, that’s controlled via dbfilename configuration. Note I will be dumping into my Redis directory (dir configuration parameter) to file dump.rdb

# The filename where to dump the DB
dbfilename dump.rdb

# The working directory.
#
# The DB will be written inside this directory, with the filename specified
# above using the 'dbfilename' configuration directive.
# 
# The Append Only File and the QFork memory mapped file will also be created 
# inside this directory.
# 
# Note that you must specify a directory here, not a file name.
dir ./

Now I will change my application from first post to write 1000 keys to Redis and check what happens.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using StackExchange.Redis; 

namespace RedisTest
{
    class Program
    {
        static void Main(string[] args)
        {
            ConnectionMultiplexer redis = ConnectionMultiplexer.Connect("localhost");
            IDatabase db = redis.GetDatabase(0);
            int counter;
            string value;
            string key;
            //Create 5 users and put these into Redis
            for (counter = 0; counter < 999; counter++)
            {
                 value = "user" + counter.ToString();
                 key = "5676" + counter.ToString();
                 db.StringSet(key,value);
            }
      
        // Retrieve keys\values from Redis
            for (counter = 0; counter < 999; counter++)
            {
                key = "5676" + counter.ToString();
                value = db.StringGet(key);
                System.Console.Out.WriteLine(key + "," + value);
            }
            System.Console.ReadLine();




        }
    }
}

And here is my dump file 60 seconds later after execution:

image

Here is how I can restore Redis from dump.rdb:

  • Stop redis (because redis overwrites the current rdb file when it exits).
  • Copy your backup rdb file to the redis working directory (this is the dir option in your redis conf, as you can see above). Also make sure your backup filename matches the dbfilename config option.
  • Change the redis configuration appendonly flag to no (otherwise redis will ignore your rdb file when it starts).
  • Start Redis
  • Run redis-cli BGREWRITEAOF to create a new appendonly file.
  • Restore redis configuration appendonly flag to yes.

Now that we have RDB snapshots working, how do they exactly work. Here what docs state. Whenever Redis needs to dump the dataset to disk, this is what happens:

  • Redis forks. We now have a child and a parent process.
  • The child starts to write the dataset to a temporary RDB file.
  • When the child is done writing the new RDB file, it replaces the old one.

This method allows Redis to benefit from copy-on-write semantics.

What about AOF?  Well you can turn on AOP via appendonly configuration parameter setting it to yes.

appendonly yes

After you set this, every time write is issued via SET it will append to AOF. When you restart Redis it will re-play the AOF to rebuild the state.

In the previous post I showed operating with strings and Redis. But beauty of Redis is its ability to work with more complex data structures, like lists and hashes. Lets fire up Redis CLI and see.

Lists. Lists let you store and manipulate an array of values for a given key. You can add values to the list, get the first or last
value and manipulate values at a given index. Lists maintain their order and have efficient index-based operations. Here I push a new user at the front of the list (users_list) with value gennadyk .

image

Now using LRANGE command I get subset of the list. It takes the index of the first element you want to retrieve as its first parameter and the index of the last element you want to retrieve as its second parameter. A value of -1 for the second parameter means to retrieve all elements in the list.

image

Sets. Sets are used to store unique values and provide a number of set-based operations, like unions. Sets in Redis are not ordered, unless you use Sorted Sets.  Here are use SADD command to add US to set called countries

image

SMEMBERS command returns all of the set members

image

Hashes. Hashes are a good example of why calling Redis a key-value store isn’t quite accurate. You see, in a lot of ways, hashes are like strings. The important difference is that they provide an extra level of indirection: a field. For example , here I am tracking my superheroes such as superman, spiderman and mythical frogman with their power level property in relation to each other:

image

So now I can not only retrieve my superman, but his power level field as well:

image

 

Sorted Sets. If hashes are like strings but with fields, then sorted sets
are like sets but with a score. The score provides sorting and ranking capabilities. Perhaps my above superheroes sample can be better expressed as sorted set vs. hashes.

There is a to Redis, much more than I can cover without writing a book. Best reference to Redis commands is here – http://redis.io/commands, a very good blog on Redis data structures and CLI – http://stillimproving.net/2013/06/redis-101.

Marius covers Redis Persistence really well in this blog – http://mariuszprzydatek.com/2013/08/29/redis-persistence/, and official Redis persistence docs are here – http://redis.io/topics/persistence, as well as great deeper entry here – http://oldblog.antirez.com/post/redis-persistence-demystified.html 

Hope this helps.