Meet Memcached in the Clouds – Setting Up Memcached as a Service via Amazon Elastic Cache

aws

 

In my previous post I introduced you to Memcached, in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering. This continued my interest in in-memory NoSQL cache systems like AppFabric Cache and Redis.

Both Redis and Memcached are offered by AWS as a cloud PaaS service called ElastiCace. With ElastiCache, you can quickly deploy your cache environment, without having to provision hardware or install software.You can choose from Memcached or Redis protocol-compliant cache engine software, and let ElastiCache perform software upgrades and patch management for you automatically. For enhanced security, ElastiCache runs in the Amazon Virtual Private Cloud (Amazon VPC) environment, giving you complete control over network access to your cache cluster.With just a few clicks in the AWS Management Console, you can add resources to your ElastiCache environment, such as additional nodes or read replicas, to meet your business needs and application requirements.

Existing applications that use Memcached or Redis can use ElastiCache with almost no modification; your applications simply need to know the host names and port numbers of the ElastiCache nodes that you have deployed.The ElastiCache Auto Discovery feature lets your applications identify all of the nodes in a cache cluster and connect to them, rather than having to maintain a list of available host names and port numbers; in this way, your applications are effectively insulated from changes to cache node membership.

Before I show you how to setup AWS ElastiCache cluster lets go through basics:

Data Model

The Amazon ElastiCache data model concepts include cache nodes, cache clusters, security configuration, and replication groups. The ElastiCache data model also includes resources for event notification and performance monitoring; these resources complement the core concepts.

Cache Nodes and Cluster

A cache node is the smallest building block of an ElastiCache deployment. Each node has its own memory, storage and processor resources, and runs a dedicated instance of cache engine software — either Memcached or Redis. ElastiCache provides a number of different cache node configurations for you to choose from, depending on your needs.You can use these cache nodes on an on-demand basis, or take advantage of reserved cache nodes at significant cost savings.

A cache cluster is a collection of one or more cache nodes, each of which runs its own instance of supported cache engine software.You can launch a new cache cluster with a single ElastiCache operation (CreateCacheCluster), specifying the number of cache nodes you want and the runtime parameters for the cache engine software on all of the nodes. Each node in a cache cluster has the same compute, storage and memory specifications, and they all run the same cache engine software (Memcached or Redis).The ElastiCache API lets you control cluster-wide attributes, such as the number of cache nodes, security settings, version upgrades, and system maintenance windows.

Cache parameter groups are an easy way to manage runtime settings for supported cache engine software. Memcached has many parameters to control memory usage, cache eviction policies, item sizes, and more; a cache parameter group is a named collection of Memcached specific parameters that you can apply to a cache cluster

. Memcached clusters contain from 1 to 20 nodes across which you can horizontally partition your data

ElastiCache-Clusters

To create cluster via ElastiCache console follow these steps:

1. Open  the Amazon ElastiCache console at https://console.aws.amazon.com/elasticache/.

2. Pick Memcached from Dashboard on the left

3. Choose Create

4. Complete Settings Section

 

memcached2

As you enter setting please note following:

1. In the Name enter desired cluster name. Remember, it must begin with the letter and can contain 1 to 20 alphanumeric characters, however cannot have two consecutive hyphens nor end with the hyphen

2. In Port you can accept default at 11211. If you have a reason to use a different port, type the port number.

3. For Parameter group, choose the default parameter group, choose the parameter group you want to use with this cluster, or choose Create new to create a new parameter group to use with this cluster.

4. For Number of nodes, choose the number of nodes you want for this cluster. You will partition your data across the cluster’s nodes.If you need to change the number of nodes later, scaling horizontally is quite easy with Memcached

5. Choose how you want the Availability zone(s) selected for this cluster. You have two options

  1. No Preference. ElastiCache selects availability zone for each node in your cluster
  2. Specify availability zones. Specify availability zone for each node in your cluster.

6. For Security groups, choose the security groups you want to apply to this cluster.

7. The Maintenance window is the time, generally an hour in length, each week when ElastiCache schedules system maintenance for your cluster. You can allow ElastiCache choose the day and time for your maintenance window (No preference), or you can choose the day, time, and duration yourself

8. Now check all of the settings and pick Create

memcached3

More information on Memcached specific parameters you can set up on your ElastiCache cluster see here – http://docs.aws.amazon.com/AmazonElastiCache/latest/UserGuide/ParameterGroups.Memcached.html

For clusters running the Memcached engine, ElastiCache supports Auto Discovery—the ability for client programs to automatically identify all of the nodes in a cache cluster, and to initiate and maintain connections to all of these nodes.From the application’s point of view, connecting to the cluster configuration endpoint is no different from connecting directly to an individual cache node

Process of Connecting to Cache Nodes

1. The application resolves the configuration endpoint’s DNS name. Because the configuration endpoint maintains CNAME entries for all of the cache nodes, the DNS name resolves to one of the nodes; the client can then connect to that node.

2. The client requests the configuration information for all of the other nodes. Since each node maintains configuration information for all of the nodes in the cluster, any node can pass configuration information to the client upon request.

3. The client receives the current list of cache node hostnames and IP addresses. It can then connect to all of the other nodes in the cluster.

The configuration information for Auto Discovery is stored redundantly in each cache cluster node. Client applications can query any cache node and obtain the configuration information for all of the nodes in the cluster.

For more information see – http://cloudacademy.com/blog/amazon-elasticache/, http://www.allthingsdistributed.com/2011/08/amazon-elasticache.html, https://www.sitepoint.com/amazon-elasticache-cache-on-steroids/

Hope this helps.

Meet Memcached– Introduction To Another In-Memory Caching System

Memcached_svg

After writing in this blog on Microsoft AppFabric Cache and Redis,  I will be following up with Memcached. Similarly to above mentioned systems, Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering. Just like them its nded for use in speeding up dynamic web applications by alleviating database load.

Memcached was originally developed by Danga Interactive for LiveJournal, but is now used by many other systems, including MocoSpace,YouTube,Reddit, Facebook, Tumblr and Wikipedia.Engine Yard and Jelastic are using Memcached as part of their platform as a service technology stack and Heroku offers several Memcached services as part of their platform as a service. Google App Engine, AppScale, Microsoft Azure and Amazon Web Services also offer a Memcached service through an API.

So Why Memcached?

  • Free & Open source
  • High performance
  • Simple to set up
  • Ease of development
  • APIs are available for most popular languages

Important to note what Memcached is NOT:

  • a persistent data store
  • a database
  • application-specific
  • a large object cache
  • fault-tolerant or
  • highly available

 

Memcached’ primary storage algorithm is a hash table.

Hash table is basically an power of 2 sized array of pointers to entries. Collisions in the hash table are resolved via separate chaining with linked chains of entries: each entry consists of the pointer to the key, a pointer to the value and the pointer to the next chained entry. ash table size is chosen simpy as the ceiling power of 2, closest to the doubled number of entries the hash table need to contain. This means effective load factor varies from 0.5 to 1.0 with average of 0.75. In that order, the same load factor management strategy is implemented in glibc’s unordered_map.

image

Memcached supports multithreaded access to the store. It controls access to the resources via bare POSIX thread mutexes. Operations with hash table buckets are guarded with one of the pthread_mutex objects in the power-of-two sized array. Size of this array couldn’t be smaller than hash table size. Index of the mutex for the bucket is determined as bucket index % bucket mutex array size. I. e. each mutex is responsible for hash table size / bucket mutex array size buckets.

Installing Memcached on Ubuntru – really easy.

To install Memcached on Ubuntu, go to terminal and type the following commands −

 

$sudo apt-get update
$sudo apt-get install memcached

Once installed Memcached should be running on default port -11211.  To check if Memcached is presently running or not, run the command given below

$ps aux | grep memcached

Memcached is originally a Linux application, but since it is open-source, it has been compiled for windows. There are two major sources for the pre-built windows binary: Jellycan and Northscale, and both versions can be used. The following are the download links for the memcached windows binaries:

http://code.jellycan.com/files/memcached-1.2.5-win32-bin.zip

http://code.jellycan.com/files/memcached-1.2.6-win32-bin.zip

http://downloads.northscale.com/memcached-win32-1.4.4-14.zip

http://downloads.northscale.com/memcached-win64-1.4.4-14.zip

http://downloads.northscale.com/memcached-1.4.5-x86.zip

http://downloads.northscale.com/memcached-1.4.5-amd64.zip

Full tutorial to install Memcached on Windows available here – https://commaster.net/content/installing-memcached-windows

To connect to a Memcached server, you need to use telnet command providing HOST name and PORT number.

Using basic TelNet

$telnet HOST PORT

Here, HOST and PORT are machine IP and port number respectively, on which the Memcached server is running.

Here I will connect to local Memcached running on default port and execute basic set and get commands:

$telnet 127.0.0.1 11211
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
// now store some data and get it from memcached server
set test 0 900 9
memcached
STORED
get test
VALUE tutorialspoint 0 9
memcached
END

Popular basic CLI commands

Command Description` Example
get Reads value based on key get mykey
set Set a key unconditionally set mykey 0 60 5
# Meaning:
0 = > no flags
60 => TTL in [s]
5 => size in byte
add Adds a key add newkey 0 60 5
replace Replaces key replace mykey 0 60 5
delete Deletes an existing key delete mykey

 

See details on these and other commands here – http://blog.elijaa.org/2010/05/21/memcached-telnet-command-summary/

In the near future I will go more in detail on Memcached, including available clients, configuration, Java development with clients and AWS ElastiCache implementation.  For now, you can also explore more here – https://memcached.org/, https://smarttechie.org/2013/07/20/memcached-a-distributed-memory-object-caching-system/, http://code.tutsplus.com/tutorials/turbocharge-your-website-with-memcached–net-23939, and https://wincent.com/wiki/Testing_memcached_with_telnet

Hope this helps.