December 2011

Live Blogging at MongoSV

Dwight Meriman, CEO of 10gen, speaks about the MongoDB community growing.The conference has doubled in size from 500 to 1100+ attendees.

Eliot Horowitz, CTO of 10gen, demos the MongoDB 2.2 Aggregation Framework. Simplifies aggregating data in MongoDB. He pulls in mongodb twitter feed to populate data and sums using: runCommand({aggregate: … })

The “aggregate” command in nightly builds tonight.

Cooper Bethea, Site Reliability Engineer, Foursquare, speaks on Experiences Deploying MongDB on AWS.

All data stored in MongoDB
8 production MongDB clusters
Two of the larger shards:
8 shards of users, 12 shards of check-ins.
Checkins: ~80 inserts/sec, ~2.5k ops/sec, 45/MB/s outbound at peak.
Users: ~250 updates/sec, ~4k ops/sec, 46MB/s outbound at peak
Only one unsharded cluster. Other fully sharded using replica sets.

All servers in EC2
mongoS is on mongoD instances
mongoCs are on three instances
mongoD working set contained in RAM
MongoD backing store: 4 EBS volumes with RAID0

Problem: fragmentaion leads to bloat
mongoD RAM footprints grows.
Data size, index size, storage size.

Solution: order replicaset by dataSize + indexSize, uptime DESC. --repair secondary nodes one at a time. Primary nodes require stepDown() which is more delicate.

Problem: EBS performance degrades
Symptoms: ioutil % on one volume > 90
qr/qw counts spike
fault rates > 10 in monostat
sometimes:  topless counts spike

Solution:
KILL IT! Stop mongoD process if secondary node, stepDown() + stop if primary.
Rebuild from scratch.

How long does it take? ~1 hour
Working set in RAM

Problem: fresh mongoD has not paged in all data
Solution: run queries
db.checkins.find({unused_key:1}).explain()

cat > /dev/null works too, unless your dataset size is larger then RAM.

Hashing Algorithm in MySQL PASSWORD()

Recently we had a question from a customer: what is the hashing algorithm implemented in PASSWORD() ?

The manual doesn't give a straight answer in any of these two pages:

http://dev.mysql.com/doc/refman/5.1/en/encryption-functions.html#function_password

http://dev.mysql.com/doc/refman/5.1/en/password-hashing.html

 

It is enough to dig a bit more to find the solution in http://forge.mysql.com/wiki/MySQL_Internals_ClientServer_Protocol#4.1_and_later that specifies "mysql.user.Password stores SHA1(SHA1(password))" .

 

Instead of blindly trusting the documentation (even if I believe it is correct), I did some tests and was confused by the first result:

mysql> SELECT PASSWORD("this_is_a_random_string") `pass`\G 

pass: *12E76A751EFA43A177049262A2EE36DA327D8E50

mysql> SELECT SHA1(SHA1("this_is_a_random_string")) `pass`\G 

pass: 9b653fd9fb63e1655786bfa3b3e00b0913dfc177

So it looked like SHA1(SHA1(password)) wasn't PASSWORD(password)), at least in this test.

The best documentation ever is the source code, so I read the source code and understood why my previous test was incorrect: the second SHA1() is applied to the binary data returned by the first SHA1() and not to its hex representation. Therefore in SQL I have to UNHEX() it before applying the second SHA1. In fact: 

mysql> SELECT SHA1(UNHEX(SHA1("this_is_a_random_string"))) `pass`\G 

pass: 12e76a751efa43a177049262a2ee36da327d8e50

 

So yes, I confirmed that mysql.user.password stores SHA1(SHA1(password)) . I also hope this post is useful to understand how MySQL implements PASSWORD().

Building libMemcached RPMs

A client running CentOS 5.4 Amazon EC2 instances needed the latest libMemcached version installed. With the inclusion of the "make rpm" target, libMemcached makes it easy to build the libMemcached RPMs by doing the following:

Spin up a new CentOS Amazon EC2 instance,

As root on the new instance:

yum install @development-tools
yum install fedora-packager
/usr/sbin/useradd makerpm

Now change to the makerpm user and build the RPMs:

su - makerpm
rpmdev-setuptree
tree
wget http://launchpad.net/libmemcached/1.0/1.0.2/+download/libmemcached-1.0.2.tar.gz
tar -zxf libmemcached-1.0.2.tar.gz
./configure && make rpm
find . -name '*rpm*'

References:
http://libmemcached.org
http://fedoraproject.org/wiki/How_to_create_an_RPM_package
https://launchpad.net/libmemcached/+download

Using HAProxy for MySQL failovers

There are a number of solutions available for MySQL architecture that provide automatic failover, but the vast majority involve potentially significant and complex changes to existing configurations. Fortunately, HAProxy can be leveraged for this purpose with a minimum of impact to the current system.  Alex Williams posted this clever solution a couple years ago in his blog.  Here we take a closer look at the details of implementing it into an already existing system.

HAProxy is a freeware load balancer, proxy, and high availability application for the TCP and HTTP protocols.  Since it was built mostly to handle web traffic, it has robust rule writing using HTTP components to check on the health of systems.  The key to Alex's solution was creating xinetd daemons on the database servers that send out HTTP messages.  The HTTP check to determine database statuses works like this:

  1. The HAProxy server sends HTTP requests to the database servers at configured intervals to specified ports
  2. The /etc/services file on the database servers maps those ports to services in their /etc/xinetd.d directory
  3. The services can call any specified script, so we build scripts that connect to the databases and checks for whatever conditions we choose.

The services then return an OK or a Service Unavailable response per the conditions.  Code for these scripts is in included in the Alex's article.

Our database configuration for this implementation is two pairs of Master-Slave databases in an Active-Passive relationship with Master-Master replication between the sets.  Siloing the passive Master-Slave provides a hot spare as well as continuous up-time during deployments provided we have a means of swapping the active/passive roles of each pair.  To accomplish the latter, we built two HAProxy configuration files, haproxy_pair1.cfg and haproxy_pair2.cfg, the only difference between the two being which Master-Slave pair is active.  Having the two files indicate which pair is active also allows immediate visibility of the current configuration.

Just as in Alex's sample configuration, our web application uses two DNS entries, appmaster and appslave to call writes and reads respectively.  The IPs for these addresses are then attached to our HAProxy server, allowing HAProxy to bind to them when it starts and then route them to the appropriate database server.

On our HAProxy server we then customized the /etc/init.d/haproxy script to handle an additional parameter of "flipactive" which provides us the capability to swap the database pairs:

flipactive() {
        # first detect which cfg file haproxy is using
        ps_out=`ps -ef |grep haproxy|grep cfg`
        # if pair2 then use pair1
        if [[ "$ps_out" =~ pair2 ]]  then
             # the -sf does a friendly reread of the config file
             /usr/local/sbin/haproxy -f /etc/haproxy/haproxy_pair1.cfg -p /var/run/haproxy.pid -st $(cat /var/run/haproxy.pid)
        # if pair1 then use pair2
        else
             /usr/local/sbin/haproxy -f /etc/haproxy/haproxy_pair2.cfg -p /var/run/haproxy.pid -st $(cat /var/run/haproxy.pid)
        fi
}

The rest of the configuration details are covered pretty well in Alex's article as well as in the HAProxy documentation.

We successfully developed and implemented this technique with the devops team at SlideShare last month to build rolling DDL scripting to multiple databases using Capistrano.  It allowed them to have explicit control over which database was being updated, thereby giving them the means necessary to update one database while other served the current application code.

Syndicate content
Website by Digital Loom