November 2010

Percona XtraDB Information_schema Changing

I've been working a lot with xtradb versions of mysql over the last year, and more and more have begun to attempt to leverage the information_schema instrumentation that has been put in place. 

Today, I went into a system with this install (admittedly a very recent one) 

Server version: 5.1.51-rel11.5-log Percona Server (GPL), 11.5, Revision 132

Having been considering upgrading RAM on the server, I went to look at how the innodb_buffer_pool was currently being used.   The information_schema tables seem to have changed dramatically, particularly around the innodb_buffer_pool related tables found here.

At this point, I'm at a loss of how to get information on which tables/indexes are using space (and how much) in the buffer pool.  I had used INNODB_BUFFER_POOL_PAGES_INDEX to do this in the past.  Hopefully documentation will be forthcoming on the new table structures.  Admittedly, I haven't dug too deeply into the new structures (since they are not documented).  I'm curious if anyone else has worked with the newer table structures:

INNODB_BUFFER_POOL_PAGES_INDEX\G
*************************** 1. row ***************************
       Table: INNODB_BUFFER_POOL_PAGES_INDEX
Create Table: CREATE TEMPORARY TABLE `INNODB_BUFFER_POOL_PAGES_INDEX` (
  `index_id` bigint(21) unsigned NOT NULL DEFAULT '0',
  `space_id` bigint(21) unsigned NOT NULL DEFAULT '0',
  `page_no` bigint(21) unsigned NOT NULL DEFAULT '0',
  `n_recs` bigint(21) unsigned NOT NULL DEFAULT '0',
  `data_size` bigint(21) unsigned NOT NULL DEFAULT '0',
  `hashed` bigint(21) unsigned NOT NULL DEFAULT '0',
  `access_time` bigint(21) unsigned NOT NULL DEFAULT '0',
  `modified` bigint(21) unsigned NOT NULL DEFAULT '0',
  `dirty` bigint(21) unsigned NOT NULL DEFAULT '0',
  `old` bigint(21) unsigned NOT NULL DEFAULT '0',
  `lru_position` bigint(21) unsigned NOT NULL DEFAULT '0',
  `fix_count` bigint(21) unsigned NOT NULL DEFAULT '0',
  `flush_type` bigint(21) unsigned NOT NULL DEFAULT '0'
) ENGINE=MEMORY DEFAULT CHARSET=utf8

Events and Replication, part 1

MySQL events have been in GA for 2 years.  Events are not used as much as I would like -- often organizations will have cron jobs or scheduled tasks to perform ETL, maintenance and small aggregation tasks.  For example, a website that shows how many users are logged in might update a "cache table" with a count of active sessions every 5 minutes.

Events are great because they are a part of the database.  They only run when the database is running, and they are backed up with the database.  There is no need to store a password in a file or in a cron listing.

Events and replication can be a tricky matter.  Events can change data, and data changes are replicated.  In the example of an event that updates a table with a count of active sessions every 5 minutes, let's assume that the event itself would run on a primary master (write) server.  All slaves (including a secondary master) would receive the table update.

The tricky part is that CREATE EVENT statements are replicated.  There are three ways to avoid an event that changes data running on more than just the master:

0) Do not replicate the CREATE EVENT, or replicate it but then delete the event on any slaves.  I do not like this approach because if you promote one of the slaves to be a master, the event will not be in place.

1) Replicate the CREATE EVENT, knowing that the SLAVESIDE_DISABLED bit will be set even if you do not specify DISABLE ON SLAVE in the CREATE statement.  If you promote a slave to be a master, all you need to do is ALTER EVENT event_name ENABLED for each event that needs to be turned on.  This is the method I prefer, even though it still requires knowledge to turn on the events when a slave is promoted, and turn off events on the old master.

2) Replicate the CREATE EVENT, and enable the event right away, but set the event scheduler off.  This is easier if there are lots of events, because all that needs to be done when a slave is promoted to a master is:

mysql> SET GLOBAL event_scheduler=ON;

However, this means that you cannot have any events running on the slave itself - it is either all the events running, or not.

Part 2 will discuss some specific issues I have run into with events and replication.

Oracle is not removing InnoDB!

 

I was asked "What's the deal with Oracle removing InnoDB?"  I had not heard this, so I did some research.  It took less than 5 minutes to figure out what happened, and it was easy to see where the confusion is.

On the MySQL products page at http://mysql.com/products/ the matrix of MySQL editions includes "MySQL Classic" which is free, "MySQL Standard" which costs $2k per year, "MySQL Enterprise" which costs $5k per year and "MySQL Cluster Carrier Grade" which costs $10k per year.

Indeed, the "MySQL Classic" does not include InnoDB.  What happened was that folks assumed that, because it was free, it was the MySQL Community edition we all know and love.

This is not true.  How do I know?  Because just above the matrix is a set of links to each edition, and if you click the "MySQL Classic" link you get to http://mysql.com/products/classic/ which explains "MySQL Classic Edition is the ideal embedded database for ISVs, OEMs and VARs developing read-intensive applications using the MyISAM storage engine."

 

So calm down, folks. 

Cassandra RandomPartitioner Tokenizing

 

So I'm creating a new cluster, and after setting up I needed to get my tokens.  As we're told in http://wiki.apache.org/cassandra/Operations:

 

Token selection:

Using a strong hash function means RandomPartitioner keys will, on average, be evenly spread across the Token space, but you can still have imbalances if your Tokens do not divide up the range evenly, so you should specify InitialToken to your first nodes as i * (2**127 / N) for i = 0 .. N-1. In Cassandra 0.7, you should specify initial_token in cassandra.yaml.  

Here's a nice simple code snippet to figure out your RandomPartitioner tokens based on the size of your cluster:

 

#! /usr/bin/python

 

#nodes = int(raw_input( "How many nodes?" ))

import sys

nodes=int(sys.argv[1])

def tokens(nodes):

    for i in range(1, nodes + 1):

        print (i * (2 ** 127 - 1) / nodes)

 

 

This should give something like this:

 

[root@cassandra06 conf]# ./tokenizer.py 6

28356863910078205288614550619314017621

56713727820156410577229101238628035242

85070591730234615865843651857942052863

113427455640312821154458202477256070484

141784319550391026443072753096570088105

170141183460469231731687303715884105727

Using SSH Proxying with Jconsole to remote Cassandra instances

This guide leans heavily on the work of http://simplygenius.com/2010/08/jconsole-via-socks-ssh-tunnel.html — what I’ve done is collect his work into something a little more manageable for our environment.

Prerequisites

This guide assumes a few things are set up:

  1. That you’ve got ssh keys pushed around to do passwordless logins between your machine and your intermediate client machine
  2. That you’ve got cassandra up and running remotely
  3. That cassandra is listening on 8080 for it’s JMX service port

The Meat

This hunk of bash script is the meat of making this work.  Put the following in your .bashrc.  Make sure to edit proxy_host= to match your environment.

function jc {
    # set this to the host you'll proxy through.
    proxy_host="remoteuser@remotehost -p 22"    host=$1

    jmxport=8080
    proxy_port=${2:-8123}

    if [ "x$host" = "x" ]; then
        echo "Usage: jc <remote server> [proxy port]"
        return 1
    fi 

    # start up a background ssh tunnel on the desired port
    ssh -N -f -D$proxy_port $proxy_host 

    # if the tunnel failed to come up, fail gracefully.
    if [ $? -ne 0 ]; then
        echo "Ssh tunnel failed"
        return 1
    fi

    ssh_pid=`ps awwwx | grep "[s]sh -N -f -D$proxy_port" \
        | awk '{print $1}'`     echo "ssh pid = $ssh_pid"     # Fire up jconsole to your remote host     jconsole -J-DsocksProxyHost=localhost -J-DsocksProxyPort=$proxy_port \         service:jmx:rmi:///jndi/rmi://${host}:${jmxport}/jmxrmi     # tear down the tunnel     kill $ssh_pid }

Then, either close your shell, or source your .bashrc.  Then you should be able simply to call your function like so:

    host$ jc cassandra-host01

Jconsole will pop up, and log you in.

-Gabriel

Originally available at http://gabrielcain.com/blog/

How to Stop Hating MySQL

Last week my presentation from PICC10 on "How to Stop Hating MySQL" was uploaded to YouTube.  PICC10 (Professional IT Community Conference) was held May 7-8 in New Brunswick, NJ.  The video can be found on youtube at http://www.youtube.com/user/TomOnTime#p/u/6/xPHJnB0COjo and the slides can be downloaded from http://technocation.org/files/doc/2010_05_stophatingmysql.pdf

Pictures from OpenSQLCamp Boston 2010

I have not seen folks posting any pics they took at OpenSQLCamp here, so I will post the few I have -- I have a few pictures of the Indexing Panel on Saturday (the participants and some audience shots), and one picture of Matt and Monty during the Friday night social event.  You can see them here.

(The indexing panel included MySQL, Postgres, Cassandra, CouchDB and MongoDB folks)

Syndicate content
Website by Digital Loom