MySQL Static Cache Daemon
5 January 2010 // Filed under scripts
The problem that this program was designed to solve is a rapidly changing MySQL table in use by a high-traffic website.
For instance, stock quotes on the front page of a bank’s or investments firm’s site.
The data is constantly updated by a service on the backend, and is referred to by some ajax widget or by impatient client, repeatedly refreshing the page (F5-F5-F5-F5-F5).
In order to prevent hammering of the database with SELECT queries, the scripts creates a static page every few seconds, so that users get a relatively updated version of the data (near-live data = good enough).
I chose python for the application, with bash as a wrapper for the daemon init script.
The MySQL data is retrieved with the help of the MySQLdb library (known as python-mysqldb in aptitude repos), also used in a previous script of mine.
The init script is a chkconfig compatible script, although it could be easily modified to a debian/ubuntu styled script as well.
For your convenience, here are all the source files in a tarball in addition to posting the actual code.
I split the script into several files for the sake of sanity in reading and managing the code:
- config.py – contains all the configurable variables.
- dbpasswd.py – is just a little hackish script to use sometime to parse .my.cnf files in case you don’t want to include the db password in the code.
- logger.py – define an instance of the python “logging” class to use throughout the script.
- methods.py – actual handling of data and building output.
- main.py – the executable that calls everything together.
- static_quotesd – init script for daemonizing
config.py
import dbpasswd
pidfile = "/var/run/static_quotesd.pid"
#select output method; either "xml" or "php"
outputpath = "/var/www/webapp/static"
outputmethod = "php"
phpoutputfile = "static_output.php"
#outputmethod = "xml"
#xmloutputfile = "static_output.xml"
#db details
dbhost = "localhost"
dbuser = "root"
dbname = "database_name"
dbtable = "tablename"
# in case you dont want to include the password in the code
# dbpasswd = dbpasswd.getpassword()
# where to find mysql password
#mycnf="/root/.my.cnf"
dbpasswd = "secretpassword"
# log directory
logdir = "/var/log/static_quotes"
# logfile name
logfile = "static_quotes.log"
# log level (0 is normal, 1 is debug)
loglevel = 1
# email adress for reporting
email = "admin@domain.com"
# no. of seconds between updates of file
frequency = 4
Here we find everything configurable and this file is included by the rest. Not much to explain here, except perhaps the “email” configuration, which is simply for future implementation – not in use yet…
dbpasswd.py
import config
from logger import logger
mycnf = config.mycnf
# simple file parser - can be used in a more general sense as well
def getpassword():
try:
mycnfopen = open(mycnf , "r")
except:
# logger.error("could not find my.cnf")
sys.exit()
mycnflines = mycnfopen.readlines()
for line in mycnflines:
if line.find("password") > -1:
splitline = line.split("=")
password = splitline[1]
# here is a little hack used to make the newline character disappear
if password[-1:] == "\n":
password = password[:-1]
return password
Like I said earlier, this is just a little hack I use sometimes when I don’t want to include the MySQL connection password in the source files, and it is usually already stored for convenience in some .my.cnf file.
We can choose to use it or not in the config file.
logger.py
# define logger
# get file location from config
logfilename = "%s/%s" % (config.logdir, config.logfile)
logging.basicConfig(level=logging.DEBUG,
format='%(asctime)s %(levelname)-8s %(message)s',
datefmt='%a, %d %b %Y %H:%M:%S',
filename=logfilename,
filemode='a')
logger = logging.getLogger("static_quotes")
# check config file for reporting level
if config.loglevel == 1:
logger.setLevel(logging.DEBUG)
else:
logger.setLevel(logging.ERROR)
This is my first time using the python logging module, and it is quite convenient, previously i found myself deploying my own, which is very un-pythonic.
The formatting itself is very close to the basic example given by the Python documentation, my main addition here is setting the log level according to the setting in the config file.
methods.py
from logger import logger
def makexml(metadata, data):
"""method to create xml output for given data from database"""
from xml.dom.minidom import Document
xmlfile = config.outputpath + config.xmloutputfile
# Create the minidom document
doc = Document()
# Create the <quotes> base element
quotes = doc.createElement("quotes")
doc.appendChild(quotes)
for elem in data:
# start parsing db data and creating elements
# Create elements and fill with data
counter = 0
while counter <= (len(elem)-1):
if counter == 0:
symbol = doc.createElement(str(metadata[counter][0]))
symbol.setAttribute("id",str(elem[counter]))
quotes.appendChild(symbol)
else:
quote_array = doc.createElement(str(metadata[counter][0]))
symbol.appendChild(quote_array)
quote_array_data = doc.createTextNode(str(elem[counter]))
quote_array.appendChild(quote_array_data)
counter += 1
# Output newly created XML
try:
# if config.loglevel == 1:
# print doc.toprettyxml(indent=" ")
xmlout = open(xmlfile, 'w')
xmlout.write(doc.toprettyxml(indent = " "))
xmlout.close()
except:
logger.error("error writing xml to file")
raise("error writing xml to file")
def makephp(metadata, data):
"""method to create php output for given data from database"""
phpfile = config.outputpath + config.phpoutputfile
phpcontent = ""
for elem in data:
# Create elements and fill with data
counter = 0
while counter <= (len(elem)-1):
if counter == 0:
phpcontent = phpcontent + "$symbol['%s'] = array(" % elem[counter]
elif counter <= (len(elem)-2):
phpcontent = phpcontent + "'%s' => '%s', " % (metadata[counter][0], elem[counter])
elif counter == (len(elem)-1):
phpcontent = phpcontent + "'%s' => '%s'" % (metadata[counter][0], elem[counter])
counter += 1
phpcontent = phpcontent + ");\n"
# Output newly created PHP
try:
# if config.loglevel == 1:
# print(phpcontent)
phpout = open(phpfile, 'w')
# here we must use some horrible formatting for php
phpout.write("<?php\n$symbol=array();\n%s\n?>" % phpcontent)
phpout.close()
except:
logger.error("error writing php to file")
raise("error writing php to file")
Ok, here stuff actually happens.
There are two methods defined: XML output, and PHP output. They both make use of two sets of data which we retrieve in main.py: the metadata, which is the result of a DESCRIBE query, and the data, which is the resuly of a SELECT query.
For the XML output, something that helped me was a quick Google search, and then clicking the first link.
The nice thing about the makexml method, is that it allows us to take even more load off of the server side, by accessing the XML file with js on the client side. That is, instead of every client making DB requests on every hit or refresh or AJAX refresh, they only make a simple HTTP GET request and parse it in their browser, alleviating stress on the MySQL server and the web server.
This should be the preferred method in my opinion.
However, to accommodate our client’s needs, and perhaps somebody else’s as well, included also is another method – makephp.
This method outputs the DB data to a PHP file, formatted as a multi-dimensional associative array, where each row is set as top-level key, and in the second level, each key is the table colum names, and the values are the data.
Hopefully this will be a little clearer with an example, to be provided later (will probably split that into a new post, as it will require some work – creating fictitious MySQL data for the sake of the explanation).
The PHP method does have one advantage, which may be of necessity to some – the ability to manipulate the data on the server side, and handle only final results to the client. The downside of this obviously being more CPU usage by the web server (solved in the XML method).
Ok, on to the actual program!
main.py
# system modules
import sys, MySQLdb, time, os
# self made modules
import config, methods, logger
logger = logger.logger
logger.debug("main process pid is: %s" % os.getpid())
while True:
try:
pid = open(config.pidfile,'w')
pid.write(str(os.getpid()))
pid.close()
except:
logger.error("error writing pid file")
# begin
logger.debug("starting loop")
# connection details
db_host_master = config.dbhost
db_user = config.dbuser
db_passwd = config.dbpasswd
db_dbname = config.dbname
db_table = config.dbtable
# db connect debug
logger.debug("dbhostname: %s, dbuser: %s, dbpasswd: %s, dbname: %s" % (db_host_master,db_user,db_passwd,db_dbname))
# actual MySQL query
db_meta_query = "DESCRIBE %s" % db_table
db_select_query = "SELECT * from %s" % db_table
logger.debug("queries: %s, %s" % (db_meta_query,db_select_query))
try:
conn = MySQLdb.connect(
host = db_host_master,
user = db_user,
passwd = db_passwd,
db = db_dbname
)
cursor = conn.cursor()
cursor.execute(db_meta_query)
metadata = cursor.fetchall()
cursor.execute(db_select_query)
data = cursor.fetchall()
# close MySQL connection
cursor.close()
conn.close()
except:
logger.error("failure in db connection")
# use this if you want to quit the daemon in case of db failure
# sys.exit()
try:
# do actual static output according to config selection
if config.outputmethod == "xml":
methods.makexml(metadata, data)
elif config.outputmethod == "php":
methods.makephp(metadata, data)
logger.debug("finished process")
except:
logger.error("failed to run method: %s" % config.outputmethod)
time.sleep(config.frequency)
logger.error("you should never see this message")
This is the only executable python file in the bunch (must be set to 755).
The Python logging module I mentioned earlier is really awesome – it allows you to easily fill your code with a bunch of debug messages, assisting the development and the troubleshooting process, and also to easily toggle the debug mode, so that you aren’t constantly bombarded by these messages.
So of course, everywhere it says logger.debug, the message will be logged only if config.loglevel is set to 1. logger.error messages will be logged always.
Everything that happens in main.py, happens inside the infinite loop while True:. This is the basis for the daemon. This basically assures us that the program will keep on running. In order to manage the process and prevent strange phenomena, we have the init script, which makes use of the pid (Process ID) file created at the beginning of the loop.
main.py itself is fairly simple – connect to the database, store the data in python lists, close the MySQL connection, according to selected method in config.py (XML or PHP), output the file. That’s it.
Here is the init script (also needs 755 and will usually reside in /etc/init.d/ or the equivalent):
static_quotesd
#
# This daemon gets data from a mysql db and outputs it to static files
#
# chkconfig: 345 97 03
#
# processname: static_quotesd
#
# pidfile: /var/run/static_quotesd.pid
#
# source function library
. /etc/rc.d/init.d/functions
RETVAL=0
start() {
echo -n $"Starting static_quotesd: "
daemon +19 '/opt/static_quotes/main.py &'
RETVAL=$?
echo
[ $RETVAL -eq 0 ] && touch /var/lock/subsys/static_quotesd
}
stop() {
echo -n $"Stopping static_quotesd: "
killproc static_quotesd
echo
[ $RETVAL -eq 0 ] && rm -f /var/lock/subsys/static_quotesd
}
restart() {
stop
start
}
case "$1" in
start)
start
;;
stop)
stop
;;
restart|force-reload|reload)
restart
;;
condrestart|try-restart)
[ -f /var/lock/subsys/static_quotesd ] && restart
;;
status)
status static_quotesd
RETVAL=$?
;;
*)
echo $"Usage: $0 {start|stop|status|restart|reload|force-reload|condrestart}"
exit 1
esac
exit $RETVAL
This is also something created on the basis of some other init script, and it is pretty standard and straightforward. Explaining the exact syntax and meaning of every section is beyond the scope of this post, and probably beyond the scope of my knowledge as well…
However something to notice here is perhaps where it says daemon +19 '/opt/static_quotes/main.py &' which relies on the fact that all the python files were stored in the /opt/static_quotes/ directory.
Also I am guessing that you know enough about GNU/Linux and python in order to understand the various permissions required to use this program (since usually stopping and starting daemons require root permissions, as it will try to write to all kinds of places like /var/lock/subsys/ and /var/run/).
Once the config file has been edited appropriately, the python files located properly (in this case in /opt/static_quotes), and static_quotesd put in /etc/init.d/ the following command should work:
Good Luck!
Hope this helps someone. Here is another link to the source files compressed in a tgz file.
I will do my best to post a usage example soon.
2010-01-05 :: tom
