Wednesday, July 27, 2011

Yes, there IS a difference betwen shutdown -r and reboot

I and a fellow admin were testing a server, trying to figure out why the file systems were not unmounting during a reboot.  We looked and looked and everything appeared to be configured correctly, except certain init scripts were not getting called when rebooting.  After some time of working on this, we decided that maybe "reboot" doesn't actually do what we think it does.  After some searches, we discovered that there IS a difference between the reboot command and shutdown -r.

The post the explained it best, in my opinion, was from where Forbes Guthrie explains it.  To sum it up, reboot and halt, does not execute the kill scripts.  They just unmount the file systems and either reboot or stop the system.  You are best off using the shutdown command to reboot or shutdown the system.

Forbes says this about what to use:

"So, should I use reboot or init 6?  – neither!  My advice is to use the shutdown command.  shutdown will do a similar job to init 6, but it has more options and a better default action.  Along with kicking off an init 6, the shutdown command will also notify all logged in users (logged in at a tty), notify all processes the system is going down and by default will pause  for a set time before rebooting (giving you the chance to cancel the reboot if you realize that you made a mistake)."

 So using shutdown is a good habit I am going to have to get into.

Monday, December 20, 2010

LXC on Ubuntu 10.10 for Minecraft Server

My family has all taken to Minecraft.  We got accounts for all six of us, two “grown ups”, two teens, and two little guys.  We quickly decided that we would run three different servers.  The grown up server is where we build things.  The teen server is where they grant themselves TNT and blow up everything.  And the little kid server is a kind of a combination of both.

Originally, I put them all on the same physical server and tried different port numbers.  That works, but wasn’t ideal.  We wanted our own IP addresses.  Less confusion, especially for the little guys.  I have enough hardware that we ran one instance on each server.  That was fine until I lost the grown up server due to hardware failure on the root partition.  I still can’t believe I didn’t use RAID.

Because of the desire for seperate IP addresses for each instance, and the desire to avoid total loss again, I decided to use Linux containers (LXC).  Here is my notes for getting minecraft servers up and running in a linux container.

I put the mincraft data into it’s own LVM partition because I’m going to use DRBD once I rebuild my other server.

(These are notes so they have very little explanation)
Get what we need installed
apt-get install lxc vlan bridge-utils python-software-properties screen libvirt-bin debootstrap
Create the partitions
lvcreate -L 2g -n minecraft1 datavg
mkfs.ext4 /dev/datavg/minecraft1

mkdir -p /data/minecraft1
vi /etc/fstab
/dev/datavg/minecraft1 /data/minecraft1 ext4 defaults 0 2
We need a root file system.  My main OS is 64bit.  Since the kernel in the LXC instances is the same as the host, I didn’t see a reason to get the 32bit root.

mkdir /data/images
cd /data/images

mount /data/minecraft1
cd /data/minecraft1
mkdir roofs
cd rootfs
tar zxvf /data/images/ubuntu-10.10-x86_64.tar.gz .
cp /etc/resolv.conf ./etc
echo route add default gw >> ./etc/rc.local

vi ./etc/rc.local (move the route to be above the 'exit' statement)
Update sources (
vi ./etc/apt/sources.list
deb maverick main restricted universe multiverse
deb maverick-security main restricted universe multiverse
deb maverick-updates main restricted universe multiverse
deb maverick partner

Chroot so we can update
chroot /data/minecraft1/rootfs /bin/bash

Get SSH up
update-rc.d ssh defaults
In the change root, you want to add your accounts and change the passwords you need to before the server comes up.
Install Sun JRE (to run minecraft server)
apt-get update
apt-get upgrade
apt-get install sun-java6-jre

LXC config files
cd /data/minecraft1
vi fstabnone /data/minecraft1/rootfs/dev/pts devpts defaults 0 0
none /data/minecraft1/rootfs/proc    proc   defaults 0 0
none /data/minecraft1/rootfs/sys     sysfs  defaults 0 0
none /data/minecraft1/rootfs/dev/shm tmpfs  defaults 0 0

vi minecraft1.conf# Container with network virtualized using a pre-configured bridge named br0 and
# veth pair virtual network devices
lxc.utsname = minecraft1 = veth = up = br0 = eth0 = 4a:49:43:49:79:bf =
lxc.tty = 6
lxc.mount = /data/minecraft1/fstab
lxc.rootfs = /data/minecraft1/rootfs

Configure networking.  I have two interfaces.  I left eth0 alone and added eth1 to be the bridged interface.

vi /etc/network/interfacesauto br0
iface br0 inet static
bridge_ports eth1
bridge_stp off
bridge_fd 0
bridge_maxwait 0
post-up /usr/sbin/brctl setfd br0 0

/etc/init.d/networking restart
Start the container
lxc-create -n minecraft1 -f /data/minecraft1/minecraft1.conf
lxc-start -n minecraft1 init

The container has started.  SSH to it.
Minecraft server setup
This is how I get a vanilla multiplayer server running

mkdir -p /minecraft
cd /minecraft

rm /minecraft/minecraft_server.jar
cd /minecraft

chmod 755

java -Xmx2048M -Xms2048M -jar minecraft_server.jar nogui

chmod 755
To start the server:

cd /minecraft

Saturday, December 11, 2010

Cacti fixed. Ooops. :)

I finally tracked down what happen.  This is the problem with changing too many things at the same time.  I looked over the permissions for the cacti user in the MySQL database and decided that it didn't need the full permissions, so I removed the ability to create temporary tables and some other things.  I think normally Cacti doesn't use temporary tables, but I added the Boost plugin and it DOES need that.

The SNMP results get put in the database and written to disk later.  To do this, Boost uses temporary tables.  Since it was unable to flush the table, the database kept filling up.  It got up to 8 million rows before I figured out how to fix it.

I also learned that you can't "backfill" data into an RRD file, at least not using the poller.php.  When I was finally able to flush the data out of the boost table, I lost a lot of the data because of the feature that writes data to the RRD file as someone requests the graph.  Once I started checking all the graphs, I pretty much locked them out of receiving old updates that were stored in the table.

I really need to make some kind of alert for the poller tables in Cacti.

Friday, December 10, 2010

MySQL on my Cacti server freaked out

I'm pretty sure this was because I was using a schema from a new version of Cacti while using the old php files to access it.  It's complicated why, but it comes down to the reason I hate RedHat/CentOS.  No real easy way to upgrade major versions.  I really wish there was something as simple as "apt-get dist-upgrade" for those systems.

Monday, November 22, 2010

OpenDNS for internet filtering

I learned that OpenDNS has an option to perform filtering, and that it's actually free to use.  I decided to log into my old OpenDNS account and try out the filtering.  Filtering is configurable and you can make at loose or as tight as you would like.  There are general categories you can add or remove to your preferences like porn, or file sharing site.  As well as time wasters, religion, and politics.

To get started, OpenDNS first has to know which IP address you are coming from.  I figured the best way for my site (my home) was to set up my home server to use ddclient.  On my CentOS server, I simply had to do 'yum install ddclient'.  Then OpenDNS provides a configuration sample for it (

Now that OpenDNS know where I'm coming from, I have to tell my systems to use the right DNS servers. There are great guides for configuring you home routers on the site.  Mine was a bit different because of Verizon, but the general idea was the same.  Once I figured that out, I was on my way.

Next I decided to sign up for the filtering.  I started with the 'low' option and the added some things to customize.  Even though you can test with, I checked with the obvious  Sure enough the site was blocked.  Awesome!  I did find out that I had to unblock "Adult Theme" if I wanted to access Reddit.

I like this approach much more than having to setup a squid proxy like I tried in the past.  The biggest reason is that since it's done in DNS, I set this up right on my router and ALL devices are filtered on my network.  That means the iPad, iPhones, and the Wii are protected, as well as the desktops and laptops.  I also like that I don't need to setup a cron job that down loads a block list every day and restart the service.  It's a very elegant solution to a long standing problem and it works great.  I think that I'm going to configure this for a couple of friends for their family networks.

Overall, I really like this method of filtering.

Friday, November 19, 2010

Day 2 at LISA10

Linux Performance tuning


I was very excited to take the Linux Performance tuning.  I wasn't sure who the teacher was, but it was an engineer from Google.  A class in performance tuning by Google, I figured this would have to be good.  So did everyone else.  It was one of the few classes at LISA that was not only sold out, but they had to add extra seats to the room.

The class began by establishing what it is we are trying to accomplish.  Goals of performance tuning are to speed up single task,  or graceful degradation of application as load increases.  You don't want your web server to come crashing to a halt when load increases, you want it to gracefully degrade as it takes on more and more load.

In order to start tuning something, you have to know where it stands currently so you can measure whether or not you actual did something.  You must first establish a baseline.  Using the baseline as the starting point, you make a single change to the system, test, then measure the results against the baseline.  Then you just keep repeating the process, making only one change at a time.

Some basic tools to start with are 'free', 'top', and 'iostat'.  Using the free command, see if the swap space is being used.  If you are using swap, you should increase the memory of the system.   Run the top command, and check what's running.  Check the I/O with iostat and use the -k option to get kilobytes instead of blocks.  These are just some of the simple tips we started with.

We got into deep details of file system optimizations by the guy that maintains the code for ext4.  There were a lot in the slides that I still have to decipher which is why this post took so long to post.

One interesting thing I learned was something called short stroking.  Short stroking is where you partition the disk to use the outer rim (the beginning of the disk) where you get 100% performance vs the interior of the disk.  Short stroking is about 10% - 30% of the disk so you are going to lose some space, but gain a lot of speed.  Combine this with multiple disks and you can get near SSD speeds.

Other bits that jump out at me:

Mounting with noatime can help I/O because you no longer have to make a write for each file access.

Increasing the size of the journal can help performance a little bit.  'sudo dumpe2fs -h /dev/sda5`

ionice - “This  program  sets  or gets the io scheduling class and priority for a program.  If no arguments or just -p is given, ionice  will  query  the current io scheduling class and priority for that process.”

nttcp - (new test TCP program) - The nttcp program measures the transferrate (and other numbers) on a TCP, UDP or UDP multicast connection.

When tuning the network, you can tune for latency or for throughput.

NFS - no_subtree_check - If a subdirectory of a filesystem is exported, but the whole filesystem isn't then whenever a NFS request arrives, the server must check not only that the accessed file is in the appropriate filesystem (which is easy) but also that it is in the exported tree (which is harder). This check is called the subtree_check. This option disables subtree_check.

You can bump the number of NFS threads up from the default.  This is in /etc/sysconfig/nfs or /etc/defaults/nfs-kernel-server

“Use the hard option with any resource you mount read-write. Then, if a user is writing to a file when the server goes down, the write will continue when the server comes up again, and nothing will be lost.” (

Remove outdated fstab options.  Just use “rw,intr”

More NFS performance info:

strace – system call tracer
ltrace – library tracing

Optimize the stuff that used often versus once

perf – kicks ass.  Learn it. (

After the class, I went to a few talks.  There was Splunk, PostgreSQL, and some cloud computing thing.  Then I attended the Minecraft get-together.  I heard of Minecraft, but now I saw a demo on the large projection screen.  Now (a week later) I have four account for the family and we are playing on our server.  We're all hooked.  I can't recommend this thing enough.  (

Thursday, November 11, 2010

Day 1 at LISA10

Aeleen's ( and ( class on “Administering Linux in a Production Environment” was pretty good.  It's more of a primer of some new things that have come up and tools to think about getting into.  For me, it seemed to be like a show case of technology that any sort of training.  If you weren't already familiar with the tools, you might be a bit lost.  If you already were familiar with it, then it was already old hat.  I did get a  couple of juicy bites from the talk though.

A good place to see what's up and coming in the Linux kernel is:

I would still like to take another look at LVM snapshots.  I understand that they were hard to work with, at least as far as restoring from a snapshot, but maybe there is a scripting solution to making this easy?  If this can be done, it would make the whole patching thing a lot easier.

Saw a tweet go by talking about OpenTSDB ( and that it might overtake RRDTool.  It does look interesting.  I got a chance to talk to Tobias a little bit about his thoughts.  He heard of it, but hasn't really looked into the product.  The issue for OpenTSDB is that it's built on top of HBase (  Generally this is for large installations.

On a side note, I brought up the caching problem we had with Cacti and the I/O destroying the DRBD setup we had, until we used the big installation addon to Cacti which caches the results then writes to disk in batches.  I wanted to add RRDTool to Nagios monitor results, but was afraid of running into the same I/O issue.  Tobias mentioned that in RRDTool 1.4, there is a caching daemon that will help with this.  I'll need to take this back as some homework.

Cloud computing came up in the training class that made me think about some of the things overheard in the hallways.  Rumblings about the possibilities of doing something “in the cloud”.  We should probably get that EC2 cluster up and running to give it a test go, as well as looking into the free Amazon Cloud.

Aeleen asked about tools we were using and Bigfix came up.  She mentioned it was a great tool and she loves it followed by pretty much the same response from a number of people in the class.  I took a quick glance at it, and it looks like it does answer a lot of the issues we have been trying to work through, like patch management, and vulnerability assessment.  I need to see if we can get a demo or something of this product.

Power and wireless issues causing major distractions in the session.  The power problem wasn't that big of a deal since we are all using laptops, we just switched to battery power.  I didn't know the wireless AP's were affected too.  I spent a good amount of time trying to reconnect to the wireless, and having a hard time keeping up with the material because of that.  It did get fixed up by the break.

I'd like to look more into condensing our sudoers file into a sort of one size fits all solution by getting more mileage out of groups, and using the server group declarations.  I knew you could restrict by user and user groups, but I didn't know that you could restrict by hosts and host groups as well.  A one file solution makes a Cfengine integrated sudoers file easy to implement.

Right as SELinux section started, Rik Farrow makes an appearance.  Neat! It seems that SELinux is getting pushed pretty hard this year.  

One big thing that was pointed out is that if your system is dipping into the swap as evidenced by the 'free' command.  You should probably add more memory to your system if possible.  Especially if this is a server and you are experiencing performance issues.  If that's the case, the memory is the first thing to look at to make the problems go away.

Up until now, I thought that the difference between 32bit and 64bit OS is that on the 64bit OS you can compute some crazy big numbers.  Other than that, there is little value into going to 64bit.  Especially considering support, and problems with 64bit vs. 32bit libraries.  I find out that this isn't really true.  The biggest difference is that you run into a lot of memory problems with 32 bit systems because of how much memory the kernel can address.  With 64 bit, a lot of the tricks involved with dealing with a lot of memory go away.  But how much memory is a lot?  Even so much as 3gig gets tricky.  Since most of out server are shipping with 8gig, this alone should drive us into building 64bit systems by default.  There are also network considerations when dealing with 32bit and 64bit systems.  The message from this session is loud and clear, “Why would you run 32bit if 64bit is an option?”

Facebook Vendor Talk

I got to sit in on a open discussion with a panel of Facebook system administrators.  It was a little slow to start, but got a bit interesting.

First, they mentioned that they are still using Cfengine2 and plan to continue using it into the foreseeable future.  I found this interesting because it hinted at something undesirable in regards to Cfengine3.

They also talked about how the administrators create tools and that when building tools, open sourcing them is usually a consideration from the start.  The once that they do provide end up here:

I managed to get into something that I've always been curious about in really large server environments like Google, Yahoo!, Facebook, and so on.  I know what a sysadmin does normally, but when dealing with really large environments, there is a much greater division of labor.  What does a sysadmin end up doing in this case?  For the guys on the panel, they end up mostly developing automation tools.  They don't touch the hardware at all now, and seldom even get into configuration.  You aren't tweaking a web server anymore, you are tweaking thousands of web servers.  I understand why it's like that, but I don't think it's anything I'm interested in doing.  I guess it's the difference between working on a large garden and a huge factory farm.  There is just something more intimate about the garden.