Hacking up varnish logging on Ubuntu 12.04

Varnish is an excellent page cache / web accelerator program that is well worth a look. Quick, where does varnish log by default? Trick question – by default Varnish doesn’t log anywhere. Let’s explore this a little bit.

Using varnishncsa for live logging

The first place you might look is just looking at the live data. Varnish provides the utility varnishncsa to view logs live in NCSA common log format. To fire it up in a console, just run varnishncsa. You can change the format, log it to a file, daemonize it…check the documentation for more.

Running varnishncsa on boot

Running Ubuntu 12.04 or newer? Piece of cake. Just edit /etc/default/varnishncsa, and set VARNISHNCSA_ENABLED=1. Then run: sudo /etc/init.d/varnishncsa start

Going Deeper: Logging X-Forwarded-For and Change Log Directory

Now is when we get to the hacky part. What if we want to change the logging format or change the log directory? Alas – the only way to do this is to edit the init script. Here are a few things I want to edit:

  • Log X-Forwarded-For
  • Custom Logging
  • Change log directory

Log X-Forwarded-For

Running behind a reverse proxy? Using an Elastic Load Balancer? The IP address in the log will not be of the end user, but of the reverse proxy or ELB. The end user IP will be in an HTTP header called “X-Forwarded-For”. If you’ve looked at the varnishncsa documentation, you know you can set this in two ways: use -f to “prefer X-Forwarded-For HTTP header over client.ip in the log output”, or use -F and specify a custom log format. If doing the former, this is really easy. Edit /etc/init.d/varnishncsa, and look for this line:


DAEMON_OPTS="-a -w ${LOGFILE} -D -P ${PIDFILE}"

Change this by appending -f:


DAEMON_OPTS="-a -w ${LOGFILE} -D -P ${PIDFILE} -f"

Custom Logging

Don’t like the NCSA custom log format? Want to add a field? Create your own format. However, this is a little non-intuitive. Create the format at the top of the init script, say, after defining the USER. You can get a list of the formatting options from the varnishncsa documentation.

For instance, let’s say we want to add the time before the first byte is received from the backend. This will help track how much time it takes to get a response from the backend, which could explain latencies. I’ll also want to keep the X-Forwarded-For IP instead of client IP.

I’ll just append it to the default format. So, my format string would be:


%{X-Forwarded-For}i %l %u %t "%r" %s %b "%{Referer}i" "%{User-agent}i" %{Varnish:time_firstbyte}x

Remove the -f option from DAEMON_OPTS if you added it above; we’re going to be writing our own format. At the top in the variable definition section, let’s add a log format:


NAME=varnishncsa
DESC="HTTP accelerator log deamon"
PATH=/sbin:/bin:/usr/sbin:/usr/bin
DAEMON=/usr/bin/$NAME
PIDFILE=/var/run/$NAME/$NAME.pid
LOGFILE=/var/log/varnish/varnishncsa.log
USER=varnishlog
LOGFORMAT="%{X-Forwarded-For}i %l %u %t "%r" %s %b "%{Referer}i" "%{User-agent}i" %{Varnish:time_firstbyte}x"

How do we get the format into the daemon start? Edit the DAEMON_OPTS line to add -F, but do NOT add the LOGFORMAT. It won’t work, trust me.


DAEMON_OPTS="-a -w ${LOGFILE} -D -P ${PIDFILE} -F"

Scroll down to the start_varnishncsa() code block, and after ${DAEMON_OPTS}, add ${LOGFORMAT}. E.g.:


start_varnishncsa() {
output=$(/bin/tempfile -s.varnish)
log_daemon_msg "Starting $DESC" "$NAME"
create_pid_directory
if start-stop-daemon --start --quiet --pidfile ${PIDFILE} \
--chuid $USER --exec ${DAEMON} -- ${DAEMON_OPTS} \
"${LOGFORMAT}" \
> ${output} 2>&1; then
log_end_msg 0
else
log_end_msg 1
cat $output
exit 1
fi
rm $output
}

Change log directory

The first part of this is relatively straightforward. Let’s say we have a separate partition for logging, at say, /mnt/logs. We want to log to /mnt/logs/varnish instead of the default /var/log/varnish. Edit /etc/init.d/varnishncsa and set the LOGFILE variable:


NAME=varnishncsa
DESC="HTTP accelerator log deamon"
PATH=/sbin:/bin:/usr/sbin:/usr/bin
DAEMON=/usr/bin/$NAME
PIDFILE=/var/run/$NAME/$NAME.pid
LOGFILE=/mnt/logs/varnish/varnishncsa.log

That’s all fine and good, but what if you don’t have permissions to that directory? Let’s fix that too. Above the start_varnishncsa() block, let’s add the following:


# Fix permissions for custom logging
if ! [ "$(stat -c '%U:%G' $(dirname ${LOGFILE}))" = "varnishlog:varnishlog" ]; then
chown -R varnishlog:varnishlog $(dirname ${LOGFILE});
fi

This way, if the log directory and its contents are not owned by varnishlog, it’ll fix this. Note that placement here is after checking VARNISHNCSA_ENABLED and that the DAEMON exists, so it’ll only run if we’ve passed those checks.

Almost there! One other snag is that log rotation will not rotate this log. Edit /etc/logrotate.d/varnish and add the path:


/var/log/varnish/varnish.log /var/log/varnish/varnishncsa.log \
/mnt/logs/varnish/varnishncsa.log {
daily
rotate 31
missingok
compress
delaycompress
missingok
postrotate
for service in varnishlog varnishncsa; do
if /usr/bin/pgrep -P 1 $service >/dev/null; then
/usr/sbin/invoke-rc.d $service reload > /dev/null
fi
done
endscript
}

Conclusion

Varnish logging isn’t the most straightforward and obvious. The above works, but is kind of kludgy: we’ve dumped a lot of stuff into the init script, and it’s highly custom to this particular installation. I created a modified version of the above that stuffs the configuration information into /etc/default/varnishncsa, but that also doesn’t feel quite right. I’d ideally like to edit, say, /etc/varnish/varnishncsa.cfg or something, but this will do. Though it’s somewhat non-intuitive, varnish is worth the effort.

Tagged with: , ,
Posted in How To

How to kill a lot of screen sessions at once

I use screen pretty much every day, but sometimes I end up orphaning a lot of screen sessions. (For instance, why is it Mac disables your Internet if you lock your screen? STOP THAT.)

Anyway, at one point I had 18 screen sessions running detached. How do you destroy them? You could re-attach each session, then issue a Cntrl-x \ , but that would quickly grow repetitive.

The way to kill a screen session is to run: screen -X -S PID quit

Copy/pasting all that sounds tedious too. Fine, let’s do it the sysadmin-y way:

  • Get the list of screens
  • grep for Detached
  • cut for the secondĀ field
  • apply command.

Piece of cake!


for x in $(screen -ls | grep Detached | cut -f2); do screen -X -S $x quit; done

Ta da! Power screen kill! I find this example interesting in the sense of the right way to approach problems: don’t necessarily just do work by hand; give a little thought to how to do things better.

The irony that the effort I saved is lost on creating this blog post is not lost on me. However, it emphasizes a good secondary skill: document what you do. Though it may take even more time, it’ll pay off in the future. It might not for the particular thing you’re documenting, but as a habit, it’s invaluable.

Tagged with:
Posted in How To

How to Drop a MongoDB Collection With An Illegal Name

For whatever reason, you may have gotten collections in MongoDB that you cannot access or delete, due to their name. Maybe you led the collection name with a number, instead of a letter. Your collection name may be illegal or inaccessible for any of a variety of reasons. In my case, I had collections with names like the following: tmp.mr.mapreduce_1368742878_ec2-123-456-789-012.compute-1.awesomecloud.host_20534

In this case, it’s likely the dashes are causing issues. That’s a mouthful, so I’ll just refer to that string as ILLEGALLYNAMEDCOLLECTION. However, what’s more interesting is how you access that data. You can’t count it like so:


> db.ILLEGALLYNAMEDCOLLECTION.count()
Wed May 22 12:58:45.932 JavaScript execution failed: SyntaxError: Unexpected token ILLEGAL

You also can’t drop() it that way. How do you access it? Use getCollection.


> db.getCollection("ILLEGALLYNAMEDCOLLECTION").count()
39
> db.getCollection("ILLEGALLYNAMEDCOLLECTION").drop()
true

So long, illegally named collection!

Tagged with:
Posted in How To

Rehash For Bash

Changed the path to an executable? If the old path is in your path, you need to rebuild the hash of paths. On tcsh, you did this by running ‘rehash’. On bash, run:

hash -r

What if you want to turn off hashing completely? Run the following, or add to your .bashrc if you want it permanent:

set +h
Tagged with: ,
Posted in How To

Repack a Debian Package

Got a debian package that is so old, it’s no longer found in repositories? Need a specific version from a server, but don’t have the original deb file? Repack it with dpkg-repack.


sudo apt-get install dpkg-repack
sudo dpkg-repack YOUROLDPACKAGENAME

Voila – instantly recreated package. Now fix your code so you can run current software again.

Tagged with: ,
Posted in How To

Ping Slow By Hostname, Not By IP

I recently had this experience: when I pinged a hostname, it took a long time for ten pings to go through — like 45 seconds:


$ ping -c 10 my.example.com
PING my.example.com (10.9.8.7) 56(84) bytes of data.
64 bytes from 10.9.8.7: icmp_req=1 ttl=50 time=52.7 ms
64 bytes from 10.9.8.7: icmp_req=2 ttl=50 time=54.4 ms
64 bytes from 10.9.8.7: icmp_req=3 ttl=50 time=52.8 ms
64 bytes from 10.9.8.7: icmp_req=4 ttl=50 time=58.9 ms
64 bytes from 10.9.8.7: icmp_req=5 ttl=50 time=57.0 ms
64 bytes from 10.9.8.7: icmp_req=6 ttl=50 time=59.2 ms
64 bytes from 10.9.8.7: icmp_req=7 ttl=50 time=52.8 ms
64 bytes from 10.9.8.7: icmp_req=8 ttl=50 time=59.1 ms
64 bytes from 10.9.8.7: icmp_req=9 ttl=50 time=64.1 ms
64 bytes from 10.9.8.7: icmp_req=10 ttl=50 time=53.2 ms

--- my.example.com ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 45570ms
rtt min/avg/max/mdev = 52.746/56.482/64.160/3.691 ms

But, if I pinged by IP address, it went through with normal time:


$ ping -c 10 10.9.8.7
PING 10.9.8.7 (10.9.8.7) 56(84) bytes of data.
64 bytes from 10.9.8.7: icmp_req=1 ttl=50 time=52.5 ms
64 bytes from 10.9.8.7: icmp_req=2 ttl=50 time=51.1 ms
64 bytes from 10.9.8.7: icmp_req=3 ttl=50 time=53.2 ms
64 bytes from 10.9.8.7: icmp_req=4 ttl=50 time=60.2 ms
64 bytes from 10.9.8.7: icmp_req=5 ttl=50 time=54.0 ms
64 bytes from 10.9.8.7: icmp_req=6 ttl=50 time=59.1 ms
64 bytes from 10.9.8.7: icmp_req=7 ttl=50 time=59.1 ms
64 bytes from 10.9.8.7: icmp_req=8 ttl=50 time=58.9 ms
64 bytes from 10.9.8.7: icmp_req=9 ttl=50 time=54.8 ms
64 bytes from 10.9.8.7: icmp_req=10 ttl=50 time=54.3 ms

--- 10.9.8.7 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9011ms
rtt min/avg/max/mdev = 51.161/55.751/60.235/3.128 ms

The ping summary stats themselves were the same, just the overall time was longer. What’s going on here? I little net research, and I had my answer: reverse DNS. I hadn’t mapped up the reverse DNS for this yet, so when I pinged by hostname, it tried to reverse map the IP in the results for each ping. You can disable this with the -n flag, and results become normal again:


$ ping -c 10 my.example.com -n
PING my.example.com (10.9.8.7) 56(84) bytes of data.
64 bytes from 10.9.8.7: icmp_req=1 ttl=50 time=61.1 ms
64 bytes from 10.9.8.7: icmp_req=2 ttl=50 time=53.2 ms
64 bytes from 10.9.8.7: icmp_req=3 ttl=50 time=59.0 ms
64 bytes from 10.9.8.7: icmp_req=4 ttl=50 time=59.0 ms
64 bytes from 10.9.8.7: icmp_req=5 ttl=50 time=53.6 ms
64 bytes from 10.9.8.7: icmp_req=6 ttl=50 time=59.4 ms
64 bytes from 10.9.8.7: icmp_req=7 ttl=50 time=51.2 ms
64 bytes from 10.9.8.7: icmp_req=8 ttl=50 time=59.2 ms
64 bytes from 10.9.8.7: icmp_req=9 ttl=50 time=54.3 ms
64 bytes from 10.9.8.7: icmp_req=10 ttl=50 time=52.9 ms

--- my.example.com ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9010ms
rtt min/avg/max/mdev = 51.224/56.339/61.156/3.404 ms

So if you have long ping times by host name, check for reverse DNS – it’s probably not mapped.

Tagged with: ,
Posted in Article

The importance of programming language choice

Does it really matter what programming language you use? Of course. Anyone who says otherwise is being naive. There are some more obvious considerations when choosing a language outside of those parameters. These include:

  • Suitability of the language to the problem you are trying to solve
  • Availability of libraries
  • In-house expertise
  • Stability / security
  • Resource / fund availability

When you pick a language you should make sure it is the language suited for the job. However, the following considerations are decidedly less obvious:

  • Availability of talent
  • Likelihood of expertise for a buyer
  • Standardization of maintenance / ease of transfer

The first point is key: how easy is it to recruit talent in the language you pick? This is going to have a direct impact on the speed with which you can scale your development team. The remainder has to do with the end game: how attractive you can make your company for an acquisition.

I have worked at a few companies that have built applications around Perl. This is not a glamorous language. Despite the fact that this is a quite robust language with extensive library support, Perl does not get the buzz. This makes it hard for recruiting purposes: while at times it feels like you can throw a rock and hit a Ruby or PHP programmer, finding a good Perl programmer is a bit like finding a unicorn. (Yes, that’s hyperbolic, but you get my point.) The same is true on the sale side: anyone who is looking to acquire the company will have to invest in the talent to actually maintain that code base, and the same talent difficulty applies.

However, this shouldn’t necessarily force you to go out and immediately rewrite your application in a more popular language. Keep this rule in mind: build using the resources you have. Do you have existing talent who is strongest in a less popular language? Build with that. You’ll get to market sooner, and waste less time. If you are worried about the end game when you are just starting, you may need to rethink your priorities. I have been with companies that have built their product around Perl that were acquired. Perl can in fact be used in high trafficked, scalable web sites. (Just ask Amazon, Craig’s List, IMDB, or Live Journal.) Just because recruiting may be slightly more challenging and asset transfer more complicated does not mean you cannot find success.

So does language choice matter? Yes, of course, though, maybe not as much as some people might think. It may have an impact on your ability to scale your technical team, and makes for a bullet point for your acquisition pitch. However, speed to market is more important. Also, people buy products, not languages.

As with most things startup-related, you have to take into account a multitude of factors to make your decisions. Make sure language choice is not your only consideration. Focus on your product, and the rest will follow.

Posted in Article
Follow

Get every new post delivered to your Inbox.

Join 92 other followers