Grisha Trubetskoy

Notes to self.

Graceful Restart in Golang

| Comments

If you have a Golang HTTP service, chances are, you will need to restart it on occasion to upgrade the binary or change some configuration. And if you (like me) have been taking graceful restart for granted because the webserver took care of it, you may find this recipe very handy because with Golang you need to roll your own.

There are actually two problems that need to be solved here. First is the UNIX side of the graceful restart, i.e. the mechanism by which a process can restart itself without closing the listening socket. The second problem is ensuring that all in-progress requests are properly completed or timed-out.

Restarting without closing the socket

  • Fork a new process which inherits the listening socket.
  • The child performs initialization and starts accepting connections on the socket.
  • Immediately after, child sends a signal to the parent causing the parent to stop accepting connecitons and terminate.

Forking a new process

There is more than one way to fork a process using the Golang lib, but for this particular case exec.Command is the way to go. This is because the Cmd struct this function returns has this ExtraFiles member, which specifies open files (in addition to stdin/err/out) to be inherited by new process.

Here is what this looks like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
file := netListener.File() // this returns a Dup()
path := "/path/to/executable"
args := []string{
    "-graceful"}

cmd := exec.Command(path, args...)
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
cmd.ExtraFiles = []*os.File{file}

err := cmd.Start()
if err != nil {
    log.Fatalf("gracefulRestart: Failed to launch, error: %v", err)
}

In the above code netListener is a pointer to net.Listener listening for HTTP requests. The path variable should contain the path to the new executable if you’re upgrading (which may be the same as the currently running one).

An important point in the above code is that netListener.File() returns a dup(2) of the file descriptor. The duplicated file descriptor will not have the FD_CLOEXEC flag set, which would cause the file to be closed in the child (not what we want).

You may come across examples that pass the inherited file descriptor number to the child via a command line argument, but the way ExtraFiles is implemented makes it unnecessary. The documentation states that “If non-nil, entry i becomes file descriptor 3+i.” This means that in the above code snippet, the inherited file descriptor in the child will always be 3, thus no need to explicitely pass it.

Finally, args array contains a -graceful option: your program will need some way of informing the child that this is a part of a graceful restart and the child should re-use the socket rather than try opening a new one. Another way to do this might be via an environment variable.

Child initialization

Here is part of the program startup sequence

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
    server := &http.Server{Addr: "0.0.0.0:8888"}

    var gracefulChild bool
    var l net.Listever
    var err error

    flag.BoolVar(&gracefulChild, "graceful", false, "listen on fd open 3 (internal use only)")

    if gracefulChild {
        log.Print("main: Listening to existing file descriptor 3.")
        f := os.NewFile(3, "")
        l, err = net.FileListener(f)
    } else {
        log.Print("main: Listening on a new file descriptor.")
        l, err = net.Listen("tcp", server.Addr)
    }

Signal parent to stop

At this point we’re ready to accept requests, but just before we do that, we need to tell our parent to stop accepting requests and exit, which could be something like this:

1
2
3
4
5
6
7
if gracefulChild {
    parent := syscall.Getppid()
    log.Printf("main: Killing parent pid: %v", parent)
    syscall.Kill(parent, syscall.SIGTERM)
}

server.Serve(l)

In-progress requests completion/timeout

For this we will need to keep track of open connections with a sync.WaitGroup. We will need to increment the wait group on every accepted connection and decrement it on every connection close.

1
var httpWg sync.WaitGroup

At first glance, the Golang standard http package does not provide any hooks to take action on Accept() or Close(), but this is where the interface magic comes to the rescue. (Big thanks and credit to Jeff R. Allen for this post).

Here is an example of a listener which increments a wait group on every Accept(). First, we “subclass” net.Listener (you’ll see why we need stop and stopped below):

1
2
3
4
5
type gracefulListener struct {
    net.Listener
    stop    chan error
    stopped bool
}

Next we “override” the Accept method. (Nevermind gracefulConn for now, it will be introduced later).

1
2
3
4
5
6
7
8
9
10
11
func (gl *gracefulListener) Accept() (c net.Conn, err error) {
    c, err = gl.Listener.Accept()
    if err != nil {
        return
    }

    c = gracefulConn{Conn: c}

    httpWg.Add(1)
    return
}

We also need a “constructor”:

1
2
3
4
5
6
7
8
9
func newGracefulListener(l net.Listener) (gl *gracefulListener) {
    gl = &gracefulListener{Listener: l, stop: make(chan error)}
    go func() {
        _ = <-gl.stop
        gl.stopped = true
        gl.stop <- gl.Listener.Close()
    }()
    return
}

The reason the function above starts a goroutine is because this cannot be done in our Accept() above since it will block on gl.Listener.Accept(). The goroutine will unblock it by closing file descriptor.

Our Close() method simply sends a nil to the stop channel for the above goroutine to do the rest of the work.

1
2
3
4
5
6
7
func (gl *gracefulListener) Close() error {
    if gl.stopped {
        return syscall.EINVAL
    }
    gl.stop <- nil
    return <-gl.stop
}

Finally, this little convenience method extracts the file descriptor from the net.TCPListener.

1
2
3
4
5
func (gl *gracefulListener) File() *os.File {
    tl := gl.Listener.(*net.TCPListener)
    fl, _ := tl.File()
    return fl
}

And, of course we also need a variant of a net.Conn which decrements the wait group on Close():

1
2
3
4
5
6
7
8
type gracefulConn struct {
    net.Conn
}

func (w gracefulConn) Close() error {
    httpWg.Done()
    return w.Conn.Close()
}

To start using the above graceful version of the Listener, all we need is to change the server.Serve(l) line to:

1
2
netListener = newGracefulListener(l)
server.Serve(netListener)

And there is one more thing. You should avoid hanging connections that the client has no intention of closing (or not this week). It is better to create your server as follows:

1
2
3
4
5
server := &http.Server{
        Addr:           "0.0.0.0:8888",
        ReadTimeout:    10 * time.Second,
        WriteTimeout:   10 * time.Second,
        MaxHeaderBytes: 1 << 16}

Mod_python Performance Part 2: High(er) Concurrency

| Comments

Tl;dr

As is evident from the table below, mod_python 3.5 (in pre-release testing as of this writing) is currently the fastest tool when it comes to running Python in your web server, and second-fastest as a WSGI container.

Server Version Req/s % of httpd static Notes
nxweb static file 3.2.0-dev 512,767 347.1 % “memcache”:false. (626,270 if true)
nginx static file 1.0.15 430,135 291.1 % stock CentOS 6.3 rpm
httpd static file 2.4.4, mpm_event 147,746 100.0 %
mod_python handler 3.5, Python 2.7.5 125,139 84.7 %
uWSGI 1.9.18.2 119,175 80.7 % -p 16 –threads 1
mod_python wsgi 3.5, Python 2.7.5 87,304 59.1 %
mod_wsgi 3.4 76,251 51.6 % embedded mode
nxweb wsgi 3.2.0-dev, Python 2.7.5 15,141 10.2 % posibly misconfigured?

The point of this test

I wanted to see how mod_python compares to other tools of similar purpose on high-end hardware and with relatively high concurrency. As I’ve written before you’d be foolish to base your platform decision on these numbers because speed in this case matters very little. So the point of this is just make sure that mod_python is in the ballpark with the rest and that there isn’t anything seriously wrong with it. And surprisingly, mod_python is actually pretty fast, fastest, even, though in its own category (a raw mod_python handler).

Test rig

The server is a 24-core Intel Xeon 3GHz with 64GB RAM, running Linux 2.6.32 (CentOS 6.3).

The testing was done with httpress, which was chosen after having tried ab, httperf and weighttp. The exact command was:

1
httpress -n 5000000 -c 120 -t 8 -k http://127.0.0.1/

Concurrency of 120 was chosen as the highest number I could run across all setups without getting strange errors. “Strange errors” could be disconnects, delays and stuck connections, all tunable by anything from Linux kernel configuration to specific tool configs. I very much wanted concurrency to be at least a few times higher but it quickly became apparent that getting to that level would require very significant system tweaking for which I just didn’t have the time. 120 concurrent requests is nothing to sneeze at though: if you sustained this rate for a day of python handler serving, you’d have processed 10,812,009,600 requests (on a single server!).

I should also note that in my tweaking of various configurations I couldn’t get the requests/s numbers any significantly higher than what you see above. Increasing concurrency and number of workers mostly increased errors rather than r/s, which is also interesting because it’s important how gracefuly each of these tools fails, but failure mode is a whole different subject.

The tests were done via the loopback (127.0.0.1) because having tried hitting the server from outside it became apparent that the network was the bottleneck.

Keepalives were in use (-k), which means that all of the 5 million requests are processed over only about fifty thousand TCP connections. Without keepalives this would be more of the Linux kernel test because the bulk of the work establishing and taking down a connection happens in the kernel.

Before running the 5 million requests I ran 100,000 as a “warm up”.

This post does not include the actual code for the WSGI app and mod_python handlers because it was same as in my last post on mod_python performance testing.

Why httpress

ab simply can’t run more than about 150K requests per second, so it couldn’t adequately test nxweb and nginx static file serving.

httperf looked promising at first, but as is noted here its requests per second cannot be trusted because it gradually increases the load.

weighttp seemed good, but somehow got stuck on idle but not yet closed connections which affected the request/s negatively.

httpress claimed that it “promptly timeouts stucked connections, forces all hanging connections to close after the main run, does not allow hanging or interrupted connections to affect the measurement”, which is just what I needed. And it worked really great too.

The choice of contenders

mod_python and mod_wsgi are the obvious choices, uWSGI/Nginx combo is known as a low-resource and fast alternative. I came across nxweb while looking at httpress (it’s written by the same person (Yaroslav Stavnichiy), it looks to be the fastest (open source) web server currently out there, faster than (closed source) G-WAN, even.

Specific tool notes

The code used for testing and the configs were essentially same as what I used in my previous post on mod_python performance testing. The key differences are listed below.

Apache

The key config on Apache was:

1
2
3
ThreadsPerChild 25    # default
StartServers 16
MinSpareThreads 400

MinSpareThreads ensures that Apache starts all possible processes and threads on startup (25 * 16 = 400) so that there is no ramp up period and it’s tsunami-ready right away.

uWSGI

The comparison with uWSGI isn’t entriely appropriate because it was running listening on a unix domain socket behind Nginx. The -p 16 –threads 1 (16 worker processes with a single thread each) was chosen as the best performing option after some experimentation. Upping -p to 32 reduced r/s to 86233, 64 to 47296. Upping –threads to 2 (with 16 workers) reduced r/s to 55925 (by half, which is weird - mod_python has no problems with 25 threads). –single-interpreter didn’t seem to have any significant impact.

The actual uWSGI command was:

1
uwsgi -s logs/uwsgi.sock --pp htdocs  -M -p 16 --threads 1 -w mp_wsgi -z 30 -l 120 -L

A note on the uWSGI performance. Initially it seemed to be outperforming the mod_python handler by nearly a factor of two. Then after all kinds of puzzled head-scratching, I decided to verify that every hit ran my Python code - I did this by writing a dot to a file and making sure that the file size matches the number of hits in the end. It turned out that about one third of the requests from Nginx to uWSGI were erroring out, but httpress didn’t see them as errors. So if you’re going to attempt to replicate this, watch out for this condition. EDIT: Thanks to uWSGI’s author Roberto De Loris’ help, it turned out that this was a result of misconfiguration on my part - the -l parameter should be set higher than 120. (This explains how I arrived at 120 as the concurrency chosen for the test too). The request/s number and uWSGI’s position in my table is still correct.

Nginx

The relevant parts of the nginx config were (Note: this is not the complete config for brevity):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
worker_processes 24;
...
events {
  worker_connections 1024;
}
...
http {
  server_tokens off;
  keepalive_timeout 65;
  sendfile on;
  tcp_nopush on;
  tcp_nodelay on;

  access_log /dev/null main;
...
  upstream uwsgi {
     ip_hash;
     server unix:logs/uwsgi.sock;
  }
...

Conclusion

Mod_python is plenty fast. Considering that unlike with other contenders large parts of the code are written in Python and thus are readable and debuggable by not just C programmers, it’s quite a feat.

I was surprised by Apache’s slow static file serving compared to Nginx and Nxweb (the latter, although still young and in development seems like a very cool web server).

On the other hand I am not all that convinced that the Nginx/uWSGI set up is as cool as it is touted everywhere. Unquestionably Nginx is a super solid server and Apache has some catching up to do when it comes to acting as a static file server or a reverse proxy. But when it comes to serving Python-generated content, my money would be on Apache rather than uWSGI. The “low” 120 concurrency level for this test was largely chosen because of uWSGI (Apache started going haywire on me at about 400+ concurrent connections). EDIT: Thanks to Roberto’s comment, this turned out to be an error on my part (see comments). uWSGI can handle higher concurrencies if -l is set higher.

It’s also interesting that on my laptop a mod_python handler outperformed the Apache static file, but it wasn’t the case on the big server.

I didn’t do Python 3 testing, it would be interesting to see how much difference it makes as well.

I realize this post may be missing key config data - I had to leave out a lot because of time contraints (and my lazyness) - so if you see any obvious gaps, please comment, I will try to address them.

P.S. Did I mention mod_python 3.5 supports Python 3? Please help me test it!

Separate Request and Response or a Single Request Object?

| Comments

Are you in favor of a single request object, or two separate objects: request and response? Could it be that the two options are not contradictory or even mutually exclusive?

I thouhgt I always was in favor of a single request object which I expressed on the Web-SIG mailing list thread dating back to October 2003 (ten years ago!). But it is only now that I’ve come to realize that both proponents of a single object and two separate objects were correct, they were just talking about different things.

The confusion lies in the distinction between what I am going to term a web application and a request handler.

A request handler exists in the realm of an HTTP server, which (naturally) serves HTTP requests. An HTTP request consists of a request (a method, such as “GET”, along with some HTTP headers and possibly a body) and a response (a status line, some HTTP headers and possibly a body) sent over the TCP connection. There is a one-to-one correspondence between a request and a response established by the commonality of the connection. An HTTP request is incomplete if the response is missing, and a response cannot exist without a request. (Yes, the term “request” is used to denote both the request and response, as well as just the request part of the request, and that’s confusing).

A web application is a layer on top of the HTTP request handler. A web application operates in requests and responses as well, but those should not be confused with the HTTP request/response pairs.

Making the conceptual distinction between a web application request and an HTTP request is difficult because both web applications and request handlers use HTTP headers and status to accomplish their objectives. The difference is that strictly speaking a web application does not have to use HTTP and ideally should be protocol-agnostic, though it is very common for a web application to rely on HTTP-specific features these days. Not every HTTP request exists as part of a web application. But because it is difficult to imagine a web application without HTTP, we tend to lump the two concepts together. It is also exacerbated by the fact that HTTP headers carry both application-specific and HTTP-specific information.

A good example of the delineation between a web application response and an HTTP response is handling of an error condition. A web application error is typically not an HTTP error. Imagine an “invalid login” page. It is a web application error, but not an HTTP error. An “invalid login” page should send a “200 OK” status line and a body explaining that the credentials supplied were not valid. But then HTTP provides its own authentication mechanism, and an HTTP “401 Unauthorized” (which ought not be used by web applications) is often misunderstood as something that web applications should incorporate into how they do things.

Another example of a place where the line gets blurry is a redirect. A redirect is quite common in a web application, and it is typically accomplished by way of an HTTP redirect (3XX status code), yet the two are not the same. An HTTP redirect, for example, may happen unbeknownst to the web application for purely infrastructural reasons, and a web application redirect does not always cause an HTTP redirect.

Yet another example: consider a website serving static content where same URI responds with different content according to the Accept-Language header in the request. Is this a “web application”? Hardly. Could you have some Python (or whatever you favorite language is) help along with this process? Certainly. Wouldn’t this code be part of a “web application”? Good question. It is not uncommon for a web application to consider the Accept-Language header in its response. You could also accomplish this entirely in an http server by configuring it correctly. Sometimes whether something is a web application just depends on how you’re looking at it, but you do have to decide for yourself which it is.

Getting to the original problem, the answer to the question of whether to use separate response/request objects or not depends very much on which realm you’re operating in. A request handler only needs one request object representing the HTTP request because it is conceptually similar to a file - you don’t typically open a file twice once for reading and once for writing. Whereas a web application, which may chose between different responses depending on what’s in the request is possibly best served with two separate objects.

I think that misunderstanding of what a “web application” is happens to be the cause of a lot of bad decisions plaguing the world of web development these days. It is not uncommon to see people get stuck on low-level HTTP subtleties while referring to web application issues and vise-versa. We’d all get along better if we took some time to think about the distinction between web applications and HTTP request handlers.

P.S. This will get even more complicated when HTTP 2.0 comes around where responses may exist without a request. And I haven’t even mentioned SSL/TLS.

My Thoughts on WSGI

| Comments

I’m not very fond of it. Here is why.

CGI Origins

WSGI is based on CGI, as the “GI” (Gateway Interface) suggests right there in the name.

CGI solved a very important problem using the very limited tools at hand available at the time. Though CGI wasn’t a standard, it was ubiquitous in the early days of the WWW, despite its inherent slowness and other limitations. It became popular because it worked with any language, was easy to turn on and provided such a thick wall of isolation that admins could turn it on for their users without too much concern for problems caused by user-generated CGI scripts.

There is now an RFC (RFC3875) describing CGI, but I hazard that Ken Coar wrote the RFC not because he thought CGI was great, but rather out of discontent with the present state of affairs - everyone was using CGI, yet there never was a formal document describing it.

So if I were to attempt to unite all Python web applications under the same standard, CGI wouldn’t be the first thing I would consider. There are other efforts at solving the same problem in more elegant ways which could be used as a model, e.g. (dare I mention?) Java Servlets.

Headers

CGI dictated that HTTP headers be passed to the CGI script by way of environment variables. The same environment that contain your $PATH and $TERM. (Note this also explains the origin of the term environment in WSGI - in HTTP there is no request environment, there is simply a request). So as to not clash with any other environment variables, CGI would prepend HTTP_ to every header name. It also swapped dashes with underscores because dashes are not allowed in shell variable names. And because environment variables in DOS and Unix are typically case-insensitive, they were capitalized. Thus "content-type" would become "HTTP_CONTENT_TYPE".

And how much sense applying the same transformation make in the realm in which WSGI operates? The headers are typically read by the webserver and stored in some kind of a structure, which ought to be directly accessible so the application can get headers in the original, unmodified format. For example in Apache this would be the req.headers_in table. What is the benefit of combing through that structure converting every key to some capitalized HTTP_ string at every request? Why are WSGI developers forced to use env['HTTP_CONTENT_LENGTH'] rather than env['Content-length']?

Another thing about the environment is that the WSGI standard states that it must be a real Python dictionary, thereby dictating that a memory allocation happen to satisfy this requirement, at every request.

start_response()

In order to be able to write anything to the client a WSGI application must envoke the start_response() function passed to it which would return a write() method.

Ten points for cuteness here, but the practicality of this solution eludes me. This is certainly a clever way to make the fact that the start of a response is an irreversible action in HTTP because the headers are sent first, but seriosly - do programmers who code at this level not know it? Why can’t the header sending part happen implicitly at the first write(), and why can’t an application write without sending any headers?

There is also another problem here - function calls are relatively expensive in Python. The requirement that the app must beg for the write object every time introduces a completely unnecessary function call.

The request object with a write() method should simply be passed in. This is how it has always worked in mod_python (cited in PEP3333 a number of times!).

Error handling

First, I must confess that after re-reading the section of the PEP3333 describing the exc_info argument several times I still can’t say I grok what it’s saying. Looking at some implementations out there I am releived to know I am not the only one.

But the gist of it that an exception can be supplied along with some headers. It seems to me there is confusion between HTTP errors and Python errors here, the two are not related. What is the expected outcome of passing a Python exception to an HTTP server? The server would probably convert it to a 500 Internal Server Error (well it only has so many possibilities to chose from), and what’s the point of that?

Wouldn’t the outcome be same if the application simply raised an exception?

If the spec wanted to provide means for the application Python errors to somehow map to HTTP errors, why not define a special exception class which could be used to send HTTP errors? What was wrong with mod_python’s:

1
raise apache.SERVER_RETURN, apache.HTTP_INTERNAL_SERVER_ERROR

I think it’s simple and self-explanatory.

Other things

What is wsgi.run_once, why does it matter and why should the web server provide it? What would be a good use case for such a thing?

There is a long section describing “middleware”. Middleware is a wrapper, an example of the pipeline design pattern and there doesn’t seem to be anything special with this concept that the WSGI spec should even mention it. (I also object to the term “middleware” - my intuition suggests it’s a layer between “hardware” and “software”, not a wrapper.)

SCRIPT_NAME and PATH_INFO

Perhaps the most annoying part of CGI were these two mis-understood variables, and sadly WSGI uses them too.

Remember that in CGI we always had a script. A typical CGI script resided somewhere on the filesystem to which the request URI maps. As part of serving the request the server traversed the URI mapping each element to an element of the filesystem path to locate the script. Once the script was found, the portion of the URI used thus far was assigned to the SCRIPT_NAME variable, while the remainder of the URI got assigned to PATH_INFO.

But where is the script in WSGI? Is my Python module the script? What relatioship does there exist between the request URI and the (non-existent) script?

Bottom line

I am not convinced that there should be a universal standard for Python web applications to begin with. I think that what we refer to as “web applications” is still not very well understood by us programmers.

But if we are to have one, I think that WSGI approach is not the right one. It brings the world of Python web development to the lowest common denominator - CGI and introduces some problems of its own on top of it.

Other notes

What is the Gateway in CGI

I did some digging into the etymology of “Common Gateway Interface”, because I wanted to know what the original author (Rob McCool) meant by it when he came up with it. From reading this it’s apparent that he saw it as the Web daemon’s gateway to an outside program:

“For example, let’s say that you wanted to “hook up” your Unix database to the World Wide Web, to allow people from all over the world to query it. Basically, you need to create a CGI program that the Web daemon will execute to transmit information to the database engine, and receive the results back again and display them to the client. This is an example of a gateway, and this is where CGI, currently version 1.1, got its origins.”

I always perceived it the other way around, I thought the “gateway” was a gateway to the web server. I think that when Phillip J. Eby first proposed the name WSGI he was under the same misperception as I.

Mod_python: The Long Story

| Comments

This story started back in 1996. I was in my early twenties, working as a programmer at a small company specializing in on-line reporting of certain pharmaceutical data.

There was a web-based application (which was extremely cool considering how long ago this was), but unfortunately it was written in Visual Basic by a contractor and I was determined to do something about it. As was very fashionable at the time, I was very pro Open Source, had Linux running on my home 386 and had recently heard Guido’s talk at the DC Linux user group presenting his new language he called Python. Python seemed like a perfect alternative to the VB monstrosity.

I spent a few weeks quietly in my cubicle learning Python and rewriting the whole app in it. (Back in those days this is how programmers worked, there was no “agile” and “daily stand ups”, everyone understood that things take time. I miss those days very much). Python was fantastic, and soon the app was completely re-written.

Then I realized that explaining what I’ve been working on to my bosses might be a bit of a challenge. You see, for a while there nobody knew that the web app they’ve been using had been re-written in Python, but sooner or later I would have to reveal the truth and, more importantly, justify my decision. I needed a good reason, and stuff about object-oriented programming, clean code, open source, etc would have fallen on deaf ears.

Just around that time the Internet Programming with Python book came out, and in it there was a chapter on how to embed the Python interpreter in the Netscape Enterprise web server. The idea seemed very intriguing to me and it might have contained exactly the justification I was looking for - it would make the app faster. (“Faster” is nearly as good as “cheaper” when it comes to selling to the management). I can’t say that I knew much C back then, but with enough tinkering around I was able to make something work, and lo and behold it was quite noticeably faster.

And so a few days later I held a presentation in the big conference room regarding this new tool we’ve started using called Python which can crunch yall’s numbers an order of magnitude faster than the Microsoft product we’ve been using. And oh, by the way, I quickly hacked something together last night - let’s do a live demo, look how fast this is! They were delighted.

Little did they know, the app had been running in Python for months, and the reason for the speed up had little to do with the language itself. It was all because I was able to embed the interpreter within the web server. Then I thought that to make it all complete I would make my little tool open source and put it on my website free for everyone to use. I called it NSAPy as a combination of the Netscape Server API and Python.

But I didn’t stop there, and soon I was able to replicate this on an Apache web server, which was taking the Internet by storm back then. The name mod_python came naturally since there already was a mod_perl.

Things were going very well back then. These were the late nineties, the dawn of e-commerce on the World Wide Web. I started working for a tiny ISP which soon transformed into a humongous Web Hosting company, we ran millions of sites, built new data centers with thousands of servers pushing gigabits of traffic and (in short) were taking over the world (or so it seemed). With the rise of our company’s stock price, me and my colleagues were on our way to becoming millionaires. Mod_python was doing very well too. It had a busy website, a large and very active mailing list and an ever growing number of devoted users. I went to various Open Source conferences to present about it (and couldn’t really believe that without exception everyone knew what mod_python was).

Then came 2001. We just bought a house and our second son was not even a year old when one beautiful sunny summer day I was summoned to a mandatory meeting. In that meeting about two thirds of our office was let go. Even though we all felt it was coming, it was still a shock. I remember coming home that morning and having to explain my wife that I’d just been fired. This after constant all-nighters, neglect for family life under the excuse of having the most important job doing the most important thing and changing the world and rants about how we’d be all set financially in just a year or two. In my personally opinion the 2007 financial crash was nothing compared to the dot-com bust. Everyone was getting laid off everywhere, the Internet became a dirty word, software development was being outsourced to India.

For the next couple of years I made a living doing contracting work here and there. Needless to say, mod_python wasn’t exactly at the top of my priority list. But it was getting ever more popular, the mailing list busier, though it didn’t make any money (for me at least). I tried my best to keep everything running in whatever spare time I had, answering emails and releasing new versions periodically. Finding time for it was increasingly difficult, especially given that most of the work I was doing had nothing to do with mod_python or Python.

One day I had this thought that donating mod_python to the Apache Software Foundation would ensure its survival, even if I can no longer contribute. And so it was done. Initially things went very well - the donation did affiliate mod_python with the solid reputation of Apache and that was great. Mod_python gained a multitude more users and most importantly contributors.

At the same time my life was becoming ever more stressful. Free time for mod_python hacking was getting more and more scarce until there was none. I also think I was experiencing burnout. Answering questions on the mailing list became an annoyance. I had to read through enormous threads with proposals for various features or how things ought to work and respond to them, and it was just never ending. It wasn’t fun anymore.

I also felt that people didn’t understand what mod_python was and that I’m not able to explain it very well. (For what it’s worth, I still feel this way). In my mind it was primarily an interface to the Apache internals, but since making every structure and API accessible from within Python was impractical, only selected pieces were exposed. Secondly, mod_python provided means to perform certain things that were best done in Apache, e.g. global locking, caching. Lastly, it provided certain common tasks but implemented in Apache-specific ways (using Apache pools, APR, etc.) for maximal performance; things like cookies and sessions fell into that category. Publisher and PSP didn’t strictly belong in mod_python, but were there for the sake of battery-includedness - you could build a rudimentary app without any additional tool.

The rest of the world saw it as a web-development framework. It wasn’t a particularly good one, especially when it came to development, because it required root privileges to run. It also didn’t do a very good job at reloading changed modules very well which complicated development. A very considerable effort was put in by one of the contributors to address the particular issue of module loading and caching, and I never thought it to be important because to me restarting Apache seemed like the answer, I didn’t think that people without root access would ever use mod_python.

As I was growing more disinterested in mod_python it got to a point where I just let it be. I would skim through emails from people I trusted and responded affirmatively to whatever they proposed without giving it much thought. I didn’t see any point in keeping and defending my vision for mod_python. I think that by about 2006 or so I was so disconnected I no longer had a good grasp of what the latest features of mod_python were being worked on. Not sure if it was my lack of interest or that other contributors felt burned out as well, but new commits slowed down to a trickle and stopped eventually, and my quarterly reports to the ASF Board became a cut-and-paste email of “no new activity”.

This is where the negative aspect of the ASF patronage begun to surface. Sadly, the ASF rules are that projects and their community must be active, and soon the project got moved to the attic. And even though I kept telling myself that I couldn’t care less, I must admit it hurt. The attic is a like a one-way trash can - once there, a project cannot go back, other than through the incubation process.

Fast forward to 2013. Why get back to hacking on it? First of all I got tired of “mod_python is dead” plastered all over the web. Every time I see some kid who wasn’t old enough to speak back when I first released it tweet that it is this or that, I can’t help but take it a little personally. It’s an open source project people, it’s only dead if you do not contribute to it.

For the skeptics in the crowd I most certainly disagree that mod_python as a concept is dead, I’d even argue that its time hasn’t come yet. The vision has not changed. Mod_python is still an interface to Apache which lets you take advantage of its versatile architecture to do some very powerful things. It’s not quite a web development framework, and it’s not even a tool for running your favorite web development framework in production (though it can certainly do that quite nicely).

These days there is more demand than ever for high volume servers that do not have a user interface and thus do not need a WSGI framework to power them - I think this is one of the areas where mod_python could be most useful. There are also all kinds of possibilities for using Apache and mod_python for distributed computing and big data stuff taking advantage of the fact that Apache is an excellent job supervisor - anyone up for writing a map/reduce framework in mod_python?

I must also note that hacking on it in the past weeks has been fun once again. I wanted to get up to speed with the latest on Python 3 and Apache internals, especially the event/epoll stuff and this has been a great way to do just that. I also very much enjoy that I can once again do whatever I want without any scrutiny.

If there is one thing I’ve learned it’s that few open source projects can exist without their founders’ continuous involvement. The Little Prince once said - “You become responsible forever for what you have tamed”. It seems like mod_python is my rose and if I don’t water it, no one will.

P.S. Did I mention mod_python now supports Python 3? Please help me test it!

Mod_python Performance and Why It Matters Not.

| Comments

TL;DR: mod_python is faster than you think.

Tonight I thought I’d spend some time looking into how the new mod_python fares against other frameworks of similar purpose. In this article I am going to show the results of my findings, and then I will explain why it really does not matter.

I am particularly interested in the following:

  • a pure mod_python handler, because this is as fast as mod_python gets.
  • a mod_python wsgi app, because WSGI is so popular these days.
  • mod_wsgi, because it too runs under Apache and is written entirely in C.
  • uWSGI, because it claims to be super fast.
  • Apache serving a static file (as a point of reference).

The Test

I am testing this on a CentOS instance running inside VirtualBox on an early 2011 MacBook Pro. The VirtualBox has 2 CPU’s and 6GB of RAM allocated to it. Granted this configuration can’t possibly be very performant [if there is such a word], but it should be enough to compare.

Real-life performance is very much affected by issues related to concurrency and load. I don’t have the resources or tools to comprehensively test such scenarios, and so I’m just using concurrency of 1 and seeing how fast each of the afore-listed set ups can process small requests.

I’m using mod_python 3.4.1 (pre-release), revision 35f35dc, compiled against Apache 2.4.4 and Python 2.7.5. Version of mod_wsgi is 3.4, for uWSGI I use 1.9.17.1.

The Apache configuration is pretty minimal (It could probably trimmed even more, but this is good enough):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
LoadModule unixd_module /home/grisha/mp_test/modules/mod_unixd.so
LoadModule authn_core_module /home/grisha/mp_test/modules/mod_authn_core.so
LoadModule authz_core_module /home/grisha/mp_test/modules/mod_authz_core.so
LoadModule authn_file_module /home/grisha/mp_test/modules/mod_authn_file.so
LoadModule authz_user_module /home/grisha/mp_test/modules/mod_authz_user.so
LoadModule auth_basic_module /home/grisha/mp_test/modules/mod_auth_basic.so
LoadModule python_module /home/grisha/src/mod_python/src/mod_python/src/mod_python.so

ServerRoot /home/grisha/mp_test
PidFile logs/httpd.pid
ServerName 127.0.0.1
Listen 8888
MaxRequestsPerChild 1000000

<Location />
      SetHandler mod_python
      PythonHandler mp
      PythonPath "sys.path+['/home/grisha/mp_test/htdocs']"
</Location>

I should note that <Location /> is there for a purpose - the latest mod_python forgoes the map_to_storage phase when inside a <Location> section, so this makes it a little bit faster.

And the mp.py file referred to by the PythonHandler in the config above looks like this:

1
2
3
4
5
6
7
8
from mod_python import apache

def handler(req):

    req.content_type = 'text/plain'
    req.write('Hello World!')

    return apache.OK

As the benchmark tool, I’m using the good old ab, as follows:

1
$ ab -n 10  http://localhost:8888/

For each test in this article I run 500 requests first as a “warm up”, then another 500K for the actual measurement.

For the mod_python WSGI handler test I use the following config (relevant section):

1
2
3
4
5
<Location />
    PythonHandler mod_python.wsgi
    PythonPath "sys.path+['/home/grisha/mp_test/htdocs']"
    PythonOption mod_python.wsgi.application mp_wsgi
</Location>

And the mp_wsgi.py file looks like this:

1
2
3
4
5
6
7
8
9
def application(environ, start_response):
    status = '200 OK'
    output = 'Hello World!'

    response_headers = [('Content-type', 'text/plain'),
                        ('Content-Length', str(len(output)))]
    start_response(status, response_headers)

    return [output]

For mod_wsgi test I use the exact same file, and the config as follows:

1
2
LoadModule wsgi_module /home/grisha/mp_test/modules/mod_wsgi.so
WSGIScriptAlias / /home/grisha/mp_test/htdocs/mp_wsgi.py

For uWSGI (I am not an expert), I first used the following command:

1
2
3
/home/grisha/src/mp_test/bin/uwsgi \
   --http 0.0.0.0:8888 \
   -M -p 1 -w mysite.wsgi -z 30 -l 120 -L

Which yielded a pretty dismal result, so I tried using a unix socket -s /home/grisha/mp_test/uwsgi.sock and ngnix as the front end as described here, which did make uWSGI come out on top (even if proxied uWSGI is an orange among the apples).

The results, requests per second, fastest at the top:

1
2
3
4
5
6
| uWSGI/nginx         | 2391 |
| mod_python handler  | 2332 |
| static file         | 2312 |
| mod_wsgi            | 2143 |
| mod_python wsgi     | 1937 |
| uWSGI --http        | 1779 |

What’s interesting and unexpected at first is that uWSGI and the mod_python handler perform better than sending a static file, which I expected to be the fastest. On a second thought though it does make sense, once you consider that no (on average pretty expensive) filesystem operations are performed to serve the request.

Mod_wsgi performs better than the mod_python WSGI handler, and that is expected, because the mod_python version is mostly Python, vs mod_wsgi’s C version.

I think that with a little work mod_python wsgi handler could perform on par with uWSGI, though I’m not sure the effort would be worth it. Because as we all know, premature optimization is the root of all evil.

Why It Doesn’t Really Matter

Having seen the above you may be tempted to jump on the uWSGI wagon, because after all, what matters more than speed?

But let’s imagine a more real world scenario, because it’s not likely that all your application does is send "Hello World!".

To illustrate the point a little better I created a very simple Django app, which too sends "Hello World!", only it does it using a template:

1
2
3
4
def hello(request):
    t = get_template("hello.txt")
    c = Context({'name':'World'})
    return HttpResponse(t.render(c))

Using the mod_python wsgi handler (the slowest), we can process 455 req/s, using uWSGI (the fastest) 474. This means that by moving this “application” from mod_pyhton to uWSGI we would improve performance by a measley 5%.

Now let’s add some database action to our so-called “application”. For every request I’m going to pull my user record from the Django auth_users table:

1
2
3
4
5
6
7
from django.contrib.auth.models import User

def hello(request):
    grisha = User.objects.get(username='grisha')
    t = get_template("hello.txt")
    c = Context({'name':str(grisha)[0:5]}) # world was 5 characters
    return HttpResponse(t.render(c))

Now we are down to 237 req/s for the mod_python WSGI handler and 245 req/s in uWSGI, and the difference between the two has shrunk to just over 3%.

Mind you, our “application” still has less than 10 lines of code. In a real-world situation the difference in performance is more likely to amount to less than a tenth of a percent.

Bottom line: it’s foolish to pick your web server based on speed alone. Factors such as your comfort level with using it, features, documentation, security, etc., are far more important than how fast it can crank out “Hello world!”.

Last, but not least, mod_python 3.4.1 (used in this article) is ready for pre-release testing, please help me test it!

Running a WSGI App on Apache Should Not Be This Hard

| Comments

If I have a Django app in /home/grisha/mysite, then all I should need to do to run it under Apache is:

1
2
3
4
5
6
7
$ mod_python create /home/grisha/mysite_httpd \
    --listen=8888 \
    --pythonpath=/home/grisha/mysite \
    --pythonhandler=mod_python.wsgi \
    --pythonoption="mod_python.wsgi.application mysite.wsgi::application"

$ mod_python start /home/grisha/mysite_httpd/conf/httpd.conf

That’s all. There should be no need to become root, tweak various configurations, place files in the right place, check permissions, none of that.

Well… With mod_python 3.4.0 (alpha) that’s exactly how it is…

Please help me test it.

The Next Smallest Step Problem

| Comments

“A journey of a thousand miles begins with a single step”

Most of my journeys never begin, or cannot continue because of that one single step, be it first or next. Because it is hard, at times excruciatingly so.

Here I speak of software, but this applies to many other aspects of my life just as well.

I recon it’s because I do not think in steps. I think of a destination. I imagine the end-result. I can picture it with clarity and in great detail. I know where I need to be. But what is the next step to get there? And it doesn’t help that where I travel, there are no signs.

The problem of deciding what to do next is so common for me that I even have a name for it. I call it “The Next Smallest Step” problem. Whenever I find myself idling, clicking over to online time-wasters, I pause and ask myself “What is the Next Smallest Step? What’s the next smallest thing I can do, right now?”

It doesn’t matter how much further this small step moves me. A nanometer is better than standing still. It has to be something that is simple enough that I can just do. Right now.

I always plan to do big things that take days, weeks or months. But of all that, can I pick that one small simple and quick thing that I can do now?

Sometimes focusing on the next smallest step is so difficult that I pencil this question on a piece of paper, and sometimes I just type it on the command line or in the source code. My short version is:

1
2
$ WHAT NEXT?
bash: WHAT: command not found

(that’s right, in CAPS)

This simple question has taken me on some of the most fascinating and challenging journeys ever. In restrospect, I think I would not be able to travel any of them without repeatedly asking it of myself, over and over again.

It has resulted in most productive and gratifying days of work. Some of my greatest projects began with this question. In many instances it established what I had to do for months ahead (years, even?). All beginning with this one small question.

Conversely not asking it often enough, if at all, led to time having gone by without any results to show for and many a great opportunity lost.

I must also note that some of my next smallest steps took days of thinking to figure out. Nothing wrong with that.

And so I thought I’d share this with you, just in case you might find it helpful. Whenever you find yourself at an impass and progress has stopped, ask yourself:

“What is the Next Smallest Step?”

Hacking on Mod_python (Again)

| Comments

Nearly eight years after my last commit to Mod_python I’ve decided to spend some time hacking on it again.

Five years without active development and thirteen since its first release, it still seems to me an entirely useful and viable tool. The code is exceptionally clean, the documentation is amazing, and the test suite is awesome. Which is a real testament to the noble efforts of all the people who contributed to its development.

We live in this new c10k world now where Apache Httpd no longer has the market dominance it once enjoyed, while the latest Python web frameworks run without requiring or recommending Mod_python. My hunch, however, is that given a thorough dusting it could be quite useful (perhaps more than ever) and applied in very interesting ways to solve the new problems. I also think the Python language is at a very important inflection point. Pyhton 3 is now mature, and is slowly but steadily becoming the preferred language of many interesting communities such as data science, for example.

The current status of Mod_python as an Apache project is that it’s in the attic. This means that the ASF isn’t providing much in the way of infrastructure support any longer, nor will you see an “official” ASF release any time soon. (If ever - Mod_python would have to re-enter as an incubator project and at this point it is entirely premature to even consider such an option).

For now the main goal is to re-establish the community, and as part of that I will have to sort out how to do issue tracking, discussion groups, etc. At this point the only thing I’ve decided is that the main repository will live on github.

The latest code is in 4.0.x branch. My initial development goal is to bring it up to compatibility with Python 2.7 and Apache Httpd 2.4 (I’m nearly there already), then potentially move on to Python 3 support. I have rolled back a few commits (most notably the new importer) because I did not understand them. There are still a few changes in Apache 2.4 that need to be addressed, but they seem relatively minor at this point. Authentication has been changed significantly in 2.4, though mod_python never had much coverage in that area.

Let’s see where this takes us? And if you like this, feel free to star and fork Mod_python on github and follow it on Twitter:

Json2avro

| Comments

As you embark on converting vast quantities of JSON to Avro, you soon discover that things are not as simple as they seem. Here is how it might happen.

A quick Google search eventually leads you to the avro-tools jar, and you find yourself attempting to convert some JSON, such as:

1
2
{"first":"John", "middle":"X", "last":"Doe"}
{"first":"Jane", "last":"Doe"}

Having read Avro documentation and being the clever being that you are, you start out with:

1
2
3
4
5
6
7
8
9
java -jar ~/src/avro/java/avro-tools-1.7.4.jar fromjson input.json --schema \
 '{"type":"record","name":"whatever",
   "fields":[{"name":"first", "type":"string"},
             {"name":"middle","type":"string"},
             {"name":"last","type":"string"}]}' > output.avro
Exception in thread "main" org.apache.avro.AvroTypeException: Expected field name not found: middle
        at org.apache.avro.io.JsonDecoder.doAction(JsonDecoder.java:477)
        at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
        ...

A brief moment of disappointment is followed by the bliss of enlightment: Duh, the “middle” element needs a default! And so you try again, this time having tacked on a default to the definition of “middle”, so it looks like {"name":"middle","type":"string","default":""}:

1
2
3
4
5
6
7
8
java -jar ~/src/avro/java/avro-tools-1.7.4.jar fromjson input.json --schema \
 '{"type":"record","name":"whatever",
   "fields":[{"name":"first", "type":"string"},
             {"name":"middle","type":"string","default":""},
             {"name":"last","type":"string"}]}' > output.avro
Exception in thread "main" org.apache.avro.AvroTypeException: Expected field name not found: middle
        at org.apache.avro.io.JsonDecoder.doAction(JsonDecoder.java:477)
        ...

Why doesn’t this work? Well… You don’t understand Avro, as it turns out. You see, JSON is not Avro, and therefore the wonderful Schema Resolution thing you’ve been reading about does not apply.

But do not despair. I wrote a tool just for you:

json2avro. It does exactly what you want:

1
2
3
4
5
json2avro input.json output.avro -s \
 '{"type":"record","name":"whatever",
   "fields":[{"name":"first", "type":"string"},
             {"name":"middle","type":"string","default":""},
             {"name":"last","type":"string"}]}'

No errors, and we have an output.avro file, let’s see what’s in it by using the aforementioned avro-tools:

1
2
3
java -jar ~/src/avro/java/avro-tools-1.7.4.jar tojson output.avro
{"first":"John","middle":"X","last":"Doe"}
{"first":"Jane","middle":"","last":"Doe"}

Let me also mention that json2avro is written in C and is fast, it supports Snappy, Deflate and LZMA compression codecs, lets you pick a custom block size and is smart enough to (optionally) skip over lines it cannot parse.

Enjoy!