Running Gunicorn webserver from Python with state preserving object

This article demonstrates how to configure and run Gunicorn directly from Python 3 code without using configuration files or invoking gunicorn from the command line.

It also demonstrates how to setup a stateful backend with Gunicorn where a single Python object is kept alive between requests such that it can manage the state of a backend directly in Python code. Graceful start up and shut down of the backend may be done in callbacks registered with Gunicorn.

Note that running a webserver with just one worker isn't generally recommended as it will not allow scaling to higher workloads. However, it may be a convenient and simple solution for cases where the server will not face high demand or where the server implements custom functionality that's expensive to initialize.

The Notes'n'Todos application uses this approach.

Processes and threads

Gunicorn is configured to run only one worker process, but to use multiple threads (although they will not be real individual threads). Having multiple threads is necessary for Gunicorn to provide persistent HTTP connections, which is important when multiple requests are made in sequence.

With multiple threads a lock or some other mechanism must be used to synchronize with single threaded backend logic.

Worker preforking and the need for create and exit callbacks

If the backend needs to do any processing at start up or shutdown, it must be done in callbacks registered with the custom Gunicorn application. It can not simply be done before and after invoking the Gunicorn app .run() for some interesting reasons:

First, realize that Gunicorn uses a prefork process creation model. It means that the state when Gunicorn was first started will be cloned whenever a worker is created. Now, also realize that Gunicorn WILL eventually restart the worker process, even if it was configured to live forever. The result may be a server that mysteriously resets to its original state every couple of days, if initialization is not done in a create callback. As you may have realized by now, I learned this the hard way...

The bottom line is that it's important that Gunicorn can restart the worker process correctly. In the example below the restart can be tested by uncommenting self.cfg.set("max_requests", 10) which will force a restart of the worker after 10 requests.

Example code

The following example script demonstrates how to do all this with a custom Gunicorn application class. Bottle is used as the WSGI framework.

The script serves a demo page with a button that shows a value increasing for each click. The backend persists the value through restarts by saving and loading to a file.

Run the script with python (Python 3), with gunicorn and bottle pip installed. Don't run it with gunicorn from the commandline.

import gunicorn.app.base

class CustomUnicornApp(gunicorn.app.base.BaseApplication):
    This gunicorn app class provides create and exit callbacks for workers, 
    and runs gunicorn with a single worker and multiple gthreads
    def __init__(self, create_app_callback, exit_app_callback, host_port):
        self._configBind = host_port
        self._createAppCallback = create_app_callback
        self._exitAppCallback = exit_app_callback

    def exitWorker(arbiter, worker):
        # worker.app provides us with a reference to "self", and we can call the 
        # exit callback with the object created by the createAppCallback:
        self = worker.app

    def load_config(self):
        self.cfg.set("bind", self._configBind)
        self.cfg.set("worker_class", "gthread")
        self.cfg.set("workers", 1)
        self.cfg.set("threads", 4)
        self.cfg.set("worker_exit", CustomUnicornApp.exitWorker)
        # Try to uncomment and make 10 requests, to test correct restart of worker:
        # self.cfg.set("max_requests", 10) 

    def load(self):
        # This function is invoked when a worker is booted
        self._createdApp = self._createAppCallback()
        return self._createdApp

# --- Stateful service example ---

import threading
from bottle import Bottle

class StatefulService():
    def __init__(self, value):
        print("Service starts, state: %d" % value)
        self.value = value

    def load(filename):
            with open(filename, "r") as f:
                value = int(f.read())
            value = 0
        return StatefulService(value)

    def save(self, filename):
        print("Saving state: %d" % self.value)
        with open(filename, "w") as f:

    def getValue(self):
        self.value += 1
        return self.value

# --- HTML and server

# The demo page has a button with inline javascript that fetches /getvalue endpoint
# and uses promises to update DOM:
demoHtml = """
<button onclick="fetch('/getvalue')
    .then(response => response.json())
    .then(response => this.innerHTML = 'Value: ' + response.value)">Click me</button>

def startServer():
    def create():
        app = Bottle()
        lock = threading.Lock()
        service = StatefulService.load("state.txt")

        def getIndex():
            # Serve static content, no lock protection necessary:
            return demoHtml

        def getValue():
            # Get value from the service, protected by a lock
            # as multiple threads may call this:
            with lock:
                value = service.getValue()
            return '{"value":%d}' % value

        # Store reference to the service in the bottle app object
        app.serviceObj = service
        return app

    def exit(app):
        # Get the service through the app object and save state

    CustomUnicornApp(create, exit, "localhost:8001").run()

if __name__ == "__main__":


Comments powered by Talkyard