Creating Threadsafe Ruby Sidekiq Workers
Thread safety in general can be difficult to wrap your head around at times, but at the simplest level if you have data shared between threads then things can get messy. As always, concrete examples are easier than theory!
Our Legacy Rails 3.2 Application
In this legacy application, we were running the web server using passenger, which used worker processes and not threads, for concurrency. This means that each process had its own memory, and writing threadsafe code wasn't a concern. Similarily, the background queue processor was Resque, again using processes over threads for concurrency.
In our upgrade to newer hardware, we decided to leave the web server with passenger, but wanted to make the switch to Sidekiq for a myriad of reasons. However, we encountered issues with our code not being threadsafe and that led to a few problems.
Class Variables Are Not Threadsafe
Don't use class variables in ruby, they just always cause problems. In our case the code was storing whether the current temperature unit should be Celsius or Farenheit, scoped to a block of code. Other examples might include a multi-tenant application and scoping to the current organization. Our class used instance variables, meaning that if two threads attempted to modify the temperature unit, the later running thread "won". This caused temperature units to be mixed up between workers, primarily when creating PDF reports to send via email.
Thread.current allows you to store information that is scoped to only the running thread, which is great for keeping data unique across a multithreaded environment such as Sidekiq (or app servers such as puma). The problem, however, is that Thread.current will retain this information until the thread is killed, or the data is manually cleared.
In the web server (again, say puma), each request is served with a single thread. However, that thread can persist to serve the next incoming request, and bam, your data is leaked across requests. However, there is a solution.
Using Request Store
Request Store is a library that wraps Thread.current, but injects some Rack middleware into your web application so that the Thread.current store is cleared out after each request is served. This is a clean, abstract way to ensure that your data is scoped per request, per thread. Awesome, right? This is great for the web server, but doesn't actually ensure we aren't leaking data in our background jobs with Sidekiq - there is no request in a background worker.
Using Request Store and Sidekiq
In fact, this was such a common theme in our work that we decided to release a gem that integrates Request Store and Sidekiq together in an elegant way, piggy backing on the work already completed in Request Store. In our gem, we create a piece of Sidekiq server middleware that is ran every time a job is pulled from the Sidekiq queue and processing begins. The middleware ensures that the Request Store store is cleared after the job completes, just like Request Store does automatically with the Rack middleware it includes by default.
Now that we have a way to securely store information either per thread and either per web request or per Sidekiq job, our code is truly multithreaded and secure. Below is an example of what our service was refactored to.