Building High Throughput APIs with Ruby—Part 1

Adtile Technologies
Adtile Technologies wrote this on

Welcome to a series of blog posts on building high throughput APIs in Ruby. These are all based on lessons learned while building Adtile and, more specifically, the ad serving component of it.

Ad servers need to be fast. Really fast. So fast that users don’t notice something external is loading. Response times need to be low, even when handling thousands of ad requests and event tracking operations every second. It’s a beautiful mix of fast responses and well designed SDKs to provide a seamless loading experience. For now, we’ll focus on what we do to get our responses out as fast as possible.

Why Ruby?

We know there are many languages out there with better raw performance. We use Ruby because that’s what we like and are good at. As simple as that. It lets us experiment and adapt quickly.

Don’t just build things in a language because it runs fast. Do it in a language where you can build things fast instead. Create small components that work together to create a larger system. That way you’ll be able to replace the slow parts later, if really needed. Raw performance only takes you so far, while a good architecture will take you much further.

Our stack

Our ad server is a Rails::API application that reads and writes to PostgreSQL, Redis, DynamoDB and Memcached. We use Sidekiq Pro for background processing.

This specific part of Adtile runs on JRuby instead MRI. JRuby has allowed us to squeeze a lot of performance out of single instances since we can run code truly concurrently and have the JVM optimize our code as it runs.

Finding bottlenecks

During the initial phases of development, we tried not to focus too much on performance while we played around with different ideas. On the other hand, we also tried to make good decisions that wouldn’t lead to a slow application. When we were finally done with our features we finally took a good look at performance.

Our code wasn’t particularly slow, but it also wasn’t as fast as I had hoped. To get a feeling of how things would work in production, we launched enough instances and did some load testing. We’ve been using and have been happy with it. Benchmarking your application locally can only tell you so much. In production you’re going to be dealing with load balancers, databases running on separate machines and the latency that comes with it. If you’re not doing any load testing against your staging or production environment, you definitely should. You may be surprised.

Simply throwing a lot of traffic at our application gave us a pretty good idea of what needed improvement right away. When our application couldn’t handle it anymore and started throwing exceptions instead of successful responses, it became clear from looking at the logs what our problems were.

Even with a big enough connection pool size for our databases, we were still seeing timeouts waiting for connections. It’s not because any of our databases are slow, but mostly due to other factors such as connection limits, and perhaps even more important, latency.

For example, in our AWS region, the latency between availability zones is around 2ms. It doesn’t seem like much, but if you’re doing multiple operations that are dependent on each other, you’ll be multiplying that latency a few times and it does add up pretty quickly.

Going easy on the database

One of the biggest improvements we saw was when we managed to cut writing to the database on every request. We still need all that data from each request, but we realized we were doing very simple operations, such as incrementing values, so we came up with a way to do those less often. Even simple ideas like this will go a big way in massively improving overall system performance. And not only that, but you’ll also be saving money. We’ve all heard that hitting the database is expensive, and with for example with DynamoDB that can be said literally, as you pay based on provisioned database throughput.

We built a small Ruby library to do just this, which you can find on GitHub. It lets you define aggregator tasks that run on a separate thread as often as you want them to run.

An example aggregator that would keep track of pageviews could look something like:

class PageviewAggregator < Aggregator
  def process(collection, item)
    collection ||= {}
    collection[item] = collection.fetch(item, 0) + 1

  def finish(collection)
    # Update the database based on your aggregated data:
    # { "/" => 471, "/about" => 127, ... }

Then, instead of doing database calls for every single page you serve, you could simply push an item to the aggregator:


You can find full instructions on how to use it on the project README.

Since we use JRuby, it means that aggregations can run at the same time we serve requests. However, even if we were running on MRI, we’d still benefit from it.

If you’re gathering analytics data, good chances are you can combine multiple writes into either a single one or less. And all that while still saving data often enough so that you can present users with up-to-date information.

Coming up next…

This post is just an introduction to our stack and a solution to a bottleneck we found. In future posts we’ll go through things such as gotchas in Rails, tuning for performance, improving Rails caching performance with Memcached, abusing Redis scripting for fun and profit and many other things. Stay tuned!

Adtile Technologies

About the Author

Adtile is a pioneer and developer of motion-sensing technology for smartphones and tablets. We are working with leading technology companies and Fortune 500 brands.

Read all posts by Adtile.

Follow us on Twitter