Data Structures

#58 2016-12-2813 min
Fast Topic Matching
A common problem in messaging middleware is that of efficiently matching message topics with interested subscribers. For example, assume we have a set of subscribers, numbered 1 to 3: Subscriber Match Request 1 forex.usd 2 forex.* 3 stock.nasdaq.msft And we have a stream of messages, numbered 1 to N: Message Topic 1 forex.gbp 2 stock.nyse.ibm 3 stock.nyse.ge 4 forex.eur 5 forex.usd … … N stock.nasdaq.msft We are then tasked with routing messages whose topics match the respective subscriber requests, where a “*” wildcard matches any word. This is frequently a bottleneck for message-oriented middleware like ZeroMQ, RabbitMQ, ActiveMQ, TIBCO EMS, et al. Because of this, there are a number of well-known solutions to the problem. In this post, I’ll describe some of these solutions, as well as a novel one, and attempt to quantify them through benchmarking. As usual, the code is available on GitHub.
#52 2016-02-2422 min
So You Wanna Go Fast?
I originally proposed this as a GopherCon talk on writing “high-performance Go”, which is why it may seem rambling, incoherent, and—at times—not at all related to Go. The talk was rejected (probably because of the rambling and incoherence), but I still think it’s a subject worth exploring. The good news is, since it was rejected, I can take this where I want. The remainder of this piece is mostly the outline of that talk with some parts filled in, some meandering stories which may or may not pertain to the topic, and some lessons learned along the way. I think it might make a good talk one day, but this will have to do for now.
#49 2015-12-2720 min
Breaking and Entering: Lose the Lock While Embracing Concurrency
This article originally appeared on Workiva’s engineering blog as a two-part series. Providing robust message routing was a priority for us at Workiva when building our distributed messaging infrastructure. This encompassed directed messaging, which allows us to route messages to specific endpoints based on service or client identifiers, but also topic fan-out with support for wildcards and pattern matching. Existing message-oriented middleware, such as RabbitMQ, provide varying levels of support for these but don’t offer the rich features needed to power Wdesk. This includes transport fallback with graceful degradation, tunable qualities of service, support for client-side messaging, and pluggable authentication middleware. As such, we set out to build a new system, not by reinventing the wheel, but by repurposing it.
#46 2015-12-061 min
Probabilistic algorithms for fun and pseudorandom profit
Probabilistic algorithms for fun and pseudorandom profit from Tyler Treat
#34 2015-02-1319 min
Stream Processing and Probabilistic Methods: Data at Scale
Stream processing and related abstractions have become all the rage following the rise of systems like Apache Kafka, Samza, and the Lambda architecture. Applying the idea of immutable, append-only event sourcing means we’re storing more data than ever before. However, as the cost of storage continues to decline, it’s becoming more feasible to store more data for longer periods of time. With immutability, how the data lives isn’t interesting anymore. It’s all about how it moves.

Fast Topic Matching

So You Wanna Go Fast?

Breaking and Entering: Lose the Lock While Embracing Concurrency

Probabilistic algorithms for fun and pseudorandom profit

Stream Processing and Probabilistic Methods: Data at Scale