Instaparse Powered Slackbot

Bots are all the rage now. While building a conversational AI bot is a huge undertaking, building your own helpful Slackbot isn’t. These bots are great for performing simple tasks that don’t warrant a dedicated interface. This is exactly what I needed at my job. Tasks like creating a customer, listing customers, and refreshing customer data were all repetitive tasks that were great candidates for automating into a bot. I began by writing my own parsing code and this worked fine until things began to get more complicated.

» Continue Reading (about 600 words)

Making Sense of Clojure's Overlooked Agents

Working extensively with Clojure in the last year, I’ve been exploring the many concurrency techniques favored by the language. Certain strategies like atoms and immutability have become second nature. The Clojure books explain atoms, refs, and agents and how they work. While I understand refs and atoms, agents were never well covered. There are examples of using agents to protect resources like files, but that’s it. The Textbook on Agents It’s natural to assume agents are like actors.

» Continue Reading (about 800 words)

Getting Started with Apache Kafka for the Baffled, Part 2

In part 1, we got a feel for topics, producers, and consumers in Apache Kafka. In this part, we will learn about partitions, keyed messages, and the two types of topics. Kafka is built around a simple log architecture. This simplicity makes Kafka robust and fast. Partitions A topic can be divided into partitions which may be distributed. Partitions enable the following: Distribute the data across brokers (think sharding) Simplify parallelization Ensure sequencing of related messages We will touch on each of these.

» Continue Reading (about 1600 words)

Reservoir Sampling in Clojure

Lately I’ve been moving our data backend to use Apache Kafka to store our many data sources. I think it’s a great way to deal with the problem of data collection, Kafka gives us great flexibility in how we consume the data and how we query it. One of our data scientists asked if we could randomly sample a stream. This is a common activity in machine learning and statistics, and it’s trivial on a known dataset.

» Continue Reading (about 600 words)

Getting Started with Apache Kafka for the Baffled, Part 1

This post isn’t about installing Kafka, or configuring your cluster, or anything like that. My introduction to Kafka was rough, and I hit a lot of gotchas along the way. I want to help others avoid that pain if I can. If you aren’t familiar with why you might want to use Kafka, there are plenty of great articles that will outline why you might want to: The Log: What every software engineer should know about real-time data’s unifying abstraction Turning The Database Inside Out With Apache Samza In this introduction I will assume that you have gone through the Kafka quickstart with version 0.8.2.

» Continue Reading (about 1300 words)