If you are running Docker as part of your infrastructure you probably are also hosting a private Docker registry for storing private Docker images. Vanilla installation is pretty good, you just put the Docker Distribution in a private VPC and you are good to go. Let's imagine a scenario where you wanted to build a public registry with custom access control to the images, something similar to Docker Hub. How would you do that? Good news is that I built exactly that when I was building Layerstore and in this article I'm going to show you how you can do it yourself.
Before we go into nitty-gritty details let me give you some background on Layerstore. Layerstore was Docker marketplace where anyone could sell Docker images either as individual images or as image bundles. The entire life cycle of a sale might look something similar to this:
- Seller reserves image identifier. This identifier will be used to push and pull images from the registry.
- Seller receives read and write permissions to the reserved image identifier.
- Seller uploads the image with
docker pushcommand, configures product page and sets the price.
- Purchaser buys the product and receives read access to the image.
- Purchaser downloads image onto his servers with
We are going to explore these steps in detail in a moment. Of course I am going to skip irrelevant product parts and concentrate mostly on Docker registry and services surrounding it.continue reading
Eye catching chart showing memory utilization before and after the fix.
Sometime ago I wrote a worker that periodically polls third party service for data. We started noticing that the worker process gets killed by the kernel for reaching memory limits. The container for the worker was given 512MB and that should be more than enough for the job it was doing. The amount of data it fetches can go anywhere from 25MB to a 100MB and it uses this data to sync some internal state of our systems with the data provided by the third party. I was able to find weird memory consumption patterns and refactor the code to take memory usage from ~50% to ~13% and stop getting worker process OOM killed. This post is about the tools I used to find memory problems in a Python application.continue reading
Docker has gained lots of popularity in the recent years. Thanks to the movement towards microservices, we are able get docker infrastructure from all major web service providers like AWS or Google Cloud.
This post is more of a tutorial style post in which I'm planning walk you through how we can bootstrap fully operating docker infrastructure from scratch in AWS using Terraform. Managing infrastructure by hand is terrible. I'm not going to go into details why is that, but I believe Terraform is going to be one of those tools that will stick with us for a while, especially once it gets more mature. We're going to use Terraform to build our infrastructure.continue reading
Locks are very important in distributed systems. Sometimes we want to make sure that only one job runs at a time, but we want to have the system highly available (e.g highly available cron server). Of course there are many other uses cases for distributed locks, which I'm not going to talk about. In this post I am going to show you an example of how to implement distributed locks on DynamoDB.continue reading
I had an opportunity to work on an interesting infrastructure challenge. It goes something like this: we need to be able to persist incoming data stream which consists of approximately 200 thousand messages/second, we also need to guarantee data availability and redundancy. This is a typical scale of data I used to deal with at Chartbeat on a daily basis. When working with such high traffic you're most likely going to run into the questions to which you might not know the answers right away.
- How many servers do we need to to handle such traffic?
- Do we need to store the data and how can we do that?
- If we must store the data, for how long are we going to need access to it?
- How much the new infrastructure is going cost us?
These are just a few questions that you will have to answer in order to pick the right tools for the job. In this post I will try to provide the answers to some of these questions and also show you a sample infrastructure setup that can be used to handle large amounts of traffic while abiding our requirements.continue reading
At Chartbeat we are thinking about adding probabilistic counters to our infrastructure, HyperLogLog (HLL) in particular. One of the challenges with something like this is to make it redundant and have somewhat good performance. Since HyperLogLog is a relatively new approach to cardinality approximation there are not many off the shelf solutions, so why not try and implement HLL in Cassandra?continue reading
This is my first post on machine learning, and hopefully not the last one. The main goal of these posts is to serve as a quick reference for simple machine learning problems and their solutions, meanwhile allowing me to get a better understanding of the field itself. That said, don't take anything for granted.continue reading
In one of my fixes that I was working at work I had to implement row level
locking in Django. Current stable, 1.3, version of Django does not have
built-in capability for row level locking on InnoDB tables. The good news are
that the development version already has an update in QuerySet API that will
let you use
select_for_update method to acquire a write lock on rows matching
your query. If you can use development version for your project you may stop
reading and go upgrade Django, otherwise I will see you at the bottom of the