Contents

Sharding: L7

This post contains one of the possible sharding approaches to minimize the design errors risks

Since the main requirement is a data migration to a sharded database with a chosen sharding key, we already know that it takes to prepare n new empty shards anyway

But it isn’t necessary to refactor the database interface implementation of a service we want to upgrade

For the first time we may simply launch n shards of application too (api-shard per db-shard) and use the shading key as a header passed to a reverse proxy that could decide which api shard to use:

+---------------+                                +---------------+
|     Auth      |                                |     Auth      |
+---------------+                                +---------------+
        |            auth & reverse proxy                |
        |           --------------------->               | inject "X-SHARDING-KEY" header
        |            reconfiguration                     |
+-------v-------+                        +---------------v---------------+
|    Reverse    |                        |            Reverse            |
|     Proxy     |                        |             Proxy             |
+---------------+                        +-------------------------------+
        |                                |               |               |
+-------v-------+                 +------v------+ +------v------+ +------v------+
|      API      |                 | api-shard-1 | | api-shard-2 | | api-shard-3 |
+---------------+                 +-------------+ +-------------+ +-------------+
        |                                |               |               |
+-------v-------+                 +------v------+ +------v------+ +------v------+
|   Database    |                 | db-shard-1  | | db-shard-2  | | db-shard-3  |
+---------------+                 +-------------+ +-------------+ +-------------+

Later, we’ll be able to find out which api-shard: db-shard pair is loaded higher than the rest of them and apply more complex sharding logic

Development environment

Of-course it is more convenient to work with custom reverse proxy
But for clarity, the approach will be shown with well-known open-source proxy solutions

API configuration

Let’s first prepare the simple api that reads a special environment variable (MESSAGE in our case) and renders its value to differ one shard from another:

package main

import (
  "errors"
  "log"
  "net/http"
  "os"
)

func main() {
  message := os.Getenv("MESSAGE")

  mux := http.NewServeMux()
  mux.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
    w.WriteHeader(http.StatusOK)
    _, _ = w.Write([]byte(message))
  })

  err := http.ListenAndServe(":8080", mux)
  if !errors.Is(err, http.ErrServerClosed) {
    log.Println(err)
    os.Exit(1)
  }
}

Next, we’re going to prepare docker-compose.yml to launch our development environment:

version: "3"

networks:
  sharded-service:
    driver: bridge

launch 3 api shards inside the same network, each replies with its own message:

services:
  api-shard-first:
    image: golang:1.21-alpine
    command: ["go", "run", "/go/src/service/main.go"]
    volumes:
      - ./main.go:/go/src/service/main.go
    environment:
      MESSAGE: "i am a first shard!" # <- mark api shard as a first one
    networks:
      - sharded-service
    ports:
      - "8081:8080"
  api-shard-second:
    image: golang:1.21-alpine
    command: ["go", "run", "/go/src/service/main.go"]
    volumes:
      - ./main.go:/go/src/service/main.go
    environment:
      MESSAGE: "i am a second shard!" # <- mark api shard as a second one
    networks:
      - sharded-service
    ports:
      - "8082:8080"
  api-shard-third:
    image: golang:1.21-alpine
    command: ["go", "run", "/go/src/service/main.go"]
    volumes:
      - ./main.go:/go/src/service/main.go
    environment:
      MESSAGE: "i am a third shard!" # <- mark api shard as a third one
    networks:
      - sharded-service
    ports:
      - "8083:8080"

Reverse Proxy configuration

Nginx

To work with Nginx, add following lines to the services block:

services:
  nginx:
    image: nginx:1.25-alpine
    volumes:
      - ./nginx.conf:/etc/nginx/conf.d/default.conf
    networks:
      - sharded-service
    ports:
      - "80:80"

To match header value with corresponding shard use the map module:

upstream first_api_shard {
  server api-shard-first:8080;
}

upstream second_api_shard {
  server api-shard-second:8080;
}

upstream third_api_shard {
  server api-shard-third:8080;
}

map $http_x_sharding_key $api_shard {
  "first"  first_api_shard;
  "second" second_api_shard;
  "third"  third_api_shard;
}

server {
  listen 80;
  location / {
    proxy_pass                 http://$api_shard;
    proxy_set_header           Host $http_host;
    proxy_pass_request_headers on;
  }
}

HAProxy

To work with HAProxy, add following lines to the services block:

services:
  haproxy:
    image: haproxy:2.2-alpine
    ports:
      - "80:80"
    volumes:
      - ./haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg
    networks:
      - sharded-service

To match header value with corresponding shard use the acl feature:

frontend sharded-service
  bind *:80
  mode http
  acl first_shard hdr(X-SHARDING-KEY) eq first
  acl second_shard hdr(X-SHARDING-KEY) eq second
  acl third_shard hdr(X-SHARDING-KEY) eq third
  use_backend first_api_shard if first_shard
  use_backend second_api_shard if second_shard
  use_backend third_api_shard if third_shard

backend first_api_shard
  mode http
  server first_api_shard api-shard-first:8080

backend second_api_shard
  mode http
  server second_api_shard api-shard-second:8080

backend third_api_shard
  mode http
  server third_api_shard api-shard-third:8080

Quick start

docker compose up
curl "http://localhost:80" -H "X-SHARDING-KEY: first"
# "i am a first shard!"
curl "http://localhost:80" -H "X-SHARDING-KEY: second"
# "i am a second shard!"
curl "http://localhost:80" -H "X-SHARDING-KEY: third"
# "i am a third shard!"