Sharding: L7
This post contains one of the possible sharding approaches to minimize the design errors risks
Since the main requirement is a data migration to a sharded database with a chosen sharding key, we already know that it takes to prepare n
new empty shards anyway
But it isn’t necessary to refactor the database interface implementation of a service we want to upgrade
For the first time we may simply launch n
shards of application too (api-shard
per db-shard
) and use the shading key as a header passed to a reverse proxy that could decide which api shard to use:
+---------------+ +---------------+
| Auth | | Auth |
+---------------+ +---------------+
| auth & reverse proxy |
| ---------------------> | inject "X-SHARDING-KEY" header
| reconfiguration |
+-------v-------+ +---------------v---------------+
| Reverse | | Reverse |
| Proxy | | Proxy |
+---------------+ +-------------------------------+
| | | |
+-------v-------+ +------v------+ +------v------+ +------v------+
| API | | api-shard-1 | | api-shard-2 | | api-shard-3 |
+---------------+ +-------------+ +-------------+ +-------------+
| | | |
+-------v-------+ +------v------+ +------v------+ +------v------+
| Database | | db-shard-1 | | db-shard-2 | | db-shard-3 |
+---------------+ +-------------+ +-------------+ +-------------+
Later, we’ll be able to find out which api-shard: db-shard
pair is loaded higher than the rest of them and apply more complex sharding logic
Development environment
Of-course it is more convenient to work with custom reverse proxy
But for clarity, the approach will be shown with well-known open-source proxy solutions
API configuration
Let’s first prepare the simple api that reads a special environment variable (MESSAGE
in our case) and renders its value to differ one shard from another:
package main
import (
"errors"
"log"
"net/http"
"os"
)
func main() {
message := os.Getenv("MESSAGE")
mux := http.NewServeMux()
mux.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
_, _ = w.Write([]byte(message))
})
err := http.ListenAndServe(":8080", mux)
if !errors.Is(err, http.ErrServerClosed) {
log.Println(err)
os.Exit(1)
}
}
Next, we’re going to prepare docker-compose.yml
to launch our development environment:
version: "3"
networks:
sharded-service:
driver: bridge
launch 3 api shards inside the same network, each replies with its own message:
services:
api-shard-first:
image: golang:1.21-alpine
command: ["go", "run", "/go/src/service/main.go"]
volumes:
- ./main.go:/go/src/service/main.go
environment:
MESSAGE: "i am a first shard!" # <- mark api shard as a first one
networks:
- sharded-service
ports:
- "8081:8080"
api-shard-second:
image: golang:1.21-alpine
command: ["go", "run", "/go/src/service/main.go"]
volumes:
- ./main.go:/go/src/service/main.go
environment:
MESSAGE: "i am a second shard!" # <- mark api shard as a second one
networks:
- sharded-service
ports:
- "8082:8080"
api-shard-third:
image: golang:1.21-alpine
command: ["go", "run", "/go/src/service/main.go"]
volumes:
- ./main.go:/go/src/service/main.go
environment:
MESSAGE: "i am a third shard!" # <- mark api shard as a third one
networks:
- sharded-service
ports:
- "8083:8080"
Reverse Proxy configuration
Nginx
To work with Nginx, add following lines to the services
block:
services:
nginx:
image: nginx:1.25-alpine
volumes:
- ./nginx.conf:/etc/nginx/conf.d/default.conf
networks:
- sharded-service
ports:
- "80:80"
To match header value with corresponding shard use the map module:
upstream first_api_shard {
server api-shard-first:8080;
}
upstream second_api_shard {
server api-shard-second:8080;
}
upstream third_api_shard {
server api-shard-third:8080;
}
map $http_x_sharding_key $api_shard {
"first" first_api_shard;
"second" second_api_shard;
"third" third_api_shard;
}
server {
listen 80;
location / {
proxy_pass http://$api_shard;
proxy_set_header Host $http_host;
proxy_pass_request_headers on;
}
}
HAProxy
To work with HAProxy, add following lines to the services
block:
services:
haproxy:
image: haproxy:2.2-alpine
ports:
- "80:80"
volumes:
- ./haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg
networks:
- sharded-service
To match header value with corresponding shard use the acl feature:
frontend sharded-service
bind *:80
mode http
acl first_shard hdr(X-SHARDING-KEY) eq first
acl second_shard hdr(X-SHARDING-KEY) eq second
acl third_shard hdr(X-SHARDING-KEY) eq third
use_backend first_api_shard if first_shard
use_backend second_api_shard if second_shard
use_backend third_api_shard if third_shard
backend first_api_shard
mode http
server first_api_shard api-shard-first:8080
backend second_api_shard
mode http
server second_api_shard api-shard-second:8080
backend third_api_shard
mode http
server third_api_shard api-shard-third:8080
Quick start
docker compose up
curl "http://localhost:80" -H "X-SHARDING-KEY: first"
# "i am a first shard!"
curl "http://localhost:80" -H "X-SHARDING-KEY: second"
# "i am a second shard!"
curl "http://localhost:80" -H "X-SHARDING-KEY: third"
# "i am a third shard!"