Introduction#
fly.io is a modern application delivery platform that provides load balancing and automatic scaling capabilities to ensure the reliability and performance of applications.
According to the fly.io load balancing documentation, it is known that fly.io supports automatic scaling based on request volume or TCP connection count. Because my friend's API was attacked recently and the server crashed, and because his APIs are used outside of browsers, such as in some applications and scripts, it is difficult to implement defense strategies. Therefore, I recommended him to migrate to fly.io.
Configuration#
First, package the program into a Docker image and deploy it on fly.io. I used the 2H512M configuration and created 11 instances.
Then, configure the maximum connection count for each instance using fly.toml
and [services.concurrency]
. Here is an example configuration:
# fly.toml app configuration file generated for cheat-show-backend on 2023-09-15T18:53:53-05:00
#
# See https://fly.io/docs/reference/configuration/ for information about how to use this file.
#
app = "XXXXXXXXXXXX"
primary_region = "lax"
swap_size_mb = 1024
[http_service]
internal_port = 8080
force_https = false
auto_stop_machines = true
auto_start_machines = true
min_machines_running = 0
processes = ["app"]
[services.concurrency]
type = "connections"
hard_limit = 10000
soft_limit = 1200
[build]
dockerfile = "Dockerfile"
soft_limit
is the soft limit. fly.io will determine the number of connections and limit each instance accordingly. If the connection count exceeds this limit for a single instance, additional instances will be started, enabling automatic scaling. After testing with wrk
, it was determined that our deployed service can handle up to 2k concurrent connections per instance. To allow enough time for scaling, I changed the limit to 1200. If the connection count is below this value, fly.io will only start one machine. If it exceeds this value, multiple machines will be started based on the connection count.
hard_limit
is the hard limit. If a single instance exceeds this value, a 503 error will be returned.
Load Testing#
Default State#
Increased Connection Count (wrk 5000 threads)#
Under normal circumstances, only one instance will be started. Billing is done per minute, with one instance costing $3.8 per month. If the connection count decreases, it will automatically scale down to one instance. Instances in a stopped state will not incur charges.
About Some Pitfalls#
fly.io's load balancing is source-based. If I have one instance in New York and one in Los Angeles, I would prefer to keep the Los Angeles instance running by default, with New York as a backup. In this case, a Los Angeles server is needed to reverse proxy fly.io. Direct access would cause fly.io to start instances based on proximity, so if a user from New York accesses it, a New York instance would be started, even if the connection count is only 1. Therefore, I also deployed 3 nginx reverse proxy instances in the Los Angeles region of fly.io, using 3 instances of 1h256m for load balancing. After optimization, each instance can handle over 10k concurrent connections.
Update: It seems that the issue with automatic scaling in the same region has been fixed. Previously, in the same region, all machines would be started by default for load balancing, and scaling only worked for cross-region scenarios. This pitfall no longer exists, and it currently works for the same region as well.
.
Optimized Dockerfile
FROM nginx
RUN sed -i 's/worker_processes auto;/worker_processes 8;/' /etc/nginx/nginx.conf
RUN sed -i 's/worker_connections [0-9]*;/worker_connections 9999999;/' /etc/nginx/nginx.conf
COPY nginx.conf /etc/nginx/conf.d/nginx.conf
nginx.conf
server {
listen 8080 default_server;
listen [::]:8080 default_server;
server_name api.test.com;
keepalive_timeout 75s;
keepalive_requests 100;
location / {
proxy_pass http://ip:80;
proxy_set_header Host $host;
proxy_set_header Upgrade $http_upgrade;
proxy_http_version 1.1;
}
}
This allows for low-cost automatic scaling of services.
Cost Issue#
The 3 instances used to deploy nginx are free. The backend instance is usually only one and costs $3.8 per month. Instances that are not started will not incur charges.