Recive [Errno 111] Connection refused on load test for TCP Server

80 views Asked by At

I'm trying to set up a job make a load test on a TCP server. The idea is for each 'client' to connect to the server, send a message and close the connection (without waiting for a response).

Doing this for 1 client works fine, when I scale using Locust, from 1000 clients, I start to receive [Errno 111] Connection refused, following the pattern in the graph below.

enter image description here

For those who want to replicate, here is the rust TCP server, for testing:

use chrono::Local;
use tokio::io::AsyncReadExt;
use tokio::net::TcpListener;

use std::error::Error;

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
    let addr = "0.0.0.0:5102".to_string();

    let listener = TcpListener::bind(&addr).await?;
    let now = Local::now();
    println!("{} -> Server started ...", now.format("%Y-%m-%d %T"));
    println!("Listening on: {}", addr);

    loop {
        let (mut socket, _) = listener.accept().await?;

        tokio::spawn(async move {
            let mut buf = vec![0; 1024];

            loop {
                let n = socket
                    .read(&mut buf)
                    .await
                    .expect("failed to read data from socket");

                if n == 0 {
                    return;
                }

                let now = Local::now();
                print!("{} -> ", now.format("%Y-%m-%d %T"));
                for c in buf[0..n].iter() {
                    print!("{}", format!("{:X}", c));
                }
                println!("");
            }
        });
    }
}

The cargo file:

[package]
name = "tcp_server_rust"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
tokio = { version = "1", features = ["full"] }
chrono = "0.4.27"

The Locust Script:

import time
import socket
from locust import User, task, between, events

ENV = "host.docker.internal:5102"

def success(rq_type, name, r_size, r_time):
    events.request.fire(
            request_type=rq_type,
            name=name,
            response_time=r_time,
            response_length=r_size,
            exception=None,
    )

def failure(rq_type, name, r_size, r_time, exception):
    events.request.fire(
            request_type=rq_type,
            name=name,
            response_time=r_time,
            response_length=r_size,
            exception=exception
    )

class TCP_locust():
    request_type = "TCP"
    def send_message(self) -> None:
        """Locust: Envia apenas um login"""
        name = "Send Message"
        start_time = time.time()
        sock = socket.socket()
        err = ""
        r_size = 0
        [host, port] = ENV.split(":")

        try:
            sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
            sock.connect((host, int(port)))
        except socket.error as exc:
            err = f"Create, con: {exc}"

        if err == "":
            try:
                sock.sendall(b"test")
            except socket.error as exc:
                err = f"Send: {exc}"

        sock.close()
        response_time = (time.time() - start_time) * 1000

        if not err:
            success(self.request_type, name, r_size, response_time)
        else:
            failure(self.request_type, name, r_size, response_time, err)

class TCPClient(User):
    abstract = True

    def __init__(self, environment):
        super().__init__(environment)
        self.client = TCP_locust()

class TCP_User(TCPClient):
    wait_time = between(1, 5)

    @task
    def micodus_send_position(self):
        self.client.send_message()

A docker compose file to upload Locust with 10 workers (put in the same folder of Locust Script):

version: '3'

services:
  master:
    image: locustio/locust
    ports:
      - "8089:8089"
      - "5557:5557"
    volumes:
      - ./:/mnt/locust
    command: -f /mnt/locust/locustfile.py --master --users 3000 --spawn-rate 100 --host "host.docker.internal:5102"

  worker:
    image: locustio/locust
    volumes:
      - ./:/mnt/locust
    command: -f /mnt/locust/locustfile.py --worker --master-host host.docker.internal
    network_mode: "host"

Use the follow command do start

docker compose up -d --build --scale worker=10

I've already tested it on other TCP servers, made in C#, Rust and Python, and I ended up having the same behavior.

Initially the tests were done with Python's own threads, without using Locust, and the same thing happened. I decided to switch to Locust to collect information and have better control over the tests.

PS: This server and script are not the ones I use in the real test, I just isolated the problem to better analyze it.

0

There are 0 answers