Docker (compose) networking

240 views Asked by At

I have a setup which without configuration change sometimes work, sometimes not, and I would welcome any help to understand why (and have it work 100% of the time).

Setup

Platform:

  • Windows 10
  • WSL2, ubuntu 21.04
  • docker compose 1.29.2
  • docker engine v20.10.7
  • docker desktop (WSL2 backend) 3.4.1 (65384)

A different set of a bit older docker/docker desktop had the same behavior.

Cluster setup:

  • Docker-compose run under WSL2, which talks to docker in Windows
  • my docker compose file starts 5 services:
    • kudu master (an Apache distributed database, this is the master)
    • 3 kudu tservers (the workers of the database)
    • 1 Apache impala (sql interface to the database)
  • an app (outside docker, spark/scala in my case) will first talk to the master, and will receive the IP of all the workers to speak directly to them
  • each component of the cluster needs to speak to each other as well, within different docker container

Run/docker-compose.yaml

Each service will advertise its IP as being the host IP (ifconfig | grep "inet " | grep -Fv 127.0.0.1 | awk '{print $2}' | tail -1). They all listen on different ports.

If I give docker.host.internal as advertised IP, it does not work either (because as the IP is eventually used outside docker, it is not valid there.)

I do not set up any specific networking configuration.

A subset of my docker compose to not pollute the question too much:

services:
  master:
    image: apache/kudu:latest
    ports:
      - "7051:7051" # RPC interface
      - "8051:8051" # Web interface
    command: ["master"]
  tserver-1:
    image: apache/kudu:latest
    depends_on:
      - master
    ports:
      - "7050:7050" # RPC interface
      - "8050:8050" # Web interface
    command: ["tserver"]
    environment:
      - KUDU_MASTERS=${KUDU_QUICKSTART_IP}:7051
      - >
        TSERVER_ARGS=
        --rpc_bind_addresses=0.0.0.0:7050
        --rpc_advertised_addresses=${KUDU_QUICKSTART_IP}:7050

It is heavily based on the example from kudu itself: https://github.com/apache/kudu/blob/master/docker/quickstart.yml

Problem

  • sometimes, it completely works.
  • sometimes, I can access kudu via impala, which I believe will only talk to master, the master itself will talk to the workers. I cannot access via my app, which will try to speak to the workers directly
  • sometimes, even impala will not connect.

All without changing docker-compose.yaml! I make sure to destroy all volumes every time as well, just in case.

If I change all rpc_advertised_addresses to the service name (which is thus a DNS name as well) I end up with the same error as in the other cases, but a bit further down the lane.

Logs

If something does not go well, I see in the logs inside the containers an error along the lines of:

Timed out: Client connection negotiation failed: client connection to 172.17.188.205:7150: Timeout exceeded waiting to connect

Which means that the container themselves cannot talk to each other.

172.17.188.205 is here my host IP. If I use services names as rpc_advertised_addresses I will see the service IP.

The problem

I'm not the only one with this issue, a teammate has the same. It feels like I'm not understanding the networking correctly.

0

There are 0 answers