We have the following setup on production for MQTT.
- 5 EMQX broker(Version 3.X)
- AWS Load balancer to distribute load across MQTT brokers (and HAProxy in some enviroments)
- Paho MQTT python client (Version 1.1)
We are noticing an issue where messages are getting frequently dropped(around 1 or 2 in every 100 messages).
MQTT connect configuration setup is as follows
client_id = "<random_int_from_1_to_100>_<current_hostname>"
clean_session = False
keep alive timeout = 60
How the messages are published ?
We have X number of celery workers publishing to the same topic in parallel, with message rate of 10/s at max. The client id is unique across each celery worker as it using hostname in client id.
For the messages which are getting dropped or missed, paho MQTT library is returning a 0 on publish indicating the message was published successfully.
Sample code for publish
(res, mid) = self.conn.publish(topic=topic, payload=payload, qos=qos)
if res == 0:
log.debug(f"Succesfully published message::{str(res)} with id {mid} for payload::{payload}",
client_id=self.client_id)
else:
log.info(f"Error publishing message::{str(res)} with id {mid} for payload::{payload}",
client_id=self.client_id)
But there are no logs EMQX(even with debug logs enabled), for the ones which have been dropped. This is happening only on production where there are multiple clients publishing to same topic, whereas with single client we haven't noticed an issue.
Is there any issue with the configuration of the above or would upgrading to a newer version of the library help fix the issue? OR this could be something specific to the EMQx broker.
- Python version: 3.6.9
- Library version: 1.1
- Operating system (including version): Linux
- MQTT server (name, version, configuration, hosting details): EMQX