I'm running into an issue I noticed with scalability when writing multi-threaded to one or multiple channels in netty 4.0.28. I could not find anything on SO about this topic. I found only "Hey, multithreading improves throughput" questions but mine is somehow the opposite of that.
In general, writing to one channel is faster than writing to multiple channels, and I do not have any idea, why this is, especially when waiting the write to complete.
The results I get are (writeAndFlush median, with warming and multiple test runs):
- single connection: 6.6M ops/sec without wait and about 60K ops/sec with await
 - 4 connections: 5.1M ops/sec without wait and about 47K ops/sec with await
 - I observed a performance increase when using Oio, performance decrease when using Nio
 
My setup:
- Nio Sockets with shared NioEventLoopGroup
 - Between 2 and 4 writing threads (Event Loop group threads = 2 * writing threads)
 - write small messages and flush after write
 - For the test: No read and connections are not closed
 
My expectation:
- Writing on multiple connections is faster than writing on a single connection (not necessarily linear, but should be still better than on one connection)
 
Bootstrap
Bootstrap b = new Bootstrap();
b.group(eventLoopGroup);
b.channel(NioSocketChannel.class); 
b.handler(new ChannelInitializer<NioSocketChannel>() {
    @Override
    public void initChannel(NioSocketChannel ch) throws Exception {
        ch.pipeline().addLast(new StringEncoder());
    }
});
// Start the client.
ChannelFuture f = b.connect(socketAddress).await();
channel = f.channel();
Write
// non-blocking
channel.writeAndFlush(message, channel.voidPromise());
or
// blocking
channel.writeAndFlush(message).await();
The full code is available at https://gist.github.com/mp911de/7a3d7206931c8c4dc581