I am trying to use a stream pipeline in Node, but running in a problem where i cant get the number of chunks in a stream accurately when i include a gzip transform.
const readStream = fs.createReadStream(group.selectFile,{highWaterMark: group.BackpressureBufferSize})
const writeStream = fs.createWriteStream(path.join(path.dirname(group.selectFile), 'encrypted_' + path.basename(group.selectFile)), { highWaterMark: 64 * 1024 });
const gzip = zlib.createGzip();
const progress = passThroughStream()
// Used only becuase i cant get a 'data' event on writeStream, so i create a
passThroughStream which doesn't manipulate the stream, but rather captures data on chunks and data length.
totalChunks = Math.ceil(totalSize / group.BackpressureBufferSize)
let currentChunk = 0;
bar.start(totalSize, 0)
progress.on('data', (chunk) => {
totalBytes += chunk.length;
currentChunk++
bar.update(totalBytes, { currentChunk, totalChunks });
})
const streams = [readStream, gzipstream, cipher, progress, writeStream]
await pipelineAsync(...streams)
Everything works fine if i dont include the compression stream, and the number of chunks is known by diving the totalsize of the file by the backpressure amount, However when i include gzip transorm the chunk varies between 16k and 8k, i thought i could set it like i did with the readstream and writestream so that each pipe has the same buffer size.
My question is:
- Can i control the highwatermark buffer size for gzip it seems defaulted to 16k and 8k
- If i am piping a readstram with a much higher water mark size of 64k, does the next stream like the gzip transform slow down due to backpressure because it only accepts 16k at a time?
- is there anyway to get the total number of chunks with the compression?