I'm trying to write zipstream compression tool, like getting data from 3rd party service, transform it, compress it and then add those bytes to zip archive byte stream and send to another service.
Crc32 is recalculating after every chunk.
Made 3rd party service emulation - reading file by chunk.
This version works, but after extracting i get empty file. But it is not empty - i see data in hex editor. I think there is something with crc32.
But if i compressing the whole file at once, it works just fine.
Here is my question.
Is it possible to compress big amount of data by chunks with deflatestream?
I need to extract this data later with regular zip tools.
public async Task<byte[]> Compress(string fileName, IAsyncEnumerable<byte[]> data)
{
var crc32Helper = new System.IO.Hashing.Crc32();
var lfh = ZipTools.GetLocalFileHeaderEntry(fileName);
testResult.AddRange(lfh);
var bytearray = new List<byte>();
await foreach (var chunk in data)
{
_originalSize += (ulong) chunk.Length;
var compressedData = Compress(chunk);
_compressedSize += (ulong) compressedData.Length;
crc32Helper.Append(chunk);
testResult.AddRange(compressedData);
}
_originalSize += (ulong) bytearray.Count;
_crc32 = crc32Helper.GetCurrentHashAsUInt32();
var cd = ZipTools.GetCentralDirectoryEntry(
fileName,
_crc32,
(ulong) lfh.Length + _compressedSize,
_compressedSize,
_originalSize);
testResult.AddRange(cd);
return testResult.ToArray();
}
public byte[] Compress(byte[] data)
{
using var input = new MemoryStream(data);
using var resultStream = new MemoryStream();
using (DeflateStream compressionStream = new DeflateStream(resultStream, CompressionMode.Compress))
{
input.CopyTo(compressionStream);
}
return resultStream.ToArray();
}
You can try to adapt the C code in zipflow to use in your C# application. zipflow takes in chunks of data and streams out a zip file, without ever having to keep either the entire input of any entry or the output zip file in memory or the file system.
I don't know what "ZipTools" is, nor can I find it. However writing multiple deflate streams to a single entry will not work. It needs to be a single deflate stream. Also there is no evidence of writing an end-of-central-directory record, which must follow the central directory.