How to create an improperly closed gzip file using python?

Question

How to create an improperly closed gzip file using python?

47 views Asked by notacorn At 29 March 2024 at 05:42

I have an application that occasionally needs to be able to read improperly closed gzip files. The files behave like this:

>>> import gzip
>>> f = gzip.open("path/to/file.gz", 'rb')
>>> f.read()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.8/gzip.py", line 292, in read
    return self._buffer.read(size)
  File "/usr/lib/python3.8/gzip.py", line 498, in read
    raise EOFError("Compressed file ended before the "
EOFError: Compressed file ended before the end-of-stream marker was reached

I wrote a function to handle this by reading the file line by line and catching the EOFError, and now I want to test it.

The input to my test should be a gz file that behaves in the same way as demonstrated. How do I make this happen in a controlled testing environment?

I really strongly prefer not making a copy of the improperly closed files that I get in production.

Original Q&A

There are 2 answers

**Amadan** · Answer 1 · 2024-03-29T06:00:01+00:00

Very simple: do the compression, then snip the result.

import gzip
plain = b"Stuff"
compressed = gzip.compress(plain)
bad_compressed = compressed[:-1]

gzip.decompress(bad_compressed)     # EOFError

Even easier, just two bytes is enough for the gzip module to recognise the gzip format, but is obviously not a complete compressed file.

bad_compressed = b'\x1f\x8b'
gzip.decompress(bad_compressed)     # EOFError

This is in-memory for the simplicity of demonstration; it would work the same if you manipulated the file instead of the string. For example:

echo Stuff | gzip | head -c 2 >file.gz

**notacorn** · Answer 2 · 2024-03-29T06:16:40+00:00

I really don't want to knock the answer as it technically answers my question as stated in the best possible way.

What I did also manage to do was create a mock of a gzip.GzipFile that behaves exactly as I expect.

% python3 -m pytest test.py
============================================================= test session starts =============================================================
platform darwin -- Python 3.12.2, pytest-8.1.1, pluggy-1.4.0
rootdir: /private/tmp
collected 1 item                                                                                                                              

test.py .                                                                                                                               [100%]

============================================================== 1 passed in 0.03s ==============================================================
% cat test.py
from unittest import mock
import gzip
import pytest

def test_foo():
    f = mock.Mock(gzip.GzipFile)
    f.readline = mock.Mock(side_effect=["hello", EOFError()])
    assert f.readline() == "hello"
    with pytest.raises(EOFError):
        f.readline()

I think for unit testing purposes this might be the cleaner solution as opposed to actually creating the file and reading it, as I can just mock the open function to return my mocked file.

TechQA.

How to create an improperly closed gzip file using python?

There are 2 answers

Related Questions in PYTHON

Related Questions in PYTHON-3.X

Related Questions in UNIT-TESTING

Related Questions in PYTEST

Related Questions in GZIP

Popular Questions

Trending Questions