zlib.error: Error -3 while decompressing data: incorrect data check

We have do deal with a huge amount of broken PDF files. The creator is "jsPDF 1.x-master". These files are not totally corrupted. It would be nice to get the readable content.
I found a solution on [stackoverflow](https://stackoverflow.com/questions/26794514/how-to-extract-data-from-corrupted-gzip-files-in-python) and it works fine for our needs.


**PyPDF2/filters.py**
```python
    def decompress(data):
        try:
            return zlib.decompress(data)
        except zlib.error:
            return decompress_corrupted(data)

    def decompress_corrupted(data):
        d = zlib.decompressobj(zlib.MAX_WBITS | 32)
        f = StringIO(data)
        result_str = b''
        buffer = f.read(1)
        try:
            while buffer:
                result_str += d.decompress(buffer)
                buffer = f.read(1)
        except zlib.error:
            pass
        return result_str
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

zlib.error: Error -3 while decompressing data: incorrect data check #422

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

zlib.error: Error -3 while decompressing data: incorrect data check #422

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions