Skip to content

JPEG conversion in analyze_document significantly impacts table predictions #341

@Belval

Description

@Belval

When obtaining predictions through analyze_document, the image is converted to JPEG https://github.com/aws-samples/amazon-textract-textractor/blob/master/textractor/textractor.py#L845. The compression is enough to degrade the table predictions.

We should check and keep the format, assuming that it is supported by Textract to avoid discrepancies between calling Textract with Textractor and calling Textract with boto3.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions