Open

Description
Hello, i got this error trying to OCR this pdf document: https://www.dropbox.com/s/ko76kalp5p59hwc/contrato%20de%20fianza%20prueba%2010.pdf?dl=0
The code which fails is:
Docsplit.extract_text(attachment.path, :output => output_dir, :language => 'spa').
I have tried using:
- Docsplit.extract_text(attachment.path, :output => output_dir, :language => 'spa', :no_clean => true)
- Docsplit.extract_text(attachment.path, :output => output_dir, :language => 'spa', :no_clean => false)
- Docsplit.extract_text(attachment.path, :output => output_dir, :no_clean => true)
- Docsplit.extract_text(attachment.path, :output => output_dir, :no_clean => false)
but non of the above is helping, still fails. A lot of other pdf documents works great.
My environment:
Rails 4.2
Ruby 2.2
Docsplit 0.7.6
tesseract-ocr 3.03
tesseract-ocr-spa 3.02
Any help please?
Metadata
Metadata
Assignees
Labels
No labels