Skip to content

Conversation

@bertsky
Copy link
Contributor

@bertsky bertsky commented Sep 19, 2019

  • add another constructor for LSTMRecognizer
    which takes the language_data_path_prefix configured/selected
    at runtime and passes it to the internal CCUtil
  • use this in Tesseract::init_tesseract_lang_data when LSTMs
    are available

(this was missing from 297d7d8)

- add another constructor for LSTMRecognizer
  which takes the language_data_path_prefix configured/selected
  at runtime and passes it to the internal CCUtil
- use this in Tesseract::init_tesseract_lang_data when LSTMs
  are available

(this was missing from 297d7d8)
@bertsky
Copy link
Contributor Author

bertsky commented Sep 19, 2019

Fixes #2584

@bertsky
Copy link
Contributor Author

bertsky commented Sep 19, 2019

But maybe anyone has a better idea of how to make this work with LSTMs?

(The story is the same as for the configuration variables in #2328 – we have to get the prefix into the shallow CCUtil / Dict inside LSTMRecognizer. For the config variables I decided to pass the existing ParamsVectors via LSTMRecognizer::LoadDictionary(), resetting the defaults only for the specific variables to be shared, via Param::ResetFrom(). But here we are dealing with the combined runtime settings for tessdata prefix and language. We could alternatively use TessdataManager for this, maybe by giving it an additional member language_data_path_prefix_ to be set in the constructor. I even thought about using its existing member data_file_name_ and just truncate the .traineddata suffix, but I guess this is bound to fail under certain conditions I am unable to identify.)

@zdenop zdenop merged commit 39a63c2 into tesseract-ocr:master Sep 20, 2019
@zdenop
Copy link
Contributor

zdenop commented Sep 20, 2019

thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants