-
Notifications
You must be signed in to change notification settings - Fork 65
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
Words tagged as incorrect are replaced with a word with hashtags.
To Reproduce
#Steps to reproduce the behavior:
>>> import spacy
>>> nlp = spacy.load('en_core_web_lg', disable=['tagger'])
>>> from contextualSpellCheck import ContextualSpellCheck
2020-10-14 10:24:16.775668: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
>>> merge_ents = nlp.create_pipe("merge_entities")
>>> nlp.add_pipe(merge_ents)
>>> spell_checker = ContextualSpellCheck(max_edit_dist=3)
>>> nlp.add_pipe(spell_checker)
>>> sent = 'Everyone has to help to fix the problems of society. There has to be more training, more opportunity to bridge the gap between the haves and the have nots.'
>>> doc = nlp(sent)
>>> correct = doc._.outcome_spellCheck
>>> correct
'Everyone has to help to fix the problems of society. There has to be more training, more opportunity to bridge the gap between the have and the have ##ts.'Expected behavior
'Everyone has to help to fix the problems of society. There has to be more training, more opportunity to bridge the gap between the have and the have nots.'
or
'Everyone has to help to fix the problems of society. There has to be more training, more opportunity to bridge the gap between the have and the have not.'
Version:
- contextualSpellCheck 0.3.0
- Spacy: 2.3.2
- transformers 3.3.1
Additional information
I checked the vocab.txt and there are words with ## in the word. I am wondering what the need for these are.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working