[TokenizerSlow] `replace_additional_special_tokens` is not doing much

Just flagging this as the `add_special_tokens` method got pretty complicated, adding a kwargs, `replace_additional_special_tokens`, that supposedly can prevent replacing the `self._additional_special_tokens` attribute. 
For any tokenizer, this will remove it from the list, but will not update the internal `trie` and thus has no effect at all: 
```python
>>> from transformers import XLMRobertaTokenizer
>>> tokenizer_a = XLMRobertaTokenizer.from_pretrained('xlm-roberta-base')
>>> tokenizer_a.add_special_tokens({"additional_special_tokens":["<//s>"]})
>>> tokenizer_a.additional_special_tokens
['<//s>']
>>> print(tokenizer_a.tokenize("This is a <//s>"))
['▁This', '▁is', '▁a', '<//s>']
>>> tokenizer_a.add_special_tokens({"additional_special_tokens":["<///s>"]}, replace_additional_special_tokens= True)
>>> print(tokenizer_a.tokenize("This is a <//s>"))
['▁This', '▁is', '▁a', '<//s>']
```
This will be addressed in #23909 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[TokenizerSlow] `replace_additional_special_tokens` is not doing much #24276

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[TokenizerSlow] replace_additional_special_tokens is not doing much #24276

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[TokenizerSlow] `replace_additional_special_tokens` is not doing much #24276