-
Notifications
You must be signed in to change notification settings - Fork 117
Description
Thank you very much for the help your work has given me!
when i reproduct ar-diffusion on iwslt14 dataset,I run the following command:
python -m torch.distributed.run --nproc_per_node=2 --nnodes=1 ./train_utils/trainer_main.py \ model.name='bert-base-uncased' batch_size=128 grad_accum=3 \ total_steps=230000 exp.name=iwslt14 \ data.name=iwslt14_tok tgt_len=90 max_pos_len=256 lr=1.6e-3 num_workers=4 use_bpe=True lr_step=80000 \ intermediate_size=1024 num_attention_heads=4 dropout=0.2 \ in_channels=64 out_channels=64 time_channels=64 \ eval_interval=3000 log_interval=1000 \ schedule_sampler='xy_uniform' time_att=True att_strategy='txl' use_AMP=True
however,i got the following error:
Traceback (most recent call last):
File "./train_utils/trainer_main.py", line 44, in main
tokenizer = create_tokenizer(path="./data/iwslt14_tok")
File "/./diffusion_denovo/AR-diffusion/data_utils/tokenizer_utils.py", line 20, in create_tokenizer
return read_byte_level(path)
File "/./diffusion_denovo/AR-diffusion/data_utils/tokenizer_utils.py", line 45, in read_byte_level
tokenizer = ByteLevelBPETokenizer(
File "/root/anaconda3/envs/ard/lib/python3.8/site-packages/tokenizers/implementations/byte_level_bpe.py", line 32, in init
BPE(
Exception: Error while initializing BPE: No such file or directory (os error 2)
According to my guess, the error is caused by the absence of the files vocab.json and merges.txt in the directory './data/iwslt14_tok', which are supposed to correspond to the vocab and merges parameters during BPE initialization. How can this issue be resolved?