ValueError while using --optimize_on_cpu

> Traceback (most recent call last): | 1/87970 [00:00<8:35:35, 2.84it/s]
File "./run_squad.py", line 990, in <module>
main()
File "./run_squad.py", line 922, in main
is_nan = set_optimizer_params_grad(param_optimizer, model.named_parameters(), test_nan=True)
File "./run_squad.py", line 691, in set_optimizer_params_grad
if test_nan and torch.isnan(param_model.grad).sum() > 0:
File "/people/sanjay/anaconda2/envs/bert_pytorch/lib/python3.5/site-packages/torch/functional.py", line 289, in isnan
raise ValueError("The argument is not a tensor", str(tensor))
ValueError: ('The argument is not a tensor', 'None')

Command: 
CUDA_VISIBLE_DEVICES=0 python ./run_squad.py \
--vocab_file bert_large/uncased_L-24_H-1024_A-16/vocab.txt \
--bert_config_file bert_large/uncased_L-24_H-1024_A-16/bert_config.json \
--init_checkpoint bert_large/uncased_L-24_H-1024_A-16/pytorch_model.bin \
--do_lower_case \
--do_train \
--do_predict \
--train_file squad_dir/train-v1.1.json \
--predict_file squad_dir/dev-v1.1.json \
--learning_rate 3e-5 \
--num_train_epochs 2 \
--max_seq_length 384 \
--doc_stride 128 \
--output_dir outputs \
--train_batch_size 4 \
--gradient_accumulation_steps 2 \
--optimize_on_cpu 

Error while using --optimize_on_cpu only. 
Works fine without the argument. 

GPU: Nvidia GTX 1080Ti Single GPU.

PS: I can only fit in train_batch_size 4 on the memory of a single GPU.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ValueError while using --optimize_on_cpu #23

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ValueError while using --optimize_on_cpu #23

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions