-
Notifications
You must be signed in to change notification settings - Fork 159
Description
Using the 16khz.yml configs on a different dataset, I get:
Traceback (most recent call last):
File "/mnt/archive2/dac/descript-audio-codec/scripts/train.py", line 441, in
train(args, accel)
File "/home/g/miniconda3/envs/dac/lib/python3.9/site-packages/argbind/argbind.py", line 159, in cmd_func
return func(*cmd_args, **kwargs)
File "/mnt/archive2/dac/descript-audio-codec/scripts/train.py", line 416, in train
train_loop(state, batch, accel, lambdas)
File "/home/g/miniconda3/envs/dac/lib/python3.9/site-packages/audiotools/ml/decorators.py", line 375, in decorated
output = fn(*args, **kwargs)
File "/home/g/miniconda3/envs/dac/lib/python3.9/site-packages/audiotools/ml/decorators.py", line 321, in decorated
output = fn(*args, **kwargs)
File "/home/g/miniconda3/envs/dac/lib/python3.9/site-packages/audiotools/ml/decorators.py", line 107, in decorated
output = fn(*args, **kwargs)
File "/mnt/archive2/dac/descript-audio-codec/scripts/train.py", line 259, in train_loop
output["mel/loss"] = state.mel_loss(recons, signal)
File "/home/g/miniconda3/envs/dac/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/mnt/archive2/dac/descript-audio-codec/dac/nn/loss.py", line 322, in forward
loss += self.log_weight * self.loss_fn(
File "/home/g/miniconda3/envs/dac/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs) File "/home/g/miniconda3/envs/dac/lib/python3.9/site-packages/torch/nn/modules/loss.py", line 101, in forward
return F.l1_loss(input, target, reduction=self.reduction)
File "/home/g/miniconda3/envs/dac/lib/python3.9/site-packages/torch/nn/functional.py", line 3263, in l1_loss
expanded_input, expanded_target = torch.broadcast_tensors(input, target)
File "/home/g/miniconda3/envs/dac/lib/python3.9/site-packages/torch/functional.py", line 74, in broadcast_tensors
return _VF.broadcast_tensors(tensors) # type: ignore[attr-defined]
RuntimeError: The size of tensor a (760) must match the size of tensor b (761) at non-singleton dimension 3
The only changes to the configs that I've made is the num_workers, seed, iters, and valid_freq. batch["signal"] shows an signal of shape [24,1,6080] before transforms. I didn't have any issues with the baseline 44kHz model. The only changes in configs between 16kHz and base are DAC.sample_rate, DAC.encoder_rates, DAC.decoder_rates, n_codebooks, DAC.quantizer_dropout, Discriminator_sample_rate, and num_iters.