-
Notifications
You must be signed in to change notification settings - Fork 6.1k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
Not able to use ---push_to_hub
option for TPU training
getting error
Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
03/27/2023 23:15:59 - ERROR - huggingface_hub.repository - Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
This is not a unique train_text_to_image_flax.py
script. I'm just using it as an example. Basically, this line will always fail when called during training on a tpu https://github.com/huggingface/diffusers/blob/main/examples/text_to_image/train_text_to_image_flax.py#L584
Reproduction
run the train_text_to_image_flax script here with this command
https://github.com/huggingface/diffusers/tree/main/examples/text_to_image#training-with-flaxjax
export MODEL_NAME="duongna/stable-diffusion-v1-4-flax"
export dataset_name="lambdalabs/pokemon-blip-captions"
export OUTPUT_DIR="/pokemon"
export HUB_MODEL_ID="pokemon-lora"
python3 train_text_to_image_flax.py \
--pretrained_model_name_or_path=$MODEL_NAME \
--dataset_name=$dataset_name \
--resolution=512 --center_crop --random_flip \
--train_batch_size=1 \
--mixed_precision="fp16" \
--max_train_steps=150 \
--learning_rate=1e-05 \
--max_grad_norm=1 \
--output_dir="sd-pokemon-model" \
--push_to_hub \
--hub_model_id=${HUB_MODEL_ID}
Logs
Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
03/27/2023 23:13:38 - ERROR - huggingface_hub.repository - Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
03/27/2023 23:13:48 - ERROR - huggingface_hub.repository - Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
03/27/2023 23:13:58 - ERROR - huggingface_hub.repository - Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
03/27/2023 23:14:08 - ERROR - huggingface_hub.repository - Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
03/27/2023 23:14:18 - ERROR - huggingface_hub.repository - Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
03/27/2023 23:14:28 - ERROR - huggingface_hub.repository - Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
03/27/2023 23:14:38 - ERROR - huggingface_hub.repository - Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
03/27/2023 23:14:48 - ERROR - huggingface_hub.repository - Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
03/27/2023 23:14:58 - ERROR - huggingface_hub.repository - Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
03/27/2023 23:15:09 - ERROR - huggingface_hub.repository - Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
03/27/2023 23:15:19 - ERROR - huggingface_hub.repository - Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
03/27/2023 23:15:29 - ERROR - huggingface_hub.repository - Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
03/27/2023 23:15:39 - ERROR - huggingface_hub.repository - Waiting for the following commands to
finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
03/27/2023 23:15:49 - ERROR - huggingface_hub.repository - Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
03/27/2023 23:15:59 - ERROR - huggingface_hub.repository - Waiting for the following commands to finish before shutting down: [[push command, status code: running, in progress. PID: 772274]].
System Info
tpu-v4-8
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working