-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Closed
Labels
bugSomething isn't workingSomething isn't workingexamplequestionFurther information is requestedFurther information is requestedver: 2.0.x
Description
Bug description
There was an exception when I ran the official code.
What version are you seeing the problem on?
v2.0
How to reproduce the bug
# main.py
# ! pip install torchvision
import torch, torch.nn as nn, torch.utils.data as data, torchvision as tv, torch.nn.functional as F
import lightning as L
from torch import Tensor
# --------------------------------
# Step 1: Define a LightningModule
# --------------------------------
# A LightningModule (nn.Module subclass) defines a full *system*
# (ie: an LLM, diffusion model, autoencoder, or simple image classifier).
# define any number of nn.Modules (or use your current ones)
encoder = nn.Sequential(nn.Linear(28 * 28, 128), nn.ReLU(), nn.Linear(128, 3))
decoder = nn.Sequential(nn.Linear(3, 128), nn.ReLU(), nn.Linear(128, 28 * 28))
class LitAutoEncoder(L.LightningModule):
def __init__(self):
super().__init__()
self.encoder = encoder
self.decoder = decoder
def forward(self, x):
# in lightning, forward defines the prediction/inference actions
embedding = self.encoder(x)
return embedding
def training_step(self, batch, batch_idx):
# training_step defines the train loop. It is independent of forward
x, y = batch
x = x.view(x.size(0), -1)
z = self.encoder(x)
x_hat = self.decoder(z)
loss = F.mse_loss(x_hat, x)
self.log("train_loss", loss)
return loss
def configure_optimizers(self):
optimizer = torch.optim.Adam(self.parameters(), lr=1e-3)
return optimizer
# -------------------
# Step 2: Define data
# -------------------
dataset = tv.datasets.MNIST(".", download=True, transform=tv.transforms.ToTensor())
train, val = data.random_split(dataset, [55000, 5000])
# -------------------
# Step 3: Train
# -------------------
autoencoder = LitAutoEncoder()
trainer = L.Trainer(limit_train_batches=100, max_epochs=1)
trainer.fit(autoencoder, data.DataLoader(train), data.DataLoader(val))
checkpoint = "./lightning_logs/version_0/checkpoints/epoch=0-step=100.ckpt"
autoencoder = LitAutoEncoder.load_from_checkpoint(checkpoint, encoder=encoder, decoder=decoder)
# choose your trained nn.Module
encoder = autoencoder.encoder
encoder.eval()
# embed 4 fake images!
fake_image_batch = Tensor(4, 28 * 28)
embeddings = encoder(fake_image_batch)
print("⚡" * 20, "\nPredictions (4 image embeddings):\n", embeddings, "\n", "⚡" * 20)Error messages and logs
/Users/kylin/Library/Caches/pypoetry/virtualenvs/lightning-demo-01-7el2zk6k-py3.10/bin/python /Users/kylin/Documents/code/github/lightning_demo_01/main.py
GPU available: True (mps), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
/Users/kylin/Library/Caches/pypoetry/virtualenvs/lightning-demo-01-7el2zk6k-py3.10/lib/python3.10/site-packages/lightning/pytorch/trainer/connectors/logger_connector/logger_connector.py:67: UserWarning: Starting from v1.9.0, `tensorboardX` has been removed as a dependency of the `lightning.pytorch` package, due to potential conflicts with other packages in the ML ecosystem. For this reason, `logger=True` will use `CSVLogger` as the default logger, unless the `tensorboard` or `tensorboardX` packages are found. Please `pip install lightning[extra]` or one of them to enable TensorBoard support by default
warning_cache.warn(
/Users/kylin/Library/Caches/pypoetry/virtualenvs/lightning-demo-01-7el2zk6k-py3.10/lib/python3.10/site-packages/lightning/pytorch/trainer/configuration_validator.py:68: UserWarning: You passed in a `val_dataloader` but have no `validation_step`. Skipping val loop.
rank_zero_warn("You passed in a `val_dataloader` but have no `validation_step`. Skipping val loop.")
| Name | Type | Params
---------------------------------------
0 | encoder | Sequential | 100 K
1 | decoder | Sequential | 101 K
---------------------------------------
202 K Trainable params
0 Non-trainable params
202 K Total params
0.810 Total estimated model params size (MB)
/Users/kylin/Library/Caches/pypoetry/virtualenvs/lightning-demo-01-7el2zk6k-py3.10/lib/python3.10/site-packages/lightning/pytorch/trainer/connectors/data_connector.py:432: PossibleUserWarning: The dataloader, train_dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 12 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
rank_zero_warn(
Epoch 0: 100%|██████████| 100/100 [00:00<00:00, 194.97it/s, v_num=1]
`Trainer.fit` stopped: `max_epochs=1` reached.
Traceback (most recent call last):
File "/Users/kylin/Documents/code/github/lightning_demo_01/main.py", line 65, in <module>
embeddings = encoder(fake_image_batch)
File "/Users/kylin/Library/Caches/pypoetry/virtualenvs/lightning-demo-01-7el2zk6k-py3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/kylin/Library/Caches/pypoetry/virtualenvs/lightning-demo-01-7el2zk6k-py3.10/lib/python3.10/site-packages/torch/nn/modules/container.py", line 217, in forward
input = module(input)
File "/Users/kylin/Library/Caches/pypoetry/virtualenvs/lightning-demo-01-7el2zk6k-py3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/kylin/Library/Caches/pypoetry/virtualenvs/lightning-demo-01-7el2zk6k-py3.10/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: Placeholder storage has not been allocated on MPS device!
Process finished with exit code 1
Environment
Current environment
- CUDA:
- GPU: None
- available: False
- version: None
- Lightning:
- lightning: 2.0.5
- lightning-cloud: 0.5.37
- lightning-utilities: 0.9.0
- pytorch-lightning: 2.0.5
- torch: 2.0.1
- torchmetrics: 1.0.1
- torchvision: 0.15.2
- Packages:
- aiohttp: 3.8.4
- aiosignal: 1.3.1
- anyio: 3.7.1
- arrow: 1.2.3
- async-timeout: 4.0.2
- attrs: 23.1.0
- backoff: 2.2.1
- beautifulsoup4: 4.12.2
- blessed: 1.20.0
- certifi: 2023.5.7
- charset-normalizer: 3.2.0
- click: 8.1.5
- croniter: 1.4.1
- dateutils: 0.6.12
- deepdiff: 6.3.1
- encoder: 1.1
- exceptiongroup: 1.1.2
- fastapi: 0.100.0
- filelock: 3.12.2
- frozenlist: 1.4.0
- fsspec: 2023.6.0
- h11: 0.14.0
- idna: 3.4
- inquirer: 3.1.3
- itsdangerous: 2.1.2
- jinja2: 3.1.2
- lightning: 2.0.5
- lightning-cloud: 0.5.37
- lightning-utilities: 0.9.0
- markdown-it-py: 3.0.0
- markupsafe: 2.1.3
- mdurl: 0.1.2
- mpmath: 1.3.0
- multidict: 6.0.4
- networkx: 3.1
- numpy: 1.25.1
- ordered-set: 4.1.0
- packaging: 23.1
- pillow: 10.0.0
- pip: 23.1.2
- psutil: 5.9.5
- pydantic: 1.10.11
- pygments: 2.15.1
- pyjwt: 2.7.0
- python-dateutil: 2.8.2
- python-editor: 1.0.4
- python-multipart: 0.0.6
- pytorch-lightning: 2.0.5
- pytz: 2023.3
- pyyaml: 6.0
- readchar: 4.0.5
- requests: 2.31.0
- rich: 13.4.2
- setuptools: 68.0.0
- six: 1.16.0
- sniffio: 1.3.0
- soupsieve: 2.4.1
- starlette: 0.27.0
- starsessions: 1.3.0
- sympy: 1.12
- torch: 2.0.1
- torchmetrics: 1.0.1
- torchvision: 0.15.2
- tqdm: 4.65.0
- traitlets: 5.9.0
- typing-extensions: 4.7.1
- urllib3: 2.0.3
- uvicorn: 0.23.0
- wcwidth: 0.2.6
- websocket-client: 1.6.1
- websockets: 11.0.3
- wheel: 0.40.0
- yarl: 1.9.2
- System:
- OS: Darwin
- architecture:
- 64bit
- processor: arm
- python: 3.10.12
- release: 22.5.0
- version: Darwin Kernel Version 22.5.0: Thu Jun 8 22:22:23 PDT 2023; root:xnu-8796.121.3~7/RELEASE_ARM64_T6020
More info
No response
addisonklinke
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingexamplequestionFurther information is requestedFurther information is requestedver: 2.0.x