S3 Torch-connector loads data as a list of tensors #918
-
|
I'm setting up a model and about to start training a dataset from an S3 bucket. To load the data from S3 I'm using s3torchconnector.S3MapDataset.from_prefix which loads the data into the Sagemaker space I'm using. However, when I start to training, this S3torchconnector gives a list of tensors (or a Map dataset) instead of a tensor to be processed by samples = samples.to(device, non_blocking=True). As expected I'm getting the Error 'list' object has no attribute 'to'. Here some snippets of my code: . . . This code works well with a custom dataset and local files in my Sagemaker space, but I cannot implement that way since I need to train millions of images, hence using an S3 dataloader. I'm not sure if S3torchconnector just works this way, or if it is possible to transform whatever it loads into a single Tensor to be processed by the Model and the sampler. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
|
The issue was resolved by adding a conditional structure to handle the type of the if isinstance(samples, list):
samples = samples[1].to(device, non_blocking=True)
else:
samples = samples.to(device, non_blocking=True)This approach ensures compatibility with both:
By checking the type and selecting the appropriate element (in this case, |
Beta Was this translation helpful? Give feedback.
The issue was resolved by adding a conditional structure to handle the type of the
samplesobject before sending it to the device. Specifically, the following logic was used in the training loop (or whereversamplesis processed):This approach ensures compatibility with both:
samplesis already a tensor.S3torchconnector.S3MapDataset, wheresamplesis returned as a list (e.g., a tuple like(key, tensor)).By checking the type and selecting the appropriate element (in this case,
sa…