Skip to content

HuggingFaceProcessor with ProcessingStep results in import errors (similar to issues/2656) #4802

Open
@solanki-ravi

Description

@solanki-ravi

Describe the bug
Using the HuggingFaceProcessor with ProcessingStep results in import errors similar to: #2656

2024-07-25T15:47:57.615Z | /opt/ml/processing/input/entrypoint/evaluate.py: line 2: SAGEMAKER_INPUTS_DIR: command not found
2024-07-25T15:47:57.615Z | /opt/ml/processing/input/entrypoint/evaluate.py: line 3: SAGEMAKER_OUTPUTS_DIR: command not found
2024-07-25T15:47:57.615Z | /opt/ml/processing/input/entrypoint/evaluate.py: line 5: import: command not found
2024-07-25T15:47:57.615Z | /opt/ml/processing/input/entrypoint/evaluate.py: line 6: import: command not found
2024-07-25T15:47:57.615Z | /opt/ml/processing/input/entrypoint/evaluate.py: line 7: import: command not found
2024-07-25T15:47:57.616Z | /opt/ml/processing/input/entrypoint/evaluate.py: line 8: import: command not found
...

To reproduce
Define the HuggingFace processor
huggingface_processor = HuggingFaceProcessor(
role=role,
transformers_version='4.4',
pytorch_version='1.6.0',
instance_count=1,
instance_type=f'{SAGEMAKER_GPU_INSTANCE_TYPE}',
command=["python3"]
)

step_evaluate = ProcessingStep(
name="...",
processor=huggingface_processor,
inputs=[
...
],
outputs=[
...
],
code="src/evaluate.py"
)

pipeline = Pipeline(
name="...",
steps=[ step_evaluate]
)

Execute the pipeline
pipeline.upsert(role_arn=role)
execution = pipeline.start()
execution.wait()

Expected behavior
Error should not happen, and evalute.py should be invoked by the container.

Screenshots or logs
Attached

System information
A description of your system. Please provide:

  • SageMaker Python SDK version:
    $ pip show sagemaker
    Name: sagemaker
    Version: 2.226.1
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans):
    $ pip show transformers
    Name: transformers
    Version: 4.42.4

$ pip show torch
Name: torch
Version: 2.1.2
log-events-viewer-result.csv

Additional context
related? #2656

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions