Description
Describe the bug
If I use a ParameterString or any other PipelineVariable object in the list passed to the arguments
argument in PySparkProcessor.run method, I get a TypeError (TypeError: Object of type ParameterString is not JSON serializable).
According to the documentation, arguments can be a list of PipelineVariable
s, so expecting this to work. Is this not supported?
To reproduce
A clear, step-by-step set of instructions to reproduce the bug.
spark_processor = PySparkProcessor(
base_job_name="sagemaker-spark",
framework_version="3.1",
role=role,
instance_count=2,
instance_type="ml.m5.xlarge",
sagemaker_session=sagemaker_session,
max_runtime_in_seconds=1200,
)
spark_processor.run(
submit_app="spark_processing/preprocess.py",
arguments=[
"--s3_input_bucket",
ParameterString(name="s3-input-bucket", default_value=bucket),
"--s3_input_key_prefix",
input_prefix_abalone,
"--s3_output_bucket",
bucket,
"--s3_output_key_prefix",
input_preprocessed_prefix_abalone,
],
)
Expected behavior
A clear and concise description of what you expected to happen.
Expect a SageMaker ProcessingJob to be created.
Screenshots or logs
If applicable, add screenshots or logs to help explain your problem.
Traceback (most recent call last):
File "/Users/[email protected]/PycharmProjects/sagemaker-sdk-test/run_pyspark_processor.py", line 63, in <module>
run_sagemaker_spark_job(
File "/Users/[email protected]/PycharmProjects/sagemaker-sdk-test/run_pyspark_processor.py", line 37, in run_sagemaker_spark_job
spark_processor.run(
File "/Users/[email protected]/PycharmProjects/sagemaker-sdk-test/venv/lib/python3.9/site-packages/sagemaker/spark/processing.py", line 902, in run
return super().run(
File "/Users/[email protected]/PycharmProjects/sagemaker-sdk-test/venv/lib/python3.9/site-packages/sagemaker/spark/processing.py", line 265, in run
return super().run(
File "/Users/[email protected]/PycharmProjects/sagemaker-sdk-test/venv/lib/python3.9/site-packages/sagemaker/workflow/pipeline_context.py", line 248, in wrapper
return run_func(*args, **kwargs)
File "/Users/[email protected]/PycharmProjects/sagemaker-sdk-test/venv/lib/python3.9/site-packages/sagemaker/processing.py", line 572, in run
self.latest_job = ProcessingJob.start_new(
File "/Users/[email protected]/PycharmProjects/sagemaker-sdk-test/venv/lib/python3.9/site-packages/sagemaker/processing.py", line 796, in start_new
processor.sagemaker_session.process(**process_args)
File "/Users/[email protected]/PycharmProjects/sagemaker-sdk-test/venv/lib/python3.9/site-packages/sagemaker/session.py", line 956, in process
self._intercept_create_request(process_request, submit, self.process.__name__)
File "/Users/[email protected]/PycharmProjects/sagemaker-sdk-test/venv/lib/python3.9/site-packages/sagemaker/session.py", line 4317, in _intercept_create_request
return create(request)
File "/Users/[email protected]/PycharmProjects/sagemaker-sdk-test/venv/lib/python3.9/site-packages/sagemaker/session.py", line 953, in submit
LOGGER.debug("process request: %s", json.dumps(request, indent=4))
File "/Users/[email protected]/opt/anaconda3/lib/python3.9/json/__init__.py", line 234, in dumps
return cls(
File "/Users/[email protected]/opt/anaconda3/lib/python3.9/json/encoder.py", line 201, in encode
chunks = list(chunks)
File "/Users/[email protected]/opt/anaconda3/lib/python3.9/json/encoder.py", line 431, in _iterencode
yield from _iterencode_dict(o, _current_indent_level)
File "/Users/[email protected]/opt/anaconda3/lib/python3.9/json/encoder.py", line 405, in _iterencode_dict
yield from chunks
File "/Users/[email protected]/opt/anaconda3/lib/python3.9/json/encoder.py", line 405, in _iterencode_dict
yield from chunks
File "/Users/[email protected]/opt/anaconda3/lib/python3.9/json/encoder.py", line 325, in _iterencode_list
yield from chunks
File "/Users/[email protected]/opt/anaconda3/lib/python3.9/json/encoder.py", line 438, in _iterencode
o = _default(o)
File "/Users/[email protected]/opt/anaconda3/lib/python3.9/json/encoder.py", line 179, in default
raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type ParameterString is not JSON serializable
System information
A description of your system. Please provide:
- SageMaker Python SDK version: 2.112.2
- Framework name (eg. PyTorch) or algorithm (eg. KMeans): PySpark
- Framework version: 3.1
- Python version: default
- CPU or GPU: CPU
- Custom Docker image (Y/N): N
Additional context
Add any other context about the problem here.