Skip to content

Correct use of model_server_workers #1275

Closed
@anotinelg

Description

@anotinelg

Describe the bug
The documentation states that when i deploy a model with model_server_workers = None,

model_server_workers (int) – Optional. The number of worker processes used by the inference server. If None, server will use one worker per vCPU.

However what i found is when i deploy my model in a ml.c5.2xlarge (8 vCPU, one CPU i Guess), it only uses 1 worker (show logs below)

if i pass the parameters into the deploy function, it correctly set the Default workers per model to the number i have specified through the model_server_workers parameter.
As a conclusion, the documentation is not updated, or the behaviour when model_server_workers = NOne does not work.

To reproduce
Deploy any model on a ml.c5.2xlarge, check the log and the entry Default workers per model

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots or logs

This is an extract of the log from the endpoint:

**Number of CPUs: 1**
Max heap size: 3739 M
Python executable: /usr/local/bin/python3.6
Config file: /etc/sagemaker-mms.properties
Inference address: http://0.0.0.0:8080
Management address: http://127.0.0.1:8081
Model Store: /.sagemaker/mms/models
Initial Models: ALL
Log dir: /logs
Metrics dir: /logs
Netty threads: 0
Netty client threads: 0
**Default workers per model: 1**

System information
A description of your system. Please provide:

  • SageMaker Python SDK version: '1.42.1'
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans): custom script on MXNET 1.4.1
  • Framework version: 1.4.1
  • Python version: 3
  • CPU or GPU: CPU
  • Custom Docker image (Y/N): AWS docker

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions