AWS SageMaker PyTorch Model Deployment - is entry_point needed?

12:09 28 Nov 2025

I'm trying to deploy a pre-trained PyTorch model to SageMaker using the Python SDK. I have a model.tar.gz file that is uploaded to S3, with the following structure:

code/
code/requirements.txt
code/inference.py
code/utils.py
model.pt

I also have the following deployment script (edited to remove ARNs etc, but I can confirm these are correct):

import os
import json
import sagemaker
from sagemaker.pytorch import PyTorchModel


role = AWS_SAGEMAKER_ROLE_ARN
bucket = 
session = sagemaker.Session(default_bucket=bucket)

model_data = f"s3://{bucket}/model.tar.gz"

model = PyTorchModel(
    model_data=model_data,
    role=role,
    framework_version="2.6",
    py_version="py312",
    #entry_point="inference.py",
    sagemaker_session=session,
    name=sagemaker-test-model
)

predictor = model.deploy(
    instance_type="ml.m5.xlarge",
    initial_instance_count=1,     
    endpoint_name=sagemaker-test-model-endpoint,
)

payload = {
    "images": [PATH_TO_IMAGES_S3]
}

response = predictor.predict(json.dumps(payload))

print(response)

The above code time outs:

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (0) from primary with message "Your invocation timed out while waiting for a 
response from container primary. Review the latency metrics for each container in Amazon CloudWatch, resolve the issue, and try again."

If I un-comment entry_point in PyTorchModel, Sagemaker tries to re-upload a model.tar.gz file to S3, which gives some permissions errors I currently can't fix due to my own set of permissions errors.

My question is: am I getting a timeout because I need to provide entry_point, despite creating the model.tar.gzfile, or is my error elsewhere? Perhaps in the inference.py file?

amazon-web-services pytorch amazon-sagemaker

Your Answer

Privacy & Cookie Consent