☁️ AI Labs
AWS Machine Learning Blog
14 min read
Capacity-aware inference: Automatic instance fallback for SageMaker AI endpoints
Today, Amazon SageMaker AI introduces capacity aware instance pool for new and existing inference endpoints. You define a prioritized list of instance types, and SageMaker AI automatically works through your list whenever capacity is constrained at creation, during scale-out, and during scale-in. Your endpoint provisions on available AI Infrastructure without manual intervention. This capability is available for Single Model Endpoints,…