Description
Describe the bug
I am trying to setup a Sagemaker processing job where the job input is defined using the AthenaDatasetDefinition. When executing the job, it fails with message below. It appears the job is trying to create a new database sagemaker_processing. I have tried to specify to reuse an existing database using the dataset definition parameters and also specified the output S3 URI parameter but they don't seem to help.
{"level":"ERROR","ts":"2025-05-13T16:18:55.242Z","msg":"[sagemaker logs] [Input: input-1] Error creating database 'sagemaker_processing' in catalog 'awsdatacatalog'."} {"level":"ERROR","ts":"2025-05-13T16:18:55.242Z","msg":"[sagemaker logs] [Input: input-1] Error AccessDeniedException: User: arn:aws:sts::726167300549:assumed-role/99999-sagemaker-devmanaged-role/SageMaker is not authorized to perform: glue:CreateDatabase on resource: arn:aws:glue:us-west-2:726167300549:catalog because no identity-based policy allows the glue:CreateDatabase action"}
To reproduce
- Define a sagemaker processing job using AthenaDatasetDefinition as ProcessingInput.
- Execute the job
Expected behavior
- Job executes without trying to create a new database.
Screenshots or logs
{"level":"INFO","ts":"2025-05-13T16:18:55.011Z","msg":"[sagemaker logs] [Input: input-1] Athena dataset definition specified. Starting athena query execution."} {"level":"INFO","ts":"2025-05-13T16:18:55.011Z","msg":"[sagemaker logs] [Input: input-1] Creating database 'sagemaker_processing' in catalog 'awsdatacatalog' if doesn't exist already."} {"level":"ERROR","ts":"2025-05-13T16:18:55.242Z","msg":"[sagemaker logs] [Input: input-1] Error creating database 'sagemaker_processing' in catalog 'awsdatacatalog'."} {"level":"ERROR","ts":"2025-05-13T16:18:55.242Z","msg":"[sagemaker logs] [Input: input-1] Error AccessDeniedException: User: arn:aws:sts::726167300549:assumed-role/99999-sagemaker-devmanaged-role/SageMaker is not authorized to perform: glue:CreateDatabase on resource: arn:aws:glue:us-west-2:726167300549:catalog because no identity-based policy allows the glue:CreateDatabase action"}
System information
A description of your system. Please provide:
- SageMaker Python SDK version: 2.227.0
- Framework name (eg. PyTorch) or algorithm (eg. KMeans): ScriptProcessor
- Framework version:
- Python version: 3.11.11
- CPU or GPU: CPU
- Custom Docker image (Y/N): N
Additional context