- Use cases
-
1. Preprocessing
- SageMaker Object Detection preprocessing
- Rekognition Object Detection preprocessing
- SageMaker Kmeans preprocessing
- Autopilot preprocessing
- DeepAR preprocessing
- Personalize preprocessing
- Select, drop or extract Columns
- Split dataset to Train and Test
- Upload to s3
- Forecast preprocessing
- Rekognition Classification preprocessing
- SageMaker Image Classification preprocessing
- Xgboost preprocessing
- Blazingtext preprocessing
- Comprehend custom preprocessing
-
2. Training
- SageMaker Object Detection training
- Rekognition Object Detection training
- Forecast training
- Personalize training
- BlazingText training
- DeepAR training
- SageMaker Kmeans training
- Comprehend custom training
- Autopilot Training
- Xgboost Training
- Autogluon training
- Rekognition Classification training
- SageMaker Image Classification training
-
3. Inference
- SageMaker Object Detection inference
- Forecast inference
- Rekognition Object Detection inference
- Comprehend custom inference
- Personalize inference
- Autopilot Inference
- BlazingText Inference
- Custom SageMaker model Inference
- DeepAR Inference
- Rekognition Classification inference
- SageMaker Image Classification inference
- SageMaker Kmeans inference
- Xgboost Inference
- Contribute a use case or contact us for help.
- Frequently Asked Questions
Autopilot preprocessing
SageMaker Autopilot first inspects your data set, and runs a number of candidates to figure out the optimal combination of data preprocessing steps, machine learning algorithms and hyperparameters.
As of today, SageMaker Autopilot supports input data in tabular format, with automatic data cleaning and preprocessing.
So all you need to do is provide a CSV file with headers! To make sure there are no missing headers/badly formatted CSVs, we recommend that you read and write the CSV with no changes as follows:
Read data using pandas
import pandas as pd
data = pd.read_csv('file.csv')
# Don't include indices
data.to_csv('automl-train.csv', index=False, header=True)
If you have to select or drop any columns, please refer to this documentation
Upload this data to S3, to a location similar to s3://bucket/prefix/automl-train.csv
Related content:
- ☞ Autopilot Inference – 1 min read
- ☞ Autopilot Training – 1 min read
- ☞ Autogluon training – 2 min read