DeepAR training

Make sure you saw this link for preprocessing first.

At the end of the preprocessing for DeepAR page, you uploaded your JSON-lines data to S3, to a location similar to s3://bucketname/train/train-data.jsonl and s3://bucketname/test/test-data.jsonl

On a SageMaker notebook, initialize the estimator:

import sagemaker

session = sagemaker.Session()

region = session.boto_region_name

estimator = sagemaker.estimator.Estimator(
    sagemaker_session=session,, "forecasting-deepar", "latest"),

Assume you have timestamps that are 1 hour apart, and you want to use 10 values in the past to predict 1 value in the future; set hyperparameters as follows:

hyperparameters = {
    "time_freq": '1H',
    "epochs": "400",
    "early_stopping_patience": "40",
    "mini_batch_size": "64",
    "learning_rate": "5E-4",
    "context_length": '10',
    "prediction_length": '1'


Change the ‘1H’ to ‘6H’ for 6 hours, and ‘1D’ for 1 day if your data points are 6 hours or one day apart, for example. Learn more about hyperparameters here

Next, train your DeepAR model using the Sagemaker Python SDK:

data_channels = {
    "train": "s3://bucketname/train/",
    "test": "s3://bucketname/test/"
}, wait=True)

When adding the path to the file for input data, go up to the folder and not the actual .jsonl file. This is set up so that a train folder for example, may contain multiple .jsonl files.

Related content: