Skip to main content

SarimaAnomalyDetector

Versions

v1.0.0

Basic Information

Class Name: SarimaAnomalyDetector

Title: Seasonal ARIMA Anomaly Detector

Version: 1.0.0

Author: Simon Vedder

Organization: OneStream

Creation Date: 2024-05-24

Default Routine Memory Capacity: 2.0 GB

Tags

Anomaly, Time Series, Supervised, ML

Description

Short Description

Seasonal Autoregressive Integrated Moving Average (SARIMA) Anomaly Detector

Long Description

SARIMA is an extension of the ARIMA model that specifically handles time series data with seasonal patterns by including seasonal patterns to capture periodic fluctuations. This routine fits a seasonal ARIMA model to each target. It then uses this model to predict values in the time series. Alongside these predictions the model produces confidence intervals around the predictions. The width of the confidence interval is customizable using the confidence_interval hyperparameter. Once these confidence intervals are produced, we compare the source value to the confidence interval, and mark any point that falls outside of the confidence interval as an anomaly. The default confidence interval is 95%, a larger value will result in a wider interval and detect less anomalies than a smaller confidence interval. This routine performs best on highly seasonal data with shorter forecast horizons.

Use Cases

1. Seasonal Data Usage

Time series data from any industry that follows set seasonal patterns are ideal for this anomaly detection routine. Seasonal ARIMA models excel at identifying and following seasonal trends in their predictions. Industries with extremely seasonal data include retail, energy, agriculture, tourism, construction, education, healthcare, and more. The most common type of seasonality is yearly seasonality, however, seasonality can be seen and modeled at different time granularity. At the yearly level, retail data is seasonal due to increased demand during holidays and back-to-school periods. At the monthly level, a country club may see increased demand at the end of every month due to the monthly minimum spending requirement. At the weekly level, a restaurant or bar will likely see increased demand on Friday and Saturday nights. If seasonality is a critical aspect of the time series data, then this routine will detect datapoint anomalies that do not follow the normal observed seasonal patterns.

2. Detailed Usage Example

A national retailer monitors weekly store revenue, which shows strong holiday and back-to-school seasonality. They first collect two years of clean, timestamped sales and call the fit method on the historical series to let the Seasonal ARIMA model learn the baseline trend and weekly/annual seasonal patterns. After appending the latest week of sales and calling the predict method, they are provided with flagged anomaly datapoints. Points where sales are unexpectedly high (e.g., a promo glitch) or low (e.g., inventory outage) are flagged based on the chosen anomaly side (positive, negative, or both). Because the model was fitted on the stable seasonal history, predict highlights only those data points that deviate from the normal seasonal pattern, enabling fast investigation and alerting.

Routine Methods

1. Init (Constructor)
  • Method: __init__
    • Type: Constructor

    • Memory Capacity: 2.0 GB

    • Allow In-Memory Execution: No

    • Read Only: No

    • Method Limits: There are no limits to the constructor method. This method simply saves the input parameters to be utilized in subsequent runs of the fit and predict methods.

    • Outputs Dynamic Artifacts: No

    • Short Description:

      • Seasonal Arima Construction.
    • Detailed Description:

      • This constructor is used to initialize the hyperparameters and group name for the seasonal arima anomaly detection routine.
    • Inputs:

      • Required Input
        • Hyperparameters: The hyperparameters of the anomaly detector.
          • Name: hyper_parameters
          • Tooltip:
            • Validation Constraints:
              • This input may be subject to other validation constraints at runtime.
          • Type: Must be an instance of SARIMA Anomaly Detector Hyper Parameters
          • Nested Model: SARIMA Anomaly Detector Hyper Parameters
            • Required Input
              • Seasonal Period: Seasonality in the dataset. Valid inputs are 'auto' or an integer.
                • Name: seasonal_period
                • Tooltip:
                  • Detail:
                    • If set to auto, the model will determine the optimal seasonality for each target. Valid integer values are 1-52.
                  • Validation Constraints:
                    • This input may be subject to other validation constraints at runtime.
                • Type: int | Literal | NoneType
              • Train Intensity: Determines the size of the hyperparameter pool.
                • Name: train_intensity
                • Tooltip:
                  • Validation Constraints:
                    • The input must be greater than or equal to 1.
                    • The input must be less than or equal to 5.
                    • This input may be subject to other validation constraints at runtime.
                • Type: int
              • Lower Confidence Interval: Determines the size of the lower confidence interval.
                • Name: lower_confidence_interval
                • Tooltip:
                  • Validation Constraints:
                    • The input must be greater than or equal to 0.
                    • The input must be less than or equal to 100.
                    • This input may be subject to other validation constraints at runtime.
                • Type: float
              • Upper Confidence Interval: Determines the size of the upper confidence interval.
                • Name: upper_confidence_interval
                • Tooltip:
                  • Validation Constraints:
                    • The input must be greater than or equal to 0.
                    • The input must be less than or equal to 100.
                    • This input may be subject to other validation constraints at runtime.
                • Type: float
              • Side: Decides whether to detect anomalies on the positive, negative, or both sides of the data.
                • Name: side
                • Tooltip:
                  • Validation Constraints:
                    • This input may be subject to other validation constraints at runtime.
                • Type: SideType_
              • Logp1: Apply logp1 transformation to values before fitting the model.
                • Name: logp1
                • Tooltip:
                  • Validation Constraints:
                    • This input may be subject to other validation constraints at runtime.
                • Type: bool
        • Group Name: Group name for your anomaly detector, used as an identifier in the output artifact.
          • Name: group_name
          • Tooltip:
            • Validation Constraints:
              • This input may be subject to other validation constraints at runtime.
          • Type: str
    • Artifacts: No artifacts are returned by this method

2. Fit (Method)
  • Method: fit
    • Type: Method

    • Memory Capacity: 2.0 GB

    • Allow In-Memory Execution: No

    • Read Only: No

    • Method Limits: This method is computationally expensive and its performance is strongly influenced by the size of dataset and number of targets. With 10,000 targets and 1.1 million rows at 100 GB of memory, the method successfully completes in several hours. However, when the dataset size increases to 15,000 targets and 7.5 million rows, the method times out after 5 hours and raises a timeout error. It is recommended to utilize time ranges as inputs to trim the amount of historical data the model is trained on.

    • Outputs Dynamic Artifacts: No

    • Short Description:

      • Fit the model.
    • Detailed Description:

      • The fit method will determine the optimal hyperparameters to use for each target. It will then fit a model to each target and store them for use in the predict method.
    • Inputs:

      • Required Input
        • Source Data Definition: The source data definition to use.
          • Name: source_data_definition
          • Tooltip:
            • Validation Constraints:
              • This input may be subject to other validation constraints at runtime.
          • Type: Must be an instance of Time Series Source Data
          • Nested Model: Time Series Source Data
            • Required Input
              • Connection: The connection to the source data.
                • Name: data_connection
                • Tooltip:
                  • Validation Constraints:
                    • This input may be subject to other validation constraints at runtime.
                • Type: Must be an instance of Tabular Connection
                • Nested Model: Tabular Connection
                  • Required Input
                    • Connection: The connection type to use to access the source data.
                      • Name: tabular_connection
                      • Tooltip:
                        • Validation Constraints:
                          • This input may be subject to other validation constraints at runtime.
                      • Type: Must be one of the following
                        • SQL Server Connection
                          • Required Input
                            • Database Resource: The name of the database resource to connect to.
                              • Name: database_resource
                              • Tooltip:
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: str
                            • Database Name: The name of the database to connect to.
                              • Name: database_name
                              • Tooltip:
                                • Detail:
                                  • Note: If you don’t see the database name that you are looking for in this list, it is recommended that you first move the data to be used within a database that is available within this list.
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: str
                            • Table Name: The name of the table to use.
                              • Name: table_name
                              • Tooltip:
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: str
                        • MetaFileSystem Connection
                          • Required Input
                            • Connection Key: The MetaFileSystem connection key.
                              • Name: connection_key
                              • Tooltip:
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: MetaFileSystemConnectionKey
                            • File Path: The full file path to the file to ingest.
                              • Name: file_path
                              • Tooltip:
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: str
                        • Partitioned MetaFileSystem Connection
                          • Required Input
                            • Connection Key: The MetaFileSystem connection key.
                              • Name: connection_key
                              • Tooltip:
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: MetaFileSystemConnectionKey
                            • File Type: The type of files to read from the directory.
                              • Name: file_type
                              • Tooltip:
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: FileExtensions_
                            • Directory Path: The full directory path containing partitioned tabular files.
                              • Name: directory_path
                              • Tooltip:
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: str
              • Dimension Columns: The columns to use as dimensions.
                • Name: dimension_columns
                • Tooltip:
                  • Validation Constraints:
                    • The input must have a minimum length of 1.
                    • This input may be subject to other validation constraints at runtime.
                • Type: list[str]
              • Date Column: The column to use as the date.
                • Name: date_column
                • Tooltip:
                  • Detail:
                    • The date column must in a DateTime readable format.
                  • Validation Constraints:
                    • This input may be subject to other validation constraints at runtime.
                • Type: str
              • Value Column: The column to use as the value.
                • Name: value_column
                • Tooltip:
                  • Detail:
                    • The value column must be a numeric (int, float, double, decimal, etc.) column.
                  • Validation Constraints:
                    • This input may be subject to other validation constraints at runtime.
                • Type: str
        • Feature Data Definition: The feature data definition to use.
          • Name: feature_data_definitions
          • Tooltip:
            • Validation Constraints:
              • This input may be subject to other validation constraints at runtime.
          • Type: list[TimeSeriesTableDefinition]
      • Optional Input
        • Date Range: The date range to fit anomalies on.
          • Name: time_range
          • Tooltip:
            • Detail:
              • If None, entire dataset will be used.
            • Validation Constraints:
              • This input may be subject to other validation constraints at runtime.
          • Type: Must be an instance of Start and End Date
          • Nested Model: Start and End Date
            • Required Input
              • Start Date: The inclusive start of the date range (MM/DD/YYYY).
                • Name: start_date
                • Tooltip:
                  • Validation Constraints:
                    • This input may be subject to other validation constraints at runtime.
                • Type: datetime
              • End Date: The inclusive end of the date range (MM/DD/YYYY).
                • Name: end_date
                • Tooltip:
                  • Detail:
                    • Note, the Seasonal ARIMA Anomaly Detector Routine treats the end date as exclusive.
                  • Validation Constraints:
                    • This input may be subject to other validation constraints at runtime.
                • Type: datetime
    • Artifacts: No artifacts are returned by this method

3. Predict (Method)
  • Method: predict
    • Type: Method

    • Memory Capacity: 2.0 GB

    • Allow In-Memory Execution: No

    • Read Only: No

    • Method Limits: The practical method limits for prediction are not well-defined, as this step has not shown any issues handling larger datasets and is generally not a bottleneck compared to the fit process. This step should complete in a matter of minutes or tens of minutes at most. Using 100 GB of memory, a dataset with 15,000 targets and 7.5 million rows completes in approximately 3 minutes for a year of predictions.

    • Outputs Dynamic Artifacts: No

    • Short Description:

      • Find Anomalies
    • Detailed Description:

      • This method will predict anomalies using the selected SARIMA model that was fit in the fit method.
    • Inputs:

      • Required Input
        • Source Data Definition: The source data definition to use.
          • Name: source_data_definition
          • Tooltip:
            • Validation Constraints:
              • This input may be subject to other validation constraints at runtime.
          • Type: Must be an instance of Time Series Source Data
          • Nested Model: Time Series Source Data
            • Required Input
              • Connection: The connection to the source data.
                • Name: data_connection
                • Tooltip:
                  • Validation Constraints:
                    • This input may be subject to other validation constraints at runtime.
                • Type: Must be an instance of Tabular Connection
                • Nested Model: Tabular Connection
                  • Required Input
                    • Connection: The connection type to use to access the source data.
                      • Name: tabular_connection
                      • Tooltip:
                        • Validation Constraints:
                          • This input may be subject to other validation constraints at runtime.
                      • Type: Must be one of the following
                        • SQL Server Connection
                          • Required Input
                            • Database Resource: The name of the database resource to connect to.
                              • Name: database_resource
                              • Tooltip:
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: str
                            • Database Name: The name of the database to connect to.
                              • Name: database_name
                              • Tooltip:
                                • Detail:
                                  • Note: If you don’t see the database name that you are looking for in this list, it is recommended that you first move the data to be used within a database that is available within this list.
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: str
                            • Table Name: The name of the table to use.
                              • Name: table_name
                              • Tooltip:
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: str
                        • MetaFileSystem Connection
                          • Required Input
                            • Connection Key: The MetaFileSystem connection key.
                              • Name: connection_key
                              • Tooltip:
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: MetaFileSystemConnectionKey
                            • File Path: The full file path to the file to ingest.
                              • Name: file_path
                              • Tooltip:
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: str
                        • Partitioned MetaFileSystem Connection
                          • Required Input
                            • Connection Key: The MetaFileSystem connection key.
                              • Name: connection_key
                              • Tooltip:
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: MetaFileSystemConnectionKey
                            • File Type: The type of files to read from the directory.
                              • Name: file_type
                              • Tooltip:
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: FileExtensions_
                            • Directory Path: The full directory path containing partitioned tabular files.
                              • Name: directory_path
                              • Tooltip:
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: str
              • Dimension Columns: The columns to use as dimensions.
                • Name: dimension_columns
                • Tooltip:
                  • Validation Constraints:
                    • The input must have a minimum length of 1.
                    • This input may be subject to other validation constraints at runtime.
                • Type: list[str]
              • Date Column: The column to use as the date.
                • Name: date_column
                • Tooltip:
                  • Detail:
                    • The date column must in a DateTime readable format.
                  • Validation Constraints:
                    • This input may be subject to other validation constraints at runtime.
                • Type: str
              • Value Column: The column to use as the value.
                • Name: value_column
                • Tooltip:
                  • Detail:
                    • The value column must be a numeric (int, float, double, decimal, etc.) column.
                  • Validation Constraints:
                    • This input may be subject to other validation constraints at runtime.
                • Type: str
        • Feature Data Definition: The feature data definition to use.
          • Name: feature_data_definitions
          • Tooltip:
            • Validation Constraints:
              • This input may be subject to other validation constraints at runtime.
          • Type: list[TimeSeriesTableDefinition]
        • State Info Definition: The snapshot name and description. These will be used as identifiers in the snapshot artifact.
          • Name: state_info
          • Tooltip:
            • Validation Constraints:
              • This input may be subject to other validation constraints at runtime.
          • Type: Must be an instance of State Info
          • Nested Model: State Info
            • Required Input
              • Name: The name of the anomaly detector instance.
                • Name: snapshot_name
                • Tooltip:
                  • Validation Constraints:
                    • This input may be subject to other validation constraints at runtime.
                • Type: str
              • Snapshot Description: The description of your anomaly detector instance.
                • Name: snapshot_description
                • Tooltip:
                  • Validation Constraints:
                    • This input may be subject to other validation constraints at runtime.
                • Type: str
      • Optional Input
        • Date Range: The date range to predict anomalies on.
          • Name: time_range
          • Tooltip:
            • Detail:
              • If None, entire dataset will be used.
            • Validation Constraints:
              • This input may be subject to other validation constraints at runtime.
          • Type: Must be an instance of Start and End Date
          • Nested Model: Start and End Date
            • Required Input
              • Start Date: The inclusive start of the date range (MM/DD/YYYY).
                • Name: start_date
                • Tooltip:
                  • Validation Constraints:
                    • This input may be subject to other validation constraints at runtime.
                • Type: datetime
              • End Date: The inclusive end of the date range (MM/DD/YYYY).
                • Name: end_date
                • Tooltip:
                  • Detail:
                    • Note, the Seasonal ARIMA Anomaly Detector Routine treats the end date as exclusive.
                  • Validation Constraints:
                    • This input may be subject to other validation constraints at runtime.
                • Type: datetime
    • Artifacts:

      • AnomalySnapshot: Parquet file containing data about your anomaly detection run.

        • Qualified Key Annotation: anomaly_snapshot
        • Aggregate Artifact: False
        • In-Memory Json Accessible: False
        • File Annotations:
          • artifacts_/@anomaly_snapshot/data_/data_<int>.parquet
            • A partitioned set of parquet files where each file will have no more than 1000000 rows.
      • Specific Anomaly Dates: Parquet file containing data about the specific dates an anomaly was detected.

        • Qualified Key Annotation: anomaly_dates
        • Aggregate Artifact: False
        • In-Memory Json Accessible: False
        • File Annotations:
          • artifacts_/@anomaly_dates/data_/data_<int>.parquet
            • A partitioned set of parquet files where each file will have no more than 1000000 rows.
      • Specific Anomaly Instances: Parquet file containing data about the specific anomaly instances that were detected.

        • Qualified Key Annotation: anomaly_instance
        • Aggregate Artifact: False
        • In-Memory Json Accessible: False
        • File Annotations:
          • artifacts_/@anomaly_instance/data_/data_<int>.parquet
            • A partitioned set of parquet files where each file will have no more than 1000000 rows.

Interface Definitions

1. Anomaly Detection Interface

An interface class requiring fit and predict methods to be implemented.

This BaseRoutineInterface class enforces a common interface for all anomaly detection routines. The interface requires each anomaly detection routine to implement a fit method and a predict method with the same input parameters. Each concrete class will have constructor methods where hyperparameters specific to the anomaly detection algorithm may be set, however, this interface does not enforce any specific constructor method.

Interface Methods:

1. Fit

Method Name: fit

Short Description: Abstract Fit Method

Detailed Description: This specifies the necessary input and output parameters for the fit method on all anomaly detection routines. The input parameters contain a source data definition and time range to fit an anomaly detector to.

Inputs:

PropertyTypeRequiredDescription
source_data_definition#/$defs/TimeSeriesTableDefinitionYesThe source data definition to use.
feature_data_definitionsarrayYesThe feature data definition to use.
time_range`#/$defs/StartDateEndDateDefinitionnull`No

Input Schema (JSON):

{
"$defs": {
"FileExtensions_": {
"description": "File Extensions.",
"enum": [
".csv",
".tsv",
".psv",
".parquet",
".xlsx"
],
"title": "FileExtensions_",
"type": "string"
},
"FileTabularConnection": {
"properties": {
"connection_key": {
"$ref": "#/$defs/MetaFileSystemConnectionKey",
"description": "The MetaFileSystem connection key.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "connection_key",
"title": "Connection Key",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"file_path": {
"description": "The full file path to the file to ingest.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.filetable:FileTabularConnection.get_file_path_bound_options",
"options_callback_kwargs": null,
"state_name": "file_path",
"title": "File Path",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"connection_key",
"file_path"
],
"title": "FileTabularConnection",
"type": "object"
},
"MetaFileSystemConnectionKey": {
"enum": [
"sql-server-routine",
"sql-server-shared"
],
"title": "MetaFileSystemConnectionKey",
"type": "string"
},
"PartitionedFileTabularConnection": {
"properties": {
"connection_key": {
"$ref": "#/$defs/MetaFileSystemConnectionKey",
"description": "The MetaFileSystem connection key.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "connection_key",
"title": "Connection Key",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"file_type": {
"$ref": "#/$defs/FileExtensions_",
"description": "The type of files to read from the directory.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "file_info",
"title": "File Type",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"directory_path": {
"description": "The full directory path containing partitioned tabular files.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.partitionedfiletable:PartitionedFileTabularConnection.get_directory_path_bound_options",
"options_callback_kwargs": null,
"state_name": "file_info",
"title": "Directory Path",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"connection_key",
"file_type",
"directory_path"
],
"title": "PartitionedFileTabularConnection",
"type": "object"
},
"SqlTabularConnection": {
"properties": {
"database_resource": {
"description": "The name of the database resource to connect to.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.sqltable:SqlTabularConnection.get_database_resources",
"options_callback_kwargs": null,
"state_name": "database_resource",
"title": "Database Resource",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"database_name": {
"description": "The name of the database to connect to.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.sqltable:SqlTabularConnection.get_database_schemas",
"options_callback_kwargs": null,
"state_name": "database_name",
"title": "Database Name",
"tooltip": "Detail:\nNote: If you don\u2019t see the database name that you are looking for in this list, it is recommended that you first move the data to be used within a database that is available within this list.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"table_name": {
"description": "The name of the table to use.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.sqltable:SqlTabularConnection.get_tables",
"options_callback_kwargs": null,
"state_name": "table_name",
"title": "Table Name",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"database_resource",
"database_name",
"table_name"
],
"title": "SqlTabularConnection",
"type": "object"
},
"StartDateEndDateDefinition": {
"properties": {
"start_date": {
"description": "The inclusive start of the date range (MM/DD/YYYY).",
"field_type": "input",
"format": "date-time",
"input_component": {
"component_type": "dateselector",
"max_date": null,
"min_date": null
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "date_selection",
"title": "Start Date",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"end_date": {
"description": "The inclusive end of the date range (MM/DD/YYYY).",
"field_type": "input",
"format": "date-time",
"input_component": {
"component_type": "dateselector",
"max_date": null,
"min_date": null
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "date_selection",
"title": "End Date",
"tooltip": "Detail:\nNote, the Seasonal ARIMA Anomaly Detector Routine treats the end date as exclusive.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"start_date",
"end_date"
],
"title": "StartDateEndDateDefinition",
"type": "object"
},
"TabularConnection": {
"description": "A shared parameter base model dedication to tabular connections.",
"properties": {
"tabular_connection": {
"anyOf": [
{
"$ref": "#/$defs/SqlTabularConnection"
},
{
"$ref": "#/$defs/FileTabularConnection"
},
{
"$ref": "#/$defs/PartitionedFileTabularConnection"
}
],
"description": "The connection type to use to access the source data.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "connection",
"title": "Connection",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
}
},
"required": [
"tabular_connection"
],
"title": "TabularConnection",
"type": "object"
},
"TimeSeriesTableDefinition": {
"description": "A parameter base model dedicated to loading tabular time series data.",
"properties": {
"data_connection": {
"$ref": "#/$defs/TabularConnection",
"description": "The connection to the source data.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "source_connection",
"title": "Connection",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"dimension_columns": {
"description": "The columns to use as dimensions.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"items": {
"type": "string"
},
"long_description": null,
"minItems": 1,
"options_callback": "xperiflow.source.app.routines.pbm.store.tsf.tstable:TimeSeriesTableDefinition.get_table_columns",
"options_callback_kwargs": null,
"state_name": "column_selection",
"title": "Dimension Columns",
"tooltip": "Validation Constraints:\nThe input must have a minimum length of 1.\n\nThis input may be subject to other validation constraints at runtime.",
"type": "array"
},
"date_column": {
"description": "The column to use as the date.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.tsf.tstable:TimeSeriesTableDefinition.get_table_columns",
"options_callback_kwargs": null,
"state_name": "column_selection",
"title": "Date Column",
"tooltip": "Detail:\nThe date column must in a DateTime readable format.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"value_column": {
"description": "The column to use as the value.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.tsf.tstable:TimeSeriesTableDefinition.get_table_columns",
"options_callback_kwargs": null,
"state_name": "column_selection",
"title": "Value Column",
"tooltip": "Detail:\nThe value column must be a numeric (int, float, double, decimal, etc.) column.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"data_connection",
"dimension_columns",
"date_column",
"value_column"
],
"title": "TimeSeriesTableDefinition",
"type": "object"
}
},
"properties": {
"source_data_definition": {
"$ref": "#/$defs/TimeSeriesTableDefinition",
"description": "The source data definition to use.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "SourceDefinition",
"title": "Source Data Definition",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"feature_data_definitions": {
"description": "The feature data definition to use.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"items": {
"$ref": "#/$defs/TimeSeriesTableDefinition"
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "FeatureDefinition",
"title": "Feature Data Definition",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "array"
},
"time_range": {
"anyOf": [
{
"$ref": "#/$defs/StartDateEndDateDefinition"
},
{
"type": "null"
}
],
"default": null,
"description": "The date range to fit anomalies on.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "DateRange",
"title": "Date Range",
"tooltip": "Detail:\nIf None, entire dataset will be used.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime."
}
},
"required": [
"source_data_definition",
"feature_data_definitions"
],
"title": "AnomalyDetectionFitParameters",
"type": "object"
}

Artifacts: No artifacts are returned by this method

2. Predict

Method Name: predict

Short Description: Abstract Predict Method

Detailed Description: This specifies the necessary input and output parameters for the predict method on all anomaly detection routines. The input parameters contain a source data definition and a time range to detect anomalies.

Inputs:

PropertyTypeRequiredDescription
source_data_definition#/$defs/TimeSeriesTableDefinitionYesThe source data definition to use.
feature_data_definitionsarrayYesThe feature data definition to use.
time_range`#/$defs/StartDateEndDateDefinitionnull`No
state_info#/$defs/StateInfoDefinitionYesThe snapshot name and description. These will be used as identifiers in the snapshot artifact.

Input Schema (JSON):

{
"$defs": {
"FileExtensions_": {
"description": "File Extensions.",
"enum": [
".csv",
".tsv",
".psv",
".parquet",
".xlsx"
],
"title": "FileExtensions_",
"type": "string"
},
"FileTabularConnection": {
"properties": {
"connection_key": {
"$ref": "#/$defs/MetaFileSystemConnectionKey",
"description": "The MetaFileSystem connection key.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "connection_key",
"title": "Connection Key",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"file_path": {
"description": "The full file path to the file to ingest.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.filetable:FileTabularConnection.get_file_path_bound_options",
"options_callback_kwargs": null,
"state_name": "file_path",
"title": "File Path",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"connection_key",
"file_path"
],
"title": "FileTabularConnection",
"type": "object"
},
"MetaFileSystemConnectionKey": {
"enum": [
"sql-server-routine",
"sql-server-shared"
],
"title": "MetaFileSystemConnectionKey",
"type": "string"
},
"PartitionedFileTabularConnection": {
"properties": {
"connection_key": {
"$ref": "#/$defs/MetaFileSystemConnectionKey",
"description": "The MetaFileSystem connection key.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "connection_key",
"title": "Connection Key",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"file_type": {
"$ref": "#/$defs/FileExtensions_",
"description": "The type of files to read from the directory.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "file_info",
"title": "File Type",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"directory_path": {
"description": "The full directory path containing partitioned tabular files.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.partitionedfiletable:PartitionedFileTabularConnection.get_directory_path_bound_options",
"options_callback_kwargs": null,
"state_name": "file_info",
"title": "Directory Path",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"connection_key",
"file_type",
"directory_path"
],
"title": "PartitionedFileTabularConnection",
"type": "object"
},
"SqlTabularConnection": {
"properties": {
"database_resource": {
"description": "The name of the database resource to connect to.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.sqltable:SqlTabularConnection.get_database_resources",
"options_callback_kwargs": null,
"state_name": "database_resource",
"title": "Database Resource",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"database_name": {
"description": "The name of the database to connect to.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.sqltable:SqlTabularConnection.get_database_schemas",
"options_callback_kwargs": null,
"state_name": "database_name",
"title": "Database Name",
"tooltip": "Detail:\nNote: If you don\u2019t see the database name that you are looking for in this list, it is recommended that you first move the data to be used within a database that is available within this list.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"table_name": {
"description": "The name of the table to use.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.sqltable:SqlTabularConnection.get_tables",
"options_callback_kwargs": null,
"state_name": "table_name",
"title": "Table Name",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"database_resource",
"database_name",
"table_name"
],
"title": "SqlTabularConnection",
"type": "object"
},
"StartDateEndDateDefinition": {
"properties": {
"start_date": {
"description": "The inclusive start of the date range (MM/DD/YYYY).",
"field_type": "input",
"format": "date-time",
"input_component": {
"component_type": "dateselector",
"max_date": null,
"min_date": null
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "date_selection",
"title": "Start Date",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"end_date": {
"description": "The inclusive end of the date range (MM/DD/YYYY).",
"field_type": "input",
"format": "date-time",
"input_component": {
"component_type": "dateselector",
"max_date": null,
"min_date": null
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "date_selection",
"title": "End Date",
"tooltip": "Detail:\nNote, the Seasonal ARIMA Anomaly Detector Routine treats the end date as exclusive.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"start_date",
"end_date"
],
"title": "StartDateEndDateDefinition",
"type": "object"
},
"StateInfoDefinition": {
"properties": {
"snapshot_name": {
"description": "The name of the anomaly detector instance.",
"field_type": "input",
"input_component": {
"component_type": "textbox",
"height": null,
"multiline": false
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "snapshot_info",
"title": "Name",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"snapshot_description": {
"description": "The description of your anomaly detector instance.",
"field_type": "input",
"input_component": {
"component_type": "textbox",
"height": null,
"multiline": false
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "snapshot_info",
"title": "Snapshot Description",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"snapshot_name",
"snapshot_description"
],
"title": "StateInfoDefinition",
"type": "object"
},
"TabularConnection": {
"description": "A shared parameter base model dedication to tabular connections.",
"properties": {
"tabular_connection": {
"anyOf": [
{
"$ref": "#/$defs/SqlTabularConnection"
},
{
"$ref": "#/$defs/FileTabularConnection"
},
{
"$ref": "#/$defs/PartitionedFileTabularConnection"
}
],
"description": "The connection type to use to access the source data.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "connection",
"title": "Connection",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
}
},
"required": [
"tabular_connection"
],
"title": "TabularConnection",
"type": "object"
},
"TimeSeriesTableDefinition": {
"description": "A parameter base model dedicated to loading tabular time series data.",
"properties": {
"data_connection": {
"$ref": "#/$defs/TabularConnection",
"description": "The connection to the source data.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "source_connection",
"title": "Connection",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"dimension_columns": {
"description": "The columns to use as dimensions.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"items": {
"type": "string"
},
"long_description": null,
"minItems": 1,
"options_callback": "xperiflow.source.app.routines.pbm.store.tsf.tstable:TimeSeriesTableDefinition.get_table_columns",
"options_callback_kwargs": null,
"state_name": "column_selection",
"title": "Dimension Columns",
"tooltip": "Validation Constraints:\nThe input must have a minimum length of 1.\n\nThis input may be subject to other validation constraints at runtime.",
"type": "array"
},
"date_column": {
"description": "The column to use as the date.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.tsf.tstable:TimeSeriesTableDefinition.get_table_columns",
"options_callback_kwargs": null,
"state_name": "column_selection",
"title": "Date Column",
"tooltip": "Detail:\nThe date column must in a DateTime readable format.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"value_column": {
"description": "The column to use as the value.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.tsf.tstable:TimeSeriesTableDefinition.get_table_columns",
"options_callback_kwargs": null,
"state_name": "column_selection",
"title": "Value Column",
"tooltip": "Detail:\nThe value column must be a numeric (int, float, double, decimal, etc.) column.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"data_connection",
"dimension_columns",
"date_column",
"value_column"
],
"title": "TimeSeriesTableDefinition",
"type": "object"
}
},
"description": "\"Note that only most recent fit will be utilized in predictions.\"",
"properties": {
"source_data_definition": {
"$ref": "#/$defs/TimeSeriesTableDefinition",
"description": "The source data definition to use.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "SourceDefinition",
"title": "Source Data Definition",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"feature_data_definitions": {
"description": "The feature data definition to use.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"items": {
"$ref": "#/$defs/TimeSeriesTableDefinition"
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "FeatureDefinition",
"title": "Feature Data Definition",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "array"
},
"time_range": {
"anyOf": [
{
"$ref": "#/$defs/StartDateEndDateDefinition"
},
{
"type": "null"
}
],
"default": null,
"description": "The date range to predict anomalies on.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "DateRange",
"title": "Date Range",
"tooltip": "Detail:\nIf None, entire dataset will be used.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"state_info": {
"$ref": "#/$defs/StateInfoDefinition",
"description": "The snapshot name and description. These will be used as identifiers in the snapshot artifact.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "StateInfoDefinition",
"title": "State Info Definition",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
}
},
"required": [
"source_data_definition",
"feature_data_definitions",
"state_info"
],
"title": "AnomalyDetectionPredictParameters",
"type": "object"
}

Artifacts:

PropertyTypeRequiredDescription
anomaly_snapshotunknownYesParquet file containing data about your anomaly detection run.
anomaly_datesunknownYesParquet file containing data about the specific dates an anomaly was detected.
anomaly_instanceunknownYesParquet file containing data about the specific anomaly instances that were detected.

Artifact Schema (JSON):

{
"additionalProperties": true,
"properties": {
"anomaly_snapshot": {
"description": "Parquet file containing data about your anomaly detection run.",
"io_factory_kwargs": {},
"preview_factory_kwargs": null,
"preview_factory_type": null,
"statistic_factory_kwargs": null,
"statistic_factory_type": null,
"title": "AnomalySnapshot"
},
"anomaly_dates": {
"description": "Parquet file containing data about the specific dates an anomaly was detected.",
"io_factory_kwargs": {},
"preview_factory_kwargs": null,
"preview_factory_type": null,
"statistic_factory_kwargs": null,
"statistic_factory_type": null,
"title": "Specific Anomaly Dates"
},
"anomaly_instance": {
"description": "Parquet file containing data about the specific anomaly instances that were detected.",
"io_factory_kwargs": {},
"preview_factory_kwargs": null,
"preview_factory_type": null,
"statistic_factory_kwargs": null,
"statistic_factory_type": null,
"title": "Specific Anomaly Instances"
}
},
"required": [
"anomaly_snapshot",
"anomaly_dates",
"anomaly_instance"
],
"title": "AnomalyDetectionArtifacts",
"type": "object"
}

Developer Docs

Routine Typename: SarimaAnomalyDetector

Method NameArtifact Keys
__init__N/A
fitN/A
predictanomaly_snapshot, anomaly_dates, anomaly_instance

Was this page helpful?