Skip to main content

RollingStandardDeviationAnomalyDetector

Versions

v1.0.0

Basic Information

Class Name: RollingStandardDeviationAnomalyDetector

Title: Rolling Standard Deviation Detector

Version: 1.0.0

Author: Simon Vedder

Organization: OneStream

Creation Date: 2024-02-28

Default Routine Memory Capacity: 2.0 GB

Tags

Anomaly, Time Series, Supervised, ML

Description

Short Description

Constructor for the Rolling Standard Deviation Anomaly Detection Routine.

Long Description

This anomaly detection technique is used to detect volatility shift anomalies. The anomalous dates marked by this routine will indicate to the user that there was a large change in volatility on/after the detected dates. To detect these anomalies the standard deviation difference is calculated by taking the difference between the standard deviations of a forward and backward rolling window at each point. For any given point, If the standard deviation difference is greater than or less than some value, set by adjusting the threshold constant hyperparameter, the point will be marked as a volatility shift anomaly. Value X is able to be tuned by the user by changing the constant threshold parameter. A higher constant threshold will make this anomaly detector less sensitive.

Use Cases

1. Retail and E-commerce

In the retail and e-commerce sectors, this anomaly detector plays a crucial role in retroactively identifying significant events that impacted business performance. For instance, a sudden spike in volatility might correlate with a viral marketing campaign, a major holiday shopping period, or even external factors like economic shifts or competitor actions. It's possible that these events are undocumented and therefore not yet included in the forecasting process. By identifying them, it allows the user to enrich their existing feature dataset, ideally improving their products' forecastability. Understanding these events through the lens of historical data allows businesses to quantify the impact of specific occurrences on sales and customer behavior. It enables a deeper understanding of how certain events or periods influenced business metrics, allowing companies to better prepare for similar future events, optimize their marketing strategies, and enhance overall decision-making processes. This analytical approach not only aids in maintaining data integrity but also enriches the understanding of market dynamics and consumer responses over time.

2. Manufacturing And Supply Chain

In the manufacturing and supply chain sectors, an anomaly detector that tracks volatility shifts in production metrics can serve as an indicator of a period of poor supply chain performance. If this routine identifies sudden changes in the variability of production rates, quality control metrics, or supply chain performance, businesses can identify issues such as equipment failures, supply shortages, or bottlenecks. Early identification of these issues enables managers to make informed decisions quickly, whether adjusting production schedules, reallocating resources, or initiating maintenance protocols to minimize downtime. Also, This routine delivers managers better insight into the history of their data, and allows them to view events that may affect their forecasting abilities in the future. Utilizing this routine effectively on historical data will improve the data's forecastability. This routine has the potential to detect past supply chain failures that resulted in strange behavior from the provided time series. If these failures were identified and resolved, it would lead to a period of time in the series that doesn't accurately reflect the normal behavior of the series, and should therefore be cleaned.

Routine Methods

1. Init (Constructor)
  • Method: __init__
    • Type: Constructor

    • Memory Capacity: 2.0 GB

    • Allow In-Memory Execution: No

    • Read Only: No

    • Method Limits: There are no limits to the constructor method. This method simply saves the input parameters to be utilized in subsequent runs of the fit and predict methods.

    • Outputs Dynamic Artifacts: No

    • Short Description:

      • Constructor for the Rolling Standard Deviation Anomaly Detection Routine.
    • Detailed Description:

      • The rolling standard deviation constructor is used to set the group name, state info, and hyperparameters for this anomaly detection instance.
    • Inputs:

      • Required Input
        • Hyperparameters: The hyperparameters for double rolling median and double rolling standard deviation anomaly detectors.
          • Name: hyper_parameters
          • Tooltip:
            • Validation Constraints:
              • This input may be subject to other validation constraints at runtime.
          • Type: Must be an instance of Rolling Median Anomaly Detection Hyper Parameters
          • Nested Model: Rolling Median Anomaly Detection Hyper Parameters
            • Required Input
              • Forward Rolling Window: The length of the forward rolling window in days, weeks, or months depending on the data frequency.
                • Name: forward_rolling_window
                • Tooltip:
                  • Validation Constraints:
                    • The input must be greater than 0.
                    • This input may be subject to other validation constraints at runtime.
                • Type: int
              • Backward Rolling Window: The length of the backward rolling window in days, weeks, or months depending on the data frequency.
                • Name: backward_rolling_window
                • Tooltip:
                  • Validation Constraints:
                    • The input must be greater than 0.
                    • This input may be subject to other validation constraints at runtime.
                • Type: int
              • Threshold Constant: Constant used for anomaly threshold. A higher threshold constant value makes it less likely to detect anomalies.
                • Name: threshold_constant
                • Tooltip:
                  • Validation Constraints:
                    • The input must be greater than 0.
                    • This input may be subject to other validation constraints at runtime.
                • Type: float
              • Detection Side: Decides whether to detect anomalies on the positive, negative, or both sides of the data.
                • Name: side
                • Tooltip:
                  • Validation Constraints:
                    • This input may be subject to other validation constraints at runtime.
                • Type: SideType_
        • Group Name: Group name for your anomaly detector, used as an identifier in the output artifact.
          • Name: group_name
          • Tooltip:
            • Validation Constraints:
              • This input may be subject to other validation constraints at runtime.
          • Type: str
    • Artifacts: No artifacts are returned by this method

2. Fit (Method)
  • Method: fit
    • Type: Method

    • Memory Capacity: 2.0 GB

    • Allow In-Memory Execution: No

    • Read Only: No

    • Method Limits: This method performs efficiently across different dataset sizes. Testing at 100 GB of memory shows that with 25,000 targets and 7.4 million rows (monthly data), the method completes in under 10 minutes. With 27,500 targets and 20.1 million rows (daily data), the method completes in under 10 minutes.

    • Outputs Dynamic Artifacts: No

    • Short Description:

      • Fit the model.
    • Detailed Description:

      • The fit method will take the data input and calculate a backward and forward rolling standard deviation of a specified window size. It then calculates the difference between the two standard deviations for each data point. The interquartile range of this difference is taken and will be used to determine whether data in the predict dataset is a volatility shift anomaly.
    • Inputs:

      • Required Input
        • Source Data Definition: The source data definition to use.
          • Name: source_data_definition
          • Tooltip:
            • Validation Constraints:
              • This input may be subject to other validation constraints at runtime.
          • Type: Must be an instance of Time Series Source Data
          • Nested Model: Time Series Source Data
            • Required Input
              • Connection: The connection to the source data.
                • Name: data_connection
                • Tooltip:
                  • Validation Constraints:
                    • This input may be subject to other validation constraints at runtime.
                • Type: Must be an instance of Tabular Connection
                • Nested Model: Tabular Connection
                  • Required Input
                    • Connection: The connection type to use to access the source data.
                      • Name: tabular_connection
                      • Tooltip:
                        • Validation Constraints:
                          • This input may be subject to other validation constraints at runtime.
                      • Type: Must be one of the following
                        • SQL Server Connection
                          • Required Input
                            • Database Resource: The name of the database resource to connect to.
                              • Name: database_resource
                              • Tooltip:
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: str
                            • Database Name: The name of the database to connect to.
                              • Name: database_name
                              • Tooltip:
                                • Detail:
                                  • Note: If you don’t see the database name that you are looking for in this list, it is recommended that you first move the data to be used within a database that is available within this list.
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: str
                            • Table Name: The name of the table to use.
                              • Name: table_name
                              • Tooltip:
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: str
                        • MetaFileSystem Connection
                          • Required Input
                            • Connection Key: The MetaFileSystem connection key.
                              • Name: connection_key
                              • Tooltip:
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: MetaFileSystemConnectionKey
                            • File Path: The full file path to the file to ingest.
                              • Name: file_path
                              • Tooltip:
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: str
                        • Partitioned MetaFileSystem Connection
                          • Required Input
                            • Connection Key: The MetaFileSystem connection key.
                              • Name: connection_key
                              • Tooltip:
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: MetaFileSystemConnectionKey
                            • File Type: The type of files to read from the directory.
                              • Name: file_type
                              • Tooltip:
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: FileExtensions_
                            • Directory Path: The full directory path containing partitioned tabular files.
                              • Name: directory_path
                              • Tooltip:
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: str
              • Dimension Columns: The columns to use as dimensions.
                • Name: dimension_columns
                • Tooltip:
                  • Validation Constraints:
                    • The input must have a minimum length of 1.
                    • This input may be subject to other validation constraints at runtime.
                • Type: list[str]
              • Date Column: The column to use as the date.
                • Name: date_column
                • Tooltip:
                  • Detail:
                    • The date column must in a DateTime readable format.
                  • Validation Constraints:
                    • This input may be subject to other validation constraints at runtime.
                • Type: str
              • Value Column: The column to use as the value.
                • Name: value_column
                • Tooltip:
                  • Detail:
                    • The value column must be a numeric (int, float, double, decimal, etc.) column.
                  • Validation Constraints:
                    • This input may be subject to other validation constraints at runtime.
                • Type: str
        • Feature Data Definition: The feature data definition to use.
          • Name: feature_data_definitions
          • Tooltip:
            • Validation Constraints:
              • This input may be subject to other validation constraints at runtime.
          • Type: list[TimeSeriesTableDefinition]
      • Optional Input
        • Date Range: The date range to fit anomalies on.
          • Name: time_range
          • Tooltip:
            • Detail:
              • If None, entire dataset will be used.
            • Validation Constraints:
              • This input may be subject to other validation constraints at runtime.
          • Type: Must be an instance of Start and End Date
          • Nested Model: Start and End Date
            • Required Input
              • Start Date: The inclusive start of the date range (MM/DD/YYYY).
                • Name: start_date
                • Tooltip:
                  • Validation Constraints:
                    • This input may be subject to other validation constraints at runtime.
                • Type: datetime
              • End Date: The inclusive end of the date range (MM/DD/YYYY).
                • Name: end_date
                • Tooltip:
                  • Detail:
                    • Note, the Seasonal ARIMA Anomaly Detector Routine treats the end date as exclusive.
                  • Validation Constraints:
                    • This input may be subject to other validation constraints at runtime.
                • Type: datetime
    • Artifacts: No artifacts are returned by this method

3. Predict (Method)
  • Method: predict
    • Type: Method

    • Memory Capacity: 2.0 GB

    • Allow In-Memory Execution: No

    • Read Only: No

    • Method Limits: This method performs efficiently with various dataset sizes. Testing at 100 GB of memory shows that with 25,000 targets and 7.4 million rows (monthly data), the method completes in under 10 minutes. With 27,500 targets and 20.1 million rows (daily data), the method completes in under 10 minutes.

    • Outputs Dynamic Artifacts: No

    • Short Description:

      • Find anomalies using fitted model
    • Detailed Description:

      • The predict method will take the data input and calculate a backward and forward rolling standard deviation of a specified window size. It then calculates the difference between the two standard deviations for each data point. The difference of the two rolling window standard deviations will be compared to the IQR calculated in the fit method. If this difference is above or below the IQR by a factor of c (set in the constructor) the point will be marked as an anomaly in the returned anomaly artifacts.
    • Inputs:

      • Required Input
        • Source Data Definition: The source data definition to use.
          • Name: source_data_definition
          • Tooltip:
            • Validation Constraints:
              • This input may be subject to other validation constraints at runtime.
          • Type: Must be an instance of Time Series Source Data
          • Nested Model: Time Series Source Data
            • Required Input
              • Connection: The connection to the source data.
                • Name: data_connection
                • Tooltip:
                  • Validation Constraints:
                    • This input may be subject to other validation constraints at runtime.
                • Type: Must be an instance of Tabular Connection
                • Nested Model: Tabular Connection
                  • Required Input
                    • Connection: The connection type to use to access the source data.
                      • Name: tabular_connection
                      • Tooltip:
                        • Validation Constraints:
                          • This input may be subject to other validation constraints at runtime.
                      • Type: Must be one of the following
                        • SQL Server Connection
                          • Required Input
                            • Database Resource: The name of the database resource to connect to.
                              • Name: database_resource
                              • Tooltip:
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: str
                            • Database Name: The name of the database to connect to.
                              • Name: database_name
                              • Tooltip:
                                • Detail:
                                  • Note: If you don’t see the database name that you are looking for in this list, it is recommended that you first move the data to be used within a database that is available within this list.
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: str
                            • Table Name: The name of the table to use.
                              • Name: table_name
                              • Tooltip:
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: str
                        • MetaFileSystem Connection
                          • Required Input
                            • Connection Key: The MetaFileSystem connection key.
                              • Name: connection_key
                              • Tooltip:
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: MetaFileSystemConnectionKey
                            • File Path: The full file path to the file to ingest.
                              • Name: file_path
                              • Tooltip:
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: str
                        • Partitioned MetaFileSystem Connection
                          • Required Input
                            • Connection Key: The MetaFileSystem connection key.
                              • Name: connection_key
                              • Tooltip:
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: MetaFileSystemConnectionKey
                            • File Type: The type of files to read from the directory.
                              • Name: file_type
                              • Tooltip:
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: FileExtensions_
                            • Directory Path: The full directory path containing partitioned tabular files.
                              • Name: directory_path
                              • Tooltip:
                                • Validation Constraints:
                                  • This input may be subject to other validation constraints at runtime.
                              • Type: str
              • Dimension Columns: The columns to use as dimensions.
                • Name: dimension_columns
                • Tooltip:
                  • Validation Constraints:
                    • The input must have a minimum length of 1.
                    • This input may be subject to other validation constraints at runtime.
                • Type: list[str]
              • Date Column: The column to use as the date.
                • Name: date_column
                • Tooltip:
                  • Detail:
                    • The date column must in a DateTime readable format.
                  • Validation Constraints:
                    • This input may be subject to other validation constraints at runtime.
                • Type: str
              • Value Column: The column to use as the value.
                • Name: value_column
                • Tooltip:
                  • Detail:
                    • The value column must be a numeric (int, float, double, decimal, etc.) column.
                  • Validation Constraints:
                    • This input may be subject to other validation constraints at runtime.
                • Type: str
        • Feature Data Definition: The feature data definition to use.
          • Name: feature_data_definitions
          • Tooltip:
            • Validation Constraints:
              • This input may be subject to other validation constraints at runtime.
          • Type: list[TimeSeriesTableDefinition]
        • State Info Definition: The snapshot name and description. These will be used as identifiers in the snapshot artifact.
          • Name: state_info
          • Tooltip:
            • Validation Constraints:
              • This input may be subject to other validation constraints at runtime.
          • Type: Must be an instance of State Info
          • Nested Model: State Info
            • Required Input
              • Name: The name of the anomaly detector instance.
                • Name: snapshot_name
                • Tooltip:
                  • Validation Constraints:
                    • This input may be subject to other validation constraints at runtime.
                • Type: str
              • Snapshot Description: The description of your anomaly detector instance.
                • Name: snapshot_description
                • Tooltip:
                  • Validation Constraints:
                    • This input may be subject to other validation constraints at runtime.
                • Type: str
      • Optional Input
        • Date Range: The date range to predict anomalies on.
          • Name: time_range
          • Tooltip:
            • Detail:
              • If None, entire dataset will be used.
            • Validation Constraints:
              • This input may be subject to other validation constraints at runtime.
          • Type: Must be an instance of Start and End Date
          • Nested Model: Start and End Date
            • Required Input
              • Start Date: The inclusive start of the date range (MM/DD/YYYY).
                • Name: start_date
                • Tooltip:
                  • Validation Constraints:
                    • This input may be subject to other validation constraints at runtime.
                • Type: datetime
              • End Date: The inclusive end of the date range (MM/DD/YYYY).
                • Name: end_date
                • Tooltip:
                  • Detail:
                    • Note, the Seasonal ARIMA Anomaly Detector Routine treats the end date as exclusive.
                  • Validation Constraints:
                    • This input may be subject to other validation constraints at runtime.
                • Type: datetime
    • Artifacts:

      • AnomalySnapshot: Parquet file containing data about your anomaly detection run.

        • Qualified Key Annotation: anomaly_snapshot
        • Aggregate Artifact: False
        • In-Memory Json Accessible: False
        • File Annotations:
          • artifacts_/@anomaly_snapshot/data_/data_<int>.parquet
            • A partitioned set of parquet files where each file will have no more than 1000000 rows.
      • Specific Anomaly Dates: Parquet file containing data about the specific dates an anomaly was detected.

        • Qualified Key Annotation: anomaly_dates
        • Aggregate Artifact: False
        • In-Memory Json Accessible: False
        • File Annotations:
          • artifacts_/@anomaly_dates/data_/data_<int>.parquet
            • A partitioned set of parquet files where each file will have no more than 1000000 rows.
      • Specific Anomaly Instances: Parquet file containing data about the specific anomaly instances that were detected.

        • Qualified Key Annotation: anomaly_instance
        • Aggregate Artifact: False
        • In-Memory Json Accessible: False
        • File Annotations:
          • artifacts_/@anomaly_instance/data_/data_<int>.parquet
            • A partitioned set of parquet files where each file will have no more than 1000000 rows.

Interface Definitions

1. Anomaly Detection Interface

An interface class requiring fit and predict methods to be implemented.

This BaseRoutineInterface class enforces a common interface for all anomaly detection routines. The interface requires each anomaly detection routine to implement a fit method and a predict method with the same input parameters. Each concrete class will have constructor methods where hyperparameters specific to the anomaly detection algorithm may be set, however, this interface does not enforce any specific constructor method.

Interface Methods:

1. Fit

Method Name: fit

Short Description: Abstract Fit Method

Detailed Description: This specifies the necessary input and output parameters for the fit method on all anomaly detection routines. The input parameters contain a source data definition and time range to fit an anomaly detector to.

Inputs:

PropertyTypeRequiredDescription
source_data_definition#/$defs/TimeSeriesTableDefinitionYesThe source data definition to use.
feature_data_definitionsarrayYesThe feature data definition to use.
time_range`#/$defs/StartDateEndDateDefinitionnull`No

Input Schema (JSON):

{
"$defs": {
"FileExtensions_": {
"description": "File Extensions.",
"enum": [
".csv",
".tsv",
".psv",
".parquet",
".xlsx"
],
"title": "FileExtensions_",
"type": "string"
},
"FileTabularConnection": {
"properties": {
"connection_key": {
"$ref": "#/$defs/MetaFileSystemConnectionKey",
"description": "The MetaFileSystem connection key.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "connection_key",
"title": "Connection Key",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"file_path": {
"description": "The full file path to the file to ingest.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.filetable:FileTabularConnection.get_file_path_bound_options",
"options_callback_kwargs": null,
"state_name": "file_path",
"title": "File Path",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"connection_key",
"file_path"
],
"title": "FileTabularConnection",
"type": "object"
},
"MetaFileSystemConnectionKey": {
"enum": [
"sql-server-routine",
"sql-server-shared"
],
"title": "MetaFileSystemConnectionKey",
"type": "string"
},
"PartitionedFileTabularConnection": {
"properties": {
"connection_key": {
"$ref": "#/$defs/MetaFileSystemConnectionKey",
"description": "The MetaFileSystem connection key.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "connection_key",
"title": "Connection Key",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"file_type": {
"$ref": "#/$defs/FileExtensions_",
"description": "The type of files to read from the directory.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "file_info",
"title": "File Type",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"directory_path": {
"description": "The full directory path containing partitioned tabular files.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.partitionedfiletable:PartitionedFileTabularConnection.get_directory_path_bound_options",
"options_callback_kwargs": null,
"state_name": "file_info",
"title": "Directory Path",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"connection_key",
"file_type",
"directory_path"
],
"title": "PartitionedFileTabularConnection",
"type": "object"
},
"SqlTabularConnection": {
"properties": {
"database_resource": {
"description": "The name of the database resource to connect to.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.sqltable:SqlTabularConnection.get_database_resources",
"options_callback_kwargs": null,
"state_name": "database_resource",
"title": "Database Resource",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"database_name": {
"description": "The name of the database to connect to.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.sqltable:SqlTabularConnection.get_database_schemas",
"options_callback_kwargs": null,
"state_name": "database_name",
"title": "Database Name",
"tooltip": "Detail:\nNote: If you don\u2019t see the database name that you are looking for in this list, it is recommended that you first move the data to be used within a database that is available within this list.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"table_name": {
"description": "The name of the table to use.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.sqltable:SqlTabularConnection.get_tables",
"options_callback_kwargs": null,
"state_name": "table_name",
"title": "Table Name",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"database_resource",
"database_name",
"table_name"
],
"title": "SqlTabularConnection",
"type": "object"
},
"StartDateEndDateDefinition": {
"properties": {
"start_date": {
"description": "The inclusive start of the date range (MM/DD/YYYY).",
"field_type": "input",
"format": "date-time",
"input_component": {
"component_type": "dateselector",
"max_date": null,
"min_date": null
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "date_selection",
"title": "Start Date",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"end_date": {
"description": "The inclusive end of the date range (MM/DD/YYYY).",
"field_type": "input",
"format": "date-time",
"input_component": {
"component_type": "dateselector",
"max_date": null,
"min_date": null
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "date_selection",
"title": "End Date",
"tooltip": "Detail:\nNote, the Seasonal ARIMA Anomaly Detector Routine treats the end date as exclusive.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"start_date",
"end_date"
],
"title": "StartDateEndDateDefinition",
"type": "object"
},
"TabularConnection": {
"description": "A shared parameter base model dedication to tabular connections.",
"properties": {
"tabular_connection": {
"anyOf": [
{
"$ref": "#/$defs/SqlTabularConnection"
},
{
"$ref": "#/$defs/FileTabularConnection"
},
{
"$ref": "#/$defs/PartitionedFileTabularConnection"
}
],
"description": "The connection type to use to access the source data.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "connection",
"title": "Connection",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
}
},
"required": [
"tabular_connection"
],
"title": "TabularConnection",
"type": "object"
},
"TimeSeriesTableDefinition": {
"description": "A parameter base model dedicated to loading tabular time series data.",
"properties": {
"data_connection": {
"$ref": "#/$defs/TabularConnection",
"description": "The connection to the source data.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "source_connection",
"title": "Connection",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"dimension_columns": {
"description": "The columns to use as dimensions.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"items": {
"type": "string"
},
"long_description": null,
"minItems": 1,
"options_callback": "xperiflow.source.app.routines.pbm.store.tsf.tstable:TimeSeriesTableDefinition.get_table_columns",
"options_callback_kwargs": null,
"state_name": "column_selection",
"title": "Dimension Columns",
"tooltip": "Validation Constraints:\nThe input must have a minimum length of 1.\n\nThis input may be subject to other validation constraints at runtime.",
"type": "array"
},
"date_column": {
"description": "The column to use as the date.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.tsf.tstable:TimeSeriesTableDefinition.get_table_columns",
"options_callback_kwargs": null,
"state_name": "column_selection",
"title": "Date Column",
"tooltip": "Detail:\nThe date column must in a DateTime readable format.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"value_column": {
"description": "The column to use as the value.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.tsf.tstable:TimeSeriesTableDefinition.get_table_columns",
"options_callback_kwargs": null,
"state_name": "column_selection",
"title": "Value Column",
"tooltip": "Detail:\nThe value column must be a numeric (int, float, double, decimal, etc.) column.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"data_connection",
"dimension_columns",
"date_column",
"value_column"
],
"title": "TimeSeriesTableDefinition",
"type": "object"
}
},
"properties": {
"source_data_definition": {
"$ref": "#/$defs/TimeSeriesTableDefinition",
"description": "The source data definition to use.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "SourceDefinition",
"title": "Source Data Definition",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"feature_data_definitions": {
"description": "The feature data definition to use.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"items": {
"$ref": "#/$defs/TimeSeriesTableDefinition"
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "FeatureDefinition",
"title": "Feature Data Definition",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "array"
},
"time_range": {
"anyOf": [
{
"$ref": "#/$defs/StartDateEndDateDefinition"
},
{
"type": "null"
}
],
"default": null,
"description": "The date range to fit anomalies on.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "DateRange",
"title": "Date Range",
"tooltip": "Detail:\nIf None, entire dataset will be used.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime."
}
},
"required": [
"source_data_definition",
"feature_data_definitions"
],
"title": "AnomalyDetectionFitParameters",
"type": "object"
}

Artifacts: No artifacts are returned by this method

2. Predict

Method Name: predict

Short Description: Abstract Predict Method

Detailed Description: This specifies the necessary input and output parameters for the predict method on all anomaly detection routines. The input parameters contain a source data definition and a time range to detect anomalies.

Inputs:

PropertyTypeRequiredDescription
source_data_definition#/$defs/TimeSeriesTableDefinitionYesThe source data definition to use.
feature_data_definitionsarrayYesThe feature data definition to use.
time_range`#/$defs/StartDateEndDateDefinitionnull`No
state_info#/$defs/StateInfoDefinitionYesThe snapshot name and description. These will be used as identifiers in the snapshot artifact.

Input Schema (JSON):

{
"$defs": {
"FileExtensions_": {
"description": "File Extensions.",
"enum": [
".csv",
".tsv",
".psv",
".parquet",
".xlsx"
],
"title": "FileExtensions_",
"type": "string"
},
"FileTabularConnection": {
"properties": {
"connection_key": {
"$ref": "#/$defs/MetaFileSystemConnectionKey",
"description": "The MetaFileSystem connection key.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "connection_key",
"title": "Connection Key",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"file_path": {
"description": "The full file path to the file to ingest.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.filetable:FileTabularConnection.get_file_path_bound_options",
"options_callback_kwargs": null,
"state_name": "file_path",
"title": "File Path",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"connection_key",
"file_path"
],
"title": "FileTabularConnection",
"type": "object"
},
"MetaFileSystemConnectionKey": {
"enum": [
"sql-server-routine",
"sql-server-shared"
],
"title": "MetaFileSystemConnectionKey",
"type": "string"
},
"PartitionedFileTabularConnection": {
"properties": {
"connection_key": {
"$ref": "#/$defs/MetaFileSystemConnectionKey",
"description": "The MetaFileSystem connection key.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "connection_key",
"title": "Connection Key",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"file_type": {
"$ref": "#/$defs/FileExtensions_",
"description": "The type of files to read from the directory.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "file_info",
"title": "File Type",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"directory_path": {
"description": "The full directory path containing partitioned tabular files.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.partitionedfiletable:PartitionedFileTabularConnection.get_directory_path_bound_options",
"options_callback_kwargs": null,
"state_name": "file_info",
"title": "Directory Path",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"connection_key",
"file_type",
"directory_path"
],
"title": "PartitionedFileTabularConnection",
"type": "object"
},
"SqlTabularConnection": {
"properties": {
"database_resource": {
"description": "The name of the database resource to connect to.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.sqltable:SqlTabularConnection.get_database_resources",
"options_callback_kwargs": null,
"state_name": "database_resource",
"title": "Database Resource",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"database_name": {
"description": "The name of the database to connect to.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.sqltable:SqlTabularConnection.get_database_schemas",
"options_callback_kwargs": null,
"state_name": "database_name",
"title": "Database Name",
"tooltip": "Detail:\nNote: If you don\u2019t see the database name that you are looking for in this list, it is recommended that you first move the data to be used within a database that is available within this list.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"table_name": {
"description": "The name of the table to use.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.sqltable:SqlTabularConnection.get_tables",
"options_callback_kwargs": null,
"state_name": "table_name",
"title": "Table Name",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"database_resource",
"database_name",
"table_name"
],
"title": "SqlTabularConnection",
"type": "object"
},
"StartDateEndDateDefinition": {
"properties": {
"start_date": {
"description": "The inclusive start of the date range (MM/DD/YYYY).",
"field_type": "input",
"format": "date-time",
"input_component": {
"component_type": "dateselector",
"max_date": null,
"min_date": null
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "date_selection",
"title": "Start Date",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"end_date": {
"description": "The inclusive end of the date range (MM/DD/YYYY).",
"field_type": "input",
"format": "date-time",
"input_component": {
"component_type": "dateselector",
"max_date": null,
"min_date": null
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "date_selection",
"title": "End Date",
"tooltip": "Detail:\nNote, the Seasonal ARIMA Anomaly Detector Routine treats the end date as exclusive.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"start_date",
"end_date"
],
"title": "StartDateEndDateDefinition",
"type": "object"
},
"StateInfoDefinition": {
"properties": {
"snapshot_name": {
"description": "The name of the anomaly detector instance.",
"field_type": "input",
"input_component": {
"component_type": "textbox",
"height": null,
"multiline": false
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "snapshot_info",
"title": "Name",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"snapshot_description": {
"description": "The description of your anomaly detector instance.",
"field_type": "input",
"input_component": {
"component_type": "textbox",
"height": null,
"multiline": false
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "snapshot_info",
"title": "Snapshot Description",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"snapshot_name",
"snapshot_description"
],
"title": "StateInfoDefinition",
"type": "object"
},
"TabularConnection": {
"description": "A shared parameter base model dedication to tabular connections.",
"properties": {
"tabular_connection": {
"anyOf": [
{
"$ref": "#/$defs/SqlTabularConnection"
},
{
"$ref": "#/$defs/FileTabularConnection"
},
{
"$ref": "#/$defs/PartitionedFileTabularConnection"
}
],
"description": "The connection type to use to access the source data.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "connection",
"title": "Connection",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
}
},
"required": [
"tabular_connection"
],
"title": "TabularConnection",
"type": "object"
},
"TimeSeriesTableDefinition": {
"description": "A parameter base model dedicated to loading tabular time series data.",
"properties": {
"data_connection": {
"$ref": "#/$defs/TabularConnection",
"description": "The connection to the source data.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "source_connection",
"title": "Connection",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"dimension_columns": {
"description": "The columns to use as dimensions.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"items": {
"type": "string"
},
"long_description": null,
"minItems": 1,
"options_callback": "xperiflow.source.app.routines.pbm.store.tsf.tstable:TimeSeriesTableDefinition.get_table_columns",
"options_callback_kwargs": null,
"state_name": "column_selection",
"title": "Dimension Columns",
"tooltip": "Validation Constraints:\nThe input must have a minimum length of 1.\n\nThis input may be subject to other validation constraints at runtime.",
"type": "array"
},
"date_column": {
"description": "The column to use as the date.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.tsf.tstable:TimeSeriesTableDefinition.get_table_columns",
"options_callback_kwargs": null,
"state_name": "column_selection",
"title": "Date Column",
"tooltip": "Detail:\nThe date column must in a DateTime readable format.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"value_column": {
"description": "The column to use as the value.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.tsf.tstable:TimeSeriesTableDefinition.get_table_columns",
"options_callback_kwargs": null,
"state_name": "column_selection",
"title": "Value Column",
"tooltip": "Detail:\nThe value column must be a numeric (int, float, double, decimal, etc.) column.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"data_connection",
"dimension_columns",
"date_column",
"value_column"
],
"title": "TimeSeriesTableDefinition",
"type": "object"
}
},
"description": "\"Note that only most recent fit will be utilized in predictions.\"",
"properties": {
"source_data_definition": {
"$ref": "#/$defs/TimeSeriesTableDefinition",
"description": "The source data definition to use.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "SourceDefinition",
"title": "Source Data Definition",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"feature_data_definitions": {
"description": "The feature data definition to use.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"items": {
"$ref": "#/$defs/TimeSeriesTableDefinition"
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "FeatureDefinition",
"title": "Feature Data Definition",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "array"
},
"time_range": {
"anyOf": [
{
"$ref": "#/$defs/StartDateEndDateDefinition"
},
{
"type": "null"
}
],
"default": null,
"description": "The date range to predict anomalies on.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "DateRange",
"title": "Date Range",
"tooltip": "Detail:\nIf None, entire dataset will be used.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"state_info": {
"$ref": "#/$defs/StateInfoDefinition",
"description": "The snapshot name and description. These will be used as identifiers in the snapshot artifact.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "StateInfoDefinition",
"title": "State Info Definition",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
}
},
"required": [
"source_data_definition",
"feature_data_definitions",
"state_info"
],
"title": "AnomalyDetectionPredictParameters",
"type": "object"
}

Artifacts:

PropertyTypeRequiredDescription
anomaly_snapshotunknownYesParquet file containing data about your anomaly detection run.
anomaly_datesunknownYesParquet file containing data about the specific dates an anomaly was detected.
anomaly_instanceunknownYesParquet file containing data about the specific anomaly instances that were detected.

Artifact Schema (JSON):

{
"additionalProperties": true,
"properties": {
"anomaly_snapshot": {
"description": "Parquet file containing data about your anomaly detection run.",
"io_factory_kwargs": {},
"preview_factory_kwargs": null,
"preview_factory_type": null,
"statistic_factory_kwargs": null,
"statistic_factory_type": null,
"title": "AnomalySnapshot"
},
"anomaly_dates": {
"description": "Parquet file containing data about the specific dates an anomaly was detected.",
"io_factory_kwargs": {},
"preview_factory_kwargs": null,
"preview_factory_type": null,
"statistic_factory_kwargs": null,
"statistic_factory_type": null,
"title": "Specific Anomaly Dates"
},
"anomaly_instance": {
"description": "Parquet file containing data about the specific anomaly instances that were detected.",
"io_factory_kwargs": {},
"preview_factory_kwargs": null,
"preview_factory_type": null,
"statistic_factory_kwargs": null,
"statistic_factory_type": null,
"title": "Specific Anomaly Instances"
}
},
"required": [
"anomaly_snapshot",
"anomaly_dates",
"anomaly_instance"
],
"title": "AnomalyDetectionArtifacts",
"type": "object"
}

Developer Docs

Routine Typename: RollingStandardDeviationAnomalyDetector

Method NameArtifact Keys
__init__N/A
fitN/A
predictanomaly_snapshot, anomaly_dates, anomaly_instance

Was this page helpful?