RollingMedianAnomalyDetector
Versions
v1.0.0
Basic Information
Class Name: RollingMedianAnomalyDetector
Title: Rolling Median Anomaly Detector
Version: 1.0.0
Author: Simon Vedder
Organization: OneStream
Creation Date: 2024-02-28
Default Routine Memory Capacity: 2.0 GB
Tags
Anomaly, Time Series, Supervised, ML, Level Shift Anomaly
Description
Short Description
Constructor for the Rolling Median Anomaly Detection Routine
Long Description
This anomaly detection technique is used to detect level shift anomalies. The anomalous dates marked by this routine will indicate to the user that there was a large change in overall value on/after the detected dates. To detect these anomalies, the median difference is calculated by taking the difference between the medians of a forward and backward rolling window at each point. For any given point, if the median difference is greater than or less than some value, set by adjusting the threshold constant hyperparameter, the point will be marked as a level shift anomaly. Value X is able to be tuned by the user by changing the constant threshold parameter. A higher threshold constant will make the anomaly detector less sensitive.
Use Cases
1. Store Fire
In retail settings, unexpected events like store fires can drastically affect sales of specific items, potentially distorting models used for forecasting. When a store is only partially operational due to such incidents, key product sales might plummet, necessitating data adjustments for accurate trend analysis. Our anomaly detection tool, which utilizes the median difference between forward and backward rolling windows, is designed to identify such level shift anomalies efficiently. By pinpointing the onset of a significant sales dip, marked as a 'level shift down', and the recovery phase, indicated by a 'level shift up', the system effectively delineates the impacted period in the dataset. Cleaning the data within this defined range removes the distortion caused by the anomaly, thus enhancing the accuracy of future forecasts. Furthermore, this method provides a historical benchmark, allowing us to estimate the duration and impact of similar events in other locations. For example, if a fire at Store A leads to an 80% reduction in sales of products X, Y, and Z over a 35-day period, we can project a comparable impact under similar circumstances at Store B. This predictive capability is invaluable for planning and mitigating risks associated with unforeseen disruptions in retail environments.
2. Sales Promotion Period
During a promotional period, such as a buy-one-get-one-free sale on chip brand A, we generally expect a surge in sales for the promoted product. However, it's also crucial to consider the impact of such promotions on competing brands. Our anomaly detection tool can play a pivotal role here by identifying sales dynamics among competing products during and after the promotion. When brand A is promoted, if we observe a 'level shift down' anomaly in sales data for competing chip brands B and C, it indicates a decrease in their sales due to the promotion. Furthermore, if these brands exhibit a 'level shift up' anomaly at the end of the promotional period, it suggests a rebound effect, which can be critical for timing marketing strategies and managing inventory for all brands involved. Identifying these relationships can help the user make decisions about what features to apply to their next round of forecasting. Originally, the sales promotion of chip brand A was only supplied to chip brand A's forecast, but after identifying the relationship of this promotion to chip brands B and C, the decision is made to apply the sales promotion feature to their forecasts as well. This approach not only allows us to quantify the direct effects of promotions on the promoted brand, but on competing brands as well.
Routine Methods
1. Init (Constructor)
- Method:
__init__-
Type: Constructor
-
Memory Capacity: 2.0 GB
-
Allow In-Memory Execution: No
-
Read Only: No
-
Method Limits: There are no limits to the constructor method. This method simply saves the input parameters to be utilized in subsequent runs of the fit and predict methods.
-
Outputs Dynamic Artifacts: No
-
Short Description:
- Constructor for the Rolling Median Anomaly Detection Routine.
-
Detailed Description:
- The rolling median constructor is used to set the group name, state info, and hyperparameters for this anomaly detection instance.
-
Inputs:
- Required Input
- Hyperparameters: The hyperparameters for double rolling median and double rolling standard deviation anomaly detectors.
- Name:
hyper_parameters - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: Must be an instance of Rolling Median Anomaly Detection Hyper Parameters
- Nested Model: Rolling Median Anomaly Detection Hyper Parameters
- Required Input
- Forward Rolling Window: The length of the forward rolling window in days, weeks, or months depending on the data frequency.
- Name:
forward_rolling_window - Tooltip:
- Validation Constraints:
- The input must be greater than 0.
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: int
- Name:
- Backward Rolling Window: The length of the backward rolling window in days, weeks, or months depending on the data frequency.
- Name:
backward_rolling_window - Tooltip:
- Validation Constraints:
- The input must be greater than 0.
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: int
- Name:
- Threshold Constant: Constant used for anomaly threshold. A higher threshold constant value makes it less likely to detect anomalies.
- Name:
threshold_constant - Tooltip:
- Validation Constraints:
- The input must be greater than 0.
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: float
- Name:
- Detection Side: Decides whether to detect anomalies on the positive, negative, or both sides of the data.
- Name:
side - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: SideType_
- Name:
- Forward Rolling Window: The length of the forward rolling window in days, weeks, or months depending on the data frequency.
- Required Input
- Name:
- Group Name: Group name for your anomaly detector, used as an identifier in the output artifact.
- Name:
group_name - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Hyperparameters: The hyperparameters for double rolling median and double rolling standard deviation anomaly detectors.
- Required Input
-
Artifacts: No artifacts are returned by this method
-
2. Fit (Method)
- Method:
fit-
Type: Method
-
Memory Capacity: 2.0 GB
-
Allow In-Memory Execution: No
-
Read Only: No
-
Method Limits: This method performs efficiently across different dataset sizes. Testing at 100 GB of memory shows that with 25,000 targets and 7.4 million rows (monthly data), the method completes in under 10 minutes. With 27,500 targets and 20.1 million rows (daily data), the method completes in under 10 minutes.
-
Outputs Dynamic Artifacts: No
-
Short Description:
- Fit the rolling median anomaly detection model.
-
Detailed Description:
- The fit method will take the data input and calculate a backward and forward rolling median of a specified window size. It then calculates the difference between the two medians for each data point. The interquartile range of this difference is taken and will be used to determine whether data in the predict dataset is a level shift anomaly.
-
Inputs:
- Required Input
- Source Data Definition: The source data definition to use.
- Name:
source_data_definition - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: Must be an instance of Time Series Source Data
- Nested Model: Time Series Source Data
- Required Input
- Connection: The connection to the source data.
- Name:
data_connection - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: Must be an instance of Tabular Connection
- Nested Model: Tabular Connection
- Required Input
- Connection: The connection type to use to access the source data.
- Name:
tabular_connection - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: Must be one of the following
- SQL Server Connection
- Required Input
- Database Resource: The name of the database resource to connect to.
- Name:
database_resource - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Database Name: The name of the database to connect to.
- Name:
database_name - Tooltip:
- Detail:
- Note: If you don’t see the database name that you are looking for in this list, it is recommended that you first move the data to be used within a database that is available within this list.
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Detail:
- Type: str
- Name:
- Table Name: The name of the table to use.
- Name:
table_name - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Database Resource: The name of the database resource to connect to.
- Required Input
- MetaFileSystem Connection
- Required Input
- Connection Key: The MetaFileSystem connection key.
- Name:
connection_key - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: MetaFileSystemConnectionKey
- Name:
- File Path: The full file path to the file to ingest.
- Name:
file_path - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Connection Key: The MetaFileSystem connection key.
- Required Input
- Partitioned MetaFileSystem Connection
- Required Input
- Connection Key: The MetaFileSystem connection key.
- Name:
connection_key - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: MetaFileSystemConnectionKey
- Name:
- File Type: The type of files to read from the directory.
- Name:
file_type - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: FileExtensions_
- Name:
- Directory Path: The full directory path containing partitioned tabular files.
- Name:
directory_path - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Connection Key: The MetaFileSystem connection key.
- Required Input
- SQL Server Connection
- Name:
- Connection: The connection type to use to access the source data.
- Required Input
- Name:
- Dimension Columns: The columns to use as dimensions.
- Name:
dimension_columns - Tooltip:
- Validation Constraints:
- The input must have a minimum length of 1.
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: list[str]
- Name:
- Date Column: The column to use as the date.
- Name:
date_column - Tooltip:
- Detail:
- The date column must in a DateTime readable format.
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Detail:
- Type: str
- Name:
- Value Column: The column to use as the value.
- Name:
value_column - Tooltip:
- Detail:
- The value column must be a numeric (int, float, double, decimal, etc.) column.
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Detail:
- Type: str
- Name:
- Connection: The connection to the source data.
- Required Input
- Name:
- Feature Data Definition: The feature data definition to use.
- Name:
feature_data_definitions - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: list[TimeSeriesTableDefinition]
- Name:
- Source Data Definition: The source data definition to use.
- Optional Input
- Date Range: The date range to fit anomalies on.
- Name:
time_range - Tooltip:
- Detail:
- If None, entire dataset will be used.
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Detail:
- Type: Must be an instance of Start and End Date
- Nested Model: Start and End Date
- Required Input
- Start Date: The inclusive start of the date range (MM/DD/YYYY).
- Name:
start_date - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: datetime
- Name:
- End Date: The inclusive end of the date range (MM/DD/YYYY).
- Name:
end_date - Tooltip:
- Detail:
- Note, the Seasonal ARIMA Anomaly Detector Routine treats the end date as exclusive.
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Detail:
- Type: datetime
- Name:
- Start Date: The inclusive start of the date range (MM/DD/YYYY).
- Required Input
- Name:
- Date Range: The date range to fit anomalies on.
- Required Input
-
Artifacts: No artifacts are returned by this method
-
3. Predict (Method)
- Method:
predict-
Type: Method
-
Memory Capacity: 2.0 GB
-
Allow In-Memory Execution: No
-
Read Only: No
-
Method Limits: This method performs efficiently with various dataset sizes. Testing at 100 GB of memory shows that with 25,000 targets and 7.4 million rows (monthly data), the method completes in under 10 minutes. With 27,500 targets and 20.1 million rows (daily data), the method completes in under 10 minutes.
-
Outputs Dynamic Artifacts: No
-
Short Description:
- Detect anomalies using fitted model
-
Detailed Description:
- The predict method will take the data input and calculate a backward and forward rolling median of a specified window size. It then calculates the difference between the two medians for each data point. The difference of the two rolling window medians will be compared to the IQR calculated in the fit method. If this difference is above or below the IQR by a factor of c (set in the constructor) the point will be marked as an anomaly in the returned anomaly artifacts.
-
Inputs:
- Required Input
- Source Data Definition: The source data definition to use.
- Name:
source_data_definition - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: Must be an instance of Time Series Source Data
- Nested Model: Time Series Source Data
- Required Input
- Connection: The connection to the source data.
- Name:
data_connection - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: Must be an instance of Tabular Connection
- Nested Model: Tabular Connection
- Required Input
- Connection: The connection type to use to access the source data.
- Name:
tabular_connection - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: Must be one of the following
- SQL Server Connection
- Required Input
- Database Resource: The name of the database resource to connect to.
- Name:
database_resource - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Database Name: The name of the database to connect to.
- Name:
database_name - Tooltip:
- Detail:
- Note: If you don’t see the database name that you are looking for in this list, it is recommended that you first move the data to be used within a database that is available within this list.
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Detail:
- Type: str
- Name:
- Table Name: The name of the table to use.
- Name:
table_name - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Database Resource: The name of the database resource to connect to.
- Required Input
- MetaFileSystem Connection
- Required Input
- Connection Key: The MetaFileSystem connection key.
- Name:
connection_key - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: MetaFileSystemConnectionKey
- Name:
- File Path: The full file path to the file to ingest.
- Name:
file_path - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Connection Key: The MetaFileSystem connection key.
- Required Input
- Partitioned MetaFileSystem Connection
- Required Input
- Connection Key: The MetaFileSystem connection key.
- Name:
connection_key - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: MetaFileSystemConnectionKey
- Name:
- File Type: The type of files to read from the directory.
- Name:
file_type - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: FileExtensions_
- Name:
- Directory Path: The full directory path containing partitioned tabular files.
- Name:
directory_path - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Connection Key: The MetaFileSystem connection key.
- Required Input
- SQL Server Connection
- Name:
- Connection: The connection type to use to access the source data.
- Required Input
- Name:
- Dimension Columns: The columns to use as dimensions.
- Name:
dimension_columns - Tooltip:
- Validation Constraints:
- The input must have a minimum length of 1.
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: list[str]
- Name:
- Date Column: The column to use as the date.
- Name:
date_column - Tooltip:
- Detail:
- The date column must in a DateTime readable format.
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Detail:
- Type: str
- Name:
- Value Column: The column to use as the value.
- Name:
value_column - Tooltip:
- Detail:
- The value column must be a numeric (int, float, double, decimal, etc.) column.
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Detail:
- Type: str
- Name:
- Connection: The connection to the source data.
- Required Input
- Name:
- Feature Data Definition: The feature data definition to use.
- Name:
feature_data_definitions - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: list[TimeSeriesTableDefinition]
- Name:
- State Info Definition: The snapshot name and description. These will be used as identifiers in the snapshot artifact.
- Name:
state_info - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: Must be an instance of State Info
- Nested Model: State Info
- Required Input
- Name: The name of the anomaly detector instance.
- Name:
snapshot_name - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Snapshot Description: The description of your anomaly detector instance.
- Name:
snapshot_description - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Name: The name of the anomaly detector instance.
- Required Input
- Name:
- Source Data Definition: The source data definition to use.
- Optional Input
- Date Range: The date range to predict anomalies on.
- Name:
time_range - Tooltip:
- Detail:
- If None, entire dataset will be used.
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Detail:
- Type: Must be an instance of Start and End Date
- Nested Model: Start and End Date
- Required Input
- Start Date: The inclusive start of the date range (MM/DD/YYYY).
- Name:
start_date - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: datetime
- Name:
- End Date: The inclusive end of the date range (MM/DD/YYYY).
- Name:
end_date - Tooltip:
- Detail:
- Note, the Seasonal ARIMA Anomaly Detector Routine treats the end date as exclusive.
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Detail:
- Type: datetime
- Name:
- Start Date: The inclusive start of the date range (MM/DD/YYYY).
- Required Input
- Name:
- Date Range: The date range to predict anomalies on.
- Required Input
-
Artifacts:
-
AnomalySnapshot: Parquet file containing data about your anomaly detection run.
- Qualified Key Annotation:
anomaly_snapshot - Aggregate Artifact:
False - In-Memory Json Accessible:
False - File Annotations:
artifacts_/@anomaly_snapshot/data_/data_<int>.parquet- A partitioned set of parquet files where each file will have no more than 1000000 rows.
- Qualified Key Annotation:
-
Specific Anomaly Dates: Parquet file containing data about the specific dates an anomaly was detected.
- Qualified Key Annotation:
anomaly_dates - Aggregate Artifact:
False - In-Memory Json Accessible:
False - File Annotations:
artifacts_/@anomaly_dates/data_/data_<int>.parquet- A partitioned set of parquet files where each file will have no more than 1000000 rows.
- Qualified Key Annotation:
-
Specific Anomaly Instances: Parquet file containing data about the specific anomaly instances that were detected.
- Qualified Key Annotation:
anomaly_instance - Aggregate Artifact:
False - In-Memory Json Accessible:
False - File Annotations:
artifacts_/@anomaly_instance/data_/data_<int>.parquet- A partitioned set of parquet files where each file will have no more than 1000000 rows.
- Qualified Key Annotation:
-
-
Interface Definitions
1. Anomaly Detection Interface
An interface class requiring fit and predict methods to be implemented.
This BaseRoutineInterface class enforces a common interface for all anomaly detection routines. The interface requires each anomaly detection routine to implement a fit method and a predict method with the same input parameters. Each concrete class will have constructor methods where hyperparameters specific to the anomaly detection algorithm may be set, however, this interface does not enforce any specific constructor method.
Interface Methods:
1. Fit
Method Name: fit
Short Description: Abstract Fit Method
Detailed Description: This specifies the necessary input and output parameters for the fit method on all anomaly detection routines. The input parameters contain a source data definition and time range to fit an anomaly detector to.
Inputs:
| Property | Type | Required | Description |
|---|---|---|---|
source_data_definition | #/$defs/TimeSeriesTableDefinition | Yes | The source data definition to use. |
feature_data_definitions | array | Yes | The feature data definition to use. |
time_range | `#/$defs/StartDateEndDateDefinition | null` | No |
Input Schema (JSON):
{
"$defs": {
"FileExtensions_": {
"description": "File Extensions.",
"enum": [
".csv",
".tsv",
".psv",
".parquet",
".xlsx"
],
"title": "FileExtensions_",
"type": "string"
},
"FileTabularConnection": {
"properties": {
"connection_key": {
"$ref": "#/$defs/MetaFileSystemConnectionKey",
"description": "The MetaFileSystem connection key.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "connection_key",
"title": "Connection Key",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"file_path": {
"description": "The full file path to the file to ingest.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.filetable:FileTabularConnection.get_file_path_bound_options",
"options_callback_kwargs": null,
"state_name": "file_path",
"title": "File Path",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"connection_key",
"file_path"
],
"title": "FileTabularConnection",
"type": "object"
},
"MetaFileSystemConnectionKey": {
"enum": [
"sql-server-routine",
"sql-server-shared"
],
"title": "MetaFileSystemConnectionKey",
"type": "string"
},
"PartitionedFileTabularConnection": {
"properties": {
"connection_key": {
"$ref": "#/$defs/MetaFileSystemConnectionKey",
"description": "The MetaFileSystem connection key.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "connection_key",
"title": "Connection Key",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"file_type": {
"$ref": "#/$defs/FileExtensions_",
"description": "The type of files to read from the directory.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "file_info",
"title": "File Type",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"directory_path": {
"description": "The full directory path containing partitioned tabular files.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.partitionedfiletable:PartitionedFileTabularConnection.get_directory_path_bound_options",
"options_callback_kwargs": null,
"state_name": "file_info",
"title": "Directory Path",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"connection_key",
"file_type",
"directory_path"
],
"title": "PartitionedFileTabularConnection",
"type": "object"
},
"SqlTabularConnection": {
"properties": {
"database_resource": {
"description": "The name of the database resource to connect to.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.sqltable:SqlTabularConnection.get_database_resources",
"options_callback_kwargs": null,
"state_name": "database_resource",
"title": "Database Resource",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"database_name": {
"description": "The name of the database to connect to.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.sqltable:SqlTabularConnection.get_database_schemas",
"options_callback_kwargs": null,
"state_name": "database_name",
"title": "Database Name",
"tooltip": "Detail:\nNote: If you don\u2019t see the database name that you are looking for in this list, it is recommended that you first move the data to be used within a database that is available within this list.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"table_name": {
"description": "The name of the table to use.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.sqltable:SqlTabularConnection.get_tables",
"options_callback_kwargs": null,
"state_name": "table_name",
"title": "Table Name",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"database_resource",
"database_name",
"table_name"
],
"title": "SqlTabularConnection",
"type": "object"
},
"StartDateEndDateDefinition": {
"properties": {
"start_date": {
"description": "The inclusive start of the date range (MM/DD/YYYY).",
"field_type": "input",
"format": "date-time",
"input_component": {
"component_type": "dateselector",
"max_date": null,
"min_date": null
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "date_selection",
"title": "Start Date",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"end_date": {
"description": "The inclusive end of the date range (MM/DD/YYYY).",
"field_type": "input",
"format": "date-time",
"input_component": {
"component_type": "dateselector",
"max_date": null,
"min_date": null
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "date_selection",
"title": "End Date",
"tooltip": "Detail:\nNote, the Seasonal ARIMA Anomaly Detector Routine treats the end date as exclusive.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"start_date",
"end_date"
],
"title": "StartDateEndDateDefinition",
"type": "object"
},
"TabularConnection": {
"description": "A shared parameter base model dedication to tabular connections.",
"properties": {
"tabular_connection": {
"anyOf": [
{
"$ref": "#/$defs/SqlTabularConnection"
},
{
"$ref": "#/$defs/FileTabularConnection"
},
{
"$ref": "#/$defs/PartitionedFileTabularConnection"
}
],
"description": "The connection type to use to access the source data.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "connection",
"title": "Connection",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
}
},
"required": [
"tabular_connection"
],
"title": "TabularConnection",
"type": "object"
},
"TimeSeriesTableDefinition": {
"description": "A parameter base model dedicated to loading tabular time series data.",
"properties": {
"data_connection": {
"$ref": "#/$defs/TabularConnection",
"description": "The connection to the source data.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "source_connection",
"title": "Connection",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"dimension_columns": {
"description": "The columns to use as dimensions.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"items": {
"type": "string"
},
"long_description": null,
"minItems": 1,
"options_callback": "xperiflow.source.app.routines.pbm.store.tsf.tstable:TimeSeriesTableDefinition.get_table_columns",
"options_callback_kwargs": null,
"state_name": "column_selection",
"title": "Dimension Columns",
"tooltip": "Validation Constraints:\nThe input must have a minimum length of 1.\n\nThis input may be subject to other validation constraints at runtime.",
"type": "array"
},
"date_column": {
"description": "The column to use as the date.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.tsf.tstable:TimeSeriesTableDefinition.get_table_columns",
"options_callback_kwargs": null,
"state_name": "column_selection",
"title": "Date Column",
"tooltip": "Detail:\nThe date column must in a DateTime readable format.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"value_column": {
"description": "The column to use as the value.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.tsf.tstable:TimeSeriesTableDefinition.get_table_columns",
"options_callback_kwargs": null,
"state_name": "column_selection",
"title": "Value Column",
"tooltip": "Detail:\nThe value column must be a numeric (int, float, double, decimal, etc.) column.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"data_connection",
"dimension_columns",
"date_column",
"value_column"
],
"title": "TimeSeriesTableDefinition",
"type": "object"
}
},
"properties": {
"source_data_definition": {
"$ref": "#/$defs/TimeSeriesTableDefinition",
"description": "The source data definition to use.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "SourceDefinition",
"title": "Source Data Definition",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"feature_data_definitions": {
"description": "The feature data definition to use.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"items": {
"$ref": "#/$defs/TimeSeriesTableDefinition"
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "FeatureDefinition",
"title": "Feature Data Definition",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "array"
},
"time_range": {
"anyOf": [
{
"$ref": "#/$defs/StartDateEndDateDefinition"
},
{
"type": "null"
}
],
"default": null,
"description": "The date range to fit anomalies on.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "DateRange",
"title": "Date Range",
"tooltip": "Detail:\nIf None, entire dataset will be used.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime."
}
},
"required": [
"source_data_definition",
"feature_data_definitions"
],
"title": "AnomalyDetectionFitParameters",
"type": "object"
}
Artifacts: No artifacts are returned by this method
2. Predict
Method Name: predict
Short Description: Abstract Predict Method
Detailed Description: This specifies the necessary input and output parameters for the predict method on all anomaly detection routines. The input parameters contain a source data definition and a time range to detect anomalies.
Inputs:
| Property | Type | Required | Description |
|---|---|---|---|
source_data_definition | #/$defs/TimeSeriesTableDefinition | Yes | The source data definition to use. |
feature_data_definitions | array | Yes | The feature data definition to use. |
time_range | `#/$defs/StartDateEndDateDefinition | null` | No |
state_info | #/$defs/StateInfoDefinition | Yes | The snapshot name and description. These will be used as identifiers in the snapshot artifact. |
Input Schema (JSON):
{
"$defs": {
"FileExtensions_": {
"description": "File Extensions.",
"enum": [
".csv",
".tsv",
".psv",
".parquet",
".xlsx"
],
"title": "FileExtensions_",
"type": "string"
},
"FileTabularConnection": {
"properties": {
"connection_key": {
"$ref": "#/$defs/MetaFileSystemConnectionKey",
"description": "The MetaFileSystem connection key.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "connection_key",
"title": "Connection Key",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"file_path": {
"description": "The full file path to the file to ingest.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.filetable:FileTabularConnection.get_file_path_bound_options",
"options_callback_kwargs": null,
"state_name": "file_path",
"title": "File Path",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"connection_key",
"file_path"
],
"title": "FileTabularConnection",
"type": "object"
},
"MetaFileSystemConnectionKey": {
"enum": [
"sql-server-routine",
"sql-server-shared"
],
"title": "MetaFileSystemConnectionKey",
"type": "string"
},
"PartitionedFileTabularConnection": {
"properties": {
"connection_key": {
"$ref": "#/$defs/MetaFileSystemConnectionKey",
"description": "The MetaFileSystem connection key.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "connection_key",
"title": "Connection Key",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"file_type": {
"$ref": "#/$defs/FileExtensions_",
"description": "The type of files to read from the directory.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "file_info",
"title": "File Type",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"directory_path": {
"description": "The full directory path containing partitioned tabular files.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.partitionedfiletable:PartitionedFileTabularConnection.get_directory_path_bound_options",
"options_callback_kwargs": null,
"state_name": "file_info",
"title": "Directory Path",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"connection_key",
"file_type",
"directory_path"
],
"title": "PartitionedFileTabularConnection",
"type": "object"
},
"SqlTabularConnection": {
"properties": {
"database_resource": {
"description": "The name of the database resource to connect to.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.sqltable:SqlTabularConnection.get_database_resources",
"options_callback_kwargs": null,
"state_name": "database_resource",
"title": "Database Resource",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"database_name": {
"description": "The name of the database to connect to.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.sqltable:SqlTabularConnection.get_database_schemas",
"options_callback_kwargs": null,
"state_name": "database_name",
"title": "Database Name",
"tooltip": "Detail:\nNote: If you don\u2019t see the database name that you are looking for in this list, it is recommended that you first move the data to be used within a database that is available within this list.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"table_name": {
"description": "The name of the table to use.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.sqltable:SqlTabularConnection.get_tables",
"options_callback_kwargs": null,
"state_name": "table_name",
"title": "Table Name",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"database_resource",
"database_name",
"table_name"
],
"title": "SqlTabularConnection",
"type": "object"
},
"StartDateEndDateDefinition": {
"properties": {
"start_date": {
"description": "The inclusive start of the date range (MM/DD/YYYY).",
"field_type": "input",
"format": "date-time",
"input_component": {
"component_type": "dateselector",
"max_date": null,
"min_date": null
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "date_selection",
"title": "Start Date",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"end_date": {
"description": "The inclusive end of the date range (MM/DD/YYYY).",
"field_type": "input",
"format": "date-time",
"input_component": {
"component_type": "dateselector",
"max_date": null,
"min_date": null
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "date_selection",
"title": "End Date",
"tooltip": "Detail:\nNote, the Seasonal ARIMA Anomaly Detector Routine treats the end date as exclusive.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"start_date",
"end_date"
],
"title": "StartDateEndDateDefinition",
"type": "object"
},
"StateInfoDefinition": {
"properties": {
"snapshot_name": {
"description": "The name of the anomaly detector instance.",
"field_type": "input",
"input_component": {
"component_type": "textbox",
"height": null,
"multiline": false
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "snapshot_info",
"title": "Name",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"snapshot_description": {
"description": "The description of your anomaly detector instance.",
"field_type": "input",
"input_component": {
"component_type": "textbox",
"height": null,
"multiline": false
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "snapshot_info",
"title": "Snapshot Description",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"snapshot_name",
"snapshot_description"
],
"title": "StateInfoDefinition",
"type": "object"
},
"TabularConnection": {
"description": "A shared parameter base model dedication to tabular connections.",
"properties": {
"tabular_connection": {
"anyOf": [
{
"$ref": "#/$defs/SqlTabularConnection"
},
{
"$ref": "#/$defs/FileTabularConnection"
},
{
"$ref": "#/$defs/PartitionedFileTabularConnection"
}
],
"description": "The connection type to use to access the source data.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "connection",
"title": "Connection",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
}
},
"required": [
"tabular_connection"
],
"title": "TabularConnection",
"type": "object"
},
"TimeSeriesTableDefinition": {
"description": "A parameter base model dedicated to loading tabular time series data.",
"properties": {
"data_connection": {
"$ref": "#/$defs/TabularConnection",
"description": "The connection to the source data.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "source_connection",
"title": "Connection",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"dimension_columns": {
"description": "The columns to use as dimensions.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"items": {
"type": "string"
},
"long_description": null,
"minItems": 1,
"options_callback": "xperiflow.source.app.routines.pbm.store.tsf.tstable:TimeSeriesTableDefinition.get_table_columns",
"options_callback_kwargs": null,
"state_name": "column_selection",
"title": "Dimension Columns",
"tooltip": "Validation Constraints:\nThe input must have a minimum length of 1.\n\nThis input may be subject to other validation constraints at runtime.",
"type": "array"
},
"date_column": {
"description": "The column to use as the date.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.tsf.tstable:TimeSeriesTableDefinition.get_table_columns",
"options_callback_kwargs": null,
"state_name": "column_selection",
"title": "Date Column",
"tooltip": "Detail:\nThe date column must in a DateTime readable format.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"value_column": {
"description": "The column to use as the value.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.tsf.tstable:TimeSeriesTableDefinition.get_table_columns",
"options_callback_kwargs": null,
"state_name": "column_selection",
"title": "Value Column",
"tooltip": "Detail:\nThe value column must be a numeric (int, float, double, decimal, etc.) column.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"data_connection",
"dimension_columns",
"date_column",
"value_column"
],
"title": "TimeSeriesTableDefinition",
"type": "object"
}
},
"description": "\"Note that only most recent fit will be utilized in predictions.\"",
"properties": {
"source_data_definition": {
"$ref": "#/$defs/TimeSeriesTableDefinition",
"description": "The source data definition to use.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "SourceDefinition",
"title": "Source Data Definition",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"feature_data_definitions": {
"description": "The feature data definition to use.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"items": {
"$ref": "#/$defs/TimeSeriesTableDefinition"
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "FeatureDefinition",
"title": "Feature Data Definition",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "array"
},
"time_range": {
"anyOf": [
{
"$ref": "#/$defs/StartDateEndDateDefinition"
},
{
"type": "null"
}
],
"default": null,
"description": "The date range to predict anomalies on.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "DateRange",
"title": "Date Range",
"tooltip": "Detail:\nIf None, entire dataset will be used.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"state_info": {
"$ref": "#/$defs/StateInfoDefinition",
"description": "The snapshot name and description. These will be used as identifiers in the snapshot artifact.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "StateInfoDefinition",
"title": "State Info Definition",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
}
},
"required": [
"source_data_definition",
"feature_data_definitions",
"state_info"
],
"title": "AnomalyDetectionPredictParameters",
"type": "object"
}
Artifacts:
| Property | Type | Required | Description |
|---|---|---|---|
anomaly_snapshot | unknown | Yes | Parquet file containing data about your anomaly detection run. |
anomaly_dates | unknown | Yes | Parquet file containing data about the specific dates an anomaly was detected. |
anomaly_instance | unknown | Yes | Parquet file containing data about the specific anomaly instances that were detected. |
Artifact Schema (JSON):
{
"additionalProperties": true,
"properties": {
"anomaly_snapshot": {
"description": "Parquet file containing data about your anomaly detection run.",
"io_factory_kwargs": {},
"preview_factory_kwargs": null,
"preview_factory_type": null,
"statistic_factory_kwargs": null,
"statistic_factory_type": null,
"title": "AnomalySnapshot"
},
"anomaly_dates": {
"description": "Parquet file containing data about the specific dates an anomaly was detected.",
"io_factory_kwargs": {},
"preview_factory_kwargs": null,
"preview_factory_type": null,
"statistic_factory_kwargs": null,
"statistic_factory_type": null,
"title": "Specific Anomaly Dates"
},
"anomaly_instance": {
"description": "Parquet file containing data about the specific anomaly instances that were detected.",
"io_factory_kwargs": {},
"preview_factory_kwargs": null,
"preview_factory_type": null,
"statistic_factory_kwargs": null,
"statistic_factory_type": null,
"title": "Specific Anomaly Instances"
}
},
"required": [
"anomaly_snapshot",
"anomaly_dates",
"anomaly_instance"
],
"title": "AnomalyDetectionArtifacts",
"type": "object"
}
Developer Docs
Routine Typename: RollingMedianAnomalyDetector
| Method Name | Artifact Keys |
|---|---|
__init__ | N/A |
fit | N/A |
predict | anomaly_snapshot, anomaly_dates, anomaly_instance |