MatrixProfileAnomalyDetector
Versions
v1.0.0
Basic Information
Class Name: MatrixProfileAnomalyDetector
Title: Matrix Profile Anomaly Detector
Version: 1.0.0
Author: Simon Vedder
Organization: OneStream
Creation Date: 2024-04-25
Default Routine Memory Capacity: 2.0 GB
Tags
Anomaly, Time Series, Unsupervised, ML
Description
Short Description
Constructor for the Matrix Profile Anomaly Detection Routine
Long Description
This routine will calculate the matrix profile of each target in the provided dataset. The subsequences within the dataset that have the furthest away 'nearest neighbor' will be identified as anomalous. The user is able to decide what length subsequences they would like to focus on, as well as the sensitivity of the anomaly detector
Use Cases
1. Financial Fraud Detection
In financial fraud detection, transactional data such as credit card usage, stock trades, or bank transfers can be modeled as time series, with each transaction occurring sequentially over time. Detecting anomalies in these sequences can help uncover fraudulent activities that deviate from normal behavioral patterns. The matrix profile is particularly well-suited for this task, as it can efficiently compare every subsequence of the time series to find anomalies without needing prior knowledge of the types of fraud that might occur. For example, a series of unusually high-value transactions made within a short time frame may signal potential fraud. By computing the matrix profile, we can detect such patterns that do not conform to a user's typical transaction history. Other fraud indicators, such as sudden changes in spending behavior or geographic location, can also manifest as anomalous time series subsequences. The matrix profile excels at identifying these deviations automatically, making it a versatile tool for detecting various forms of financial fraud in a time series context.
2. Network Security
In network security, time series data representing traffic patterns—such as packet transfer rates, login frequencies, or bandwidth usage—can be analyzed for anomalies that signal potential cyber threats. Attacks like Distributed Denial of Service (DDoS) or unauthorized access often create abnormal patterns in network time series data. Applying a matrix profile to network traffic data allows for the detection of these anomalies by highlighting subsequences that deviate from normal traffic behavior. For instance, a sudden spike in network requests or unusual packet timing could indicate a DDoS attack. By continuously monitoring the time series for such irregular patterns, matrix profiles provide a powerful, unsupervised method to detect zero-day attacks or unusual user activity that might go unnoticed by rule-based systems. The ability of the matrix profile to identify anomalies in real-time enhances network security by enabling preemptive action against emerging threats in dynamic, high-volume traffic data.
Routine Methods
1. Init (Constructor)
- Method:
__init__-
Type: Constructor
-
Memory Capacity: 2.0 GB
-
Allow In-Memory Execution: No
-
Read Only: No
-
Method Limits: N/A
-
Outputs Dynamic Artifacts: No
-
Short Description:
- Constructor for the Matrix Profile Anomaly Detection Routine.
-
Detailed Description:
- The matrix profile is a series of values that represent the distance to a subsequences kth nearest neighbor. The higher the value returned for a given date, the more likely that date is the start of a subsequence anomaly. This means we want to find the peaks in the returned matrix profile series in order to identify anomalies, we use prominence to find these peaks. Prominence is a measure of how much a peak stands out due to its height and location relative to other peaks. Using prominence helps us avoid the problem of identifying two peaks in extremely close proximity to each other, a common issue with the matrix profile output. A higher prominence value means a peak has to stand out more in order to be identified.
-
Inputs:
- Required Input
- HyperParameters: The hyper parameters of the anomaly detector.
- Name:
hyper_parameters - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: Must be an instance of Matrix Profile Anomaly Detector Hyper Parameters
- Nested Model: Matrix Profile Anomaly Detector Hyper Parameters
- Required Input
- Subsequence Size: Subsequence size to use, the longer the subsequence the longer detected anomalies will be.
- Name:
subsequence_size - Tooltip:
- Validation Constraints:
- The input must be greater than 2.
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: int
- Name:
- Peak Prominence Threshold: A higher prominence value means a peak has to stand out more in order to be identified.
- Name:
prominence - Tooltip:
- Validation Constraints:
- The input must be greater than 0.
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: float
- Name:
- Kth Nearest Neighbor: This value determines how many similar subsequences to compare against.
- Name:
knn - Tooltip:
- Detail:
- A higher value is more robust to false positives, but may miss some true positives. A lower value is more likely to detect all true anomalies, as well as some false positives.
- Validation Constraints:
- The input must be greater than or equal to 1.
- This input may be subject to other validation constraints at runtime.
- Detail:
- Type: int
- Name:
- Subsequence Size: Subsequence size to use, the longer the subsequence the longer detected anomalies will be.
- Required Input
- Name:
- Group Name: Group name for your anomaly detector, used as an identifier in the output artifact.
- Name:
group_name - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- HyperParameters: The hyper parameters of the anomaly detector.
- Required Input
-
Artifacts: No artifacts are returned by this method
-
2. Fit (Method)
- Method:
fit-
Type: Method
-
Memory Capacity: 2.0 GB
-
Allow In-Memory Execution: No
-
Read Only: No
-
Method Limits: N/A
-
Outputs Dynamic Artifacts: No
-
Short Description:
- Fit the matrix profile model.
-
Detailed Description:
- The fit method will take the data input and save it. If prior to the predict data, this fit data will be used by the predict data to find nearest neighbors. Fit data will not be considered for anomaly detection.
-
Inputs:
- Required Input
- Source Data Definition: The source data definition to use.
- Name:
source_data_definition - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: Must be an instance of Time Series Source Data
- Nested Model: Time Series Source Data
- Required Input
- Connection: The connection to the source data.
- Name:
data_connection - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: Must be an instance of Tabular Connection
- Nested Model: Tabular Connection
- Required Input
- Connection: The connection type to use to access the source data.
- Name:
tabular_connection - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: Must be one of the following
- SQL Server Connection
- Required Input
- Database Resource: The name of the database resource to connect to.
- Name:
database_resource - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Database Name: The name of the database to connect to.
- Name:
database_name - Tooltip:
- Detail:
- Note: If you don’t see the database name that you are looking for in this list, it is recommended that you first move the data to be used within a database that is available within this list.
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Detail:
- Type: str
- Name:
- Table Name: The name of the table to use.
- Name:
table_name - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Database Resource: The name of the database resource to connect to.
- Required Input
- MetaFileSystem Connection
- Required Input
- Connection Key: The MetaFileSystem connection key.
- Name:
connection_key - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: MetaFileSystemConnectionKey
- Name:
- File Path: The full file path to the file to ingest.
- Name:
file_path - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Connection Key: The MetaFileSystem connection key.
- Required Input
- Partitioned MetaFileSystem Connection
- Required Input
- Connection Key: The MetaFileSystem connection key.
- Name:
connection_key - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: MetaFileSystemConnectionKey
- Name:
- File Type: The type of files to read from the directory.
- Name:
file_type - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: FileExtensions_
- Name:
- Directory Path: The full directory path containing partitioned tabular files.
- Name:
directory_path - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Connection Key: The MetaFileSystem connection key.
- Required Input
- SQL Server Connection
- Name:
- Connection: The connection type to use to access the source data.
- Required Input
- Name:
- Dimension Columns: The columns to use as dimensions.
- Name:
dimension_columns - Tooltip:
- Validation Constraints:
- The input must have a minimum length of 1.
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: list[str]
- Name:
- Date Column: The column to use as the date.
- Name:
date_column - Tooltip:
- Detail:
- The date column must in a DateTime readable format.
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Detail:
- Type: str
- Name:
- Value Column: The column to use as the value.
- Name:
value_column - Tooltip:
- Detail:
- The value column must be a numeric (int, float, double, decimal, etc.) column.
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Detail:
- Type: str
- Name:
- Connection: The connection to the source data.
- Required Input
- Name:
- Feature Data Definition: The feature data definition to use.
- Name:
feature_data_definitions - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: list[TimeSeriesTableDefinition]
- Name:
- Source Data Definition: The source data definition to use.
- Optional Input
- Date Range: The date range to fit anomalies on.
- Name:
time_range - Tooltip:
- Detail:
- If None, entire dataset will be used.
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Detail:
- Type: Must be an instance of Start and End Date
- Nested Model: Start and End Date
- Required Input
- Start Date: The inclusive start of the date range (MM/DD/YYYY).
- Name:
start_date - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: datetime
- Name:
- End Date: The inclusive end of the date range (MM/DD/YYYY).
- Name:
end_date - Tooltip:
- Detail:
- Note, the Seasonal ARIMA Anomaly Detector Routine treats the end date as exclusive.
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Detail:
- Type: datetime
- Name:
- Start Date: The inclusive start of the date range (MM/DD/YYYY).
- Required Input
- Name:
- Date Range: The date range to fit anomalies on.
- Required Input
-
Artifacts: No artifacts are returned by this method
-
3. Predict (Method)
- Method:
predict-
Type: Method
-
Memory Capacity: 2.0 GB
-
Allow In-Memory Execution: No
-
Read Only: No
-
Method Limits: N/A Args: api (RoutineApi): The api for the routine instance. parameters (AnomalyDetectionPredictParameters): The input parameters for the predict method. -source_data_definition: The source data definition -state_info: The state name and description -time_range: A time range used to calculate fit data on. -feature_data_definitions: The feature data defintions Returns: AnomalyDetectionArtifacts
-
Outputs Dynamic Artifacts: No
-
Short Description:
- Detect anomalies using fit matrix profile model.
-
Detailed Description:
- The predict method will take the data input and use the matrix profile to detect subsequence anomalies. It will use the fit data to get a sense of what patterns the data follows.
-
Inputs:
- Required Input
- Source Data Definition: The source data definition to use.
- Name:
source_data_definition - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: Must be an instance of Time Series Source Data
- Nested Model: Time Series Source Data
- Required Input
- Connection: The connection to the source data.
- Name:
data_connection - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: Must be an instance of Tabular Connection
- Nested Model: Tabular Connection
- Required Input
- Connection: The connection type to use to access the source data.
- Name:
tabular_connection - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: Must be one of the following
- SQL Server Connection
- Required Input
- Database Resource: The name of the database resource to connect to.
- Name:
database_resource - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Database Name: The name of the database to connect to.
- Name:
database_name - Tooltip:
- Detail:
- Note: If you don’t see the database name that you are looking for in this list, it is recommended that you first move the data to be used within a database that is available within this list.
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Detail:
- Type: str
- Name:
- Table Name: The name of the table to use.
- Name:
table_name - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Database Resource: The name of the database resource to connect to.
- Required Input
- MetaFileSystem Connection
- Required Input
- Connection Key: The MetaFileSystem connection key.
- Name:
connection_key - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: MetaFileSystemConnectionKey
- Name:
- File Path: The full file path to the file to ingest.
- Name:
file_path - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Connection Key: The MetaFileSystem connection key.
- Required Input
- Partitioned MetaFileSystem Connection
- Required Input
- Connection Key: The MetaFileSystem connection key.
- Name:
connection_key - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: MetaFileSystemConnectionKey
- Name:
- File Type: The type of files to read from the directory.
- Name:
file_type - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: FileExtensions_
- Name:
- Directory Path: The full directory path containing partitioned tabular files.
- Name:
directory_path - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Connection Key: The MetaFileSystem connection key.
- Required Input
- SQL Server Connection
- Name:
- Connection: The connection type to use to access the source data.
- Required Input
- Name:
- Dimension Columns: The columns to use as dimensions.
- Name:
dimension_columns - Tooltip:
- Validation Constraints:
- The input must have a minimum length of 1.
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: list[str]
- Name:
- Date Column: The column to use as the date.
- Name:
date_column - Tooltip:
- Detail:
- The date column must in a DateTime readable format.
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Detail:
- Type: str
- Name:
- Value Column: The column to use as the value.
- Name:
value_column - Tooltip:
- Detail:
- The value column must be a numeric (int, float, double, decimal, etc.) column.
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Detail:
- Type: str
- Name:
- Connection: The connection to the source data.
- Required Input
- Name:
- Feature Data Definition: The feature data definition to use.
- Name:
feature_data_definitions - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: list[TimeSeriesTableDefinition]
- Name:
- State Info Definition: The snapshot name and description. These will be used as identifiers in the snapshot artifact.
- Name:
state_info - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: Must be an instance of State Info
- Nested Model: State Info
- Required Input
- Name: The name of the anomaly detector instance.
- Name:
snapshot_name - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Snapshot Description: The description of your anomaly detector instance.
- Name:
snapshot_description - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Name: The name of the anomaly detector instance.
- Required Input
- Name:
- Source Data Definition: The source data definition to use.
- Optional Input
- Date Range: The date range to predict anomalies on.
- Name:
time_range - Tooltip:
- Detail:
- If None, entire dataset will be used.
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Detail:
- Type: Must be an instance of Start and End Date
- Nested Model: Start and End Date
- Required Input
- Start Date: The inclusive start of the date range (MM/DD/YYYY).
- Name:
start_date - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: datetime
- Name:
- End Date: The inclusive end of the date range (MM/DD/YYYY).
- Name:
end_date - Tooltip:
- Detail:
- Note, the Seasonal ARIMA Anomaly Detector Routine treats the end date as exclusive.
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Detail:
- Type: datetime
- Name:
- Start Date: The inclusive start of the date range (MM/DD/YYYY).
- Required Input
- Name:
- Date Range: The date range to predict anomalies on.
- Required Input
-
Artifacts:
-
AnomalySnapshot: Parquet file containing data about your anomaly detection run.
- Qualified Key Annotation:
anomaly_snapshot - Aggregate Artifact:
False - In-Memory Json Accessible:
False - File Annotations:
artifacts_/@anomaly_snapshot/data_/data_<int>.parquet- A partitioned set of parquet files where each file will have no more than 1000000 rows.
- Qualified Key Annotation:
-
Specific Anomaly Dates: Parquet file containing data about the specific dates an anomaly was detected.
- Qualified Key Annotation:
anomaly_dates - Aggregate Artifact:
False - In-Memory Json Accessible:
False - File Annotations:
artifacts_/@anomaly_dates/data_/data_<int>.parquet- A partitioned set of parquet files where each file will have no more than 1000000 rows.
- Qualified Key Annotation:
-
Specific Anomaly Instances: Parquet file containing data about the specific anomaly instances that were detected.
- Qualified Key Annotation:
anomaly_instance - Aggregate Artifact:
False - In-Memory Json Accessible:
False - File Annotations:
artifacts_/@anomaly_instance/data_/data_<int>.parquet- A partitioned set of parquet files where each file will have no more than 1000000 rows.
- Qualified Key Annotation:
-
-
Interface Definitions
1. Anomaly Detection Interface
An interface class requiring fit and predict methods to be implemented.
This BaseRoutineInterface class enforces a common interface for all anomaly detection routines. The interface requires each anomaly detection routine to implement a fit method and a predict method with the same input parameters. Each concrete class will have constructor methods where hyperparameters specific to the anomaly detection algorithm may be set, however, this interface does not enforce any specific constructor method.
Interface Methods:
1. Fit
Method Name: fit
Short Description: Abstract Fit Method
Detailed Description: This specifies the necessary input and output parameters for the fit method on all anomaly detection routines. The input parameters contain a source data definition and time range to fit an anomaly detector to.
Inputs:
| Property | Type | Required | Description |
|---|---|---|---|
source_data_definition | #/$defs/TimeSeriesTableDefinition | Yes | The source data definition to use. |
feature_data_definitions | array | Yes | The feature data definition to use. |
time_range | `#/$defs/StartDateEndDateDefinition | null` | No |
Input Schema (JSON):
{
"$defs": {
"FileExtensions_": {
"description": "File Extensions.",
"enum": [
".csv",
".tsv",
".psv",
".parquet",
".xlsx"
],
"title": "FileExtensions_",
"type": "string"
},
"FileTabularConnection": {
"properties": {
"connection_key": {
"$ref": "#/$defs/MetaFileSystemConnectionKey",
"description": "The MetaFileSystem connection key.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "connection_key",
"title": "Connection Key",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"file_path": {
"description": "The full file path to the file to ingest.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.filetable:FileTabularConnection.get_file_path_bound_options",
"options_callback_kwargs": null,
"state_name": "file_path",
"title": "File Path",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"connection_key",
"file_path"
],
"title": "FileTabularConnection",
"type": "object"
},
"MetaFileSystemConnectionKey": {
"enum": [
"sql-server-routine",
"sql-server-shared"
],
"title": "MetaFileSystemConnectionKey",
"type": "string"
},
"PartitionedFileTabularConnection": {
"properties": {
"connection_key": {
"$ref": "#/$defs/MetaFileSystemConnectionKey",
"description": "The MetaFileSystem connection key.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "connection_key",
"title": "Connection Key",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"file_type": {
"$ref": "#/$defs/FileExtensions_",
"description": "The type of files to read from the directory.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "file_info",
"title": "File Type",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"directory_path": {
"description": "The full directory path containing partitioned tabular files.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.partitionedfiletable:PartitionedFileTabularConnection.get_directory_path_bound_options",
"options_callback_kwargs": null,
"state_name": "file_info",
"title": "Directory Path",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"connection_key",
"file_type",
"directory_path"
],
"title": "PartitionedFileTabularConnection",
"type": "object"
},
"SqlTabularConnection": {
"properties": {
"database_resource": {
"description": "The name of the database resource to connect to.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.sqltable:SqlTabularConnection.get_database_resources",
"options_callback_kwargs": null,
"state_name": "database_resource",
"title": "Database Resource",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"database_name": {
"description": "The name of the database to connect to.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.sqltable:SqlTabularConnection.get_database_schemas",
"options_callback_kwargs": null,
"state_name": "database_name",
"title": "Database Name",
"tooltip": "Detail:\nNote: If you don\u2019t see the database name that you are looking for in this list, it is recommended that you first move the data to be used within a database that is available within this list.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"table_name": {
"description": "The name of the table to use.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.sqltable:SqlTabularConnection.get_tables",
"options_callback_kwargs": null,
"state_name": "table_name",
"title": "Table Name",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"database_resource",
"database_name",
"table_name"
],
"title": "SqlTabularConnection",
"type": "object"
},
"StartDateEndDateDefinition": {
"properties": {
"start_date": {
"description": "The inclusive start of the date range (MM/DD/YYYY).",
"field_type": "input",
"format": "date-time",
"input_component": {
"component_type": "dateselector",
"max_date": null,
"min_date": null
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "date_selection",
"title": "Start Date",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"end_date": {
"description": "The inclusive end of the date range (MM/DD/YYYY).",
"field_type": "input",
"format": "date-time",
"input_component": {
"component_type": "dateselector",
"max_date": null,
"min_date": null
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "date_selection",
"title": "End Date",
"tooltip": "Detail:\nNote, the Seasonal ARIMA Anomaly Detector Routine treats the end date as exclusive.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"start_date",
"end_date"
],
"title": "StartDateEndDateDefinition",
"type": "object"
},
"TabularConnection": {
"description": "A shared parameter base model dedication to tabular connections.",
"properties": {
"tabular_connection": {
"anyOf": [
{
"$ref": "#/$defs/SqlTabularConnection"
},
{
"$ref": "#/$defs/FileTabularConnection"
},
{
"$ref": "#/$defs/PartitionedFileTabularConnection"
}
],
"description": "The connection type to use to access the source data.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "connection",
"title": "Connection",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
}
},
"required": [
"tabular_connection"
],
"title": "TabularConnection",
"type": "object"
},
"TimeSeriesTableDefinition": {
"description": "A parameter base model dedicated to loading tabular time series data.",
"properties": {
"data_connection": {
"$ref": "#/$defs/TabularConnection",
"description": "The connection to the source data.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "source_connection",
"title": "Connection",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"dimension_columns": {
"description": "The columns to use as dimensions.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"items": {
"type": "string"
},
"long_description": null,
"minItems": 1,
"options_callback": "xperiflow.source.app.routines.pbm.store.tsf.tstable:TimeSeriesTableDefinition.get_table_columns",
"options_callback_kwargs": null,
"state_name": "column_selection",
"title": "Dimension Columns",
"tooltip": "Validation Constraints:\nThe input must have a minimum length of 1.\n\nThis input may be subject to other validation constraints at runtime.",
"type": "array"
},
"date_column": {
"description": "The column to use as the date.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.tsf.tstable:TimeSeriesTableDefinition.get_table_columns",
"options_callback_kwargs": null,
"state_name": "column_selection",
"title": "Date Column",
"tooltip": "Detail:\nThe date column must in a DateTime readable format.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"value_column": {
"description": "The column to use as the value.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.tsf.tstable:TimeSeriesTableDefinition.get_table_columns",
"options_callback_kwargs": null,
"state_name": "column_selection",
"title": "Value Column",
"tooltip": "Detail:\nThe value column must be a numeric (int, float, double, decimal, etc.) column.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"data_connection",
"dimension_columns",
"date_column",
"value_column"
],
"title": "TimeSeriesTableDefinition",
"type": "object"
}
},
"properties": {
"source_data_definition": {
"$ref": "#/$defs/TimeSeriesTableDefinition",
"description": "The source data definition to use.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "SourceDefinition",
"title": "Source Data Definition",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"feature_data_definitions": {
"description": "The feature data definition to use.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"items": {
"$ref": "#/$defs/TimeSeriesTableDefinition"
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "FeatureDefinition",
"title": "Feature Data Definition",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "array"
},
"time_range": {
"anyOf": [
{
"$ref": "#/$defs/StartDateEndDateDefinition"
},
{
"type": "null"
}
],
"default": null,
"description": "The date range to fit anomalies on.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "DateRange",
"title": "Date Range",
"tooltip": "Detail:\nIf None, entire dataset will be used.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime."
}
},
"required": [
"source_data_definition",
"feature_data_definitions"
],
"title": "AnomalyDetectionFitParameters",
"type": "object"
}
Artifacts: No artifacts are returned by this method
2. Predict
Method Name: predict
Short Description: Abstract Predict Method
Detailed Description: This specifies the necessary input and output parameters for the predict method on all anomaly detection routines. The input parameters contain a source data definition and a time range to detect anomalies.
Inputs:
| Property | Type | Required | Description |
|---|---|---|---|
source_data_definition | #/$defs/TimeSeriesTableDefinition | Yes | The source data definition to use. |
feature_data_definitions | array | Yes | The feature data definition to use. |
time_range | `#/$defs/StartDateEndDateDefinition | null` | No |
state_info | #/$defs/StateInfoDefinition | Yes | The snapshot name and description. These will be used as identifiers in the snapshot artifact. |
Input Schema (JSON):
{
"$defs": {
"FileExtensions_": {
"description": "File Extensions.",
"enum": [
".csv",
".tsv",
".psv",
".parquet",
".xlsx"
],
"title": "FileExtensions_",
"type": "string"
},
"FileTabularConnection": {
"properties": {
"connection_key": {
"$ref": "#/$defs/MetaFileSystemConnectionKey",
"description": "The MetaFileSystem connection key.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "connection_key",
"title": "Connection Key",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"file_path": {
"description": "The full file path to the file to ingest.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.filetable:FileTabularConnection.get_file_path_bound_options",
"options_callback_kwargs": null,
"state_name": "file_path",
"title": "File Path",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"connection_key",
"file_path"
],
"title": "FileTabularConnection",
"type": "object"
},
"MetaFileSystemConnectionKey": {
"enum": [
"sql-server-routine",
"sql-server-shared"
],
"title": "MetaFileSystemConnectionKey",
"type": "string"
},
"PartitionedFileTabularConnection": {
"properties": {
"connection_key": {
"$ref": "#/$defs/MetaFileSystemConnectionKey",
"description": "The MetaFileSystem connection key.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "connection_key",
"title": "Connection Key",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"file_type": {
"$ref": "#/$defs/FileExtensions_",
"description": "The type of files to read from the directory.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "file_info",
"title": "File Type",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"directory_path": {
"description": "The full directory path containing partitioned tabular files.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.partitionedfiletable:PartitionedFileTabularConnection.get_directory_path_bound_options",
"options_callback_kwargs": null,
"state_name": "file_info",
"title": "Directory Path",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"connection_key",
"file_type",
"directory_path"
],
"title": "PartitionedFileTabularConnection",
"type": "object"
},
"SqlTabularConnection": {
"properties": {
"database_resource": {
"description": "The name of the database resource to connect to.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.sqltable:SqlTabularConnection.get_database_resources",
"options_callback_kwargs": null,
"state_name": "database_resource",
"title": "Database Resource",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"database_name": {
"description": "The name of the database to connect to.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.sqltable:SqlTabularConnection.get_database_schemas",
"options_callback_kwargs": null,
"state_name": "database_name",
"title": "Database Name",
"tooltip": "Detail:\nNote: If you don\u2019t see the database name that you are looking for in this list, it is recommended that you first move the data to be used within a database that is available within this list.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"table_name": {
"description": "The name of the table to use.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.conn.sqltable:SqlTabularConnection.get_tables",
"options_callback_kwargs": null,
"state_name": "table_name",
"title": "Table Name",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"database_resource",
"database_name",
"table_name"
],
"title": "SqlTabularConnection",
"type": "object"
},
"StartDateEndDateDefinition": {
"properties": {
"start_date": {
"description": "The inclusive start of the date range (MM/DD/YYYY).",
"field_type": "input",
"format": "date-time",
"input_component": {
"component_type": "dateselector",
"max_date": null,
"min_date": null
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "date_selection",
"title": "Start Date",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"end_date": {
"description": "The inclusive end of the date range (MM/DD/YYYY).",
"field_type": "input",
"format": "date-time",
"input_component": {
"component_type": "dateselector",
"max_date": null,
"min_date": null
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "date_selection",
"title": "End Date",
"tooltip": "Detail:\nNote, the Seasonal ARIMA Anomaly Detector Routine treats the end date as exclusive.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"start_date",
"end_date"
],
"title": "StartDateEndDateDefinition",
"type": "object"
},
"StateInfoDefinition": {
"properties": {
"snapshot_name": {
"description": "The name of the anomaly detector instance.",
"field_type": "input",
"input_component": {
"component_type": "textbox",
"height": null,
"multiline": false
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "snapshot_info",
"title": "Name",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"snapshot_description": {
"description": "The description of your anomaly detector instance.",
"field_type": "input",
"input_component": {
"component_type": "textbox",
"height": null,
"multiline": false
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "snapshot_info",
"title": "Snapshot Description",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"snapshot_name",
"snapshot_description"
],
"title": "StateInfoDefinition",
"type": "object"
},
"TabularConnection": {
"description": "A shared parameter base model dedication to tabular connections.",
"properties": {
"tabular_connection": {
"anyOf": [
{
"$ref": "#/$defs/SqlTabularConnection"
},
{
"$ref": "#/$defs/FileTabularConnection"
},
{
"$ref": "#/$defs/PartitionedFileTabularConnection"
}
],
"description": "The connection type to use to access the source data.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "connection",
"title": "Connection",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
}
},
"required": [
"tabular_connection"
],
"title": "TabularConnection",
"type": "object"
},
"TimeSeriesTableDefinition": {
"description": "A parameter base model dedicated to loading tabular time series data.",
"properties": {
"data_connection": {
"$ref": "#/$defs/TabularConnection",
"description": "The connection to the source data.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "source_connection",
"title": "Connection",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"dimension_columns": {
"description": "The columns to use as dimensions.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"items": {
"type": "string"
},
"long_description": null,
"minItems": 1,
"options_callback": "xperiflow.source.app.routines.pbm.store.tsf.tstable:TimeSeriesTableDefinition.get_table_columns",
"options_callback_kwargs": null,
"state_name": "column_selection",
"title": "Dimension Columns",
"tooltip": "Validation Constraints:\nThe input must have a minimum length of 1.\n\nThis input may be subject to other validation constraints at runtime.",
"type": "array"
},
"date_column": {
"description": "The column to use as the date.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.tsf.tstable:TimeSeriesTableDefinition.get_table_columns",
"options_callback_kwargs": null,
"state_name": "column_selection",
"title": "Date Column",
"tooltip": "Detail:\nThe date column must in a DateTime readable format.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
},
"value_column": {
"description": "The column to use as the value.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": "xperiflow.source.app.routines.pbm.store.tsf.tstable:TimeSeriesTableDefinition.get_table_columns",
"options_callback_kwargs": null,
"state_name": "column_selection",
"title": "Value Column",
"tooltip": "Detail:\nThe value column must be a numeric (int, float, double, decimal, etc.) column.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "string"
}
},
"required": [
"data_connection",
"dimension_columns",
"date_column",
"value_column"
],
"title": "TimeSeriesTableDefinition",
"type": "object"
}
},
"description": "\"Note that only most recent fit will be utilized in predictions.\"",
"properties": {
"source_data_definition": {
"$ref": "#/$defs/TimeSeriesTableDefinition",
"description": "The source data definition to use.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "SourceDefinition",
"title": "Source Data Definition",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"feature_data_definitions": {
"description": "The feature data definition to use.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"items": {
"$ref": "#/$defs/TimeSeriesTableDefinition"
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "FeatureDefinition",
"title": "Feature Data Definition",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime.",
"type": "array"
},
"time_range": {
"anyOf": [
{
"$ref": "#/$defs/StartDateEndDateDefinition"
},
{
"type": "null"
}
],
"default": null,
"description": "The date range to predict anomalies on.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "DateRange",
"title": "Date Range",
"tooltip": "Detail:\nIf None, entire dataset will be used.\n\nValidation Constraints:\nThis input may be subject to other validation constraints at runtime."
},
"state_info": {
"$ref": "#/$defs/StateInfoDefinition",
"description": "The snapshot name and description. These will be used as identifiers in the snapshot artifact.",
"field_type": "input",
"input_component": {
"component_type": "combobox",
"show_search": true
},
"long_description": null,
"options_callback": null,
"options_callback_kwargs": null,
"state_name": "StateInfoDefinition",
"title": "State Info Definition",
"tooltip": "Validation Constraints:\nThis input may be subject to other validation constraints at runtime."
}
},
"required": [
"source_data_definition",
"feature_data_definitions",
"state_info"
],
"title": "AnomalyDetectionPredictParameters",
"type": "object"
}
Artifacts:
| Property | Type | Required | Description |
|---|---|---|---|
anomaly_snapshot | unknown | Yes | Parquet file containing data about your anomaly detection run. |
anomaly_dates | unknown | Yes | Parquet file containing data about the specific dates an anomaly was detected. |
anomaly_instance | unknown | Yes | Parquet file containing data about the specific anomaly instances that were detected. |
Artifact Schema (JSON):
{
"additionalProperties": true,
"properties": {
"anomaly_snapshot": {
"description": "Parquet file containing data about your anomaly detection run.",
"io_factory_kwargs": {},
"preview_factory_kwargs": null,
"preview_factory_type": null,
"statistic_factory_kwargs": null,
"statistic_factory_type": null,
"title": "AnomalySnapshot"
},
"anomaly_dates": {
"description": "Parquet file containing data about the specific dates an anomaly was detected.",
"io_factory_kwargs": {},
"preview_factory_kwargs": null,
"preview_factory_type": null,
"statistic_factory_kwargs": null,
"statistic_factory_type": null,
"title": "Specific Anomaly Dates"
},
"anomaly_instance": {
"description": "Parquet file containing data about the specific anomaly instances that were detected.",
"io_factory_kwargs": {},
"preview_factory_kwargs": null,
"preview_factory_type": null,
"statistic_factory_kwargs": null,
"statistic_factory_type": null,
"title": "Specific Anomaly Instances"
}
},
"required": [
"anomaly_snapshot",
"anomaly_dates",
"anomaly_instance"
],
"title": "AnomalyDetectionArtifacts",
"type": "object"
}
Developer Docs
Routine Typename: MatrixProfileAnomalyDetector
| Method Name | Artifact Keys |
|---|---|
__init__ | N/A |
fit | N/A |
predict | anomaly_snapshot, anomaly_dates, anomaly_instance |