SurvivalAnalysisRoutine
Versions
v0.1.0
Basic Information
Class Name: SurvivalAnalysisRoutine
Title: Survival Analysis
Version: 0.1.0
Author: Benjamin Fitzgerald, Jon Tuazon
Organization: OneStream
Creation Date: 2025-07-08
Default Routine Memory Capacity: 2.0 GB
Tags
Survival Analysis, Statistics
Description
Short Description
Perform survival analysis on a dataset.
Long Description
This routine performs survival analysis on a dataset, allowing for the examination of time-to-event data. It can be used to analyze customer survival times and the impact of various features on survival. The routine will analyze the time it takes for an entity (e.g., patient, customer) to make a payment after receiving a service or product, or the time it takes for an entity to convert from one treatment pipeline to another. It will also flag risky entities based on their risk scores. Using survival analysis, this routine can help identify trends in both payment behavior and the factors that influence timely payments.
Use Cases
1. Time to Payment
Analyze the time it takes for an entity (e.g., patient, customer) to make a payment after receiving a service or product. This routine can help identify trends in both payment behavior and the factors that influence timely payments. Features that the user can choose to analyze may impact time to payment behaviors. This routine can also assist in identifying trends in payment behaviors, which can inform financial decision-making and improve cash flow management. This capability is essential for organizations to optimize their billing processes and enhance customer satisfaction.
2. Pipeline Conversion
Analyze the time it takes for an entity (e.g., lead, customer, employee) to transition from one business pipeline stage to another. For example, in sales operations, this could involve measuring how long it takes for a prospect to move from initial contact to closed deal. In customer onboarding, it could involve tracking the time it takes for a client to convert from signed contract to active usage of the product. In recruitment, it could examine the duration from candidate application to final hire. This analysis can uncover influential factors that affect transition speed, such as lead quality, team responsiveness, product complexity, and organizational workflows.
Routine Methods
1. Init (Constructor)
- Method:
__init__-
Type: Constructor
-
Memory Capacity: 2.0 GB
-
Allow In-Memory Execution: No
-
Read Only: No
-
Method Limits: N/A
-
Outputs Dynamic Artifacts: No
-
Short Description:
- Initialize the Survival Analysis Routine.
-
Detailed Description:
- This constructor currently performs minimal setup. Core parameters are handled in the fit method.
-
Inputs:
- No input parameters
-
Artifacts: No artifacts are returned by this method
-
2. Fit (Method)
- Method:
fit-
Type: Method
-
Memory Capacity: 2.0 GB
-
Allow In-Memory Execution: No
-
Read Only: No
-
Method Limits: This method was tested with a dataset of 10K rows with 6 covariates (3 numerical and 3 categorical) and completed in 2 hours and 55 minutes with 100GB of memory.
-
Outputs Dynamic Artifacts: No
-
Short Description:
- Generate a report to help the user better understand the dataset.
-
Detailed Description:
- This routine performs survival analysis on a dataset, allowing for the examination of time-to-event data. It can be used to analyse use cases such as time to payment or pipelines conversion. It will analyse the time it takes for an entity (e.g., patient, customer) to make a payment after receiving a service or product, or the time it takes for a patient to convert from one treatment pipeline to another. It will also flag risky entities based on their risk scores. Using survival analysis, this routine can help identify trends in both payment behavior and the factors that influence timely payments.
-
Inputs:
- Required Input
- Source Data Definition: Select the data source connection for training the survival analysis routine.
- Name:
data_connection - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: Must be an instance of Tabular Connection
- Nested Model: Tabular Connection
- Required Input
- Connection: The connection type to use to access the source data.
- Name:
tabular_connection - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: Must be one of the following
- SQL Server Connection
- Required Input
- Database Resource: The name of the database resource to connect to.
- Name:
database_resource - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Database Name: The name of the database to connect to.
- Name:
database_name - Tooltip:
- Detail:
- Note: If you don’t see the database name that you are looking for in this list, it is recommended that you first move the data to be used within a database that is available within this list.
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Detail:
- Type: str
- Name:
- Table Name: The name of the table to use.
- Name:
table_name - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Database Resource: The name of the database resource to connect to.
- Required Input
- MetaFileSystem Connection
- Required Input
- Connection Key: The MetaFileSystem connection key.
- Name:
connection_key - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: MetaFileSystemConnectionKey
- Name:
- File Path: The full file path to the file to ingest.
- Name:
file_path - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Connection Key: The MetaFileSystem connection key.
- Required Input
- Partitioned MetaFileSystem Connection
- Required Input
- Connection Key: The MetaFileSystem connection key.
- Name:
connection_key - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: MetaFileSystemConnectionKey
- Name:
- File Type: The type of files to read from the directory.
- Name:
file_type - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: FileExtensions_
- Name:
- Directory Path: The full directory path containing partitioned tabular files.
- Name:
directory_path - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Connection Key: The MetaFileSystem connection key.
- Required Input
- SQL Server Connection
- Name:
- Connection: The connection type to use to access the source data.
- Required Input
- Name:
- Event Column: Select the column that indicates the event occurrence (e.g., payment, conversion).The event column should contain binary values indicating whether the event occurred (1) or not (0).
- Name:
event_column - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Time-to-Event Column: Select the column that indicates the time-to-event (e.g., time until payment, time until conversion).The time column should contain numerical values representing the time until the event occurred.
- Name:
time_column - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Numerical Features: Select numerical features for the survival model. WARNING: Training and prediction must use the same features. Prediction may include extras, but not fewer.
- Name:
numerical_features - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: list[str]
- Name:
- Categorical Features: Select categorical features for the survival model. WARNING: Training and prediction must use the same features. Prediction may include extras, but not fewer.
- Name:
categorical_features - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: list[str]
- Name:
- Unit Settings: Set time unit (days, weeks, months, years) according to the dataset for the routine.
- Name:
unit_settings - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Source Data Definition: Select the data source connection for training the survival analysis routine.
- Required Input
-
Artifacts:
- Survival Analysis Web Dashboard: A Web Dashboard with the results of the Survival Analysis routine.
- Qualified Key Annotation:
web_app - Aggregate Artifact:
False - In-Memory Json Accessible:
False - File Annotations:
artifacts_/@web_app/data_/data.appref- json file of data relating to web app
- Qualified Key Annotation:
- Survival Analysis Web Dashboard: A Web Dashboard with the results of the Survival Analysis routine.
-
3. Predict (Method)
- Method:
predict-
Type: Method
-
Memory Capacity: 2.0 GB
-
Allow In-Memory Execution: No
-
Read Only: No
-
Method Limits: This method was tested with a dataset of 1M rows with 6 covariates (3 numerical and 3 categorical) and completed in roughly 4 minutes with 100GB of memory. At the moment, using larger datasets may lead to memory issues.
-
Outputs Dynamic Artifacts: No
-
Short Description:
- Generate a web app artifact for survival analysis predictions.
-
Detailed Description:
- This method uses the best model from the survival analysis routine to make predictions on the test set. The user will input the predict data, and the method will generate a summary dataframe flagging high risk scores, survival and cumulative hazard quantile curves, and a permutation feature importance plot.
-
Inputs:
- Required Input
- Source Data Definition: Select the data source connection for the survival analysis prediction.
- Name:
data_connection - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: Must be an instance of Tabular Connection
- Nested Model: Tabular Connection
- Required Input
- Connection: The connection type to use to access the source data.
- Name:
tabular_connection - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: Must be one of the following
- SQL Server Connection
- Required Input
- Database Resource: The name of the database resource to connect to.
- Name:
database_resource - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Database Name: The name of the database to connect to.
- Name:
database_name - Tooltip:
- Detail:
- Note: If you don’t see the database name that you are looking for in this list, it is recommended that you first move the data to be used within a database that is available within this list.
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Detail:
- Type: str
- Name:
- Table Name: The name of the table to use.
- Name:
table_name - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Database Resource: The name of the database resource to connect to.
- Required Input
- MetaFileSystem Connection
- Required Input
- Connection Key: The MetaFileSystem connection key.
- Name:
connection_key - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: MetaFileSystemConnectionKey
- Name:
- File Path: The full file path to the file to ingest.
- Name:
file_path - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Connection Key: The MetaFileSystem connection key.
- Required Input
- Partitioned MetaFileSystem Connection
- Required Input
- Connection Key: The MetaFileSystem connection key.
- Name:
connection_key - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: MetaFileSystemConnectionKey
- Name:
- File Type: The type of files to read from the directory.
- Name:
file_type - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: FileExtensions_
- Name:
- Directory Path: The full directory path containing partitioned tabular files.
- Name:
directory_path - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Connection Key: The MetaFileSystem connection key.
- Required Input
- SQL Server Connection
- Name:
- Connection: The connection type to use to access the source data.
- Required Input
- Name:
- Unique Identifier Column: Select the column to be used as the unique identifier for each observation.If no unique identifier is available, select Auto ID (use row index as the unique identifier).
- Name:
unique_id_column - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Prediction Horizon: Set the prediction horizon for the survival analysis model. This defines how far into the future the model will predict survival probabilities.
- Name:
prediction_horizon - Tooltip:
- Validation Constraints:
- The input must be greater than or equal to 1.
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: int
- Name:
- Prediction Interval: Set the prediction interval for the survival analysis model. This defines the time interval for which the model will provide survival probabilities.
- Name:
prediction_interval - Tooltip:
- Validation Constraints:
- The input must be greater than or equal to 1.
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: int
- Name:
- Source Data Definition: Select the data source connection for the survival analysis prediction.
- Required Input
-
Artifacts:
- Survival Analysis Web Dashboard: A Web Dashboard with the results of the Survival Analysis routine.
- Qualified Key Annotation:
web_app - Aggregate Artifact:
False - In-Memory Json Accessible:
False - File Annotations:
artifacts_/@web_app/data_/data.appref- json file of data relating to web app
- Qualified Key Annotation:
- Survival Analysis Web Dashboard: A Web Dashboard with the results of the Survival Analysis routine.
-
Interface Definitions
No interface definitions found for this routine
Developer Docs
Routine Typename: SurvivalAnalysisRoutine
| Method Name | Artifact Keys |
|---|---|
__init__ | N/A |
fit | web_app |
predict | web_app |