NaturalLanguageSql
Versions
v1.0.0
Basic Information
Class Name: NaturalLanguageSql
Title: Natural Language to SQL
Version: 1.0.0
Author: Drew Shea
Organization: OneStream
Creation Date: 2025-01-01
Default Routine Memory Capacity: 2.0 GB
Tags
Data Analysis, Data Transformation, LLM
Description
Short Description
Generate SQL queries from natural language for tabular data.
Long Description
This Routine enables users to seamlessly translate natural language descriptions into SQL queries, targeting both file system tabular data and SQL table data. It simplifies the process of querying structured data by allowing users to bypass complex SQL syntax, empowering them to retrieve insights quickly and efficiently. Additionally, it incorporates advanced debugging capabilities to identify and resolve errors in existing SQL queries, enhancing productivity and reducing the time spent troubleshooting.
The routine leverages generative AI models to interpret user queries and map them to corresponding SQL syntax, ensuring accuracy and adaptability across different data sources. Whether accessing data from flat files, delimited files, or database tables, users can generate precise queries without prior SQL expertise.
For debugging, the routine provides meaningful error diagnostics and suggests fixes for common SQL issues, such as syntax errors, missing clauses, or misaligned joins. This dual functionality makes it an invaluable tool for both novice users and experienced analysts, streamlining workflows and minimizing the learning curve associated with SQL.
This Routine works by inspecting the natural language query, extracting relevant table names, retrieving table schemas, and generating SQL queries based on the specified SQL dialect.
Use Cases
1. Simplifying Data Retrieval for Non-SQL Users
In many organizations, non-technical team members often struggle to retrieve data from databases or file systems due to a lack of SQL knowledge. This use case demonstrates how the routine empowers such users by allowing them to phrase their queries in natural language. For instance, a marketing professional might need to extract sales data by region and product category but lacks the SQL expertise to craft the query. Using this routine, they can simply ask, 'Show me the sales by region and product category for the last quarter,' and receive the desired data without writing a single line of SQL. This capability bridges the gap between technical and non-technical users, fostering collaboration and reducing dependency on technical teams.
2. Enhancing SQL Debugging Efficiency
SQL errors can be challenging to debug, especially for users who are not deeply familiar with database systems. This use case highlights the routine’s ability to identify and resolve SQL query errors efficiently. Imagine a scenario where a data analyst encounters an error in a query that joins multiple tables with complex conditions. Debugging this manually could be time-consuming and frustrating. The routine steps in to analyze the query, pinpoint the source of the error, and suggest fixes, such as adding a missing ON clause in a JOIN statement or correcting syntax. By automating this debugging process, the routine saves valuable time and ensures that errors are resolved with clarity and precision, allowing users to focus on deriving insights rather than troubleshooting.
Routine Methods
1. Natural Language To Sql Query (Method)
- Method:
natural_language_to_sql_query-
Type: Method
-
Memory Capacity: 2.0 GB
-
Allow In-Memory Execution: Yes
-
Read Only: No
-
Method Limits: There are no scale implications with this routine method. This method identifies the schema of the provided table(s) and generates a query based on the user's natural language input. This method should complete for all input connections in a matter of seconds.
-
Outputs Dynamic Artifacts: No
-
Short Description:
- Convert natural language to a SQL query in a specific SQL dialect.
-
Detailed Description:
- Converts a natural language query into a SQL query using a specific SQL dialect. The natural language query will be parsed and transformed into a SQL query that can be executed on a tabular data source. The SQL query will be generated based on the tabular connection type and the SQL dialect specified in the parameters.
-
Inputs:
- Required Input
- Data Definition: The connection type to use to access the source data.
- Name:
data_connection - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: Must be an instance of Natural Language Tabular Connection
- Nested Model: Natural Language Tabular Connection
- Required Input
- Connection: The connection type to use to access the source data.
- Name:
tabular_connection - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: Must be one of the following
- SQL Server Connection
- Required Input
- Database Resource: The name of the database resource to connect to.
- Name:
database_resource - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Database Name: The name of the database to connect to.
- Name:
database_name - Tooltip:
- Detail:
- Note: If you don’t see the database name that you are looking for in this list, it is recommended that you first move the data to be used within a database that is available within this list.
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Detail:
- Type: str
- Name:
- Table Names: The names of the tables to use.
- Name:
table_names - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: list[str]
- Name:
- Database Resource: The name of the database resource to connect to.
- Required Input
- MetaFileSystem Connection
- Required Input
- Connection Key: The MetaFileSystem connection key.
- Name:
connection_key - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: MetaFileSystemConnectionKey
- Name:
- Root Directory: The root directory to search for files within.
- Name:
root_dir - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- File Paths: The full file path to the file to ingest.
- Name:
file_paths - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: list[str]
- Name:
- Connection Key: The MetaFileSystem connection key.
- Required Input
- SQL Server Connection
- Name:
- Connection: The connection type to use to access the source data.
- Required Input
- Name:
- Sql Dialect: The SQL dialect to use when generating the SQL query.
- Name:
dialect - Tooltip:
- Validation Constraints:
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: SqlDialect_
- Name:
- Natural Language Query: The natural language query to execute on the data.
- Name:
query - Tooltip:
- Validation Constraints:
- The input must have a minimum length of 1.
- The input must have a maximum length of 2000.
- This input may be subject to other validation constraints at runtime.
- Validation Constraints:
- Type: str
- Name:
- Data Definition: The connection type to use to access the source data.
- Required Input
-
Artifacts:
- Generated Query: Contains the generated SQL query and additional metadata.
- Qualified Key Annotation:
response - Aggregate Artifact:
False - In-Memory Json Accessible:
True - File Annotations:
artifacts_/@response/data_/data.json- Stored json data.
artifacts_/@response/data_/schema.json- The json schema of the json object stored in the 'data.json' file
- Qualified Key Annotation:
- Generated Query: Contains the generated SQL query and additional metadata.
-
Interface Definitions
No interface definitions found for this routine
Developer Docs
Routine Typename: NaturalLanguageSql
| Method Name | Artifact Keys |
|---|---|
natural_language_to_sql_query | response |