Skip to main content

NaturalLanguageSql

Versions

v1.0.0

Basic Information

Class Name: NaturalLanguageSql

Title: Natural Language to SQL

Version: 1.0.0

Author: Drew Shea

Organization: OneStream

Creation Date: 2025-01-01

Default Routine Memory Capacity: 2.0 GB

Tags

Data Analysis, Data Transformation, LLM

Description

Short Description

Generate SQL queries from natural language for tabular data.

Long Description

This Routine enables users to seamlessly translate natural language descriptions into SQL queries, targeting both file system tabular data and SQL table data. It simplifies the process of querying structured data by allowing users to bypass complex SQL syntax, empowering them to retrieve insights quickly and efficiently. Additionally, it incorporates advanced debugging capabilities to identify and resolve errors in existing SQL queries, enhancing productivity and reducing the time spent troubleshooting.

The routine leverages generative AI models to interpret user queries and map them to corresponding SQL syntax, ensuring accuracy and adaptability across different data sources. Whether accessing data from flat files, delimited files, or database tables, users can generate precise queries without prior SQL expertise.

For debugging, the routine provides meaningful error diagnostics and suggests fixes for common SQL issues, such as syntax errors, missing clauses, or misaligned joins. This dual functionality makes it an invaluable tool for both novice users and experienced analysts, streamlining workflows and minimizing the learning curve associated with SQL.

This Routine works by inspecting the natural language query, extracting relevant table names, retrieving table schemas, and generating SQL queries based on the specified SQL dialect.

Use Cases

1. Simplifying Data Retrieval for Non-SQL Users

In many organizations, non-technical team members often struggle to retrieve data from databases or file systems due to a lack of SQL knowledge. This use case demonstrates how the routine empowers such users by allowing them to phrase their queries in natural language. For instance, a marketing professional might need to extract sales data by region and product category but lacks the SQL expertise to craft the query. Using this routine, they can simply ask, 'Show me the sales by region and product category for the last quarter,' and receive the desired data without writing a single line of SQL. This capability bridges the gap between technical and non-technical users, fostering collaboration and reducing dependency on technical teams.

2. Enhancing SQL Debugging Efficiency

SQL errors can be challenging to debug, especially for users who are not deeply familiar with database systems. This use case highlights the routine’s ability to identify and resolve SQL query errors efficiently. Imagine a scenario where a data analyst encounters an error in a query that joins multiple tables with complex conditions. Debugging this manually could be time-consuming and frustrating. The routine steps in to analyze the query, pinpoint the source of the error, and suggest fixes, such as adding a missing ON clause in a JOIN statement or correcting syntax. By automating this debugging process, the routine saves valuable time and ensures that errors are resolved with clarity and precision, allowing users to focus on deriving insights rather than troubleshooting.

Routine Methods

1. Natural Language To Sql Query (Method)
  • Method: natural_language_to_sql_query
    • Type: Method

    • Memory Capacity: 2.0 GB

    • Allow In-Memory Execution: Yes

    • Read Only: No

    • Method Limits: There are no scale implications with this routine method. This method identifies the schema of the provided table(s) and generates a query based on the user's natural language input. This method should complete for all input connections in a matter of seconds.

    • Outputs Dynamic Artifacts: No

    • Short Description:

      • Convert natural language to a SQL query in a specific SQL dialect.
    • Detailed Description:

      • Converts a natural language query into a SQL query using a specific SQL dialect. The natural language query will be parsed and transformed into a SQL query that can be executed on a tabular data source. The SQL query will be generated based on the tabular connection type and the SQL dialect specified in the parameters.
    • Inputs:

      • Required Input
        • Data Definition: The connection type to use to access the source data.
          • Name: data_connection
          • Tooltip:
            • Validation Constraints:
              • This input may be subject to other validation constraints at runtime.
          • Type: Must be an instance of Natural Language Tabular Connection
          • Nested Model: Natural Language Tabular Connection
            • Required Input
              • Connection: The connection type to use to access the source data.
                • Name: tabular_connection
                • Tooltip:
                  • Validation Constraints:
                    • This input may be subject to other validation constraints at runtime.
                • Type: Must be one of the following
                  • SQL Server Connection
                    • Required Input
                      • Database Resource: The name of the database resource to connect to.
                        • Name: database_resource
                        • Tooltip:
                          • Validation Constraints:
                            • This input may be subject to other validation constraints at runtime.
                        • Type: str
                      • Database Name: The name of the database to connect to.
                        • Name: database_name
                        • Tooltip:
                          • Detail:
                            • Note: If you don’t see the database name that you are looking for in this list, it is recommended that you first move the data to be used within a database that is available within this list.
                          • Validation Constraints:
                            • This input may be subject to other validation constraints at runtime.
                        • Type: str
                      • Table Names: The names of the tables to use.
                        • Name: table_names
                        • Tooltip:
                          • Validation Constraints:
                            • This input may be subject to other validation constraints at runtime.
                        • Type: list[str]
                  • MetaFileSystem Connection
                    • Required Input
                      • Connection Key: The MetaFileSystem connection key.
                        • Name: connection_key
                        • Tooltip:
                          • Validation Constraints:
                            • This input may be subject to other validation constraints at runtime.
                        • Type: MetaFileSystemConnectionKey
                      • Root Directory: The root directory to search for files within.
                        • Name: root_dir
                        • Tooltip:
                          • Validation Constraints:
                            • This input may be subject to other validation constraints at runtime.
                        • Type: str
                      • File Paths: The full file path to the file to ingest.
                        • Name: file_paths
                        • Tooltip:
                          • Validation Constraints:
                            • This input may be subject to other validation constraints at runtime.
                        • Type: list[str]
        • Sql Dialect: The SQL dialect to use when generating the SQL query.
          • Name: dialect
          • Tooltip:
            • Validation Constraints:
              • This input may be subject to other validation constraints at runtime.
          • Type: SqlDialect_
        • Natural Language Query: The natural language query to execute on the data.
          • Name: query
          • Tooltip:
            • Validation Constraints:
              • The input must have a minimum length of 1.
              • The input must have a maximum length of 2000.
              • This input may be subject to other validation constraints at runtime.
          • Type: str
    • Artifacts:

      • Generated Query: Contains the generated SQL query and additional metadata.
        • Qualified Key Annotation: response
        • Aggregate Artifact: False
        • In-Memory Json Accessible: True
        • File Annotations:
          • artifacts_/@response/data_/data.json
            • Stored json data.
          • artifacts_/@response/data_/schema.json
            • The json schema of the json object stored in the 'data.json' file

Interface Definitions

No interface definitions found for this routine

Developer Docs

Routine Typename: NaturalLanguageSql

Method NameArtifact Keys
natural_language_to_sql_queryresponse

Was this page helpful?