Skip to main content

How to Resolve Common Feature Upload Errors

Author: Ansley Wunderlich, Created: 2024-09-18

When uploading features and events, it's essential to navigate potential errors effectively to keep your projects on track. This article covers some of the most common errors you may encounter, along with practical solutions to resolve them.

Understanding Common Errors

1. Duplicate Feature Names / Values

What It Is:

If you see an error indicating that a feature name is duplicated across different dates, it means that the same feature name has been used for multiple entries without differentiation. This happens when a user uses the same generic identifier for multiple different features. This error will occur for both feature name duplicates and feature value duplicates.

How to Resolve:

  • Ensure Uniqueness: Each feature name must be unique. For instance, if you have a feature named “Price” for two different product categories, rename one to “Shirt_Price” for one feature set and the other to “Pants_Price.”
  • Avoid using generic identifiers such as “FeatureName” or “FeatureValue”.
note

The local file being uploaded to OneStream must have a Feature Name and Feature Value for every date listed. There cannot be any missing rows or cells, or it will prompt an error.

Example Error Message:

Inputted DataForm intersection columns for data source: [Feature Table 1] matches the dataform intersection columns of [Feature Table 2] datasource in the database.

This error indicates that a feature name or value appears in both the FX Rate feature and Inflation feature tables without proper differentiation.

image-20240926-143647.png

2. Frequency Detection Issues

What It Is:

SensibleAI Forecast (FOR) requires a minimum of three data points to infer the frequency of a feature. If your dataset contains fewer points or if they are too spread out, you will encounter an error message.

How to Resolve:

  • Consecutive Data Points: Ensure that your dataset has at least three consecutive data points within a specific time frame. For instance, if you are tracking daily sales, you must have data for at least three consecutive days to allow FOR to detect the frequency.

Example Error Message:

Could not infer frequency for intersections: ['[Feature Name]Feature Column Name']

This error means that FOR could not infer frequency for that series. Ensure your data includes enough consecutive points. For the below error example, the target source data is daily.

image-20240926-161939.png

note

See the table below to understand how FOR matches actuals and feature frequencies.

Best Practices to Avoid Common Errors

To minimize these common errors, here are some best practices:

  • Preprocess Your Data: Before uploading, aggregate or break out features as necessary. For example, if you have hourly data but need daily aggregates, do this before uploading to FOR.
  • Consistent Naming Conventions: Stick to a consistent naming convention for your features, which will help avoid duplication and mapping issues.
  • Quality Data Collection: Ensure your data collection methods yield consistent and timely data points, particularly for features requiring frequency detection.

Feature Impact Table

SensibleAI Forecast handles features differently depending on their nature. Here’s a quick overview:

Features That Don’t Require Additional Action

  • Features with consistent data points, such as daily sales figures, which FOR can process automatically.
  • Boolean features indicating yes/no conditions, as they are straightforward and easy for FOR to interpret.
warning

Beware of using a binary system of 0 and 1 to denote Boolean features. Running a project with a feature target dimension that was a Boolean value has allowed projects to run a full pipeline and deploy but cause errors for predictions.

image-20240926-175008.png

Features That May Need Preprocessing

  • Aggregated data: If your dataset includes daily data but you require weekly summaries, aggregate the data before uploading.
  • Features with sparse data points: Ensure that there are sufficient data points (preferably consecutive) to allow FOR to detect frequency.

FAQ

Can you apply higher granularity feature data (daily) to lower granularity target data (monthly)? Can you also apply lower granularity feature sets (monthly) to higher granularity target sets (weekly)? How are feature values distributed in such cases?

It is possible to apply numerical feature data across different granularities, both from higher to lower (daily to monthly) and from lower to higher (monthly to weekly). The chart below outlines how features are applied to actuals based on the frequency of the datasets.

Actuals Frequency

Feature Frequency

Matching Algorithm

Daily

Daily

Direct match.

Weekly

Daily

Average the daily values.

Monthly

Daily

Average the daily values.

Daily

Weekly

Use weekly value for every day in that week.

Weekly

Weekly

Direct match.

Monthly

Weekly

Average the week values and apply to the month. For weeks that span two months, use a weighted average.

Daily

Monthly

Use monthly value for every day in that month.

Weekly

Monthly

Use monthly value for every week in that month. For weeks that span two months, use a weighted average.

Monthly

Monthly

Direct match.


Conclusion

Understanding these common errors and their resolutions is crucial for successful interaction with FOR. By following best practices and preprocessing your data correctly, you can minimize errors and streamline your project workflow.

Was this page helpful?