Datasets

Datasets are the foundation of your forecasting pipeline. This guide covers everything you need to know about importing, configuring, and managing your data.

Supported Formats

SalesForecaster accepts data in several formats:

CSV — Comma-separated values with a header row
Excel — .xlsx workbooks (first sheet is used by default)
JSON — Array of flat objects
API — Push data programmatically via our REST API

Uploading Data

Via the Dashboard

Navigate to Data > Upload
Drag and drop your file or click Browse
Preview the detected schema and adjust column types
Click Save Dataset

Via the API

curl -X POST https://api.ltprophecy.com/v1/datasets \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: multipart/form-data" \
  -F "file=@sales_data.csv" \
  -F "name=Q1 Sales Data"

Data Requirements

Minimum Requirements

At least one date column and one numeric target column
Minimum 52 data points recommended for weekly data (12 for monthly)
Consistent date frequency (no large gaps)

Recommended Structure

For best results, include:

Date column — Timestamps for each observation
Target variable — The metric to forecast (revenue, units, etc.)
Categorical dimensions — Region, product line, customer segment
Numeric features — Deal count, marketing spend, headcount

Data Pipeline

The data pipeline automates recurring data imports:

Scheduled pulls — Connect to your data warehouse or CRM
Transformation rules — Apply cleaning and normalization
Validation checks — Automatic data quality monitoring
Incremental updates — Append new data without re-uploading

Configure pipelines under Data > Pipeline.

Feature Engineering

SalesForecaster automatically generates features from your raw data:

Temporal features — Day of week, month, quarter, holiday indicators
Lag features — Previous period values and rolling averages
Growth rates — Period-over-period and year-over-year changes
Interaction features — Cross-dimensional aggregations

You can also define custom features under Data > Features.

Data Quality

The platform continuously monitors data quality:

Missing value detection — Identifies and flags gaps
Outlier detection — Statistical and ML-based anomaly detection
Schema drift — Alerts when data structure changes
Freshness monitoring — Warns when data hasn't been updated