Model Training

LTprophecy provides a managed ML training pipeline supporting XGBoost, LightGBM, CatBoost, Prophet, and ensemble methods with automated hyperparameter optimization via Optuna.

Available Algorithms

AlgorithmBest ForPlan
ProphetSeasonal time series with holiday effectsAll
XGBoostTabular data with many featuresAll
LightGBMLarge datasets, faster trainingGrowth+
CatBoostHigh-cardinality categoricalsGrowth+
EnsembleMaximum accuracy via model stackingEnterprise
Custom (BYOM)Your own scikit-learn pipelineEnterprise

Training Configuration

Basic Options

  • Algorithm β€” select from the table above
  • Dataset & version β€” pin to a specific dataset version
  • Target column β€” the numeric column to forecast
  • Feature columns β€” additional inputs to the model
  • Validation split β€” train/val/test ratio (default 70/15/15)
  • Forecast horizon β€” number of periods to forecast ahead

Hyperparameter Optimization

Enable Auto-tune (Optuna) to let the platform search for optimal hyperparameters. Configure:

  • n_trials β€” number of Optuna trials (default: 50)
  • timeout_minutes β€” maximum search time
  • Metric β€” RMSE, MAE, MAPE, or sMAPE
  • CV folds β€” time-series cross-validation folds (default: 5)
  • Pruning β€” early stopping of unpromising trials

Training Jobs & GPU Queue

Training jobs are queued through Celery workers. CPU workers handle most models; GPU workers (if provisioned) accelerate deep learning and large LightGBM runs. You can monitor job status in real-time via Models β†’ Training Runs.

Typical training times:

  • Prophet (≀ 100k rows): < 2 min
  • XGBoost with Optuna (100k rows, 50 trials): 5–15 min
  • Ensemble stack: 20–45 min

Experiment Tracking (MLflow)

Every training run is automatically logged to MLflow with:

  • All hyperparameters
  • Evaluation metrics (RMSE, MAE, RΒ², etc.)
  • Feature importance plots
  • Confusion matrices and residual charts
  • Serialized model artifact (stored in MinIO)

Access the MLflow UI via Models β†’ MLflow Dashboard (admins only in production).

Model Registry & Promotion

After training, models pass through lifecycle stages:

  1. Staging β€” trained, under evaluation
  2. Production β€” promoted by an admin, used for forecasts
  3. Archived β€” retired, artifacts retained

Only Production models can be selected when creating new forecasts. Promoting a model requires org:models:manage permission.

Model Evaluation

The evaluation panel shows:

  • Hold-out test set metrics
  • Backtested forecasts vs actuals chart
  • Shapley value feature importance (XGBoost/LightGBM)
  • Drift detection against training data distribution
  • Calibration curves for probabilistic forecasters