Skip to main content

Training Models with AutoTrain

HuggingFace AutoTrain lets you train custom ML models without writing training code. XGENIA integrates AutoTrain through AI tools and visual pro nodes.

What you will learn in this guide​

  • How to create an AutoTrain project
  • How to upload training data
  • How to start and monitor training
  • How to scaffold a complete ML pipeline on the canvas

Supported Tasks​

TaskDescriptionExample Use Case
text-classificationClassify text into categoriesSentiment analysis, spam detection
tabular-classificationClassify tabular data rowsChurn prediction, fraud detection
tabular-regressionPredict numeric values from tabular dataPrice forecasting, scoring
image-classificationClassify images into categoriesProduct categorization
llm-finetuningFine-tune a large language modelCustom chatbots, domain-specific AI
dreamboothFine-tune image generation modelsCustom image generation

Training via AI Chat​

Step 1: Create a project​

Create an AutoTrain project called "sentiment-model" for text classification

The AI calls autotrain_create_project and returns a project ID.

Step 2: Upload your data​

Upload training data to the sentiment-model project from https://example.com/train.csv

The AI calls autotrain_upload_data with the project ID and data URL.

Step 3: Start training​

Start training the sentiment-model project

The AI calls autotrain_start_training. Training runs on HuggingFace infrastructure.

Step 4: Check status​

What's the status of my sentiment-model training?

The AI calls autotrain_status and reports progress, including any trained model repositories.

Scaffolding a Full Pipeline​

Ask the AI to set up a complete ML pipeline on your canvas:

Create an AutoTrain workflow for text classification

This calls create_autotrain_workflow and places the following nodes on your canvas:

  1. HF Token variable — stores your API key
  2. Project Name variable — the AutoTrain project name
  3. Dataset URL variable — where your training data lives
  4. Create Project — JS function node that creates the project
  5. Upload Data — JS function node that uploads the dataset
  6. Start Training — JS function node that triggers training

All nodes are pre-wired using @label references. Just set your token, dataset URL, and trigger the pipeline.

Using the Auto ML Trainer Pro Node​

For more control, use the Auto ML Trainer visual node.

Inputs​

PortTypeDescription
dataArrayTraining data (array of records)
targetColumnStringColumn to predict
taskTypeEnumtext-classification, tabular-classification, etc.
baseModelStringBase model from HuggingFace Hub (optional)
hyperparametersObject{ learning_rate, num_epochs, batch_size }
hfTokenStringHuggingFace API token
mlServerUrlStringML Coordinator URL
TrainSignalStart training
checkStatusSignalPoll for training progress

Outputs​

PortTypeDescription
statusStringCurrent training status
hfProjectIdStringAutoTrain project ID
hfModelRepoStringTrained model repository on HuggingFace
trainingMetricsObjectAccuracy, loss, and other metrics
errorStringError message if training fails
statusChangedSignalFires when status changes

Typical Workflow​

  1. Wire your analyzed data and target column from the Auto ML Analyzer
  2. Set the taskType matching hfTaskType from the analyzer
  3. Trigger Train to start
  4. Poll with checkStatus or listen for statusChanged
  5. When complete, use hfModelRepo with the Auto ML Predictor node