Analyzing Data for Machine Learning
Before training a model, XGENIA can analyze your dataset to detect the problem type, suggest the best target column, identify quality issues, and recommend preprocessing steps.
What you will learn in this guide​
- How to use the AI chat to analyze data
- What insights the analysis provides
- How to use the Auto ML Analyzer pro node
Using the AI Chat​
The simplest way to analyze data is through the AI assistant. Ask it to analyze your data:
Analyze this customer data for churn prediction:
[paste your data or describe it]
The AI calls hf_analyze_data under the hood, which sends your data to the ML Coordinator for analysis.
Analysis Output​
The tool returns structured insights:
| Field | Description |
|---|---|
problemType | Classification, regression, or clustering |
suggestedTarget | The column most likely to be the prediction target |
suggestedFeatures | Columns that are good predictors |
hfTaskType | The corresponding HuggingFace task (e.g. tabular-classification) |
qualityIssues | Missing values, outliers, class imbalance warnings |
preprocessingSteps | Recommended data preparation steps |
columns | Per-column statistics (type, uniqueness, missing rate) |
Using the Auto ML Analyzer Pro Node​
For visual workflows, drag the Auto ML Analyzer node onto your canvas.
Inputs​
| Port | Type | Description |
|---|---|---|
data | Array/Object | Your dataset (array of records) |
mlServerUrl | String | ML server URL (default: http://localhost:3001) |
context | String | What you want to predict (e.g. "customer churn") |
hfToken | String | HuggingFace API token |
Analyze | Signal | Trigger the analysis |
Outputs​
| Port | Type | Description |
|---|---|---|
insights | Object | Full analysis results |
problemType | String | Detected problem type |
targetColumn | String | Suggested target column |
suggestedFeatures | Array | Recommended feature columns |
dataQualityIssues | Array | Data quality warnings |
preprocessingSteps | Array | Recommended preprocessing |
hfTaskType | String | HuggingFace task type |
error | String | Error message if analysis fails |
Example Workflow​
- Connect a Query Records node or REST node to provide data
- Wire the data output to the Analyzer's
datainput - Set the
contextto describe your prediction goal - Trigger the
Analyzesignal - Use the outputs to configure training (target column, features, task type)
Retention Analysis​
For customer retention specifically, use the Client Retention Analyzer node. It provides specialized outputs:
- Churn rate — percentage of churned customers
- Risk factors — columns correlated with churn, ranked by significance
- Recommendations — actionable suggestions to reduce churn
Wire it to a Retention Action Engine node to automatically generate action plans based on the risk factors.