Training Loop Analytics & Cohorts

The Training Loop system provides powerful analytics to help you understand, compare, and improve your AI assistant's performance. Use cohorts to group and compare different AI versions or experiments.

Overview

Analytics in the Training Loop system help you:

Compare AI versions: See how different models or playbook versions perform
Track outcomes: Monitor customer replies, escalations, and other outcomes
Analyze feedback: Understand what users think about AI decisions
Identify trends: Spot patterns in performance over time
Make data-driven decisions: Use real data to guide improvements

Cohorts

What are Cohorts?

Cohorts are groups of training records that share common characteristics. Records are automatically tagged with cohorts based on:

Playbook version: Which version of the playbook was used
Prompt version: Which version of prompts was used
AI model: Which AI model was used (GPT-4o, GPT-4o Mini, etc.)
Code version: Which version of the system code was running

Automatic Tagging

Training records are automatically tagged with cohorts when they're created. The cohort tag includes version information, making it easy to:

Group by version: See all decisions from a specific AI version
Compare experiments: Compare different versions side-by-side
Track changes: See how performance changes with new versions

Cohort Examples

Example cohort tags:

playbook_v3.2_control: Control group for playbook version 3.2
playbook_v3.2_experiment: Experiment group for playbook version 3.2
gpt4o_vs_gpt4o_mini: Comparison between two models

Analytics Metrics

Record Counts

See how many training records are in each cohort:

Total records: Total number of AI decisions recorded
By action type: Breakdown by message generation, tool calls, playbook decisions
By time period: Records created in specific date ranges

Outcome Signals

Track automatic outcome detection:

Customer replied: Percentage of decisions that led to customer replies
No further inbound: Percentage where conversations ended
Escalation triggered: Percentage that led to escalations
Response latency: Average time until customer reply

Feedback Statistics

Analyze human feedback:

Thumbs up/down: Overall feedback ratio
Feedback reasons: Breakdown by reason (helpful, unhelpful, incorrect, etc.)
Feedback trends: How feedback changes over time
Feedback by cohort: Compare feedback across different versions

Performance Metrics

Compare AI versions:

Agreement rates: How often different versions agree
Safety deltas: Changes in risk levels between versions
Intent shifts: How intent classification changes
Quality scores: Overall performance metrics

Using Analytics

Viewing Cohort Statistics

Select a cohort: Choose which cohort to analyze
View metrics: See record counts, outcomes, and feedback
Compare cohorts: Select multiple cohorts to compare
Filter by date: Analyze specific time periods

Comparing Cohorts

When comparing cohorts, you can see:

Side-by-side metrics: Compare key metrics across cohorts
Differences: See where versions differ
Trends: Identify which version performs better
Recommendations: Get suggestions based on data

Filtering Data

Filter analytics by:

Date range: Specific time periods
Action type: Message generation, tool calls, playbook decisions
Outcome type: Customer replied, escalation, etc.
Feedback value: Positive, negative, or all feedback

Use Cases

A/B Testing

Compare two AI versions:

Create cohorts: Tag records with different version identifiers
Run experiment: Let both versions handle conversations
Compare results: Use analytics to see which performs better
Make decision: Choose the better-performing version

Version Rollout

Monitor new versions:

Tag new version: New records automatically get new cohort tag
Monitor metrics: Watch outcomes and feedback
Compare to previous: See if new version is better
Rollback if needed: Revert if performance degrades

Performance Tracking

Track improvements over time:

Baseline: Establish baseline metrics
Track changes: Monitor metrics as you make improvements
Identify trends: See if performance is improving
Validate changes: Confirm improvements are working

Analytics Dashboard

Key Metrics

The analytics dashboard shows:

Total records: Number of AI decisions recorded
Outcome rates: Percentage of different outcomes
Feedback ratio: Positive vs negative feedback
Performance trends: How metrics change over time

Visualizations

Charts and graphs show:

Outcome distribution: Pie charts of outcome types
Feedback trends: Line graphs of feedback over time
Cohort comparison: Bar charts comparing cohorts
Performance metrics: Various visualizations of key metrics

Best Practices

Regular Monitoring

Check weekly: Review analytics regularly to catch issues early
Track trends: Watch for changes in metrics over time
Compare versions: Always compare new versions to previous ones
Set alerts: Get notified when metrics change significantly

Effective Cohort Management

Clear naming: Use descriptive cohort names
Consistent tagging: Tag records consistently
Document experiments: Note what each cohort represents
Archive old cohorts: Remove or archive outdated cohorts

Data-Driven Decisions

Use multiple metrics: Don't rely on a single metric
Consider context: Understand the context behind the data
Validate findings: Confirm findings with additional analysis
Act on insights: Use analytics to guide improvements

Privacy & Security

Tenant isolation: Your analytics are completely private
No PII: Analytics don't include customer information
Secure access: Only authorized users can view analytics
Audit logging: All analytics access is logged

Limitations

What Analytics Can't Do

Predict future: Analytics show past performance, not future
Explain everything: Some patterns may not have clear explanations
Replace judgment: Use analytics to inform, not replace, human judgment
Guarantee results: Better metrics don't guarantee better outcomes

Understanding Metrics

Context matters: Metrics need context to be meaningful
Sample size: Small samples may not be representative
Correlation vs causation: Correlation doesn't mean causation
Multiple factors: Many factors affect outcomes, not just AI version

Next Steps

Learn about Providing Feedback - How feedback affects analytics
Review Training Loop Overview - Understand the full system
Check Training Context - How AI uses your business context

Training Loop Analytics & Cohorts ​

Overview ​

Cohorts ​

What are Cohorts? ​

Automatic Tagging ​

Cohort Examples ​

Analytics Metrics ​

Record Counts ​

Outcome Signals ​

Feedback Statistics ​

Performance Metrics ​

Using Analytics ​

Viewing Cohort Statistics ​

Comparing Cohorts ​

Filtering Data ​

Use Cases ​

A/B Testing ​

Version Rollout ​

Performance Tracking ​

Analytics Dashboard ​

Key Metrics ​

Visualizations ​

Best Practices ​

Regular Monitoring ​

Effective Cohort Management ​

Data-Driven Decisions ​

Privacy & Security ​

Limitations ​

What Analytics Can't Do ​

Understanding Metrics ​

Next Steps ​