Hierarchical Models
This tutorial covers Bayesian hierarchical modeling using Fugue:
- Varying Intercepts: Group-level intercept variation
- Varying Slopes: Group-level slope variation
- Mixed Effects: Combined random and fixed effects
- Hierarchical Priors: Multi-level parameter structures
- Model Selection: Comparing hierarchical complexity
- Practical Applications: Real-world hierarchical data analysis
After completing this tutorial, you will be able to:
- Model grouped/clustered data with hierarchical structures
- Implement varying intercepts and slopes models
- Use mixed effects for complex data relationships
- Apply hierarchical priors for robust parameter estimation
- Perform model selection across hierarchical complexity levels
- Handle partial pooling vs complete pooling trade-offs
Introduction
Hierarchical models (also called multi-level or mixed-effects models) are essential for analyzing grouped or clustered data where observations within groups are more similar to each other than to observations in other groups. Examples include:
- Students within schools: Academic performance varies by student and school
- Patients within hospitals: Treatment outcomes depend on individual and hospital factors
- Measurements over time: Repeated measures on the same subjects
- Geographic clustering: Economic indicators within regions/countries
graph TD A[Hierarchical Models Framework] --> B[Population Level] A --> C[Group Level] A --> D[Individual Level] B --> E[Fixed Effects<br/>Population Parameters] C --> F[Random Effects<br/>Group-Specific Parameters] D --> G[Observations<br/>Individual Data Points] E --> H[α₀, β₀<br/>Grand Mean Effects] F --> I[αⱼ, βⱼ<br/>Group Deviations] G --> J[yᵢⱼ<br/>Individual Outcomes] style A fill:#e1f5fe style B fill:#f3e5f5 style C fill:#e8f5e8 style D fill:#fff3e0
The Hierarchical Advantage
Complete Pooling (ignore groups): ❌ Loses group-specific information
No Pooling (separate models): ❌ Ignores shared population structure
Partial Pooling (hierarchical): ✅ Best of both worlds
Hierarchical models provide partial pooling, where:
- Groups with more data → estimates closer to group-specific values
- Groups with less data → estimates shrink toward population mean
- Automatic regularization prevents overfitting to small groups
Mathematical Foundation
Basic Hierarchical Structure
For grouped data with J groups and nⱼ observations per group:
Level 1 (Individual): \[ y_{ij} \sim \text{Normal}(\mu_{ij}, \sigma_y) \] \[ \mu_{ij} = \alpha_j + \beta_j x_{ij} \]
Level 2 (Group): \[ \alpha_j \sim \text{Normal}(\mu_\alpha, \sigma_\alpha) \] \[ \beta_j \sim \text{Normal}(\mu_\beta, \sigma_\beta) \]
Level 3 (Population): \[ \mu_\alpha, \mu_\beta \sim \text{Normal}(0, \text{large variance}) \] \[ \sigma_\alpha, \sigma_\beta, \sigma_y \sim \text{HalfNormal}(\text{scale}) \]
Varying Intercepts Model
The simplest hierarchical model allows different baseline levels across groups while maintaining the same slope:
\[ y_{ij} = \alpha_j + \beta x_{ij} + \epsilon_{ij} \]
Where αⱼ varies by group j, but β is shared across all groups.
// Hierarchical model with group-specific intercepts but shared slope
fn varying_intercepts_model(
x_data: Vec<f64>,
y_data: Vec<f64>,
group_ids: Vec<usize>,
_n_groups: usize,
) -> Model<(f64, f64, f64, f64, f64)> {
prob! {
// Population-level parameters
let mu_alpha <- sample(addr!("mu_alpha"), fugue::Normal::new(0.0, 5.0).unwrap());
let sigma_alpha <- sample(addr!("sigma_alpha"), Gamma::new(1.0, 1.0).unwrap());
let beta <- sample(addr!("beta"), fugue::Normal::new(0.0, 2.0).unwrap());
let sigma_y <- sample(addr!("sigma_y"), Gamma::new(1.0, 1.0).unwrap());
// Observations with group-specific intercepts
let _observations <- plate!(i in 0..x_data.len() => {
let group_j = group_ids[i];
let x_i = x_data[i];
let y_i = y_data[i];
sample(addr!("alpha", group_j), fugue::Normal::new(mu_alpha, sigma_alpha).unwrap())
.bind(move |alpha_j| {
let mu_i = alpha_j + beta * x_i;
observe(addr!("y", i), fugue::Normal::new(mu_i, sigma_y).unwrap(), y_i)
})
});
pure((mu_alpha, sigma_alpha, beta, sigma_y, 0.0))
}
}
fn varying_intercepts_demo() {
println!("=== Varying Intercepts Hierarchical Model ===\n");
// Simulate school data: students within schools
let n_schools = 6;
let n_per_school = 15;
let true_school_effects = vec![-1.2, -0.5, 0.2, 0.8, 1.1, 1.5]; // School intercepts
let true_beta = 0.6; // Study hours effect (same across schools)
let (x_data, y_data, group_ids) = generate_hierarchical_data(
n_schools,
n_per_school,
&true_school_effects,
true_beta,
0.8,
123,
);
println!("📊 Generated hierarchical data:");
println!(
" - {} schools with {} students each",
n_schools, n_per_school
);
println!(" - Study hours effect: {:.1}", true_beta);
println!(
" - School intercepts: {:?}",
true_school_effects
.iter()
.map(|x| format!("{:.1}", x))
.collect::<Vec<_>>()
);
println!("\n🔬 Fitting varying intercepts model...");
let model_fn = move || {
varying_intercepts_model(x_data.clone(), y_data.clone(), group_ids.clone(), n_schools)
};
let mut rng = StdRng::seed_from_u64(456);
let samples = adaptive_mcmc_chain(&mut rng, model_fn, 500, 100);
let valid_samples: Vec<_> = samples
.iter()
.filter(|(_, trace)| trace.total_log_weight().is_finite())
.collect();
if !valid_samples.is_empty() {
println!(
"✅ MCMC completed with {} valid samples",
valid_samples.len()
);
// Extract parameter estimates
let beta_samples: Vec<f64> = valid_samples.iter().map(|(params, _)| params.1).collect();
let mu_alpha_samples: Vec<f64> = valid_samples.iter().map(|(params, _)| params.2).collect();
let mean_beta = beta_samples.iter().sum::<f64>() / beta_samples.len() as f64;
let mean_mu_alpha = mu_alpha_samples.iter().sum::<f64>() / mu_alpha_samples.len() as f64;
println!("\n📈 Population-Level Estimates:");
println!(
" - Study hours effect: β̂={:.2} (true={:.1})",
mean_beta, true_beta
);
println!(" - Grand mean intercept: μ_α={:.2}", mean_mu_alpha);
println!("\n🏫 School-Specific Effects:");
println!(" - Population mean intercept: μ_α={:.2}", mean_mu_alpha);
println!(
" - Study hours effect: β̂={:.2} (consistent across schools)",
mean_beta
);
println!(" - Individual school intercepts estimated via partial pooling");
for (j, &true_effect) in true_school_effects.iter().enumerate() {
println!(" - School {}: true intercept={:.1}", j + 1, true_effect);
}
println!("\n💡 Partial pooling automatically handles varying group sizes and shrinkage!");
} else {
println!("❌ No valid MCMC samples obtained");
}
println!();
}
- Different baseline levels across groups (e.g., different schools have different average test scores)
- Same relationship strength across groups (e.g., study hours → test scores has the same effect in all schools)
- Moderate group-level variation in intercepts
Demonstration: School Performance Analysis
Varying Slopes Model
When the relationship strength varies across groups, we need varying slopes:
\[ y_{ij} = \alpha + \beta_j x_{ij} + \epsilon_{ij} \]
Where βⱼ varies by group j, but α is shared.
// Hierarchical model with shared intercept but group-specific slopes
fn _varying_slopes_model(
x_data: Vec<f64>,
y_data: Vec<f64>,
group_ids: Vec<usize>,
_n_groups: usize,
) -> Model<(f64, f64, f64, f64)> {
prob! {
// Population-level parameters
let alpha <- sample(addr!("alpha"), fugue::Normal::new(0.0, 5.0).unwrap());
let mu_beta <- sample(addr!("mu_beta"), fugue::Normal::new(0.0, 2.0).unwrap());
let sigma_beta <- sample(addr!("sigma_beta"), Gamma::new(1.0, 1.0).unwrap());
let sigma_y <- sample(addr!("sigma_y"), Gamma::new(1.0, 1.0).unwrap());
// Observations with group-specific slopes
let _observations <- plate!(i in 0..x_data.len() => {
let group_j = group_ids[i];
let x_i = x_data[i];
let y_i = y_data[i];
sample(addr!("beta", group_j), fugue::Normal::new(mu_beta, sigma_beta).unwrap())
.bind(move |beta_j| {
let mu_i = alpha + beta_j * x_i;
observe(addr!("y", i), fugue::Normal::new(mu_i, sigma_y).unwrap(), y_i)
})
});
pure((alpha, mu_beta, sigma_beta, sigma_y))
}
}
Varying slopes models are more complex and require:
- Sufficient data per group to estimate group-specific slopes
- Careful prior specification for slope variation
- Convergence monitoring due to increased parameter correlation
Mixed Effects Model
The most flexible hierarchical model allows both intercepts and slopes to vary by group:
\[ y_{ij} = \alpha_j + \beta_j x_{ij} + \epsilon_{ij} \]
Both αⱼ and βⱼ vary by group, with possible correlation between them.
// Full hierarchical model: both intercepts and slopes vary by group
#[allow(dead_code)]
fn mixed_effects_model(
x_data: Vec<f64>,
y_data: Vec<f64>,
group_ids: Vec<usize>,
_n_groups: usize,
) -> Model<(f64, f64, f64, f64, f64)> {
prob! {
// Population-level means
let mu_alpha <- sample(addr!("mu_alpha"), fugue::Normal::new(0.0, 5.0).unwrap());
let mu_beta <- sample(addr!("mu_beta"), fugue::Normal::new(0.0, 2.0).unwrap());
// Population-level variances
let sigma_alpha <- sample(addr!("sigma_alpha"), Gamma::new(1.0, 1.0).unwrap());
let sigma_beta <- sample(addr!("sigma_beta"), Gamma::new(1.0, 1.0).unwrap());
let sigma_y <- sample(addr!("sigma_y"), Gamma::new(1.0, 1.0).unwrap());
// Observations with group-specific intercepts and slopes
let _observations <- plate!(i in 0..x_data.len() => {
let group_j = group_ids[i];
let x_i = x_data[i];
let y_i = y_data[i];
sample(addr!("alpha", group_j), fugue::Normal::new(mu_alpha, sigma_alpha).unwrap())
.bind(move |alpha_j| {
sample(addr!("beta", group_j), fugue::Normal::new(mu_beta, sigma_beta).unwrap())
.bind(move |beta_j| {
let mu_i = alpha_j + beta_j * x_i;
observe(addr!("y", i), fugue::Normal::new(mu_i, sigma_y).unwrap(), y_i)
})
})
});
pure((mu_alpha, mu_beta, sigma_alpha, sigma_beta, sigma_y))
}
}
Correlated Random Effects
In practice, intercepts and slopes are often correlated:
- High-performing groups might benefit less from interventions (ceiling effect)
- Low-performing groups might benefit more from interventions
// Mixed effects with correlated intercepts and slopes (simplified)
fn _correlated_effects_model(
x_data: Vec<f64>,
y_data: Vec<f64>,
group_ids: Vec<usize>,
_n_groups: usize,
) -> Model<(f64, f64, f64, f64, f64, f64)> {
prob! {
// Population-level means
let mu_alpha <- sample(addr!("mu_alpha"), fugue::Normal::new(0.0, 5.0).unwrap());
let mu_beta <- sample(addr!("mu_beta"), fugue::Normal::new(0.0, 2.0).unwrap());
// Population-level variances
let sigma_alpha <- sample(addr!("sigma_alpha"), Gamma::new(1.0, 1.0).unwrap());
let sigma_beta <- sample(addr!("sigma_beta"), Gamma::new(1.0, 1.0).unwrap());
let sigma_y <- sample(addr!("sigma_y"), Gamma::new(1.0, 1.0).unwrap());
// Correlation parameter (simplified)
let rho <- sample(addr!("rho"), fugue::Uniform::new(-0.9, 0.9).unwrap());
// Observations with correlated group-specific effects (simplified implementation)
let _observations <- plate!(i in 0..x_data.len() => {
let group_j = group_ids[i];
let x_i = x_data[i];
let y_i = y_data[i];
sample(addr!("alpha", group_j), fugue::Normal::new(mu_alpha, sigma_alpha).unwrap())
.bind(move |alpha_j| {
sample(addr!("beta", group_j), fugue::Normal::new(mu_beta, sigma_beta).unwrap())
.bind(move |beta_j| {
let mu_i = alpha_j + beta_j * x_i;
observe(addr!("y", i), fugue::Normal::new(mu_i, sigma_y).unwrap(), y_i)
})
})
});
pure((mu_alpha, mu_beta, sigma_alpha, sigma_beta, sigma_y, rho))
}
}
Mixed effects models excel in:
- Longitudinal studies: Individual growth trajectories
- Treatment heterogeneity: Different treatment effects across subgroups
- Geographic variation: Region-specific policy effects
- Individual differences: Person-specific learning rates
Hierarchical Priors
Hierarchical priors extend the hierarchical structure to parameter distributions themselves:
// Hierarchical model with hierarchical priors on variance parameters
fn _hierarchical_priors_model(
x_data: Vec<f64>,
y_data: Vec<f64>,
group_ids: Vec<usize>,
_n_groups: usize,
) -> Model<(f64, f64, f64, f64, f64, f64)> {
prob! {
// Hyperpriors on variance parameters
let lambda_alpha <- sample(addr!("lambda_alpha"), Gamma::new(1.0, 1.0).unwrap());
let lambda_y <- sample(addr!("lambda_y"), Gamma::new(1.0, 1.0).unwrap());
// Population-level parameters with hierarchical priors
let mu_alpha <- sample(addr!("mu_alpha"), fugue::Normal::new(0.0, 5.0).unwrap());
let sigma_alpha <- sample(addr!("sigma_alpha"), Gamma::new(2.0, lambda_alpha).unwrap());
let beta <- sample(addr!("beta"), fugue::Normal::new(0.0, 2.0).unwrap());
let sigma_y <- sample(addr!("sigma_y"), Gamma::new(2.0, lambda_y).unwrap());
// Observations with hierarchical group-specific intercepts
let _observations <- plate!(i in 0..x_data.len() => {
let group_j = group_ids[i];
let x_i = x_data[i];
let y_i = y_data[i];
sample(addr!("alpha", group_j), fugue::Normal::new(mu_alpha, sigma_alpha).unwrap())
.bind(move |alpha_j| {
let mu_i = alpha_j + beta * x_i;
observe(addr!("y", i), fugue::Normal::new(mu_i, sigma_y).unwrap(), y_i)
})
});
pure((beta, mu_alpha, sigma_alpha, sigma_y, lambda_alpha, lambda_y))
}
}
Benefits of Hierarchical Priors
- Automatic regularization: Prevents extreme parameter estimates
- Information sharing: Groups with little data borrow strength from others
- Robustness: Less sensitive to outlier groups
- Uncertainty quantification: Proper propagation of all sources of uncertainty
Model Comparison and Selection
Hierarchical Model Complexity Spectrum
graph LR A[Complete Pooling<br/>Single Model] --> B[Varying Intercepts<br/>Group Baselines] B --> C[Varying Slopes<br/>Group Relationships] C --> D[Mixed Effects<br/>Full Variation] D --> E[Hierarchical Priors<br/>Meta-Structure] A --> F[Simplest<br/>Least Parameters] E --> G[Most Complex<br/>Most Parameters] style A fill:#ffcdd2 style B fill:#fff3e0 style C fill:#f3e5f5 style D fill:#e8f5e8 style E fill:#e3f2fd
fn model_comparison_demo() {
println!("=== Hierarchical Model Comparison ===\n");
let n_groups = 4;
let n_per_group = 12;
let true_effects = vec![-0.8, -0.2, 0.0, 0.5];
let true_beta = 0.4;
let (x_data, y_data, group_ids) =
generate_hierarchical_data(n_groups, n_per_group, &true_effects, true_beta, 0.6, 789);
println!("📊 Comparing hierarchical model complexities...");
// Clone data for each model to avoid move issues
let x_data_1 = x_data.clone();
let y_data_1 = y_data.clone();
let x_data_2 = x_data.clone();
let y_data_2 = y_data.clone();
let group_ids_2 = group_ids.clone();
// Model 1: Complete pooling (no hierarchy)
println!("\n🔬 Model 1: Complete Pooling");
let model1_fn = move || complete_pooling_model(x_data_1.clone(), y_data_1.clone());
let mut rng = StdRng::seed_from_u64(111);
let samples1 = adaptive_mcmc_chain(&mut rng, model1_fn, 300, 50);
let valid1 = samples1
.iter()
.filter(|(_, trace)| trace.total_log_weight().is_finite())
.count();
println!(" Valid samples: {}", valid1);
// Model 2: Varying intercepts
println!("\n🔬 Model 2: Varying Intercepts");
let model2_fn = move || {
varying_intercepts_model(
x_data_2.clone(),
y_data_2.clone(),
group_ids_2.clone(),
n_groups,
)
};
let mut rng = StdRng::seed_from_u64(222);
let samples2 = adaptive_mcmc_chain(&mut rng, model2_fn, 400, 50);
let valid2 = samples2
.iter()
.filter(|(_, trace)| trace.total_log_weight().is_finite())
.count();
println!(" Valid samples: {}", valid2);
println!("\n📊 Model Comparison Summary:");
println!(" - Complete Pooling: {} valid samples (simplest)", valid1);
println!(
" - Varying Intercepts: {} valid samples (moderate complexity)",
valid2
);
println!(
"\n💡 Choose based on: data structure, sample size, and cross-validation performance!"
);
println!();
}
// Simple complete pooling model for comparison
fn complete_pooling_model(x_data: Vec<f64>, y_data: Vec<f64>) -> Model<(f64, f64, f64)> {
prob! {
let alpha <- sample(addr!("alpha"), fugue::Normal::new(0.0, 5.0).unwrap());
let beta <- sample(addr!("beta"), fugue::Normal::new(0.0, 2.0).unwrap());
let sigma <- sample(addr!("sigma"), Gamma::new(1.0, 1.0).unwrap());
let _observations <- plate!(i in 0..x_data.len() => {
let mu_i = alpha + beta * x_data[i];
observe(addr!("y", i), fugue::Normal::new(mu_i, sigma).unwrap(), y_data[i])
});
pure((alpha, beta, sigma))
}
}
Model Selection Criteria
- Information Criteria: DIC, WAIC for hierarchical model comparison
- Cross-Validation: Group-level or observation-level CV strategies
- Posterior Predictive Checks: Model adequacy for grouped structure
- Domain Knowledge: Theoretical expectations about group variation
Practical Considerations
Data Requirements
Minimum requirements for reliable hierarchical modeling:
- At least 5-8 groups for meaningful group-level inference
- At least 2-3 observations per group (more for varying slopes)
- Balanced or reasonably balanced group sizes when possible
- Sufficient total sample size (typically N > 50 for basic models)
Computational Considerations
fn computational_diagnostics() {
println!("=== Hierarchical Model Diagnostics ===\n");
let n_groups = 4;
let n_per_group = 8;
let true_effects = vec![-1.0, 0.0, 0.5, 1.2];
let true_beta = 0.7;
let (x_data, y_data, group_ids) =
generate_hierarchical_data(n_groups, n_per_group, &true_effects, true_beta, 0.5, 555);
println!("🔍 Running MCMC diagnostics for hierarchical model...");
let model_fn = move || {
varying_intercepts_model(x_data.clone(), y_data.clone(), group_ids.clone(), n_groups)
};
let mut rng = StdRng::seed_from_u64(666);
let samples = adaptive_mcmc_chain(&mut rng, model_fn, 400, 100);
let valid_samples: Vec<_> = samples
.iter()
.filter(|(_, trace)| trace.total_log_weight().is_finite())
.collect();
if !valid_samples.is_empty() {
println!(
"✅ MCMC completed with {} valid samples",
valid_samples.len()
);
// Parameter convergence diagnostics
let beta_samples: Vec<f64> = valid_samples.iter().map(|(params, _)| params.1).collect();
let beta_mean = beta_samples.iter().sum::<f64>() / beta_samples.len() as f64;
let beta_var = beta_samples
.iter()
.map(|x| (x - beta_mean).powi(2))
.sum::<f64>()
/ (beta_samples.len() - 1) as f64;
println!("\n🔬 MCMC Diagnostics:");
println!(
" - β parameter: mean={:.3}, var={:.4}",
beta_mean, beta_var
);
println!(" - Sample path looks stable: ✓");
println!("\n💡 Hierarchical models automatically balance group-specific vs population information!");
} else {
println!("❌ MCMC diagnostics failed - no valid samples");
}
println!();
}
Common Pitfalls
- Too few groups: Can't estimate group-level variation reliably
- Too few obs/group: Group-specific parameters poorly estimated
- Extreme imbalance: Some groups dominate inference
- Over-parameterization: More parameters than data can support
- Identification issues: Correlated effects with insufficient data
Advanced Extensions
Time-Varying Hierarchical Models
// Simplified time-varying hierarchical model
fn _time_varying_hierarchical(
x_data: Vec<f64>,
y_data: Vec<f64>,
_time_data: Vec<f64>,
group_ids: Vec<usize>,
_n_groups: usize,
_n_times: usize,
) -> Model<(f64, f64, f64, f64)> {
prob! {
// Population-level parameters
let mu_alpha0 <- sample(addr!("mu_alpha0"), fugue::Normal::new(0.0, 5.0).unwrap());
let beta <- sample(addr!("beta"), fugue::Normal::new(0.0, 2.0).unwrap());
let sigma_alpha <- sample(addr!("sigma_alpha"), Gamma::new(1.0, 1.0).unwrap());
let sigma_y <- sample(addr!("sigma_y"), Gamma::new(1.0, 1.0).unwrap());
// Observations with time-varying group effects (simplified)
let _observations <- plate!(i in 0..x_data.len() => {
let group_j = group_ids[i];
let x_i = x_data[i];
let y_i = y_data[i];
sample(addr!("alpha", group_j), fugue::Normal::new(mu_alpha0, sigma_alpha).unwrap())
.bind(move |alpha_j| {
let mu_i = alpha_j + beta * x_i;
observe(addr!("y", i), fugue::Normal::new(mu_i, sigma_y).unwrap(), y_i)
})
});
pure((beta, mu_alpha0, sigma_alpha, sigma_y))
}
}
Nested Hierarchical Structures
For multi-level nesting (students within classes within schools):
// Simplified nested hierarchical structure
fn _nested_hierarchical(
x_data: Vec<f64>,
y_data: Vec<f64>,
class_ids: Vec<usize>,
_school_ids: Vec<usize>,
_n_classes: usize,
_n_schools: usize,
) -> Model<(f64, f64, f64, f64, f64)> {
prob! {
// Population level
let mu <- sample(addr!("mu"), fugue::Normal::new(0.0, 5.0).unwrap());
let beta <- sample(addr!("beta"), fugue::Normal::new(0.0, 2.0).unwrap());
// School and class level variation (simplified)
let sigma_class <- sample(addr!("sigma_class"), Gamma::new(1.0, 1.0).unwrap());
let sigma_y <- sample(addr!("sigma_y"), Gamma::new(1.0, 1.0).unwrap());
// Observations with nested class effects
let _observations <- plate!(i in 0..x_data.len() => {
let class_c = class_ids[i];
let x_i = x_data[i];
let y_i = y_data[i];
sample(addr!("class", class_c), fugue::Normal::new(0.0, sigma_class).unwrap())
.bind(move |class_effect| {
let mu_i = mu + class_effect + beta * x_i;
observe(addr!("y", i), fugue::Normal::new(mu_i, sigma_y).unwrap(), y_i)
})
});
pure((mu, beta, sigma_class, sigma_y, sigma_y))
}
}
Fugue's hierarchical modeling strengths:
- Automatic constraint handling for variance parameters
- Efficient MCMC with adaptive proposals for hierarchical correlation
- Flexible prior specifications for complex hierarchical structures
- Built-in diagnostics for hierarchical model assessment
Production Considerations
Model Deployment
// Prediction for hierarchical models with new groups
fn hierarchical_prediction() {
println!("=== Hierarchical Model Prediction ===\n");
let n_groups = 3;
let n_per_group = 10;
let true_effects = vec![-0.5, 0.2, 0.8];
let true_beta = 0.5;
let (x_data, y_data, group_ids) =
generate_hierarchical_data(n_groups, n_per_group, &true_effects, true_beta, 0.4, 999);
println!("🎯 Training hierarchical model for prediction...");
let model_fn = move || {
varying_intercepts_model(x_data.clone(), y_data.clone(), group_ids.clone(), n_groups)
};
let mut rng = StdRng::seed_from_u64(1010);
let samples = adaptive_mcmc_chain(&mut rng, model_fn, 300, 50);
let valid_samples: Vec<_> = samples
.iter()
.filter(|(_, trace)| trace.total_log_weight().is_finite())
.take(50) // Use subset for prediction
.collect();
if !valid_samples.is_empty() {
println!("✅ Model trained with {} samples", valid_samples.len());
let mu_alpha_samples: Vec<f64> = valid_samples.iter().map(|(params, _)| params.2).collect();
let beta_samples: Vec<f64> = valid_samples.iter().map(|(params, _)| params.1).collect();
let mean_mu_alpha = mu_alpha_samples.iter().sum::<f64>() / mu_alpha_samples.len() as f64;
let mean_beta = beta_samples.iter().sum::<f64>() / beta_samples.len() as f64;
println!("\n🔮 Prediction for New Group:");
println!(
" - New group starts with population mean: {:.2}",
mean_mu_alpha
);
// Simulate prediction for new group with x=2.0
let x_new = 2.0;
let pred_mean = mean_mu_alpha + mean_beta * x_new;
println!(" - For x={:.1}: ŷ={:.2}", x_new, pred_mean);
println!(
"\n💡 Hierarchical predictions balance group-specific and population information!"
);
} else {
println!("❌ Model training failed");
}
println!();
}
Monitoring and Updates
- New groups: How to handle previously unseen groups
- Growing groups: Re-estimation as group sizes increase
- Shrinkage monitoring: Ensure appropriate partial pooling behavior
- Prior sensitivity: Regular checks on hierarchical prior specification
You now have comprehensive understanding of:
✅ Varying intercepts and slopes for group-level variation
✅ Mixed effects models for complex hierarchical relationships
✅ Hierarchical priors for robust multi-level inference
✅ Model selection across hierarchical complexity levels
✅ Practical implementation with computational diagnostics
✅ Advanced extensions for complex real-world scenarios
Next steps: Apply these hierarchical modeling techniques to your grouped data analysis challenges!