Building Complex Models
- Do-Notation with
prob!
- Vectorized Operations with
plate!
- Hierarchical Address Management
- Sequential Dependencies
- Composable Model Functions
- Advanced Address Patterns
- Mixing Styles for Flexibility
- Real-World Applications
- Multi-Level Hierarchies
- Configurable Model Factories
- Testing Complex Models
- Common Pitfalls
- Performance Considerations
- Next Steps
Fugue's compositional architecture is grounded in category theory and monadic structures, enabling the systematic construction of sophisticated probabilistic models through principled composition operators. This guide explores the mathematical foundations and practical applications of Fugue's macro system for building complex probabilistic programs.
Fugue models form a monad with:
- Unit: via
pure()
- Bind: via
prob!
macro - Composition: Satisfies associativity and unit laws
This categorical structure ensures that model composition is mathematically sound and computationally tractable.
Do-Notation with prob!
The prob!
macro implements monadic do-notation for probabilistic computations, providing a natural syntax for sequential dependence. Formally, it translates:
into the monadic composition :
graph LR A[M₁] -->|bind| B[λx.M₂⁽ˣ⁾] B -->|bind| C[λy.η⁽f⁽ˣ'ʸ⁾⁾] C --> D[Result⁽ᶻ⁾]
// Simple do-notation style probabilistic program
let _simple_model = prob!(
let x <- sample(addr!("x"), Normal::new(0.0, 1.0).unwrap());
let y <- sample(addr!("y"), Normal::new(x, 0.5).unwrap());
let sum = x + y; // Regular variable assignment
pure(sum)
);
println!("✅ Created simple model with prob! macro");
Key Features:
<-
for probabilistic binding (monadic bind)=
for regular variable assignmentpure()
to lift deterministic values- Natural control flow without callback nesting
Use prob!
when you need to chain multiple probabilistic operations. It's especially powerful for dependent sampling where later variables depend on earlier ones.
Vectorized Operations with plate!
The plate!
macro implements plate notation from graphical models, representing conditionally independent replications. Given independent observations, plate notation expresses:
The computational graph shows the independence structure:
graph TB subgraph "Plate: i ∈ {1..N}" A[θ] --> B1[x₁] A --> B2[x₂] A --> B3[...] A --> BN[xₙ] end
Plate notation encodes the conditional independence assumption: for . This factorization enables efficient likelihood computation and parallel processing.
// Independent samples using plate notation
let _vector_model = plate!(i in 0..5 => {
sample(addr!("sample", i), Normal::new(0.0, 1.0).unwrap())
});
println!("✅ Created vectorized model with {} samples", 5);
// Plate with observations
let observations = [1.2, -0.5, 2.1, 0.8, -1.0];
let n_obs = observations.len();
let _observed_model = plate!(i in 0..n_obs => {
observe(addr!("obs", i), Normal::new(0.0, 1.0).unwrap(), observations[i])
});
println!("✅ Created observation model for {} data points", n_obs);
Benefits:
- Automatic address indexing prevents conflicts
- Natural iteration over data structures
- Vectorized likelihood computations
- Clear intent for independent operations
The plate!
macro automatically appends indices to addresses, so addr!("sample", i)
becomes unique for each iteration without manual address management.
Hierarchical Address Management
Complex models require systematic parameter organization following a tree-structured address space. The address hierarchy forms a prefix tree where each node represents a scope:
This hierarchical structure prevents address collisions and enables efficient parameter lookup:
The hierarchical address space forms a partially ordered set where if is a prefix of . This structure ensures unique identification of parameters while maintaining compositional semantics.
// Hierarchical model using scoped addresses
let _hierarchical_model = prob!(
let global_mu <- sample(addr!("global_mu"), Normal::new(0.0, 10.0).unwrap());
let group_mu <- sample(scoped_addr!("group", "mu", "{}", 0),
Normal::new(global_mu, 1.0).unwrap());
pure((global_mu, group_mu))
);
println!("✅ Created hierarchical model with scoped addresses");
Address Strategy:
scoped_addr!
prevents parameter name collisions- Hierarchical structure mirrors model dependencies
- Systematic naming aids debugging and introspection
- Indices enable parameter arrays
Sequential Dependencies
Sequential models exhibit temporal dependence where the state at time depends on previous states. This creates a Markov chain structure:
The computational challenge lies in maintaining state consistency while enabling efficient inference:
// Sequential model with dependencies
let _sequential_model = prob! {
let states <- plate!(t in 0..3 => {
sample(addr!("x", t), Normal::new(0.0, 1.0).unwrap())
.bind(move |x_t| {
observe(addr!("y", t), Normal::new(x_t, 0.5).unwrap(), 1.0 + t as f64)
.map(move |_| x_t)
})
});
pure(states)
};
println!("✅ Created sequential model with observations");
Patterns:
- Explicit state threading through computations
- Observation conditioning at each time step
- Autoregressive dependencies
- Mixed probabilistic and deterministic updates
Sequential models can create large traces. Consider using memory-efficient handlers for long sequences.
Composable Model Functions
Build reusable model components:
// Helper function to create a component model
fn create_normal_component(name: &str, mean: f64, std: f64) -> Model<f64> {
sample(addr!(name), Normal::new(mean, std).unwrap())
}
// Compose multiple components
let _composition_model = prob! {
let param1 <- create_normal_component("param1", 0.0, 1.0);
let param2 <- create_normal_component("param2", 2.0, 0.5);
let combined = param1 * param2;
pure(combined)
};
println!("✅ Created composed model with reusable components");
Design Principles:
- Functions return
Model<T>
for composability - Pattern matching enables model selection
- Pure functions for deterministic transformations
- Higher-order functions for model templates
Advanced Address Patterns
For large-scale models like neural networks:
// Complex addressing for large models
let _neural_layer_model = plate!(layer in 0..3 => {
let layer_size = match layer {
0 => 4,
1 => 8,
2 => 1,
_ => 1,
};
plate!(i in 0..layer_size => {
sample(
scoped_addr!("layer", "weight", "{}_{}", layer, i),
Normal::new(0.0, 0.1).unwrap()
)
})
});
println!("✅ Created neural network parameter structure");
Scaling Strategies:
- Systematic parameter naming conventions
- Multi-level scoping for complex architectures
- Consistent indexing schemes
- Hierarchical parameter organization
Mixing Styles for Flexibility
Combine macros with traditional function composition:
// Mixture model with component selection
let _mixture_model = prob! {
let component <- sample(addr!("component"), Bernoulli::new(0.3).unwrap());
let mu = if component { -2.0 } else { 2.0 };
let x <- sample(addr!("x"), Normal::new(mu, 1.0).unwrap());
pure((component, x))
};
println!("✅ Created mixture model with 2 components");
Best Practices:
- Use functions for reusable components
- Use macros for readable composition
- Separate concerns (priors, likelihood, observations)
- Document parameter dependencies
Real-World Applications
Bayesian Linear Regression
Bayesian linear regression models the relationship with uncertainty quantification:
// Complete Bayesian linear regression
let x_data = [1.0, 2.0, 3.0, 4.0, 5.0];
let y_data = [2.1, 3.9, 6.2, 8.1, 9.8];
let n = x_data.len();
let _regression_model = prob! {
let intercept <- sample(addr!("intercept"), Normal::new(0.0, 10.0).unwrap());
let slope <- sample(addr!("slope"), Normal::new(0.0, 10.0).unwrap());
let precision <- sample(addr!("precision"), Gamma::new(1.0, 1.0).unwrap());
let sigma = (1.0 / precision).sqrt();
let _likelihood <- plate!(i in 0..n => {
let predicted = intercept + slope * x_data[i];
observe(addr!("y", i), Normal::new(predicted, sigma).unwrap(), y_data[i])
});
pure((intercept, slope, sigma))
};
println!("✅ Created Bayesian linear regression model");
Hierarchical Clustering
Hierarchical models implement partial pooling through multi-level parameter structures. The hierarchy enables information sharing across groups while maintaining group-specific effects:
// Simplified hierarchy to avoid nested macro issues
let _multilevel_model = prob!(
let pop_mean <- sample(addr!("pop_mean"), Normal::new(0.0, 10.0).unwrap());
let _pop_precision <- sample(addr!("pop_precision"), Gamma::new(2.0, 0.5).unwrap());
let group_mean <- sample(scoped_addr!("group", "mean", "{}", 0),
Normal::new(pop_mean, 1.0).unwrap());
pure((pop_mean, group_mean))
);
println!("✅ Created hierarchical model structure");
State Space Models
Sequential latent variable models:
// Sequential model with dependencies
let _sequential_model = prob! {
let states <- plate!(t in 0..3 => {
sample(addr!("x", t), Normal::new(0.0, 1.0).unwrap())
.bind(move |x_t| {
observe(addr!("y", t), Normal::new(x_t, 0.5).unwrap(), 1.0 + t as f64)
.map(move |_| x_t)
})
});
pure(states)
};
println!("✅ Created sequential model with observations");
Multi-Level Hierarchies
Population → Groups → Individuals structure:
// Simplified hierarchy to avoid nested macro issues
let _multilevel_model = prob!(
let pop_mean <- sample(addr!("pop_mean"), Normal::new(0.0, 10.0).unwrap());
let _pop_precision <- sample(addr!("pop_precision"), Gamma::new(2.0, 0.5).unwrap());
let group_mean <- sample(scoped_addr!("group", "mean", "{}", 0),
Normal::new(pop_mean, 1.0).unwrap());
pure((pop_mean, group_mean))
);
println!("✅ Created hierarchical model structure");
Key Features:
- Partial pooling across hierarchy levels
- Systematic parameter organization
- Natural shrinkage properties
- Scalable to large group structures
Configurable Model Factories
Dynamic model construction:
// Helper function to create a component model
fn create_normal_component(name: &str, mean: f64, std: f64) -> Model<f64> {
sample(addr!(name), Normal::new(mean, std).unwrap())
}
// Compose multiple components
let _composition_model = prob! {
let param1 <- create_normal_component("param1", 0.0, 1.0);
let param2 <- create_normal_component("param2", 2.0, 0.5);
let combined = param1 * param2;
pure(combined)
};
println!("✅ Created composed model with reusable components");
Flexibility Benefits:
- Runtime model configuration
- Conditional model components
- A/B testing different model structures
- Experiment management
Testing Complex Models
Model validation requires systematic testing across multiple dimensions: syntactic correctness, semantic validity, and statistical consistency:
graph TD A[Model M] --> B[Syntactic Tests] A --> C[Semantic Tests] A --> D[Statistical Tests] B --> E[Type Checking] B --> F[Address Uniqueness] C --> G[Trace Validity] C --> H[Parameter Bounds] D --> I[Prior Predictive] D --> J[Posterior Consistency] E --> K{All Pass?} F --> K G --> K H --> K I --> K J --> K K -->|Yes| L[Model Validated] K -->|No| M[Refinement Required]
Testing Hierarchy:
- Unit Tests: Individual model components
- Integration Tests: Model composition correctness
- Statistical Tests: Distributional properties
- Performance Tests: Scalability and efficiency
#[test]
fn test_model_composition() {
// Test that models construct without errors
let _simple = prob! {
let x <- sample(addr!("test_x"), Normal::new(0.0, 1.0).unwrap());
pure(x)
};
// Test plate notation
let _plate_model = plate!(i in 0..3 => {
sample(addr!("plate_test", i), Normal::new(0.0, 1.0).unwrap())
});
// Test scoped addresses
let addr1 = scoped_addr!("test", "param");
let addr2 = scoped_addr!("test", "param", "{}", 42);
// Addresses should be different
assert_ne!(addr1.0, addr2.0);
assert!(addr2.0.contains("42"));
// Test hierarchical model construction
let _hierarchical = prob! {
let global <- sample(addr!("global"), Normal::new(0.0, 1.0).unwrap());
let locals <- plate!(i in 0..2 => {
sample(scoped_addr!("local", "param", "{}", i),
Normal::new(global, 0.1).unwrap())
});
pure((global, locals))
};
// All models should construct successfully
// (Actual execution would require handlers)
}
Common Pitfalls
- Address Conflicts: Use
scoped_addr!
for complex models - Memory Usage: Large plate operations can create big traces
- Sequential Dependencies: Explicit state management required
- Type Inference: Sometimes need explicit type annotations
Performance Considerations
- Plate Size: Very large plates may exceed memory limits
- Nesting Depth: Deep hierarchies increase trace size
- Address Complexity: Simple addresses are more efficient
- Function Composition: Pure functions are optimized away
Next Steps
- Optimization: See Optimizing Performance for efficiency techniques
- Debugging: Check Debugging Models for troubleshooting complex models
- Production: Learn Production Deployment for scaling
Building complex models successfully combines mathematical rigor with practical implementation:
- Categorical Foundations: Monadic structure ensures compositionality
- Systematic Organization: Hierarchical addressing prevents conflicts
- Efficient Computation: Plate notation enables vectorization
- Validation Framework: Multi-level testing ensures correctness
These patterns transform complex probabilistic modeling from ad-hoc construction into principled composition.
Complex models become tractable and maintainable through systematic composition, principled addressing, and mathematical abstraction. Fugue's macro system provides elegant syntactic sugar while preserving the underlying categorical structure that enables powerful inference algorithms and compositional reasoning about probabilistic programs.