Working with Distributions
Fugue's type-safe distribution system represents a principled approach to probabilistic programming, eliminating entire classes of runtime errors through rigorous type theory while preserving the full expressiveness of statistical modeling. This guide demonstrates the mathematical foundations and practical applications of Fugue's distribution architecture.
Fugue's distribution system is grounded in dependent type theory, where each distribution is parameterized not just by its parameters , but by its support type . This ensures that and eliminates the need for runtime type checking or unsafe casting operations.
Type Safety in Practice
Traditional probabilistic programming libraries return f64
for everything, leading to casting overhead and runtime errors. Fugue distributions return their natural types:
// Demonstrate natural return types
let coin = Bernoulli::new(0.5).unwrap();
let flip: bool = coin.sample(&mut rng); // Natural boolean
if flip {
println!("🪙 Coin flip result: Heads!");
} else {
println!("🪙 Coin flip result: Tails!");
}
No casting, no comparisons with floating-point values—just natural boolean logic.
Continuous Distributions
Continuous distributions in Fugue model phenomena over uncountable domains . The probability density function satisfies the normalization condition:
For computational stability, Fugue operates in log-space by default, computing to avoid numerical underflow:
Working directly with densities can cause severe numerical issues when . Fugue's log_prob()
method computes , which remains numerically stable even for extreme tail probabilities.
// Working with continuous distributions
let standard_normal = Normal::new(0.0, 1.0).unwrap();
let sample: f64 = standard_normal.sample(&mut rng);
println!("📊 Standard normal sample: {:.3}", sample);
// Compute log-probability density
let log_density = standard_normal.log_prob(&0.0); // Peak of standard normal
println!(
"📈 Log-density at x=0: {:.3} (peak of standard normal)",
log_density
);
// Custom parameters
let measurement_model = Normal::new(10.0, 0.5).unwrap();
let measurement = measurement_model.sample(&mut rng);
println!("🔬 Sensor measurement (μ=10.0, σ=0.5): {:.3}", measurement);
Key Points:
sample()
returnsf64
for direct arithmeticlog_prob()
computes log-density (avoids numerical underflow)- Parameter validation happens at construction time
Always work with log-probabilities for numerical stability. Only convert to regular probabilities when necessary for interpretation.
Discrete Distributions
Discrete distributions operate over countable support sets or finite sets. The probability mass function satisfies:
Fugue enforces this constraint at construction time and leverages natural integer types to eliminate precision loss from floating-point representation:
Unlike floating-point representations that can introduce rounding errors, Fugue's native u64
and usize
types preserve exact integer values. This is crucial for count data where must remain precisely representable.
// Working with discrete distributions
// Count data
let event_rate = Poisson::new(3.0).unwrap();
let count: u64 = event_rate.sample(&mut rng);
println!("📅 Event count (λ=3.0): {} events", count);
// Log-probability mass
let prob_3_events = event_rate.log_prob(&3);
println!(
"🎯 Log-probability of exactly 3 events: {:.3}",
prob_3_events
);
// Use counts directly in calculations
let total_cost = count * 50; // Direct arithmetic with u64
println!("💰 Total cost ({} events × $50): ${}", count, total_cost);
Benefits:
u64
counts support direct arithmetic without casting- No precision loss from floating-point representations
- Natural integration with Rust's type system
Safe Categorical Sampling
Categorical distributions return usize
for safe array indexing:
// Safe categorical sampling
let choices = vec![0.3, 0.5, 0.2]; // Three categories
let categorical = Categorical::new(choices).unwrap();
let selected: usize = categorical.sample(&mut rng);
// Safe array indexing (no bounds checking needed)
let options = ["Option A", "Option B", "Option C"];
println!(
"🎲 Categorical choice (weights: 0.3, 0.5, 0.2): {}",
options[selected]
);
// Uniform categorical
let uniform_choice = Categorical::uniform(5).unwrap();
let idx: usize = uniform_choice.sample(&mut rng);
println!("🎯 Uniform random index (0-4): {}", idx);
The usize
return type eliminates bounds checking errors—the sampled index is guaranteed to be valid for the probability vector length.
Parameter Validation
Fugue enforces mathematical constraints through compile-time and runtime validation. Each distribution family has a parameter space defining valid configurations:
graph TD A[Parameter Input θ] --> B{θ ∈ Θ?} B -->|Yes| C[Distribution Construction] B -->|No| D[ValidationError] C --> E[Type-Safe Sampling] D --> F[Early Failure Detection]
Constraint Examples:
- Normal Distribution: requires
- Beta Distribution: requires
- Categorical Distribution: and
// Distribution parameter validation
// This will return an error
match Normal::new(0.0, -1.0) {
Ok(_) => println!("✅ Normal(μ=0.0, σ=-1.0) created successfully"),
Err(e) => println!("❌ Normal(μ=0.0, σ=-1.0) failed: {:?}", e),
}
// Beta distribution parameters must be positive
match Beta::new(0.0, 1.0) {
Ok(_) => println!("✅ Beta(α=0.0, β=1.0) created successfully"),
Err(e) => println!("❌ Beta(α=0.0, β=1.0) failed: {:?}", e),
}
// Poisson rate must be non-negative
match Poisson::new(-1.0) {
Ok(_) => println!("✅ Poisson(λ=-1.0) created successfully"),
Err(e) => println!("❌ Poisson(λ=-1.0) failed: {:?}", e),
}
Validation Rules:
- Normal: σ > 0
- Beta: α > 0, β > 0
- Poisson: λ ≥ 0
- Categorical: probabilities sum to 1, all non-negative
Storing Mixed Distributions
Use trait objects for collections of distributions with the same return type:
// Storing different distributions together
let continuous_dists: Vec<Box<dyn Distribution<f64>>> = vec![
Normal::new(0.0, 1.0).unwrap().clone_box(),
Beta::new(2.0, 5.0).unwrap().clone_box(),
Uniform::new(-1.0, 1.0).unwrap().clone_box(),
];
// Sample from each
for (i, dist) in continuous_dists.iter().enumerate() {
let sample = dist.sample(&mut rng);
let dist_name = match i {
0 => "Normal(0,1)",
1 => "Beta(2,5)",
2 => "Uniform(-1,1)",
_ => "Unknown",
};
println!("📦 {} sample: {:.3}", dist_name, sample);
}
This enables dynamic distribution selection and model composition patterns.
Practical Modeling Patterns
Common modeling scenarios demonstrate natural type usage:
// Practical modeling examples
// Model a sensor with noise
let true_temperature = 20.5; // True value
let sensor_noise = Normal::new(0.0, 0.2).unwrap(); // Measurement error
let measured_temp = true_temperature + sensor_noise.sample(&mut rng);
println!(
"🌡️ True temperature: {:.2}°C → Measured: {:.2}°C",
true_temperature, measured_temp
);
// Count model for arrivals
let arrival_rate = Poisson::new(2.5).unwrap(); // 2.5 arrivals per hour
let hourly_arrivals = arrival_rate.sample(&mut rng);
println!(
"🚪 Expected arrivals: 2.5/hour → Actual: {} arrivals",
hourly_arrivals
);
// Decision model
let decision_prob = 0.7;
let decision = Bernoulli::new(decision_prob).unwrap();
let will_buy = decision.sample(&mut rng);
if will_buy {
println!("🛒 Customer decision (p=0.7): Will make a purchase");
} else {
println!("🚶 Customer decision (p=0.7): Will not purchase");
}
Each distribution serves its natural domain without artificial conversions.
Working with Log-Probabilities
Logarithmic probability computation is essential for numerical stability in probabilistic programming. Consider the log-sum-exp operation for computing:
where . This formulation prevents overflow when is large:
Direct computation of where will underflow to machine zero for moderate . Log-space computation of remains stable for arbitrarily small probabilities, preserving up to 15-17 digits of precision in IEEE 754 double precision.
// Working with log-probabilities
let normal = Normal::new(100.0, 15.0).unwrap();
// Multiple observations
let observations = vec![98.5, 102.1, 99.8, 101.5, 97.2];
let mut total_log_prob = 0.0;
println!(
"📋 Evaluating {} observations under Normal(μ=100.0, σ=15.0):",
observations.len()
);
for obs in &observations {
let log_p = normal.log_prob(obs);
total_log_prob += log_p;
println!(" x={}: log P(x) = {:.3}", obs, log_p);
}
println!("🔢 Joint log-probability: {:.3}", total_log_prob);
// Convert back to probability (be careful with underflow!)
if total_log_prob > -700.0 {
// Avoid underflow
let probability = total_log_prob.exp();
println!("📊 Joint probability: {:.2e}", probability);
} else {
println!("⚠️ Joint probability too small to represent as f64");
}
Converting large negative log-probabilities back to regular probabilities can underflow to zero. Keep computations in log-space when possible.
Advanced Patterns
For complex modeling scenarios, see these patterns:
Hierarchical Models
// Hierarchical prior structure
let global_mean = Normal::new(0.0, 10.0).unwrap();
let mu = global_mean.sample(&mut rng);
let group_precision = Gamma::new(2.0, 0.5).unwrap();
let tau = group_precision.sample(&mut rng);
let sigma = (1.0 / tau).sqrt(); // Convert precision to std dev
// Individual observations from hierarchical model
let individual = Normal::new(mu, sigma).unwrap();
let observation = individual.sample(&mut rng);
println!("🌐 Global mean: {:.3}", mu);
println!("📊 Group std dev: {:.3}", sigma);
println!("👤 Individual observation: {:.3}", observation);
Mixture Components
// Mixture model components
let mixture_weights = vec![0.6, 0.3, 0.1];
let component_selector = Categorical::new(mixture_weights).unwrap();
let selected_component: usize = component_selector.sample(&mut rng);
// Different components
let components = [
Normal::new(-2.0, 0.5).unwrap(),
Normal::new(0.0, 1.0).unwrap(),
Normal::new(3.0, 0.8).unwrap(),
];
let sample = components[selected_component].sample(&mut rng);
println!(
"🎯 Selected component {}: sample = {:.3}",
selected_component, sample
);
Conjugate Priors
// Beta-Bernoulli conjugacy
let prior_alpha = 2.0;
let prior_beta = 8.0;
let prior = Beta::new(prior_alpha, prior_beta).unwrap();
let p: f64 = prior.sample(&mut rng);
// Simulate some trials
let trials = 20;
let mut successes = 0;
let bernoulli = Bernoulli::new(p).unwrap();
for _ in 0..trials {
if bernoulli.sample(&mut rng) {
successes += 1;
}
}
// Posterior parameters (conjugate update)
let posterior_alpha = prior_alpha + successes as f64;
let posterior_beta = prior_beta + (trials - successes) as f64;
let posterior = Beta::new(posterior_alpha, posterior_beta).unwrap();
let updated_p = posterior.sample(&mut rng);
println!("🎲 Prior p: {:.3}", p);
println!("📈 Observed: {}/{} successes", successes, trials);
println!("🔄 Posterior p: {:.3}", updated_p);
Testing Your Distributions
Always test distribution properties and parameter validation:
#[test]
fn test_distribution_properties() {
let mut rng = thread_rng();
// Test type safety
let coin = Bernoulli::new(0.5).unwrap();
let flip: bool = coin.sample(&mut rng);
assert!(flip == true || flip == false); // Must be boolean
// Test parameter validation
assert!(Normal::new(0.0, -1.0).is_err());
assert!(Beta::new(0.0, 1.0).is_err());
assert!(Poisson::new(-1.0).is_err());
// Test valid distributions
let normal = Normal::new(0.0, 1.0).unwrap();
let sample = normal.sample(&mut rng);
let log_prob = normal.log_prob(&sample);
assert!(log_prob.is_finite());
}
Common Pitfalls
- Underflow in probability space: Use log-probabilities for accumulation
- Parameter validation: Check constructor errors, don't assume success
- Precision with counts: Use
u64
return types directly, avoidf64
conversion - Categorical indexing: Trust the
usize
return—it's guaranteed valid
Next Steps
- Complex Models: See Building Complex Models for compositional patterns
- Debugging: Check out Debugging Models for troubleshooting
- Custom Logic: Learn Custom Handlers for specialized inference
The type-safe distribution system eliminates entire classes of runtime errors while making statistical code more readable and maintainable.