Docs
Loading...
Searching...
No Matches
ContinuosCovariatesModuleCache Class Reference

Module for covariate-related computations within clustering processes. More...

#include <continuos_covariate_module_cache.hpp>

Inheritance diagram for ContinuosCovariatesModuleCache:
Module

Public Member Functions

 ContinuosCovariatesModuleCache (const Data &data_, const ContinuosCache &continuos_cache_, bool fixed_v_, double m_=0, double B_=1.0, double v_=1.0, double nu_=1.0, double S0_=1.0, const Eigen::VectorXi *old_alloc_provider={}, const std::unordered_map< int, std::vector< int > > *old_cluster_members_provider_={})
 Constructor for CovariatesModule.
Similarity Computation Methods
double compute_similarity_cls (int cls_idx, bool old_allo=false) const
 Compute covariate similarity contribution for a cluster.
double compute_similarity_obs (int obs_idx, int cls_idx) const __attribute__((hot))
 Compute covariate similarity for a single observation in a cluster.
Eigen::VectorXd compute_similarity_obs (int obs_idx) const
 Compute covariate similarity contributions for all existing clusters.
Public Member Functions inherited from Module
 Module (const Eigen::VectorXi *old_allocations_provider_=nullptr, const std::unordered_map< int, std::vector< int > > *old_cluster_members_provider_=nullptr)
void set_old_allocations_provider (const Eigen::VectorXi *provider)
void set_old_cluster_members_provider (const std::unordered_map< int, std::vector< int > > *provider)
virtual ~Module ()=default

Protected Member Functions

Helper Methods
ContinuosCache::ClusterStats compute_cluster_statistics (const Eigen::Ref< const Eigen::VectorXi > obs) const
 Compute cluster statistics for covariate similarity.
double compute_log_marginal_likelihood_NN (const ContinuosCache::ClusterStats &stats) const __attribute__((hot))
 Compute log marginal likelihood for cluster given covariates.
double compute_log_marginal_likelihood_NNIG (const ContinuosCache::ClusterStats &stats) const __attribute__((hot))
 Compute log marginal likelihood for cluster given covariates.
double compute_predictive_NN (const ContinuosCache::ClusterStats &stats, double covariate_val) const
 Compute log predictive density for a new observation (Normal-Normal model).
double compute_predictive_NNIG (const ContinuosCache::ClusterStats &stats, double covariate_val) const
 Compute log predictive density for a new observation (NNIG model).
double compute_log_marginal_likelihood (const ContinuosCache::ClusterStats &stats) const __attribute__((hot
 Compute log marginal likelihood based on model type.

Protected Attributes

double always_inline
 Product of prior variance and observation variance.
const double log_B
 Log of prior variance.
const double log_v
 Log of observation variance.
const double const_term
 Constant term in log likelihood.
const double lgamma_nu
 log Gamma(ν) for NNIG model (v ~ IG(ν, S₀))
const double nu_logS0
 ν log(S₀) for NNIG model (v ~ IG(ν, S₀))
std::vector< double > log_v_plus_nB
 Cache for log(v_plus_nB) for NN.
std::vector< double > lgamma_nu_n
 Cache for lgamma(nu_n) for NNIG.
Module References
const Datadata
 Reference to data object with cluster assignments.
const ContinuosCachecontinuos_cache
 Reference to covariate cache for precomputed stats.
Data used
const bool fixed_v
 Whether observation variance is fixed (NN) or random (NNIG).
const double m
 Prior mean for covariate.
const double B
 Prior variance for covariate.
const double v
 Observation variance for covariate.
const double nu
 Prior shape parameter for variance (NNIG).
const double S0
 Prior scale parameter for variance (NNIG).
Protected Attributes inherited from Module
const Eigen::VectorXi * old_allocations_provider
 Provider function for accessing old allocation state.
const std::unordered_map< int, std::vector< int > > * old_cluster_members_provider
 Provider function for accessing old cluster members map.

Detailed Description

Module for covariate-related computations within clustering processes.

This class implements the product partition model with regression on covariates as described in Müller et al. (2011). It computes similarity measures based on how well observations within a cluster can be explained by a common covariate distribution (Normal conjugate prior).

Reference: Müller, P., Quintana, F. (2011) "A Product Partition Model With Regression on Covariates"

Constructor & Destructor Documentation

◆ ContinuosCovariatesModuleCache()

ContinuosCovariatesModuleCache::ContinuosCovariatesModuleCache ( const Data & data_,
const ContinuosCache & continuos_cache_,
bool fixed_v_,
double m_ = 0,
double B_ = 1.0,
double v_ = 1.0,
double nu_ = 1.0,
double S0_ = 1.0,
const Eigen::VectorXi * old_alloc_provider = {},
const std::unordered_map< int, std::vector< int > > * old_cluster_members_provider_ = {} )
inline

Constructor for CovariatesModule.

Parameters
data_Reference to Data object with cluster assignments
continuos_cache_Reference to ContinuosCache for precomputed stats
fixed_v_Whether observation variance is fixed (NN) or random (NNIG)
m_Prior mean for covariate
B_Prior variance for covariate
v_Observation variance for covariate
nu_Prior shape parameter for variance (NNIG)
S0_Prior scale parameter for variance (NNIG)
old_alloc_providerfunction to access old allocations
old_cluster_members_provider_function to access old cluster members

Member Function Documentation

◆ compute_cluster_statistics()

ContinuosCache::ClusterStats ContinuosCovariatesModuleCache::compute_cluster_statistics ( const Eigen::Ref< const Eigen::VectorXi > obs) const
protected

Compute cluster statistics for covariate similarity.

Parameters
obsVector of observation indices in the cluster
Returns
Sufficient statistics (n, sum, sum of squares)

◆ compute_log_marginal_likelihood()

double ContinuosCovariatesModuleCache::compute_log_marginal_likelihood ( const ContinuosCache::ClusterStats & stats) const
inlineprotected

Compute log marginal likelihood based on model type.

Chooses between NN and NNIG models based on covariates_data.fixed_v.

Parameters
statsSufficient statistics for the cluster
Returns
Log marginal likelihood value

◆ compute_log_marginal_likelihood_NN()

double ContinuosCovariatesModuleCache::compute_log_marginal_likelihood_NN ( const ContinuosCache::ClusterStats & stats) const
protected

Compute log marginal likelihood for cluster given covariates.

Implements the Normal-Normal conjugate prior model:

  • x_i ~ N(μ_j, v) for i ∈ S_j
  • Prior on mean μ_j: N(m, B)
  • Observation variance v is known and fixed

The marginal likelihood integrates out the cluster-specific mean μ_j. With sufficient statistics:

  • n_j = |S_j| (cluster size)
  • x̄_j = (1/n_j) Σ_{i ∈ S_j} x_i (sample mean)
  • SS = Σ_{i ∈ S_j} (x_i - x̄_j)² (centered sum of squares)

The posterior distribution of μ_j is N(m̂_j, τ_j) where:

  • τ_j = Bv / (v + n_j B) (posterior variance)
  • m̂_j = τ_j (n_j x̄_j / v + m / B) (posterior mean)

The log marginal likelihood is:

log q(x_j) = -n_j/2 log(2π) - n_j/2 log(v) - 1/2 log(B) + 1/2 log(τ_j)

  • SS/(2v) - n_j(x̄_j - m)² / (2(v + n_j B))

where log(τ_j) = log(B) + log(v) - log(v + n_j B)

Parameters
statsSufficient statistics for the cluster
Returns
Log marginal likelihood value
Note
This is marked as attribute((hot)) for performance optimization as it is called frequently in the MCMC sampling loop.

◆ compute_log_marginal_likelihood_NNIG()

double ContinuosCovariatesModuleCache::compute_log_marginal_likelihood_NNIG ( const ContinuosCache::ClusterStats & stats) const
protected

Compute log marginal likelihood for cluster given covariates.

Implements the Normal-InverseGamma conjugate prior model: x ~ N(μ, v_j)

  • Prior on mean μ: N(m, B*v_j)
  • Prior on variance v_j: IG(nu, S0)
Parameters
statsSufficient statistics (n, sum, sum of squares)
Returns
Log marginal likelihood contribution

The marginal likelihood for the NNIG model is: log g(S) = log Γ(ν + n/2) - log Γ(ν) - n/2 log(2π)

  • 1/2 log(1 + nB) + ν log(S₀)
  • (ν + n/2) log(S₀ + SS/2 + n/(2(1+nB)) (x̄-m)²)

◆ compute_predictive_NN()

double ContinuosCovariatesModuleCache::compute_predictive_NN ( const ContinuosCache::ClusterStats & stats,
double covariate_val ) const
protected

Compute log predictive density for a new observation (Normal-Normal model).

Computes the probability of observing the value at obs_idx given the current cluster statistics, assuming the Normal-Normal conjugate prior (fixed variance).

Parameters
statsSufficient statistics of the cluster (n, sum, sum of squares)
covariate_valValue of the covariate to predict
Returns
Log predictive density log p(x_new | x_cluster)

The predictive distribution for the NN model is a Normal distribution: x_new | x_cluster ~ N(μ_n, σ²_pred)

Where:

  • Posterior mean: μ_n = (m + nB x̄) / (1 + nB)
  • Predictive variance: σ²_pred = v * (1 + (n+1)B) / (1 + nB)

◆ compute_predictive_NNIG()

double ContinuosCovariatesModuleCache::compute_predictive_NNIG ( const ContinuosCache::ClusterStats & stats,
double covariate_val ) const
protected

Compute log predictive density for a new observation (NNIG model).

Computes the probability of observing the value at obs_idx given the current cluster statistics, assuming the Normal-Normal-Inverse-Gamma conjugate prior.

Parameters
statsSufficient statistics of the cluster (n, sum, sum of squares)
covariate_valValue of the covariate to predict
Returns
Log predictive density log p(x_new | x_cluster)

The predictive distribution for the NNIG model is a non-standardized Student-t distribution: x_new | x_cluster ~ t(df=2ν_n, loc=μ_n, scale=S_n * ratio)

Where:

  • Degrees of freedom: 2ν_n = 2ν + n
  • Location: μ_n = (m + nB x̄) / (1 + nB)
  • Scale is derived from the posterior scale S_n and the variance inflation factor.

◆ compute_similarity_cls()

double ContinuosCovariatesModuleCache::compute_similarity_cls ( int cls_idx,
bool old_allo = false ) const
virtual

Compute covariate similarity contribution for a cluster.

Computes the log marginal likelihood of the covariates within a cluster under the Normal conjugate model. Higher values indicate that observations in the cluster have similar covariate values.

Parameters
cls_idxIndex of the cluster (0 to K-1)
old_alloIf true, uses old allocations from old_allocations_provider; if false, uses current allocations from data (default: false)
Returns
Log marginal likelihood contribution (similarity score)

The computation follows Müller et al. (2011):

  1. Compute sufficient statistics (n, sum, sum of squares)
  2. Update hyperparameters using conjugate update rules
  3. Compute log marginal likelihood using updated parameters

This value is added to the clustering prior in split-merge moves to encourage clusters with homogeneous covariate values.

Implements Module.

◆ compute_similarity_obs() [1/2]

Eigen::VectorXd ContinuosCovariatesModuleCache::compute_similarity_obs ( int obs_idx) const
virtual

Compute covariate similarity contributions for all existing clusters.

Computes the predictive contributions for adding observation obs_idx to each existing cluster, considering covariate values.

Parameters
obs_idxIndex of the observation
Returns
Vector of log predictive density contributions for each cluster

Implements Module.

◆ compute_similarity_obs() [2/2]

double ContinuosCovariatesModuleCache::compute_similarity_obs ( int obs_idx,
int cls_idx ) const
virtual

Compute covariate similarity for a single observation in a cluster.

Computes the predictive contribution when adding observation obs_idx to cluster cls_idx, considering the covariate values.

Parameters
obs_idxIndex of the observation
cls_idxIndex of the cluster
Returns
Log predictive density contribution

Used in Gibbs sampling to compute the probability of assigning an observation to a cluster based on covariate similarity.

Implements Module.

Member Data Documentation

◆ always_inline

double ContinuosCovariatesModuleCache::always_inline
protected
Initial value:
{
if (fixed_v) {
} else {
}
}
inline double compute_log_predictive_likelihood(const ContinuosCache::ClusterStats &stats, double covariate_val) const
__attribute__((hot, always_inline)) {
if (fixed_v) {
return compute_predictive_NN(stats, covariate_val);
} else {
return compute_predictive_NNIG(stats, covariate_val);
}
}
const double Bv
double compute_log_marginal_likelihood_NNIG(const ContinuosCache::ClusterStats &stats) const __attribute__((hot))
Compute log marginal likelihood for cluster given covariates.
Definition continuos_covariate_module_cache.cpp:66
double compute_predictive_NNIG(const ContinuosCache::ClusterStats &stats, double covariate_val) const
Compute log predictive density for a new observation (NNIG model).
Definition continuos_covariate_module_cache.cpp:156
double compute_predictive_NN(const ContinuosCache::ClusterStats &stats, double covariate_val) const
Compute log predictive density for a new observation (Normal-Normal model).
Definition continuos_covariate_module_cache.cpp:139
double always_inline
Product of prior variance and observation variance.
Definition continuos_covariate_module_cache.hpp:164
const bool fixed_v
Whether observation variance is fixed (NN) or random (NNIG).
Definition continuos_covariate_module_cache.hpp:43
double compute_log_marginal_likelihood_NN(const ContinuosCache::ClusterStats &stats) const __attribute__((hot))
Compute log marginal likelihood for cluster given covariates.
Definition continuos_covariate_module_cache.cpp:103
const bool fixed_v
Whether observation variance is fixed (NN) or random (NNIG).
Definition continuos_covariate_module.hpp:43

Product of prior variance and observation variance.

◆ B

const double ContinuosCovariatesModuleCache::B
protected

Prior variance for covariate.

◆ const_term

const double ContinuosCovariatesModuleCache::const_term
protected

Constant term in log likelihood.

◆ continuos_cache

const ContinuosCache& ContinuosCovariatesModuleCache::continuos_cache
protected

Reference to covariate cache for precomputed stats.

◆ data

const Data& ContinuosCovariatesModuleCache::data
protected

Reference to data object with cluster assignments.

◆ fixed_v

const bool ContinuosCovariatesModuleCache::fixed_v
protected

Whether observation variance is fixed (NN) or random (NNIG).

◆ lgamma_nu

const double ContinuosCovariatesModuleCache::lgamma_nu
protected

log Gamma(ν) for NNIG model (v ~ IG(ν, S₀))

◆ lgamma_nu_n

std::vector<double> ContinuosCovariatesModuleCache::lgamma_nu_n
protected

Cache for lgamma(nu_n) for NNIG.

◆ log_B

const double ContinuosCovariatesModuleCache::log_B
protected

Log of prior variance.

◆ log_v

const double ContinuosCovariatesModuleCache::log_v
protected

Log of observation variance.

◆ log_v_plus_nB

std::vector<double> ContinuosCovariatesModuleCache::log_v_plus_nB
protected

Cache for log(v_plus_nB) for NN.

◆ m

const double ContinuosCovariatesModuleCache::m
protected

Prior mean for covariate.

◆ nu

const double ContinuosCovariatesModuleCache::nu
protected

Prior shape parameter for variance (NNIG).

◆ nu_logS0

const double ContinuosCovariatesModuleCache::nu_logS0
protected

ν log(S₀) for NNIG model (v ~ IG(ν, S₀))

◆ S0

const double ContinuosCovariatesModuleCache::S0
protected

Prior scale parameter for variance (NNIG).

◆ v

const double ContinuosCovariatesModuleCache::v
protected

Observation variance for covariate.


The documentation for this class was generated from the following files: