Docs
Loading...
Searching...
No Matches
NGGP Class Reference

Normalized Generalized Gamma Process class for Bayesian nonparametric clustering. More...

#include <NGGP.hpp>

Inheritance diagram for NGGP:
Process NGGPx

Public Member Functions

 NGGP (Data &d, Params &p, U_sampler &mh)
 Constructor for the Normalized Generalized Gamma Process.
Gibbs Sampling Methods
double gibbs_prior_existing_cluster (int cls_idx, int obs_idx=0) const override
 Computes the log prior probability of assigning a data point to an existing cluster.
Eigen::VectorXd gibbs_prior_existing_clusters (int obs_idx) const override
 Computes the log prior probabilities of assigning a data point to every existing cluster. This method is useful for Gibbs sampling over existing clusters. It returns a vector of log prior probabilities for all existing clusters.
double gibbs_prior_new_cluster () const override
 Computes the log prior probability of assigning a data point to a new cluster.
Split-Merge Algorithm Methods
double prior_ratio_split (int ci, int cj) const override
 Computes the prior ratio for a split operation in an NGGP-based split-merge MCMC algorithm.
double prior_ratio_merge (int size_old_ci, int size_old_cj) const override
 Computes the prior ratio for a merge operation in an NGGP-based split-merge MCMC algorithm.
double prior_ratio_shuffle (int size_old_ci, int size_old_cj, int ci, int cj) const override
 Computes the prior ratio for a shuffle operation in an NGGP-based split-merge MCMC algorithm.
Parameter Update Methods
void update_params () override
 Updates the NGGP parameters by updating the latent variable U.
Public Member Functions inherited from Process
 Process (Data &d, const Params &p)
 Constructor initializing process with data and parameters.
virtual double gibbs_prior_new_cluster_obs (int obs_idx) const
 Compute prior probability for creating a new cluster for a specific observation.
void set_old_allocations (const Eigen::VectorXi &new_allocations)
 Store current allocations for potential rollback.
void set_old_cluster_members (const std::unordered_map< int, std::vector< int > > &new_cluster_members)
 Store current cluster members for potential rollback.
void set_old_K (int new_K)
 Store current number of clusters for potential rollback.
void restore_state ()
 Restores the cached state into the data object.
const Eigen::VectorXi & old_allocations_view () const
 Provides read-only access to the stored previous allocations.
void set_idx_i (int i)
 Set index of first observation in split-merge pair.
void set_idx_j (int j)
 Set index of second observation in split-merge pair.
virtual ~Process ()
 Virtual destructor for proper cleanup of derived classes.

Protected Attributes

Private Member Variables
U_samplerU_sampler_method
 Reference to the U_sampler instance for updating the latent variable U.
Random Number Generation
std::random_device rd
 Random device for seeding.
std::mt19937 gen
 Mersenne Twister random number generator.
Protected Attributes inherited from Process
Datadata
 Reference to the data object containing observations and allocations.
const Paramsparams
 Reference to the parameters object containing process hyperparameters.
Eigen::VectorXi old_allocations
 Storage for previous allocations to enable rollback in case of rejection or for computation requiring it.
std::unordered_map< int, std::vector< int > > old_cluster_members
 Storage for previous cluster members to enable rollback in case of rejection or for computation requiring it.
int old_K = 0
 Number of clusters associated with the stored previous state.
int idx_i
 Index of first observation involved in split-merge move.
int idx_j
 Index of second observation involved in split-merge move.
const double log_a = log(params.a)
 Precomputed logarithm of total mass parameter for efficiency.

Additional Inherited Members

Protected Member Functions inherited from Process
const std::unordered_map< int, std::vector< int > > & old_cluster_members_view () const
 Provides read-only access to the stored previous cluster members.

Detailed Description

Normalized Generalized Gamma Process class for Bayesian nonparametric clustering.

This class implements a Normalized Generalized Gamma Process (NGGP) that extends the Dirichlet Process by incorporating a latent variable U. The NGGP provides more flexibility in modeling cluster sizes and incorporates adaptive behavior through the U parameter.

Constructor & Destructor Documentation

◆ NGGP()

NGGP::NGGP ( Data & d,
Params & p,
U_sampler & mh )
inline

Constructor for the Normalized Generalized Gamma Process.

Parameters
dReference to the data object containing observations and cluster assignments.
pReference to the parameters object containing NGGP parameters (a, sigma, tau).
mhReference to a U_sampler instance (e.g., RWMH or MALA) for updating the latent variable U via MCMC.

Member Function Documentation

◆ gibbs_prior_existing_cluster()

double NGGP::gibbs_prior_existing_cluster ( int cls_idx,
int obs_idx = 0 ) const
nodiscardoverridevirtual

Computes the log prior probability of assigning a data point to an existing cluster.

For NGGP, this incorporates the discount parameter sigma, giving probability proportional to (n_k - sigma) where n_k is the cluster size.

Parameters
cls_idxThe index of the cluster.
obs_idxThe index of the observation (default: 0, unused in this implementation).
Returns
The log prior probability of assigning the data point to the existing cluster.

Computes the log prior probability of assigning a data point to an existing cluster.

For NGGP, this incorporates the discount parameter sigma, giving probability proportional to (n_k - sigma) where n_k is the cluster size.

Parameters
cls_idxThe index of the cluster.
obs_idxThe index of the observation (unused in this implementation).
Returns
The log prior probability of assigning the data point to the existing cluster.

Implements Process.

Reimplemented in NGGPx.

◆ gibbs_prior_existing_clusters()

Eigen::VectorXd NGGP::gibbs_prior_existing_clusters ( int obs_idx) const
nodiscardoverridevirtual

Computes the log prior probabilities of assigning a data point to every existing cluster. This method is useful for Gibbs sampling over existing clusters. It returns a vector of log prior probabilities for all existing clusters.

Parameters
obs_idxThe index of the observation to assign.
Returns
A vector of log prior probabilities for assigning the data point to each existing cluster.

Computes the log prior probabilities of assigning a data point to all existing clusters.

This method incorporates spatial information by considering the number of neighbors in each target cluster when computing the prior probabilities.

Parameters
obs_idxThe index of the observation to assign.
Returns
A vector of log prior probabilities for assigning the data point to each existing cluster.

Implements Process.

Reimplemented in NGGPx.

◆ gibbs_prior_new_cluster()

double NGGP::gibbs_prior_new_cluster ( ) const
nodiscardoverridevirtual

Computes the log prior probability of assigning a data point to a new cluster.

For NGGP, this depends on the latent variable U and is proportional to alpha * sigma * (tau + U)^sigma.

Returns
The log prior probability of assigning the data point to a new cluster.

Computes the log prior probability of assigning a data point to a new cluster.

For NGGP, this depends on the latent variable U and is proportional to alpha * sigma * (tau + U)^sigma.

Returns
The log prior probability of assigning the data point to a new cluster.

Implements Process.

Reimplemented in NGGPx.

◆ prior_ratio_merge()

double NGGP::prior_ratio_merge ( int size_old_ci,
int size_old_cj ) const
nodiscardoverridevirtual

Computes the prior ratio for a merge operation in an NGGP-based split-merge MCMC algorithm.

This method accounts for the generalized gamma process prior when computing the acceptance ratio for merging clusters.

Parameters
size_old_ciThe size of the first cluster before the merge.
size_old_cjThe size of the second cluster before the merge.
Returns
The log prior ratio for the merge operation.

Computes the prior ratio for a merge operation in an NGGP-based split-merge MCMC algorithm.

This method accounts for the generalized gamma process prior when computing the acceptance ratio for merging clusters.

Parameters
size_old_ciThe size of the first cluster before the merge.
size_old_cjThe size of the second cluster before the merge.
Returns
The log prior ratio for the merge operation.

Implements Process.

Reimplemented in NGGPx.

◆ prior_ratio_shuffle()

double NGGP::prior_ratio_shuffle ( int size_old_ci,
int size_old_cj,
int ci,
int cj ) const
nodiscardoverridevirtual

Computes the prior ratio for a shuffle operation in an NGGP-based split-merge MCMC algorithm.

This method accounts for the generalized gamma process prior when computing the acceptance ratio for shuffling observations between clusters.

Parameters
size_old_ciThe size of the first cluster before the shuffle.
size_old_cjThe size of the second cluster before the shuffle.
ciThe first cluster index involved in the shuffle.
cjThe second cluster index involved in the shuffle.
Returns
The log prior ratio for the shuffle operation.

Computes the prior ratio for a shuffle operation in an NGGP-based split-merge MCMC algorithm.

This method accounts for the generalized gamma process prior when computing the acceptance ratio for shuffling observations between clusters.

Parameters
size_old_ciThe size of the first cluster before the shuffle.
size_old_cjThe size of the second cluster before the shuffle.
ciThe first cluster index involved in the shuffle.
cjThe second cluster index involved in the shuffle.
Returns
The log prior ratio for the shuffle operation.

Implements Process.

Reimplemented in NGGPx.

◆ prior_ratio_split()

double NGGP::prior_ratio_split ( int ci,
int cj ) const
nodiscardoverridevirtual

Computes the prior ratio for a split operation in an NGGP-based split-merge MCMC algorithm.

This method accounts for the generalized gamma process prior when computing the acceptance ratio for splitting clusters.

Parameters
ciThe first cluster index involved in the split.
cjThe second cluster index involved in the split.
Returns
The log prior ratio for the split operation.

Computes the prior ratio for a split operation in an NGGP-based split-merge MCMC algorithm.

This method accounts for the generalized gamma process prior when computing the acceptance ratio for splitting clusters.

Parameters
ciThe first cluster index involved in the split.
cjThe second cluster index involved in the split.
Returns
The log prior ratio for the split operation.

Implements Process.

Reimplemented in NGGPx.

◆ update_params()

void NGGP::update_params ( )
inlineoverridevirtual

Updates the NGGP parameters by updating the latent variable U.

This method delegates the update to the U_sampler instance, which uses an MCMC algorithm (RWMH or MALA) to sample U from its conditional distribution given the current partition.

See also
U_sampler::update_U(), RWMH::update_U(), MALA::update_U()

Implements Process.

Reimplemented in NGGPx.

Member Data Documentation

◆ gen

std::mt19937 NGGP::gen
mutableprotected

Mersenne Twister random number generator.

◆ rd

std::random_device NGGP::rd
protected

Random device for seeding.

◆ U_sampler_method

U_sampler& NGGP::U_sampler_method
protected

Reference to the U_sampler instance for updating the latent variable U.

This can be any derived class of U_sampler (e.g., RWMH or MALA) that implements the MCMC algorithm for sampling U from its conditional distribution.


The documentation for this class was generated from the following files: