kmeanstf.kmeanstf.TunnelKMeansTF¶

class kmeanstf.kmeanstf.TunnelKMeansTF(n_clusters=8, init='random', n_init=1, max_iter=300, tol=0.0001, verbose=0, random_state=None, max_tunnel_iter=300, max_tunnel_moves_per_iter=100, criterion=1.0, local_trials=1, collect_history=False)

implements tunnel k-means

For full desription of methods see base class BaseKMeansTF

Parameters:

n_clusters (int) – The number of clusters to form as well as the number of centroids to generate.
init ('random', 'k-means++' or array) – method of initialization
n_init (int) – number of runs of the initial k-means phase with different initializations (default 1). Only one tunnel phase is performed even if n_init is larger than 1.
max_iter (int) – Maximum number of Lloyd iterations for a single run of the k-means algorithm.
tol (float) – Relative tolerance with regards to inertia to declare convergence.
verbose (int) – Verbosity mode.
random_state (int) – None, or integer to seed the random number generators of python, numpy and tensorflow
max_tunnel_iter (int) – how many tunnel iterations to perform maximally
max_tunnel_moves_per_iter (int) – how many centroids to move maximally in one tunnel iteration
criterion (float) – inital required ratio error/utility (is increased adaptively)
local_trials (int) – how many time should each tunnel move be repeated with different random offset vector (1 or larger)
collect_history (bool) – collect historic information on inertia, criterion, tunnel moves, codebooks

Variables:

cluster_centers (array, [n_clusters, n_features]) – Coordinates of cluster centers. If the algorithm stops before fully converging (see tol and max_iter), these will not be consistent with labels_.
labels (array, shape(n_samples)) – Labels of each point, i.e. index of closest centroid
inertia (float) – Sum of squared distances of samples to their closest cluster center.
n_iter (int) – Number of iterations run.

__init__(n_clusters=8, init='random', n_init=1, max_iter=300, tol=0.0001, verbose=0, random_state=None, max_tunnel_iter=300, max_tunnel_moves_per_iter=100, criterion=1.0, local_trials=1, collect_history=False)¶: Initialize self. See help(type(self)) for accurate signature.

The Methods

`__init__`([n_clusters, init, n_init, …])	Initialize self.
`fit`(X)	Compute k-means clustering.
`fit_predict`(X)	Compute cluster centers and predict cluster index for each sample.
`get_errs_and_utils`(X[, centroids])	Get error and utility values wrt.
`get_gaussian_mixture`([n, d, g, sigma])	generate test data from Gaussian mixture distribution
`get_history`()	Get collected history data of performed run of fit().
`get_log`([abbr])	Get statistics of performed run of fit()
`get_params`()	Get params used to define class
`get_system_status`([do_print])	print tensorflow version and availability of GPUs.
`predict`(X)	Predict the closest cluster each sample in X belongs to.
`self_test`([X, n_clusters, n_init, n, d, g, …])	self-testing routine
`set_random_seed`(seed)	setting random seed for tensorflow, python and numpy