====================================================
 HG-means Clustering: Mixture of Gaussian instances
====================================================

---- WHAT
Instructions about the format of mixtures of spherical Gaussians datasets. For each dataset, there is a data file, containing the coordinates of the samples, and an accompanying class file, containing the labels (classes) of each sample.


---- WHO
Daniel Gribel (dgribel@inf.puc-rio.br) and Thibaut Vidal (vidalt@inf.puc-rio.br)


---- NAME OF DATA FILES

The name of a Mixture of Gaussians data file follows the structure below:

-------------
 Gau-M-D.txt 
-------------

where M is the number of clusters (Gaussians distributions), and D is the dimensionality of data. Here is an example of the name of a data file: Gau-50-10.txt.


---- CONTENT OF DATA FILES

In the first line of a data file, there is the number of data points (n) and the dimensionality of the data (d), separated by a single space. The remaining lines correspond to the coordinates of data points. Each line contains the values of the d features of a sample, where x_ij correspond to the j-th feature of the i-th sample of the data. Each feature value is separated by a single space, as depicted in the scheme below:

===========================
| n d                     |
---------------------------
| x_11 x_12 x_13 ... x_1d |
---------------------------
| x_21 x_22 x_23 ... x_2d |
---------------------------
| ...  ...  ...  ... ...  |
---------------------------
| x_n1 x_n2 x_n3 ... x_nd |
===========================

All Mixture of Gaussians instances consider means and variations uniformly selected in the ranges [0,5] and  [1, 10], respectively.


---- NAME OF CLASS FILES

The name of the files containing the classes (labels) of each data point follows the structure below:

---------------
 Y-Gau-M-D.txt 
---------------

where M is the number of clusters (Gaussians distributions), and D is the dimensionality of data. Here is an example of the name of a class file: Y-Gau-50-10.txt.


---- CONTENT OF CLASS FILES

The content of the classes file exhibits the class of each sample of the dataset, i.e., the Gaussian distribution that generated the sample, where y_i correspond to the class (label) of the i-th sample:

=====
 y_1
-----
 y_2
-----
 ...
-----
 y_n
=====