I am using the GMM to cluster my dataset to K Groups, my model is running well, but there is no way to get raw data from each cluster, Can you guys suggest me some idea to solve this problem. Thank you so much.
You can do it like this (look at d0, d1, & d2).
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from pandas import DataFrame
from sklearn import datasets
from sklearn.mixture import GaussianMixture
# load the iris dataset
iris = datasets.load_iris()
# select first two columns
X = iris.data[:, 0:2]
# turn it into a dataframe
d = pd.DataFrame(X)
# plot the data
plt.scatter(d[0], d[1])
gmm = GaussianMixture(n_components = 3)
# Fit the GMM model for the dataset
# which expresses the dataset as a
# mixture of 3 Gaussian Distribution
gmm.fit(d)
# Assign a label to each sample
labels = gmm.predict(d)
d['labels']= labels
d0 = d[d['labels']== 0]
d1 = d[d['labels']== 1]
d2 = d[d['labels']== 2]
# here is a possible solution for you:
d0
d1
d2
# plot three clusters in same plot
plt.scatter(d0[0], d0[1], c ='r')
plt.scatter(d1[0], d1[1], c ='yellow')
plt.scatter(d2[0], d2[1], c ='g')
# print the converged log-likelihood value
print(gmm.lower_bound_)
# print the number of iterations needed
# for the log-likelihood value to converge
print(gmm.n_iter_)
# it needed 8 iterations for the log-likelihood to converge.