machine-learning cluster-analysis mutual-information

Is Mutual Information(MI) an important factor to consider in unsupervised learning (Clustering)?

I have a supervised learning problem. The final step in the solving process is segmentation. Do features with the lowest MI affect the clustering process?

My problem about Churn Customers Segmentation: I found out some features with no MI at all. Do I drop these features?

Solution

You should do a feature important experiment, like this.

https://github.com/ash-wicus-ml/Notebooks/blob/master/XG%20Boost%20-%20Feature%20Importance.ipynb

When you know what your X-variable is, you can run some clustering exercises.

https://github.com/ash-wicus-ml/Notebooks/blob/master/Clustering%20Algorithms%20Compared.ipynb

Imbalanced data: undersampling or oversampling?
Is there an R function to optimize the PRG AUC (area under the precision-recall-gain curve)?
Stratefied vs Random Splitting on highly categotical datasets
Balancing samples on a binary classification sequence problem with sparse positive labels
Equation of the hyperplane for machine learning
Gradient descent in linear regression causing parameter to be -infinity
Using Scikit-Learn OneHotEncoder with a Pandas DataFrame
How can I process a pdf using OpenAI's APIs (GPTs)?
unable to load a model with Keras
How to build a Face recognition system from scratch?
ValueError: Exception encountered when calling layer "sequential_5" (type Sequential)
How to be sure I am implementing/simulating a paper/algorithm correctly?
Most efficient way to use a large image dataset with Google Colab -- getting drive timeout + memory errors
Find coordinate of eye corner using difference in color
Python Error: rv_generic.interval() missing 1 required positional argument: 'confidence'
SKlearn classifier's predict_proba doesn't sum to 1
Keras: Difference between Kernel and Activity regularizers
Neural Network built from scratch using numpy isn't learning
Is there any other way to push large pickle files (bigger than 100 MB) to GitHub repo other than Git LFS? (For my ML project)
Selection of training data for SVM
How do you invert a tensor of boolean values in Pytorch?
Random projection algorithm pseudo code
Python compare images of, piece of, clothing (identification)
Random forest getting 100% score since feature selection
Trajectory Clustering: Which Clustering Method?
mlr3 properly setting up parallelization
The docling_core library (with smoldocling) fails to export to markdown
Bee swarm plot with SHAP values for Random forest
Scikit-Learn Classifier model returns all zeroes
How does one use Pytorch (+ cuda) with an A100 GPU?