Skip to content

eeddaann/data-mining-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Mining Project

This notebook is the final project for data-mining course. For this project we applied data-mining techniques with python's scikit-learn library. The project consist:

  1. Data Exploration:
    • statistics about the features
    • scatter plots for the features
    • correlation matrix
    • violinplot
  2. Feature Engineering:
    • LDA
    • PCA
    • Modification for PCA
    • Feature Generation
  3. KNN
    • Hyperparameter optimization
    • apply with different preprocessing
  4. Gaussian Naive Bayes
    • apply with different preprocessing
  5. Multilayer Perceptron:
    • apply with different preprocessing
  6. Boosting:
    • Based on decision trees
    • Based on Gaussian Naive Bayes
  7. Evaluation:
    • Confusion matrix
    • Receiver operating characteristic (ROC)