ABOUT ME

-

Today
-
Yesterday
-
Total
-
  • Course Intro
    기계학습이론과실습 2022. 5. 4. 13:43

    Through this course we will learn about these topics and their mathmatical principles.

     

    1. Supervised Learning
      - Linear Regression
      - Logistic Regression
      - Decision Tree
      - Ensemble methods
      - Support Vecotr Machines
    2. Unsupervised Learning
      - Clustering: Hierarchical Clustering, DBSCAN, K-Means, GMM
      - Dimension Reduction: PCA

     

    Definition of AI, ML, DL

    AI: "smart computer", or "intelligent computer" / Rule-based approaches vs Learning-based approaches

    ML: "a machine learns" / Supervised learning vs Unsupervised learning vs Reinforcement learning

    Supervised Learning: Regression problem vs Classification problem

     

    Purpose:

    To find out the best relationship between IVs and DV =

    finding the optimal values of parameters = 

    minimizing the errors of the function =

    minimizing the value of the cost function(MSE or Entropy) =

    two ways to minimize MSE: 1) normal equation 2) gradient descent (assumption: convex function)

     

    Overall Procedure:

    1. Data preparation: preparing the data into a numpy array form or a pandas data frame form

    * to use character type variables use "OrdinalEncoder" or "pd.get_dummies" here

    * to prevent imbalanced classification problem use "SMOTE" related method here

    2. Splitting Data: splitting data into train data and test data

    3. Normalization: feature scaling if it's necessary

    4. Learning: get the optimal parameter values

    * to prevent overfitting use "Lasso" or "Ridge" regularization method here

    * to prevent imbalanced classification problem use "weight" related method here

    * for hyperparameter tuning use "cross validation" or "grid search" here

    5. Evaluating: check the performance of the model on the test data

    6. Fixing the model: to increase the performance of the model, fix the model

     

    '기계학습이론과실습' 카테고리의 다른 글

    Decision Tree  (0) 2022.05.04
    Clustering  (0) 2022.05.04
    Naive Bayes  (0) 2022.05.03
    Document Classification  (0) 2022.04.25
    Imbalanced Classification  (0) 2022.04.25

    댓글

Designed by Tistory.