1️⃣ What is Machine Learning?
Machine Learning (ML) is a field of AI where computers learn patterns from data instead of being explicitly programmed.
Instead of writing:
if price > 100 → expensive
We give the machine many examples, and it learns the rule itself.
2️⃣ What is Supervised Learning?
Supervised Learning is a type of Machine Learning where:
The model learns from labeled data.
That means:
Each input has a correct output.
Example:
| Input (X) | Output (Y) | |-----------|------------| | House size | House price | | Email text | Spam / Not Spam | | Student hours | Exam score |
The goal is to learn a function:
f(X) = Y
There are two main types:
3️⃣ Regression
Regression is used when the output is continuous (a number).
Examples:
🔹 Common Regression Algorithms
| Algorithm | Description | |------------|------------| | Linear Regression | Fits straight line to data | | Polynomial Regression | Fits curved line | | Decision Tree Regressor | Tree-based model | | Random Forest Regressor | Many decision trees combined | | Support Vector Regression (SVR) | Margin-based regression |
🔹 Linear Regression (Most Basic)
Equation of a line:
y = mx + b
In ML form:
y = wX + b
Where:
The model learns w and b by minimizing error.
4️⃣ Classification
Classification is used when output is categorical.
Examples:
🔹 Types of Classification
| Type | Example | |------|---------| | Binary | Yes / No | | Multi-class | Cat / Dog / Horse | | Multi-label | Multiple tags at once |
🔹 Common Classification Algorithms
| Algorithm | Description | |------------|------------| | Logistic Regression | Linear classifier | | K-Nearest Neighbors (KNN) | Distance-based | | Decision Tree | Tree-based model | | Random Forest | Ensemble of trees | | Support Vector Machine (SVM) | Margin classifier | | Naive Bayes | Probabilistic classifier |
5️⃣ How Supervised Learning Works
Basic steps:
| Step | Description | |------|------------| | 1 | Collect data | | 2 | Split into training and testing | | 3 | Train model on training data | | 4 | Evaluate on test data | | 5 | Improve model |
6️⃣ Evaluation Metrics (Very Important)
Evaluation tells us how good the model is.
🔹 Regression Metrics
| Metric | Meaning | Formula Idea | |--------|---------|-------------| | MAE | Mean Absolute Error | Average absolute difference | | MSE | Mean Squared Error | Average squared difference | | RMSE | Root MSE | Square root of MSE | | R² Score | Goodness of fit | Variance explained |
Explanation:
🔹 Classification Metrics
| Metric | Meaning | |--------|---------| | Accuracy | Correct predictions / total | | Precision | Correct positives / predicted positives | | Recall | Correct positives / actual positives | | F1 Score | Harmonic mean of precision & recall | | Confusion Matrix | Detailed prediction table |
🔹 Confusion Matrix
| | Predicted Positive | Predicted Negative | |----------------|-------------------|-------------------| | Actual Positive | True Positive (TP) | False Negative (FN) | | Actual Negative | False Positive (FP) | True Negative (TN) |
From this we compute:
Accuracy = (TP + TN) / Total Precision = TP / (TP + FP) Recall = TP / (TP + FN) F1 = 2 * (Precision * Recall) / (Precision + Recall)
7️⃣ Python Implementation
We’ll use scikit-learn.
🔹 Install Libraries
pip install numpy pandas scikit-learn matplotlib
🔹 Example 1: Regression (Linear Regression)
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
from sklearn.model_selection import train_test_split
# Generate sample data
X = np.array([1,2,3,4,5]).reshape(-1,1)
y = np.array([2,4,6,8,10])
# Split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Model
model = LinearRegression()
model.fit(X_train, y_train)
# Predict
predictions = model.predict(X_test)
# Evaluation
print("MAE:", mean_absolute_error(y_test, predictions))
print("MSE:", mean_squared_error(y_test, predictions))
print("R2:", r2_score(y_test, predictions))
🔹 Example 2: Classification (Logistic Regression)
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
# Load dataset
data = load_iris()
X = data.data
y = data.target
# Binary classification (class 0 vs others)
y = (y == 0)
# Train model
model = LogisticRegression()
model.fit(X, y)
# Predict
predictions = model.predict(X)
# Evaluation
print("Accuracy:", accuracy_score(y, predictions))
print("Confusion Matrix:\n", confusion_matrix(y, predictions))
print("Classification Report:\n", classification_report(y, predictions))
8️⃣ What You Should Understand After This Topic
You should now understand:
FULL COMPILATION OF ALL CODE
# Install:
# pip install numpy pandas scikit-learn matplotlib
import numpy as np
from sklearn.linear_model import LinearRegression, LogisticRegression
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris
# -----------------------------
# REGRESSION EXAMPLE
# -----------------------------
X = np.array([1,2,3,4,5]).reshape(-1,1)
y = np.array([2,4,6,8,10])
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
reg_model = LinearRegression()
reg_model.fit(X_train, y_train)
reg_predictions = reg_model.predict(X_test)
print("Regression Results")
print("MAE:", mean_absolute_error(y_test, reg_predictions))
print("MSE:", mean_squared_error(y_test, reg_predictions))
print("R2:", r2_score(y_test, reg_predictions))
# -----------------------------
# CLASSIFICATION EXAMPLE
# -----------------------------
data = load_iris()
X = data.data
y = data.target
y = (y == 0)
clf_model = LogisticRegression()
clf_model.fit(X, y)
clf_predictions = clf_model.predict(X)
print("\nClassification Results")
print("Accuracy:", accuracy_score(y, clf_predictions))
print("Confusion Matrix:\n", confusion_matrix(y, clf_predictions))
print("Classification Report:\n", classification_report(y, clf_predictions))