{ "cells": [ { "cell_type": "markdown", "id": "d7d8810c", "metadata": {}, "source": [ "# Python and NumPy Basics for Machine Learning\n", "\n", "This notebook covers the essential Python and NumPy concepts needed for machine learning, extracted from our course slides with additional practical examples." ] }, { "cell_type": "markdown", "id": "9554e58f", "metadata": {}, "source": [ "## 1. Python Basics Review\n", "\n", "Let's start with fundamental Python concepts that we'll use throughout the course." ] }, { "cell_type": "code", "execution_count": null, "id": "75fb2de5", "metadata": {}, "outputs": [], "source": [ "# Variables and Data Types\n", "# Numbers\n", "x = 42 # integer\n", "y = 3.14 # float\n", "z = 2 + 3j # complex number\n", "\n", "print(f\"Integer: {x}, type: {type(x)}\")\n", "print(f\"Float: {y}, type: {type(y)}\")\n", "print(f\"Complex: {z}, type: {type(z)}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "46544990", "metadata": {}, "outputs": [], "source": [ "# Collections - fundamental data structures\n", "my_list = [1, 2, 3, 4] # mutable (can be changed)\n", "my_tuple = (1, 2, 3, 4) # immutable (cannot be changed)\n", "my_dict = {'a': 1, 'b': 2} # key-value pairs\n", "\n", "print(f\"List: {my_list}\")\n", "print(f\"Tuple: {my_tuple}\")\n", "print(f\"Dictionary: {my_dict}\")\n", "\n", "# Demonstrate mutability\n", "my_list[0] = 10 # This works\n", "print(f\"Modified list: {my_list}\")\n", "\n", "# my_tuple[0] = 10 # This would cause an error!" ] }, { "cell_type": "code", "execution_count": null, "id": "823bb277", "metadata": {}, "outputs": [], "source": [ "# Control Flow - loops and conditionals\n", "print(\"Even numbers from 0 to 8:\")\n", "for i in range(5):\n", " if i % 2 == 0:\n", " print(f\"{i} is even\")\n", " else:\n", " print(f\"{i} is odd\")" ] }, { "cell_type": "code", "execution_count": null, "id": "0e936848", "metadata": {}, "outputs": [], "source": [ "# List comprehensions - a Pythonic way to create lists\n", "numbers = [1, 2, 3, 4, 5]\n", "squared = [x**2 for x in numbers]\n", "\n", "print(f\"Original: {numbers}\")\n", "print(f\"Squared: {squared}\")\n", "\n", "# More complex example: filter even numbers and square them\n", "even_squared = [x**2 for x in numbers if x % 2 == 0]\n", "print(f\"Even numbers squared: {even_squared}\")" ] }, { "cell_type": "markdown", "id": "fd85176f", "metadata": {}, "source": [ "## 2. Introduction to NumPy\n", "\n", "NumPy is the foundation of scientific computing in Python. It provides efficient operations on arrays of numbers." ] }, { "cell_type": "code", "execution_count": null, "id": "3d69b80f", "metadata": {}, "outputs": [], "source": [ "# Import NumPy (standard convention)\n", "import numpy as np\n", "\n", "# Check NumPy version\n", "print(f\"NumPy version: {np.__version__}\")\n", "\n", "# Why NumPy? Performance comparison\n", "python_list = [1, 2, 3, 4]\n", "numpy_array = np.array([1, 2, 3, 4])\n", "\n", "print(f\"Python list: {python_list}\")\n", "print(f\"NumPy array: {numpy_array}\")\n", "print(f\"Array type: {type(numpy_array)}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "9e456ac0", "metadata": {}, "outputs": [], "source": [ "# Vectorized operations - the power of NumPy\n", "python_list = [1, 2, 3, 4]\n", "numpy_array = np.array([1, 2, 3, 4])\n", "\n", "# With Python lists, you need a loop\n", "python_result = []\n", "for x in python_list:\n", " python_result.append(x * 2)\n", "print(f\"Python way: {python_result}\")\n", "\n", "# With NumPy, apply operation to entire array at once\n", "numpy_result = numpy_array * 2\n", "print(f\"NumPy way: {numpy_result}\")\n", "\n", "# This is much faster for large arrays!" ] }, { "cell_type": "markdown", "id": "9e61deb9", "metadata": {}, "source": [ "## 3. Creating NumPy Arrays\n", "\n", "There are many ways to create NumPy arrays depending on your needs." ] }, { "cell_type": "code", "execution_count": null, "id": "e60f6310", "metadata": {}, "outputs": [], "source": [ "# Different ways to create arrays\n", "a = np.array([1, 2, 3, 4]) # From a list\n", "b = np.zeros(5) # Array of zeros\n", "c = np.ones((2, 3)) # 2x3 array of ones\n", "d = np.arange(0, 10, 2) # [0, 2, 4, 6, 8] - start, stop, step\n", "e = np.linspace(0, 1, 5) # 5 evenly spaced points from 0 to 1\n", "f = np.random.random((3, 3)) # Random 3x3 matrix\n", "\n", "print(\"From list:\", a)\n", "print(\"Zeros:\", b)\n", "print(\"Ones (2x3):\")\n", "print(c)\n", "print(\"Range with step:\", d)\n", "print(\"Linspace:\", e)\n", "print(\"Random 3x3:\")\n", "print(f)" ] }, { "cell_type": "code", "execution_count": null, "id": "24229810", "metadata": {}, "outputs": [], "source": [ "# Array properties - important to understand your data\n", "array_2d = np.array([[1, 2, 3], [4, 5, 6]])\n", "\n", "print(f\"Array:\")\n", "print(array_2d)\n", "print(f\"Shape: {array_2d.shape}\") # Dimensions: (rows, columns)\n", "print(f\"Data type: {array_2d.dtype}\") # Type of elements\n", "print(f\"Number of dimensions: {array_2d.ndim}\") # 1D, 2D, 3D, etc.\n", "print(f\"Total elements: {array_2d.size}\") # Total number of elements\n", "print(f\"Memory usage: {array_2d.nbytes} bytes\") # Memory consumption" ] }, { "cell_type": "markdown", "id": "40ae27de", "metadata": {}, "source": [ "## 4. Basic Array Operations\n", "\n", "NumPy allows element-wise operations and mathematical functions on entire arrays." ] }, { "cell_type": "code", "execution_count": null, "id": "c60ff920", "metadata": {}, "outputs": [], "source": [ "# Basic arithmetic operations\n", "a = np.array([1, 2, 3, 4])\n", "b = np.array([5, 6, 7, 8])\n", "\n", "print(f\"a = {a}\")\n", "print(f\"b = {b}\")\n", "print(f\"a + b = {a + b}\") # Element-wise addition\n", "print(f\"a - b = {a - b}\") # Element-wise subtraction\n", "print(f\"a * b = {a * b}\") # Element-wise multiplication (NOT matrix multiplication)\n", "print(f\"a / b = {a / b}\") # Element-wise division\n", "print(f\"a ** 2 = {a ** 2}\") # Element-wise power" ] }, { "cell_type": "code", "execution_count": null, "id": "1e1e55c5", "metadata": {}, "outputs": [], "source": [ "# Mathematical functions\n", "a = np.array([1, 4, 9, 16])\n", "\n", "print(f\"Original array: {a}\")\n", "print(f\"Square root: {np.sqrt(a)}\")\n", "print(f\"Exponential: {np.exp(a)}\")\n", "print(f\"Natural log: {np.log(a)}\")\n", "print(f\"Sine: {np.sin(a)}\")\n", "print(f\"Cosine: {np.cos(a)}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "d5233378", "metadata": {}, "outputs": [], "source": [ "# Statistical operations\n", "data = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])\n", "\n", "print(f\"Data: {data}\")\n", "print(f\"Sum: {np.sum(data)}\")\n", "print(f\"Mean: {np.mean(data)}\")\n", "print(f\"Standard deviation: {np.std(data)}\")\n", "print(f\"Minimum: {np.min(data)}\")\n", "print(f\"Maximum: {np.max(data)}\")\n", "print(f\"Median: {np.median(data)}\")" ] }, { "cell_type": "markdown", "id": "9fc0af85", "metadata": {}, "source": [ "## 5. Array Indexing and Slicing\n", "\n", "Accessing and modifying array elements is crucial for data manipulation." ] }, { "cell_type": "code", "execution_count": null, "id": "6dea63f6", "metadata": {}, "outputs": [], "source": [ "# 1D array indexing\n", "arr = np.array([10, 20, 30, 40, 50])\n", "\n", "print(f\"Array: {arr}\")\n", "print(f\"First element (index 0): {arr[0]}\")\n", "print(f\"Last element (index -1): {arr[-1]}\")\n", "print(f\"Second to fourth (index 1:4): {arr[1:4]}\")\n", "print(f\"Every other element: {arr[::2]}\")\n", "print(f\"Reverse array: {arr[::-1]}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "7f17b163", "metadata": {}, "outputs": [], "source": [ "# 2D array indexing\n", "matrix = np.array([[1, 2, 3], \n", " [4, 5, 6], \n", " [7, 8, 9]])\n", "\n", "print(\"Matrix:\")\n", "print(matrix)\n", "print(f\"Element at row 0, column 1: {matrix[0, 1]}\")\n", "print(f\"First row: {matrix[0, :]}\")\n", "print(f\"First column: {matrix[:, 0]}\")\n", "print(f\"2x2 submatrix (top-left):\")\n", "print(matrix[:2, :2])" ] }, { "cell_type": "code", "execution_count": null, "id": "6c032d42", "metadata": {}, "outputs": [], "source": [ "# Boolean indexing - very powerful for data filtering\n", "data = np.array([1, 5, 3, 8, 2, 9, 4])\n", "\n", "print(f\"Original data: {data}\")\n", "\n", "# Create boolean mask\n", "mask = data > 5\n", "print(f\"Mask (elements > 5): {mask}\")\n", "\n", "# Apply mask to get elements\n", "large_values = data[mask]\n", "print(f\"Values > 5: {large_values}\")\n", "\n", "# Can do it in one line\n", "small_values = data[data <= 3]\n", "print(f\"Values <= 3: {small_values}\")" ] }, { "cell_type": "markdown", "id": "7409d5ed", "metadata": {}, "source": [ "## 6. Array Reshaping and Broadcasting\n", "\n", "Understanding shapes and how arrays interact is crucial for machine learning." ] }, { "cell_type": "code", "execution_count": null, "id": "0dfb45d5", "metadata": {}, "outputs": [], "source": [ "# Reshaping arrays\n", "original = np.arange(12) # [0, 1, 2, ..., 11]\n", "print(f\"Original array: {original}\")\n", "print(f\"Shape: {original.shape}\")\n", "\n", "# Reshape to 2D\n", "matrix_3x4 = original.reshape(3, 4)\n", "print(f\"\\nReshaped to 3x4:\")\n", "print(matrix_3x4)\n", "\n", "# Reshape to different dimensions\n", "matrix_2x6 = original.reshape(2, 6)\n", "print(f\"\\nReshaped to 2x6:\")\n", "print(matrix_2x6)\n", "\n", "# Flatten back to 1D\n", "flattened = matrix_3x4.flatten()\n", "print(f\"\\nFlattened: {flattened}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "9c59423f", "metadata": {}, "outputs": [], "source": [ "# Broadcasting - performing operations on arrays of different shapes\n", "\n", "# Scalar with array\n", "arr = np.array([1, 2, 3, 4])\n", "result = arr + 10 # Adds 10 to each element\n", "print(f\"Array: {arr}\")\n", "print(f\"Array + 10: {result}\")\n", "\n", "# Array with smaller array\n", "matrix = np.array([[1, 2, 3], \n", " [4, 5, 6]])\n", "vector = np.array([10, 20, 30])\n", "\n", "print(f\"\\nMatrix shape: {matrix.shape}\")\n", "print(matrix)\n", "print(f\"\\nVector shape: {vector.shape}\")\n", "print(vector)\n", "\n", "# Broadcasting: vector is added to each row of matrix\n", "broadcast_result = matrix + vector\n", "print(f\"\\nResult of matrix + vector:\")\n", "print(broadcast_result)" ] }, { "cell_type": "markdown", "id": "2209244b", "metadata": {}, "source": [ "## 7. Working with Multi-dimensional Arrays\n", "\n", "Real-world data often comes in higher dimensions (images, time series, etc.)." ] }, { "cell_type": "code", "execution_count": null, "id": "0c408064", "metadata": {}, "outputs": [], "source": [ "# Creating and working with 3D arrays\n", "# Think of this as a stack of 2D matrices\n", "array_3d = np.random.randint(0, 10, size=(2, 3, 4)) # 2 matrices of 3x4\n", "\n", "print(f\"3D array shape: {array_3d.shape}\")\n", "print(f\"3D array:\")\n", "print(array_3d)\n", "\n", "# Access different parts\n", "print(f\"\\nFirst matrix (index 0):\")\n", "print(array_3d[0])\n", "\n", "print(f\"\\nElement at position [1, 2, 3]: {array_3d[1, 2, 3]}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "23bcc3b1", "metadata": {}, "outputs": [], "source": [ "# Operations along specific axes\n", "matrix = np.array([[1, 2, 3], \n", " [4, 5, 6], \n", " [7, 8, 9]])\n", "\n", "print(\"Matrix:\")\n", "print(matrix)\n", "\n", "# Sum along different axes\n", "print(f\"\\nSum of all elements: {np.sum(matrix)}\")\n", "print(f\"Sum along axis 0 (columns): {np.sum(matrix, axis=0)}\")\n", "print(f\"Sum along axis 1 (rows): {np.sum(matrix, axis=1)}\")\n", "\n", "# Mean along axes\n", "print(f\"\\nMean along axis 0 (columns): {np.mean(matrix, axis=0)}\")\n", "print(f\"Mean along axis 1 (rows): {np.mean(matrix, axis=1)}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "2f2dd5dd", "metadata": {}, "outputs": [], "source": [ "# Array concatenation and splitting\n", "arr1 = np.array([1, 2, 3])\n", "arr2 = np.array([4, 5, 6])\n", "\n", "# Concatenate along different axes\n", "concat_horizontal = np.concatenate([arr1, arr2])\n", "print(f\"Horizontal concatenation: {concat_horizontal}\")\n", "\n", "# For 2D arrays\n", "mat1 = np.array([[1, 2], [3, 4]])\n", "mat2 = np.array([[5, 6], [7, 8]])\n", "\n", "# Stack vertically (along rows)\n", "vertical_stack = np.vstack([mat1, mat2])\n", "print(f\"\\nVertical stack:\")\n", "print(vertical_stack)\n", "\n", "# Stack horizontally (along columns)\n", "horizontal_stack = np.hstack([mat1, mat2])\n", "print(f\"\\nHorizontal stack:\")\n", "print(horizontal_stack)" ] }, { "cell_type": "markdown", "id": "d30feefb", "metadata": {}, "source": [ "## 8. Linear Algebra with NumPy\n", "\n", "Essential operations for machine learning algorithms." ] }, { "cell_type": "code", "execution_count": null, "id": "2a1db220", "metadata": {}, "outputs": [], "source": [ "# Vector operations\n", "a = np.array([2, 4, 6])\n", "b = np.array([1, 3, 5])\n", "\n", "print(f\"Vector a: {a}\")\n", "print(f\"Vector b: {b}\")\n", "\n", "# Dot product (very important in ML)\n", "dot_product = np.dot(a, b)\n", "print(f\"Dot product: {dot_product}\")\n", "\n", "# Vector magnitude (length)\n", "magnitude_a = np.linalg.norm(a)\n", "magnitude_b = np.linalg.norm(b)\n", "print(f\"Magnitude of a: {magnitude_a:.2f}\")\n", "print(f\"Magnitude of b: {magnitude_b:.2f}\")\n", "\n", "# Unit vector (normalized)\n", "unit_vector_a = a / magnitude_a\n", "print(f\"Unit vector a: {unit_vector_a}\")\n", "print(f\"Magnitude of unit vector: {np.linalg.norm(unit_vector_a):.2f}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "ae094ffb", "metadata": {}, "outputs": [], "source": [ "# Matrix operations\n", "A = np.array([[1, 2], \n", " [3, 4], \n", " [5, 6]])\n", "B = np.array([[7, 8], \n", " [9, 10]])\n", "\n", "print(\"Matrix A (3x2):\")\n", "print(A)\n", "print(\"\\nMatrix B (2x2):\")\n", "print(B)\n", "\n", "# Matrix multiplication (different from element-wise multiplication)\n", "matrix_mult = np.dot(A, B) # or A @ B\n", "print(f\"\\nMatrix multiplication A @ B:\")\n", "print(matrix_mult)\n", "\n", "# Transpose\n", "A_transpose = A.T\n", "print(f\"\\nTranspose of A:\")\n", "print(A_transpose)\n", "print(f\"Shape changed from {A.shape} to {A_transpose.shape}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "37aefbf1", "metadata": {}, "outputs": [], "source": [ "# Example: Simple linear regression setup\n", "# This demonstrates how linear algebra is used in ML\n", "\n", "# Generate sample data\n", "np.random.seed(42) # For reproducible results\n", "n_samples, n_features = 100, 3\n", "\n", "# Feature matrix X (each row is a data point)\n", "X = np.random.randn(n_samples, n_features)\n", "\n", "# True weights (what we want to learn)\n", "true_weights = np.array([1.5, -2.0, 0.5])\n", "\n", "# Generate target values with some noise\n", "noise = np.random.randn(n_samples) * 0.1\n", "y = X @ true_weights + noise # @ is matrix multiplication\n", "\n", "print(f\"Data shape: {X.shape}\")\n", "print(f\"Target shape: {y.shape}\")\n", "print(f\"True weights: {true_weights}\")\n", "\n", "# Add bias term (intercept)\n", "X_with_bias = np.column_stack([np.ones(n_samples), X])\n", "print(f\"X with bias shape: {X_with_bias.shape}\")\n", "\n", "# Analytical solution: w = (X^T X)^(-1) X^T y\n", "XTX_inv = np.linalg.inv(X_with_bias.T @ X_with_bias)\n", "estimated_weights = XTX_inv @ X_with_bias.T @ y\n", "\n", "print(f\"\\nEstimated weights (with bias): {estimated_weights}\")\n", "print(f\"True weights (with bias=0): [0, {true_weights[0]}, {true_weights[1]}, {true_weights[2]}]\")\n", "print(f\"Error: {np.abs(estimated_weights[1:] - true_weights)}\")" ] }, { "cell_type": "markdown", "id": "801f8b05", "metadata": {}, "source": [ "## 9. Practical Examples for Machine Learning\n", "\n", "Common data preprocessing tasks using NumPy." ] }, { "cell_type": "code", "execution_count": null, "id": "7b6483f6", "metadata": {}, "outputs": [], "source": [ "# Data normalization - important preprocessing step\n", "# Generate sample dataset\n", "np.random.seed(42)\n", "data = np.random.randn(50, 3) * [10, 100, 0.1] + [50, 500, 5]\n", "\n", "print(\"Original data statistics:\")\n", "print(f\"Mean: {data.mean(axis=0)}\")\n", "print(f\"Std: {data.std(axis=0)}\")\n", "print(f\"Min: {data.min(axis=0)}\")\n", "print(f\"Max: {data.max(axis=0)}\")\n", "\n", "# Z-score normalization (zero mean, unit variance)\n", "data_zscore = (data - data.mean(axis=0)) / data.std(axis=0)\n", "\n", "print(\"\\nAfter Z-score normalization:\")\n", "print(f\"Mean: {data_zscore.mean(axis=0)}\")\n", "print(f\"Std: {data_zscore.std(axis=0)}\")\n", "\n", "# Min-Max normalization (scale to [0, 1])\n", "data_minmax = (data - data.min(axis=0)) / (data.max(axis=0) - data.min(axis=0))\n", "\n", "print(\"\\nAfter Min-Max normalization:\")\n", "print(f\"Min: {data_minmax.min(axis=0)}\")\n", "print(f\"Max: {data_minmax.max(axis=0)}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "0a5c481e", "metadata": {}, "outputs": [], "source": [ "# Train-test split implementation\n", "def train_test_split_numpy(X, y, test_size=0.2, random_state=None):\n", " \"\"\"Simple train-test split using NumPy\"\"\"\n", " if random_state:\n", " np.random.seed(random_state)\n", " \n", " n_samples = len(X)\n", " n_test = int(n_samples * test_size)\n", " \n", " # Random permutation of indices\n", " indices = np.random.permutation(n_samples)\n", " \n", " # Split indices\n", " test_idx = indices[:n_test]\n", " train_idx = indices[n_test:]\n", " \n", " return X[train_idx], X[test_idx], y[train_idx], y[test_idx]\n", "\n", "# Example usage\n", "X = np.random.randn(100, 3)\n", "y = np.random.randint(0, 2, 100)\n", "\n", "X_train, X_test, y_train, y_test = train_test_split_numpy(X, y, test_size=0.2, random_state=42)\n", "\n", "print(f\"Original data: {len(X)} samples\")\n", "print(f\"Training set: {len(X_train)} samples\")\n", "print(f\"Test set: {len(X_test)} samples\")\n", "print(f\"Test ratio: {len(X_test) / len(X):.1%}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "327aee79", "metadata": {}, "outputs": [], "source": [ "# Computing basic statistics for analysis\n", "# Generate sample dataset\n", "np.random.seed(42)\n", "dataset = np.random.randn(200, 4)\n", "\n", "print(\"Dataset shape:\", dataset.shape)\n", "print(\"\\nBasic statistics:\")\n", "print(f\"Mean of each feature: {dataset.mean(axis=0)}\")\n", "print(f\"Standard deviation: {dataset.std(axis=0)}\")\n", "print(f\"Variance: {dataset.var(axis=0)}\")\n", "print(f\"Minimum values: {dataset.min(axis=0)}\")\n", "print(f\"Maximum values: {dataset.max(axis=0)}\")\n", "\n", "# Correlation matrix between features\n", "correlation_matrix = np.corrcoef(dataset.T) # Transpose for feature correlations\n", "print(f\"\\nCorrelation matrix:\")\n", "print(correlation_matrix)\n", "\n", "# Find highly correlated features (correlation > 0.5)\n", "high_corr_mask = np.abs(correlation_matrix) > 0.5\n", "# Remove diagonal (feature with itself)\n", "np.fill_diagonal(high_corr_mask, False)\n", "\n", "high_corr_pairs = np.where(high_corr_mask)\n", "if len(high_corr_pairs[0]) > 0:\n", " print(f\"\\nHighly correlated feature pairs:\")\n", " for i, j in zip(high_corr_pairs[0], high_corr_pairs[1]):\n", " print(f\"Features {i} and {j}: correlation = {correlation_matrix[i, j]:.3f}\")\n", "else:\n", " print(\"\\nNo highly correlated features found.\")" ] }, { "cell_type": "markdown", "id": "b77f26e1", "metadata": {}, "source": [ "## 10. Summary and Next Steps\n", "\n", "You now have the essential Python and NumPy skills needed for machine learning!" ] }, { "cell_type": "code", "execution_count": null, "id": "5c432506", "metadata": {}, "outputs": [], "source": [ "# Quick review: What we've covered\n", "print(\"Python and NumPy Basics - Summary:\")\n", "print(\"\\n1. Python fundamentals:\")\n", "print(\" - Variables and data types\")\n", "print(\" - Lists, tuples, dictionaries\")\n", "print(\" - Control flow and list comprehensions\")\n", "\n", "print(\"\\n2. NumPy essentials:\")\n", "print(\" - Array creation and properties\")\n", "print(\" - Vectorized operations\")\n", "print(\" - Indexing and slicing\")\n", "print(\" - Broadcasting\")\n", "print(\" - Linear algebra operations\")\n", "\n", "print(\"\\n3. ML preprocessing:\")\n", "print(\" - Data normalization\")\n", "print(\" - Train-test splitting\")\n", "print(\" - Statistical analysis\")\n", "\n", "print(\"\\nYou're ready for machine learning algorithms!\")" ] }, { "cell_type": "code", "execution_count": null, "id": "e80be967", "metadata": {}, "outputs": [], "source": [ "# Test your understanding - try these exercises:\n", "\n", "# Exercise 1: Create a 5x5 matrix of random numbers and find:\n", "# - The sum of each row\n", "# - The maximum value in each column\n", "# - All values greater than 0.5\n", "\n", "print(\"Exercise 1:\")\n", "matrix = np.random.random((5, 5))\n", "print(\"Random 5x5 matrix:\")\n", "print(matrix)\n", "print(f\"Sum of each row: {matrix.sum(axis=1)}\")\n", "print(f\"Max of each column: {matrix.max(axis=0)}\")\n", "print(f\"Number of values > 0.5: {np.sum(matrix > 0.5)}\")\n", "\n", "# Exercise 2: Normalize a dataset and verify the result\n", "print(\"\\nExercise 2:\")\n", "data = np.random.randn(100, 3) * [5, 10, 2] + [10, 50, 5]\n", "normalized = (data - data.mean(axis=0)) / data.std(axis=0)\n", "print(f\"Original mean: {data.mean(axis=0)}\")\n", "print(f\"Normalized mean: {normalized.mean(axis=0)}\")\n", "print(f\"Normalized std: {normalized.std(axis=0)}\")" ] } ], "metadata": { "language_info": { "name": "python" } }, "nbformat": 4, "nbformat_minor": 5 }