217 lines
4.3 KiB
Markdown
217 lines
4.3 KiB
Markdown
好嘞 😃,那我给你系统、详细地讲一下 **`torch.nn`** ——它基本上是 PyTorch 搭建神经网络的核心模块。
|
||
|
||
---
|
||
|
||
# 🔹 1. `torch.nn` 是什么?
|
||
|
||
* `torch.nn` = **neural network(神经网络)** 的缩写。
|
||
* 作用:提供了构建神经网络的各种**模块**(layers)、**容器**、**损失函数**、**工具函数**。
|
||
* 核心思想:把网络看成一组 **模块(Module)** 的组合,每个模块都可以包含参数(权重、偏置),也可以嵌套其他模块。
|
||
|
||
---
|
||
|
||
# 🔹 2. 核心类:`nn.Module`
|
||
|
||
所有神经网络的基础类,几乎所有东西(层、模型、损失函数)都继承自它。
|
||
|
||
### 常见用法
|
||
|
||
```python
|
||
import torch
|
||
import torch.nn as nn
|
||
|
||
class MyModel(nn.Module):
|
||
def __init__(self):
|
||
super().__init__()
|
||
self.fc1 = nn.Linear(10, 20) # 输入10维 -> 输出20维
|
||
self.fc2 = nn.Linear(20, 1) # 输入20维 -> 输出1维
|
||
|
||
def forward(self, x): # 定义前向传播
|
||
x = torch.relu(self.fc1(x))
|
||
x = self.fc2(x)
|
||
return x
|
||
|
||
model = MyModel()
|
||
```
|
||
|
||
👉 所有模型都需要定义 `__init__` 和 `forward` 方法。
|
||
|
||
---
|
||
|
||
# 🔹 3. 常用层(Layers)
|
||
|
||
`torch.nn` 提供了很多常用的层,常见的分类:
|
||
|
||
### (1)全连接层(Linear)
|
||
|
||
```python
|
||
nn.Linear(in_features, out_features)
|
||
```
|
||
|
||
* 就是矩阵乘法 + bias
|
||
* 常用于 MLP
|
||
|
||
---
|
||
|
||
### (2)卷积层(CNN)
|
||
|
||
```python
|
||
nn.Conv1d, nn.Conv2d, nn.Conv3d
|
||
```
|
||
|
||
* 卷积神经网络的核心
|
||
* 用于提取空间/时序特征
|
||
|
||
---
|
||
|
||
### (3)循环神经网络(RNN / LSTM / GRU)
|
||
|
||
```python
|
||
nn.RNN, nn.LSTM, nn.GRU
|
||
```
|
||
|
||
* 用于处理序列数据(文本、时间序列)
|
||
|
||
---
|
||
|
||
### (4)归一化层
|
||
|
||
```python
|
||
nn.BatchNorm1d, nn.BatchNorm2d, nn.LayerNorm
|
||
```
|
||
|
||
* 让训练更稳定,加速收敛
|
||
|
||
---
|
||
|
||
### (5)正则化层
|
||
|
||
```python
|
||
nn.Dropout(p=0.5)
|
||
```
|
||
|
||
* 随机“丢弃”神经元,防止过拟合
|
||
|
||
---
|
||
|
||
# 🔹 4. 常用容器(Containers)
|
||
|
||
用于把多个层组合在一起。
|
||
|
||
### (1)Sequential
|
||
|
||
```python
|
||
model = nn.Sequential(
|
||
nn.Linear(10, 20),
|
||
nn.ReLU(),
|
||
nn.Linear(20, 1)
|
||
)
|
||
```
|
||
|
||
* 顺序堆叠层,适合简单模型。
|
||
|
||
### (2)ModuleList / ModuleDict
|
||
|
||
```python
|
||
self.layers = nn.ModuleList([nn.Linear(10, 20), nn.Linear(20, 30)])
|
||
self.dict = nn.ModuleDict({
|
||
"fc1": nn.Linear(10, 20),
|
||
"fc2": nn.Linear(20, 1)
|
||
})
|
||
```
|
||
|
||
* 更灵活,可以动态组合模块。
|
||
|
||
---
|
||
|
||
# 🔹 5. 常用损失函数(Loss functions)
|
||
|
||
`torch.nn` 提供了很多常用的 loss:
|
||
|
||
* **回归**
|
||
|
||
```python
|
||
nn.MSELoss() # 均方误差
|
||
nn.L1Loss() # 绝对值误差
|
||
```
|
||
|
||
* **分类**
|
||
|
||
```python
|
||
nn.CrossEntropyLoss() # 多分类交叉熵
|
||
nn.BCELoss() # 二分类交叉熵
|
||
nn.NLLLoss() # 负对数似然
|
||
```
|
||
|
||
* **其他**
|
||
|
||
```python
|
||
nn.SmoothL1Loss() # Huber loss
|
||
```
|
||
|
||
---
|
||
|
||
# 🔹 6. 激活函数
|
||
|
||
激活函数也在 `torch.nn` 里,分为 **类** 和 **函数式** 两种:
|
||
|
||
* 类形式(模块)
|
||
|
||
```python
|
||
nn.ReLU(), nn.Sigmoid(), nn.Tanh()
|
||
```
|
||
* 函数形式(函数接口)
|
||
|
||
```python
|
||
torch.relu(x), torch.sigmoid(x), torch.tanh(x)
|
||
```
|
||
|
||
👉 一般在 `nn.Sequential` 里用类形式,在 `forward` 里用函数形式。
|
||
|
||
---
|
||
|
||
# 🔹 7. 使用流程(套路)
|
||
|
||
一个典型训练流程:
|
||
|
||
```python
|
||
import torch
|
||
import torch.nn as nn
|
||
import torch.optim as optim
|
||
|
||
# 1. 定义模型
|
||
model = nn.Sequential(
|
||
nn.Linear(10, 20),
|
||
nn.ReLU(),
|
||
nn.Linear(20, 1)
|
||
)
|
||
|
||
# 2. 定义损失函数
|
||
criterion = nn.MSELoss()
|
||
|
||
# 3. 定义优化器
|
||
optimizer = optim.SGD(model.parameters(), lr=0.01)
|
||
|
||
# 4. 前向传播 + 反向传播 + 更新
|
||
x = torch.randn(5, 10) # 输入 (batch=5, features=10)
|
||
y = torch.randn(5, 1) # 目标
|
||
|
||
output = model(x) # 前向
|
||
loss = criterion(output, y) # 计算损失
|
||
optimizer.zero_grad() # 梯度清零
|
||
loss.backward() # 反向传播
|
||
optimizer.step() # 参数更新
|
||
```
|
||
|
||
---
|
||
|
||
# 🔹 8. 记忆小技巧
|
||
|
||
* **所有层、损失函数、网络结构都是 `nn.Module` 的子类**
|
||
* **训练流程三步走:forward → loss → backward**
|
||
* **组合模块时,先想清楚输入输出 shape**
|
||
|
||
---
|
||
|
||
要不要我帮你整理一个 **torch.nn 思维导图(层 → 容器 → 损失函数 → 激活函数 → 用法套路)**,这样你一张图就能记住?
|