e2hang/python

Fork 0

Files

e2hang a5fdeaf70e Pytorch Added

2025-09-09 15:10:57 +08:00

4.3 KiB

Raw Permalink Blame History

好嘞 😃，那我给你系统、详细地讲一下 torch.nn ——它基本上是 PyTorch 搭建神经网络的核心模块。

🔹 1. `torch.nn` 是什么？

torch.nn = neural network（神经网络） 的缩写。
作用：提供了构建神经网络的各种模块（layers）、容器、损失函数、工具函数。
核心思想：把网络看成一组 模块（Module） 的组合，每个模块都可以包含参数（权重、偏置），也可以嵌套其他模块。

🔹 2. 核心类：`nn.Module`

所有神经网络的基础类，几乎所有东西（层、模型、损失函数）都继承自它。

常见用法

import torch
import torch.nn as nn

class MyModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(10, 20)  # 输入10维 -> 输出20维
        self.fc2 = nn.Linear(20, 1)   # 输入20维 -> 输出1维

    def forward(self, x):  # 定义前向传播
        x = torch.relu(self.fc1(x))
        x = self.fc2(x)
        return x

model = MyModel()

👉 所有模型都需要定义 __init__ 和 forward 方法。

🔹 3. 常用层（Layers）

torch.nn 提供了很多常用的层，常见的分类：

（1）全连接层（Linear）

nn.Linear(in_features, out_features)

就是矩阵乘法 + bias
常用于 MLP

（2）卷积层（CNN）

nn.Conv1d, nn.Conv2d, nn.Conv3d

卷积神经网络的核心
用于提取空间/时序特征

（3）循环神经网络（RNN / LSTM / GRU）

nn.RNN, nn.LSTM, nn.GRU

用于处理序列数据（文本、时间序列）

（4）归一化层

nn.BatchNorm1d, nn.BatchNorm2d, nn.LayerNorm

让训练更稳定，加速收敛

（5）正则化层

nn.Dropout(p=0.5)

随机“丢弃”神经元，防止过拟合

🔹 4. 常用容器（Containers）

用于把多个层组合在一起。

（1）Sequential

model = nn.Sequential(
    nn.Linear(10, 20),
    nn.ReLU(),
    nn.Linear(20, 1)
)

顺序堆叠层，适合简单模型。

（2）ModuleList / ModuleDict

self.layers = nn.ModuleList([nn.Linear(10, 20), nn.Linear(20, 30)])
self.dict = nn.ModuleDict({
    "fc1": nn.Linear(10, 20),
    "fc2": nn.Linear(20, 1)
})

更灵活，可以动态组合模块。

🔹 5. 常用损失函数（Loss functions）

torch.nn 提供了很多常用的 loss：

回归

nn.MSELoss()       # 均方误差
nn.L1Loss()        # 绝对值误差

分类

nn.CrossEntropyLoss()   # 多分类交叉熵
nn.BCELoss()            # 二分类交叉熵
nn.NLLLoss()            # 负对数似然

其他
```
nn.SmoothL1Loss()  # Huber loss
```

🔹 6. 激活函数

激活函数也在 torch.nn 里，分为类和 函数式 两种：

类形式（模块）
```
nn.ReLU(), nn.Sigmoid(), nn.Tanh()
```

函数形式（函数接口）

torch.relu(x), torch.sigmoid(x), torch.tanh(x)

👉 一般在 nn.Sequential 里用类形式，在 forward 里用函数形式。

🔹 7. 使用流程（套路）

一个典型训练流程：

import torch
import torch.nn as nn
import torch.optim as optim

# 1. 定义模型
model = nn.Sequential(
    nn.Linear(10, 20),
    nn.ReLU(),
    nn.Linear(20, 1)
)

# 2. 定义损失函数
criterion = nn.MSELoss()

# 3. 定义优化器
optimizer = optim.SGD(model.parameters(), lr=0.01)

# 4. 前向传播 + 反向传播 + 更新
x = torch.randn(5, 10)   # 输入 (batch=5, features=10)
y = torch.randn(5, 1)    # 目标

output = model(x)        # 前向
loss = criterion(output, y)   # 计算损失
optimizer.zero_grad()    # 梯度清零
loss.backward()          # 反向传播
optimizer.step()         # 参数更新

🔹 8. 记忆小技巧

所有层、损失函数、网络结构都是 nn.Module 的子类
训练流程三步走：forward → loss → backward
组合模块时，先想清楚输入输出 shape

要不要我帮你整理一个 torch.nn 思维导图（层 → 容器 → 损失函数 → 激活函数 → 用法套路），这样你一张图就能记住？

4.3 KiB Raw Permalink Blame History Unescape Escape

🔹 1. torch.nn 是什么？

🔹 2. 核心类：nn.Module