139 lines
6.3 KiB
Markdown
139 lines
6.3 KiB
Markdown
# Training parameter recommendations
|
|
|
|
Hardware: Dual Intel Xeon E5-2678 v3 (24 physical cores / 48 threads) + NVIDIA GTX 1050 Ti (4 GB VRAM)
|
|
|
|
Purpose: recommended, ready-to-apply parameter sets for the repository's two training flows:
|
|
- Card Model (`card_model/train_card_model.py`) — see `card_model/config.py`
|
|
- MCCFR Trainer (`mccfr_trainer.py`)
|
|
|
|
Do not modify code automatically; this document lists the variables and suggested values (two profiles: Quick/Dev and Balanced/Production). Edit the constants in the referenced files when you are ready.
|
|
|
|
Files to adjust (examples):
|
|
- Card Model config: [card_model/config.py](card_model/config.py#L65-L72)
|
|
- MCCFR trainer: [mccfr_trainer.py](mccfr_trainer.py#L76-L97)
|
|
|
|
---
|
|
|
|
## Summary recommendation for your machine (short)
|
|
- If you want fast iterations: use the `Quick / Dev` profile below.
|
|
- If you want longer runs for better final performance and have time: use the `Balanced / Production` profile.
|
|
|
|
---
|
|
|
|
## Card Model (histogram + equity) — variables in `card_model/config.py`
|
|
Two profiles: Quick / Dev (iterate fast) and Balanced / Production.
|
|
|
|
### Quick / Dev (recommended to iterate)
|
|
- `NUM_TRAIN_SAMPLES` = 200_000
|
|
- `NUM_VAL_SAMPLES` = 10_000
|
|
- `NUM_ROLLOUTS` = 200
|
|
- `BATCH_SIZE` = 1024
|
|
- `NUM_EPOCHS` = 32
|
|
- `LEARNING_RATE` = 1e-3
|
|
- `WEIGHT_DECAY` = 1e-4
|
|
- `LAMBDA_MSE` = 0.1
|
|
- `NUM_WORKERS` = 20 # used for dataset generation and DataLoader in this codebase; 20 is a good balance on 24 cores
|
|
|
|
Notes:
|
|
- `NUM_ROLLOUTS=200` reduces data-generation cost (fewer MC rollouts) so samples are cheaper to produce. Increase to 1000 for higher-quality labels if you have time.
|
|
- `BATCH_SIZE=1024` is safe for GTX 1050 Ti (4 GB VRAM). If you see OOM during CardModel training, reduce to 512.
|
|
|
|
### Balanced / Production (longer training, better final quality)
|
|
- `NUM_TRAIN_SAMPLES` = 2_000_000
|
|
- `NUM_VAL_SAMPLES` = 100_000
|
|
- `NUM_ROLLOUTS` = 1000
|
|
- `BATCH_SIZE` = 4096
|
|
- `NUM_EPOCHS` = 64
|
|
- `LEARNING_RATE` = 5e-4
|
|
- `WEIGHT_DECAY` = 1e-4
|
|
- `LAMBDA_MSE` = 0.1
|
|
- `NUM_WORKERS` = 22
|
|
|
|
Notes:
|
|
- Production profile expects long wall-clock time and sustained CPU usage. With `NUM_WORKERS=22` you still leave 2 physical cores for OS/driver tasks.
|
|
- If training CardModel on GPU causes OOM, fallback to CPU (`device=torch.device('cpu')`) or reduce `BATCH_SIZE`.
|
|
|
|
---
|
|
|
|
## MCCFR Trainer (`mccfr_trainer.py`) — main self-play + network training
|
|
Two profiles: Quick / Dev and Balanced / Production.
|
|
|
|
### Quick / Dev (safe to test)
|
|
- `NUM_ITERATIONS` = 1_000
|
|
- `GAMES_PER_ITER` = 200
|
|
- `NUM_WORKERS` = 20 # worker processes for self-play traversals (use physical cores minus a few)
|
|
- `BUFFER_MAX_SIZE` = 500_000
|
|
- `MIN_BUFFER_SIZE_FOR_TRAIN` = 10_000
|
|
- `TRAIN_BATCH_SIZE` = 4_096
|
|
- `TRAIN_STEPS_PER_ITER` = 20
|
|
- `LEARNING_RATE` = 1e-3
|
|
- `WEIGHT_DECAY` = 1e-4
|
|
- `CLIP_GRAD_NORM` = 1.0
|
|
- `CARD_MODEL_CHECKPOINT` = `card_model/data/best_card_model.pt` (use existing checkpoint if available)
|
|
|
|
Why these values?
|
|
- `NUM_WORKERS=20` uses most physical cores while leaving a few cores for the main process and OS.
|
|
- `TRAIN_BATCH_SIZE=4096` is a conservative batch that should fit in 4 GB VRAM for the small CFR network and allow efficient training.
|
|
- Reduce `MIN_BUFFER_SIZE_FOR_TRAIN` for faster first training iterations during experiments.
|
|
|
|
### Balanced / Production (long-run)
|
|
- `NUM_ITERATIONS` = 50_000
|
|
- `GAMES_PER_ITER` = 500
|
|
- `NUM_WORKERS` = 20
|
|
- `BUFFER_MAX_SIZE` = 2_000_000
|
|
- `MIN_BUFFER_SIZE_FOR_TRAIN` = 100_000
|
|
- `TRAIN_BATCH_SIZE` = 8_192
|
|
- `TRAIN_STEPS_PER_ITER` = 50
|
|
- `LEARNING_RATE` = 5e-4
|
|
- `WEIGHT_DECAY` = 1e-4
|
|
- `CLIP_GRAD_NORM` = 1.0
|
|
|
|
Notes:
|
|
- The CFR network is compact; even on 4GB VRAM you can try `TRAIN_BATCH_SIZE` up to 8k-16k depending on other GPU activity. Start with 8k and monitor GPU memory with `nvidia-smi`.
|
|
- `NUM_WORKERS=20` still recommended; avoid setting `NUM_WORKERS` >= number of physical cores to reduce scheduling/oversubscription overhead.
|
|
|
|
---
|
|
|
|
## Suggested practical workflow (apply these before long runs)
|
|
1. For a first end-to-end test, use the **Quick / Dev** profile for both Card Model and MCCFR Trainer.
|
|
2. Generate CardModel training data once:
|
|
- Run `python train_card_model.py` (it will generate or load `card_model/data/train_data.npz`).
|
|
- If generation is too slow, reduce `NUM_TRAIN_SAMPLES` or `NUM_ROLLOUTS` in the Quick profile.
|
|
3. Train CardModel to obtain `card_model/data/best_card_model.pt`.
|
|
4. Use that checkpoint with `mccfr_trainer.py` (set `CARD_MODEL_CHECKPOINT` if you want to load it) and start MCCFR with Quick/Dev profile.
|
|
5. If both steps succeed and you want to scale up, switch to the Balanced/Production profile.
|
|
|
|
Commands examples:
|
|
- Generate & train CardModel (from repo root):
|
|
|
|
```
|
|
python train_card_model.py
|
|
```
|
|
|
|
- Start MCCFR trainer (from repo root):
|
|
|
|
```
|
|
python mccfr_trainer.py
|
|
```
|
|
|
|
Monitor GPU memory while training with `nvidia-smi -l 2` and reduce `BATCH_SIZE` / `TRAIN_BATCH_SIZE` if you see OOM.
|
|
|
|
---
|
|
|
|
## Notes & cautions
|
|
- The repository hardcodes some constants in `card_model/config.py` and `mccfr_trainer.py`. This document lists the variables and recommended values — you must edit the constants in those files or override them in a wrapper script before running.
|
|
- For multi-process data generation and MCCFR traversal, the code uses `spawn` start method to avoid CUDA forking issues. Keep that unchanged.
|
|
- If you plan to fully utilize all 24 cores for data generation, avoid launching heavy background tasks. Disk I/O during parallel generation can be significant; make sure you have enough temporary disk space for intermediate `.npz` files.
|
|
|
|
---
|
|
|
|
## Quick reference: exact variables to set
|
|
- `card_model/config.py`:
|
|
- `NUM_TRAIN_SAMPLES`, `NUM_VAL_SAMPLES`, `NUM_ROLLOUTS`, `BATCH_SIZE`, `NUM_EPOCHS`, `LEARNING_RATE`, `NUM_WORKERS`, `WEIGHT_DECAY`.
|
|
- `mccfr_trainer.py`:
|
|
- `NUM_ITERATIONS`, `GAMES_PER_ITER`, `NUM_WORKERS`, `BUFFER_MAX_SIZE`, `MIN_BUFFER_SIZE_FOR_TRAIN`, `TRAIN_BATCH_SIZE`, `TRAIN_STEPS_PER_ITER`, `LEARNING_RATE`, `WEIGHT_DECAY`, `CARD_MODEL_CHECKPOINT`.
|
|
|
|
---
|
|
|
|
If you want, I can now write a small wrapper script that launches CardModel data generation and training, then launches MCCFR with the chosen profile (no code changes to core files — the wrapper will set values at runtime). Reply if you want that wrapper created.
|