You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/speculative_decoding/README.md
+9-4Lines changed: 9 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -30,16 +30,21 @@ This example focuses on training with Hugging Face. To train with Megatron‑LM,
30
30
31
31
### Docker
32
32
33
-
Please use the PyTorch docker image (e.g., `nvcr.io/nvidia/pytorch:25.06-py3`) or visit our [installation docs](https://nvidia.github.io/Model-Optimizer/getting_started/2_installation.html) for more information.
33
+
Please use the PyTorch docker image (e.g., `nvcr.io/nvidia/pytorch:25.12-py3`) or visit our [installation docs](https://nvidia.github.io/Model-Optimizer/getting_started/2_installation.html) for more information.
34
34
35
35
Also follow the installation steps below to upgrade to the latest version of Model Optimizer and install dataset and example-specific dependencies.
36
36
37
37
### Local Installation
38
38
39
-
Install Modelopt with `hf` dependencies and other requirements for this example:
39
+
To set up the environment locally, first install the latest ModelOpt with Hugging Face support:
40
+
41
+
```bash
42
+
pip install -e "../..[hf]"
43
+
```
44
+
45
+
Next, install any additional dependencies required for this example:
40
46
41
47
```bash
42
-
pip install -U nvidia-modelopt[hf]
43
48
pip install -r requirements.txt
44
49
```
45
50
@@ -78,7 +83,7 @@ For small base models that fit in GPU memory, we can collocate them with draft m
78
83
--eagle_config eagle_config.json
79
84
```
80
85
81
-
This command will launch `main.py` with `accelerate`. See [section: interact with modelopt.torch.speculative](#interact-with-modelopttorchspeculative) for more details.
86
+
FSDP2 is used by default. To enable context parallelism for long-context training, specify `--cp_size n`.
82
87
The saved modelopt checkpoint is similar in architecture to HF models. It can be further optimized through **ModelOpt**, e.g., PTQ and QAT.
0 commit comments