Skip to content

Commit 25e8bba

Browse files
committed
update readme
Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>
1 parent 8d5f2a5 commit 25e8bba

File tree

1 file changed

+9
-4
lines changed

1 file changed

+9
-4
lines changed

examples/speculative_decoding/README.md

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -30,16 +30,21 @@ This example focuses on training with Hugging Face. To train with Megatron‑LM,
3030

3131
### Docker
3232

33-
Please use the PyTorch docker image (e.g., `nvcr.io/nvidia/pytorch:25.06-py3`) or visit our [installation docs](https://nvidia.github.io/Model-Optimizer/getting_started/2_installation.html) for more information.
33+
Please use the PyTorch docker image (e.g., `nvcr.io/nvidia/pytorch:25.12-py3`) or visit our [installation docs](https://nvidia.github.io/Model-Optimizer/getting_started/2_installation.html) for more information.
3434

3535
Also follow the installation steps below to upgrade to the latest version of Model Optimizer and install dataset and example-specific dependencies.
3636

3737
### Local Installation
3838

39-
Install Modelopt with `hf` dependencies and other requirements for this example:
39+
To set up the environment locally, first install the latest ModelOpt with Hugging Face support:
40+
41+
```bash
42+
pip install -e "../..[hf]"
43+
```
44+
45+
Next, install any additional dependencies required for this example:
4046

4147
```bash
42-
pip install -U nvidia-modelopt[hf]
4348
pip install -r requirements.txt
4449
```
4550

@@ -78,7 +83,7 @@ For small base models that fit in GPU memory, we can collocate them with draft m
7883
--eagle_config eagle_config.json
7984
```
8085

81-
This command will launch `main.py` with `accelerate`. See [section: interact with modelopt.torch.speculative](#interact-with-modelopttorchspeculative) for more details.
86+
FSDP2 is used by default. To enable context parallelism for long-context training, specify `--cp_size n`.
8287
The saved modelopt checkpoint is similar in architecture to HF models. It can be further optimized through **ModelOpt**, e.g., PTQ and QAT.
8388

8489
## Training Draft Model with Offline Base Model

0 commit comments

Comments
 (0)