update readme

h-guo18 · h-guo18 · commit 25e8bba34557 · 2026-01-22T01:05:58.000Z
Signed-off-by: h-guo18 &lt;67671475+h-guo18@users.noreply.github.com&gt;
diff --git a/examples/speculative_decoding/README.md b/examples/speculative_decoding/README.md
@@ -30,16 +30,21 @@ This example focuses on training with Hugging Face. To train with Megatron‑LM,
 
 ### Docker
 
-Please use the PyTorch docker image (e.g., `nvcr.io/nvidia/pytorch:25.06-py3`) or visit our [installation docs](https://nvidia.github.io/Model-Optimizer/getting_started/2_installation.html) for more information.
+Please use the PyTorch docker image (e.g., `nvcr.io/nvidia/pytorch:25.12-py3`) or visit our [installation docs](https://nvidia.github.io/Model-Optimizer/getting_started/2_installation.html) for more information.
 
 Also follow the installation steps below to upgrade to the latest version of Model Optimizer and install dataset and example-specific dependencies.
 
 ### Local Installation
 
-Install Modelopt with `hf` dependencies and other requirements for this example:
+To set up the environment locally, first install the latest ModelOpt with Hugging Face support:
+
+```bash
+pip install -e "../..[hf]"
+```
+
+Next, install any additional dependencies required for this example:
 
 ```bash
-pip install -U nvidia-modelopt[hf]
 pip install -r requirements.txt
 ```
 
@@ -78,7 +83,7 @@ For small base models that fit in GPU memory, we can collocate them with draft m
             --eagle_config eagle_config.json
 ```
 
-This command will launch `main.py` with `accelerate`. See [section: interact with modelopt.torch.speculative](#interact-with-modelopttorchspeculative) for more details.
+FSDP2 is used by default. To enable context parallelism for long-context training, specify `--cp_size n`.
 The saved modelopt checkpoint is similar in architecture to HF models. It can be further optimized through **ModelOpt**, e.g., PTQ and QAT.
 
 ## Training Draft Model with Offline Base Model