File tree Expand file tree Collapse file tree 4 files changed +8
-3
lines changed
Expand file tree Collapse file tree 4 files changed +8
-3
lines changed Original file line number Diff line number Diff line change @@ -271,6 +271,9 @@ def get_default_kwargs(self):
271271 "sharding_source" : ['factory' , 'heuristic' ],
272272 "sharding_dims" : ['ep' , 'bmm' ],
273273 },
274+ "fuse_fp8_moe" : {
275+ "allow_different_input_scales" : True ,
276+ },
274277 }
275278 }
276279
Original file line number Diff line number Diff line change @@ -175,4 +175,5 @@ l0_b200:
175175 tests :
176176 - accuracy/test_llm_api_autodeploy.py::TestLlama3_1_8B::test_auto_dtype[False-1]
177177 - accuracy/test_llm_api_autodeploy.py::TestNemotronMOE::test_fp8
178+ - accuracy/test_llm_api_autodeploy.py::TestNemotronSuperV3::test_fp8[1]
178179 - unittest/_torch/auto_deploy/unit/singlegpu
Original file line number Diff line number Diff line change @@ -217,6 +217,9 @@ l0_dgx_b200:
217217 - accuracy/test_llm_api_autodeploy.py::TestLlama3_1_8B::test_auto_dtype[False-4]
218218 - accuracy/test_llm_api_autodeploy.py::TestNemotronSuperV3::test_bf16
219219 - accuracy/test_llm_api_autodeploy.py::TestNemotronSuperV3::test_fp8[4]
220+ - accuracy/test_llm_api_autodeploy.py::TestNemotronMOE::test_nvfp4[1]
221+ - accuracy/test_llm_api_autodeploy.py::TestNemotronMOE::test_nvfp4[2]
222+ - accuracy/test_llm_api_autodeploy.py::TestNemotronMOE::test_nvfp4[4]
220223- condition :
221224 ranges :
222225 system_gpu_count :
@@ -233,6 +236,3 @@ l0_dgx_b200:
233236 orchestrator : mpi
234237 tests :
235238 - accuracy/test_llm_api_autodeploy.py::TestNemotronSuperV3::test_fp8[8]
236- - accuracy/test_llm_api_autodeploy.py::TestNemotronMOE::test_nvfp4[1]
237- - accuracy/test_llm_api_autodeploy.py::TestNemotronMOE::test_nvfp4[2]
238- - accuracy/test_llm_api_autodeploy.py::TestNemotronMOE::test_nvfp4[4]
Original file line number Diff line number Diff line change @@ -441,3 +441,4 @@ l0_h100:
441441 - examples/test_ad_speculative_decoding.py::test_autodeploy_spec_dec_output[draft_target]
442442 - examples/test_ad_speculative_decoding.py::test_autodeploy_spec_dec_output[eagle3]
443443 - examples/test_ad_speculative_decoding.py::test_autodeploy_eagle3_acceptance_rate
444+ - accuracy/test_llm_api_autodeploy.py::TestNemotronSuperV3::test_fp8[1]
You can’t perform that action at this time.
0 commit comments