Skip to content

Commit 009e418

Browse files
committed
fixes
Signed-off-by: Gal Hubara Agam <96368689+galagam@users.noreply.github.com>
1 parent 35fa199 commit 009e418

File tree

4 files changed

+8
-3
lines changed

4 files changed

+8
-3
lines changed

tests/integration/defs/accuracy/test_llm_api_autodeploy.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -271,6 +271,9 @@ def get_default_kwargs(self):
271271
"sharding_source": ['factory', 'heuristic'],
272272
"sharding_dims": ['ep', 'bmm'],
273273
},
274+
"fuse_fp8_moe": {
275+
"allow_different_input_scales": True,
276+
},
274277
}
275278
}
276279

tests/integration/test_lists/test-db/l0_b200.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -175,4 +175,5 @@ l0_b200:
175175
tests:
176176
- accuracy/test_llm_api_autodeploy.py::TestLlama3_1_8B::test_auto_dtype[False-1]
177177
- accuracy/test_llm_api_autodeploy.py::TestNemotronMOE::test_fp8
178+
- accuracy/test_llm_api_autodeploy.py::TestNemotronSuperV3::test_fp8[1]
178179
- unittest/_torch/auto_deploy/unit/singlegpu

tests/integration/test_lists/test-db/l0_dgx_b200.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -217,6 +217,9 @@ l0_dgx_b200:
217217
- accuracy/test_llm_api_autodeploy.py::TestLlama3_1_8B::test_auto_dtype[False-4]
218218
- accuracy/test_llm_api_autodeploy.py::TestNemotronSuperV3::test_bf16
219219
- accuracy/test_llm_api_autodeploy.py::TestNemotronSuperV3::test_fp8[4]
220+
- accuracy/test_llm_api_autodeploy.py::TestNemotronMOE::test_nvfp4[1]
221+
- accuracy/test_llm_api_autodeploy.py::TestNemotronMOE::test_nvfp4[2]
222+
- accuracy/test_llm_api_autodeploy.py::TestNemotronMOE::test_nvfp4[4]
220223
- condition:
221224
ranges:
222225
system_gpu_count:
@@ -233,6 +236,3 @@ l0_dgx_b200:
233236
orchestrator: mpi
234237
tests:
235238
- accuracy/test_llm_api_autodeploy.py::TestNemotronSuperV3::test_fp8[8]
236-
- accuracy/test_llm_api_autodeploy.py::TestNemotronMOE::test_nvfp4[1]
237-
- accuracy/test_llm_api_autodeploy.py::TestNemotronMOE::test_nvfp4[2]
238-
- accuracy/test_llm_api_autodeploy.py::TestNemotronMOE::test_nvfp4[4]

tests/integration/test_lists/test-db/l0_h100.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -441,3 +441,4 @@ l0_h100:
441441
- examples/test_ad_speculative_decoding.py::test_autodeploy_spec_dec_output[draft_target]
442442
- examples/test_ad_speculative_decoding.py::test_autodeploy_spec_dec_output[eagle3]
443443
- examples/test_ad_speculative_decoding.py::test_autodeploy_eagle3_acceptance_rate
444+
- accuracy/test_llm_api_autodeploy.py::TestNemotronSuperV3::test_fp8[1]

0 commit comments

Comments
 (0)