fixes

galagam · galagam · commit 009e41811203 · 2026-01-26T03:35:49.000-08:00
Signed-off-by: Gal Hubara Agam &lt;96368689+galagam@users.noreply.github.com&gt;
diff --git a/tests/integration/defs/accuracy/test_llm_api_autodeploy.py b/tests/integration/defs/accuracy/test_llm_api_autodeploy.py
@@ -271,6 +271,9 @@ def get_default_kwargs(self):
                     "sharding_source": ['factory', 'heuristic'],
                     "sharding_dims": ['ep', 'bmm'],
                 },
+                "fuse_fp8_moe": {
+                    "allow_different_input_scales": True,
+                },
             }
         }
 
diff --git a/tests/integration/test_lists/test-db/l0_b200.yml b/tests/integration/test_lists/test-db/l0_b200.yml
@@ -175,4 +175,5 @@ l0_b200:
   tests:
   - accuracy/test_llm_api_autodeploy.py::TestLlama3_1_8B::test_auto_dtype[False-1]
   - accuracy/test_llm_api_autodeploy.py::TestNemotronMOE::test_fp8
+  - accuracy/test_llm_api_autodeploy.py::TestNemotronSuperV3::test_fp8[1]
   - unittest/_torch/auto_deploy/unit/singlegpu
diff --git a/tests/integration/test_lists/test-db/l0_dgx_b200.yml b/tests/integration/test_lists/test-db/l0_dgx_b200.yml
@@ -217,6 +217,9 @@ l0_dgx_b200:
   - accuracy/test_llm_api_autodeploy.py::TestLlama3_1_8B::test_auto_dtype[False-4]
   - accuracy/test_llm_api_autodeploy.py::TestNemotronSuperV3::test_bf16
   - accuracy/test_llm_api_autodeploy.py::TestNemotronSuperV3::test_fp8[4]
+  - accuracy/test_llm_api_autodeploy.py::TestNemotronMOE::test_nvfp4[1]
+  - accuracy/test_llm_api_autodeploy.py::TestNemotronMOE::test_nvfp4[2]
+  - accuracy/test_llm_api_autodeploy.py::TestNemotronMOE::test_nvfp4[4]
 - condition:
     ranges:
       system_gpu_count:
@@ -233,6 +236,3 @@ l0_dgx_b200:
       orchestrator: mpi
   tests:
   - accuracy/test_llm_api_autodeploy.py::TestNemotronSuperV3::test_fp8[8]
-  - accuracy/test_llm_api_autodeploy.py::TestNemotronMOE::test_nvfp4[1]
-  - accuracy/test_llm_api_autodeploy.py::TestNemotronMOE::test_nvfp4[2]
-  - accuracy/test_llm_api_autodeploy.py::TestNemotronMOE::test_nvfp4[4]
diff --git a/tests/integration/test_lists/test-db/l0_h100.yml b/tests/integration/test_lists/test-db/l0_h100.yml
@@ -441,3 +441,4 @@ l0_h100:
   - examples/test_ad_speculative_decoding.py::test_autodeploy_spec_dec_output[draft_target]
   - examples/test_ad_speculative_decoding.py::test_autodeploy_spec_dec_output[eagle3]
   - examples/test_ad_speculative_decoding.py::test_autodeploy_eagle3_acceptance_rate
+  - accuracy/test_llm_api_autodeploy.py::TestNemotronSuperV3::test_fp8[1]

Original file line number	Diff line number	Diff line change
`@@ -271,6 +271,9 @@ def get_default_kwargs(self):`
`271`	`271`	`"sharding_source": ['factory', 'heuristic'],`
`272`	`272`	`"sharding_dims": ['ep', 'bmm'],`
`273`	`273`	`},`
	`274`	`+ "fuse_fp8_moe": {`
	`275`	`+ "allow_different_input_scales": True,`
	`276`	`+ },`
`274`	`277`	`}`
`275`	`278`	`}`
`276`	`279`