adjust unit tests for `test_save_load_float16` by kaixuanliu · Pull Request #12500 · huggingface/diffusers

kaixuanliu · 2025-10-17T02:14:06Z

When we run unit test like pytest -rA tests/pipelines/wan/test_wan_22.py::Wan22PipelineFastTests::test_save_load_float16, we found that the pipeline runs w/ all fp16 datatype, but after save and reload, some parts of text-encoder in pipe_loaded uses fp32, although we set torch_dtype to fp16 explicitly. Deep investigation found that the root cause is here: L783. Here we made an adjustment to the test case to manually add the component = component.to(torch_device).half() operation to align excatly with the behavior in pipe

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

kaixuanliu · 2025-10-17T02:15:51Z

@a-r-r-o-w @DN6 pls help review, thx!

regisss · 2025-10-22T16:45:51Z

Not sure I understand the issue here. This specific T5 module is kept in fp32 on purpose, why forcing a fp16 cast in the test?

kaixuanliu · 2025-10-23T02:46:26Z

@regisss Hi, the purpose of this test case is to compare the output of pipelines using fp16 dtype(pipe) and the output of pipelines loaded from previously saved(pipe_loaded), they should be the same. However, all components of pipe is set to fp16 dtype in L1424~L1426, while for pipe_loaded, some parts are kept in fp32, which does not match exactly with the computation in pipe fwd.

sayakpaul · 2025-10-27T13:02:43Z

+                if hasattr(component, "half"):
+                    # Although all components for pipe_loaded should be float16 now, some submodules still use fp32, like in https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/models/t5/modeling_t5.py#L783, so we need to do the conversion again manally to align with the datatype we use in pipe exactly
+                    component = component.to(torch_device).half()


This doesn't seem right at all. torch_dtype should be able to take care of it. I just ran it on my GPU for SD and it worked fine.

Hi @sayakpaul , I tested on A100, and when I print pipe_loaded.text_encoder.encoder.block[0].layer[1].DenseReluDense.wo.weight.dtype in L1455 , it returns torch.float32, not torch.float16, and the max_diff in L1456 is np.float16(0.0004883). When we apply this PR to align excatly with the behavior in pipe, the max_diff is 0. I think it's better to adjust the test case to make the output comparison of pipe and pipe_loaded apple to apple. WDYT?

My point is torch_dtype in from_pretrained() should be enough for the model to be in fp16. Setting it with half() after loading the model in the FP16 torch_dtype seems erroneous to me.

I also ran the test on an A100, and it wasn't a problem. So, I am not sure if this test fix is correct at all.

I printed pipe_loaded.text_encoder.encoder.block[0].layer[1].DenseReluDense.wo.weight.dtype after pipe_loaded = self.pipeline_class.from_pretrained(tmpdir, torch_dtype=torch.float16), and it returns torch.float32, it is root caused in L783, so I manualy add .half() to pipe_loaded, although it looks a bit wierd... On A100, the tolerance value is OK, but I think from the fundamentals perspective, the output from pipelines loaded from former saved should be exactly the same, that is the max_diff should be 0, right?

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

kaixuanliu · 2025-10-30T07:29:57Z

@sayakpaul Hi, I adjusted the test code to pass dtype to get_dummy_components, instead of add .half() to every component, do you think it's OK now?

sayakpaul · 2025-10-30T07:32:30Z

    supports_dduf = False

-    def get_dummy_components(self):
+    def get_dummy_components(self, dtype=torch.float32):


Why do we need this?

pls refer to L246-L256 (Sorry I only found Chinese version for this explanation). Using torch.Tensor.to method will convert all weights, while using torch_dtype parameter with from_pretrained will preserve layers in _keep_in_fp32_modules. For wan models, all components of pipe will be fp16 dtype while it is not the case for pipe_loaded. Here I override test_save_load_float16 function seperately for wan models.

sayakpaul

I am honestly not sure about the changes introduced in this PR. We have gone over multiple comments and so far, I haven't been able to manually verify myself the failures this PR tries to solve.

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

…into wan-pipeline

sayakpaul · 2025-10-30T08:57:03Z

        pass

+    @unittest.skipIf(torch_device not in ["cuda", "xpu"], reason="float16 requires CUDA or XPU")
+    def test_save_load_float16(self, expected_max_diff=1e-2):


I still don't know then how on my end the tests are passing.

I think it should be related with the input. When I set all the seed in get_dummy_components to 1, the max_diff on A100 is np.float16(0.2366), and when set seed to 42, the output will be all nan value. After this PR, the max_diff will all be 0 for all the seed

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

kaixuanliu · 2025-10-30T11:17:17Z

+        # Use from_pretrained with a tiny model to ensure proper dtype handling
+        # This ensures _keep_in_fp32_modules and _skip_layerwise_casting_patterns are respected
+        transformer = WanTransformer3DModel.from_pretrained(
+            "Kaixuanliu/tiny-random-wan-transformer", 


pls replace my model space. We have to use from_pretrained here to make all the submodules' dtype correctly loaded.

kaixuanliu · 2025-10-30T11:17:36Z

-            qk_norm="rms_norm_across_heads",
-            rope_max_seq_len=32,
+        transformer_2 = WanTransformer3DModel.from_pretrained(
+            "Kaixuanliu/tiny-random-wan-transformer",


Same as above

kaixuanliu · 2025-10-30T11:19:25Z

CC @yao-matrix

DN6 · 2025-11-04T13:08:02Z

I think it would be best to modify the test_save_load_float16 test to account for keep in FP32 modules. e.g

def test_save_load_float16(self, expected_max_diff=1e-2):
    components = self.get_dummy_components()
    for name, module in components.items():
        module = module.to(torch_device)
        # Account for components with _keep_in_fp32_modules
        if hasattr(module, "_keep_in_fp32_modules"):
            for name, param in module.named_parameters():
                if any(
                    module_to_keep_in_fp32 in name.split(".")
                    for module_to_keep_in_fp32 in module._keep_in_fp32_modules
                ):
                    param.data = param.data.to(torch.float32)
                else:
                    param.data = param.data.to(torch.float16)

        elif hasattr(module, "half"):
            components[name] = module.to(torch_device).half()

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

kaixuanliu · 2025-11-05T02:55:45Z

@DN6 Hi, I think this is a good advice, looks much better now. Have updated the code following your advice, thx!

kaixuanliu · 2025-11-11T09:18:29Z

@sayakpaul @DN6 Hi, can this PR be merged now?

sayakpaul

Thanks for your patience!

kaixuanliu · 2025-11-12T02:24:50Z

@sayakpaul Hi, the failed CI cases should have nothing to do with this PR, can you help merge?

sayakpaul · 2025-11-12T03:02:32Z

Yeah will merge shortly. Thanks for your contributions!

adjust unit tests for wan pipeline

b938d30

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

sayakpaul reviewed Oct 27, 2025

View reviewed changes

update code

2244e23

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

Merge branch 'main' into wan-pipeline

906adf7

sayakpaul reviewed Oct 30, 2025

View reviewed changes

kaixuanliu marked this pull request as draft October 30, 2025 08:05

kaixuanliu added 2 commits October 30, 2025 08:19

avoid adjusting common get_dummy_components API

5305169

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

Merge branch 'wan-pipeline' of https://github.com/kaixuanliu/diffusers …

a1d659c

…into wan-pipeline

kaixuanliu marked this pull request as ready for review October 30, 2025 08:47

sayakpaul reviewed Oct 30, 2025

View reviewed changes

kaixuanliu marked this pull request as draft October 30, 2025 10:10

use form_pretrained to transformer and transformer_2

ecd4c8b

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

kaixuanliu commented Oct 30, 2025

View reviewed changes

kaixuanliu marked this pull request as ready for review October 30, 2025 11:18

kaixuanliu added 2 commits November 5, 2025 02:48

update code

62f3428

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

update

6b697ed

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

sayakpaul added 2 commits November 5, 2025 09:08

Merge branch 'main' into wan-pipeline

91bdabf

Merge branch 'main' into wan-pipeline

3f2ab46

sayakpaul mentioned this pull request Nov 9, 2025

add ChronoEdit #12593

Merged

DN6 and others added 2 commits November 10, 2025 13:27

Merge branch 'main' into wan-pipeline

4bdfa35

Merge branch 'main' into wan-pipeline

b6e5a28

sayakpaul approved these changes Nov 11, 2025

View reviewed changes

Merge branch 'main' into wan-pipeline

6ec93a7

sayakpaul added 3 commits November 12, 2025 12:35

Merge branch 'main' into wan-pipeline

1baf156

Merge branch 'main' into wan-pipeline

0ee299a

Merge branch 'main' into wan-pipeline

1a8dd43

sayakpaul merged commit 7a001c3 into huggingface:main Nov 13, 2025
21 of 24 checks passed

Conversation

kaixuanliu commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kaixuanliu commented Oct 17, 2025

Uh oh!

regisss commented Oct 22, 2025

Uh oh!

kaixuanliu commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kaixuanliu Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kaixuanliu commented Oct 30, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kaixuanliu Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sayakpaul left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kaixuanliu commented Oct 30, 2025

Uh oh!

DN6 commented Nov 4, 2025

Uh oh!

kaixuanliu commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kaixuanliu commented Nov 11, 2025

Uh oh!

sayakpaul left a comment

Choose a reason for hiding this comment

Uh oh!

kaixuanliu commented Nov 12, 2025

Uh oh!

sayakpaul commented Nov 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kaixuanliu commented Oct 17, 2025 •

edited

Loading

kaixuanliu commented Oct 23, 2025 •

edited

Loading

kaixuanliu Oct 28, 2025 •

edited

Loading

kaixuanliu Oct 30, 2025 •

edited

Loading

kaixuanliu commented Nov 5, 2025 •

edited

Loading