Skip to content

SigLIP2 discrepancy between HF implementation and original JAX implementation #43493

@nmilosev

Description

@nmilosev

System Info

Google Colab

  • transformers version: 4.57.6
  • Platform: Linux-6.6.105+-x86_64-with-glibc2.35
  • Python version: 3.12.12
  • Huggingface_hub version: 0.36.0
  • Safetensors version: 0.7.0
  • Accelerate version: 1.12.0
  • Accelerate config: not found
  • DeepSpeed version: not installed
  • PyTorch version (accelerator?): 2.9.0+cu126 (CUDA)
  • Tensorflow version (GPU?): 2.19.1 (False)
  • Flax version (CPU?/GPU?/TPU?): 0.11.2 (gpu)
  • Jax version: 0.7.2
  • JaxLib version: 0.7.2
  • Using distributed or parallel set-up in script?:
  • Using GPU in script?:
  • GPU type: Tesla T4

Who can help?

@yonigozlan @molbap @zucchini-nlp

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Steps to reproduce:

  1. The issue is apparent when comparing official JAX implementation from @mitscha link to example with official SigLIP2 example from HF docs
  2. When running inference in same model config the learned temperature and bias are different leading to "not ideal" performance in zero shot image classification. Please note that as of recent times the model is "operational", in previous version of HF transformers the model always gave 0% probs to all candidate labels.
Implementation Temperature (Logit scale) Bias
JAX Official 109.9 -15.9
HF Transformers 4.57.6 4.6994 -15.9324

Expected behavior

I expected learned temperature and bias to be similar.

When setting temperature to 100+ in HF implementation, model performance improves.

Please see attached notebook for a full example to replicate this issue.

SigLIP2_temperature_bias.ipynb

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions