HUST Vision Lab

All

119 repositories

InfiniteVL
Public
InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models
Python
•
Apache License 2.0
•4•84•1•0•Updated Feb 2, 2026Feb 2, 2026
VAD
Public
[ICCV 2023 & ICLR 2026] VAD: Vectorized Scene Representation for Efficient Autonomous Driving
end-to-end autonomous-driving
Python
•
Apache License 2.0
•141•1.2k•76•1•Updated Jan 31, 2026Jan 31, 2026
VGT
Public
Visual Generation Tuning
Python
•
MIT License
•0•98•1•0•Updated Jan 27, 2026Jan 27, 2026
MobileI2V
Public
[ArXiv 2025] MobileI2V: Fast and High-Resolution Image-to-Video on Mobile Devices
video-generation image-to-video diffusion-models video-diffusion-transformers
Python
•2•68•1•0•Updated Jan 5, 2026Jan 5, 2026
GaussTR
Public
[CVPR 2025] GaussTR: Foundation Model-Aligned Gaussian Transformer for Self-Supervised 3D Spatial Understanding
Python
•
MIT License
•11•207•1•0•Updated Jan 5, 2026Jan 5, 2026
DiffusionDriveV2
Public
DiffusionDriveV2: Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous Driving
Python
•
MIT License
•20•233•9•2•Updated Dec 29, 2025Dec 29, 2025
SuperCLIP
Public
Python
•
Apache License 2.0
•6•121•5•0•Updated Dec 26, 2025Dec 26, 2025
DiffusionVL
Public
[ArXiv 2025] DiffusionVL: Translating Any Autoregressive Models into Diffusion Vision Language Models
Python
•
Apache License 2.0
•5•131•3•0•Updated Dec 25, 2025Dec 25, 2025
TBCM
Public
Image-Free Timestep Distillation via Continuous-Time Consistency with Trajectory-Sampled Pairs
Python
•0•21•1•0•Updated Dec 16, 2025Dec 16, 2025
LightningDiT
Public
[CVPR 2025 Oral] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models
Python
•
MIT License
•52•1.4k•18•1•Updated Dec 16, 2025Dec 16, 2025
4DLangVGGT
Public
Official implementation of “4D LangVGGT: 4D Language-Visual Geometry Grounded Transformer”
Python
•
MIT License
•2•79•3•0•Updated Dec 10, 2025Dec 10, 2025
DiffusionDrive
Public
[CVPR 2025 Highlight] Truncated Diffusion Model for Real-Time End-to-End Autonomous Driving
Python
•
MIT License
•121•1.3k•24•1•Updated Dec 8, 2025Dec 8, 2025
EVA-X
Public
[npjDigitalMed (Nature Portfolio)] EVA-X: A foundation model for general chest X-ray analysis with self-supervised learning
Python
•14•94•6•0•Updated Dec 6, 2025Dec 6, 2025
MolSight
Public
[AAAI 2026] MolSight: Optical Chemical Structure Recognition with SMILES Pretraining, Multi-Granularity Learning and Reinforcement Learning
Python
•
Apache License 2.0
•0•14•0•0•Updated Dec 5, 2025Dec 5, 2025
LENS
Public
[AAAI 2026 Oral] LENS: Learning to Segment Anything with Unified Reinforced Reasoning
Python
•
Apache License 2.0
•6•106•15•0•Updated Dec 3, 2025Dec 3, 2025
Turbo-VAED
Public
[AAAI 2026] Turbo-VAED: Fast and Stable Transfer of Video-VAEs to Mobile Devices
Python
•1•93•12•0•Updated Nov 30, 2025Nov 30, 2025
RAD
Public
[NeurIPS 2025] RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning
Python
•
MIT License
•3•183•6•0•Updated Nov 7, 2025Nov 7, 2025
MaskAdapter
Public
[CVPR 2025] Official repository of the paper "Mask-Adapter: The Devil is in the Masks for Open-Vocabulary Segmentation"
segmentation clip zero-shot open-vocabulary open-vocabulary-semantic-segmentation vision-language-model segment-anything open-vocabulary-segmentation zero-shot-segmentation
Python
•
Apache License 2.0
•4•123•2•0•Updated Oct 23, 2025Oct 23, 2025
hustvl.github.io
Public
HTML
•
BSD 3-Clause "New" or "Revised" License
•2•1•0•0•Updated Oct 11, 2025Oct 11, 2025
TOGS
Public
[IEEE JBHI] The official code of "TOGS: Gaussian Splatting with Temporal Opacity Offset for Real-Time 4D DSA Rendering"
Python
•1•31•1•0•Updated Sep 10, 2025Sep 10, 2025
simpleseg
Public
Python
•0•8•3•0•Updated Sep 9, 2025Sep 9, 2025
Snap-Snap
Public
The repository of "Snap-Snap: Taking Two Images to Reconstruct 3D Human Gaussians in Milliseconds"
Python
•2•40•5•0•Updated Sep 1, 2025Sep 1, 2025
recogdrive
Public
ReCogDrive: A Reinforced Cognitive Framework for End-to-End Autonomous Driving
Python
•
Apache License 2.0
•48•7•0•0•Updated Aug 21, 2025Aug 21, 2025
ViTMatte
Public
[Information Fusion (Vol.103, Mar. '24)] Boosting Image Matting with Pretrained Plain Vision Transformers
Python
•
MIT License
•45•506•19•3•Updated Aug 13, 2025Aug 13, 2025
Dynamic-2DGS
Public
[ACM MM 2025] Dynamic 2D Gaussians: Geometrically Accurate Radiance Fields for Dynamic Objects
3d-vision 4d-reconstruction dynamic-reconstruction gaussian-splatting 3d-gaussian-splatting 4d-gaussian-splatting
Python
•
Apache License 2.0
•6•170•3•0•Updated Aug 6, 2025Aug 6, 2025
.github
Public
0•0•0•0•Updated Jul 4, 2025Jul 4, 2025
GroundingSuite
Public
[ICCV 2025] GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding
Python
•1•73•2•0•Updated Jun 26, 2025Jun 26, 2025
PersonViT
Public
PersonViT: Large-scale Self-supervised Vision Transformer for Person Re-Identification
Python
•
Apache License 2.0
•7•41•2•0•Updated Jun 11, 2025Jun 11, 2025
MIM4D
Public
[IJCV 2025] MIM4D: Masked Modeling with Multi-View Video for Autonomous Driving Representation Learning
Python
•
Apache License 2.0
•1•76•2•0•Updated May 30, 2025May 30, 2025
PixelHacker
Public
PixelHacker: Image Inpainting with Structural and Semantic Consistency
Python
•
Apache License 2.0
•20•473•14•1•Updated May 20, 2025May 20, 2025