-
Wide Residual Networks
Paper • 1605.07146 • Published • 2 -
Characterizing signal propagation to close the performance gap in unnormalized ResNets
Paper • 2101.08692 • Published • 2 -
Pareto-Optimal Quantized ResNet Is Mostly 4-bit
Paper • 2105.03536 • Published • 2 -
When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations
Paper • 2106.01548 • Published • 2
Collections
Discover the best community collections!
Collections including paper arxiv:2302.05543
-
SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations
Paper • 2108.01073 • Published • 8 -
MoVQ: Modulating Quantized Vectors for High-Fidelity Image Generation
Paper • 2209.09002 • Published -
An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion
Paper • 2208.01618 • Published • 2 -
Adding Conditional Control to Text-to-Image Diffusion Models
Paper • 2302.05543 • Published • 57
-
Random Field Augmentations for Self-Supervised Representation Learning
Paper • 2311.03629 • Published • 10 -
TEAL: Tokenize and Embed ALL for Multi-modal Large Language Models
Paper • 2311.04589 • Published • 23 -
GENOME: GenerativE Neuro-symbOlic visual reasoning by growing and reusing ModulEs
Paper • 2311.04901 • Published • 11 -
Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models
Paper • 2311.06783 • Published • 28
-
Attention Is All You Need
Paper • 1706.03762 • Published • 105 -
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Paper • 2005.11401 • Published • 14 -
LoRA: Low-Rank Adaptation of Large Language Models
Paper • 2106.09685 • Published • 54 -
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Paper • 2205.14135 • Published • 15
-
AutoCLIP: Auto-tuning Zero-Shot Classifiers for Vision-Language Models
Paper • 2309.16414 • Published • 19 -
Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Paper • 2309.13018 • Published • 9 -
Robust Speech Recognition via Large-Scale Weak Supervision
Paper • 2212.04356 • Published • 43 -
Language models in molecular discovery
Paper • 2309.16235 • Published • 10
-
X-Adapter: Adding Universal Compatibility of Plugins for Upgraded Diffusion Model
Paper • 2312.02238 • Published • 28 -
IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models
Paper • 2308.06721 • Published • 33 -
T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models
Paper • 2302.08453 • Published • 12 -
ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs
Paper • 2311.13600 • Published • 47
-
Adding Conditional Control to Text-to-Image Diffusion Models
Paper • 2302.05543 • Published • 57 -
ControlNet V1.1
📉1.16kTransform images using various artistic effects
-
Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators
Paper • 2303.13439 • Published • 5 -
Text2Video-Zero
🚀379
-
Wide Residual Networks
Paper • 1605.07146 • Published • 2 -
Characterizing signal propagation to close the performance gap in unnormalized ResNets
Paper • 2101.08692 • Published • 2 -
Pareto-Optimal Quantized ResNet Is Mostly 4-bit
Paper • 2105.03536 • Published • 2 -
When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations
Paper • 2106.01548 • Published • 2
-
X-Adapter: Adding Universal Compatibility of Plugins for Upgraded Diffusion Model
Paper • 2312.02238 • Published • 28 -
IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models
Paper • 2308.06721 • Published • 33 -
T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models
Paper • 2302.08453 • Published • 12 -
ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs
Paper • 2311.13600 • Published • 47
-
SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations
Paper • 2108.01073 • Published • 8 -
MoVQ: Modulating Quantized Vectors for High-Fidelity Image Generation
Paper • 2209.09002 • Published -
An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion
Paper • 2208.01618 • Published • 2 -
Adding Conditional Control to Text-to-Image Diffusion Models
Paper • 2302.05543 • Published • 57
-
Random Field Augmentations for Self-Supervised Representation Learning
Paper • 2311.03629 • Published • 10 -
TEAL: Tokenize and Embed ALL for Multi-modal Large Language Models
Paper • 2311.04589 • Published • 23 -
GENOME: GenerativE Neuro-symbOlic visual reasoning by growing and reusing ModulEs
Paper • 2311.04901 • Published • 11 -
Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models
Paper • 2311.06783 • Published • 28
-
Attention Is All You Need
Paper • 1706.03762 • Published • 105 -
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Paper • 2005.11401 • Published • 14 -
LoRA: Low-Rank Adaptation of Large Language Models
Paper • 2106.09685 • Published • 54 -
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Paper • 2205.14135 • Published • 15
-
Adding Conditional Control to Text-to-Image Diffusion Models
Paper • 2302.05543 • Published • 57 -
ControlNet V1.1
📉1.16kTransform images using various artistic effects
-
Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators
Paper • 2303.13439 • Published • 5 -
Text2Video-Zero
🚀379
-
AutoCLIP: Auto-tuning Zero-Shot Classifiers for Vision-Language Models
Paper • 2309.16414 • Published • 19 -
Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Paper • 2309.13018 • Published • 9 -
Robust Speech Recognition via Large-Scale Weak Supervision
Paper • 2212.04356 • Published • 43 -
Language models in molecular discovery
Paper • 2309.16235 • Published • 10