March 18, 2024
MELON: Reconstructing 3D objects from images with unknown posesMarch 14, 2024
Cappy: Outperforming and boosting large multi-task language models with a small scorerMarch 8, 2024
Health-specific embedding tools for dermatology and pathologyFebruary 22, 2024
VideoPrism: A foundational visual encoder for video understandingJanuary 31, 2024
MobileDiffusion: Rapid text-to-image generation on-deviceDecember 19, 2023
VideoPoet: A large language model for zero-shot video generationDecember 15, 2023
StyleDrop: Text-to-image generation in any styleNovember 21, 2023
Open sourcing Project Guideline: A platform for computer vision accessibility technologyNovember 14, 2023
Scaling multimodal understanding to long videosOctober 9, 2023
SANPO: A Scene understanding, Accessibility, Navigation, Pathfinding, & Obstacle avoidance datasetSeptember 28, 2023
DynIBaR: Space-time view synthesis from videos of dynamic scenesSeptember 26, 2023
Google Research embarks on effort to map a mouse brain