Solving fundamental computational problems that deliver meaningful impact for Google’s products, society, and scientific progress.
Solving fundamental computational problems that deliver meaningful impact for Google’s products, society, and scientific progress.
Athena is an international team of research scientists and engineers who tackle product-inspired problems with novel solutions to assist, complement, empower, and inspire people — from the everyday to the imaginative. Our work spans algorithms, artificial intelligence (AI), language understanding, and many other fields, and yields state-of-the-art breakthroughs in areas like efficiency, privacy, and user engagement.
We collaborate closely with partners across Google to take discoveries from publication to implementation for the Company's largest and most trusted products. Beyond Google's portfolio of products and services, our contributions to AI, computer science and machine learning power scientific advances for climate science, journalism, microeconomics and other data-driven disciplines.
We recognize that AI is a foundational and transformational technology and are proud to contribute to a long history of responsible innovation. Our commitment to Responsible AI principles ensure we develop and use technologies in ways that are socially beneficial, avoid bias, are built and tested for safety, are accountable to people and aligned with our values
We extend machine learning approaches to better model the relationships contained in information networks. These models (e.g., semi-supervised similarity ranking & clustering, neural graph embedding, and graph convolutional approaches) are useful in a wide range of machine learning applications.
Auction theory, mechanism design, and advanced algorithms serve to improve Ads and other market-based products
Applying integer programming, linear programming, constraint programming, and graph algorithms to solve problems at scale for transportation, search, natural language understanding, computer vision, robotics and more.
We advance the state of the art in natural language technologies and build systems that learn to understand and generate language in context.
We focus on large scale machine learning including supervised learning (e.g. deep learning and kernel-based learning), and semi/unsupervised learning (e.g. streaming clustering and efficient similarity search). The research areas include distributed optimization, personalization and privacy-preserving learning, on-device learning and inference, recommendation systems, data-dependent hashing, and learning-based vision. We develop principled approaches and apply them to Google’s products. Our team regularly publishes in top-tier learning conferences and journals. Our team’s work has been applied across Google, powering Search and Display Ads, YouTube, Android, Play, Gmail, Assistant and Google Shopping.
We provide fast clustering of the datasets that can scale to billions of datapoints, and a streaming throughput of hundreds of thousands of points per second. The goal is to provide scalable nonparametric clustering without making simplistic generative assumptions like convexity of clusters which are rarely true in practice. The team develops techniques that can handle drift in data distributions over time. These techniques are being used in a large number of applications including dynamic spam detection in multiple products and semantic expansion in NLP.
We sift through data to discover, understand, and model implicit signals in user behavior. We partner with Product Areas such as Ads, YouTube, Android, and more to add machine learning functionality to products across Google. Due to the open ended nature of data mining, ongoing projects vary and currently include smart notifications on Android, Ads Pricing optimizations, differential privacy work, and more.
The goals of the Structured Data group are: 1) working with various product teams closely and leverage our expertise in structured data to solve challenging technical problems and initiate new product features; 2) providing scientific expertise in computational journalism across Google in the fight against digital misinformation; 3) drive a long-term agenda that will advance state-of-the-art research in structured data with real world impact.
We develop techniques for large scale similarity search in massive databases with arbitrary data types (sparse or dense high dimensional data) and similarity measures (metric/non-metric, potentially learned from data). The focus has been on developing data-dependent ML-based hashing techniques and tree-hash hybrids that are driving a multitude of applications at Google. This team also develops techniques for fast inference in machine learning models including neural networks, often improving the speeds over 50x while maintaining near exact accuracy.
Our mission is to accurately and efficiently represent, combine, optimize and search models of speech and text. In particular, we devise automata, grammars, neural and other models that represent word histories, context-dependent lexicons for speech and keyboard, written-to-spoken transductions and extractions of dates, times, currency, measures, etc, and transliteration and contextual models of language. These can be combined and optimized to give high-accuracy, efficient speech recognition and synthesis, text normalization, and more. We provide efficient decoding algorithms to search these models. This work is used extensively in Google's speech and text processing infrastructure.
Our mission is to create a comprehensive set of classifiers for detecting offensive, inappropriate & controversial content in images and video. We accomplish this using a variety of techniques, including ensembles of ML models that are trained on images and text from the web. We also apply transfer learning on deep vision models for domain-specific classifier creation.
Semi-supervised learning is increasingly critical to solving many real-world product problems where data is sparse, sparsely labeled, or noisy. We develop semi-supervised and unsupervised machine learning systems that operate at Google scale. We apply our research to a broad range of problems, including query understanding, conversation understanding, and media understanding.
We develop systems for transforming cloud-resident ML models to highly efficient models that run on resource-constrained mobile devices.
We enrich electronic conversations by understanding media using multi-modal signals from images, video, text, and the web. We accomplish this by marrying machine vision models with ML-enabled natural language understanding and generation systems.
Many fundamental learning problems we solve at Google have non-trivial combinatorial structure that prevents the application of general purpose ML algorithms. They exhibit complex and discontinuous loss functions (e.g., in pricing) or combinatorial explosions (such as contextual bandits, feature selection, or integer programming) and may require solutions that are robust against strategic behavior. Our team pushes the boundaries in these areas through research that blends techniques from learning theory, game theory, and discrete/continuous optimization.
Glassbox Learning does R&D into making ML more controllable and interpretable, without sacrificing accuracy. An important line of research is how to translate policy goals about metrics and fairness into machine learning training. For interpretability, Glassbox provides end-to-end guarantees on the relationship of inputs to outputs, such as monotonicity and other shape constraints. To achieve these goals, Glassbox researches and utilizes new algorithms for constrained optimization.
Dataset Search, also known as Science Search, is a project to index all datasets on the web and to make the metadata (and, where possible, the data itself) searchable and useful. Datasets and related data tend to be spread across multiple data repositories on the web. In most cases, data is not linked nor has it been indexed which makes searching tedious or, in some cases, impossible.
Su Wang
Emmanuel Guere
Ross Michael Anderson
You (Will) Wu
Brandon Asher Mayer
Tianjian Lu
Jae Hun Ro
Harikrishna Narasimhan
Alexander Ku
Cibu C Johny
Mihai Amarandei-Stavila
Sumit Kumar Sanghai
Frederic Didier
Rasmus Munk Larsen
Donald Metzler
Kshipra Bhawalkar
Yun-hsuan Sung
Felix Yu
Vinh Q. Tran
Serena Lutong Wang
Ashish V. Thapliyal
Santiago R. Balseiro
Moustafa Alzantot
Vincent Pierre Cohen-addad
Fotios Iliopoulos
Shankar Kumar
Neha Arora
Yang Li
Hao Zhang
Amr Ahmed
Mingyang Zhang
Harsh Mehta
Cheng Li
Gang Li
Tao Chen
Krishna Srinivasan
Xin Wang
Deepak Ramachandran
Qing Wang
Li Yang
Di Wang
Yichen Zhou
Zach Fisher
Jing Lu
Zhe Feng
Kate Lin
Xi Chen
Chao Zhang
Lin Chen
Jennifer Brennan
Vahab S. Mirrokni
Corinna Cortes
Michael Riley
Tania Bedrax-Weiss
Gagan Aggarwal
Sanjiv Kumar
Jon Orwant
Aranyak Mehta
Ameesh Makadia
Vincent Furnon
Sarvjeet Singh
Andrew Tomkins
Silvio Lattanzi
Sergei Vassilvitskii
Vidhya Navalpakkam
Michael Bendersky
Radu Soricut
Marc Najork
Craig Boutilier
Rich Washington
Katrina Sostek
Anna Katanova
Fabien Viger
Cyril Allauzen
Kevin Aydin
Laurent Perron
Mario Guajardo-Céspedes
Jeongwoo Ko
Bruno De Backer
Sameer Agarwal
Tom Bagby
Chih-wei Hsu
Alex Fabrikant
Alexander Gutkin
Allan Heydon
Xavi Gonzalvo
Zoya Svitkina
Thomas Furness
Kai Kohlhoff
Afshin Rostamizadeh
Umar Syed
Mohammad Mahdian
Kevin Canini
Kishore Papineni
Clement Courbet
Tamas Sarlos
Bo Pang
Burcu Karagol Ayan
Ondrej Sykora
David Rybach
Sandeep Tata
Sebastian Goodman
Alejandra Estanislao
Guillaume Chatelet
Brian Roark
Nan Ding
Dustin Zelle
Pawel Lichocki
Erik Vee
Jai Gupta
Jing Xie
Aaron Archer
Jonathan Halcrow
Renato Paes Leme
Morteza Zadimoghaddam
Austin Waters
Simon Baumgartner
Natasha Noy
Flip Korn
Zhenhai Zhu
Ruiqi Guo
Da-Cheng Juan
Chris Welty
Saeed Alaei
Anne-Claire Haury
Rina Panigrahy
Dmitry Storcheus
Sreenivas Gollapudi
Kyle Gorman
Edith Cohen
Xuanhui Wang
Joonseok Lee
ying sheng
Andres Munoz Medina
Mandy Guo
Karthik Raman
Ehsan Variani
Balasubramanian Sivan
Ji Ma
Seungyeon Kim
Kostas Kollias
Yin-Wen Chang
Isin Demirsahin
Dana Alon
bertrand Le cun
Nachiappan Valliappan
Alessandro Epasto
Ananda Theertha Suresh
Spurthi Amba Hombaiah
Sherol Chen
Vincent Perot
Jialu Liu
Bryan Perozzi
Felix Chern
Manish Purohit
Weize Kong
Junfeng He
David Applegate
Stéphane Soppera
Kareem Amin
Walid Krichene
Si Si
Grady Simon
Filip Radlinski
Zhen Qin
CJ Carey
Sen Zhao
Tomer Levinboim
Jason Baldridge
Guolong Su
Gustavo Hernandez Abrego
Yinlam Chow
Sashank Reddi
Michal Lukasik
John Palowitch
Giulia DeSalvo
Dimitris Paparas
Sarah Mohajeri
Dara Bahri
Badih Ghazi
Hossein Esfandiari
Chun-Ta Lu
Arjun Gopalan
Sara Ahmadian
Christian Tjandraatmadja
Song Zuo
Tianqi Liu
Andreas Veit
Ankit Singh Rawat
Srinadh Bhojanapalli
Masrour Zoghi
Jean Pouget-Abadie
Daniel Glasner
Ameya Velingker
Sasan Tavakkol
Honglei Zhuang
Fei Sha
Srikumar Ramalingam
Jieming Mao
Pasin Manurangsi
Phil Sun
Zonglin Li
Jeremiah Liu
Felix Stahlberg
Nick Doudchenko
Marialena Kyriakidi
Le Yan
Sadeep Jayasumana
Matthew Fahrbach
Wittawat Jitkrittum
Matthew Joseph
Juan Pablo Vielma
Wennan Zhu
Yuan Deng
Carolina Osorio
Rolf Jagerman
Carlos Esteves
Rajat Sen
Ayan Chakrabarti
Ceslee Montgomery
Otilia Stretcu
Mehran Kazemi
Luke Vilnis
Anton Tsitsulin
Kai Hui
Rajesh Jayaram
Parker Riley
Tal Schuster
Joey Huchette
Yichao Zhou
Krisztian Balog
Hussein Hazimeh
Kazuma Hashimoto
Thibaut Cuvelier
Jiaming Shen
Yifeng Teng
Sjoerd van Steenkiste
Cenk Baykal
Sami Abu-El-Haija
Yaqing Wang
Najoung Kim
Beliz Gunel