In an interview on the sidelines of the Google I/O Connect held in Bengaluru on Wednesday, Ajjarapu reasoned that with its largest mobile-first population, micro-payment and digital payment models, a booming startup and developer ecosystem, and diverse language landscape, “India is uniquely positioned to drive the next generation of AI innovation.”
In India, Google works with the Ministry of Electronics and Information Technology’s Startup Hub to train 10,000 startups in AI, expanding access to its artificial intelligence (AI) models like Gemini and Gemma (family of open models styled on Gemini tech), and introducing new language tools from Google DeepMind India, according to Ajjarapu.
It supports “eligible AI startups” with up to $350,000 in Google Cloud credits “to invest in the cloud infrastructure and computational power essential for AI development and deployment.”
Karya, an AI data startup that empowers low-income communities, is “using Gemini (also Microsoft products) to design a no-code chatbot,” while “Cropin (in which Google is an investor) is using Gemini to power its new real-time generative AI, agri-intelligent platform.”
Manu Chopra, co-founder and CEO of Karya, said he uses Gemini “to take Karya Platform global and enable low-income communities everywhere to build truly ethical and inclusive AI.”
Gemini has helped Cropin “build a more sustainable, food-secure future for the planet,” according to Krishna Kumar, the startup’s co-founder and CEO.
Robotic startup Miko.ai “is using Google LLM as a part of its quality control mechanisms,” says Ajjarapu.
According to Sneh Vaswani, co-founder and CEO of Miko.ai, Gemini is the “key” to helping it “provide safe, reliable, and culturally appropriate interactions for children worldwide.”
Helping farmers
With an eye on harnessing the power of AI for social good, Google plans to soon launch the Agricultural Landscape Understanding (ALU) Research API, an application programming interface to help farmers leverage AI and remote sensing to map farm fields across India, according to Ajjarapu.
The solution is built on Google Cloud and on partnerships with the Anthro Krishi team and India’s digital AgriStack. It is piloted by Ninjacart, Skymet, Team-Up, IIT Bombay, and the Government of India, he pointed out.
“This is the first such model for India that will show you all field boundaries based on usage patterns, and show you other things like sources of water,” he added.
On local language datasets, Ajjarapu underscored that Project Vaani, in collaboration with the Indian Institute of Science (IISc), has completed Phase 1 — over 14,000 hours of speech data across 58 languages from 80,000 speakers in 80 districts. The project plans to expand its coverage to all states of India, totaling 160 districts, in phase two.
Project Vaani introduced IndicGenBench, a benchmarking tool tailored for Indian languages, which covers 29 languages. Additionally, Project Vaani is open-sourcing its CALM (Composition of Language Models) framework for developers to integrate specialised language models with Gemma models. For example, integrating a Kannada specialist model into an English coding assistant may help in offering coding assistance in Kannada as well.
Google, which has Gemini Nano tailored for mobile devices, has introduced the Matformer framework, developed by the Google DeepMind team in India. According to Manish Gupta, director, Google, it allows developers to mix different sizes of Gemini models within a single platform.
This approach optimises performance and resource efficiency, ensuring smoother, faster, and more accurate AI experiences directly on user devices.
India-born Ajjarapu was part of Google’s corporate development team that handled mergers and acquisitions when Google’s parent Alphabet acquired UK-based AI company DeepMind in 2014. As a result, he got the opportunity to conduct the due diligence and lead the integration of DeepMind with Google.
Research, products and services
Ajjarapu, though, was not a researcher, and was unsure of meaningfully contributing to DeepMind’s mission, which “at that time, was to solve intelligence.” This prompted him to quit Google in 2017 after 11 years, and launch Lfyt’s self-driving division. Two years later, Ajjarupu rejoined Google DeepMind as senior director, engineering and product.
Last year, Alphabet merged the Brain team from Google Research and DeepMind into a single unit called Google DeepMind, and made Demis Hassabis its CEO. Jeff Dean, who reports to Sundar Pichai, CEO of Google and Alphabet, serves as chief scientist to both Google Research and Google DeepMind.
While the latter unit focuses on research to power the next generation of products and services, Google Research deals with fundamental advances in computer science across areas such as algorithms and theory, privacy and security, quantum computing, health, climate and sustainability and responsible AI.
Has this merger led to a more product-focused approach at the cost of research, as critics point out? Ajjarapu counters that Google was still training its Gemini foundation models when the units were merged in April 2023, after which it launched the Gemini models in December, followed by Gemini 1.5 Pro, “which has technical breakthroughs like a long context window (2 million tokens that covers about 1 hour of video, or 11 hours of audio, or 30,000 lines of code).”
A context window is the amount of words, known as tokens, a language model can take as input when generating responses.
“Today, more than 1.5 million developers globally use Gemini models across our tools. The fastest way to build with Gemini is through Google AI Studio, and India has one of the largest developer bases on Google AI Studio,” he notes.
Google Brain and DeepMind, according to Ajjarapu, were also collaborating “for many years before the merger”.
“We believe we built an AI super unit at Google DeepMind. We now have a foundational research unit, which Manish is a part of. Our team is part of that foundation research unit. We also have a GenAI research unit, focused on pushing generative models regardless of the technique — be it large language models (LLMs) or diffusion models that gradually add noise (disturbances) to data (like an image) and then learn to reverse this process to generate new data,” said Ajjarapu, who is part of the product unit and whose job is to “take the research and put it in Google products.”
Google also has a science team, which is primarily responsible for things like protein folding and discovering new materials. Protein folding refers to the problem of determining the structure if a protein from its sequence of amino acids alone.
“There are many paradigms to go after AI development, and we feel like we’re pretty well covered in all of them,” he says. “We’re now fully in our Gemini era, bringing the power of multimodality to everyone.”
Match, incubate and launch
And how does Google decide which research products and product ideas to prioritise and invest in? According to Ajjurupa, the company uses an approach called “match, incubate, and launch.”
Is there a problem that’s ready to be solved with a technology that’s readily available? That’s the matching part. For instance, for graph neural nets, the map is a graph. So there is a match. However, even if there’s a match, performance is not guaranteed when it comes to generative AI.
“You have to iterate it,” he says.
The next step involves de-risking an existing technology or research breakthrough for the real world since not all of them are ready to be made into products. This phase is called incubation. The final stage is the launch.
“That’s the methodical approach that we follow. But given the changing nature of the world, and changing priorities, we try to be nimble,” says Ajjarupu.
Gupta, on his part, asks his research team to identify research problems that will have “some kind of a transformative impact on the world, which makes it worthy of being pursued, even if the problem is very hard or the chances of failure are very high.”
And how is Google DeepMind addressing ethical concerns around AI, especially biases and privacy? According to Gupta, the company has developed a framework to evaluate the societal impact of technology, created red teaming techniques, data sets and benchmarks, and shared them with the research community.
He adds that his team contributed the SeeGULL dataset (benchmark to detect and mitigate social stereotypes about groups of people in language models) to uncover biases in language models based on aspects such as nationality and religion.
“We work to understand and mitigate these biases and aim for cultural inclusivity too in our models,” says Gupta.
Ajjarapu adds that the company’s focus is on “responsible governance, responsible research, and responsible impact.”
He cited the example of the Google SynthID — an embedded watermark and metadata labelling solution that flags photos (deepfakes) generated using Google’s text-to-image generator, Imagen.