Google is buying Play Store developers' code for AI training
A confidential pilot turns Android app codebases into paid data assets, with terms and payout sizes still undisclosed.
By Ryan Merket · · updated
Why it matters
Google's pilot could turn private app code into a new form of non-dilutive funding for founders, but it also creates fresh diligence questions around IP, data rights, and defensibility.

Google (@Google) is offering to pay some Android app developers for access to the code behind Play Store apps, according to a 404 Media report based on an email sent to a developer whose app has millions of downloads.
The pitch, as reported by 404 Media's Jason Koebler, invites a "select group of Google Play app developers" into a "confidential content offer pilot." Google frames the program as a way for developers to "generate additional revenue" from production codebases, archived projects, prototypes, and side projects. The email says developers would keep 100% of their intellectual property and grant Google a non-exclusive license.
That matters for founders because the offer treats a startup's codebase not just as infrastructure, but as an asset that can be licensed separately from the product. For small Android teams, especially bootstrapped developers, that could look like non-dilutive capital. For venture-backed companies, it raises a more complicated question: whether selling access to proprietary code for AI training creates future diligence issues around customer commitments, open-source obligations, security posture, and the company's own technical moat.
What Google is actually asking for
The email quoted by 404 Media says Google is looking for "high-quality, real-world codebases to help improve Google's developer tools and products." It says the code can help with "understanding complex logic" and "developing coding evals and benchmarks."
The email itself does not mention artificial intelligence, according to 404 Media. But a link inside the message reportedly leads to a Google page about "partnerships to improve our AI products," where Google says it is seeking to "pay for the delivery of non-public content in a range of media formats." That distinction is important: developers are not just being asked to share public repositories. Google is seeking access to private code that, in normal circumstances, would never be scraped from the open web.
The terms described in the email are developer-friendly on the surface: non-exclusive access, retained IP, and the ability to monetize the same data elsewhere. But the key business variables are not in the public record. 404 Media does not report how much Google is offering, how developers are selected, whether Google can use the code for model training specifically, what security review is required, or whether developers can exclude secrets, customer-specific logic, third-party licensed components, or commercially sensitive modules.
Why now
Google has spent the past year publicly reorganizing much of its product story around AI, including Gemini-branded models and agents highlighted on its own site. But coding assistants have become one of the clearest places where AI is already changing developer behavior. Anthropic's Claude Code and Microsoft's Copilot are named in the 404 Media report as competitive pressure points.
The pilot suggests Google wants more than the public code that AI companies have historically scraped or licensed. Real production code can be more valuable than polished examples because it includes messy architecture, legacy decisions, app-specific business logic, and the patterns developers actually ship under constraints. It can also expose risks that generic benchmarks miss.
For founders, that is the strategic subtext. The codebase sitting inside an Android app may now have a market value independent of app revenue, ads, subscriptions, or acquisition interest. But the price of that value is disclosure. A founder who licenses code to Google may gain cash without selling equity, while also handing a major platform operator a deeper look at how the product is built.
The funding implication for startups
If Google expands the pilot, investors may start asking portfolio companies whether training-data licensing is an acceptable revenue line or a red flag. For pre-seed and seed founders, selling code access could extend runway. For companies with defensible software, it could also undercut the argument that the codebase is proprietary enough to support venture-scale returns.
The answer will depend on the contract, not the pitch email. A non-exclusive license can still be broad. A retained-IP clause does not answer whether outputs, embeddings, evals, or future developer products can reflect patterns learned from the code. And a one-time check may look less attractive if it complicates an enterprise sale, an acquisition process, or a later claim that the company's technical work is hard to replicate.
Google's offer is also a signal to the rest of the AI market: private developer code is becoming paid training inventory. The developers who built the apps now have leverage, but only if they know what rights they are selling.