Sovereignty

Sovereign AI: a stack, not a label

Sovereign AI is becoming infrastructure. But what do governments and organizations actually control? A layer-by-layer guide to what the claims miss

Samuel van Hoogstraten, Trompe-l'oeil Still Life, 1664

A government can buy a system sold as "sovereign AI" and still depend on foreign chips, foreign cloud technology, foreign model weights, foreign training data, foreign software frameworks, foreign licences, and foreign legal regimes.

That may be a reasonable compromise. It may even be the only realistic compromise for now. But it should not pass as sovereignty without a footnote.

This is the tension I am regularly hearing in conversations with people working on public AI infrastructure, public-interest technology, and government deployment. The question is rarely whether sovereignty matters; most people in those rooms agree that it does. The harder question is much more practical: sovereignty, but where, at what layer, for which use case, and at what cost?

I have recently had the chance to discuss these questions with peers working close to public AI infrastructure in France. France is a useful case because it is not only issuing speeches. It is trying to move real public-sector workloads, structure demand, and build shared state infrastructure. That is real work. It is also exactly where the vocabulary starts to become dangerous.

A system can be hosted in France, procured through a French public framework, legally protected under a European certification scheme, and still depend on foreign chips, foreign model families, foreign software tooling, or foreign training data. The point is not to dismiss the effort but to name the remaining dependencies clearly enough so they can be governed.

CNAS's Sovereign AI Index makes the point clearly: most sovereign AI projects still depend heavily on foreign, often US, technology across the stack, and for most countries the realistic task is managing dependencies, not eliminating them. Brookings makes a similar argument: AI rests on global foundations from which no country can fully separate, so sovereignty requires understanding and managing interdependencies rather than pretending they disappear.

What exactly is being claimed? What is actually controlled? What dependency remains? And which dependencies are acceptable for ordinary public services, versus unacceptable for critical state functions?

That is where sovereign AI needs a truth label.

Sovereignty is not one thing

The phrase "sovereign AI" is now used for way too many things.

Sometimes it means national cloud infrastructure. Sometimes data residency. Sometimes a domestic model. Sometimes open weights. Sometimes local language support. Sometimes industrial strategy. Sometimes it means "not OpenAI," which is not a strategy.

Although these things are related, they are not interchangeable.

A country can have sovereign infrastructure and foreign models. It can have domestic models trained on foreign compute. It can have open weights with unknown training data. It can have excellent local language performance on benchmarks and still rely on English-dominant pre-training. It can have strong data residency and weak operational resilience. It can have a legal wrapper around a technology stack it does not control.

None of this automatically makes the effort useless. The more useful frame is a stack, not a label.

The useful question is not "is it sovereign?" The useful question is: sovereign enough for what?

A chatbot answering general questions on a city website does not need the same sovereignty threshold as a system supporting emergency services, tax enforcement, social benefits, migration decisions, healthcare triage, judicial administration, or defence procurement.

A low-risk public information tool can probably tolerate more dependency. A high-stakes administrative system cannot. An internal drafting assistant has one risk profile. An agent that modifies records, sends notifications, allocates resources, or triggers follow-up actions has another.

If a ministry cannot explain where the model runs, who can update it, what law reaches the provider, what data shaped the model, whether the model can be replaced, and what happens if the vendor changes terms, then it does not have a sovereignty strategy. It has a supplier relationship. Maybe a good one. But still a supplier relationship.

The sovereignty stack

A practical sovereignty assessment should go layer by layer. Not because every country needs to control every layer (most cannot), but because decision-makers need to know which layers they control, which ones they borrow, and which ones they are choosing not to solve.

Hardware

Hardware is where the conversation gets uncomfortable.

Frontier AI depends on chips, advanced packaging, fabs, lithography, energy, cooling, data centres, and logistics. No country fully controls this chain alone.

The scale gap is not abstract. Europe operates 14 supercomputers and 19 AI Factories under EuroHPC JU, backed by roughly €10 billion of combined funding over 2021–2027. The entire EuroHPC compute fleet totals roughly 57,000 accelerators. Meta's H100 deployment target alone is 350,000 GPUs, more than six times the aggregated EuroHPC fleet. Google, Amazon, Meta, and Microsoft combined are spending $725 billion on AI infrastructure in 2026. JUPITER, Europe's first exascale system, delivers 1 exaFLOP. A single Google Ironwood superpod delivers 42.5 exaFLOPS. Europe accounts for under 5% of global AI compute performance.

The AI Factories are real infrastructure, genuinely useful for startups, researchers, and universities. In the same period, Microsoft committed four billion euros to French datacenters bringing 25,000 advanced GPUs. Those numbers are not in the same conversation.

For most governments and organizations, hardware sovereignty is not a short-term procurement objective. It is industrial policy. If a compute strategy depends on a provider exposed to foreign export controls, sanctions, or supply constraints, that dependency should appear high in the risk register. Not in the appendix.

The decision is not "do we have hardware sovereignty?" For most countries, the answer is no. The decision is: which workloads require guaranteed compute access, under whose jurisdiction, with what fallback if access tightens?

Cloud and infrastructure jurisdiction

Cloud sovereignty is about where data and workloads physically run, which law applies, who operates the infrastructure, and whether foreign legal regimes can reach the provider.

France's SecNumCloud framework is one of the more advanced attempts to turn this into enforceable criteria. The provider must be legally domiciled in the EU, no non-European entity can hold more than 24% of capital and voting rights, and the US CLOUD Act and FISA cannot compel data disclosure. That 24% capital rule is why AWS, Microsoft, and Google Cloud cannot obtain SecNumCloud directly, even with French data centres. Bleu (Capgemini-Orange, running on Microsoft Azure) and S3NS (Thales, recently certified with Google Cloud technology) are French legal entities that satisfy the criteria, but the underlying technology stack is still American.

Albert API is presented by DINUM as a way for administrations to access generative AI in a secure, SecNumCloud, and sovereign environment, currently hosted on certified infrastructure at Outscale. All of that is real protection. It is just not the whole answer.

What certification does not do is guarantee operational independence if the underlying technology stack is still foreign-controlled, or resilience if the licensor changes terms. That is the gap worth naming, not as a criticism of the framework, but as the next problem to solve.

Models

Access is not ownership. Running a model on sovereign infrastructure is not the same as controlling what it represents. What model sovereignty actually requires is more demanding than most procurement frameworks acknowledge: training data that is auditable and free of foreign legal claims; training compute under national or allied control; weights owned outright; the ability to retrain, audit, or modify without foreign permission; and inference tooling that does not depend on foreign IP. By that standard, almost no current sovereign AI initiative achieves model sovereignty.

Open weights help with portability and this matters. They reduce dependence on a live API. But weights are the output of training, not the training itself. They do not tell you what was in the dataset, what was filtered out, which human preferences shaped the model, or whether it can be reproduced. They do not remove the compute dependency.

CNAS found that among sovereign AI projects building on open-weight models, Meta's Llama family appears most often, followed by Mistral, Gemma, and Qwen. Even at the model layer, many sovereign AI ambitions still rely on US foundations. Qwen, now the most widely downloaded open-weight family globally, is trained on data reflecting Chinese content moderation decisions and has built-in restrictions on certain topics. A government building sovereign infrastructure on Qwen is inheriting decisions made in Hangzhou.

The exception worth knowing is AI2's OLMo family, where training data, training code, and evaluation pipeline are fully public and reproducible. That is a meaningfully different sovereignty profile. We return to fully sovereign model initiatives later in this piece.

A government or organization using Llama, Mistral, Qwen, Gemma, or another open-weight family should say exactly what it is getting: portability, local deployment options, lower API dependency, maybe better inspectability at runtime. Open weights can give you operational portability. They do not give you sovereignty.

Data and language

Language is usually framed as inclusion: more languages, more citizens served. That is right but incomplete. Language is also a sovereignty layer.

A model trained mostly on English data can produce fluent French, Spanish, Portuguese, Japanese, Hindi, or Arabic. That does not mean it has internalized the same assumptions about evidence, authority, uncertainty, time, politeness, institutional responsibility, or administrative judgment. Fine-tuning on a local language does not fix representational gaps installed at pre-training. It makes the outputs look right. Whether the reasoning underneath is right is a different question.

For many use cases this may not matter much. For public administration, law, medicine, education, justice, welfare, policing, migration, and civic communication, it matters considerably more. A system can sound local and still reason with imported defaults.

Most evaluations do not test this. They test task performance, not institutional fit. If an administrative assistant is used to draft letters to citizens, classify requests, or support eligibility decisions, the model's implicit calibration matters. What counts as a reasonable justification? What does "fair" sound like in that legal and cultural context?

This is why India is interesting. The work around Bhashini and national language infrastructure is not only about interface translation. India's broader AI mission includes datasets, compute, indigenous model work, and language technologies. The direction is notable: language capacity built into the infrastructure, not only into the front end.

Latin America offers a different angle. Latam-GPT is not trying to beat OpenAI on frontier performance. Developed with more than 30 institutions across eight countries and trained on regional data, it starts from a realistic premise: most countries cannot build credible AI infrastructure alone, but they might together.

Tooling and runtime

Models do not run in isolation. They run through frameworks, serving stacks, orchestration layers, monitoring tools, guardrails, evaluation pipelines, vector databases, identity systems, APIs, and developer workflows. If all of those are foreign-controlled, closed, or tightly coupled to one provider, the model layer alone does not save you.

A government or organization may run an open-weight model on certified infrastructure and still be dependent on a proprietary orchestration layer, a non-portable vector database, or a vendor-specific observability stack.

Most strategies budget for the vision, not the operations. A strategy document does not operate a platform. This is where many deployments discover that the sovereignty was shallower than the contract suggested.

Agentic authority

This is the layer becoming unavoidable fastest.

Most current AI governance still assumes a system that produces outputs and a human who decides what to do with them. That model is already outdated.

Agentic systems do not only produce text. They can call APIs, send messages, modify records, trigger workflows, book appointments, escalate cases, spend money, or interact with other systems. Once an AI system acts, the sovereignty question changes. It is no longer only: where is the data? It becomes: who authorized this action, on whose behalf, under which law, with what audit trace, with what human checkpoint, and who is liable if the action causes harm?

IMDA published a Model AI Governance Framework for Agentic AI in January 2026 and updated it in May 2026. The framework emphasizes bounding agent powers, meaningful human accountability, lifecycle controls, transparency, and risks such as third-party agents and automation bias.

On May 8, 2026, Chinese authorities issued implementation guidelines for AI agents, defining them as systems capable of autonomous perception, memory, decision-making, interaction, and execution. The framing is characteristic: acknowledge the risks, push development and governance forward simultaneously, do not let caution slow the investment.

Both treat agents as a distinct governance problem. For governments and organizations, the question is not simply "are we compliant?" It is: how much authority are we giving this system, and where does that authority stop? Human oversight checkpoints help. Least-privilege access helps. Sandboxing and logging help. For now, any government or organization deploying agents in public services should assume the legal framework is behind the technical capability.

What different countries are really testing

This is not a ranking but my attempt at a pattern map.

France is testing whether legal and certification frameworks can drive real public-sector adoption. SecNumCloud, Albert API, Cloud Pi, Nubo, and the cloud doctrine are all part of that landscape. The Cour des Comptes said it plainly in 2025: the problem is no longer the absence of a strategy. It is closing the gap between strategy and adoption.

India is testing language sovereignty and public digital infrastructure. Bhashini, AIKosha, the IndiaAI compute program, and indigenous model efforts show a strategy built around scale, inclusion, and linguistic diversity. The challenge is whether that infrastructure can move from access and language coverage toward deeper model and data sovereignty.

Rakuten AI 3.0, released in March 2026 under the GENIAC project promoted by METI and NEDO, is a large Japanese language model with strong benchmark performance. Japan is also the first foreign participant in the US Genesis AI mission: managed interdependence, not autarky.

Singapore is testing governance discipline. It does not frame everything as sovereignty. Its agentic AI framework shows a practical approach to deployment risk, organizational accountability, and controlled adoption.

China is testing full-stack ambition under state coordination. It has the data scale, industrial policy, platform ecosystem, and regulatory capacity to build a non-Western AI ecosystem at meaningful scale. The tradeoff is real: control comes with political constraints, content filtering, and governance assumptions that democracies cannot and should not import.

Latin America is testing the coalition model. Latam-GPT starts from a realistic premise: many countries cannot build credible AI infrastructure alone, but they may be able to build shared regional capacity around language, culture, and public-interest use cases.

The European Commission's 2025 sovereign cloud tender shows sovereignty becoming a procurement benchmark in Europe. Procurement criteria do not create compute capacity by themselves.

Each of these strategies is partial. Nobody has the whole answer.

The decision map governments and organizations actually need

Instead of asking whether a system is sovereign, institutions should classify dependencies.

Acceptable dependencies can be tolerated for low-risk use cases: public information chatbots, internal drafting assistants, translation drafts, general summarization, search over non-sensitive documents. Foreign models or hosted APIs may be acceptable if data is not sensitive, outputs are reviewed, and the use case is reversible. Overclassifying everything as critical leads to paralysis.

Managed dependencies are acceptable only with explicit controls: clear data boundaries, logging, audit rights, model replaceability, contractual guarantees, and a realistic exit plan. This covers internal administrative documents, citizen communications, and regulated-sector workflows.

Critical dependencies should not be accepted for high-stakes public functions without serious mitigation: benefits eligibility, immigration decisions, policing, judicial administration, emergency services, healthcare triage, child protection, defence, or autonomous action affecting citizens' rights. Public authority requires accountability, and accountability requires more than a vendor promise.

Strategic dependencies cannot be solved at project level but must be named nationally or regionally: chips, fabs, frontier compute, foundational training data, dominant software frameworks, talent pipelines. If they are not named, they become invisible. Invisible dependencies become policy surprises.

The truth label

Before a system is called sovereign AI, it should answer a few plain questions.

Where does it run? Who owns and operates the infrastructure? Which legal jurisdiction applies? Which foreign laws may reach the provider? Which model is used? Who owns the weights? What licence governs them? Can the model be replaced without rebuilding the whole system? Is the training data known? Can the system be audited? Can logs be retained under public control? Can the workload move? What happens if the provider changes terms or withdraws access? What happens if the system acts incorrectly? Who is accountable? Which use cases are excluded because the dependency is too risky?

It is the minimum level of honesty required before public institutions deploy AI into public life.

Sovereignty is not purity

Most countries and organizations will not build the whole stack alone. Most should not try. Open source will still matter. Shared infrastructure will still matter. Regional alliances will still matter. Private providers will still matter.

The more serious position is not "build everything yourself." It is: know what you depend on, decide which dependencies are acceptable, mitigate the ones you can, avoid the ones you cannot justify, and stop calling the result sovereign without saying sovereign at which layer.

Full-stack sovereignty is out of reach for most countries today. That is not the scandal. The scandal is pretending otherwise.

If the system depends on Nvidia chips, say so. If it runs on certified European infrastructure but uses foreign model families, say so. If the model is open weight but not reproducible, say so. If the training data is unknown, say so. If the language performance is benchmarked but not tested for administrative judgment, say so. If agents can act before humans review the action, say so.

The better question is not: how do we achieve sovereign AI? It is: which parts of the AI stack must we control for this specific public function, which dependencies can we accept, and which ones would make the claim of sovereignty dishonest?

Less satisfying to announce. Much more useful to govern.