Enterprise chat UIs are moving fast from “one model answers everything” to multi-agent systems: expense policy bots, PM helpers, document Q&A, and general knowledge—often split across different GCP projects and even different Google accounts. That’s where things get brittle.
In “Mind the Boundary: Stabilizing Gemini Enterprise A2A via a Cloud Run Hub Across Projects and Accounts” (Takao Morita, 2026), the author shows that making Gemini Enterprise’s Agent-to-Agent (A2A) integration work reliably is less about protocol theory and more about two practical realities:
-
Gemini Enterprise UI constraints (what the UI accepts/throws errors on)
-
Boundary-dependent authentication (what changes when you cross projects/accounts)
The paper’s solution: build a Cloud Run A2A Hub that sits between the Gemini UI and your backend agents/tools, acting as a strict “compatibility + routing + containment” layer.
The Core Problem: “Spec-Compliant” Can Still Break the UI
A2A uses agent discovery (agent card) + JSON-RPC messaging. In principle, you can return structured JSON results, metadata, citations, etc.
In practice, the paper observes two Gemini Enterprise UI behaviors that matter a lot:
-
User input arrives in params.message.parts[].text (not reliably in params.text)
-
UI requests include acceptedOutputModes=[] (empty)
That second detail is crucial: when acceptedOutputModes is empty, mixing structured output into JSON-RPC responses can trigger UI failures. So you can be “correct” by protocol… and still get the dreaded UI “answer failed.”
The Hub Design: Text-Only JSON-RPC + Separate Tool API
The Hub runs on Cloud Run and enforces a rule:
-
JSON-RPC endpoint (POST /) returns text-only, always.
One message part, plain text, no structured payload.
Everything “rich” goes elsewhere:
-
REST tool API (POST /tools/query) returns JSON, including:
-
route decision
-
downstream agent used
-
structured outputs
-
debugging signals
-
citations/metadata
-
This separation gives you the best of both worlds:
-
UI stability (Gemini UI doesn’t choke)
-
Developer observability (you still get structured results and debug data)
Deterministic Routing Instead of “LLM Decides”
The Hub intentionally avoids LLM-based “tool selection.” Instead it uses keyword/regex deterministic rules, because enterprise ops cares about:
-
reproducibility
-
predictable behavior
-
easier debugging
Four routes are implemented:
-
Expense agent (public A2A agent in another project)
-
PM support agent (Cloud Run IAM-protected agent in another account)
-
DocQA / RAG route (Discovery Engine / Vertex AI Search + optional GCS source retrieval)
-
General QA (Vertex AI)
The Hidden Boss Fight: Authentication Changes by Boundary
A2A doesn’t define auth, so Cloud Run + IAM boundaries decide your fate.
The paper maps boundary types to auth mechanisms:
-
Same project: Application Default Credentials (ADC) “just works”
-
Cross-project (public agent): unauthenticated HTTPS works (but has obvious security tradeoffs)
-
Cross-account (protected Cloud Run): you must use OIDC ID tokens
-
correct audience required
-
Hub’s service account needs Cloud Run Invoker permission
-
wrong audience → 401
-
missing Invoker grant → 403
-
The takeaway: auth is discontinuous across boundaries even if your A2A calls look identical.
RAG Gotcha: Search ≠ Evidence Access
For DocQA, the Hub uses Discovery Engine / Vertex AI Search. But to extract precise details (like deadlines), it sometimes needs to fetch the original source text from GCS.
That introduces a classic enterprise failure mode:
-
Search returns results
-
But GCS read fails with 403 because the Cloud Run service account lacks storage.objects.get
-
Answer becomes incomplete or vague—even though retrieval “worked”
Once storage.objects.get is granted, the system can do evidence-backed extraction, e.g., correctly pulling an incident-response deadline like “within 15 minutes.”
Evaluation: Four Queries, Stable UI, Reproducible Routing
The author tests four representative queries:
-
expense policy deadline
-
PM/WBS task assistance
-
general knowledge (Mount Fuji height)
-
incident-response deadline extraction from internal docs
Results:
-
routing is deterministic and correct
-
Gemini UI stays stable because responses are always text-only
-
structured inspection remains available via REST
-
RAG becomes reliably “evidence-backed” only after fixing GCS IAM
Why This Matters
If you’re building enterprise agent systems, this paper delivers a practical rulebook:
-
Treat UI constraints as a first-class engineering requirement.
-
Use a Hub to absorb failures and normalize requests/responses.
-
Expect IAM/auth to change sharply across project/account boundaries.
-
Plan evidence access explicitly in RAG (search alone isn’t enough).
source: https://arxiv.org/pdf/2602.17675