apigraphqlmcparchitectureoidc

Architecting Solenya API V1 for the Agentic Era

Marcus Gawronsky and Jean Durand

"Good APIs are boring. An API that's interesting is a bad API. For the developers who use them, APIs are tools that they use in order to accomplish some other goal. Any time they spend thinking about the API instead of about that goal is time wasted."

  • Sean Goedecke, Everything I Know About Good API Design (Goedecke, 2025)

The Architectural Problem

Solenya was founded on a simple observation: discovery systems were fragmented. Search lived in one stack, recommendations in another, and behaviour in a third. Internally, we described that fragmentation as separate jars. The problem was not aesthetic; it limited what could actually be built across search, listings, recommendations, and agents.

We solved that at the model layer by unifying it into one model that can power every surface. That same question then reappeared at the API layer. An API design dictates what downstream products can and cannot build. If the interface is fragmented, the product above it inherits that fragmentation.

That matters more now because the consumer is no longer only a human developer. Modern production systems are increasingly written, tested, and maintained by agentic harnesses - autonomous coding loops that generate integration code, run static analysis, and deploy. When an agent writes code against a traditional REST API, it hallucinates JSON structures, misreads loosely typed Swagger specs, and produces integration code that passes no quality gate. For these clients, a typed API is not a preference. It is the only architecture that survives contact with reality.

That is the context for Solenya API V1. V1 is a unified interface for human developers and autonomous agents, with typed execution, runtime capability discovery, scoped delegation, and browser-safe tracking. This blog walks through the design decisions behind V1 and explains how each layer supports that goal.

V1 at a Glance

Before we unpack the trade-offs, here is the shape of Solenya API V1. This is the architecture we set out to build.

The Protocol Map

The V1 architecture uses each protocol where it belongs:

LayerProtocolWhy
VocabularyGoogle Product Feed20-year-old standard embedded in model weights
ExecutionGraphQLTyped, relational, self-documenting, introspectable
Capability discoveryMCPStandardised, progressively disclosed tool listing
TrustOAuth2.1 / OIDCDelegatable, lifecycle-managed, revocable
AnalyticsREST + sendBeaconBrowser physics demands fire-and-forget

The Principles Behind V1

The resulting architecture is not a random jar of protocols. It follows a small set of design principles that keep the interface unified even when different layers solve different problems.

One model, one schema, every surface. The carousel, the listing, the search bar, and the agent all sit on top of the same model. The API reflects that decision: one GraphQL schema, one unified items namespace, regardless of whether the caller is a React component, a Python script, or an autonomous shopping agent.

Use the industry's native vocabulary. We do not ask merchants to translate their catalogues into a bespoke Solenya schema before they can benefit from the system. V1 starts from the Google Product Feed specification because it is already widely deployed and already legible to both developers and models.

Progressive disclosure is a protocol requirement. The same system that adapts to users over time should reveal its own capabilities in layers. Introspection is not just a debugging tool. It is how agents discover what the API can do without loading the entire surface area into context at once.

The documentation should be executable. Our documentation at solenya.ai/docs embeds live GraphiQL and Swagger explorers directly, with every query example providing an "open-in-IDE" link. The documentation is not separate from the interface. It is part of how the interface is learned.

The Design Space: Choosing the Right Protocol for Each Layer

To design V1 deliberately, we had to evaluate the tools available. We looked at what each protocol is good at, where it breaks down for humans and agents, and how the pieces fit together.

Progressive Disclosure: The Organising Principle

The first design constraint follows directly from that new consumer. The API must be understandable in layers. Agents, like humans, reason better when the interface reveals the next relevant level of detail instead of demanding full-system comprehension up front.

The term comes from interface design. Don't show the user everything at once. Sequence information to reduce cognitive load. Reveal depth on demand. It is why a well-designed IDE shows autocomplete suggestions rather than dumping the entire standard library into a modal.

For agents, progressive disclosure is not a UX nicety. It is a hard engineering constraint.

A large language model has a finite context window. Flooding that window with a complete API specification does not make the agent smarter. It degrades reasoning quality and spikes inference cost. Anthropic's engineering team has documented that 50+ tools consume approximately 72,000 tokens before work even begins (Anthropic, 2025). Manus, the autonomous agent platform, calls KV-cache hit rate "the single most important metric for a production-stage AI agent" (Manus, 2025). Every time you dynamically swap tools in and out of context, you risk invalidating that cache at a 10x cost penalty.

The consequence for API design is structural. An API built for agents must be introspectable, self-describing, and paginable at the capability level, not just the data level. The agent should be able to ask "what can I do here?" and receive a bounded, coherent answer. Then ask "tell me more about this specific capability" and receive the next layer. Discovery of tools must itself be progressively disclosed.

This principle governs every design decision in V1. It is the thread connecting the vocabulary we chose, the query language we adopted, the authentication model we enforce, and even the separate REST endpoint we kept for analytics. Each layer reveals exactly what is needed at that layer and no more. That principle explains the constraint, but not yet the architecture. To design V1 deliberately, we had to evaluate the tools available: what each protocol is good at, where it breaks down for humans and agents, and how the pieces fit together.

REST: Familiar, but Fragmented

REST is the lingua franca of the web for good reason. It is simple, stateless, and maps cleanly to HTTP semantics. GET /products/123 is so familiar that most developers will know how to use it before reading documentation - and that familiarity is genuinely valuable (Goedecke, 2025).

But REST scales poorly as relational complexity grows. E-commerce is intensely relational: products belong to categories, have variants grouped by item_group_id, carry inventory across locations, and relate to users through orders, views, and preferences. Expressing these relationships in REST demands either endpoint explosion - /products/:id/similar, /users/:id/recommendations, /products/:id/reviews - or deeply nested responses that return everything whether the client needs it or not. The first is fragmentation; the second is overfetching. Both recreate the same structural separation V1 is designed to avoid.

OpenAPI attempts to address discoverability, but the schema and the implementation are maintained separately. They drift. The spec says the field is a string; the server returns an integer. The documentation says the endpoint exists; the server returns 404. Agents writing code against an OpenAPI spec that has drifted from reality produce code that compiles against the spec and fails against the server. That combination of endpoint sprawl, overfetching, and schema drift is why REST is not the core execution layer for V1.

RPC (gRPC, Thrift): Precise, but Poor for Public Clients

Excellent for tightly-coupled internal services where performance is paramount and schema drift is unacceptable. Binary protocols, generated stubs, and strict contracts eliminate ambiguity. But the binary wire format creates a readability tax for both humans and agents, and gRPC's reliance on HTTP/2 framing makes it a poor fit for browser clients. RPC is the right tool for service meshes. It is not the public interface we wanted for V1.

GraphQL: Typed, Introspectable Execution

A single endpoint where the client declares exactly what it needs. The schema is the contract, the documentation, and the type system - simultaneously. No drift is possible because they are the same artifact. Selection Sets let the client request exactly the fields relevant to the current task. Fragments let it compose reusable field selections. Introspection lets it query the schema itself to discover what types exist, what fields are available, and what arguments each field accepts - at runtime, from the API itself.

GraphQL has some objectively strange design decisions. It returns HTTP 200 even when errors occur - a deliberate choice, not an oversight. The rationale is that a partial response (some resolvers succeeded, one failed) is a legitimate, structured outcome. All errors are typed in the response envelope - inspectable, catchable, contractual. From the perspective of traditional HTTP semantics, this is counterintuitive. From the perspective of an agent parsing a response, a typed error envelope is vastly more reliable than inferring failure from a status code.

For production systems increasingly authored by agentic coding harnesses, the quality gates GraphQL provides are not nice-to-haves. Automatic client codegen from the schema means the agent's generated code either type-checks or it doesn't. Errors surface at compile time, not in production. IDE autocomplete works. Static analysis works. Automated tests can be generated from the schema contract. These are exactly the properties that make integration code survive a CI pipeline. That combination of codegen, introspection, and typed errors is why GraphQL became the execution layer for V1.

MCP: Capability Discovery for Agents

MCP is not a replacement for GraphQL - it is orthogonal. Where GraphQL is the typed, relational execution interface, MCP is the discovery and invocation interface for generalist agents. It solves the problem of how an LLM that has never seen your API before can find out what tools are available, understand their parameters, and call them - without being pre-programmed.

The MCP specification recently added native pagination over tool listings - a direct endorsement of progressive disclosure as a first-class protocol concern. An agent connecting to an MCP server no longer receives every tool definition at once; it can page through capabilities, reducing context window pressure and improving cache efficiency.

Critically, MCP and GraphQL compose. Our search_catalog tool is an MCP-discoverable capability backed by a GraphQL query. The agent discovers it via MCP; the execution is GraphQL. The protocols are complementary layers, not competing alternatives. That is why MCP sits alongside GraphQL in V1 rather than replacing it.

Skills: Operational Guidance for Agents

Mintlify, Vercel, and the broader ecosystem have converged on a parallel pattern: skills - markdown files that encode tribal knowledge, best practices, and decision trees in a format agents can load into context (Mintlify, 2026). Skills are documentation organised and written for machines. They handle the "how to use this tool well" problem that MCP doesn't address.

Skills and MCP solve different problems. MCP packages authenticated, scoped access. Skills package knowledge. An agent with MCP access to our API but no skill file will call search_catalog - but it might structure queries suboptimally. An agent with the skill file will know to use distinct_on: { selector: item_group_id } to deduplicate variants and filter_term to scope results to in-stock inventory. Both layers matter. Neither replaces the other. That is why skills complement V1 rather than define it.

Merchant Feed First

At Solenya, we use the Google Product Feed specification as our native integration layer.

That choice is not nostalgic. It is a reliability decision for both humans and agents.

The Google Product Feed specification traces its roots to Google Base, launched in 2005 - a dedicated platform for users to upload structured product data. It has had twenty years to mature, be scrutinised, and become the lingua franca of catalog ingestion across every major ad network: Meta (Facebook & Instagram), Microsoft Advertising (Bing Shopping), Pinterest, Snapchat, and TikTok all accept or directly map from the Google specification.

This ubiquity has a second-order consequence that matters enormously for the agentic era: the specification is overwhelmingly present in the pre-training data of every major foundational model. Fields like item_group_id, sale_price, product_type, gtin, and availability are not novel vocabulary that an agent must learn from documentation - they are patterns the model has seen millions of times across web crawls, merchant documentation, and integration guides. The taxonomy is embedded in the weights.

Invent a novel, proprietary e-commerce schema, and an agent mapping to it will frequently hallucinate fields, misalign types, and silently fail. The newer and more bespoke the schema, the higher the failure rate - not because the agent is incapable, but because the vocabulary breaks patterns the model has internalised during pre-training. Adopt a schema the model has seen ten million times, and reliability improves structurally. Not through prompt engineering. Through vocabulary familiarity.

The practical consequence is concrete: the same feed URL you already maintain for Google Merchant Center, the same XML your ad campaigns depend on, is the feed Solenya ingests directly. No ETL pipeline. No field mapping. No bespoke schema adaptation. The item_group_id that groups your variant SKUs in Google Shopping is the same item_group_id you query through our distinct_on operator. A familiar vocabulary makes discovery more reliable before prompt engineering even begins.

GraphQL as a Typed, Self-Documenting V1 API

With the feed vocabulary fixed, the next question is how V1 should expose data and functionality. GraphQL became the centre of the architecture because it keeps the schema, the documentation, and the execution contract in the same place.

Introspection as Progressive Disclosure

GraphQL's Introspection query is progressive disclosure implemented at the protocol level. An agent - or a human developer in GraphiQL - can ask the schema what types exist:

Open in IDE ↗
{
  __schema {
    types {
      name
      description
    }
  }
}

Then drill into a specific type to discover its fields:

Open in IDE ↗
{
  __type(name: "MerchantFeedType") {
    fields {
      name
      description
      type { name kind }
    }
  }
}

Then inspect a specific field's arguments to understand how to filter or rank results. Each query is a deliberate step deeper - never dumping the entire schema at once, always revealing exactly the next layer of capability.

Every field in our schema carries a description. This is not a code comment that drifts from the implementation. It is part of the schema contract, introspectable at runtime. The documentation and the API are the same artifact - they cannot diverge.

The Namespaced V1 Architecture

Our shift from flat, ambiguous resolvers to a namespaced architecture enforces constraint at the schema level. In V1, the items query is a namespace - a gateway that returns feed-specific resolvers rather than data directly:

Open in IDE ↗
query {
  items {
    merchant_feed(
      rank_with: { multimodal_query: { text: "blue running shoes" } }
      distinct_on: { selector: item_group_id }
      filter_term: { where: { availability: { eq: { value: "in_stock" } } } }
    ) {
      page(page_size: 10, page_number: 1) {
        rows {
          record { title brand price image_link }
          metadata { score }
        }
        page_info { has_next_page }
      }
      facets {
        brand(top_n: 10) { value count }
        price(bucket_size: 25.0) { range count }
      }
    }
  }
}

The field name (merchant_feed) explicitly defines the return type and available arguments. Feed-specific distinct_on enums prevent invalid column queries - the error surfaces at schema validation, not at runtime. Pagination arguments live on the page field resolver, not on the root query, enabling lazy loading: you can request facets without fetching any rows, or page_info without materialising the result set. Each sub-field is an independent, progressively disclosed layer.

Type Safety for Agentic Harnesses

When an agentic coding harness generates integration code against our V1 schema, the process is deterministic:

  1. Codegen reads the schema and produces typed client stubs - every query, every mutation, every input type, every response shape.
  2. Static analysis validates that the generated code references only fields that exist with the correct types.
  3. Automated tests can be scaffolded from the schema contract itself.

If the generated code type-checks, it will work against the API. If it doesn't type-check, it fails at build time - not in production, not at 3am, not in a customer's checkout flow. This is the quality gate that REST with loosely-typed JSON cannot provide. For organisations whose integration code is increasingly written by machines, the type system is the entire safety net.

OIDC as the Trust Layer for Delegation

Static, omnipotent API keys fail the exact delegation model V1 is meant to support. They do not expire, they cannot be scoped precisely, and they cannot be revoked without breaking everything downstream. For a human developer running a quick script, a long-lived key is convenient (Goedecke, 2025). For an autonomous agent acting on behalf of thousands of users, it is an architectural liability.

The MCP specification builds on OAuth2.1 and OpenID Connect (OIDC) for precisely this reason. OIDC provides the mechanics that agentic architectures demand:

  • Lifecycle management. Our access tokens are short-lived JWTs with a 2-hour expiry. The agent cannot accumulate privilege over time. Refresh tokens rotate on each use and are individually revocable.
  • Scoped delegation. Hierarchical scopes (index:read:*, events:write:*, account:admin:*) mean an agent can act on behalf of a user with precisely the permissions required for its task - and no more. A read-only agent cannot mutate indexes. An analytics agent cannot access account administration.
  • Auditability. Every token carries claims that identify the issuing client, the delegating user, and the granted scopes. Access patterns are loggable and auditable without additional instrumentation.

As shopping agents evolve from recommending products to executing checkouts, the trust model matters enormously. A user delegating purchase authority to an agent should be able to revoke that delegation precisely - without revoking the agent's access to search. OIDC's scope model makes this granularity possible. Static API keys do not.

Why Event Tracking Uses a Separate REST Endpoint

Up to this point, the architecture argues for one typed, unified surface. Event tracking is the deliberate exception. The reason is browser behaviour, not a change in philosophy.

The Browser Constraint

Event tracking for e-commerce analytics - view_item, purchase_item, view_experiment - must fire from the shopper's browser, across diverse customer storefronts that Solenya does not control. These events must survive page navigation. They must never block the main thread. And they must work reliably at scale across every browser, device, and network condition.

A GraphQL POST request from a third-party storefront to api.solenya.ai would trigger a CORS preflight - an OPTIONS request that the browser sends before the actual request to verify cross-origin permissions. During a page unload (the shopper clicking "Buy Now" and navigating to the checkout), that preflight round-trip is abandoned by the browser before it completes. The event is silently dropped at exactly the moment it matters most.

The sendBeacon Solution

navigator.sendBeacon() is explicitly designed for this scenario. It fires a small payload asynchronously, survives page unload, and never blocks the UI thread. But sendBeacon - and fetch() with keepalive: true - both require a simple request to avoid triggering CORS preflight. Per the Fetch Standard, a simple request demands POST with a Content-Type of application/x-www-form-urlencoded, multipart/form-data, or text/plain. No custom headers. No Authorization header. No application/json.

This constraint dictated our tracking API design completely:

  • Form-urlencoded payloads. The access_token is a form field (RFC 6750 §2.2), not a header. The events array is JSON-encoded inside a form field. This avoids preflight entirely.
  • Fire-and-forget semantics. The endpoint returns no response body. sendBeacon doesn't give you one anyway.
  • Minimal payload. Events carry only structural data - item selectors, transaction IDs, experiment variants. No PII. No device fingerprinting. The schema follows the action-object pattern (view_item, purchase_item) aligned with GA4's recommended events, reducing migration friction for teams already tracking with Google Analytics.
const form = new URLSearchParams();
form.append("access_token", token);
form.append("solenya_user_uuid", userUuid);
form.append("events", JSON.stringify([
  { type: "view_item", item: { selector: "id", eq: "SKU-001" } }
]));
navigator.sendBeacon("https://api.solenya.ai/v1/track/event", form);

sendBeacon uses a lower-priority queue - the browser batches and sends when convenient - making it more bandwidth-efficient than fetch with keepalive, which sends immediately. Both survive navigation. But sendBeacon has the broader browser support (97%+ vs 95%+) and the more predictable behaviour under load.

This is the only place where V1 deliberately steps outside the typed GraphQL surface. Tracking uses a separate REST endpoint because the browser constraints demand it, not because the architectural principles changed.

Conclusion: One Deliberately Layered Architecture

Every piece of Solenya API V1 flows from the same kernel:

  • One vocabulary the agents already speak. The Google Product Feed - two decades old, present in every model's pre-training data, maintained for every ad network, and ingested by Solenya without transformation.
  • One typed, introspectable execution layer. GraphQL - self-documenting, progressively disclosed via Introspection, generating deterministic client code that either type-checks or fails at build time.
  • One standardised capability discovery surface. MCP - paginated tool listings, composable with GraphQL, discoverable at runtime by any compliant agent.
  • One delegatable, lifecycle-managed trust layer. OIDC - scoped tokens, short-lived JWTs, individually revocable refresh tokens, auditable claims.
  • One browser-native analytics island. REST + sendBeacon - form-urlencoded, fire-and-forget, no preflight, no blocked main thread.

The API documentation is live at solenya.ai/docs. The GraphiQL explorer is live at api.solenya.ai/v1/graphql. Every query example in the docs opens directly in the IDE. The schema will explain itself.

The scaling-law era promised that a single, ever-larger model would solve everything. The cognitive core era (Karpathy, 2024) proved that the model is commodity and the tools are the product. Solenya API V1 is built around that reality: a familiar vocabulary, a typed execution layer, explicit capability discovery, scoped delegation, and a separate browser-safe tracking path where physics requires it.

Anthropic. (2025). Build effective agents. https://www.anthropic.com/research/building-effective-agents
Goedecke, S. (2025). Everything I Know About Good API Design. https://www.seangoedecke.com/good-api-design/
Karpathy, A. (2024). Andrej Karpathy: Software in the Era of AI. https://www.dwarkeshpatel.com/p/andrej-karpathy
Manus. (2025). Manus Context Engineering. https://manus.im/blog/Context-Engineering-for-Agents
Mintlify. (2026). Skills: AI Documentation for Agents. https://mintlify.com/docs/agent-skills