In this article, you'll find an analysis of the transformation of e-commerce architecture toward systems designed for intelligent agents, which requires a move away from structures built exclusively for the frontend. You'll learn how to create "AI-ready" product data and JSON schemas that minimize the risk of hallucinating models, and how to optimize APIs for RAG systems. You'll also learn about the role of Semly's middleware layer, which standardizes data and allows you to quickly deploy AI features while controlling costs and security.
Why change the architecture of an online store under generative AI?
Generative AI is forcing a shift in thinking about the e-commerce backend from an "API for the front end" to an "API for intelligent agents" approach.
New types of API consumers
Your API will be consumed not only by your storefront or mobile app, but also by:
- product chatbots (RAG, AI agents),
- recommendation and personalization layer (LLM as orchestrator),
- content generation pipelines (asynchronous AI yobs),
- analytical tools with a language layer.
These new components expect data to be more semantic, taxonomically consistent and event-driven (sequences of events instead of aggregates).
Semly's role in this change
Semly acts as an intermediate layer between your store and generative models. It standardizes product and event data, manages prompts, caching and model costs, allowing developers to focus on the store's domain logic instead of the details of integrating with LLM.
What requirements does generative AI place on e-commerce architecture?
Key use cases AI vs. data needs
- Product Chatbot: It needs full product data, availability, pricing and user context.
- Semantic search engine: It requires rich descriptions and a search API that allows filtering and sorting.
- Recommendations from the LLM: They need structured behavioral events (view, add_to_cart, purchase).
Types of data necessary for quality AI
- Product data (ID, texts, technical attributes, marketing, SEO, multimedia, relationships).
- Event data (GA4 standard: view_item, add_to_cart, begin_checkout, purchase).
- Contextual data (entry channel, location, business constraints).
Store API design under generative AI
REST vs GraphQL in the context of AI
"AI-ready" architectures often combine both approaches:
- REST: Ideal for catalog export and batching (ETL to vector index).
- GraphQL: Allows you to download exactly the fields you need in the on-demand promt.
Sample GET /api/products/{id} response with AI in mind:
{
"id": "product-uuid",
"sku": "TRAIL-001",
"slug": "trail-running-shoes",
"title": "Trail Running Shoes",
"brand": "Acme Sports",
"attributes": {
"terrain": "trail",
"cushioning": "high",
"weight_g": 280
},
"price": {"amount": 92.76, "currency": "EUR"},
"availability": "in_stock"
}Product data under generative AI
Normalization and taxonomies
For AI to make meaningful inferences, the data must be consistent. It is worth taking inspiration from the schema.org/Product standards and the Google Merchant Center specification.
An example of a model in the spirit of schema.org:
{
"@context": "https://schema.org/",
"@type": "Product",
"name": "Running Shoes Blue",
"brand": {"@type": "Brand", "name": "Acme Sports"},
"offers": {
"@type": "Offer",
"priceCurrency": "EUR",
"price": "69.51"
}
}JSON structures for data exchange with AI models
JSON of the shopping cart and user session
The shopping cart provides a key context for the chatbot:
{
"cart_id": "cart-123",
"items": [{
"product_id": "TRAIL-001",
"quantity": 1,
"unit_price": 92.76
}],
"total": 92.76,
"currency": "EUR"
}JSON of user events
Following the GA4 model, adopt a common format:
{
"event_type": "view_item",
"occurred_at": "2026-01-12T10:05:00Z",
"ecommerce": {
"items": [{"item_id": "TRAIL-001", "price": 92.76}],
"currency": "EUR"
}
}Layer of events and user behavior history
If you collect events through GA4, Segment or Snowplow, you already have a base. For AI, events are used to personalize responses and detect intentions.
"Make events a first-class architectural citizen - save them in an event store or wholesalers like BigQuery or Snowflake."
Integration with generative AI in practice
Architectural patterns
- Microservice AI: He is responsible for integration with LLM and preparation of prompts.
- Middleware / BFF: The frontend communicates with the BFF, which combines data from the store's API and AI.
- Event-driven AI workers: Asynchronous generation of descriptions after a "ProductCreated" event.
Safety and costs
Mask personal information in prompts and use aggressive input filtering to reduce token costs.
How does Semly support developers?
Semly addresses the challenges of integration by providing:
- Data standardization: Mapping structures (Shopify, Magento) to an "AI-ready" model.
- The finished API layer: Endpoints for chatbot and recommendations.
- Quality control: Query caching and monitoring mechanisms.
FAQ for developers
How to start implementing on an existing SaaS platform (e.g. Shopify, Shopware)?
Existing APIs (REST or GraphQL) should be used to efficiently export the product catalog and stream events. A key step is to identify gaps in product data, such as poor descriptions or missing technical attributes, and plan to fill them. Instead of directly linking the frontend to LLM models, it is recommended to add an intermediate layer such as Semly
What if the data is incomplete or inconsistent?
AI can "fill in the gaps" in natural language, and must not be relied upon for facts such as technical parameters or compatibility. The safest strategy is to use AI only to enrich descriptions based on already verified technical data. In the prompts themselves, the model must be explicitly forbidden to "guess" - it must openly communicate the lack of information if it does not find it in the source. At the same time, investment should be made in data quality at the source, such as in PIM systems.
Is a separate data warehouse and feature store necessary to get started?
At the very start, this is not necessary - you can start with a simple export of catalog and events directly to Semly or the AI service of your choice. However, the data warehouse and feature store become crucial at the scaling stage of the solution, when there is a need to combine data from multiple sources, build advanced hybrid recommendations or serve multiple brands and markets simultaneously.
How to approach the migration of product data to the new JSON structure?
It is recommended to create a mapping layer between the existing data model and the target standardized "AI-ready" schema. This process can take place gradually - the mapping can be partial at first, and the data can be successively enriched through daily merchandising processes or automated AI processes that generate missing descriptions based on available attributes.
Summary
Successfully implementing generative artificial intelligence in an online store is a process that goes beyond simple integration with a chatbot. It requires a fundamental remodeling of how the system "talks" to the algorithms, shifting the focus from visual presentation to precise data structure.
Here are the key pillars of modern e-commerce architecture:
- Semantic APIs (REST & GraphQL): The foundation is to move away from interfaces designed solely for the frontend. The architecture must offer endpoints that provide LLM models with full business context without unnecessary information noise. GraphQL becomes a key tool here, allowing precise sets of fields (e.g., just technical attributes and availability) to be pulled directly into the prompt.
- Rich and standardized product data: AI models work best on structured data that conforms to standards such as schema.org or Google Merchant Center. A full product model must include not only marketing descriptions, but more importantly typed technical attributes (e.g., weight, power, compatibility) and a list of specific benefits and uses.
- Structured Events (Events): User behavior data (view, add_to_cart, purchase) ceases to be just raw logs for analytics and becomes fuel for personalization. These events, combined with session history, allow AI to accurately detect customer purchase intent.
Sources:
- commercetools HTTP API - Products
- Shopify Storefront API - Product object
- Google Merchant Center - Product data specification
- Google Analytics 4 - Ecommerce measurement
- Vertex AI Search for Commerce - User events
- GA4 - Recommended events for retail/ecommerce
- Snowplow - GA migration guide (event diagrams)
Share:
