Additional Modules / Intershop Agentic Commerce / Product Content Agent

Concept - Product Content Agent

Document Properties

Kbid

4829G4

Added to KB

10-Jun-2026

Status

online

Product

Intershop Commerce Platform

Last Modified

10-Jun-2026

Public Access

everyone

Doc Type

Concepts

Product

Product Content Agent

Document Link

https://knowledge.intershop.com/kb/4829G4

Share Link

Share this document

Introduction

The Intershop Product Content Agent uses AI and Retrieval-Augmented Generation (RAG) to aggregate, validate, and optimize product information from multiple sources. It helps commerce teams create structured, consistent, and discoverable product content based on trusted product data such as manufacturer PDFs, supplier information, internal PIM data, and other allowed sources.

The agent is designed for business users such as e-commerce managers and product content editors, and for partner developers who integrate product content enrichment into customer-specific product information management (PIM), enterprise resource planning (ERP), or commerce data workflows.

In the Intershop Commerce Platform, the Product Content Agent is part of the Agentic Commerce Package and can be made available through the Copilot for Merchants. The Copilot for Merchants acts as the conversational control layer for merchant operations and allows users to interact with connected agents through natural language, including voice input where enabled.

References

Business Context

Product content quality directly affects product discoverability, conversion, operational efficiency, and compliance readiness. Many commerce teams maintain product data from multiple external and internal sources. This often leads to incomplete attributes, inconsistent terminology, missing SEO metadata, manual copywriting effort, and slow onboarding of new assortments.

The Product Content Agent addresses this by combining source retrieval, structured extraction, conflict detection, content generation, and quality indicators in one repeatable workflow. The goal is not to replace product data ownership, but to accelerate product content work and make enrichment results easier to review, approve, and import into operational systems.

Feature Overview

Feature	Description
AI-Driven Product Data Enrichment	Generates template-based product descriptions and structured product specifications by incorporating trusted sources. The output is intended to be precise, complete, standardized, and fact-based.
Data Aggregation and Validation	Combines product information from multiple sources, such as manufacturer PDFs, supplier databases, and internal PIM systems. The agent detects incomplete or conflicting information and enriches it with reliable data where available.
Product Search Optimization	Extracts relevant keywords, creates search-oriented content, and improves product discoverability for search, product lists, and product detail pages. Depending on the configured prompt, this can include GEO-oriented content such as product FAQs, question-and-answer blocks, how-to guidance, use cases, and concise answer snippets for AI-assisted search experiences.
Prompt-Based Content Generation	Allows customers to control structure, content focus, tone of voice, and style guidance through configurable prompts.

Target Users

User Group	Typical Goals
E-Commerce Manager	Improve catalog completeness, accelerate assortment onboarding, increase product findability, and reduce manual coordination effort.
Product Content Editor	Generate consistent product descriptions, enrich technical attributes, review quality scores, and publish approved content to downstream channels.
Partner Developer	Connect the agent to a custom PIM, map customer-specific attributes, automate enrichment jobs, and return enriched data to the customer's master data process.
Solution Architect	Define integration boundaries, source governance, security controls, and approval workflows for production use.

Conceptual Architecture

The Product Content Agent is organized as an enrichment pipeline. The pipeline starts with product identification, retrieves and evaluates available source material, extracts structured facts, resolves conflicts, generates product content, and returns a validated output object for review or import.

Product input
  SKU, brand, optional product name, optional product type, optional product URL
        |
        v
Source retrieval
  Multi-step web and deep search, manufacturer PDFs, supplier sources, internal PIM data when integrated
        |
        v
Extraction and validation
  PDF/OCR extraction, category detection, attribute discovery, source classification
        |
        v
Data consolidation
  Conflict resolution, missing-field detection, source and quality indicators
        |
        v
Content generation
  Product descriptions, SEO metadata, value drivers, structured attributes
        |
        v
Review and integration
  Copilot for Merchants interaction, PIM import, approval workflow, publication

Copilot for Merchants Integration

Copilot for Merchants provides a chat-based interface for daily commerce operations and acts as the central framework for invoking and coordinating domain-specific agents. When the Product Content Agent is exposed through Copilot for Merchants, business users can trigger enrichment workflows with natural-language prompts instead of navigating a technical interface.

Typical prompts include:

Enrich the product content for this SKU using manufacturer sources.
Create a technical German product description and SEO metadata for the selected product.
Check whether the product data has conflicting technical attributes.
Prepare enriched content for review before updating the PIM.

If voice input is enabled in Copilot for Merchants, users can dictate such prompts. Voice input is a Copilot capability; the agent still applies the configured source, prompt, approval, and integration rules.

Workflow

Step	Description	Main Output
1. Initialize Input	Validate product input and identify whether cached source documents already exist for the SKU.	Validated product request
2. Determine Category	Identify the product category from product input and source material. Category information is used to guide relevant attribute discovery.	Product category and confidence
3. Retrieve Sources	Search for product pages, datasheets, manuals, and other configured source types using built-in multi-step web and deep search. Manufacturer/OEM sources are treated as high-trust sources.	Source list and downloadable documents
4. Extract PDFs and Web Data	Extract text and tables from documents. The reference implementation uses a fallback chain for PDF processing and can use OCR for complex layouts.	Extracted source text and structured facts
5. Discover Attributes	Identify category-specific attributes, synonyms, values, missing critical attributes, and possible value drivers.	Attribute candidates and missing-data list
6. Resolve Conflicts	Compare values across sources and resolve conflicts by consensus, source reliability, or tier priority, depending on the data situation.	Consolidated product facts with conflict metadata
7. Generate Content	Create structured descriptions, SEO metadata, and search-oriented product content according to the configured prompt and brand guidance.	Reviewable product content
8. Validate and Score	Map the result to the target schema and calculate quality indicators such as completeness, confidence, source count, and missing fields.	PIM-ready JSON output

Web and Deep Search

The Product Content Agent includes a multi-step web and deep search capability to discover relevant product information beyond the data already available in the PIM. This search capability helps find official manufacturer pages, datasheets, manuals, certificates, product pages, supplier information, and other configured source types.

Deep search improves source coverage for products where the initial product record is incomplete or where important technical attributes are missing. The agent can search with SKU, brand, product name, product category, and discovered attribute gaps. This allows the enrichment process to move from broad discovery to targeted follow-up research for missing or conflicting data.

The search process is governed by customer configuration such as allowed sources, blocked sources, source priority, search depth, document limits, and review rules.

Output Model

The enrichment result is intended to be imported into a PIM or reviewed before import. A typical result contains the following sections:

Product identifiers such as SKU, MPN, EAN, GTIN, HS code, and country of origin where available.
Naming data such as manufacturer name, short name, display name, series, product line, and variant.
Structured attributes with value, unit, label, group, source tier, and value-driver markers.
Product descriptions and SEO metadata.
Optional GEO-oriented content such as FAQs, question-and-answer blocks, how-to sections, use cases, front-loaded answer snippets, and structured specification blocks when configured for the customer scenario.
Digital assets such as datasheets or manuals when available.
Compliance information such as certifications, standards, directives, and environmental indicators where available in sources.
Quality indicators such as confidence score, completeness score, data quality level, missing fields, and warnings.
Source references with URL, source tier, extracted fields, and access timestamp.

Quality Measurement

The Product Content Agent should make content quality measurable before and after enrichment. This helps business users understand the value of an enrichment run and helps operators identify products that still need review.

Metric	Description	Typical Use
Completeness score	Measures whether important output fields are filled, for example names, descriptions, and attributes.	Identify products with missing content before publication.
Confidence score	Indicates how reliable the generated data is based on source quality and consistency.	Separate import-ready records from records requiring manual review.
Source count and source tiers	Shows how many sources were used and whether high-trust sources were available.	Explain why a value was generated and detect products with weak source coverage.
Quality delta	Compares the product record before and after enrichment.	Provide a business-facing ROI signal for catalog enrichment initiatives.

ICM PIM Fields and Write-Back

When the Intershop Commerce Management (ICM) PIM is the target system, the Product Content Agent writes enrichment results into the product data model through a project-specific field mapping. The agent output is structured so that standard product fields, product attributes, SEO fields, and review metadata can be mapped separately.

Agent Output	Typical ICM PIM Target	Notes
naming.display_name, naming.manufacturer_name, naming.series, naming.product_line	Product name, display name, manufacturer name, series, product line, or corresponding custom product attributes	The exact write target depends on the customer's product model and naming conventions.
content.description.short, content.description.long, content.description.technical	Short description, long description, and technical description fields or localized content attributes	Existing content can be preserved unless the user or workflow explicitly approves replacement.
content.seo.meta_title, content.seo.meta_description, content.seo.keywords	SEO title, meta description, keywords, search terms, or channel-specific SEO attributes	Used for search optimization and discoverability where the storefront or search index consumes these fields.
attributes	Product attributes, classification attributes, technical specification attributes, or custom attributes	Each generated attribute is mapped by ID, label, value, unit, group, and source. Customer taxonomies such as ETIM, BMEcat, GS1, or internal field IDs can be used during mapping.
identifiers	MPN, EAN, GTIN, UPC, HS code, TARIC, country of origin, or corresponding identifier attributes	Identifier writes should respect uniqueness, formatting, and validation rules in the customer PIM.
digital_assets	Datasheet links, manual links, certificate links, image metadata, or asset references	Asset publication requires customer-specific rights and asset-management rules.
compliance	Compliance attributes such as certifications, standards, directives, RoHS, REACH, WEEE, and regional approvals	Compliance values are source-derived and should be reviewed according to the customer's compliance process.
quality_indicators and sources	Internal enrichment status, quality score, confidence score, source provenance, review notes, or audit attributes	These fields help reviewers understand why a value was generated and whether it is ready for import or publication.

Default ICM REST API Integration

In the default Intershop scenario, product information is read from and written back to Intershop Commerce Management through the ICM REST API. The Product Content Agent can read the current product context by SKU or product reference, enrich the data with external and internal sources, and write approved values back to the mapped ICM PIM fields.

The write-back should be controlled by the customer workflow. Typical configurations distinguish between enrichment suggestions, review-stage updates, and approved writes. For partner integrations with a custom PIM, the same enrichment output can be delivered through a registered adapter instead of being written through the ICM REST API.

Adapter and Enrichment Context

The Product Content Agent can be connected to different source and target systems through an adapter layer. In the default Intershop scenario, this adapter uses the ICM REST API. For customer-specific PIM systems, a partner can implement and register an adapter with a callback base URL. The enrichment service then calls the adapter to list catalogs, read content items, and deliver enriched content back to the connected system.

This adapter pattern works like a webhook-style integration for write-back: the Product Content Agent does not need to know the internal API of the third-party PIM. The adapter translates between the generic enrichment protocol and the customer's PIM data model.

Adapter Endpoint	Purpose
Register adapter	The external adapter registers its ID, display name, callback base URL, optional authorization token, capabilities, and enrichment context.
List collections	The enrichment service asks the adapter which catalogs, collections, or product groups can be processed.
Read items	The enrichment service retrieves product candidates or individual content items from the adapter.
Write enriched item	The enrichment service posts enriched content back to the adapter, which then updates the third-party PIM or stores the result for review.

An enrichment context can be used to control domain-specific behavior without changing the core pipeline. Typical context settings include content type, target language, search strategy, source hints, prompt extensions, required attribute groups, and whether GEO-oriented content such as FAQ or how-to sections should be generated.

Context Setting	Purpose
Content type	Defines whether the pipeline enriches a product, article, legal document, or another structured content type.
Target language	Defines the language of generated descriptions, FAQs, how-to sections, and SEO/GEO content.
Prompt extension	Adds customer- or category-specific instructions, such as required technical attributes, terminology, tone, or forbidden generic phrases.
Search strategy	Defines whether the agent should use web search, document extraction, both, or only customer-provided source material.
Source hint	Provides a search-optimized hint for products that are hard to find, for example, a manufacturer name, SKU, product family, and document type.
GEO content options	Controls whether FAQ, how-to, answer snippets, and front-loaded summaries should be generated.

Source Trust and Explainability

The Product Content Agent uses source references and quality indicators to support review and explainability. The reference implementation distinguishes at least two source tiers:

Tier	Meaning	Example
tier_1	Manufacturer or OEM source, typically identified by a matching brand domain or validated manufacturer document.	Manufacturer product page, official datasheet, official manual
tier_2	Other public or customer-approved source.	Distributor product page, public catalog page, third-party data source

When several sources contain different values for the same attribute, the agent can use weighted source reliability and consensus voting. The final result should still be reviewed according to the customer's approval process before publication.

Source Configuration and Legal Responsibility

The Product Content Agent operates on public or customer-provided sources configured for the customer scenario. The customer is responsible for defining, managing, and maintaining allowed and disallowed sources and for ensuring that the use of third-party content complies with applicable copyright, data protection, and other legal requirements.

The agent is designed to generate original, structured product information based on contextual understanding of available source material. It should not be configured to copy third-party content verbatim into the product catalog.

Integration Scenarios

Scenario	Description	Typical Integration Point
Business-user enrichment in Copilot for Merchants	A merchant triggers enrichment for one or more products through a conversational prompt and reviews the generated result.	Copilot for Merchants, Intershop Commerce Management, approval workflow
Batch enrichment for catalog onboarding	A product content team enriches a list of SKUs before introducing a new assortment or supplier catalog.	CSV or Excel export/import, PIM staging area, scheduled enrichment job
Custom PIM integration	A partner connects a customer-specific PIM through a registered adapter, maps the result schema to the customer's product model, and receives enriched data through the adapter write-back endpoint.	Registered adapter, callback/webhook-style write-back endpoint, PIM API, message queue, custom middleware
Product search optimization	The agent creates keywords, search-friendly product names, and metadata to improve discoverability in product search and product lists.	Search index feed, product detail page content, SEO metadata fields

Batch Handling, Performance, and Parallelization

Batch enrichment is designed for catalog onboarding, supplier data clean-up, and larger product-content improvement initiatives. In customer and partner scenarios, batches are typically provided through a file upload in Copilot for Merchants or through a customer-specific Azure Storage Account upload location. The processing service picks up the file, validates the input, processes the contained products, and returns a result file or writes approved data back through the configured API.

For performance, products can be processed in parallel with a configurable worker count. Parallelization is useful because many enrichment steps are I/O-bound, for example source search, PDF download, OCR, and LLM calls. The reference implementation supports parallel batch processing with configurable workers and also performs PDF handling in parallel with rate limiting.

Production batch settings should balance throughput, cost, and reliability. The worker count, maximum number of PDFs per product, maximum pages per PDF, retry behavior, and rate-limit handling should be configured according to the customer's source landscape, Azure OpenAI capacity, OCR setup, and ICM REST API limits. Large batches should be processed asynchronously with status tracking so users can continue working in Copilot for Merchants while enrichment runs in the background.

Batch processing should persist progress per product item. This allows failed items to be retried separately, interrupted batches to resume, and write-back problems to be handled independently from enrichment results.

Operational Considerations

Source governance should be defined before production use. This includes allowed domains, blocked domains, source priority, and audit requirements.
Generated content should be routed through a customer-defined review and approval workflow before publication.
Prompt templates should reflect the customer's brand voice, product taxonomy, required attribute groups, and channel-specific content rules.
Quality thresholds should define when content can be imported automatically, when it requires review, and when enrichment should be rejected.
Batch processing should consider API rate limits, source availability, PDF size, OCR cost, ICM REST API throughput, and retry behavior.
Partner integrations should treat the agent output as structured enrichment data with provenance, not as the sole system of record.

Limitations

The agent can only extract and generate content from sources it can access and that are allowed for the customer scenario.
Completeness and confidence depend on source quality. Missing or contradictory source data may require manual review.
AI-generated product content can require customer approval before it is used commercially.
Compliance information should be treated as source-derived content and verified according to the customer's compliance process.
Custom PIM integration requires field mapping, authentication, error handling, and customer-specific workflow design.

Disclaimer

The information provided in the Knowledge Base may not be applicable to all systems and situations. Intershop Communications will not be liable to any party for any direct or indirect damages resulting from the use of the Customer Support section of the Intershop Corporate Website, including, without limitation, any lost profits, business interruption, loss of programs or other data on your information handling system.