MENU

The Dual Frontier: A Retailer’s Framework for Agentic Commerce

Download PDF

The Dual Frontier
A Retailer’s Framework for Agentic Commerce

Introduction

Most retailers’ first instinct for onsite AI is an LLM powered chat widget.
It’s understandable, but it centers on the technology, and not the shopper.
Shoppers are unlikely to abandon twenty years of search-and-browse behavior
just because a chat bubble appeared in the corner. The harder, more valuable
work is starting with shoppers and determining how and when AI can benefit
their shopping experience. That’s a design problem specific to each retailer’s
categories, customers, and friction points, and it is solved through iterative
exploration, not just adding a chat widget and calling it a day.
Offsite, the infrastructure for agent-driven transactions is forming slower than
last September’s announcements implied. According to The Information, Away,
one of the brands Shopif y showcased when OpenAI launched its in-app
checkout, still isn’t purchasable inside ChatGPT. A DTC brand with roughly $300
million in annual revenue applied for the commerce beta and was told the
rollout was proceeding slowly. The bottleneck isn’t payments, which work; it’s
that a merchant’s catalog requires hands-on data optimizing before agents can
recommend inventory reliably.
These struggles share a root, and the patterns we see in client conversations
cluster around four tendencies:
01
INTRODUCTION
01 BOTH / Treating catalog as a checkbox
instead of strategy.
Sparse or stale catalog data doesn’t just lower
conversion onsite; it gives agents less to work with.
This no longer just lowers conversion; in the agentic
world, it risks exclusion from consideration entirely.
02 ONSITE / Leaving first-party
data unused
Returns, support tickets, post-purchase failure
patterns are where category advantage lives.
LLMs will keep improving at general reasoning;
they’re unlikely to independently learn which of your
SKUs actually work for which use cases.
03 ONSITE / Chasing agent traffic
instead of earning direct habits.
Direct sessions protect margin, generate
proprietary data, and don’t depend on a platform’s
ranking algorithm. The goal is building onsite
experiences good enough that shoppers skip agents
entirely; when they don’t, it’s ceding the transaction
to the LLM provider.
04 BOTH / Waiting for maturity instead of
learning while the market forms.
By the time agent traffic reaches 10-15% of
sessions, retailers who started early will have run
dozens of learning cycles on what predicts
conversion, what earns trust, and what to protect.
Those defaults will be expensive to change.

01
INTRODUCTION
1 https://www.theinformation.com/articles/openais-shopping-ambitions-hit-messy-data-reality
The best shopping assistance has always been human:
the ex-plumber in the plumbing aisle who asked about your project and level of expertise
before recommending parts, the audio specialist who learned about your room before
suggesting speakers. That expertise was expensive to staff, so it got cut.
The technology to restore it at scale without the headcount now exists.
This guide provides a framework for winning in agentic commerce across both frontiers.
Part 1 defines the onsite/offsite split and why they’re connected.
Part 2 covers building onsite advantage, using ambient intelligence and outcome data to
supplement what external agents can offer.
Part 3 addresses how to get discovered by agentic platforms (GEO) and what to protect once
you are.
Part 4 reframes retail media economics as agent-driven shopping compresses the funnel.
Part 5 diagnoses where the work typically stalls and how to sequence it.
Appendix B has a 15-minute diagnostic.
It’ll tell you whether to start now or fix foundations first.

Part 1:
The Expectation G ap
and Dual Frontier

02
01
PART 1: THE EXPECTATION GAP AND DUAL FRONTIER
A shopper asks ChatGPT for standing desk
recommendations, gets clear on weight capacity,
cable management, and height range for their setup,
then lands on a PDP with a spec sheet and 47
reviews to parse. That gap between the conversation
they just had, and the experience offered is where
they’re lost.
This is the onsite problem: shoppers are forming
expectations about what intelligent shopping feels
like, and most retail sites are not meeting them.
Not because the technology is missing, but because
it’s implemented wrong…a chat widget that waits to
be discovered instead of surfacing when behavior
signals confusion.
The offsite problem is different but related.
A growing share of shopping journeys will start in
agent surfaces rather than search queries.
ChatGPT, Gemini, Perplexity, and whatever comes
next will shape consideration sets before shoppers
reach any retailer’s site, and increasingly close the
transaction without them reaching it at all.
Models can reason over unstructured content, but
they won’t infer claims for you that could break trust
in them. If your catalog requires guessing on
allergens, compatibility, or fit, agents will
recommend someone else.
These problems share a root:
frontier models are getting good enough at product
reasoning that shoppers will use them.
03
Onsite (your properties):
Offsite (agent ecosystems):
Where you control the experience and hold
outcome data no one else has. Where you can
build habits that make external agents less
necessary. This is the offense: earning direct
relationships through intelligence that frontier
agents can’t replicate.
Where consideration sets form and where you
need to be discoverable without giving away your
differentiation. Open protocols are arriving fast:
Google’s UCP is already live, OpenAI’s commerce
integration ACP is onboarding merchants now.
These will standardize transactions but not whether
agents recommend you in the first place. What you
can control: catalog depth, outcome signals, and
what you share versus protect is covered in Parts 2
and 3. The goal is staying in the game without
becoming a commodity.
Retailers who treat onsite and offsite as competing
priorities will fail at both. Retailers who see them as
two expressions of the same capability will
compound advantages in both directions.
Some retailers will benefit (at least initially), others
will get routed around. This splits Agentic commerce
across two battlegrounds:

Part 2:
The Onsite
Advantage

GPT-6 will be smarter than GPT-5. It still won’t
know that your customers return the slim-fit oxfords
from Brand A at three times the rate of the classic-fit.
Why Retailers Should Win Onsite
Frontier models won’t have all the retailers
outcome data. They won’t know which of 47 mattress
SKUs actually works for side sleepers with back pain
based on return rates. They can’t tell shoppers that
customers who buy the mid-tier air purifier for rooms
over 400 square feet return it at twice the rate of the
premium, because that pattern lives in your returns
data, not in any spec sheet. They won’t discover that
furniture shoppers care about assembly time while
ignoring weight capacity because they’ve never seen
how those customers actually behave.
Better chat interfaces won’t save anyone;
faster responses won’t either. Public reviews and
ratings give frontier models a partial picture, but the
fuller one lives in data only retailers hold:
return reasons by SKU attribute, support ticket
patterns that surface defects before they’re public,
behavioral signals showing which product
comparisons actually lead to kept purchases. That
accumulated knowledge of what matters in specific
categories is what no outside model can synthesize.
Onsite agent interactions add a new layer of signal:
when shoppers share budget, timeline, and use-case
context directly, that’s intent, data richer than any
clickstream and it’s now yours.
Agent platforms will eventually know more about
individual shoppers across their full purchase history;
retailers will know more about what actually works in
their categories. Depth beats breadth when the
question is specific. And as protocols mature, the
onsite agent that resolves a shopper’s uncertainty
and the offsite agent that sent them will share
context, making the outcome data you’re capturing
now more valuable, not less.
Ambient Intelligence, Not Just Chatbots
Most retailers default to chat widgets because
that’s what “AI assistance” looks like. That is the
wrong model.
Shoppers don’t abandon twenty years of
search-and-browse muscle memory just because a
chat bubble appeared in a corner. When assistance requires the shopper to seek it out, most won’t, and
teams might conclude the demand isn’t there. This
is the wrong lesson from the right observation.
The better model is ambient intelligence:
assistance that surfaces at moments of friction and
disappears when not needed. The interface is still
conversational; what changes is when it appears.
Silent by default, helpful when behavior
signals uncertainty.
Google figured this out with AI Overviews.
They embedded intelligence directly into search
results. Users didn’t have to change behavior, the
experience just got smarter. That’s ambient. It’s
already generating billions of monthly chats.
A good store clerk doesn’t wait for you to ask.
They see you standing in the aisle, holding two
boxes, eyes flicking between spec sheets and they
walk over with exactly the context you need.
Ambient intelligence works the same way.
A shopper lands on a product detail page for a
monitor arm, scrolls down, scrolls back up, toggles
between variants, pauses on the specs. The system
recognizes the pattern and surfaces a small decision
window: “This arm works with desks 0.5 to 2 inches
thick. Your monitor up to 27 inches is supported.
Most customers install in 10 minutes.” They can go
deeper if they want, or add to cart and move on.
The trigger signals are readable: dwell time on
specs, toggling between variants, pogo-sticking
between product pages, scroll patterns that
suggest confusion. These are signals only the
retailer sees; no offsite agent has access to how a
shopper is behaving on your site. The triggers differ
by category, but the principle is the same: surface
help that matches how they’re already thinking,
not help that interrupts it. Albertsons found that
when assistance is contextually right, even in
grocery where shoppers resist anything that slows
a routine trip, their Ask AI capability delivers a 10%
increase in basket size. The triggering logic is
something you test and learn over time.
But the harder problem isn’t when to surface, it’s
what to say when you do. Outcome data tells you
‘runs large, size down.’ User context tells you ‘you’ve
bought medium in similar brands; this one runs
small.’ Both resolve uncertainty a sizing chart can’t.
05
Four Moments Where Intelligence Creates Value
Not every moment in a shopping journey benefits from intervention.
Knowing where to surface help and where to stay silent is itself a form of specialization.
03 Commit (ready to buy,
last hesitations):
A shopper asks ChatGPT for a monitor arm
recommendation. If your catalog isn’t structured for
that query, you’re not in the answer. If it is structured
but your prices are stale, you’re in the answer once,
then skipped next time.
The goal of onsite intelligence is earning direct
relationships, but not every shopper will form that
habit. Some will start with ChatGPT because they’re
early in research, some will use Perplexity to
compare across retailers, some will ask Gemini
because it’s integrated into their phone.
Invisibility to these agents means exclusion from
the consideration set. Visibility with an inaccurate
catalog means one recommendation, one
disappointment, then deprioritization. Visibility and
accuracy but with full differentiation exposed
means becoming one of five equivalent options
competing on price.
Protocols Are Plumbing
The agent commerce stack is converging on
standards: ways for assistants to discover what you
sell, verif y what’s true, and complete transactions.
Google’s UCP, OpenAI’s ACP commerce
integrations, and whatever follows will reduce
integration friction. They also create a trap:
teams mistake “connected” for “competitive.”
Protocols tell an agent how to ask and how you
respond. They don’t improve the truthfulness or
usability of what you return. If your catalog is sparse,
inconsistent, or stale, protocols transmit that
weakness at machine speed. And agents are
optimized for user trust: when confidence is low,
missing attributes, messy variants, unreliable
availability, the agent doesn’t try anyway. In the old
world, poor data meant lower conversion. In the
agentic world, poor data means you may never
enter consideration.
Optimize for Reasoning Engines
SEO earned you a ranking.
GEO earns you a recommendation.
When a shopper asks Perplexity for a standing
desk that fits a small apartment with good cable
management, the engine isn’t matching keywords.
It’s reasoning over structured attributes:
dimensions, cable routing features, weight capacity,
assembly complexity. If those attributes exist in your
catalog as machine-readable data rather than buried
in marketing copy, you’re more likely to be a
candidate. If the engine also finds third-party r eviews,
consistent pricing across sources, and recent
availability updates, it gains the confidence to cite you .
If not, it recommends someone else.
This is the discipline emerging as generative
engine optimization: structuring content and
product data so reasoning engines can understand,
trust, and surface your products in AI-generated
answers. The tactics differ from SEO: schema markup
over keyword density, explicit use-case attributes
over inferred relevance, conversational FAQ content
over landing-page prose. The principle is the same:
if you’re not optimized for how the system decides,
you’re invisible to it.
The catalog work described in Part 2 serves
double duty here. Structured attributes that power
onsite intelligence also make your products legible
to external agents. Outcome-derived insights,
“runs large, size down” or “doesn’t work with curved
screens” become the contextual signals that give
reasoning engines confidence to recommend you
over competitors with sparser data. The investment
compounds across both frontiers.
What This Looks Like by Category
Fashion – Share freely: sizes, materials,
care instructions, variant images.
Share selectively: “slim-fit styles from Brand A
run a full size small” (the insight your return
data revealed). Protect the return rates by
style and brand that taught you which fits
actually work.
Electronics – Share freely: specs,
compatibility lists, warranty terms.
Share selectively: “known connectivity issues
with Ring Gen 2 doorbells” (the insight your
support tickets revealed). Protect the ticket
volume and resolution patterns that surface
these issues before they’re public.
Traditional e-commerce economics reward
engagement: more page views, more time on site,
more ad impressions during browsing.
Revenue scales with attention captured.
Agent commerce inverts this. Shoppers converting
in three touches instead of twelve is success;
specialization resolved their uncertainty faster.
But the dashboard looks worse: page views down,
time on site down, display impressions down.
The inversion is already underway. Retailers who
adapt their economics and ad servers will capture it;
retailers who protect old dashboards will blame
the market.
Why the Math Still Works
The concern is obvious: fewer page views means
fewer ad impressions means less RMN revenue.
But this assumes the only monetizable moment is
passive browsing. Amazon’s Rufus offers early
evidence. Customers using the assistant are 60%
more likely to convert; the 250 million users who
engaged in 2025 generated an incremental $10 billion
in GMV. Fewer browsing sessions, more decisive
ones. And decisive sessions should command higher
prices: brands will pay more per outcome when the
intent is already there.
The advertiser used to pay for 10 clicks at $1 each
to get one purchase. Now they pay $10 for one
conversion with much higher efficiency.
Their economics stay the same; the path just
compresses. The opportunity is selling outcomes
against moments that didn’t exist before, provided
you can prove the outcomes are incremental.
What This Requires
Measurement becomes non-negotiable.
Without holdouts, there’s no way to distinguish
efficiency gains from disengagement. If sessions-
to-convert drops from 12 to 3, is that because the
agent surface helped or because those shoppers
were going to buy anyway? Retailers who can answer
this question will justify continued investment;
retailers who can’t will cut the wrong things.
Pricing models follow. CPM assumes volume.
Agent surfaces have lower volume but higher intent.
CPA or hybrid models that reward conversion
become more attractive to brands who want
efficiency, and more defensible for retailers who can
prove incrementality.
Catalog teams own product data but don’t see
support tickets. UX teams build surfaces but don’t
see return rates. RMN teams optimize for impressions
while agent success requires fewer of them.
Data science builds models but waits months for
engineering to deploy them.
This is the real reason most retailers will fail at
agentic commerce: not missing technology but
missing coordination.
Learning velocity requires these teams working off
the same feedback loop with compatible incentives.
A support ticket about curved-monitor compatibility
should reach the catalog within days, not quarters.
A /B test results should change what surfaces in
weeks, not fiscal years. The retailers who fix this will
compound advantages while competitors debate
ownership. A partner like Moloco Commerce Media
already operating your retail media can serve as
connective tissue: the catalog normalization is done,
the experimentation primitives exist, the outcome
data is already flowing. Agent commerce becomes
an extension, not a second build.
The Learning Window
Agent-referred traffic is small today, which is
precisely why now is the time to start.
Learning in agent commerce can’t be compressed
with budget or headcount. Each cycle teaches you
something: which catalog attributes actually predict
conversion, which onsite interventions earn trust
versus annoy, which signals to share offsite and
which to protect. Target 4-6 experiments a quarter.
This knowledge accumulates through iteration.
The question that predicts success: how fast can
you close the loop? A shopper struggles with
monitor arm compatibility. That shows up in support
tickets Tuesday. How quickly does “doesn’t work with
curved screens over 32 inches” appear in the catalog
and agent surfaces? Days? Weeks? Quarters?
That cycle time, from signal to encoded
insight to live intervention, is measurable.
Retailers running 30 cycles per year will
learn things retailers running 3 cycles
never discover.
The Margin Trap to Avoid
Barnes & Noble had the distribution and data
to dominate online books but protected
brick-and-mortar retail instead because the margins
looked so much better than having to figure out
shipping logistics.
Agent commerce makes shopping more efficient.
Some retailers will see declining engagement
metrics and conclude it’s destructive.
They’ll throttle investment to protect page views and
ad impressions while competitors who adapt
measurement grow margin per transaction,
capturing shoppers who simply want buying
to be easier.
Protecting old metrics is how you guarantee the
disruption happens to you instead of through you.
PART 2: THE ONSITE ADVANTAGE
01 Explore (vague intent,
needs structure):
A shopper arrives with “running shoes for trails”
but needs help narrowing. The choice is between
surfacing meaningful starting points based on what
actually differentiates options or dumping 200 results
sorted by popularity. The intervention here is a
guided entry: “Rocky terrain or groomed paths?
Under or over 5 miles?” Two questions that cut the
consideration set by 80%.
02 Evaluate (clear options,
needs comparison clarity):
They’ve identified 3-4 candidates. The choice is
between making tradeoffs explicit or forcing them to
open four tabs and decode spec sheets.
The intervention is comparison synthesis:
“Option A has better grip but runs warm.
Option B drains faster but fits narrow. Option C is the
budget pick with no major tradeoffs.” This is where
outcome data shines. The retailer knows Option A
triggers “runs hot” returns; a frontier agent doesn’t.
04 Complete (in cart,
checking for gaps):
Uncertain about completeness. Did they forget
something? Will they regret not adding the
accessory? The intervention is domain-informed
completion: “Most customers add the wall anchors
for this shelf” beats “frequently bought together”
because it’s based on what actually gets returned
with “fell off wall” in the notes.
The Goal Is Habit Formation
Efficiency is measurable in months. Habit formation takes longer but matters more.
This is the bet: when a site resolves uncertainty faster than competitors, shoppers will notice.
Not consciously at first, but after three or four sessions where questions were anticipated,
the behavior shifts: they will start coming directly. Direct session rates climb, branded search
referrals grow, and agent-referred traffic becomes a smaller share of conversions. They skip the
Google search, skip the ChatGPT query, trust that they’ll find what they need.
This is the real prize: not capturing the session but earning the default. Retailers who build
this don’t need to worry as much about offsite agents because their shoppers still see a reason
to come direct.
01 Discovery:
The shopper is building a consideration set.
An agent surfaces your product as a recommendation.
That’s an outcome opportunity, a conversion you can
drive and attribute, that didn’t exist when discovery
meant banner ads against browse traffic.
02 Comparison:
The shopper is qualifying options. An agent
synthesizes tradeoffs and highlights a winner.
That’s an outcome opportunity, influencing the
decision at the moment it’s made, that didn’t exist
when comparison meant hoping they clicked your
sponsored listing.
03 Decision:
The shopper is ready to buy but hesitating.
An agent resolves uncertainty or completes the
basket. That’s an outcome opportunity, closing the
sale at the point of friction, that didn’t exist when
decision-stage meant retargeting ads over days.
Two windows are closing simultaneously.
Onsite, shoppers are forming habits about what
intelligent shopping feels like; a site that feels static
while a competitor’s anticipated needs is training
shoppers to start somewhere else next time.
Offsite, agents are building defaults about which
retailers to trust, which catalogs to rely on,
which signals to weight. By the time agent traffic
reaches 10-15% of sessions, those defaults will be
expensive to change.
The Cold-Start Problem
Behavioral learning requires behavior.
New categories, new SKUs, and long-tail products
don’t have outcome data yet.This is where catalog
structure earns its keep. Rich attributes, explicit
compatibility relationships, and structured tradeoffs
let agent surfaces perform reasonably before
behavioral signals accumulate. The catalog does the
work until the data catches up.
Retailers who treat catalog structure as
“checkbox” and behavioral learning as “the real
intelligence” will have dead zones across half their
assortment. The two systems bootstrap each other.
That proof requires your ad server and agent surface
sharing the same infrastructure. Separate systems
mean high-intent moments you can’t price and
conversions you can’t attribute.
Agent commerce creates new outcome
opportunities across the shopping journey: Grocery
– Share freely: ingredients,
allergens, nutrition, pricing. Share selectively:
“store-brand pasta is an accepted substitute;
store-brand peanut butter is not” (the insight
your fulfillment data revealed). Protect the
behavioral data showing which substitutions
get accepted, rejected, or trigger churn.
Furniture – Share freely: dimensions,
materials, weight capacity. Share selectively:
“assembly takes 45 minutes for most
customers” (the insight your reviews and
returns revealed). Protect, the analytical
method that identifies assembly time
matters more than weight capacity to
your shoppers.
Near Add to Cart with unresolved questions.
Will this work with their existing setup? Is this the
right size? The intervention is uncertainty resolution:
“Fits curved monitors up to 32 inches” or “Runs
large, most customers size down.” They either
convert or leave to validate elsewhere.
But proving incrementality requires
connected infrastructure. If your agent
surfaces and ad server operate as separate
systems, you create high-intent moments
you can’t price and conversions you can’t
attribute. The economic model only works
when agent intelligence and monetization
share the same foundation.

06
GPT-6 will be smarter than GPT-5. It still won’t
know that your customers return the slim-fit oxfords
from Brand A at three times the rate of the classic-fit.
Why Retailers Should Win Onsite
Frontier models won’t have all the retailers
outcome data. They won’t know which of 47 mattress
SKUs actually works for side sleepers with back pain
based on return rates. They can’t tell shoppers that
customers who buy the mid-tier air purifier for rooms
over 400 square feet return it at twice the rate of the
premium, because that pattern lives in your returns
data, not in any spec sheet. They won’t discover that
furniture shoppers care about assembly time while
ignoring weight capacity because they’ve never seen
how those customers actually behave.
Better chat interfaces won’t save anyone;
faster responses won’t either. Public reviews and
ratings give frontier models a partial picture, but the
fuller one lives in data only retailers hold:
return reasons by SKU attribute, support ticket
patterns that surface defects before they’re public,
behavioral signals showing which product
comparisons actually lead to kept purchases. That
accumulated knowledge of what matters in specific
categories is what no outside model can synthesize.
Onsite agent interactions add a new layer of signal:
when shoppers share budget, timeline, and use-case
context directly, that’s intent, data richer than any
clickstream and it’s now yours.
Agent platforms will eventually know more about
individual shoppers across their full purchase history;
retailers will know more about what actually works in
their categories. Depth beats breadth when the
question is specific. And as protocols mature, the
onsite agent that resolves a shopper’s uncertainty
and the offsite agent that sent them will share
context, making the outcome data you’re capturing
now more valuable, not less.
Ambient Intelligence, Not Just Chatbots
Most retailers default to chat widgets because
that’s what “AI assistance” looks like. That is the
wrong model.
Shoppers don’t abandon twenty years of
search-and-browse muscle memory just because a
chat bubble appeared in a corner. When assistance requires the shopper to seek it out, most won’t, and
teams might conclude the demand isn’t there. This
is the wrong lesson from the right observation.
The better model is ambient intelligence:
assistance that surfaces at moments of friction and
disappears when not needed. The interface is still
conversational; what changes is when it appears.
Silent by default, helpful when behavior
signals uncertainty.
Google figured this out with AI Overviews.
They embedded intelligence directly into search
results. Users didn’t have to change behavior, the
experience just got smarter. That’s ambient. It’s
already generating billions of monthly chats.
A good store clerk doesn’t wait for you to ask.
They see you standing in the aisle, holding two
boxes, eyes flicking between spec sheets and they
walk over with exactly the context you need.
Ambient intelligence works the same way.
A shopper lands on a product detail page for a
monitor arm, scrolls down, scrolls back up, toggles
between variants, pauses on the specs. The system
recognizes the pattern and surfaces a small decision
window: “This arm works with desks 0.5 to 2 inches
thick. Your monitor up to 27 inches is supported.
Most customers install in 10 minutes.” They can go
deeper if they want, or add to cart and move on.
The trigger signals are readable: dwell time on
specs, toggling between variants, pogo-sticking
between product pages, scroll patterns that
suggest confusion. These are signals only the
retailer sees; no offsite agent has access to how a
shopper is behaving on your site. The triggers differ
by category, but the principle is the same: surface
help that matches how they’re already thinking,
not help that interrupts it. Albertsons found that
when assistance is contextually right, even in
grocery where shoppers resist anything that slows
a routine trip, their Ask AI capability delivers a 10%
increase in basket size. The triggering logic is
something you test and learn over time.
But the harder problem isn’t when to surface, it’s
what to say when you do. Outcome data tells you
‘runs large, size down.’ User context tells you ‘you’ve
bought medium in similar brands; this one runs
small.’ Both resolve uncertainty a sizing chart can’t.
Four Moments Where Intelligence Creates Value
Not every moment in a shopping journey benefits from intervention.
Knowing where to surface help and where to stay silent is itself a form of specialization.
03 Commit (ready to buy,
last hesitations):
A shopper asks ChatGPT for a monitor arm
recommendation. If your catalog isn’t structured for
that query, you’re not in the answer. If it is structured
but your prices are stale, you’re in the answer once,
then skipped next time.
The goal of onsite intelligence is earning direct
relationships, but not every shopper will form that
habit. Some will start with ChatGPT because they’re
early in research, some will use Perplexity to
compare across retailers, some will ask Gemini
because it’s integrated into their phone.
Invisibility to these agents means exclusion from
the consideration set. Visibility with an inaccurate
catalog means one recommendation, one
disappointment, then deprioritization. Visibility and
accuracy but with full differentiation exposed
means becoming one of five equivalent options
competing on price.
Protocols Are Plumbing
The agent commerce stack is converging on
standards: ways for assistants to discover what you
sell, verif y what’s true, and complete transactions.
Google’s UCP, OpenAI’s ACP commerce
integrations, and whatever follows will reduce
integration friction. They also create a trap:
teams mistake “connected” for “competitive.”
Protocols tell an agent how to ask and how you
respond. They don’t improve the truthfulness or
usability of what you return. If your catalog is sparse,
inconsistent, or stale, protocols transmit that
weakness at machine speed. And agents are
optimized for user trust: when confidence is low,
missing attributes, messy variants, unreliable
availability, the agent doesn’t try anyway. In the old
world, poor data meant lower conversion. In the
agentic world, poor data means you may never
enter consideration.
Optimize for Reasoning Engines
SEO earned you a ranking.
GEO earns you a recommendation.
When a shopper asks Perplexity for a standing
desk that fits a small apartment with good cable
management, the engine isn’t matching keywords.
It’s reasoning over structured attributes:
dimensions, cable routing features, weight capacity,
assembly complexity. If those attributes exist in your
catalog as machine-readable data rather than buried
in marketing copy, you’re more likely to be a
candidate. If the engine also finds third-party r eviews,
consistent pricing across sources, and recent
availability updates, it gains the confidence to cite you .
If not, it recommends someone else.
This is the discipline emerging as generative
engine optimization: structuring content and
product data so reasoning engines can understand,
trust, and surface your products in AI-generated
answers. The tactics differ from SEO: schema markup
over keyword density, explicit use-case attributes
over inferred relevance, conversational FAQ content
over landing-page prose. The principle is the same:
if you’re not optimized for how the system decides,
you’re invisible to it.
The catalog work described in Part 2 serves
double duty here. Structured attributes that power
onsite intelligence also make your products legible
to external agents. Outcome-derived insights,
“runs large, size down” or “doesn’t work with curved
screens” become the contextual signals that give
reasoning engines confidence to recommend you
over competitors with sparser data. The investment
compounds across both frontiers.
What This Looks Like by Category
Fashion – Share freely: sizes, materials,
care instructions, variant images.
Share selectively: “slim-fit styles from Brand A
run a full size small” (the insight your return
data revealed). Protect the return rates by
style and brand that taught you which fits
actually work.
Electronics – Share freely: specs,
compatibility lists, warranty terms.
Share selectively: “known connectivity issues
with Ring Gen 2 doorbells” (the insight your
support tickets revealed). Protect the ticket
volume and resolution patterns that surface
these issues before they’re public.
Traditional e-commerce economics reward
engagement: more page views, more time on site,
more ad impressions during browsing.
Revenue scales with attention captured.
Agent commerce inverts this. Shoppers converting
in three touches instead of twelve is success;
specialization resolved their uncertainty faster.
But the dashboard looks worse: page views down,
time on site down, display impressions down.
The inversion is already underway. Retailers who
adapt their economics and ad servers will capture it;
retailers who protect old dashboards will blame
the market.
Why the Math Still Works
The concern is obvious: fewer page views means
fewer ad impressions means less RMN revenue.
But this assumes the only monetizable moment is
passive browsing. Amazon’s Rufus offers early
evidence. Customers using the assistant are 60%
more likely to convert; the 250 million users who
engaged in 2025 generated an incremental $10 billion
in GMV. Fewer browsing sessions, more decisive
ones. And decisive sessions should command higher
prices: brands will pay more per outcome when the
intent is already there.
The advertiser used to pay for 10 clicks at $1 each
to get one purchase. Now they pay $10 for one
conversion with much higher efficiency.
Their economics stay the same; the path just
compresses. The opportunity is selling outcomes
against moments that didn’t exist before, provided
you can prove the outcomes are incremental.
What This Requires
Measurement becomes non-negotiable.
Without holdouts, there’s no way to distinguish
efficiency gains from disengagement. If sessions-
to-convert drops from 12 to 3, is that because the
agent surface helped or because those shoppers
were going to buy anyway? Retailers who can answer
this question will justify continued investment;
retailers who can’t will cut the wrong things.
Pricing models follow. CPM assumes volume.
Agent surfaces have lower volume but higher intent.
CPA or hybrid models that reward conversion
become more attractive to brands who want
efficiency, and more defensible for retailers who can
prove incrementality.
Catalog teams own product data but don’t see
support tickets. UX teams build surfaces but don’t
see return rates. RMN teams optimize for impressions
while agent success requires fewer of them.
Data science builds models but waits months for
engineering to deploy them.
This is the real reason most retailers will fail at
agentic commerce: not missing technology but
missing coordination.
Learning velocity requires these teams working off
the same feedback loop with compatible incentives.
A support ticket about curved-monitor compatibility
should reach the catalog within days, not quarters.
A /B test results should change what surfaces in
weeks, not fiscal years. The retailers who fix this will
compound advantages while competitors debate
ownership. A partner like Moloco Commerce Media
already operating your retail media can serve as
connective tissue: the catalog normalization is done,
the experimentation primitives exist, the outcome
data is already flowing. Agent commerce becomes
an extension, not a second build.
The Learning Window
Agent-referred traffic is small today, which is
precisely why now is the time to start.
Learning in agent commerce can’t be compressed
with budget or headcount. Each cycle teaches you
something: which catalog attributes actually predict
conversion, which onsite interventions earn trust
versus annoy, which signals to share offsite and
which to protect. Target 4-6 experiments a quarter.
This knowledge accumulates through iteration.
The question that predicts success: how fast can
you close the loop? A shopper struggles with
monitor arm compatibility. That shows up in support
tickets Tuesday. How quickly does “doesn’t work with
curved screens over 32 inches” appear in the catalog
and agent surfaces? Days? Weeks? Quarters?
That cycle time, from signal to encoded
insight to live intervention, is measurable.
Retailers running 30 cycles per year will
learn things retailers running 3 cycles
never discover.
The Margin Trap to Avoid
Barnes & Noble had the distribution and data
to dominate online books but protected
brick-and-mortar retail instead because the margins
looked so much better than having to figure out
shipping logistics.
Agent commerce makes shopping more efficient.
Some retailers will see declining engagement
metrics and conclude it’s destructive.
They’ll throttle investment to protect page views and
ad impressions while competitors who adapt
measurement grow margin per transaction,
capturing shoppers who simply want buying
to be easier.
Protecting old metrics is how you guarantee the
disruption happens to you instead of through you.
01 Explore (vague intent,
needs structure):
A shopper arrives with “running shoes for trails”
but needs help narrowing. The choice is between
surfacing meaningful starting points based on what
actually differentiates options or dumping 200 results
sorted by popularity. The intervention here is a
guided entry: “Rocky terrain or groomed paths?
Under or over 5 miles?” Two questions that cut the
consideration set by 80%.
02 Evaluate (clear options,
needs comparison clarity):
They’ve identified 3-4 candidates. The choice is
between making tradeoffs explicit or forcing them to
open four tabs and decode spec sheets.
The intervention is comparison synthesis:
“Option A has better grip but runs warm.
Option B drains faster but fits narrow. Option C is the
budget pick with no major tradeoffs.” This is where
outcome data shines. The retailer knows Option A
triggers “runs hot” returns; a frontier agent doesn’t.
04 Complete (in cart,
checking for gaps):
Uncertain about completeness. Did they forget
something? Will they regret not adding the
accessory? The intervention is domain-informed
completion: “Most customers add the wall anchors
for this shelf” beats “frequently bought together”
because it’s based on what actually gets returned
with “fell off wall” in the notes.
The Goal Is Habit Formation
Efficiency is measurable in months. Habit formation takes longer but matters more.
This is the bet: when a site resolves uncertainty faster than competitors, shoppers will notice.
Not consciously at first, but after three or four sessions where questions were anticipated,
the behavior shifts: they will start coming directly. Direct session rates climb, branded search
referrals grow, and agent-referred traffic becomes a smaller share of conversions. They skip the
Google search, skip the ChatGPT query, trust that they’ll find what they need.
This is the real prize: not capturing the session but earning the default. Retailers who build
this don’t need to worry as much about offsite agents because their shoppers still see a reason
to come direct.
PART 2: THE ONSITE ADVANTAGE
01 Discovery:
The shopper is building a consideration set.
An agent surfaces your product as a recommendation.
That’s an outcome opportunity, a conversion you can
drive and attribute, that didn’t exist when discovery
meant banner ads against browse traffic.
02 Comparison:
The shopper is qualifying options. An agent
synthesizes tradeoffs and highlights a winner.
That’s an outcome opportunity, influencing the
decision at the moment it’s made, that didn’t exist
when comparison meant hoping they clicked your
sponsored listing.
03 Decision:
The shopper is ready to buy but hesitating.
An agent resolves uncertainty or completes the
basket. That’s an outcome opportunity, closing the
sale at the point of friction, that didn’t exist when
decision-stage meant retargeting ads over days.
Two windows are closing simultaneously.
Onsite, shoppers are forming habits about what
intelligent shopping feels like; a site that feels static
while a competitor’s anticipated needs is training
shoppers to start somewhere else next time.
Offsite, agents are building defaults about which
retailers to trust, which catalogs to rely on,
which signals to weight. By the time agent traffic
reaches 10-15% of sessions, those defaults will be
expensive to change.
The Cold-Start Problem
Behavioral learning requires behavior.
New categories, new SKUs, and long-tail products
don’t have outcome data yet.This is where catalog
structure earns its keep. Rich attributes, explicit
compatibility relationships, and structured tradeoffs
let agent surfaces perform reasonably before
behavioral signals accumulate. The catalog does the
work until the data catches up.
Retailers who treat catalog structure as
“checkbox” and behavioral learning as “the real
intelligence” will have dead zones across half their
assortment. The two systems bootstrap each other.
That proof requires your ad server and agent surface
sharing the same infrastructure. Separate systems
mean high-intent moments you can’t price and
conversions you can’t attribute.
Agent commerce creates new outcome
opportunities across the shopping journey: Grocery
– Share freely: ingredients,
allergens, nutrition, pricing. Share selectively:
“store-brand pasta is an accepted substitute;
store-brand peanut butter is not” (the insight
your fulfillment data revealed). Protect the
behavioral data showing which substitutions
get accepted, rejected, or trigger churn.
Furniture – Share freely: dimensions,
materials, weight capacity. Share selectively:
“assembly takes 45 minutes for most
customers” (the insight your reviews and
returns revealed). Protect, the analytical
method that identifies assembly time
matters more than weight capacity to
your shoppers.
Near Add to Cart with unresolved questions.
Will this work with their existing setup? Is this the
right size? The intervention is uncertainty resolution:
“Fits curved monitors up to 32 inches” or “Runs
large, most customers size down.” They either
convert or leave to validate elsewhere.
But proving incrementality requires
connected infrastructure. If your agent
surfaces and ad server operate as separate
systems, you create high-intent moments
you can’t price and conversions you can’t
attribute. The economic model only works
when agent intelligence and monetization
share the same foundation.

Part 3:
Offsite Visibility Without
Being Vulnerable

GPT-6 will be smarter than GPT-5. It still won’t
know that your customers return the slim-fit oxfords
from Brand A at three times the rate of the classic-fit.
Why Retailers Should Win Onsite
Frontier models won’t have all the retailers
outcome data. They won’t know which of 47 mattress
SKUs actually works for side sleepers with back pain
based on return rates. They can’t tell shoppers that
customers who buy the mid-tier air purifier for rooms
over 400 square feet return it at twice the rate of the
premium, because that pattern lives in your returns
data, not in any spec sheet. They won’t discover that
furniture shoppers care about assembly time while
ignoring weight capacity because they’ve never seen
how those customers actually behave.
Better chat interfaces won’t save anyone;
faster responses won’t either. Public reviews and
ratings give frontier models a partial picture, but the
fuller one lives in data only retailers hold:
return reasons by SKU attribute, support ticket
patterns that surface defects before they’re public,
behavioral signals showing which product
comparisons actually lead to kept purchases. That
accumulated knowledge of what matters in specific
categories is what no outside model can synthesize.
Onsite agent interactions add a new layer of signal:
when shoppers share budget, timeline, and use-case
context directly, that’s intent, data richer than any
clickstream and it’s now yours.
Agent platforms will eventually know more about
individual shoppers across their full purchase history;
retailers will know more about what actually works in
their categories. Depth beats breadth when the
question is specific. And as protocols mature, the
onsite agent that resolves a shopper’s uncertainty
and the offsite agent that sent them will share
context, making the outcome data you’re capturing
now more valuable, not less.
Ambient Intelligence, Not Just Chatbots
Most retailers default to chat widgets because
that’s what “AI assistance” looks like. That is the
wrong model.
Shoppers don’t abandon twenty years of
search-and-browse muscle memory just because a
chat bubble appeared in a corner. When assistance requires the shopper to seek it out, most won’t, and
teams might conclude the demand isn’t there. This
is the wrong lesson from the right observation.
The better model is ambient intelligence:
assistance that surfaces at moments of friction and
disappears when not needed. The interface is still
conversational; what changes is when it appears.
Silent by default, helpful when behavior
signals uncertainty.
Google figured this out with AI Overviews.
They embedded intelligence directly into search
results. Users didn’t have to change behavior, the
experience just got smarter. That’s ambient. It’s
already generating billions of monthly chats.
A good store clerk doesn’t wait for you to ask.
They see you standing in the aisle, holding two
boxes, eyes flicking between spec sheets and they
walk over with exactly the context you need.
Ambient intelligence works the same way.
A shopper lands on a product detail page for a
monitor arm, scrolls down, scrolls back up, toggles
between variants, pauses on the specs. The system
recognizes the pattern and surfaces a small decision
window: “This arm works with desks 0.5 to 2 inches
thick. Your monitor up to 27 inches is supported.
Most customers install in 10 minutes.” They can go
deeper if they want, or add to cart and move on.
The trigger signals are readable: dwell time on
specs, toggling between variants, pogo-sticking
between product pages, scroll patterns that
suggest confusion. These are signals only the
retailer sees; no offsite agent has access to how a
shopper is behaving on your site. The triggers differ
by category, but the principle is the same: surface
help that matches how they’re already thinking,
not help that interrupts it. Albertsons found that
when assistance is contextually right, even in
grocery where shoppers resist anything that slows
a routine trip, their Ask AI capability delivers a 10%
increase in basket size. The triggering logic is
something you test and learn over time.
But the harder problem isn’t when to surface, it’s
what to say when you do. Outcome data tells you
‘runs large, size down.’ User context tells you ‘you’ve
bought medium in similar brands; this one runs
small.’ Both resolve uncertainty a sizing chart can’t.
Four Moments Where Intelligence Creates Value
Not every moment in a shopping journey benefits from intervention.
Knowing where to surface help and where to stay silent is itself a form of specialization.
03 Commit (ready to buy,
last hesitations):
A shopper asks ChatGPT for a monitor arm
recommendation. If your catalog isn’t structured for
that query, you’re not in the answer. If it is structured
but your prices are stale, you’re in the answer once,
then skipped next time.
The goal of onsite intelligence is earning direct
relationships, but not every shopper will form that
habit. Some will start with ChatGPT because they’re
early in research, some will use Perplexity to
compare across retailers, some will ask Gemini
because it’s integrated into their phone.
Invisibility to these agents means exclusion from
the consideration set. Visibility with an inaccurate
catalog means one recommendation, one
disappointment, then deprioritization. Visibility and
accuracy but with full differentiation exposed
means becoming one of five equivalent options
competing on price.
Protocols Are Plumbing
The agent commerce stack is converging on
standards: ways for assistants to discover what you
sell, verif y what’s true, and complete transactions.
Google’s UCP, OpenAI’s ACP commerce
integrations, and wha