← Blog 11 min read

Amazon Rufus and COSMO: The Complete Seller Guide to Amazon's AI Discovery Stack (2026)

Daisy
Daisy Director of Content & Digital Strategy
Apr 24, 2026
Share

What Is Amazon Rufus?

Amazon Rufus is Amazon's generative AI shopping assistant. It sits inside the Amazon Shopping app and on Amazon.com, and it answers product questions, compares options, and helps shoppers make decisions using natural conversational language instead of typed keywords.

Rufus is powered by a custom large language model that Amazon built specifically for shopping - not a general-purpose LLM adapted for e-commerce. Per Amazon's own engineering team, the model was trained from the start on the Amazon catalog, customer reviews, and community Q&A content, and it uses retrieval-augmented generation (RAG) to pull in fresh, reliable information at query time.

rufus

Where Rufus Shows Up

Rufus is no longer a single chat interface tucked in the corner of the app. Amazon has been rolling it out as a distributed layer across the shopping journey:

  • Rufus chat button in the Amazon app and desktop navigation

  • Search bar prompts that appear before the shopper even hits the SERP

  • "Researched by AI" - a SERP module that functions like an AI Overview, condensing the results page into a generative decision guide

  • "Customers Ask" - a SERP module that surfaces frequently-clicked Rufus prompts next to a matching product carousel

  • Product detail page questions that shoppers can ask about a specific ASIN

The practical implication: Rufus is not a feature shoppers have to seek out. Amazon is placing it directly in the path of discovery.

Why It Matters for Sellers

Rufus-attributed sessions behave differently than traditional search sessions. Agency data from early 2026 shows Rufus is now mediating a meaningful share of mobile queries - roughly 15-20% according to one Q1 2026 audit across dozens of brands. And shoppers who engage with Rufus convert at materially higher rates than shoppers who don't, because Rufus pre-qualifies intent before they ever land on a PDP.

The downside: if your listing doesn't give Rufus the structured information it needs to recommend you with confidence, you don't make the shortlist. You are not "ranked low" - you are invisible.


What Is Amazon COSMO?

Amazon COSMO - short for Common Sense Knowledge Generation and Serving System - is the knowledge graph that sits underneath Amazon's search and recommendation infrastructure. It is a separate system from Rufus, though they solve related problems.

COSMO was published as a peer-reviewed paper at SIGMOD 2024 (the Companion Proceedings of the ACM International Conference on Management of Data), authored by Amazon's applied science team. The paper describes an industry-scale knowledge graph that mines customer behavior - query-purchase pairs and co-purchase pairs - to extract commonsense relationships between products and the intentions behind buying them.

What "Commonsense Knowledge" Means in Practice

COSMO's knowledge graph is built from relationship triples that look like this:

  • <slip-resistant shoes, used_for_audience, pregnant women>
  • <camera case + screen protector, capable_of, protecting camera>
  • <furniture for small apartments, implies_need, multi-functional design>

These are the inferences a knowledgeable store associate would make automatically - that someone asking for "furniture for small apartments" probably wants a sofa bed or storage ottoman, even if they never said the word "sofa." COSMO's job is to make that inference at scale, across Amazon's catalog.

The Numbers Amazon Has Published

From the SIGMOD 2024 paper and Amazon Science blog:

  • 6.3 million nodes and 29 million knowledge edges across the graph

  • 18 major product categories covered

  • ~60% macro F1 improvement in search relevance (frozen encoders, ESCI dataset)

  • ~28% macro F1 improvement even when baselines are fine-tuned

  • 0.7% relative sales lift in live online A/B tests on ~10% of U.S. traffic

  • 8% increase in navigation engagement rate in the same traffic segment

Amazon framed the 0.7% figure as "hundreds of millions of dollars in annual revenue" in the paper itself.


Rufus vs. COSMO: Are They the Same Thing?

rufus vs cosmo

No. This is one of the most common misconceptions in the Amazon seller world right now, and it matters because the two systems have different inputs, different surfaces, and different optimization levers.

 

Rufus

COSMO

What it is

Generative AI shopping assistant (front-end)

Commonsense knowledge graph (back-end)

Where it surfaces

Chat UI, search bar, "Researched by AI," PDP Q&A

Search relevance, session recommendations, search navigation

Primary input

Shopper's natural-language question

Customer behavior signals (query→purchase, co-purchase)

Published at

Amazon Science blog, IEEE Spectrum

SIGMOD 2024 (peer-reviewed)

Launched broadly

2024 (U.S. mobile), expanded 2025-2026

Deployed in search navigation; ~10% of U.S. traffic per the paper

Replaces A9/A10?

No - it sits alongside it

No - it augments it with a semantic layer

The simplest way to think about it: COSMO is the knowledge. Rufus is the conversation. Rufus draws on Amazon's broader shopping data - which includes COSMO-derived relationships - to answer questions. But Rufus is not "a wrapper around COSMO," and COSMO is not "the Rufus algorithm." They are complementary systems within the same AI discovery stack.

When you optimize for one, you're largely optimizing for the other - because both reward the same underlying thing: listings that clearly communicate who the product is for, what problem it solves, and in what context.


How Rufus Actually Recommends Products

How Rufus Actually Recommends Products

Under the hood, Rufus follows a workflow that sellers should internalize, because each step is an optimization opportunity.

Step 1: Intent Parsing

Rufus interprets the shopper's question, extracts the intent, and identifies the constraints. "What's a good running shoe for flat feet under $100?" is parsed into product category + use case + physical constraint + budget.

Step 2: Retrieval

Rufus pulls from sources Amazon has classified as reliable: the product catalog, customer reviews, community Q&A, and relevant Stores APIs. This is the RAG layer.

Step 3: Buying Criteria Extraction

Rufus identifies the decision criteria that matter for this specific query - arch support, cushioning, weight, price ceiling.

Step 4: Contextual Decision Guide

Rufus condenses the SERP into a generative summary that compares products against those criteria, not against the raw keyword match.

Step 5: Reinforcement Learning Feedback Loop

Amazon has publicly stated that Rufus improves through customer feedback (thumbs up / thumbs down) and reinforcement learning. Over time, the responses shoppers find useful become more likely. The responses shoppers reject become less likely.

The critical insight: Rufus is not searching your listing for keywords. It is searching your listing (and your reviews, and your Q&A) for answers to specific constraint-based questions. If your listing cannot answer "will this fit in a small car trunk?" or "is this safe for sensitive skin?" - you don't get recommended, regardless of how well you rank on the keyword version of that query.


How COSMO Changes Amazon Search

How COSMO Changes Amazon Search

A9 asks: does this listing contain the words the shopper typed? COSMO asks: does this product solve the problem the shopper described?

That shift - from lexical matching to intent matching - is the single biggest change in Amazon organic discovery since A9 was introduced.

The COSMO Pipeline (Simplified)

  1. Behavior mining. COSMO starts with two data types: query-purchase pairs (what did the shopper buy after this search?) and co-purchase pairs (what else was bought in the same session?).

  2. LLM hypothesis generation. A large language model generates candidate explanations for why those pairings exist. Most candidates are junk - the model sometimes generates empty rationales like "customers bought them together because they like them." Those get filtered.

  3. Human-in-the-loop annotation. Surviving candidates are reviewed by human annotators against two criteria: plausibility (is the relationship reasonable?) and typicality (is this a common association?).

  4. COSMO-LM instruction tuning. The filtered, annotated knowledge is used to fine-tune an in-house language model (COSMO-LM) that can then expand the knowledge graph across all 18 categories.

The result is a structured, trustworthy graph of intent relationships that feeds into Amazon's search relevance, session-based recommendations, and navigation systems.

Where COSMO Has the Most Impact

Per the paper and Amazon's own commentary, COSMO's semantic inference adds the most value when queries are broad, ambiguous, or intent-driven - categories like Home & Kitchen, Clothing, Sports & Outdoors, Baby Products, and Toys & Games. Categories dominated by brand- or model-specific searches (Electronics, Video Games) still benefit, but the lift is smaller except for long-tail and gifting queries.


What This Means for Your Listings

If your listings were written for the keyword era - bullet-stuffed with synonyms, optimized for search term volume, structured around what A9 could index - they are increasingly being evaluated by a system that asks a fundamentally different question.

Here is what is actually changing in seller behavior in 2026.

1. Titles Are Becoming Descriptive, Not Keyword-Dense

Titles still matter for A9 matching, but they now have a second job: signaling to Rufus and COSMO who the product is for and what it's for. The 8-token keyword-stuffed title is losing ground to titles that lead with the most important buying criterion.

2. Bullets Are Shifting From Features to Jobs-to-Be-Done

Rufus rewards bullets that answer "can I…?" and "will this work if…?" questions in simple, declarative sentences. Feature-only bullets ("premium stainless steel construction") communicate almost nothing about fit or use case.

3. Reviews Are Now a Discovery Signal, Not Just a Conversion Signal

This is the one most sellers aren't ready for. Rufus reads reviews to answer intent-specific questions. If a shopper asks "is this good for cold weather?" and four of your top reviews discuss cold-weather performance, Rufus surfaces you. If none of them do, you don't exist for that query - even if your product is objectively excellent for cold weather.

4. Images Matter More Than Sellers Realize

COSMO is multimodal. Both COSMO and Rufus can parse structured visual information (zone-aware gallery layouts, A+ Content, From the Brand carousels) for use-case and audience signals. Pretty-but-empty hero images are a missed opportunity.

5. A+ Content Is a Rufus RAG Source

A+ Content is indexed and retrievable. The text you put in A+ modules is eligible to surface in Rufus answers. If you treat A+ as pure brand storytelling instead of structured product knowledge, you're leaving Rufus visibility on the table.


The Rufus + COSMO Optimization Framework

This is a practical framework that works for both systems simultaneously, because both reward the same underlying content quality. Use it as a checklist.

Layer 1: Audience Signals

Answer who this product is for in the listing itself.

  • Demographic fit (age range, skill level, body type, skin type)

  • Use-case audience (runners with flat feet, pregnant women, dog owners with small breeds)

  • Experience level (beginner, intermediate, professional)

Layer 2: Use-Case and Context Signals

Answer when and where this product is used.

  • Environments (indoor, outdoor, cold weather, travel)

  • Occasions (daily use, weekly, special events)

  • Adjacent activities (running, cooking, studying)

Layer 3: Constraint Answers

Answer the "can I…?" and "will this…?" questions explicitly.

  • Compatibility (fits X, works with Y)

  • Safety and suitability (safe for Z, not recommended for W)

  • Physical constraints (size, weight, dimensions in real-world terms)

Layer 4: Proof and Specificity

Replace vague claims with specific, quotable facts.

  • "Lasts all day" → "12-hour battery life in standard use"

  • "Premium quality" → specific material grade, origin, certifications

  • "Easy to clean" → "dishwasher-safe on top rack"

Layer 5: Review Gardening

Actively shape the review corpus by inviting specific use-case feedback.

  • In post-purchase follow-ups, ask questions tied to the use cases you want to rank for.

  • Don't manipulate star ratings. Do shape what reviewers talk about.

Layer 6: FAQ and Q&A Density

Populate the community Q&A and any FAQ modules with the actual questions shoppers are asking Rufus about your category.

  • Use simple, declarative answers Rufus can lift directly.


What Stops Working in 2026

A few tactics that were effective in the A9 era are actively counterproductive now.

  • Keyword stuffing in titles and bullets. Still parsed by A9, increasingly ignored (or penalized via semantic dilution) by COSMO.

  • Synonym-spray bullets. Listing every variant of a keyword instead of answering a use-case question.

  • Generic AI-written listings. ChatGPT-style listings that sound polished but communicate no structured knowledge are the new keyword stuffing - different failure mode, same outcome.

  • Ignoring review content. Treating reviews as a star-rating lever instead of a discovery corpus.

  • Over-relying on exact-match Sponsored Products. Industry observers have reported that exact-match behavior has already shifted under semantic matching, with negative keywords no longer available at the product-targeting level in some campaigns - a signal that Amazon's ad system is moving in the same direction as its organic system.


Your 30-Day Action Plan

Your 30-Day Action Plan

If you're reading this and feeling the pressure to catch up, start here. This is deliberately small - pick one product, not your whole catalog.

Week 1: Diagnose

  1. Ask Rufus 10 questions a real shopper might ask about your best-selling ASIN.

  2. Write down every wrong, missing, or "I don't know" answer. That is your optimization backlog.

  3. Search your top 3 non-branded keywords and check whether you appear in the "Researched by AI" or "Customers Ask" modules.

Week 2: Rewrite

  1. Rewrite your title to lead with the most important buying criterion, not the most-searched keyword.

  2. Rewrite 3 bullets from feature statements into constraint answers.

  3. Add 5 FAQs to your A+ Content or description, written as direct answers in simple declarative sentences.

Week 3: Review Gardening

  1. Review your last 50 reviews. Tag the use cases they mention.

  2. Identify the 2-3 use cases you want to rank for that are underrepresented in reviews.

  3. Update your post-purchase follow-up to invite feedback on those specific use cases.

Week 4: Measure and Iterate

  1. Re-ask Rufus the same 10 questions from Week 1. Compare.

  2. Check whether the "I don't know" answers have improved.

  3. Set a 30-day calendar reminder to do it again.

Then, and only then, scale to the rest of the catalog.


Final Word

The Amazon discovery stack now has two AI layers stacked on top of the A9 era: COSMO underneath, Rufus on top. Both reward the same thing - listings that clearly communicate who the product is for, what problem it solves, and in what context.

That is not a keyword problem. It is a customer-understanding problem. The sellers who treat it that way will win the next 12-18 months. The ones who keep trying to stuff their way to visibility will watch their share of Rufus-mediated sessions go to competitors they've never heard of.

Start with one product. Ask Rufus 10 questions. Fix what it gets wrong.

Then do it again in 30 days.

Frequently Asked Questions

No. Rufus is Amazon's generative AI shopping assistant (front-end). COSMO is a commonsense knowledge graph (back-end) published at SIGMOD 2024. They are complementary systems within Amazon's AI discovery stack, but they have different inputs, different deployment surfaces, and different documentation. Optimizing for one tends to help the other, because both reward clear audience, use-case, and constraint signals in your listing.

No. COSMO is an additional semantic intelligence layer that augments Amazon's existing search infrastructure. Keyword matching, conversion-based ranking, and sales velocity signals remain active. COSMO is particularly influential for broad, ambiguous, and intent-driven queries where pure keyword matching falls short.

Amazon has not published an official figure. One agency audit from Q1 2026 across multiple brands estimated 15-20% of shopper queries on mobile were Rufus-mediated. That number should be treated as directional, not canonical, and it is almost certainly climbing quarter over quarter.

No, but their job has changed. Keywords still matter for A9 matching, Sponsored Products targeting, and indexing. What has changed is that keyword presence alone is no longer sufficient for discovery in AI-mediated sessions. Listings need keywords plus structured answers to the questions shoppers are actually asking.

Per the SIGMOD 2024 paper and subsequent Amazon commentary, categories with broad or intent-driven queries - Home & Kitchen, Clothing, Sports & Outdoors, Baby Products, Patio & Garden, Toys & Games - see the largest COSMO effect. Categories dominated by brand- and model-specific searches (Electronics, Video Games) see a smaller effect, with the exception of long-tail and gifting queries.

Yes. A+ Content is a retrievable source for Rufus. The structured text you place in A+ modules is eligible to surface in Rufus responses. This is why treating A+ as structured product knowledge (not pure brand storytelling) has become an optimization priority.

Amazon does not yet offer a clean, first-party Rufus visibility metric inside Seller Central. Third-party platforms are beginning to offer AI-citation tracking across assistants, and Amazon's Sponsored Prompts reporting gives a partial view. For most sellers, the pragmatic test is still: ask Rufus questions a real shopper would ask, and see whether you show up.

Don't rewrite everything at once. Pick your 10 highest-revenue ASINs and run the 30-day plan on each. High-revenue ASINs that are already getting Rufus impressions usually need only light semantic tuning. ASINs that are missing from Rufus answers entirely usually need deeper work - clearer audience, use case, and constraint answers, plus review gardening.

No. Both systems are already live in production, already shaping discovery, and already measurable in sales data for categories where they've been deployed. The sellers who optimize first are locking in a 12-18 month share advantage before the rest of the market catches up.

(1) COSMO paper (SIGMOD 2024): Available on Amazon Science under the title "COSMO: A Large-Scale E-commerce Common Sense Knowledge Generation and Serving System at Amazon." (2) Rufus engineering explainer: Amazon Science blog and IEEE Spectrum ("Amazon Rufus: How We Built an AI-Powered Shopping Assistant," Trishul Chilimbi). (3) Rufus customer-facing help: Amazon's "About Rufus" help page.

Daisy
About the author

Daisy

Director of Content & Digital Strategy

Digital marketing leader with a strong background in content strategy, SEO, and brand storytelling for e-commerce. Specializes in translating complex marketplace data into actionable insights that drive organic growth and customer engagement.

Follow the author:

Ready to stop leaving money on the table?

Get a free margin audit and see exactly how much profit you're missing.

Book Your Free Audit →
Get In Touch

Start Your Free Margin Audit

Tell us about your brand and we'll map every profit leak — no commitment, no cost.

Don't miss out!

Get weekly Amazon intelligence delivered to your inbox.