4.3 Secondary Market Research

At a Glance
Other names Desk Research · Market Sizing Research · TAM Research · Macro Trend Research
In Brief
Secondary market research is the desk-based work that produces a defensible market-size estimate and a macro-trend map of the regulatory, demographic, technology, and channel shifts that materially change the sizing picture in the next 12–24 months. The output is a sizing-and-trend brief that tests your market hypothesis — its size, segment, competition, and direction — against published evidence, so you can make a go/no-go call before spending time and money on primary research.
Common Use Case
You have a market hypothesis but no customers yet. Before committing budget to interviews, smoke tests, or a build, you want a defensible answer to “is this market big enough, and is it moving in a direction that helps or hurts us?” — sourced from published material a skeptical investor or partner could check.
Helps Answer
- How large is the market for our specific segment, geography, and use case — not the headline industry number?
- How does our bottom-up build-up compare to the top-down published TAM, and where are the assumptions most fragile?
- Which macro trends — regulatory, demographic, technology, channel-cost — would materially change the sizing picture in the next 12–24 months?
- What pricing benchmarks and willingness-to-pay signals exist in published material for this segment?
- What are realistic customer-acquisition-cost, retention, and channel-cost benchmarks for this category?
- Which findings are strong enough to act on, and which need primary research before we commit budget?
Description
Secondary market research tests a market hypothesis you already hold — about its size, segment, competition, or trajectory — against existing published material: government statistics, analyst reports, academic research, regulatory filings, trade-association data, public-company disclosures, founder benchmarks, and competitor data. You come in with a stated belief (“this segment is large enough and moving in our favor”) and use desk research to confirm, resize, or kill your idea using evidence that already exists. The goal is a defensible TAM / SAM / SOM — total addressable market (everyone who could buy), serviceable addressable market (the slice you can reach), and serviceable obtainable market (the slice you can realistically win) — and a trend map that together validate, or refute, the hypothesis before you commit primary-research budget.
Approaches to secondary market research vary, but two principles hold throughout.
The first is triangulated and bottom-up by default. A single TAM number from a single analyst press release is a guess. A defensible TAM rests on at least two independent estimation paths — a top-down estimate from analyst or government data and a bottom-up build-up from named-segment counts multiplied by annual contract value (ACV, the revenue one customer generates per year) — that agree to within roughly 3–5×. When the two disagree by more than that, the assumptions need work, not the number. A polished $50B TAM pulled from one report is usually less credible than a $200M TAM with airtight bottom-up math, explicit segment definitions, and visible sensitivity to its assumptions. The same triangulation rule applies to macro trends: every trend claim should rest on at least two independent perspectives — analyst forecast plus government data, or trade press plus a public-company 10-K, not two restatements of the same primary source.
The second is citation-backed claims with mandatory fact-check. AI assembles secondary-research material quickly, but current models fabricate report titles, statistics, and citation URLs at rates between 18% and 94% depending on the task. The most pernicious failure mode is real URL plus fabricated claim — the citation looks defensible because the URL resolves, but the cited page does not actually contain the cited number. Every numeric claim and every named source must trace to a verifiable URL, and a fact-check pass — by a human or a verification subagent that re-fetches each URL and confirms the cited claim is on the cited page — is part of the method.
For query-level demand signal — what people are typing into Google and asking AI assistants right now — use Search Trend Analysis. This page is about how big a market is and where it is heading according to published material; that one is about what the audience is actively searching for. The two methods are complementary and often run in parallel during the same week.
How to
Prep
-
Frame the decision question, not the topic. “Is this market big enough?” is too vague to research. “How many US mid-market fintechs (50–500 employees) would pay $30K+/yr for compliance automation today, and how do the EU AI Act + state privacy laws shift that number over the next two years?” is researchable and decision-anchored. Limit yourself to 2–3 such questions per research session — anything more and the work expands indefinitely.
-
Define the segment slice you can actually reach. Not “businesses” but “B2B SaaS companies with 10–50 employees in the US.” Not “consumers” but “homeowners aged 30–50 with household income above $100K.” Every later number — TAM, SAM, SOM, willingness-to-pay benchmarks, channel-cost estimates — is anchored to this slice. Skip this step and you produce a headline industry number that looks finished but tells you nothing about the market you can actually sell to.
-
Identify candidate sources across multiple buckets. A single source is a guess; multiple sources from different buckets is a triangulation. Cross-reference at least three of these source-pool buckets before locking your reading list:
- Government data — Census Business Builder, Bureau of Labor Statistics, SBA.gov, EU Eurostat, country-level statistics offices. Free, large samples, slow to update.
- Library-accessible analyst databases — Most US public libraries and university libraries carry at least one major consumer- or industry-research database; ask the reference desk before paying.
- Free analyst summaries — McKinsey Global Institute, Bain Insights, BCG Insights, HBR, World Economic Forum, OECD. Full reports are paywalled; the free-tier summaries often carry the headline number you need.
- Founder benchmarks — annual benchmark reports published by SaaS lenders, VC firms, and payment platforms. Free, recent, founder-relevant.
- Regulatory filings — SEC 10-Ks for public competitors and adjacent players, FDA / EMA pipelines for regulated categories, EU AI Act compliance filings, state-level privacy filings. Underused; often the only place segment-level numbers are disclosed.
- Trade associations — IAB for ad tech, ICMA for capital markets, AHIP for health insurance, RIAA for music — often publish unit-economics and channel-cost data not available elsewhere.
- Trade press and beat reporting — sector-specific publications and dedicated beat coverage. Useful for recency, but always trace the cited primary source.
If a number shows up in only one bucket, treat it as directional. If it shows up in two or more independent buckets, it earns a slot in your sizing.
-
Sketch your sizing approach and trend dimensions before pulling data. Decide your TAM / SAM / SOM build-up plan in advance:
- Top-down — start from a published industry total, multiply by the share that matches your segment slice (geography × company size × use case).
- Bottom-up — start from a named count of segment members (Census counts, funding databases, professional networks) × ACV × time period.
- Benchmark-anchored — anchor an unknown number to a comparable category’s known benchmark (e.g., scheduling-software ACV ≃ project-management-software ACV at the same segment size).
Plan to run at least two paths and triangulate. For macro trends, pick the dimensions that matter to your specific segment from the PESTEL frame — Political, Economic, Social, Technological, Environmental, Legal — and drop the dimensions that do not. For most B2B software the dimensions that matter most are Legal (compliance regimes), Technological (adoption curves of upstream platforms), and Economic (segment-level capital-spending shifts); Environmental rarely applies. Stub the trend map with the dimensions you will fill, before collecting data.
-
Tag claims as decision-grade or directional. Some claims drive decisions (TAM in the 1–5× range you are committing to, willingness-to-pay benchmarks for the segment you are pricing against, regulatory pipeline that affects your go-to-market). Others are background context (rough industry growth rate, generic technology-adoption color). Tag each candidate claim decision-grade (must be verified against a primary source and re-fetched by a fact-check pass) or directional (a roundup or AI summary is fine). Without this triage, fact-checking either takes forever or doesn’t happen.
Execution
The verified reading list from Prep is now the input to structured collection. The goal here is not to write essays on each source; it is to fill a fixed dual schema — sizing facts and trend facts — so the synthesis stage has clean inputs.
-
Use a fixed dual schema, not free-form notes. Every source produces two structured records:
Sizing record — segment definition (geography × company size or demographic × use case), headline number, methodology used by the source to derive it, sample or sample size, year of underlying data, who funded the report (if applicable), and any caveats the source itself flagged. Inline
[1],[2]citations on every numeric claim.Trend record — PESTEL dimension(s) the source addresses, the specific shift being claimed, the time horizon, the methodology behind the projection, the named affected segment, and any countervailing forces the source acknowledged. Inline citations on every projection.
A
Sourceslist at the end of each record resolves every marker to a working URL with the date the source was retrieved. -
Collect each field from its canonical source.
- Segment counts and demographics → government statistical data, funding databases, and professional-network counts → bottom-up sizing.
- Top-down industry totals → analyst reports and library-accessible databases → top-down sanity check.
- ACV, willingness-to-pay, retention, CAC benchmarks → founder benchmark reports, public-company 10-Ks for adjacent players, trade-association unit-economics reports → bottom-up multipliers.
- Regulatory pipeline → primary regulatory filings (SEC EDGAR for public companies, FDA / EMA pipelines, EU AI Act timeline documents, state-privacy law trackers like IAPP) → trend map (Legal).
- Technology-adoption curves and platform shifts → analyst adoption forecasts, public-company 10-Ks of upstream platforms, World Economic Forum and OECD reports → trend map (Technological).
- Demographic and economic trends → Census projections, BLS labor projections, IMF / OECD forecasts → trend map (Social, Economic).
-
Build top-down and bottom-up in parallel. Run both estimation paths from the start, not sequentially. The point is not to pick one — the point is to see whether they triangulate. If your bottom-up is $200M and your top-down is $50B, one of the two is reading the wrong segment and you need to find out which before you write anything.
-
AI does the first pass; humans verify decision-grade claims. A capable LLM with web access can fill 70–80% of the dual schema in hours. The remaining 20–30% — and any claim you tagged as decision-grade in Prep — must be confirmed by visiting the cited source yourself or routing through a fact-check subagent. Pricing pages change weekly; analyst-report headline numbers shift between editions; “X% growth” claims often hide a different segment definition than yours.
-
Flag everything gated as MANUAL. Anything behind a login, paywall, library-only access, or paid analyst report gets flagged as MANUAL — [what would resolve it] in the record. AI cannot fetch behind authentication, and a confidently-stated number from a source the agent could not actually retrieve is the most common failure mode of AI-generated market sizing.
Analysis
Analysis turns the collected records into a sizing-and-trend brief that supports a decision. Each artifact below should end in a sentence that names a choice you are now better equipped to make.
-
Triangulate sizing — top-down vs. bottom-up. Lay your top-down number, your bottom-up number, and any benchmark-anchored numbers side by side. If they agree to within roughly 3–5×, your sizing is defensible — pick the bottom-up as your primary number (it is more defensible to investors and partners) and use the top-down as the sanity-check ceiling. If they disagree by more than that, your assumptions are wrong somewhere and you cannot proceed without finding the gap. Most often the gap is segment definition: the top-down counts a wider universe than the bottom-up reaches.
-
Stress-test every load-bearing assumption. For each number you plan to act on, ask:
- Does the underlying data match the geography, segment, and time period I am sizing? (Specificity drift is the most common failure.)
- Is the data still recent enough to act on? (Anything older than 24 months in a fast-moving category is directional, not decision-grade.)
- Is the source’s definition of the category the same as mine? (Definition drift — “B2B SaaS” in one report includes services revenue, in another it doesn’t.)
- What conversion rate from “addressable” to “actually reached” am I using, and is it sourced or fabricated? (Whimsical conversion rates produce optimistic SOMs that fall apart on first contact with primary research.)
- If the source is vendor-funded, does the vendor benefit from a larger headline number? (Almost always yes.)
-
Map macro trends to the sizing picture. For each PESTEL dimension you collected, ask: does this trend materially change the SAM or SOM in the next 12–24 months, and in which direction? Tag each trend expanding (increases the addressable segment), contracting (shrinks it), or shifting (changes the segment definition rather than the size — for example, EU AI Act compliance becomes a buying criterion, raising willingness-to-pay among compliance-mature buyers and disqualifying compliance-immature ones). Drop trends that do not materially change the picture; a trend map of fifteen items is unusable.
-
Run a sensitivity analysis on the SOM. Pick the two or three load-bearing assumptions in your bottom-up build (segment count, conversion rate, ACV, retention), and compute the SOM at +30% / -30% on each. If the SOM swings more than 3× under reasonable assumption ranges, your number is fragile — say so in the brief, and identify the single primary-research move that would tighten the most fragile assumption.
-
Write the strategic insight. A 200–400 word brief answering five questions in order:
- What is the realistic SOM for our segment slice in the next 12 months? Bottom-up build, named segment, sourced ACV, sourced conversion rate.
- What is the top-down ceiling, and how far is the bottom-up from it? A short sentence on triangulation; if the gap is large, name where it lives.
- Which 2–3 macro trends materially change the SOM in 12–24 months, and in which direction? Each trend must rest on at least two independent sources.
- Which 2–3 assumptions are most fragile, and what primary research would tighten them? Names the next research move.
- Go / no-go / not-yet? A one-sentence recommendation grounded in the four answers above.
This is the artifact you actually act on. The schemas, citations, and sensitivity tables exist to make this paragraph defensible.
- AI-fabricated sources (hallucination) When AI assembles the raw material, it invents plausible report titles, statistics, and citation URLs. The most pernicious version is a real, resolving URL paired with a number that is not actually on the page, so the citation looks defensible while the claim is invented. Never treat an AI-assembled claim as verified. Run a secondary fact-check agent as its own pass: hand it the assembled brief and have it re-fetch every cited URL, confirm the specific cited number or quote actually appears on that page, and label each claim VERIFIED, FAILED VERIFICATION (the URL loads but does not support the claim — the dangerous case), or UNREACHABLE (404, paywalled, or blocked). Keep every decision-grade claim unverified until that pass clears it. The fact-check subagent built into the Prep, Execution, and Analysis prompts above is written for exactly this.
- Aspirational thinking disguised as analysis Founders pick TAM numbers that support the story they want to tell, then back-fit the methodology. Force yourself to commit to your sizing path and segment definition in Prep, before you have seen any of the headline numbers; if a single source flips your conclusion, you were fitting the method to a predetermined answer.
- Vendor-report bias Industry analysts and trend-report publishers benefit from a larger headline number; their commercial interest is often aligned with optimism. When you cite a vendor-funded report, weight it accordingly and require an independent corroboration before treating its number as decision-grade.
- Confirmation bias in conversion rates The most common way founders inflate SOM is by importing a whimsical conversion rate from “addressable” to “actually reached.” Use sourced conversion rates from comparable categories (founder benchmarks are useful here) and run sensitivity at +30% / -30%; if the SOM only stands up at the optimistic end, say so in the brief.
- Willingness-to-pay vs. ability-to-pay Search volume, expressed interest, and “would you use this” survey results indicate willingness-to-use, not willingness-to-pay. Anchor pricing benchmarks to actual paid revenue from comparable products, not to interest signals.
Learn more
Case Studies
Spotify: Reading the music-industry collapse
Daniel Ek framed Spotify around the observation that piracy could not be legislated away, so a paid service had to beat free. The bet was grounded in published industry-revenue decline and rising smartphone adoption rather than primary research.
Uber TAM debate: Damodaran vs. Gurley (2014)
Aswath Damodaran sized Uber’s global TAM at about $100B by anchoring to taxi-and-limousine revenue; Bill Gurley argued the relevant TAM was several hundred billion once ride-hailing reshaped private-car ownership. The exchange is the canonical illustration of how segment definition drives sizing.
Lighter Capital: 2025 SaaS benchmarks
Lighter Capital’s free 2025 benchmark report puts median B2B SaaS revenue growth at 28% (down from 47% in 2024) and upper-quartile growth at 65% (down from 88%), giving founders a public baseline for their own growth-rate assumptions.
Further reading
- Pearson catalog
- Aswath Damodaran — A Disruptive Cab Ride to Riches: The Uber Payoff (2014)
- Bill Gurley — How to Miss By a Mile: An Alternative Look at Uber’s Potential Market Size (2014)
- SBA.gov: Market research and competitive analysis
- Cornell University Johnson Library FAQ: How can I find market research reports and data?
- Anthropic: How we built our multi-agent research system (2025)
- Hallucination to Truth: A Review of Fact-Checking and Factuality Evaluation in Large Language Models (arXiv 2508.03860, 2025)
- AI Hallucination Rate Benchmarks 2026
- HBR: The AI Tools That Are Transforming Market Research (2025)
- Bain & Company — Macro Trends Research
- PESTLE Analysis (SI Labs)
- Census.gov — Economic Census
Got something to add? Share with the community.