Most AI integration projects fail at L1. They skip it. They start by connecting to the biggest system they have and end up with a one-system answer to a 6-system question. L1 is the work you do before writing any ingest code: identify which sources hold the data needed to answer the questions the business actually has.
Start with questions, not systems. The cross-system question — which of our highest-value customers are the ones we're failing to serve and who have already complained? — needs data from 4–8 sources joined together. Map every source, write down its freshness SLA, write down its owner. Do this on a whiteboard before any code ships.
Most teams' first move on an AI integration is to connect to the system they know best — usually their CRM — and try to make AI features work from there. Six weeks later, the AI keeps giving incomplete answers because half the relevant data lives somewhere else: billing in Stripe, support in Intercom, product use in Mixpanel, ad spend in Meta Ads. The team didn't skip the data work; they skipped scoping the data work.
L1 is the audit. List every source. List every question the business needs answered. Map each question to the sources that hold the answer. The 25-source number scares people but it's typical for any operating business at scale — DTC, SaaS, services, healthcare, financial services, all the same shape.
The single-customer demo we trace through every layer of this playbook: Sample Customer 4287, returning customer, ~$2K LTV, recent stalled order, just emailed support, active subscription, recent email engagement. Question: should we proactively reach out before she churns? The answer requires 6 sources — Shopify (order), warehouse (fulfillment), carrier API (tracking), Gorgias (support ticket), Recharge (subscription state), Klaviyo (engagement). No single SaaS can answer this. The lake makes it answerable. L1 is where you map the 6 sources.
The 6-source pattern isn't DTC-specific. SaaS: high-MRR account showing churn signals → CRM + product telemetry + support + billing + email = 5 sources. Services: top-decile client with overdue invoice + missed status → CRM + PSA + invoicing + email + project tracker = 5 sources. Healthcare: high-utilization patient with missed follow-ups → EHR + scheduling + billing + clinical notes = 4 sources. The 25-source rule applies everywhere.
Good answer: a structured discovery — list every business question that matters, map each to the sources required, flag the long-tail sources that get skipped at most shops. Bad answer: tell us what systems you have, we'll connect to those. If a vendor pitches you a 5-source build for a 25-source business, they're either inexperienced or optimizing for a fast close. Either way, that's the conversation to have before signing.