The honest answer on AI in CDD
AI does not replace commercial due diligence. AI replaces the slowest, lowest-value parts of the synthesis around CDD, which is valuable but not the same thing.
The defensible value in CDD comes from two activities. First, primary research the seller cannot replicate: structured customer calls, lost customer calls, channel calls, expert calls, surveys. Second, senior judgment about what the evidence actually means for the thesis. Neither activity is well served by handing it to a model. Customers do not give honest answers to chatbots, and judgment about a thesis-specific question is not a retrieval problem.
What AI does well is everything in between. The transcription, the first-pass coding, the desktop research, the competitive scraping, the model building. At 2nd St Strategy, our internal stack has been built across 200+ engagements specifically around those leverage points.
Where the leverage is
1. Transcription and theme extraction
A CDD engagement with 30 customer and expert calls used to spend a week on transcription and coding. Our pipeline does the transcription in hours, runs first-pass theme extraction against a structured grid, and surfaces patterns for a senior reviewer to confirm or adjust. The engagement gets to synthesis days earlier, which means more time on judgment and verdict shaping.
2. Competitive scraping and benchmarking
Web-scale competitive data is genuinely easier with modern tooling. Pricing pages, job postings, product release cadence, customer review distributions, executive movement, partnership announcements. We assemble a competitive picture from public sources that would have required a small army a few years ago.
3. Dynamic TAM modeling
Top-down and bottom-up models that update interactively, with the assumptions exposed and the deal team able to flex inputs on the call. The model becomes a live artifact during diligence rather than a static deck slide.
4. Pattern matching across engagements
The accumulated corpus of anonymized prior engagements is itself a competitive asset. When a sub-thesis on the current deal looks familiar, we can pull the pattern from a prior engagement in seconds rather than relying on individual memory.
5. MSA-level geographic intelligence
For any location-based or route-based business, the right unit of analysis is the metropolitan statistical area (MSA). We built PinpointIQ as a software platform for exactly that. It covers 900+ U.S. MSAs across 30+ verticals with TAM, competitive density, demographic drivers, and white-space mapping. Every location-based CDD engagement starts there.
Where AI does not help
- Sourcing primary research. Customer interviews come from outbound work, expert networks, and existing relationships. No model produces them.
- Conducting the interview. The conversation requires a senior person who knows what to listen for, when to probe, when to drop a planned question.
- Judging credibility. Was the customer hedging. Was the expert overconfident. Did the channel partner have an axe to grind. Judgment, not retrieval.
- Verdict and IC narrative. The deliverable that actually drives the investment decision is written by a senior person who has done many of these. The model is not on the IC.
What confidentiality requires
Real CDD work involves sensitive transcripts, target financials, and proprietary research. None of it can be allowed to drift into public training pipelines.
At 2nd St we use models with enterprise data handling commitments, route nothing identifiable through public consumer products, and document the data handling architecture for every engagement. This is unglamorous and load-bearing.
The shape of the stack
A CDD engagement at 2nd St typically uses some combination of the following:
- Hardened transcription pipeline for calls and surveys.
- Theme extraction tools that code transcripts to a structured grid.
- Targeted web scraping for competitor signal.
- PinpointIQ for MSA-level market sizing and competitive landscape in location-based services.
- Dynamic TAM and SAM modeling with exposed assumptions.
- Pattern-matching across the anonymized prior engagement corpus.
- Bespoke tooling per deal: vertical-specific scrapers, custom dashboards, sometimes one-off models built for a specific sub-thesis.
The point of the stack is to get to the senior judgment faster, with better evidence, on a shorter timeline. Not to remove the judgment.
If you are wondering whether to adopt this
Most sponsors should not be building this stack in-house. The right answer is hiring CDD partners who already have it and using the accelerated synthesis to ask sharper questions. That is what 2nd St is built for.
