The shortlist that shouldn't be a shortlist
Your procurement process is running. The shortlist has five names. Each one looks credible. Each demo went well. The references all check out.
Five is too many. If you're doing the engineering evaluation properly, two of the five drop within the first 45 minutes. Sometimes three. The shortlist that goes to procurement should be the two or three names left after the engineering filter, not the five names sales presented you with.
This piece is the engineering filter. It's not procurement's checklist. It's the questions a senior engineering leader uses to compress a shortlist of five down to a defensible two before the contract conversation starts.
AI – Powered Product Development Playbook
How AI-first startups build MVPs faster, ship quicker, & impress investors without big teams.
The eight items the engineering filter actually checks
Procurement evaluates contract terms, financial stability, security posture, commercial flexibility. Necessary. The engineering filter evaluates whether the partner can actually do the engineering. Eight items. Each is concrete enough to be answered in a single working session.
Item 1: Can they whiteboard your reference architecture in 45 minutes?
Most engineering leads have walked into kickoffs where the partner brings out the reference architecture deck. The slides look good. The architecture is reasonable.
The reveal isn't in the deck. It's in the followup. Ask the partner to whiteboard the same architecture, in real time, in a single working session, with their senior engineer on the call. If they can, they actually understand it. If they can't, the deck was authored by someone who isn't going to be on your project.
This filter alone eliminates one or two of the five most of the time.
Item 2: Do their post-incident write-ups exist, and are they readable?
Real implementation partners have incidents and write them up. Ask for the last three post-incident reviews from comparable engagements (with customer-identifying details redacted). What you're checking: do they have the discipline to produce them, and is the analysis sharp or generic?
The post-incident review tells you more about a partner than the case study deck. The case study is marketing. The post-incident review is engineering. Partners who can't produce them aren't operating production systems.
Item 3: Does their senior engineer disagree with their own salesperson during the call?
Sales says scope, security, and roadmap with confidence. The senior engineer who'll actually do the work occasionally disagrees with the salesperson in front of you. Maybe on timeline, maybe on the right approach to integration, maybe on whether a specific feature is achievable in the proposed window.
This isn't a problem. It's a good sign. It means the engineer is engaged and the partner has an internal honesty discipline. Partners where the engineer enthusiastically agrees with everything the salesperson said are partners where the engineer was coached.
Item 4: What's the named team for your engagement, and what have they shipped together before?
Some partners deploy a sales team that scopes the work, a delivery team that gets staffed when the contract signs, and a third team that takes over at month four. By the time the work starts, none of the people in the original sales conversation are on the engagement.
Ask for the named team. The three to five engineers who will actually do the work. Ask what they've shipped together as a unit before. If the answer is vague, the partner is building the team after they sign you. That's a 90-day ramp on your dime before they can ship anything useful.
Item 5: Can they show you a recent code review from a comparable engagement?
This sounds aggressive. It's not. Code review style is the highest-fidelity signal of engineering culture. A partner whose code reviews are detailed, constructive, and focused on architectural concerns is a partner whose engineers will hold the work to a high standard. A partner whose code reviews are perfunctory or absent is a partner whose code quality will drift.
Ask for redacted samples. Most good partners will share them. The ones who refuse are usually the ones whose internal review process is weaker than the brochure suggests.
Item 6: What's their AI-specific operating model?
General consulting firms got hit hard in 2025. McKinsey laid off 200 technology and support staff after automating non-client-facing work with AI. The category of partner that worked for traditional IT is in transition for AI.
Ask the AI-specific question: how do you operate the LLM ops layer for clients? Do you ship eval harnesses to client repos? Cost dashboards? Observability tuned for LLMs? Rollback automation?
Partners with a real AI operating model answer specifically. Partners without one answer generically about "best practices" or "industry standards." The first kind ships systems you can operate at month eighteen. The second kind ships systems that hit the cliff.
Item 7: What does the knowledge transfer at month nine actually look like?
You're not buying a partner forever. You're buying a partner who hands the operation to your team at some point. Ask what that looks like.
The right answer is specific. Documentation in your repos. Named engineers on your team who paired with their senior engineers during the build. A graduated handoff schedule. A defined endpoint to the engagement.
The wrong answer is vague. "We support you ongoing" or "knowledge transfer is included" without specifics. Partners who can't articulate the off-ramp are partners whose business model requires you to never leave.
Item 8: What would they refuse to do, even if you asked?
The most diagnostic question of the eight. Real partners have refused work. Maybe because the timeline was unrealistic, maybe because the scope was vague, maybe because the customer wasn't ready operationally.
A partner who has refused work has standards. A partner who has never refused work optimizes for billable hours. The second kind will agree to your timeline even when they know it's unrealistic, and you'll discover that fact at month four.
Ask yourself: Run your current shortlist through these eight items in a single working session, with the named senior engineer from each partner present. Two will probably exit by item three. The remaining three will compress to two by item eight. Sometimes one. That's the shortlist you should be negotiating with.
What this filter doesn't tell you
The eight items don't evaluate commercial fit. They don't catch budget mismatches, contract risks, or cultural alignment with your team. Procurement still has work to do. The point of the engineering filter isn't to replace procurement; it's to make sure the partners procurement is evaluating can actually do the engineering.
Without the engineering filter, procurement spends three weeks negotiating with a partner who turns out, at month four, to be incapable of delivering what they promised. The contract is fine. The engineering isn't. That's the failure mode the eight items eliminate.
The compound case for running this filter properly
BCG's 2024 research found 74% of companies investing in AI failed to generate tangible value. HFS Research: 65% of enterprise buyers say traditional consulting models no longer provide enough value. Industry data: 68% of AI projects exceed budget, 42% average overrun, scope creep cited in 52% of cases.
Most of that bad outcome traces back to partner mismatch detected too late. The eight-item filter catches the mismatch at the sales-call stage, when the cost of switching is zero. The same mismatch detected at month four costs a quarter of the engagement, the team's confidence in the program, and the CFO's confidence in the team.
The engineering filter is the highest-leverage hour of work in the entire partner selection process. It's almost free. It usually isn't run because procurement owns the process and engineering's calendar isn't blocked for it.
How Logiciel fits this conversation
If you're using the eight items to evaluate Logiciel, here's the honest answer to each. We can whiteboard a reference architecture in 45 minutes; ask for it and we'll do it. We have post-incident reviews and can share redacted versions. Our engineers disagree with our sales team during scoping; that's the operating norm. We name the team that does the work, and they've shipped together before. We ship eval harnesses, cost dashboards, and observability to client repos as part of every engagement. Knowledge transfer is explicit, scheduled, and ends with a defined off-ramp. We've refused work, multiple times, when timelines or scope made the engagement likely to fail.
That's the eight-item answer. If you'd rather see it in a working session than read it on a blog, the offer below is real.
100 CTOs. Real Expectations
This report shows what actually predicts delivery success and what CTOs discover too late.
Call to Action
Two specific moves before procurement closes
Run the eight-item filter against your current shortlist this week. The conversation it produces, with your engineering lead and the partner's senior engineer in the same room, compresses the shortlist faster than any other technique we know.
Use Logiciel's Evaluation Differentiator Framework as the scoring rubric. It's structured around the eight items. Your engineering lead can run it in 45 minutes.
Download the Evaluation Differentiator Framework →
Or run the working session with us included. We'll go through our own answers to the eight items in real time, and you can compare. The session doubles as both a partner evaluation and an AI engineering working session, depending on which way it goes.
Book the 45-minute partner working session →
Frequently Asked Questions
Why eight items and not the standard 12 procurement questions?
The 12 procurement questions include several that aren't engineering questions (financial stability, contract terms, references). Those still matter, just not in the engineering filter. The eight items are specifically the ones a senior engineer would test before signing off on a partner. Run both. Engineering owns the eight; procurement owns the rest.
What if no partner on our shortlist passes all eight?
That's worth knowing. Either the shortlist needs to be expanded, or one partner needs to be told what's missing from their answer and given a chance to address it. Most partners can answer eight of eight; partners who can't are usually missing one specific capability that will surface during the engagement anyway.
How do we run this when our shortlist is global and not in our office?
Video calls work. The whiteboard test transfers to a shared Miro or Figma. The senior-engineer-disagreement test transfers (sometimes more honestly, because the sales filter is weaker on video). The only item that's harder remote is the code review sample, which most partners will share digitally anyway.
Aren't we putting unfair pressure on partners with these questions?
The questions are diagnostic, not adversarial. Partners who pass them are partners who'll ship. Partners who fail them are partners whose engagements would have surprised you in eighteen months. The conversation is supposed to be substantive. The partners worth working with will appreciate that you're taking the engineering evaluation seriously.
Our procurement team won't add 45 minutes per partner for an engineering evaluation. What do we do?
Run the filter before procurement gets involved. Have your engineering lead spend an hour with each partner's senior engineer during the initial sales conversation, before the formal evaluation starts. The names that survive go to procurement. The names that don't are eliminated before procurement spends time on them. Net time saved: usually a week of procurement's calendar. --- Sources cited: - BCG: 74% of AI investments fail / HFS Research: 65% buyer dissatisfaction with traditional consulting - 68% of AI projects exceed budget, 42% average overrun - McKinsey 200-person AI layoff: consulting model in transition