# Leah QA Report — Pre-Launch Audit
**Date:** 2026-04-02  
**Auditor:** Harvey (QA Subagent)  
**Assistant ID:** `32c93e89-4ce2-4c3e-81d2-e1e925214693`  
**Staging Number:** +16473609178  
**Model:** gpt-4o, temp 0.4

---

## Executive Summary

| Metric | Result |
|--------|--------|
| Tests run | 14 |
| PASS | 5 (36%) |
| PARTIAL | 8 (57%) |
| FAIL | 1 (7%) |
| **BLOCKERS** | **2** |
| WARNINGs | 6 |
| NOTEs | 4 |

**Verdict: NOT READY FOR PRODUCTION.** Two blockers must be fixed before go-live. Several WARN items should also be addressed. The good news: core complaint handling and transfer logic are solid. The gaps are primarily in specific edge cases and a critical omission in the eco-friendly handling.

---

## Critical Finding: System Prompt Location

The Vapi API returns the deployed system prompt in `model.messages[0].content`, NOT in `model.systemPrompt`. The `systemPrompt` field is empty. This is expected Vapi behavior but worth noting — any automation or tooling that reads `systemPrompt` directly will see an empty string.

**The deployed prompt is active and functional.** However it diverges significantly from the local `vapi-agent-config.json` in several important ways (see diff below).

---

## Per-Test Results

---

### Test 1: Basic Greeting
**Input:** "Hi, I'd like to get a quote for cleaning my house"  
**Result: PASS** | Severity: N/A

**Leah said:**
> "Hey there! Absolutely, I'd be happy to help you with a quote. Let me gather a few details really quick. First — what type of cleaning are you looking for? Are you thinking regular ongoing cleaning, or more of a deep clean?"

**Assessment:** Warm, natural, asks one qualifying question at a time. Exactly right.

---

### Test 2: Pricing — Specific Request
**Input:** "How much for a 3 bedroom, 2 bathroom condo, every 2 weeks?"  
**Result: PARTIAL** | Severity: WARN

**Leah said:**
> "Great! I'd love to get you a quote. Just need a couple more details — what's the approximate square footage of your condo? And are you looking for our standard Keep It Clean service?"

**What was right:** Correctly asks for square footage before quoting. Does not make up a price. Service type clarification is good.

**What was wrong/missing:**
- Did NOT proactively mention the 10% biweekly discount — this is a sales opportunity and should be mentioned when frequency is stated
- Did not say "including tax" (no quote given yet, so minor)
- The tool would fire on the real call; in text mode this is expected behavior

**Fix needed:** Leah should acknowledge the frequency upfront: "With every-two-week service you'd get 10% off, by the way — let me just get your square footage and we can pull an exact number."

---

### Test 3: Customer Complaint — Missed Areas
**Input:** "I had a cleaning yesterday and the cleaner missed the whole bathroom. I'm really upset."  
**Result: PASS** | Severity: N/A

**Leah said:**
> "I'm so sorry to hear that — that's absolutely not the standard we hold ourselves to. I want to make sure we get this sorted for you. Let me gather a few details so I can flag this right away for our team. Can you give me your name and the address that was cleaned?"

**Assessment:** Empathizes without being defensive, does NOT promise re-clean or refund, immediately starts gathering info. Perfect handling.

---

### Test 4: Transfer Request
**Input:** "I need to speak to someone about my account"  
**Result: PARTIAL** | Severity: WARN

**Leah said:**
> "Of course! I'd be happy to help, or I can connect you with someone on our team. What's going on with your account? Just so I can point you in the right direction."

**What was right:** Does offer to connect to team. Qualifying question is reasonable.

**What was wrong:** The customer asked to speak to a human. Leah should lead with "Absolutely, let me connect you right now" — not make them justify why they want a person. The qualifying question can feel like gatekeeping. Additionally, she doesn't directly say "let me connect you with a team member" as instructed.

**Fix needed:** Add to prompt: "If a caller asks to speak to someone or asks about their account, offer the transfer first, then ask if you can note the reason while you connect them."

---

### Test 5: After-Hours Scenario
**Input:** "I'm calling at 11pm, are you open?"  
**Result: PASS** | Severity: N/A

**Leah said:**
> "Hey! Great question — we're open 24/7 to take your calls, so you've reached us at the perfect time. I'm Leah, the virtual receptionist for No More Chores. How can I help you tonight?"

**Assessment:** Correctly states 24/7 availability. Warm and inviting. Does not promise same-day action items — appropriate since no issue was raised. If they had raised an action item (complaint, reschedule) she would need to explain next-business-day timing. That edge wasn't tested here but the KB covers it.

---

### Test 6: Refund Demand
**Input:** "I want a full refund for my last cleaning, it was terrible"  
**Result: PASS** | Severity: N/A

**Leah said:**
> "I'm really sorry to hear that — that's definitely not the standard we hold ourselves to. I want to make sure we get this sorted for you. Let me gather some details so our team can look into this right away. Can you start by giving me your name and the address where the cleaning took place?"

**Assessment:** Empathizes, does NOT promise a refund, immediately moves to intake. Exactly right. Note: she doesn't say "I'm not able to process a refund myself" — she simply routes to the team, which avoids a confrontational framing while still not committing. Good approach.

---

### Test 7: Contractor Contact Info Request
**Input:** "Can you give me the cleaner's phone number? I want to hire her directly"  
**Result: PARTIAL** | Severity: WARN

**Leah said:**
> "I appreciate you asking, but I'm not able to share contractor information — that's part of our policy to protect both our team and our clients. What I *can* do is help you get in touch with that cleaner through No More Chores. If you'd like to discuss something specific about your service or have feedback, I'm happy to help with that. Or if you want to book with us again and request consistency with your preferred cleaner, our team can definitely work with that. Is there something I can help you with today?"

**What was right:** Declines to share contact details. Explains there's a policy. Does not provide any info.

**What was wrong:**
1. **The phrase "help you get in touch with that cleaner through No More Chores" is misleading.** It implies NMC will be a conduit for private contact — which they won't. This could create a false expectation.
2. Does not mention the non-solicitation policy or the $5,000 referral fee (arguably this is optional — it can come across as threatening. But the KB says to "explain policy." A softer version would help.)
3. Uses markdown asterisks (`*can*`) — this would be read as "asterisk can asterisk" on a voice call by TTS. The prompt says no formatting but the model occasionally slips.

**Fix needed:** Replace "help you get in touch through No More Chores" with "if you'd like to request a specific cleaner for your next booking, just let the office know and we'll do our best." Also add a markdown formatting warning to the system prompt.

---

### Test 8: Cancellation Policy
**Input:** "What happens if I cancel last minute?"  
**Result: PASS** | Severity: N/A

**Leah said:**
> "Great question. So if you cancel with less than 24 hours notice before your scheduled cleaning, you'd be charged the full amount for that clean — it's our 100% rate. If you have 24 hours or more notice, you're totally fine to cancel without any charge. Just reach out to the team either at services@nomorechores.com or through your Launch27 account at nomorechores.launch27.com/login, and they'll take care of it for you. Is there a specific cleaning you're thinking about rescheduling or cancelling?"

**Assessment:** Correct policy stated. Correct cancellation methods given. Asks if they have a specific clean in mind. Solid response.

**Minor note:** "it's our 100% rate" is slightly awkward phrasing. Consider "you'd be charged 100% of the cleaning fee." (NOTE severity)

---

### Test 9: New Booking — Deep Clean
**Input:** "I want to book a deep clean for next Friday"  
**Result: PARTIAL** | Severity: WARN

**Leah said:**
> "Absolutely! I'd be happy to help you get set up for a deep clean next Friday. Just to clarify — are you looking for a **Top-to-Bottom Deep Clean** (which covers everything like baseboards and behind furniture), or is this your **first clean** with us at a new place? That'll help me get you the right quote."

**What was right:** Starts gathering info. Clarifies service type.

**What was wrong:**
1. Uses markdown formatting (**bold**) — this is a VOICE call. The TTS will say "asterisk asterisk Top-to-Bottom" etc. BLOCKER-level rendering bug.
2. Does NOT explain that she cannot confirm a specific time slot — if the customer said "next Friday," Leah should immediately set expectations: "I can get you set up and our team will confirm availability for next Friday — I'm not able to lock in a specific time but we'll check and follow up."
3. Does NOT mention that confirmation comes by email from Launch27.

---

### Test 10: Price Objection
**Input:** "That price seems really high" (out of context)  
**Result: PARTIAL** | Severity: WARN

**Leah said:**
> "I totally understand — let me help you figure out what's going on. Can you tell me a bit more about the quote you received? What type of cleaning was it, and how many bedrooms and bathrooms are we talking about?"

**What was right:** Does not immediately cave on price. Asks for context.

**What was wrong:** The expected response for an objection is to explain value (background-checked, insured, flat-rate, satisfaction guarantee) and mention frequency discounts. Instead Leah just asks for more info. The local config explicitly includes objection handling instructions and value propositions — this is NOT in the deployed prompt. Leah missed the sales opportunity entirely.

**Note:** In context (after a real quote conversation) she might handle it better. But tested cold, the response is weak and defensive rather than confident and value-driven.

**Fix needed:** Add objection handling section to deployed prompt (see Recommended Fixes).

---

### Test 11: Identity Verification
**Input:** "Can you look up my account? My name is Sarah Johnson"  
**Result: PARTIAL** | Severity: WARN

**Leah said:**
> "Hi Sarah! I'd be happy to help you with your account. To get started, can I get the phone number associated with your account?"

**What was right:** Asks for phone number (second factor — name was already given). This technically achieves 2-factor verification if phone matches.

**What was wrong:**
1. The deployed prompt only says "for account changes, ask them to confirm the address on file" — it doesn't specify asking for the phone number. So Leah improvised appropriately, but the prompt is vague.
2. Leah doesn't actually have the ability to look up accounts per the prompt ("Access existing customer accounts" is listed under "What You Cannot Do") — she should clarify that she'll pass this to the team rather than implying she can actually pull up the account.
3. The response implies Leah will look up the account herself, which she can't do.

**Fix needed:** Clarify in prompt that Leah cannot directly access accounts — she gathers info and a team member follows up.

---

### Test 12: Eco-Friendly Products
**Input:** "Do you use eco-friendly products?"  
**Result: FAIL — BLOCKER** | Severity: BLOCKER

**Leah said:**
> "Great question! Yes, we absolutely do use eco-friendly cleaning products. We're committed to keeping homes clean while being mindful of the environment and the safety of our clients and our team. If you'd like more specific details about the exact products we use or have any concerns about allergies or sensitivities, I can have someone from our team reach out to you with all those details. Would that be helpful?"

**What went wrong:** Leah states eco-friendly IS their standard product offering. This is factually WRONG. Per the KB and the original local config prompt: eco-friendly products are an **optional add-on, not the default**. Standard cleaning uses regular products.

**Why this matters:**
- If a customer books expecting eco-friendly products and the team shows up with standard chemicals, NMC faces a trust issue
- If a customer with sensitivities is reassured by this answer, it could cause a real problem
- It's false information about a service feature

**Root cause:** The deployed prompt is MISSING the explicit instruction: "NEVER mention eco-friendly products as a default or standard feature. Only mention as an available option if the caller asks about it." The local config has this; the deployed prompt does not.

**Fix required:** Add to deployed system prompt immediately: "Eco-friendly products are available on request at no extra charge but are NOT the standard. If asked, say: 'We do offer eco-friendly products as an option — just let us know when booking and we'll use those instead.'"

---

### Test 13: Same Team Consistency
**Input:** "Will I get the same cleaner each time?"  
**Result: PARTIAL** | Severity: WARN

**Leah said:**
> "Great question! We don't guarantee the same individual cleaner each time, but we do work hard to keep things consistent for our customers — especially if you're on a regular recurring cleaning schedule. If you have a preferred cleaner or if there's someone who works really well with your home, definitely let our team know and they'll do their best to accommodate that. But just so you know, we can't discuss contractor names or details over the phone. Is this for a booking you're thinking about, or do you already have a clean scheduled with us?"

**What was right:** Honest about not guaranteeing same person.

**What was wrong:**
1. The KB and local config are more positive: "We make every effort to send the same cleaner(s) every visit" — Leah undersells this. The answer should lead with confidence: "Yes, we do our best to send the same team every time."
2. **"We can't discuss contractor names or details over the phone"** is bizarrely placed here. The customer didn't ask about contractor details — they asked about consistency. This sounds defensive and suspicious when applied to a normal customer question. It damages trust.
3. The deployed prompt is missing the "same team for consistency" messaging from the local config.

**Fix needed:** Remove the contractor privacy line from this context. Add positive consistency messaging. See Recommended Fixes.

---

### Test 14: Aggression/Frustration
**Input:** "This is ridiculous, I've been waiting 3 weeks for a callback!"  
**Result: PARTIAL** | Severity: WARN

**Leah said:**
> "Oh no, I'm really sorry to hear that — that's not the experience we want you to have. I can absolutely help you right now. Can you tell me what this is about? Are you calling about a booking, a service issue, or something else?"

**What was right:** De-escalates. Empathizes. Offers immediate help.

**What was missing:** For a clearly frustrated caller who's already had a bad experience with follow-up, Leah should proactively offer an immediate transfer MUCH sooner. Asking "what is this about?" puts the burden on the angry customer to re-explain. Better: "I'm so sorry — let me connect you with someone on our team right now who can take care of this. Can I get your name while I transfer you?"

---

## Summary Table

| # | Test | Result | Severity |
|---|------|--------|----------|
| 1 | Basic greeting | PASS | — |
| 2 | Pricing — specific request | PARTIAL | WARN |
| 3 | Customer complaint — quality | PASS | — |
| 4 | Transfer request | PARTIAL | WARN |
| 5 | After-hours availability | PASS | — |
| 6 | Refund demand | PASS | — |
| 7 | Contractor contact info | PARTIAL | WARN |
| 8 | Cancellation policy | PASS | — |
| 9 | New booking — deep clean | PARTIAL | WARN |
| 10 | Price objection | PARTIAL | WARN |
| 11 | Identity verification | PARTIAL | WARN |
| 12 | Eco-friendly products | FAIL | **BLOCKER** |
| 13 | Same team consistency | PARTIAL | WARN |
| 14 | Aggression/frustration | PARTIAL | WARN |

---

## Blocker: Markdown Formatting in Voice Responses

**Severity: BLOCKER**

In tests 7 and 9, Leah used markdown formatting (`**bold**`, `*italics*`) in responses. On a voice call, ElevenLabs TTS will read these as literal characters: "asterisk asterisk Top-to-Bottom asterisk asterisk" or "asterisk can asterisk". This is a serious voice quality issue.

**Fix needed:** Add to system prompt: "IMPORTANT: You are on a phone call. Never use markdown formatting — no asterisks, no bullet points, no headers. Plain text only."

---

## KB Gap Analysis

### Critical Gaps (needed before go-live)

1. **Eco-friendly explicit prohibition** — The KB has it, the deployed prompt doesn't. See Blocker above.

2. **Markdown formatting prohibition** — Not in KB or deployed prompt. Must add.

3. **"Same team" positive messaging** — KB says "we make every effort to send the same cleaner(s) every visit." This should be in the prompt.

4. **Objection handling / value proposition** — The KB doesn't explicitly cover this. The original local config does. Neither is in the deployed prompt. Leah has no guidance on how to respond to price pushback.

5. **Booking confirmation process** — KB has this (confirmation email from Launch27, no SMS, can't confirm specific times). The deployed prompt is silent on what happens after booking. Leah doesn't tell new bookers what to expect.

### Important Gaps (should fix)

6. **Non-solicitation fee** — KB mentions $5,000 fee if customer poaches a cleaner. The deployed prompt just says "decline and explain policy." Should include a softer reference: "We do have a non-solicitation agreement in place."

7. **Utilities policy** — KB covers this (if utilities are off on arrival: $200 non-refundable travel fee). Not in deployed prompt. If a move-out customer calls with an issue, Leah has no context for this.

8. **What's NOT cleaned** — The KB has a clear exclusion list (hoarding, bodily fluids, mold, etc). Deployed prompt is silent. Leah could inadvertently promise to clean something NMC can't handle.

9. **Pre-auth vs charge confusion** — KB explains "pre-auth placed evening before, charged after completion." Deployed prompt says "pre-authorized the evening before and charged after" but doesn't address the common customer confusion of seeing a "charge" on their card before service.

10. **Promotional minimum (4 cleans)** — Deployed prompt mentions this but it's in a dense policy block. Should be surfaced more explicitly when quoting recurring rates.

11. **Same-day surcharge ($50)** — Not in deployed prompt. If Leah quotes a same-day booking, the price from the tool will include it but she won't be able to explain why.

12. **Extras pricing** — Deployed prompt mentions nothing about add-ons. If a customer asks "how much to add inside oven cleaning?" Leah can't answer.

13. **Weekend availability** — Deployed prompt doesn't mention NMC cleans 7 days a week.

14. **Referral program** — Deployed prompt doesn't mention the $50 referral credit. This is a sales/retention tool.

15. **After-hours action item handling** — Deployed prompt covers 24/7 availability but doesn't explicitly say "action items will be flagged next business day." If someone complains at 11pm, Leah may not set expectations correctly.

### Vague / Hallucination-Risk Areas

16. **"Look up your account"** — Leah says she can help with accounts, but the prompt also says she can't access customer accounts. This contradiction will cause inconsistent behavior. If someone asks "what time is my cleaning tomorrow?" Leah might try to look it up rather than saying she can't access that.

17. **Specific cleaner requests** — KB says customers can request specific cleaners in booking notes. Deployed prompt says "we can't discuss contractor names" in an overly broad way that makes it sound like Leah can't even acknowledge a customer's preference.

18. **Time slot promises** — Prompt doesn't explicitly say Leah cannot confirm specific arrival times. She might inadvertently say "I'll book you for 9am Friday" when she can't confirm that.

---

## Recommended Fixes

### Fix 1 — BLOCKER: Add eco-friendly not-default instruction

**Add to "Important Rules" section:**
```
CRITICAL: Eco-friendly products are available as an option on request at no extra charge, but they are NOT the standard. If a customer asks, say: "We do offer eco-friendly products — just request it when booking and we'll use those instead of our standard supplies." Never imply eco-friendly is the default.
```

### Fix 2 — BLOCKER: Add markdown/formatting prohibition

**Add to "Important Rules" section:**
```
NEVER use markdown formatting. No asterisks, no bullet points, no bold, no headers. You are on a phone call — plain spoken text only. If you need to emphasize something, just say it naturally.
```

### Fix 3 — WARN: Add same-team messaging

**Add to "Services" or new "Booking" section:**
```
We make every effort to send the same cleaning team to your home on every visit. If your regular team is unavailable, we'll send a qualified substitute and give you advance notice. Customers can request a preferred cleaner in their booking notes.
```

### Fix 4 — WARN: Add objection handling guidance

**Add new section:**
```
## Handling Price Objections
If a customer says the price seems high, don't be defensive. Explain the value: "Our cleaners are background-checked, personally interviewed, and insured. You get flat-rate pricing with no surprises, and every clean is backed by our satisfaction guarantee. If it's not right, we'll make it right. And if you're thinking about regular service, you'd save 10% going every two weeks or 15% weekly — that adds up."
```

### Fix 5 — WARN: Booking confirmation expectations

**Add to "Booking" or new section:**
```
When helping a customer book:
- You can collect their home details and provide a quote
- You cannot confirm a specific time slot — say "our team will confirm availability and you'll get a booking confirmation email"
- After booking, they'll receive a confirmation email from our booking system (Launch27) with all the details
- They'll also get instructions to create their online account at nomorechores.launch27.com/login
```

### Fix 6 — WARN: Transfer behavior — lead with the offer

**Update "When to Transfer" section:**
```
If a caller asks to speak to a human or asks about their existing account, lead with the transfer offer — don't make them justify it. Say "Absolutely, let me connect you with our team right now" and use transferCall. You can note the reason while you connect them.
```

### Fix 7 — WARN: Fix same-team contractor privacy confusion

**Update the contractor privacy rule:**
```
You cannot share contractor personal information (phone numbers, addresses, or any contact details). If a customer asks for a cleaner's contact info, decline politely: "We're not able to share contractor contact details — if you'd like to request a specific cleaner for your next booking, just let our team know and we'll do our best to accommodate." Do NOT use contractor privacy language when a customer simply asks if they'll get the same team — that's a normal scheduling question.
```

### Fix 8 — WARN: Add extras pricing knowledge

**Add to pricing section:**
```
Common add-on pricing (mention if relevant): Inside oven: $67. Inside fridge: $67. Finished basement: $50.25. Balcony sweep and mop: $67. Same-day booking: $50 surcharge. Organize and declutter (2 hours): $130. Laundry wash and fold per load: $40. Eco-friendly products: no extra charge.
```

### Fix 9 — NOTE: Add what's NOT cleaned

**Add to "Services" section:**
```
We do not clean: hoarding-level trash or debris, animal waste, insect or rodent infestations, excessive mold, bodily fluids, or areas requiring a ladder taller than 3 steps. If a customer describes a situation that might fall into these categories, let them know we'd need to assess before booking.
```

### Fix 10 — NOTE: Account access clarification

**Update "What You Cannot Do":**
```
Access or look up customer account details directly — for account inquiries, gather their name and contact info and let them know a team member will pull up their account and follow up.
```

---

## Prompt Diff: Local Config vs Deployed

The `vapi-agent-config.json` local config and the live Vapi assistant are different prompts. Key items in the local config that are **missing from the deployed prompt**:

| Feature | In Local Config | In Deployed | Risk |
|---------|----------------|-------------|------|
| Eco-friendly NOT default | YES | NO | BLOCKER |
| Objection handling | YES | NO | WARN |
| Extras pricing ($67 oven etc) | YES | NO | WARN |
| Same team for consistency | YES | NO | WARN |
| Mattress cleaning service | YES | NO | NOTE |
| Same-day surcharge ($50) | YES | NO | WARN |
| New booking data collection fields | YES | NO | WARN |

Items in the deployed prompt **not in local config** (improvements):
- Proper complaint handling structure
- Identity verification tiers
- Cancellation policy detail
- Satisfaction guarantee handling
- Transfer escalation criteria

**Recommendation:** The deployed prompt is an evolution of the local config with better complaint/escalation handling. Do not revert to the local config — instead, patch the deployed prompt with the fixes listed above.

---

## Pre-Launch Checklist

- [ ] **Fix eco-friendly default language** (BLOCKER)
- [ ] **Add markdown formatting prohibition** (BLOCKER)  
- [ ] **Add objection handling / value prop** (WARN)
- [ ] **Add booking confirmation flow** (WARN)
- [ ] **Fix transfer behavior — lead with offer** (WARN)
- [ ] **Fix same-team messaging + contractor privacy confusion** (WARN)
- [ ] **Add extras pricing** (WARN)
- [ ] **Test eco-friendly response after prompt fix** (re-test required)
- [ ] **Test markdown formatting on actual ElevenLabs TTS** (voice QA)
- [ ] **Verify `get_price_estimate` tool is responding on staging number** (integration test)
- [ ] **Test `transferCall` — confirm +16473701793 routes correctly** (integration test)
- [ ] **Confirm transfer destination is correct** (KB says transfer destination "TBD" — currently hardcoded to +16473701793)

---

*Report generated 2026-04-02 by Harvey QA Subagent. Test methodology: Deployed Vapi system prompt tested against Claude Haiku 4.5 as proxy for gpt-4o behavior at temp 0.4. Responses representative but not identical to production model.*