# CleanOS Deep Research Report (ChatGPT, March 2026)

CleanOS Deep Research Report for AI-Native
Cleaning Business SaaS
Executive framing for CleanOS
You’re building “CleanOS” to replace a three‑tool spine (Launch27 + GoHighLevel + Aircall) with a single,
AI‑native operating system for cleaning companies, starting by dogfooding in No More Chores (Toronto,
contractor model, ~10 years in market, ~$713K/year). The highest-leverage product thesis—based on where
the market is in March 2026—is not “add AI to scheduling,” because incumbents are rapidly adding AI
across voice, marketing, and analytics; it’s make AI the default interface to the operational truth of a
cleaning  business  (availability,  pricing  rules,  recurring  schedules,  exceptions,  payments,  and
handoff) while keeping the underlying workflows  cleaning-specific (recurring residential work, teams/
contractors, add-ons, lockouts, reschedules, supplies, payouts, and quality loops). This is also the vector
where  “generic  FSM”  products  (Jobber,  Housecall  Pro,  Workiz,  ServiceTitan)  tend  to  be  overbuilt  or
mismatched for maid services, while “maid-specific” products often lag behind in modern automation and
AI-native UX (based on recurring complaints about glitches, limitations, and workflow friction). 
A key 2026 reality check: AI receptionist is now table stakes among mid-market field-service platforms.
Jobber has an “AI Receptionist” positioned as answering calls and texts and providing “24/7 customer
service” on call or text. 
 Housecall Pro has an “AI Team” built into the product (no beta/waitlist per their
help center article). 
 Workiz markets “Genius Answering” as an AI receptionist/dispatcher and states it
uses language models from OpenAI and ElevenLabs.  
 ServiceTitan positions “Titan Intelligence” and
“Atlas” as “AI for the Trades” / an “AI sidekick.” 
So the differentiation isn’t “we have a bot.” It’s: (a) the bot can safely and correctly execute booking +
quoting + scheduling for cleaning businesses, (b) it fails gracefully with human escalation, and (c) the
entire system is built around multi-tenant AI + data isolation from day one.
Competitive landscape for maid and cleaning software in March
2026
Market segmentation that matters for your roadmap
By March 2026, there are two main categories you’re competing with:
The “maid-specific scheduling stacks” (ZenMaid, Launch27/Automaid, BookingKoala, Maidily, MaidCentral)
specialize  in  online  booking  forms,  recurring  schedules,  reminders,  basic  CRM,  and  cleaner/team
coordination. Their competitive wedge is fit-for-maids workflows and terminology. Pricing spans from low-
cost SMB plans to high-touch “ops platform” pricing (MaidCentral). 
1
2
3
4
5
6
7
1


The “general field service management (FSM) platforms” (Jobber, Housecall Pro, Workiz, ServiceTitan) target
a broader trades market. They tend to have stronger ecosystems (payments, reporting, pipelines, add-ons),
but maid owners commonly report mismatch around long-running/repeating workflows, booking UX, and
cost scaling with users. 
Side-by-side snapshot of relevant incumbents
Prices below are as displayed on vendor pages and/or reputable aggregators as crawled in early March
2026; vendors frequently A/B test pricing pages and offer different monthly vs annual pricing. All prices
shown are USD unless noted.
Product
Positioning
Entry
pricing
signal
(USD/mo)
Pricing model
signals
Notable 2026 “AI”
direction
Common
complaint
signals from
reviewed
sources
ZenMaid
Maid-
service
scheduling
+
automation
$19/mo
Starter;
$39 Pro;
$49 Pro
Max 
Starter capped
at 40
appointments/
month; comms
templates and
app features
expand by tier 
Not marketed as
“AI receptionist”
leader; focuses on
automation & ops
tooling
G2 “pros/
cons”
summary
flags “slow
and glitchy”
as a common
issue 
;
one Software
Advice review
requests
mileage
tracking +
per-contact
comm
preference
toggles 
Launch27
(Automaid)
Maid
booking +
scheduling
automation
$75 Base;
$150 Pro;
$299 Plus 
Unlimited users/
bookings in
Base; higher
tiers remove
branding + add
integrations like
Zapier/Twilio/QB
Not positioned as
AI-native;
integrations exist
(Twilio, Zapier) 
Review-
summary
“Cons”
include
“features you
pay for…aren’t
working
properly” +
“inflexibility”
complaints 
8
9
9
10
11
12
12
12
13
2


Product
Positioning
Entry
pricing
signal
(USD/mo)
Pricing model
signals
Notable 2026 “AI”
direction
Common
complaint
signals from
reviewed
sources
BookingKoala
Booking +
marketing
modules
Starter
$27/mo;
Growing
$57/mo;
Premium
$197/mo 
Pricing scales by
“providers” or
storage;
includes
marketing
capabilities at
Premium 
Not a leading AI
receptionist
narrative; more
“all-in-one”
website/forms/
automation
Review notes
learning
curve:
building
forms “you
need to get
used to” 
Maidily
Cleaning-
specific OS
(“pay per
job”)
Free $0
(≤10 jobs/
mo); Start
$29; Grow
$49; Scale
$99 
Unlimited users;
pay based on
jobs; dedicated
SMS number is +
$10/mo; “AI-
powered
scheduling”
appears in Scale
feature grid 
Explicitly includes
“AI setup
assistant” and “AI-
powered
scheduling” (Scale)
Competes
directly on no
per-seat fees
+ comms;
strong value
positioning
(but still
early-market
vs giants) 
MaidCentral
High-end
platform
for multi-
location
ops
Starts at
$450/mo 
Scales by
physical
locations +
completed jobs/
location 
Not primarily
known for AI
reception at SMB
price points
High price
creates a gap
for SMB/mid-
market who
need
sophistication
without
enterprise
spend 
14
14
15
16
16
16
16
17
17
18
3


Product
Positioning
Entry
pricing
signal
(USD/mo)
Pricing model
signals
Notable 2026 “AI”
direction
Common
complaint
signals from
reviewed
sources
Jobber
General
FSM for
home
services
Core
“starting
at” $29/mo
(annual
promo);
add’l users
$29/user
on team
tiers 
Clear per-user
scaling; Connect
includes 5 users;
Grow includes
10 users; add
more at $29/
user 
Jobber “AI Tools”
and “AI
Receptionist” are
explicit product
pillars 
“Cons”
summary
includes
integration
issues and
“glitches and
bugs” 
;
another
review
complains
about long-
running
project
workflows
(resetting job
forms) 
Housecall Pro
General
FSM +
marketing
add-ons
Basic $59/
mo billed
annually
($79
monthly);
Essentials
$149/$189;
MAX
$299/$329 
Plans tied to
user count
(Basic 1 user;
Essentials up to
5) + add’l users
($35/mo each) 
“AI Team” built
directly into
accounts (no beta/
waitlist) 
Review-
summary
“Cons”
highlight
workflow
disruption
from invoice/
image
sending
changes +
email-history
concerns 
19
20
21
22
22
23
24
3
25
4


Product
Positioning
Entry
pricing
signal
(USD/mo)
Pricing model
signals
Notable 2026 “AI”
direction
Common
complaint
signals from
reviewed
sources
Workiz
General
FSM with
integrated
phone
Lite free;
Kickstart
$225 /
$187
(annual);
Standard
$275 /
$229; Pro
$325 /
$270;
Ultimate
“Let’s talk” 
Communication/
phone add-ons
sold separately;
AI Answering is
sold separately
and requires
phone plan 
“Genius
Answering” AI
receptionist and
AI toolset; Workiz
help doc states it
uses OpenAI +
ElevenLabs
models 
Review
describes
severe issues
for multi-
property
clients/sub-
address
handling +
poor support
+ inefficient
card
processing 
ServiceTitan
Enterprise
FSM suite
Packages
are per-
technician;
“Request
Pricing” 
Per-technician
pricing; heavy
modular “Pro
Products”
ecosystem 
“Titan
Intelligence” and
“Atlas” positioned
as AI for trades
and in-field
sidekick 
Generally too
expensive/
complex for
typical maid
SMBs; creates
down-market
opportunity 
What owners complain about most and where the gaps are
Cost scaling with users is a recurring structural complaint driver for cleaning companies, because maid
businesses frequently have many part-time cleaners and contractors who “need access” (schedule visibility,
work orders, checklists, payouts) but do not justify $20–$35+/seat pricing. Jobber explicitly charges $29 per
added user in team plans. 
 Housecall Pro Essentials includes up to 5 users and lists “additional users
($35/mo each).” 
 Workiz similarly prices tiers by included users and lists per-member costs for Standard/
Pro (with separate pricing for annual vs monthly).  
 Maidily is explicitly positioning against this by
including “unlimited users” and charging based on jobs, plus a dedicated SMS number add-on. 
Gap: a
modern platform that supports contractor-heavy rosters without per-seat punishment, while still offering
robust permissions/auditing.
Reliability and UX friction (bugs, glitches, “features not working”) shows up in review summaries across
platforms. Launch27/Automaid review-summary “Cons” include “Many of the features you pay for or that
are  promised  aren't  working  properly.”  
 Jobber’s  review  summary  lists  “glitches  and  bugs”  and
“integration issues.” 
 ZenMaid’s G2 pros/cons summary flags “slow and glitchy.” 
Gap: an “AI-native”
product that is also “ops-native”—meaning the core scheduling data model and integrations are boringly
robust, because AI becomes useless when the calendar truth is wrong.
26
27
28
29
30
30
5
31
20
24
32
16
13
22
10
5


Booking + quoting flexibility is still under-solved for cleaning compared to trades. Maidily highlights multi-
model quoting (room-based, sqft, flat rate, add-ons) on its feature pages, and includes quoting and job
automation in its platform narrative.  
 BookingKoala reviews suggest form-building and setup can be
non-trivial for some users (“forms…you need to get used to”). 
Gap: a quote engine designed around
cleaning-specific pricing rules (recurring discounts, first-clean premiums, add-on bundles, service area
constraints) that is also machine-operable by your AI agent (i.e., deterministic tools + schemas).
Communication  recordkeeping  is  a  sharp  edge  where  “AI  receptionist  hype”  often  breaks  down
operationally. Housecall Pro review-summary “Cons” highlight concerns around email attachment changes
causing inefficiency and also call out the lack of email history retrieval and mention Twilio concerns,
showing how comms detail can become operationally high-stakes.  
Gap: a unified “conversation →
booking → invoice → follow-up” thread that is searchable and reliable, with one canonical customer record.
AI in home services in March 2026
Who is doing “AI booking / AI receptionist” now
Incumbent FSM vendors are shipping AI as first-class product surfaces, not experiments:
Jobber markets “Jobber AI” as learning how you run your business and suggesting automations, follow-ups,
drafting quotes, and flagging jobs. 
 Jobber’s “AI Receptionist” is positioned as answering calls and texts
and delivering “24/7 customer service” over call or text. 
Housecall Pro’s help-center article states “The AI Team is now available” with “no Beta program or waitlist,”
describing AI teammates that can answer the phone, book jobs, generate custom reports, and provide
guidance. 
Workiz (with integrated phone focus) markets “Genius Answering” as an AI receptionist/dispatcher.  
Their help document explicitly says Genius Answering “uses industry-leading language models from OpenAI
and ElevenLabs” to converse naturally and gather info. 
ServiceTitan positions “Titan Intelligence” and “Atlas” as “AI for the Trades” / “AI sidekick,” emphasizing using
company data and external docs like manuals and spec sheets, plus roadmap items like office capabilities.
What’s working in production vs what’s still hype
What  reliably  works  today  is  “front-door  AI”  that  captures  leads,  reduces  missed  calls,  and  collects
structured details—because the task is bounded if you design the toolchain correctly. This is aligned with
how modern tool calling is supposed to work: the model selects tools, your app executes them, and the
model continues with the tool outputs.  
 It’s also aligned with the growing industry move toward
structured outputs (JSON schema / strict tool outputs) to reduce “LLM drift” in operational actions. 
Where hype still creeps in is fully autonomous “AI dispatch” and “AI scheduling” when the system of record
is fragmented and policies are messy. Maidily explicitly lists “AI-powered scheduling” only at its highest
“Scale” tier, suggesting it’s still a premium capability with real complexity.  
 ServiceTitan’s Atlas page
33
15
25
34
2
3
35
36
5
37
38
16
6


shows “Atlas in the Office (Coming soon)” while “Atlas in the Field” exists now, reinforcing that office-wide
automation is a harder maturity step. 
Voice AI state of the art relevant to CleanOS
For voice agents, the architecture options have hardened into two primary patterns:
Speech-to-speech  “realtime”  (lowest  latency,  most  natural  turn-taking)  via  the  OpenAI  Realtime  API,
designed for low-latency multimodal (audio/text) interactions and commonly used for voice agents. 
Chained pipeline (ASR → text LLM → TTS), which remains reliable and flexible for connecting to tool-heavy
workflows  and  existing  text  agents.  OpenAI’s  Audio  &  speech  guide  explicitly  describes  these  two
approaches and the tradeoffs. 
In your exact context—Twilio phone numbers, live booking on calls, and deterministic scheduling actions—
Twilio Media Streams is the key “telephony substrate” if you want streaming audio control. It provides raw
audio from live calls over WebSockets for near real-time processing.  
 Twilio’s WebSocket message
protocol  (Connected/Start/Media/DTMF/Stop/etc.)  is  well  defined  and  supports  bidirectional  streaming
patterns. 
If you prefer to avoid building and maintaining audio infrastructure yourself, Vapi is explicitly positioned as
a  developer  platform  that  handles  voice-agent  infrastructure  (STT/LLM/TTS  components,  phone/web
integration,  and  tools).  
 Vapi  also  supports  multi-assistant  orchestration  (“Squads”)  with  context-
preserving transfers, which is directly relevant to your “AI receptionist → ops agent → billing agent →
human” handoff vision. 
Tech stack validation for a bootstrapped solo founder
Your proposed stack is directionally right for speed, but adjust for “multi-tenant + AI”
sharp edges
Your baseline direction (Node.js + Next.js + Postgres + queues + Twilio + Stripe + LLM providers) is consistent
with how modern SaaS teams ship quickly in 2026. The main improvements are about reducing integration
risk and locking in safe multi-tenant patterns early—because retrofitting tenant isolation and “agent safety”
later is painful.
Below are the most decision-relevant validations and changes.
Backend and API shape
Staying in JavaScript/TypeScript is pragmatic. Many modern agent frameworks and vendor SDKs you’ll likely
want are TypeScript-first or TypeScript-friendly: OpenAI’s Agents SDK for TypeScript is explicitly presented as
a lightweight package for building agentic apps (including voice agents), and OpenAI’s Realtime guidance
points to the Agents SDK as a recommended starting point for browser voice agents. 
 Vapi’s platform
also emphasizes developer tooling (including a CLI and MCP integration for IDE correctness), which fits a TS
workflow. 
39
40
41
42
43
44
45
46
47
7


Key recommendation: design the backend around a “tools-first” contract, not around chat transcripts.
Concretely,  define  internal  tools  like
 get_quote ,
 get_availability ,
 create_booking , 
reschedule_booking ,  take_deposit ,  send_invoice ,  handoff_to_human ,  then  require  the
agent to operate via strict structured outputs. This aligns with OpenAI function calling and Structured
Outputs guidance. 
Postgres + pgvector and the “single-database” advantage
Postgres remains a strong default for your case because you need both structured scheduling data and
searchable unstructured conversation artifacts. If you keep embeddings nearest-neighbor search inside
Postgres (pgvector), you reduce moving parts; OpenAI’s embedding-model documentation shows how
embedding models are invoked via their embeddings endpoint, and OpenAI cookbook examples reference
using text-embedding-3-small  for embeddings workflows. 
Where you must be careful is not “do we use pgvector,” but “how do we enforce tenant isolation at the
database layer so one missed  WHERE tenant_id = ...  can’t leak data.” Postgres Row Level Security
(RLS) exists specifically to restrict which rows are visible/modifiable per role/user. 
Prisma and RLS: feasible, but plan it intentionally
Using Prisma is workable in a multi-tenant Postgres world, but you need a standard pattern for RLS context
passing. Multiple guides now exist for using Postgres RLS with Prisma. 
 The key is to avoid an “app-only
filter” strategy that relies on every developer remembering  tenantId  conditions forever; RLS exists to
move this protection into the database. 
Queues/jobs: BullMQ is fine, but check licensing against your SaaS ambitions
BullMQ is widely used as a Redis-backed job queue (GitHub project; active releases). 
 However, BullMQ’s
own  site  states:  “Standard  license  available  for  organizations  with  fewer  than  100  employees.  Larger
organizations require Enterprise license.” 
That may be totally fine for Phase 1–2 (dogfooding + early SaaS), but it can become a commercial constraint
later as you scale customers or if acquirers care about licensing. For a bootstrapped path, the fastest
approach is still: use BullMQ now, but document an exit path (e.g., compatible abstractions around job
scheduling and retries).
Communications UI: Chatwoot is a reasonable “buy vs build” acceleration
Chatwoot positions itself as an “AI-powered, open-source customer support platform” for omnichannel
conversation management, usable self-hosted or cloud. 
 For CleanOS, Chatwoot is attractive because it
can serve as your human handoff console while you build your own cleaning-specific CRM and scheduling
UI. The trade is that you’ll need to integrate Chatwoot identity (tenant/company, customer linkage) cleanly.
48
49
50
51
52
53
54
55
8


CRM and marketing automation: consider “embedded lightweight CRM” first, not a full
CRM transplant
Twenty is a major open-source CRM project (GitHub shows very high adoption and active releases) and is
explicitly positioning as a modern alternative to Salesforce. 
 Its docs emphasize developer-friendly APIs
(REST/GraphQL) for extensions. 
But embedding a full CRM can become a product gravity well. Maid owners want the “few CRM fields that
matter” (addresses, access notes, pets, recurring preferences, billing tokens, cleaner assignment history)—
not Salesforce complexity. A practical compromise is:
Build a cleaning-native customer/profile model inside CleanOS. Expose integrations outward (QuickBooks,
Zapier-like workflows). Use Twenty or similar only if/when you need “pipeline CRM” beyond cleaning ops.
For marketing automation, Mautic is explicitly “free and open source marketing automation software.” 
It can work, but it’s another operational surface area. For a solo founder, many lifecycle messages (quote
follow-ups, schedule confirmations, review asks, churn winbacks) can be implemented as code-driven
workflows on top of your queue.
Supabase vs “raw Postgres”
Supabase can accelerate auth, storage, and database operations, and it emphasizes RLS as the core
authorization mechanism for browser-accessible tables.  
 Its pricing page also makes clear that each
Supabase “project” includes a dedicated Postgres instance and shows paid plans starting with compute add-
ons (e.g., compute “starts from $10/month”) and a Pro plan price point. 
For CleanOS, Supabase can be a strong accelerant if you want:
Fast auth + security primitives, plus RLS-first culture. Operational simplicity early (managed Postgres,
dashboard tools).
But you’re building a multi-tenant B2B SaaS, not a consumer app where the browser directly queries the DB.
If your API layer remains your primary access path, you can still use “raw Postgres” (managed via your
preferred host) and adopt the same RLS discipline.
Recommendation: if you already have momentum with Postgres + Prisma and will keep DB private behind
your API, do not switch to Supabase solely for speed—unless you decide to adopt Supabase Auth + RLS as a
product pillar.
Reusing an open-source booking foundation
There is no perfect open-source “Launch27 clone” foundation that will drop into a cleaning SaaS, but some
components are worth considering:
56
57
58
59
60
9


If you want an embeddable scheduling/availability primitive, Cal.com is an “open source Calendly successor”
and is explicitly designed as scheduling infrastructure. 
 Cal.com also has an open-source Stripe payment
app/plugin in its repo, which signals extensibility for paid booking flows. 
However, cleaning businesses need more than appointment slots: they need recurring job series, team/
cleaner assignment, travel buffers, cleaner capacity, add-on duration adjustments, service-area rules, and
contractor payout logic. Cal.com can at best be a component, not the backbone.
For “open-source FSM,” Odoo is open source (huge GitHub project), and Odoo has a field service app, plus
an Odoo Community Association (OCA) ecosystem with field-service modules.  
 But Odoo is an ERP
gravity system: adopting it pulls you into its data model, deployment, and customization world (significant
scope for a solo founder building a SaaS product).
Practical guidance: reuse open-source only where it’s a clear isolated subsystem (conversation console,
basic  scheduling  widget,  email  templates).  For  the  core  “cleaning  booking  +  recurring  scheduling  +
dispatch” engine, build a minimal cleaning-specific version that your AI agent can operate deterministically.
Multi-tenant architecture patterns in 2026 with Next.js +
PostgreSQL
The three tenant isolation models and what they mean for CleanOS
The multi-tenant decision is less about “what’s academically best” and more about “what stops catastrophic
mistakes while you ship fast.”
Database-per-tenant  maximizes  isolation  and  simplifies  “delete  customer”  stories,  but  it  increases
operational overhead (migrations across many DBs, connection pooling, monitoring). It’s usually not ideal
for a bootstrapped solo founder unless customers are very large and regulated.
Schema-per-tenant is a middle ground: good isolation and easier per-tenant customizations, but migrations
and query complexity still scale with tenant count.
Shared schema + tenant key (row-level isolation) is typically the fastest to ship and easiest to operate, but it
is also the easiest place to accidentally leak data if you rely only on application-layer filters. This is exactly
what Postgres RLS is designed to mitigate: it restricts which rows are returned or can be modified “on a per-
user basis,” adding enforcement beyond SQL GRANT privileges. 
Recommendation for CleanOS Phase 1–3
For Phase 1 (dogfooding), implement the multi-tenant model even if you have one tenant. It forces early
discipline in the data model, permissions, and agent tools.
For Phase 2–3 (multi-tenant SaaS), use:
Shared  schema  with  tenant_id  (or  workspace_id )  everywhere.  Strong  indexing  on  tenant  keys.
Postgres RLS policies that require tenant context for all customer-facing tables.
61
62
63
50
10


Supabase’s docs are direct about why RLS matters: it enables “secure data access,” and Supabase guidance
states RLS should be enabled for tables in exposed schemas. 
 Even if you’re not using Supabase, the
design ethos applies: enforce isolation as close to the data as possible.
If you need a concrete implementation pattern reference: guides exist specifically describing multi-tenant
patterns with Postgres RLS, Prisma, and Next.js. 
Multi-tenant routing and branding in Next.js
On the application side, the most common multi-tenant UX pattern is subdomains per workspace or custom
domains per company. Vercel has public guidance and examples on building multi-tenant SaaS apps with
Next.js and Vercel (notably via wildcard domains and middleware routing). 
CleanOS implications:
Store tenant routing in a tenants  table: id , slug , subdomain , custom_domain , brand_config ,
default_timezone , etc. Resolve tenant from request host → set tenant context for the entire request
lifecycle (API, UI, and agent tool calls). Never allow “agent tools” to run without explicit tenant context (this is
where leaks often happen).
How incumbents treat “tenant” conceptually
While most incumbents do not publicly document their exact database strategy, their product surfaces
clearly show tenant semantics:
ServiceTitan’s API FAQ explicitly distinguishes between application-level keys and tenant-specific OAuth
credentials (Client ID & Secret) “unique to each ServiceTitan tenant,” which reflects strong tenant delineation
in their platform model. 
Jobber’s developer ecosystem shows a defined platform with app templates and APIs; their engineering
content  references  modern  stacks  (React,  Ruby  on  Rails,  GraphQL)  and  teams  including  AI/ML  and
Communications, reinforcing that mature FSM platforms centralize multi-tenant workflows behind APIs.
For your purposes: design CleanOS so “tenant” is a first-class security boundary in both the database and
the agent orchestration layer.
AI agent architecture patterns for booking, quoting, scheduling,
and support
Core principle: AI must not “improvise” state changes
The single most important architectural point for a booking/scheduling agent is: natural language is for
conversation; state changes must be tool calls with strict schemas.
59
64
65
66
67
11


OpenAI’s function calling guide lays out the canonical flow: call the model with available tools → model
returns a tool call → your app executes → you send tool output back → model returns final response (or
more tool calls). 
Structured Outputs (via strict function schemas or JSON schema formats) exist to force reliability and
schema adherence, and OpenAI explicitly positions Structured Outputs for connecting models to tools/
functions/data safely. 
CleanOS design implication: your internal “booking engine” must be the source of truth; the agent is a UI
layer that can only act via constrained tools.
Voice + SMS: practical production architecture in 2026
You already have Twilio + Claude working. For CleanOS, the scalable multi-tenant variant likely needs:
Telephony ingestion layer: Twilio Voice webhooks and/or Media Streams. Twilio Media Streams provides raw
audio over WebSockets in near real-time.  
 If you want full streaming (barge-in, interruption control),
implement Twilio’s Media Stream protocol and validate authenticity (e.g., Twilio signature expectations are
part of the security model). 
Voice agent runtime: either your own Realtime API integration or a platform like Vapi.
If building yourself: OpenAI’s Realtime API is explicitly designed for low-latency speech-to-speech and is a
common path for production voice agents. 
 OpenAI also notes the two main ways to build voice agents:
speech-to-speech realtime vs chaining speech-to-text + text model + text-to-speech. 
If using Vapi: Vapi describes itself as a developer platform for building voice AI agents, offering both single-
assistant  (“Assistants”)  and  multi-agent  (“Squads”)  primitives;  it  also  documents  tool  integrations  for
webhooks and built-in actions like call transfers. 
For your immediate “ship fast” constraint, Vapi is consistent with your current approach: it can reduce the
engineering surface area of realtime audio while still supporting tool calls into your booking engine. 
Conversation memory: what to store and what not to store
Store three layers:
Operational state: customer profile, property details, recurring preferences, add-ons, schedule constraints
(structured tables). Conversation artifacts: transcripts, summaries, and extracted entities needed for service
delivery (e.g., gate code, pets, parking) stored as audit logs. Retrieval layer: embeddings for semantic search
over policies and notes (e.g., “this customer hates phone calls,” “dog is aggressive,” “use side entrance”).
OpenAI  embeddings  guidance  shows  how  embeddings  are  obtained  via  the  embeddings  endpoint;
cookbook examples show common embedding workflows. 
Avoid storing raw “agent chain-of-thought” style reasoning. Focus on: what was said, what was decided,
what tools executed, and why (structured reason codes).
37
38
68
69
40
41
70
71
72
12


Human handoff that doesn’t break trust
You already have a VA scheduler. In production, handoff must be explicit and fast:
Agent detects low confidence / policy exceptions / scheduling conflicts → calls  handoff_to_human .
System creates a human task in your conversation inbox (Chatwoot) with context package: summary,
extracted  fields,  proposed  next  action,  and  links  to  booking/calendar/customer  record.  Chatwoot  is
explicitly built to manage conversations across channels and can be self-hosted, making it a reasonable
console for this workflow. 
Agent stays “available” but stops making commitments. It can say: “I’m connecting you with our team to
confirm availability,” then the human confirms.
This is the difference between “AI receptionist demo” and “AI receptionist product.”
Frameworks teams are using in production right now
In March 2026, teams building agents in production commonly fall into three “stack families,” all of which
can work with your Node/TS preference:
OpenAI-native agent tooling: OpenAI’s Agents SDK for TypeScript is designed for agentic apps with tools,
handoffs, streaming results, and traces; OpenAI provides both SDK docs and a TypeScript documentation
site, plus the JS repo.  
 This is especially relevant if you lean into OpenAI Realtime for voice and
Responses API for tool-heavy interactions. 
Graph/workflow orchestration: LangGraph is positioned as a low-level orchestration framework for stateful,
long-running agent workflows (including durable execution concepts and human-in-the-loop).  
 AWS
Prescriptive  Guidance  also  discusses  LangChain/LangGraph  in  the  context  of  agentic  AI  frameworks,
highlighting tool integration and memory abstractions as key capabilities. 
Data/RAG-first  agent  frameworks:  LlamaIndex  documents  agents  as  systems  that  can  break  down
questions,  choose  tools,  plan  tasks,  and  store  memory.  
 This  can  be  useful  for  your  “policy  and
playbook” retrieval (service policies, pricing rules, FAQs) and for building internal ops copilots.
For CleanOS: you do not need to pick one framework early. The highest ROI is establishing (1) your internal
tool contract and (2) traceability/auditing of tool actions, then choosing the orchestration approach that
best supports long-running workflows (reschedules, cancellations, failed payments, rebooks).
A reference architecture for CleanOS agents
A production-friendly architecture for your use case looks like:
Channel adapters: SMS inbound/outbound via Twilio. Voice inbound via Twilio, either via Media Streams
(DIY) or via Vapi/Twilio integration. 
55
73
74
75
76
77
78
13


Agent runtime: Text brain: Claude or OpenAI models (you already do this); for deterministic actions use
function-calling + strict schemas.  
 Voice brain: OpenAI Realtime API for speech-to-speech or Vapi for
managed voice orchestration. 
Tool layer (your code): Quote engine (cleaning-specific). Availability engine (recurring rules, buffers, travel).
Booking  engine  (atomic  create/reschedule/cancel).  Payments  (Stripe).  Accounting  sync  (QuickBooks).
Notifications and follow-ups (queue-driven jobs).
Data layer: Postgres with tenant isolation enforced via RLS once multi-tenant; pgvector for retrieval search.
Human console: Chatwoot for conversation + assignment; integrates with paging/Slack later. 
Observability:  Store  every  tool  call  +  parameters  +  result  +  agent  version  prompt  hash  (critical  for
debugging and customer disputes).
This structure is how you get to “Launch27 + Jobber + AI agent” without ending up with a fragile bot stitched
onto a brittle scheduler.
Bottom-line conclusions for your dev team
Competitive landscape: Maid-specific tools remain strong on basic scheduling/booking, but suffer from
reliability  complaints,  workflow  gaps,  and  limited  AI-native  interaction;  general  FSM  tools  are  rapidly
integrating AI and have deeper ecosystems, but price and workflow-fit issues create a clear opening for a
cleaning-first OS—especially if you solve per-seat cost pain and contractor-heavy operations well. 
AI state of the art: AI receptionist + embedded AI assistants are now standard in major FSM platforms.
What’s defensible is not the model; it’s your deterministic tool layer, cleaning-specific policies, and safe
handoff design—implemented via function calling + strict schemas and supported by modern voice stacks
(Realtime API, Twilio Media Streams, or Vapi). 
Stack  validation:  Your  stack  is  reasonable  for  speed,  but  treat  multi-tenancy  and  agent  safety  as
foundational  architecture,  not  “Phase  3  chores.”  Pay  attention  to  queue  licensing  (BullMQ)  and  be
intentional about whether Supabase provides enough acceleration to justify platform coupling. 
https://www.softwareadvice.com/field-service/jobber-profile/
https://www.softwareadvice.com/field-service/jobber-profile/
https://www.getjobber.com/features/ai-receptionist/
https://www.getjobber.com/features/ai-receptionist/
https://help.housecallpro.com/en/articles/9311875-ai-team-overview
https://help.housecallpro.com/en/articles/9311875-ai-team-overview
https://help.workiz.com/hc/en-us/articles/34172688818321-How-Genius-Answering-interacts-with-
your-customers
https://help.workiz.com/hc/en-us/articles/34172688818321-How-Genius-Answering-interacts-with-your-customers
79
80
81
82
83
84
85
1
8
22
2
3
84
4
36
14


https://www.servicetitan.com/features/titan-intelligence
https://www.servicetitan.com/features/titan-intelligence
https://developers.openai.com/api/docs/guides/function-calling
https://developers.openai.com/api/docs/guides/function-calling
https://get.zenmaid.com/pricing
https://get.zenmaid.com/pricing
https://www.g2.com/products/zenmaid-software/reviews?qs=pros-and-cons
https://www.g2.com/products/zenmaid-software/reviews?qs=pros-and-cons
https://www.softwareadvice.com/field-service/zenmaid-profile/
https://www.softwareadvice.com/field-service/zenmaid-profile/
https://www.launch27.com/pricing/
https://www.launch27.com/pricing/
https://www.softwareadvice.com/field-service/launch27-profile/reviews/
https://www.softwareadvice.com/field-service/launch27-profile/reviews/
https://www.bookingkoala.com/pricing/
https://www.bookingkoala.com/pricing/
https://www.softwareadvice.com/field-service/bookingkoala-profile/
https://www.softwareadvice.com/field-service/bookingkoala-profile/
https://www.maidily.com/pricing/
https://www.maidily.com/pricing/
https://maidcentral.com/pricing/
https://maidcentral.com/pricing/
https://www.getjobber.com/pricing/
https://www.getjobber.com/pricing/
https://www.getjobber.com/features/ai/
https://www.getjobber.com/features/ai/
https://www.housecallpro.com/pricing/
https://www.housecallpro.com/pricing/
https://www.softwareadvice.com/construction/housecall-profile/reviews/
https://www.softwareadvice.com/construction/housecall-profile/reviews/
https://www.workiz.com/pricing-plans/
https://www.workiz.com/pricing-plans/
https://www.workiz.com/features/genius-answering/
https://www.workiz.com/features/genius-answering/
https://www.softwareadvice.com/field-service/workiz-profile/
https://www.softwareadvice.com/field-service/workiz-profile/
https://www.servicetitan.com/pricing
https://www.servicetitan.com/pricing
5
6
37
48
79
7
9
10
11
12
13
83
14
15
16
17
18
19
20
21
34
23
24
25
26
27
32
28
35
29
30
31
15


https://www.maidily.com/features/quoting/
https://www.maidily.com/features/quoting/
https://developers.openai.com/api/docs/guides/structured-outputs/
https://developers.openai.com/api/docs/guides/structured-outputs/
https://www.servicetitan.com/features/atlas
https://www.servicetitan.com/features/atlas
https://developers.openai.com/api/docs/guides/realtime
https://developers.openai.com/api/docs/guides/realtime
https://developers.openai.com/api/docs/guides/audio
https://developers.openai.com/api/docs/guides/audio
https://www.twilio.com/docs/voice/media-streams
https://www.twilio.com/docs/voice/media-streams
https://www.twilio.com/docs/voice/media-streams/websocket-messages
https://www.twilio.com/docs/voice/media-streams/websocket-messages
https://docs.vapi.ai/quickstart/introduction
https://docs.vapi.ai/quickstart/introduction
https://openai.github.io/openai-agents-js/
https://openai.github.io/openai-agents-js/
https://vapi.ai/cli
https://vapi.ai/cli
https://platform.openai.com/docs/guides/embeddings/embedding-models%3F.flac
https://platform.openai.com/docs/guides/embeddings/embedding-models%3F.flac
https://www.postgresql.org/docs/current/ddl-rowsecurity.html
https://www.postgresql.org/docs/current/ddl-rowsecurity.html
https://atlasgo.io/guides/orms/prisma/row-level-security
https://atlasgo.io/guides/orms/prisma/row-level-security
https://github.com/taskforcesh/bullmq
https://github.com/taskforcesh/bullmq
https://bullmq.io/
https://bullmq.io/
https://www.chatwoot.com/
https://www.chatwoot.com/
https://github.com/twentyhq/twenty
https://github.com/twentyhq/twenty
https://docs.twenty.com/developers/extend/capabilities/apis
https://docs.twenty.com/developers/extend/capabilities/apis
https://mautic.org/
https://mautic.org/
33
38
39
40
74
80
41
42
68
69
78
43
44
45
70
71
46
47
49
72
50
52
81
51
53
54
85
55
82
56
57
58
16


https://supabase.com/docs/guides/database/postgres/row-level-security
https://supabase.com/docs/guides/database/postgres/row-level-security
https://supabase.com/pricing
https://supabase.com/pricing
https://github.com/calcom/cal.com
https://github.com/calcom/cal.com
https://github.com/calcom/cal.com/blob/main/packages/app-store/stripepayment/README.md
https://github.com/calcom/cal.com/blob/main/packages/app-store/stripepayment/README.md
https://github.com/odoo/odoo
https://github.com/odoo/odoo
https://www.pedroalonso.net/blog/postgres-multi-tenant-search/
https://www.pedroalonso.net/blog/postgres-multi-tenant-search/
https://www.youtube.com/watch?v=vVYlCnNjEWA
https://www.youtube.com/watch?v=vVYlCnNjEWA
https://developer.servicetitan.io/docs/faqs-app-key-client-id-secret/
https://developer.servicetitan.io/docs/faqs-app-key-client-id-secret/
https://dev.to/jobber/building-an-app-in-jobber-platform-57aj
https://dev.to/jobber/building-an-app-in-jobber-platform-57aj
https://developers.openai.com/api/docs/guides/agents-sdk
https://developers.openai.com/api/docs/guides/agents-sdk
https://deepwiki.com/langchain-ai/docs/2.2-langgraph-framework-documentation
https://deepwiki.com/langchain-ai/docs/2.2-langgraph-framework-documentation
https://docs.aws.amazon.com/prescriptive-guidance/latest/agentic-ai-frameworks/langchain-
langgraph.html
https://docs.aws.amazon.com/prescriptive-guidance/latest/agentic-ai-frameworks/langchain-langgraph.html
https://developers.llamaindex.ai/python/framework/use_cases/agents/
https://developers.llamaindex.ai/python/framework/use_cases/agents/
59
60
61
62
63
64
65
66
67
73
75
76
77
17


