Freshworks says Freddy AI Agent resolves up to 80% of customer queries. One of their featured case study companies, Total Expert, reports roughly 23% deflection. Another, UPayments, reports 75%+. Same product. Same marketing claim. Completely different results.
The gap isn't the AI - it's the implementation. We've spent time with Freshdesk's official docs, the Freshworks Community implementation guides, and G2 reviews from real teams running Freddy AI at scale. The difference between a deployment that hits 70%+ resolution and one that stalls at 25% comes down to about seven things, and the most important of them has nothing to do with the AI configuration itself.
Fix your knowledge base before you touch the AI settings
This isn't soft advice - it's the single most consistently documented finding across official Freshdesk documentation, the Freshworks Community, and independent reviews.
"Freddy AI learns through knowledge base articles rather than traditional retraining. Each new Q&A pair and article significantly improves response accuracy."
Teams that deploy Freddy AI with a sparse or outdated knowledge base nearly always blame the AI for poor results. The actual failure point is the content the AI was given to work with. If you're seeing 25% deflection when you expected 60%, the first diagnostic question is: how complete and current is your knowledge base?
Practically speaking, writing for AI is different from writing for humans. A few principles that practitioners consistently cite as high-leverage:
- One topic per article. Articles covering multiple features or processes confuse intent-matching. If an article answers "how do I cancel?" and also covers "how do I downgrade?", split it.
- Lead with the answer. Long preambles ("Great question! When it comes to subscription management...") delay the information the AI needs to ground its response. Put the answer first.
- Avoid bare yes/no answers. The AI performs better when answers include context. "Yes, you can downgrade mid-billing-cycle. The change takes effect at the start of the next cycle, and the pro-rated difference is credited to your account" is more useful than "Yes."
- Keep language direct and specific. Vague phrasing ("pricing varies") gives the AI nothing to work with. Exact figures and policies do.
Freshdesk's AI Agent Studio surfaces an unanswered queries log post-launch - every query the agent couldn't respond to. Think of that log as your KB roadmap. The healthiest deployments treat it as a weekly action item.
Understand the knowledge source limits
Freshdesk's AI Agent can ingest knowledge from four source types, each with constraints worth knowing before you start:
| Source type | Per-agent limit | Per-account limit | Key constraints |
|---|---|---|---|
| URLs | 10 | 25 (total across all agents) | Public pages only; no auth, CAPTCHA, or dynamic JS |
| Files | 200 | - | PDF, DOCX, TXT; max 35 MB each; no images or video |
| Solution articles | Unlimited | - | Toggle "Learn from solution articles" |
| Custom Q&As | Unlimited | - | Best for specific, targeted scenarios |
The 25-URL-per-account limit catches teams off-guard - especially those managing multiple brands or a large content library. If you hit it, the practical fix is to convert the most important URL content into uploaded documents instead.
One operational gotcha: URL ingestion is asynchronous. You'll receive an email when a URL finishes being processed, which can take up to 30 minutes. Pages with popups, login walls, or CAPTCHA fail silently - the agent simply lacks that knowledge, with no error shown. Always check the "extracted content preview" after adding a URL to confirm it ingested correctly.
Write precise Instructions - especially what NOT to do
Instructions is the field in AI Agent Studio where you add business context: industry terminology, brand voice, restricted topics, explicit escalation triggers. Most teams leave this underused.
Generic Instructions ("be helpful and professional") do almost nothing. The agents that perform well have Instructions that tell the bot exactly what to skip and when to hand off. Examples from practitioners that produce measurable improvements:
- Escalate immediately for: legal disputes, billing reversals over [X], requests mentioning "attorney" or "lawsuit"
- Never quote specific pricing or promotional terms - direct customers to the pricing page
- Avoid jargon specific to our internal systems (list internal terms here)
- If the customer expresses frustration three times in a row, trigger human handoff regardless of confidence score
The more specific the constraint, the more predictable the agent's behavior. Generic guidance produces inconsistent output; explicit rules produce reliable behavior.
A note on LLM response variability: Freddy AI, like any large language model, may give slightly different responses to identical queries. This is expected, not a misconfiguration. When you see inconsistency, the right fix is to tighten the relevant Instructions or update the knowledge article - not to chase perfect consistency from the model itself.
Design escalation before you launch
How your agent hands off to human agents is as important as how it resolves - maybe more important. A graceful escalation with full context transferred is a great customer experience. A dead-end "I can't help with that" followed by a restart with a human is a frustrating one.
Three things to get right before launch:
1. Define explicit handover conditions. Configure specific triggers - negative sentiment detected, VIP customer segment, billing dispute, confidence score below threshold, after-hours request. Don't rely on the agent "figuring out" when to escalate.
2. Write a useful handover message. "I'm connecting you with a support specialist who can help further. I'm sharing our conversation so you won't need to repeat yourself." That's it. Set the expectation; don't over-explain.
3. Ensure context transfers completely. AI Agent Studio should write the customer's intent, account state, and conversation summary into the ticket payload before escalating. A human agent who has to ask "what's this about?" after an AI hand-off doubles customer frustration. This "context-loss rate" - the share of escalations where the human must ask the customer to restate their issue - is worth tracking explicitly. Target: under 5%.
The email threading limitation. This is the non-obvious production surprise that catches teams who rely on email as a primary channel: Freddy AI's Email AI Agent only responds to the first message in a thread. Subsequent replies in the same email thread receive no AI response - they go straight to human agents. If most of your support volume is multi-reply email threads, your actual AI automation rate will be significantly lower than your session metrics suggest. Know this before you promise stakeholders a deflection number.
Start with Copilot, then expand to Agent
This is the implementation sequence that practitioners recommend most consistently, and it runs counter to the instinct to deploy the flashiest thing first.
"Most teams begin with AI Copilot since it helps existing agents immediately, then add AI Agent once they're comfortable."
The logic: Freddy AI Copilot - which gives your human agents reply suggestions, conversation summaries, live translations, and sentiment alerts - delivers immediate, low-risk value. Angela Thomas, Director of Customer Care at one Freshdesk customer, found it so useful that it changed how their team writes standard replies. And critically: Copilot surfaces knowledge gaps in your KB without customers ever seeing an AI failure.
Once you've used Copilot for four to six weeks and filled the gaps it surfaces, deploying the full AI Agent is a much safer bet. You've already seen which topics the AI handles confidently and which ones need more content.
The recommended phased rollout looks like this:
| Phase | What to deploy | Goal | Success signal |
|---|---|---|---|
| 1 | AI Copilot only | Surface KB gaps; build team familiarity | Agents using reply suggestions daily |
| 2 | AI Agent on one channel (web chat) | Validate resolution quality at low risk | CSAT flat or improving vs. baseline |
| 3 | Add high-volume workflow (order status, password reset) | Increase automation depth | Escalation rate under 30% for that workflow |
| 4 | Expand to additional channels | Scale proven performance | Per-channel resolution rate meets targets |
Each phase expansion should be triggered by actual data from the prior phase, not a calendar date.
Measure resolution rate, not just deflection rate
Most teams track deflection rate and treat it as success. It's not. Deflection counts how many queries never reached a human - including queries where the customer gave up, got a wrong answer, or simply abandoned the conversation. High deflection with poor CSAT is worse than having no AI at all; it erodes trust faster.
The measurement approach that leading Freshdesk deployments use:
| Metric | What it measures | Target |
|---|---|---|
| Resolution rate | % of AI conversations fully resolved without human help | 55–70% mature; 80%+ top performers |
| Deflection rate | % of queries that never reached a human | 41.2% industry median; 58.7% top quartile |
| CSAT delta | Change in CSAT vs. pre-AI baseline | Flat or improving |
| Reopen rate | "Resolved" tickets where customer contacts again within 48 hrs | Under 8% |
| Escalation context-loss | Escalations where human must ask customer to repeat themselves | Under 5% |
| Hallucination rate | AI responses containing fabricated information | Under 1% |
The hallucination rate deserves more attention than most teams give it. Any AI response touching refund windows, pricing, warranty terms, or SLA commitments is a compliance risk if it's wrong. Spot-check a sample of AI-resolved tickets in your first four weeks. If you find fabricated policy details, tighten the relevant Instructions or add explicit fallback language ("For exact pricing, please check our pricing page").
The benchmark gap to understand: industry median deflection in 2026 is 41.2%, with the top quartile at 58.7%. If you're at 40%, you're not underperforming - you might be exactly where a mature deployment should be. Context matters.
Establish your baseline before you launch. Pull 90 days of ticket volume by category, first response time, resolution time, and CSAT. Without a pre-deployment baseline, you can't demonstrate ROI or identify regression.
Know the operational limits before you're in production
A few platform limitations worth knowing before they become production incidents:
Session exhaustion. When the account's session quota hits zero, the AI agent stops responding to all queries until you repurchase sessions. This isn't a graceful degradation - it's a hard stop. Enable the auto-recharge feature (it triggers when sessions drop to 50) for any deployment that customers rely on. And don't forget: preview testing consumes real sessions, so budget separately for your test phases.
URL ingestion and the silent fail. Pages with CAPTCHA, popups, login walls, or heavy JavaScript fail to ingest without a clear error message. The agent just lacks that knowledge. After adding any URL, check the extracted content preview. For any URL that fails, upload a PDF or DOCX version of the content as a fallback.
The 25-URL account limit. Applies across all agents on the account. If you're running multiple agents (different brands, different regions), they share this pool. Convert important URL content to uploaded files when you're near the limit.
Multi-brand setups. One agent per brand, not one agent trying to handle multiple brands. Knowledge management, testing, and performance tracking are significantly cleaner when each brand has its own agent.
AI Insights is English-only. Multilingual analytics is on the roadmap. If you're running a multilingual deployment, you'll need to track non-English agent performance through Freshdesk's standard reporting for now.
| Limitation | Impact | Workaround |
|---|---|---|
| Email AI: first message only | Multi-reply email threads not handled | Factor into email deflection estimates; consider supplemental AI layer |
| 25 URL limit per account | Restrictive for large content libraries | Convert important URLs to uploaded files |
| Session exhaustion = hard stop | AI stops responding entirely | Enable auto-recharge at 50-session threshold |
| URL ingestion silent failure | Knowledge gaps without warning | Check extracted content preview after every URL add |
| AI Insights English-only | No analytics for multilingual deployments | Use Freshdesk standard reporting for non-English volume |
Try eesel AI with Freshdesk
If you're finding that Freddy AI's native capabilities - particularly the 25-URL limit, the email threading gap, or the session-based pricing model - don't quite fit your support volume or workflow, eesel AI integrates directly with Freshdesk as a layer that runs alongside your existing setup.
Design.com runs 50,000+ tickets per month on Freshdesk using eesel AI's multi-agent setup. The model is different: you pay $0.40 per resolved ticket rather than per session, there are no seat fees, and the knowledge input isn't limited to 25 URLs - eesel ingests from Zendesk, Google Docs, Notion, Confluence, Slack, and your existing help center simultaneously. Draft mode (human reviews before sending) and autonomous mode (sends directly for high-confidence tickets) let you dial between full automation and supervised AI based on ticket category.
There's a $50 free trial - no credit card required - if you want to run it against real tickets before committing.
Frequently Asked Questions
Share this article

Article by
Stevia Putri
Stevia Putri is a marketing generalist at eesel AI, where she helps turn powerful AI tools into stories that resonate. She’s driven by curiosity, clarity, and the human side of technology.

