用 RAG + Agent 架構搭建智慧客服系統：查詢訂單、處理退款、升級人工一站搞定

Building a Smart Customer Service System with RAG + Agent: Order Lookup, Refund Handling, and Human Escalation in One

RAG + Agent アーキテクチャで構築するスマートカスタマーサービス：注文照会・返金処理・人工エスカレーションを一元化

用 RAG 檢索知識庫、Agent 執行工具，打造能真正解決問題的智慧客服系統。

Combine RAG knowledge retrieval with Agent tool execution to build a customer service system that actually solves problems.

RAG による知識検索と Agent のツール実行を組み合わせ、実際に問題を解決できるスマートカスタマーサービスを構築する。

這是《AI 工具實戰 30 天：從提示詞到 Agent，每天一個工具改變你的工作方式》系列第 26 篇，共 30 篇。前幾篇我們分別探討了提示詞工程、Code Interpreter 數據分析等主題，今天要把這些能力整合起來，搭建一套真正能用的智慧客服系統。

This is article 26 of 30 in the series “30 Days of AI Tools in Action: From Prompts to Agents, One Tool a Day to Transform Your Work.” In previous articles we covered prompt engineering, Code Interpreter for data analysis, and more. Today we bring those capabilities together to build a customer service system that actually works.

これは「AI ツール実践 30 日間：プロンプトから Agent まで、毎日一つのツールで仕事を変える」シリーズの第 26 篇（全 30 篇）です。これまでプロンプトエンジニアリングや Code Interpreter によるデータ分析などを取り上げてきました。今日はそれらの能力を統合し、実際に使えるスマートカスタマーサービスシステムを構築します。

為什麼傳統客服機器人總讓人抓狂？Why Traditional Chatbots Are So Frustratingなぜ従来のチャットボットはこれほど使いにくいのか

傳統 FAQ 機器人的問題很明確：它只會比對關鍵字，給你一段預設回答，遇到「我的訂單 #20240312 還沒到」這種需要查資料庫的問題就完全無能為力。用戶最後還是得等人工，體驗極差。真正的智慧客服需要兩個核心能力：一是能從知識庫找到正確政策（RAG），二是能呼叫外部工具執行動作（Agent）。把這兩者結合，才能處理 80% 以上的真實客服場景。

The problem with traditional FAQ bots is clear: they match keywords and return canned responses. Ask something like “my order #20240312 hasn’t arrived” and they’re completely useless — the user ends up waiting for a human anyway. A genuinely smart customer service system needs two core capabilities: retrieving accurate policies from a knowledge base (RAG), and calling external tools to take action (Agent). Combining both is what lets you handle over 80% of real support scenarios.

従来の FAQ ボットの問題は明確です。キーワードをマッチングして定型文を返すだけで、「注文 #20240312 がまだ届いていない」のような DB 照会が必要な質問には全く対応できません。結局ユーザーは人工対応を待つことになり、体験は最悪です。本当のスマートカスタマーサービスには二つのコア能力が必要です。知識ベースから正確なポリシーを取得する RAG と、外部ツールを呼び出してアクションを実行する Agent です。この二つを組み合わせることで、実際のサポートシナリオの 80% 以上に対応できます。

系統架構：RAG 負責「知道」，Agent 負責「做到」System Architecture: RAG Knows, Agent Actsシステムアーキテクチャ：RAG が「知り」、Agent が「実行する」

整個系統分三層。第一層是 RAG 知識庫，把退款政策、常見問題、SLA 規範等文件向量化存入 Pinecone 或 Chroma，讓 AI 能精準檢索而不是靠記憶。第二層是 Agent 工具集，定義三個核心工具：`query_order(order_id)` 查詢訂單狀態、`submit_refund(order_id, reason)` 提交退款申請、`escalate_to_human(ticket_id, summary)` 升級人工並附上對話摘要。第三層是 LLM 決策層，使用 GPT-4o 或 Claude 3.5 作為大腦，根據用戶輸入決定要檢索知識庫還是呼叫工具，或兩者並用。

The system has three layers. First, the RAG knowledge base: vectorize documents like refund policies, FAQs, and SLA terms into Pinecone or Chroma so the AI retrieves accurate information rather than relying on memory. Second, the Agent toolset: define three core tools — `query_order(order_id)` to check order status, `submit_refund(order_id, reason)` to file a refund, and `escalate_to_human(ticket_id, summary)` to hand off to a human agent with a conversation summary. Third, the LLM decision layer: use GPT-4o or Claude 3.5 as the brain, deciding whether to retrieve from the knowledge base, call a tool, or both, based on user input.

システムは三層構造です。第一層は RAG 知識ベースで、返金ポリシー・FAQ・SLA 規定などのドキュメントを Pinecone や Chroma にベクトル化して格納し、AI が記憶に頼らず正確に検索できるようにします。第二層は Agent ツールセットで、三つのコアツールを定義します。`query_order(order_id)` で注文状況を照会、`submit_refund(order_id, reason)` で返金申請を提出、`escalate_to_human(ticket_id, summary)` で会話サマリー付きで人工対応にエスカレーションします。第三層は LLM 決定層で、GPT-4o または Claude 3.5 を頭脳として使い、ユーザー入力に基づいて知識ベース検索・ツール呼び出し・またはその両方を判断します。

實作重點：讓 Agent 知道何時該放手Key Implementation: Teaching the Agent When to Let Go実装のポイント：Agent にいつ手放すべきかを教える

最容易被忽略的設計是「升級條件」。你需要在系統提示詞中明確定義：當用戶情緒激動（偵測到憤怒語氣）、問題涉及金額超過閾值、或 Agent 連續兩次無法解決問題時，必須主動觸發 `escalate_to_human`，並把整段對話摘要一起傳給人工客服，讓他們不用從頭問起。這個細節決定了系統是真正有用還是只是把問題推給人工。另外，用 LangChain 的 `ConversationBufferMemory` 保留對話上下文，避免用戶每次都要重複說明訂單號碼。

The most overlooked design element is the escalation condition. In your system prompt, explicitly define when the agent must trigger `escalate_to_human`: when the user shows signs of frustration, when the issue involves an amount above a threshold, or when the agent fails to resolve the problem twice in a row. Pass the full conversation summary to the human agent so they don’t have to start from scratch. This detail is what separates a genuinely useful system from one that just offloads problems. Also use LangChain’s `ConversationBufferMemory` to retain context so users don’t have to repeat their order number every message.

最も見落とされがちな設計要素は「エスカレーション条件」です。システムプロンプトで明確に定義する必要があります。ユーザーが怒りのトーンを示した場合、問題が閾値を超える金額に関わる場合、または Agent が連続 2 回問題を解決できなかった場合に、`escalate_to_human` を能動的にトリガーし、会話全体のサマリーを人工エージェントに渡して最初から聞き直さなくて済むようにします。この細部が、本当に役立つシステムと単に問題を人工に押し付けるシステムの違いを生みます。また LangChain の `ConversationBufferMemory` を使って会話コンテキストを保持し、ユーザーが毎回注文番号を繰り返さなくて済むようにしましょう。

今日行動：用 LangChain + OpenAI 跑通最小可行版本Today’s Action: Run a Minimal Viable Version with LangChain + OpenAI今日のアクション：LangChain + OpenAI で最小実行可能バージョンを動かす

不需要一開始就搭完整系統。今天的目標是：用 LangChain 的 `initialize_agent` 搭配 `OPENAI_FUNCTIONS` 模式，定義上述三個工具（先用 mock 函數模擬資料庫），加上一份 500 字的退款政策文件做 RAG 測試。跑通「查詢訂單 → 確認符合退款條件 → 提交退款」這條主流程就算成功。整個 MVP 大約 150 行 Python，一個下午可以完成。系統跑通後，再逐步替換 mock 函數為真實 API，接入真實知識庫，這才是正確的迭代節奏。

上一篇：AI 數據分析實戰：用 ChatGPT Code Interpreter 讓非技術人員也能讀懂數據

下一篇（第27篇）預告：AI 輔助決策：如何讓 AI 成為你的策略顧問而不是答案機器

You don’t need to build the full system on day one. Today’s goal: use LangChain’s `initialize_agent` with `OPENAI_FUNCTIONS` mode, define the three tools above using mock functions to simulate the database, and add a 500-word refund policy document for RAG testing. Getting the main flow working — query order → confirm refund eligibility → submit refund — counts as success. The whole MVP is around 150 lines of Python, achievable in an afternoon. Once it runs, gradually replace mock functions with real APIs and connect a real knowledge base. That’s the right iteration rhythm.

上一篇：AI 數據分析實戰：用 ChatGPT Code Interpreter 讓非技術人員也能讀懂數據

Next up — Article 27: AI-Assisted Decision Making: How to Make AI Your Strategy Advisor, Not Just an Answer Machine

最初から完全なシステムを構築する必要はありません。今日の目標は、LangChain の `initialize_agent` を `OPENAI_FUNCTIONS` モードで使い、上記の三つのツールを mock 関数で定義し（DB をシミュレート）、500 字の返金ポリシー文書を RAG テスト用に追加することです。「注文照会 → 返金条件確認 → 返金申請提出」というメインフローが動けば成功です。MVP 全体は約 150 行の Python で、午後一つで完成できます。動いたら mock 関数を実際の API に置き換え、実際の知識ベースに接続していく、それが正しいイテレーションのリズムです。

上一篇：AI 數據分析實戰：用 ChatGPT Code Interpreter 讓非技術人員也能讀懂數據

次回（第 27 篇）予告：AI 支援意思決定：AI を答えを出す機械ではなく戦略アドバイザーにする方法