🧙‍♂️ 打造乾淨又好用的 AI 輸出格式！OutputParser 實戰教學

大家應該都有過這樣的經驗吧？

你請 AI 幫你列出某個國家的資訊，結果它回來一段很有禮貌、但又難以讀取的「自由發揮文字」👇：

嗨～這裡是你想要的資訊哦！

國家名稱：日本
首都：東京
面積：377,975 平方公里

看起來人類 OK，但程式碼看到這會崩潰 😵！

今天就來教你怎麼用 OutputParser 幫你的 LLM 輸出加上「結構」！

🎯 OutputParser 是什麼？

簡單來說，OutputParser 就是負責把 AI 模型回傳的內容「結構化」的工具。
想像它是 AI 講了一堆話後的「翻譯員」，負責幫你翻成你想要的格式。

LangChain 目前內建了幾種 parser，我們今天要看這三個：

Parser 類型	說明
`StrOutputParser`	直接抽出文字內容 📜
`JsonOutputParser`	要求模型回傳 JSON 格式 📦
`PydanticOutputParser`	回傳符合你定義的 Python 資料模型 🧩

🧪 第一關：StrOutputParser，最簡單的解析器

💡 用法超級直覺，基本上就是把 AI 回的內容，當成純文字抽出來。

str_parser = StrOutputParser()
messages = chat_model.invoke("請提供一個國家的名稱和首都,使用台灣語言回答。")
print(str_parser.invoke(messages))

🧪 第二關：JsonOutputParser，自動 JSON 格式幫你處理

🤖 AI 會乖乖按照你給的 format 回傳 JSON，然後你只要 .invoke() 就能拿到 dict ✨！

# Initialize the JSON output parser
json_parser = JsonOutputParser()

# Get format instructions for the JSON output
# This provides guidelines on how the model should structure its response in JSON format
format_instructions = json_parser.get_format_instructions()
print(format_instructions)  # -> Return a JSON object.

# Invoke the chat model with a request for a country's name and capital in Taiwanese language
# The format instructions are included in the prompt to ensure the response adheres to the expected JSON structure
messages = chat_model.invoke("請提供一個國家的名稱和首都,"
                             f"{format_instructions} 使用台灣語言回答。")

# Parse the response using the JSON output parser
# The parser extracts and processes the JSON-formatted response from the model
json_output = json_parser.invoke(messages)
print(json_output) # -> {'國家': '日本', '首都': '東京'}

🎉 第三關：PydanticOutputParser

讓你從 AI 模型拿到的資料直接變成 Python 的物件，省下大量後處理的痛苦 😎。

🧱 Step 1：先定義資料模型（用 Pydantic）

class CountryInfo(BaseModel):
    country: str = Field(description="國家名稱")
    activity: List[str] = Field(description="國家活動列表")
    capital: str = Field(description="首都名稱")
    area: str = Field(description="國家面積")

📦 Step 2：叫 AI 回傳符合格式的資料

pydantic_parser = PydanticOutputParser(pydantic_object=CountryInfo)
format_instructions = pydantic_parser.get_format_instructions()

messages = chat_model.invoke("請提供一個國家的名稱、首都和面積,"
                             f"{format_instructions} 使用台灣語言回答。")

pprint(pydantic_parser.invoke(messages))

📦 輸出結果直接是 CountryInfo 實例：

CountryInfo(
    country='日本',
    activity=['參觀東京鐵塔', '品嚐拉麵', '泡溫泉'],
    capital='東京',
    area='377,975 平方公里'
)

超整齊、超結構化！你可以直接 .country、.area 拿值，不用再寫一堆 split()、regex，人生變快樂了

⚖️ 總整理：三種 Parser 怎麼選？

Parser 類型	適用情境	回傳格式
`StrOutputParser`	只需要一段文字，不管格式	純文字 `str`
`JsonOutputParser`	想拿 dict，但不需要模型驗證	`dict`
`PydanticOutputParser`	想要有結構又有驗證的資料	`Pydantic` 物件

🧩 小結：輸出也能很優雅！

透過 OutputParser，你可以把 LLM 的「人類式回答」轉成「程式能懂的格式」，尤其是 PydanticOutputParser，更是開發者整合 LLM 最佳拍檔 🤝。

🧠 用得好，AI 回傳的內容就能輕鬆轉為你要的 API 輸入、DB 結構，甚至前端頁面元件。