RAG対応AI Agentメモリシステム構築 – ベクターDBとLangChainで実現する高度な記憶機能

週刊アスキーNo.1559(2025年9月2日発行) [雑誌]

(5298)

￥366 (2025-09-05 23:11 GMT +09:00 時点 - )

週刊新潮 2025年9月11日号[雑誌]

(5303)

￥510 (2025-09-05 23:11 GMT +09:00 時点 - )

ソースネクスト｜ ZERO ウイルスセキュリティ 1台版（無期限）｜ウイルス対策・セキュリティソフト｜ Windows/Mac/Android/iOS対応

(539779)

￥2,480 (2025-09-05 23:11 GMT +09:00 時点 - )

Amazon Fire HD 10 キッズプロ (10インチ) ギャラクシー対象年齢6歳から数千点のキッズコンテンツが1年間使い放題

(544618)

￥23,980 (2025-09-05 23:11 GMT +09:00 時点 - )

Anker Nano Charging Station (7-in-1, 100W, 巻取り式 USB-C ケーブル) (USBタップ電源タップ) 【PSE技術基準適合】MacBook iPad iPhone Galaxy Android ブラック

(54763)

(2025-09-05 23:11 GMT +09:00 時点 - )

1 はじめに
2 なぜ高度なメモリシステムが必要なのか
- 2.1 従来のメモリシステムの限界
- 2.2 RAGベースメモリシステムの優位性
3 ベクターデータベースを使ったメモリアーキテクチャ
- 3.1 システム設計の全体像
- 3.2 記憶の保存メカニズム
4 セマンティック検索による記憶の呼び出し
- 4.1 関連する過去の記憶を取得
5 実用的な使用例とデモンストレーション
- 5.1 AI Agentとの統合例
6 まとめと次のステップ
7 関連記事

はじめに

従来のAI Agentでは、会話履歴を単純にテキストとして保存するだけでしたが、現代のエンタープライズ環境では「意味的に関連する過去の情報を的確に呼び出せる記憶システム」が求められています。本記事では、RAG（Retrieval Augmented Generation）技術とベクターデータベースを組み合わせた、次世代のAI Agentメモリシステムを実装する方法を解説します。

なぜ高度なメモリシステムが必要なのか

従来のメモリシステムの限界

基本的なAI Agentでは、以下のような問題が発生します：

# 従来の単純なメモリ（問題があるパターン）
memory = ConversationBufferWindowMemory(k=5)  # 直近5回の会話のみ保持

このアプローチでは：

情報の消失：古い重要な情報が自動的に削除される
文脈の断絶：関連する過去の会話を見つけられない
検索能力の欠如：特定のトピックに関する過去の議論を呼び出せない

RAGベースメモリシステムの優位性

RAG技術を活用することで：

セマンティック検索：意味的に類似した過去の会話を発見
長期記憶の保持：重要な情報を永続的に保存
文脈の継続性：関連する議論を自動的に関連付け

ベクターデータベースを使ったメモリアーキテクチャ

システム設計の全体像

import os
from langchain.memory import VectorStoreRetrieverMemory
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage, AIMessage
import chromadb
from datetime import datetime
import uuid

class AdvancedAgentMemory:
    """
    RAG技術を活用した高度なAI Agentメモリシステム

    主な機能：
    - セマンティック検索による記憶の呼び出し
    - 長期記憶と短期記憶の階層化
    - 重要度に基づく記憶の優先度付け
    """

    def __init__(self, collection_name="agent_memory"):
        # OpenAIのembeddingモデルを初期化
        self.embeddings = OpenAIEmbeddings(
            model="text-embedding-ada-002"  # 最新の高性能embeddingモデル
        )

        # Chromaデータベースの初期化
        self.client = chromadb.PersistentClient(path="./agent_memory_db")

        # ベクターストアの設定
        self.vectorstore = Chroma(
            client=self.client,
            collection_name=collection_name,
            embedding_function=self.embeddings
        )

        # LangChainのVectorStoreRetrieverMemoryを使用
        self.memory = VectorStoreRetrieverMemory(
            vectorstore=self.vectorstore,
            memory_key="chat_history",
            return_docs=True,
            input_key="input"
        )

        print("✅ 高度なメモリシステムが初期化されました")

記憶の保存メカニズム

記憶をただ保存するだけでなく、後の検索を考慮したメタデータを付与します：

    def store_conversation(self, human_input: str, ai_response: str, 
                          importance_score: float = 0.5, 
                          topic_tags: list = None):
        """
        会話を意味的検索可能な形で保存

        Args:
            human_input: ユーザーの入力
            ai_response: AIの応答
            importance_score: 重要度（0.0-1.0）
            topic_tags: トピックタグ（カテゴリ分類用）
        """
        # タイムスタンプと一意IDの生成
        timestamp = datetime.now().isoformat()
        conversation_id = str(uuid.uuid4())

        # 会話の文脈を含む検索用テキストを作成
        conversation_context = f"""
        人間: {human_input}
        AI: {ai_response}

        トピック: {', '.join(topic_tags) if topic_tags else '一般'}
        重要度: {importance_score}
        """

        # メタデータの構築（後の検索フィルタリングに使用）
        metadata = {
            "conversation_id": conversation_id,
            "timestamp": timestamp,
            "importance_score": importance_score,
            "topic_tags": topic_tags or [],
            "human_input": human_input,
            "ai_response": ai_response,
            "conversation_type": self._classify_conversation_type(human_input)
        }

        # ベクターデータベースに保存
        self.vectorstore.add_texts(
            texts=[conversation_context],
            metadatas=[metadata],
            ids=[conversation_id]
        )

        print(f"📝 会話が保存されました（ID: {conversation_id[:8]}...）")

    def _classify_conversation_type(self, input_text: str) -> str:
        """
        入力テキストから会話タイプを自動分類
        """
        if any(word in input_text.lower() for word in ['質問', '教えて', '？', '?']):
            return "question"
        elif any(word in input_text.lower() for word in ['ありがとう', 'thanks']):
            return "gratitude"
        elif any(word in input_text.lower() for word in ['問題', 'エラー', 'バグ']):
            return "problem_solving"
        else:
            return "general"

セマンティック検索による記憶の呼び出し

関連する過去の記憶を取得

単純なキーワード検索ではなく、意味的に類似した記憶を呼び出します：

    def retrieve_relevant_memories(self, query: str, 
                                 k: int = 5, 
                                 importance_threshold: float = 0.3,
                                 topic_filter: str = None) -> list:
        """
        クエリに関連する過去の記憶をセマンティック検索で取得

        Args:
            query: 検索クエリ
            k: 取得する記憶の数
            importance_threshold: 重要度の閾値
            topic_filter: 特定のトピックでフィルタリング

        Returns:
            関連する記憶のリスト
        """
        # 基本的なセマンティック検索
        results = self.vectorstore.similarity_search_with_score(
            query=query,
            k=k * 2  # フィルタリング後に十分な数を確保
        )

        # 重要度と条件でフィルタリング
        filtered_results = []
        for doc, score in results:
            metadata = doc.metadata

            # 重要度フィルタ
            if metadata.get('importance_score', 0) < importance_threshold:
                continue

            # トピックフィルタ
            if topic_filter and topic_filter not in metadata.get('topic_tags', []):
                continue

            # 類似度スコアが高い（距離が近い）記憶を選択
            if score < 0.8:  # 閾値は調整可能
                filtered_results.append({
                    'content': doc.page_content,
                    'metadata': metadata,
                    'similarity_score': 1 - score,  # スコアを直感的に
                    'timestamp': metadata.get('timestamp', '')
                })

        # 重要度と類似度で順序付け
        filtered_results.sort(
            key=lambda x: (x['metadata']['importance_score'], x['similarity_score']), 
            reverse=True
        )

        return filtered_results[:k]

    def get_contextual_memory(self, current_input: str) -> str:
        """
        現在の入力に基づいて関連する記憶を文脈として整理
        """
        relevant_memories = self.retrieve_relevant_memories(
            query=current_input,
            k=3,
            importance_threshold=0.4
        )

        if not relevant_memories:
            return "関連する過去の記憶はありません。"

        context = "=== 関連する過去の記憶 ===n"
        for i, memory in enumerate(relevant_memories, 1):
            timestamp = memory['timestamp'][:16]  # 日時部分のみ
            similarity = memory['similarity_score']
            importance = memory['metadata']['importance_score']

            context += f"""
【記憶 {i}】 ({timestamp}, 類似度: {similarity:.2f}, 重要度: {importance:.1f})
{memory['content']}

---

"""

        return context

実用的な使用例とデモンストレーション

AI Agentとの統合例

class RAGEnhancedAgent:
    """
    RAGメモリシステムを統合したAI Agent
    """

    def __init__(self):
        self.memory_system = AdvancedAgentMemory()
        self.llm = ChatOpenAI(
            model="gpt-4",
            temperature=0.7
        )

    def chat(self, user_input: str, importance_score: float = 0.5, 
             topic_tags: list = None) -> str:
        """
        記憶機能付きの対話処理
        """
        # 関連する過去の記憶を取得
        relevant_context = self.memory_system.get_contextual_memory(user_input)

        # プロンプトの構築
        enhanced_prompt = f"""
以下の過去の記憶を参考にして、ユーザーの質問に回答してください。
過去の文脈を活用して、一貫性のある対話を心がけてください。

{relevant_context}

現在のユーザー入力: {user_input}

回答:"""

        # AI応答の生成
        response = self.llm.predict(enhanced_prompt)

        # 会話をメモリに保存
        self.memory_system.store_conversation(
            human_input=user_input,
            ai_response=response,
            importance_score=importance_score,
            topic_tags=topic_tags
        )

        return response

# 使用例
if __name__ == "__main__":
    agent = RAGEnhancedAgent()

    # 最初の会話（プロジェクトについて）
    response1 = agent.chat(
        "PythonでWebアプリを作りたいのですが、フレームワークは何がおすすめですか？",
        importance_score=0.8,
        topic_tags=["Python", "Web開発", "フレームワーク"]
    )
    print("AI:", response1)

    # 関連する質問（過去の記憶が活用される）
    response2 = agent.chat(
        "さっきのWebフレームワークの件で、データベース連携はどうすればいいですか？",
        importance_score=0.7,
        topic_tags=["Python", "データベース"]
    )
    print("AI:", response2)