Reflexion Framework Design
Overviewโ
The Reflexion framework implements the Actor-Evaluator-Self-Reflection pattern to enable MCP ADR Analysis Server tools to learn from mistakes through linguistic feedback and self-reflection. This framework maintains the 100% prompt-driven architecture while providing continuous learning and improvement capabilities.
Core Conceptโ
Reflexion Framework works by:
- Actor: Executes tasks and generates outputs based on observations
- Evaluator: Scores and evaluates the Actor's performance
- Self-Reflection: Generates linguistic feedback for improvement
- Memory: Stores lessons learned for future reference
- Iteration: Continuously improves through feedback loops
Research Foundationโ
Based on Shinn et al. (2023) "Reflexion: Language Agents with Verbal Reinforcement Learning":
- Verbal Reinforcement: Converts feedback into linguistic self-reflection
- Episodic Memory: Stores experiences and lessons learned
- Iterative Improvement: Rapidly learns from prior mistakes
- No Fine-tuning Required: Uses existing LLM capabilities without model updates
Architecture Integrationโ
Existing Components Integrationโ
- PromptObject Interface: Reflexion generates enhanced prompts with memory context
- File System Utilities: Uses existing prompt-driven file operations for memory persistence
- Research Integration: Leverages research utilities for feedback analysis
- Cache System: Stores reflection memories and learning outcomes
Framework Componentsโ
Reflexion Framework
โโโ Actor (Task execution with memory context)
โโโ Evaluator (Performance assessment and scoring)
โโโ Self-Reflection (Linguistic feedback generation)
โโโ Memory Manager (Episodic and long-term memory)
โโโ Learning Tracker (Progress and improvement monitoring)
โโโ Integration Layer (MCP tool integration utilities)
Core Reflexion Componentsโ
1. Actor Componentโ
Purpose: Execute tasks with memory-enhanced context Responsibilities:
- Task Execution: Perform assigned tasks using current knowledge
- Memory Integration: Incorporate past lessons into current actions
- Context Awareness: Consider previous failures and successes
- Trajectory Generation: Create detailed execution paths for evaluation
2. Evaluator Componentโ
Purpose: Assess Actor performance and provide feedback Evaluation Criteria:
- Task Success: Did the Actor achieve the intended outcome?
- Quality Assessment: How well was the task executed?
- Efficiency Analysis: Was the approach optimal?
- Error Detection: What mistakes were made?
- Improvement Potential: Where can performance be enhanced?
3. Self-Reflection Componentโ
Purpose: Generate linguistic feedback for continuous improvement Reflection Types:
- Success Analysis: What worked well and why?
- Failure Analysis: What went wrong and how to fix it?
- Pattern Recognition: What patterns emerge from multiple attempts?
- Strategy Refinement: How can approaches be improved?
- Knowledge Gaps: What knowledge is missing or incomplete?
4. Memory Managerโ
Purpose: Store and retrieve lessons learned and experiences Memory Types:
- Episodic Memory: Specific task attempts and outcomes
- Semantic Memory: General lessons and principles learned
- Procedural Memory: Improved methods and approaches
- Meta-Memory: Knowledge about what has been learned
Reflexion Framework Interfacesโ
Core Reflexion Typesโ
export interface ReflexionConfig {
memoryEnabled: boolean;
maxMemoryEntries: number;
reflectionDepth: 'basic' | 'detailed' | 'comprehensive';
evaluationCriteria: EvaluationCriterion[];
learningRate: number; // How quickly to adapt (0-1)
memoryRetention: number; // How long to keep memories (days)
feedbackIntegration: boolean; // Enable external feedback
}
export interface TaskAttempt {
attemptId: string;
taskType: string;
context: any;
action: string;
outcome: TaskOutcome;
evaluation: EvaluationResult;
reflection: SelfReflection;
timestamp: string;
metadata: AttemptMetadata;
}
export interface TaskOutcome {
success: boolean;
result: any;
errors: string[];
warnings: string[];
executionTime: number;
resourcesUsed: ResourceUsage;
}
export interface EvaluationResult {
overallScore: number; // 0-1 scale
criteriaScores: Record<string, number>;
feedback: string[];
strengths: string[];
weaknesses: string[];
improvementAreas: string[];
confidence: number; // 0-1 scale
}
export interface SelfReflection {
reflectionText: string;
lessonsLearned: string[];
actionableInsights: string[];
futureStrategies: string[];
knowledgeGaps: string[];
confidenceLevel: number; // 0-1 scale
applicability: string[]; // Where these lessons apply
}