PathTracer is a comprehensive Windows application that combines advanced audio recording capabilities with cutting-edge AI integration. The application features:
- Multi-Model AI Integration: Support for GPT-4, Claude, Gemini, and GPT-4 Vision
- High-Quality Audio Recording: Windows-specific WASAPI loopback recording
- Real-time AI Analysis: Analyze recordings with multiple AI models
- Cross-Platform Architecture: Built with .NET 8 and MAUI for future expansion
- Privacy-Focused Design: Local processing with optional cloud integration
- Multiple AI Providers: GPT-4, Claude, Gemini, and GPT-4 Vision
- Consensus AI Mode: Combines outputs from multiple AI models for better results
- Image Analysis: Vision capabilities for analyzing screenshots and images
- Health Monitoring: Real-time status checking of all AI providers
- WASAPI Loopback Recording: High-quality system audio capture
- Non-Intrusive: Doesn't interfere with other audio applications
- Real-time Processing: Live audio data handling
- Multiple Formats: WAV output with configurable quality settings
- System Tray: Minimize to system tray with context menu
- Modern UI: Clean, professional Windows Forms interface
- Multi-Tab Interface: Separate tabs for AI, Recording, and Settings
- Real-time Status: Live updates on recording and AI processing status
PathTracer_Project/
├── Pathtracer.Core/ # Core interfaces and models
│ ├── Interfaces/
│ │ ├── IEnhancedAIClient.cs # Multi-model AI client interface
│ │ ├── IGeminiClient.cs # Original Gemini interface
│ │ └── IAgents.cs # Agent framework interfaces
│ └── Models/
│ ├── GeminiClientSettings.cs # AI settings with multi-provider support
│ └── Recording/ # Recording models
├── Pathtracer.Gemini/ # AI implementation
│ ├── EnhancedAIClient.cs # Multi-model AI client
│ ├── GeminiClient.cs # Original Gemini client
│ └── AIInteractionAgent.cs # AI-powered agent
├── Pathtracer.UI.Windows/ # Windows Forms application
│ ├── MainForm.cs # Main application window
│ ├── Program.cs # Application entry point
│ └── WindowsSystemTrayAgent.cs # System tray integration
└── src/PathTracer.MAUI/ # MAUI cross-platform project
└── Services/Recording/
└── WindowsAudioRecordingAgent.cs # Enhanced audio recording
The EnhancedAIClient class provides unified access to multiple AI providers:
- GPT-4 Integration: Latest OpenAI GPT-4-turbo model
- Claude Integration: Anthropic's Claude 3 Sonnet
- Gemini Integration: Google Gemini Pro and Vision
- Consensus Mode: Combines outputs from multiple models
- Image Analysis: Vision capabilities for screenshot analysis
Enhanced audio recording with:
- WASAPI Loopback: Captures system audio without interference
- NAudio Integration: Professional audio processing library
- Real-time Statistics: Live recording metrics
- File Management: Automatic organization of recordings
- Error Handling: Comprehensive error reporting
Modern Windows Forms interface featuring:
- Material Design: Clean, modern UI styling
- Multi-Tab Layout: AI Assistant, Recording, and Settings tabs
- System Tray Integration: Background operation support
- Real-time Updates: Live status and progress indicators
- Error Handling: User-friendly error messages
- .NET 8 SDK: Download from Microsoft
- Visual Studio 2022: With .NET desktop development workload
- Windows 10/11: For WASAPI audio recording support
Set up API keys as environment variables:
# PowerShell
[Environment]::SetEnvironmentVariable("GEMINI_API_KEY", "your-gemini-api-key", "User")
[Environment]::SetEnvironmentVariable("OPENAI_API_KEY", "your-openai-api-key", "User")
[Environment]::SetEnvironmentVariable("ANTHROPIC_API_KEY", "your-anthropic-api-key", "User")# Command Prompt
setx GEMINI_API_KEY "your-gemini-api-key"
setx OPENAI_API_KEY "your-openai-api-key"
setx ANTHROPIC_API_KEY "your-anthropic-api-key"- Clone the repository (or navigate to the project directory)
- Restore dependencies:
dotnet restore
- Build the solution:
dotnet build
- Run the Windows application:
cd Pathtracer.UI.Windows dotnet run
The application uses environment variables for API configuration:
- GEMINI_API_KEY: Google Gemini API key
- OPENAI_API_KEY: OpenAI API key for GPT-4
- ANTHROPIC_API_KEY: Anthropic API key for Claude
Settings can also be configured through the Settings tab in the application.
- Select AI Provider: Choose from Gemini, GPT-4, Claude, or Consensus mode
- Enter Prompt: Type your question or request
- Generate Response: Click "Generate AI Response"
- View Results: AI response appears in the output area
- Health Check: Verify AI provider availability
- Start Recording: Click "Start Audio Recording"
- Monitor Progress: View real-time recording status
- Stop Recording: Click "Stop Recording" when finished
- AI Analysis: Click "AI Analyze Recording" for automatic analysis
- View Recordings: Access recording history and statistics
- API Configuration: Enter API keys for AI providers
- Recording Quality: Configure audio recording settings
- AI Parameters: Adjust temperature, max tokens, etc.
- Save Settings: Persist configuration changes
The Consensus mode combines outputs from multiple AI models to provide more reliable and comprehensive responses. This is particularly useful for:
- Complex analysis tasks
- Fact-checking and verification
- Generating diverse perspectives
- Improving response quality
The application supports image analysis through GPT-4 Vision and Gemini Vision:
- Screenshot analysis
- Document processing
- Visual content understanding
- OCR and text extraction
The application runs in the background with system tray support:
- Minimize to system tray
- Quick recording controls
- Status notifications
- Background AI processing
To add a new AI provider:
- Update
IEnhancedAIClient: Add new provider toAIModelProviderenum - Implement Provider Logic: Add new method in
EnhancedAIClient - Update Settings: Add API key property to
GeminiClientSettings - Update UI: Add provider to combo box in
MainForm
To add new recording features:
- Update
IAudioRecordingAgent: Add new method signatures - Implement in
WindowsAudioRecordingAgent: Add concrete implementation - Update Models: Add new model classes if needed
- Update UI: Add controls to
MainForm
The MAUI project structure supports future cross-platform expansion:
- Android: Platform-specific implementations in
Platforms/Android/ - iOS: Platform-specific implementations in
Platforms/iOS/ - macOS: Platform-specific implementations in
Platforms/MacCatalyst/
-
AI Provider Errors:
- Check API keys in environment variables
- Verify internet connectivity
- Check provider service status
-
Audio Recording Issues:
- Ensure Windows audio services are running
- Check audio device permissions
- Verify NAudio dependencies
-
Build Errors:
- Ensure .NET 8 SDK is installed
- Restore NuGet packages
- Check project references
The application logs detailed information to:
- Console output
- Debug output
- Windows Event Log (for critical errors)
For issues and questions:
- Check the logs for error details
- Verify environment variables are set correctly
- Ensure all dependencies are installed
- Check API provider quotas and limits
- Screen Recording: Full screen capture capabilities
- Video Analysis: AI-powered video content analysis
- Real-time Transcription: Live speech-to-text conversion
- Cloud Integration: Azure cloud services integration
- Mobile Apps: Android and iOS companion apps
- Web Dashboard: Browser-based management interface
- Parallel Processing: Multi-threaded AI processing
- Caching: Intelligent response caching
- Streaming: Real-time audio streaming
- Compression: Efficient data compression
- Encryption: Enhanced security and privacy
This project is part of the PathTracer ecosystem and follows the same licensing terms.
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Implement your changes
- Add tests if applicable
- Submit a pull request
- API keys are stored securely in environment variables
- Local data is encrypted at rest
- Network communications use HTTPS
- Privacy-focused design with local processing options