Skip to content

JarrodAI/pathtracer-windows

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PathTracer Windows Application - Complete AI-Enhanced Recording Platform

Overview

PathTracer is a comprehensive Windows application that combines advanced audio recording capabilities with cutting-edge AI integration. The application features:

  • Multi-Model AI Integration: Support for GPT-4, Claude, Gemini, and GPT-4 Vision
  • High-Quality Audio Recording: Windows-specific WASAPI loopback recording
  • Real-time AI Analysis: Analyze recordings with multiple AI models
  • Cross-Platform Architecture: Built with .NET 8 and MAUI for future expansion
  • Privacy-Focused Design: Local processing with optional cloud integration

Features

🤖 AI Capabilities

  • Multiple AI Providers: GPT-4, Claude, Gemini, and GPT-4 Vision
  • Consensus AI Mode: Combines outputs from multiple AI models for better results
  • Image Analysis: Vision capabilities for analyzing screenshots and images
  • Health Monitoring: Real-time status checking of all AI providers

🎵 Audio Recording

  • WASAPI Loopback Recording: High-quality system audio capture
  • Non-Intrusive: Doesn't interfere with other audio applications
  • Real-time Processing: Live audio data handling
  • Multiple Formats: WAV output with configurable quality settings

🖥️ Windows Integration

  • System Tray: Minimize to system tray with context menu
  • Modern UI: Clean, professional Windows Forms interface
  • Multi-Tab Interface: Separate tabs for AI, Recording, and Settings
  • Real-time Status: Live updates on recording and AI processing status

Architecture

Project Structure

PathTracer_Project/
├── Pathtracer.Core/                    # Core interfaces and models
│   ├── Interfaces/
│   │   ├── IEnhancedAIClient.cs        # Multi-model AI client interface
│   │   ├── IGeminiClient.cs            # Original Gemini interface
│   │   └── IAgents.cs                  # Agent framework interfaces
│   └── Models/
│       ├── GeminiClientSettings.cs     # AI settings with multi-provider support
│       └── Recording/                  # Recording models
├── Pathtracer.Gemini/                  # AI implementation
│   ├── EnhancedAIClient.cs             # Multi-model AI client
│   ├── GeminiClient.cs                 # Original Gemini client
│   └── AIInteractionAgent.cs           # AI-powered agent
├── Pathtracer.UI.Windows/              # Windows Forms application
│   ├── MainForm.cs                     # Main application window
│   ├── Program.cs                      # Application entry point
│   └── WindowsSystemTrayAgent.cs       # System tray integration
└── src/PathTracer.MAUI/                # MAUI cross-platform project
    └── Services/Recording/
        └── WindowsAudioRecordingAgent.cs # Enhanced audio recording

Key Components

EnhancedAIClient

The EnhancedAIClient class provides unified access to multiple AI providers:

  • GPT-4 Integration: Latest OpenAI GPT-4-turbo model
  • Claude Integration: Anthropic's Claude 3 Sonnet
  • Gemini Integration: Google Gemini Pro and Vision
  • Consensus Mode: Combines outputs from multiple models
  • Image Analysis: Vision capabilities for screenshot analysis

WindowsAudioRecordingAgent

Enhanced audio recording with:

  • WASAPI Loopback: Captures system audio without interference
  • NAudio Integration: Professional audio processing library
  • Real-time Statistics: Live recording metrics
  • File Management: Automatic organization of recordings
  • Error Handling: Comprehensive error reporting

MainForm

Modern Windows Forms interface featuring:

  • Material Design: Clean, modern UI styling
  • Multi-Tab Layout: AI Assistant, Recording, and Settings tabs
  • System Tray Integration: Background operation support
  • Real-time Updates: Live status and progress indicators
  • Error Handling: User-friendly error messages

Setup Instructions

Prerequisites

  1. .NET 8 SDK: Download from Microsoft
  2. Visual Studio 2022: With .NET desktop development workload
  3. Windows 10/11: For WASAPI audio recording support

Environment Variables

Set up API keys as environment variables:

# PowerShell
[Environment]::SetEnvironmentVariable("GEMINI_API_KEY", "your-gemini-api-key", "User")
[Environment]::SetEnvironmentVariable("OPENAI_API_KEY", "your-openai-api-key", "User")
[Environment]::SetEnvironmentVariable("ANTHROPIC_API_KEY", "your-anthropic-api-key", "User")
# Command Prompt
setx GEMINI_API_KEY "your-gemini-api-key"
setx OPENAI_API_KEY "your-openai-api-key"
setx ANTHROPIC_API_KEY "your-anthropic-api-key"

Building the Application

  1. Clone the repository (or navigate to the project directory)
  2. Restore dependencies:
    dotnet restore
  3. Build the solution:
    dotnet build
  4. Run the Windows application:
    cd Pathtracer.UI.Windows
    dotnet run

Configuration

The application uses environment variables for API configuration:

  • GEMINI_API_KEY: Google Gemini API key
  • OPENAI_API_KEY: OpenAI API key for GPT-4
  • ANTHROPIC_API_KEY: Anthropic API key for Claude

Settings can also be configured through the Settings tab in the application.

Usage Guide

AI Assistant Tab

  1. Select AI Provider: Choose from Gemini, GPT-4, Claude, or Consensus mode
  2. Enter Prompt: Type your question or request
  3. Generate Response: Click "Generate AI Response"
  4. View Results: AI response appears in the output area
  5. Health Check: Verify AI provider availability

Audio Recording Tab

  1. Start Recording: Click "Start Audio Recording"
  2. Monitor Progress: View real-time recording status
  3. Stop Recording: Click "Stop Recording" when finished
  4. AI Analysis: Click "AI Analyze Recording" for automatic analysis
  5. View Recordings: Access recording history and statistics

Settings Tab

  1. API Configuration: Enter API keys for AI providers
  2. Recording Quality: Configure audio recording settings
  3. AI Parameters: Adjust temperature, max tokens, etc.
  4. Save Settings: Persist configuration changes

Advanced Features

Consensus AI Mode

The Consensus mode combines outputs from multiple AI models to provide more reliable and comprehensive responses. This is particularly useful for:

  • Complex analysis tasks
  • Fact-checking and verification
  • Generating diverse perspectives
  • Improving response quality

Image Analysis

The application supports image analysis through GPT-4 Vision and Gemini Vision:

  • Screenshot analysis
  • Document processing
  • Visual content understanding
  • OCR and text extraction

System Tray Integration

The application runs in the background with system tray support:

  • Minimize to system tray
  • Quick recording controls
  • Status notifications
  • Background AI processing

Development

Adding New AI Providers

To add a new AI provider:

  1. Update IEnhancedAIClient: Add new provider to AIModelProvider enum
  2. Implement Provider Logic: Add new method in EnhancedAIClient
  3. Update Settings: Add API key property to GeminiClientSettings
  4. Update UI: Add provider to combo box in MainForm

Extending Recording Capabilities

To add new recording features:

  1. Update IAudioRecordingAgent: Add new method signatures
  2. Implement in WindowsAudioRecordingAgent: Add concrete implementation
  3. Update Models: Add new model classes if needed
  4. Update UI: Add controls to MainForm

Cross-Platform Support

The MAUI project structure supports future cross-platform expansion:

  • Android: Platform-specific implementations in Platforms/Android/
  • iOS: Platform-specific implementations in Platforms/iOS/
  • macOS: Platform-specific implementations in Platforms/MacCatalyst/

Troubleshooting

Common Issues

  1. AI Provider Errors:

    • Check API keys in environment variables
    • Verify internet connectivity
    • Check provider service status
  2. Audio Recording Issues:

    • Ensure Windows audio services are running
    • Check audio device permissions
    • Verify NAudio dependencies
  3. Build Errors:

    • Ensure .NET 8 SDK is installed
    • Restore NuGet packages
    • Check project references

Logs

The application logs detailed information to:

  • Console output
  • Debug output
  • Windows Event Log (for critical errors)

Support

For issues and questions:

  1. Check the logs for error details
  2. Verify environment variables are set correctly
  3. Ensure all dependencies are installed
  4. Check API provider quotas and limits

Future Enhancements

Planned Features

  • Screen Recording: Full screen capture capabilities
  • Video Analysis: AI-powered video content analysis
  • Real-time Transcription: Live speech-to-text conversion
  • Cloud Integration: Azure cloud services integration
  • Mobile Apps: Android and iOS companion apps
  • Web Dashboard: Browser-based management interface

Performance Optimizations

  • Parallel Processing: Multi-threaded AI processing
  • Caching: Intelligent response caching
  • Streaming: Real-time audio streaming
  • Compression: Efficient data compression
  • Encryption: Enhanced security and privacy

License

This project is part of the PathTracer ecosystem and follows the same licensing terms.

Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Implement your changes
  4. Add tests if applicable
  5. Submit a pull request

Security

  • API keys are stored securely in environment variables
  • Local data is encrypted at rest
  • Network communications use HTTPS
  • Privacy-focused design with local processing options

About

PathTracer - AI-Enhanced Human Task Checker and Logger for Windows

Resources

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published