PantryPal - AI Recipe Assistant Web App

A voice-activated AI recipe recommender web application that runs locally on your laptop. Uses Web Speech API for voice input/output and OpenAI for recipe generation.

Features

🎤 Voice Interface: Uses your laptop's microphone and speakers via Web Speech API
🤖 Animated Faces: Dark UI with expressive robot faces (^^, _, @@, o-o)
⏱️ Visual Timer: LCD-style countdown timer for cooking steps
🔊 Sound Effects: Plays startup, timer completion, and recipe completion sounds
🍳 AI Recipe Generation: Creates recipes based on your ingredients or dish name
🗣️ Text-to-Speech: PantryPal speaks instructions and guides you through each step
📝 Step-by-Step Guidance: Voice-controlled navigation through recipe steps

How It Works

Greeting: PantryPal introduces itself and asks for your ingredients
Input Detection: Detects if you provided ingredients or a dish name
Recipe Generation: Uses OpenAI GPT to create a personalized recipe
Voice Navigation: Say "okay pal, next step" to move between steps
Timer Support: Say "start timer" when a step requires timing
Completion: PantryPal announces when the recipe is complete

Prerequisites

Python 3.12 or higher
OpenAI API key (get one here)
Modern web browser (Chrome, Safari, or Edge recommended)
Microphone and speakers (built into most laptops)
Platform Support:
- Windows: Uses browser TTS (Windows voices like Microsoft Zira/David)
- macOS: Uses browser TTS (macOS voices) with optional server-side TTS caching
- Linux: Uses browser TTS (system voices)

Installation

macOS Installation

Navigate to project directory:
```
cd PantryPalApp
```
Create virtual environment:
```
python3.12 -m venv .venv
source .venv/bin/activate
```
If python3.12 is not found, try python3 or install Python 3.12+ from python.org
Install dependencies:
```
pip install -r requirements.txt
```

Create .env file:

echo "OPENAI_API_KEY=your_api_key_here" > .env

Or manually create a .env file in the project root and add:

OPENAI_API_KEY=your_actual_openai_api_key

Run the application:
```
python app.py
```
Open in browser:
- Navigate to http://localhost:5001 (will look something like: http://127.0.0.1:5001)
- Grant microphone permissions when prompted
- Click anywhere on the page to enable audio (required for browser autoplay policy)

Windows Installation

Navigate to project directory:
```
cd PantryPalApp
```
Create virtual environment:
```
python -m venv .venv
.venv\Scripts\activate
```
If python is not found, try py or install Python 3.12+ from python.org
- Make sure to check "Add Python to PATH" during installation
Install dependencies:
```
pip install -r requirements.txt
```
Create .env file:

Option A - Using PowerShell:
```
"OPENAI_API_KEY=your_api_key_here" | Out-File -FilePath .env -Encoding utf8
```
Option B - Manual creation:
- Create a new file named .env in the project root (you may need to enable "Show hidden files" in File Explorer)
- Add the following line:
```
OPENAI_API_KEY=your_actual_openai_api_key
```
Run the application:
```
python app.py
```
Open in browser:
- Navigate to http://localhost:5001 (will look something like: http://127.0.0.1:5001)
- Grant microphone permissions when prompted
- Click anywhere on the page to enable audio (required for browser autoplay policy)

Configuration

Environment Variables

Create a .env file in the project root with:

OPENAI_API_KEY=your_openai_api_key_here
PORT=5001  # Optional: defaults to 5001

Port Configuration

The app runs on port 5001 by default. To change it:

Set PORT environment variable:

export PORT=8080  # On Windows: set PORT=8080
python app.py

Or modify app.py directly (line 646):

port = int(os.environ.get('PORT', 5001))  # Change 5001 to your desired port

Running the Application

Development Mode

python app.py

This starts the Flask development server on http://localhost:5001.

Production Mode (Optional)

For production use, you can use a WSGI server like gunicorn:

# Install gunicorn (already in requirements.txt)
pip install gunicorn

# Run with gunicorn
gunicorn -w 4 -b 0.0.0.0:5001 app:app

Note: The app is designed for local use. For production deployment, you would need to:

Set up HTTPS (required for microphone access)
Configure a proper web server
Handle CORS appropriately
Set up proper security measures

Usage

Voice Commands

"I have [ingredients]" - Provide ingredients you have (e.g., "I have eggs, cheese, bread")
"[dish name]" - Request a specific dish (e.g., "scrambled eggs")
"okay pal, next step" - Move to the next recipe step
"start timer" - Begin countdown for current step (if timer is required)
"next" - Shortcut for next step (when in recipe flow)

Flow

Start: PantryPal greets you and asks for ingredients
Provide Input: Tell PantryPal what ingredients you have or what dish you want
Time Available: Specify how much time you have (e.g., "30 minutes")
Recipe Generation: PantryPal creates a recipe and lists ingredients
Follow Steps: PantryPal guides you through each step with voice
Timer: When a step requires timing, say "start timer" to begin countdown
Next Step: Say "okay pal, next step" to continue
Completion: PantryPal announces when the recipe is complete

Project Structure

PantryPalApp/
├── app.py                 # Flask backend server
├── templates/
│   └── index.html        # Frontend UI (HTML, CSS, JavaScript)
├── sounds/               # Sound files
│   ├── startup.wav      # Sound played on app start
│   ├── timer_done.wav   # Sound when timer completes
│   └── recipe_done.wav  # Sound when recipe completes
├── static/
│   └── tts_cache/       # Cached text-to-speech audio files (auto-generated)
├── requirements.txt     # Python dependencies
├── .env                 # Environment variables (create this file, not in git)
└── README.md           # This file

Technical Details

Backend (app.py)

Flask Server: Serves the web app and handles API requests
OpenAI Integration: Generates recipes using GPT-4o-mini
TTS Caching: Caches text-to-speech audio files (macOS only, optional)
Recipe Parsing: Extracts steps, timers, and ingredients from AI responses
Platform Support: Server-side TTS works on macOS; Windows/Linux use browser TTS
API Endpoints:
- / - Main application page
- /api/generate-recipe - Generate recipe from ingredients
- /api/get-ingredients-list - Get ingredients for a dish name
- /api/check-input-type - Detect if input is dish name or ingredients
- /api/text-to-speech - Generate TTS audio (macOS only)
- /sounds/<filename> - Serve sound files
- /api/health - Health check endpoint

Frontend (templates/index.html)

Web Speech API: Voice recognition and text-to-speech
Continuous Listening: Microphone stays active throughout the session
LCD Display: Visual display showing current step, timer, or status
Robot Faces: Animated facial expressions based on app state
Timer Visualization: Real-time countdown display
Sound Playback: Plays audio files at appropriate moments

Browser Compatibility

Recommended Browsers

✅ Chrome/Edge (desktop & mobile) - Full support
✅ Safari (iOS 14.5+, macOS) - Full support
⚠️ Firefox - Limited Web Speech API support

Required Features

Web Speech API (Speech Recognition)
Web Speech API (Speech Synthesis)
Microphone access
Audio playback

Troubleshooting

App won't start

Problem: ModuleNotFoundError: No module named 'flask_cors'

Solution:

pip install -r requirements.txt

Problem: OPENAI_API_KEY not found in environment variables

Solution:

Create a .env file in the project root
Add: OPENAI_API_KEY=your_actual_api_key

Problem: Port already in use

Solution:

Change the port in .env: PORT=8080
Or kill the process using port 5001

Voice recognition not working

Problem: Microphone not detected

Solutions:

Grant microphone permissions when prompted by the browser
Check browser settings: Chrome → Settings → Privacy → Microphone
Ensure no other application is using the microphone
Try a different browser (Chrome/Safari recommended)

Problem: Microphone disconnects after inactivity

Solution: The app includes a keep-alive mechanism. If issues persist:

Refresh the page
Check browser console for errors
Ensure microphone permissions are still granted

Recipe generation fails

Problem: API errors

Solutions:

Verify your OpenAI API key is valid
Check your OpenAI account has credits/billing set up
Review browser console (F12) for error messages
Check Flask server logs in terminal

Problem: Network errors

Solutions:

Ensure you have internet connection
Check if OpenAI API is accessible
Verify firewall isn't blocking requests

TTS (Text-to-Speech) not working

Problem: No audio playback

Solutions:

Click anywhere on the page to enable audio (browser autoplay policy)
Check browser audio settings
Ensure speakers/headphones are connected and working
Check browser console for errors

Problem: Server-side TTS not working (macOS)

Solutions:

Ensure you're on macOS (server-side TTS is macOS-only)
Browser TTS will be used as fallback automatically
Check that say command works: say "test" in terminal

Note for Windows/Linux users: The app uses browser TTS automatically on these platforms. Windows will use built-in voices (Microsoft Zira, Microsoft David, etc.) and Linux will use system voices. No additional configuration needed.

Sound files not playing

Problem: Sounds don't play

Solutions:

Verify sound files exist in sounds/ directory:
- startup.wav
- timer_done.wav
- recipe_done.wav
Check browser console for 404 errors
Ensure audio is enabled in browser
Click anywhere on page to enable audio

LCD display issues

Problem: Timer or text not showing

Solutions:

Check browser console for JavaScript errors
Ensure CSS is loading properly
Try hard refresh (Ctrl+Shift+R or Cmd+Shift+R)

Security Notes

The app runs locally and doesn't expose your data externally
OpenAI API calls are made server-side (API key stays on your machine)
Voice data is processed in the browser (not sent to external servers except OpenAI)
No user data is stored or logged
.env file should never be committed to git (already in .gitignore)

Performance Tips

TTS Caching: Audio files are cached in static/tts_cache/ to avoid regeneration
Microphone Keep-Alive: The app maintains microphone connection to prevent timeouts
Browser Caching: Browser caches static files for faster loading

Development

Making Changes

Backend changes (app.py): Restart the Flask server
Frontend changes (templates/index.html): Refresh browser (no restart needed)

Debugging

Browser Console: Press F12 to see JavaScript errors and logs
Flask Logs: Check terminal where python app.py is running
Network Tab: Use browser DevTools to inspect API requests

Testing

Test voice commands in a quiet environment
Verify microphone permissions are granted
Test with different recipes and ingredient combinations
Check timer functionality with various durations

Notes

The app is designed for local use only on your laptop
All processing happens locally or via OpenAI API
No external hosting or deployment is required
The app uses your laptop's default microphone and speakers
HTTPS is not required for local development (localhost is exempt)
The app runs on port 5001 by default
All voice processing happens in the browser (Web Speech API)
Recipe generation uses OpenAI's API (requires internet connection)
Sound files are served from the sounds/ directory
TTS audio files are cached in static/tts_cache/ to avoid regeneration

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
sounds		sounds
templates		templates
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

RyanMastropaolo/PantryPalApp

Folders and files

Latest commit

History

Repository files navigation

PantryPal - AI Recipe Assistant Web App

Features

How It Works

Prerequisites

Installation

macOS Installation

Windows Installation

Configuration

Environment Variables

Port Configuration

Running the Application

Development Mode

Production Mode (Optional)

Usage

Voice Commands

Flow

Project Structure

Technical Details

Backend (app.py)

Frontend (templates/index.html)

Browser Compatibility

Recommended Browsers

Required Features

Troubleshooting

App won't start

Voice recognition not working

Recipe generation fails

TTS (Text-to-Speech) not working

Sound files not playing

LCD display issues

Security Notes

Performance Tips

Development

Making Changes

Debugging

Testing

Notes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages