An intelligent educational analytics platform using multi-agent workflow and retrieval-augmented generation (RAG) to provide instructors with natural language querying capabilities for student performance data.
- Natural Language Queries: Ask questions in plain English
- Zero Hallucination: All responses grounded in actual database data
- Multi-Agent Workflow: Specialized agents for query understanding, schema retrieval, SQL generation, and more
- Multi-Hop RAG: Contextual retrieval across multiple vector database hops
- Automatic Visualization: Charts and graphs generated based on query type
- Secure Access Control: Role-based access with instructor-specific data filtering
The system processes natural language queries through a 7-stage pipeline:
- User Query Input - Teacher asks questions in plain English
- Query Understanding - Intent detection, entity extraction, query classification
- Schema Retrieval - Multi-hop RAG for finding relevant tables and join patterns
- Validation - Permission checks, scope validation, table access control
- SQL Generation - LLM-powered parameterized queries with context awareness
- Data Analysis - SQL execution, statistics, and insights generation
- Response Formatting - Natural language summaries and visualizations
The Schema Retrieval Agent uses a 4-hop RAG approach for accurate SQL generation:
| Hop | Purpose | Vector Collection | Output |
|---|---|---|---|
| 1 | Query Intent | query_intents |
Intent type (e.g., "unit_completion") |
| 2 | Table Selection | table_schemas |
Relevant tables (e.g., fct.LearnerUnitStats) |
| 3 | Join Patterns | join_patterns |
Correct JOIN syntax with all keys |
| 4 | Business Rules | business_rules |
Domain filters (e.g., UnitCompletionPerc >= 100) |
This multi-hop approach ensures zero hallucination by grounding all schema decisions in vector-embedded metadata.
- Docker and Docker Compose
- Python 3.10+
- PostgreSQL 14+ with pgvector extension
- Ollama with Llama3 model
git clone <repository-url>
cd educational-analytics# Copy environment file
cp backend/.env.example backend/.env
# Edit .env with your configuration
nano backend/.env# Start all services
docker-compose up -d
# Wait for services to be healthy
docker-compose ps
# Pull Llama3 model in Ollama
docker exec educational_analytics_ollama ollama pull llama3# Database schema is automatically created on first run
# Verify schema creation
docker exec educational_analytics_db psql -U eduuser -d educational_analytics -c "\dn"# Enter API container
docker exec -it educational_analytics_api bash
# Generate all embeddings
python scripts/generate_embeddings.py
# Test embeddings
python scripts/test_embeddings.py
# Exit container
exitAPI is now available at http://localhost:8000
- API Documentation:
http://localhost:8000/docs - Health Check:
http://localhost:8000/api/health
# Login
curl -X POST http://localhost:8000/api/auth/login \
-H "Content-Type: application/json" \
-d '{
"email": "[email protected]",
"password": "password123"
}'
# Response
{
"access_token": "eyJ...",
"token_type": "bearer"
}# Send query
curl -X POST http://localhost:8000/api/query \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <your-token>" \
-d '{
"query": "How many students completed Unit 5?",
"org_id": 1,
"instructor_id": 123,
"academic_year": "2024-25"
}'
# Response
{
"text_response": "12 students (75% of your class) completed Unit 5...",
"statistics": {
"total_students": 16,
"completed": 12,
"completion_rate": 75
},
"insights": [
"75% completion rate is above the organization average of 68%",
"4 students still need to complete the unit"
],
"visualization": {
"chart_type": "bar_chart",
"title": "Unit 5 Completion Status",
...
},
"data": [...],
"metadata": {
"result_count": 16,
"query_type": "statistics",
"processing_time": 2.34
}
}backend/
├── agents/ # Multi-agent implementations
├── api/ # FastAPI application
├── database/ # Database connection and schemas
├── scripts/ # Utility scripts
├── utils/ # Configuration and logging
├── workflow/ # LangGraph workflow
└── tests/ # Unit and integration tests
# Unit tests
docker exec educational_analytics_api pytest tests/unit
# Integration tests
docker exec educational_analytics_api pytest tests/integration
# All tests
docker exec educational_analytics_api pytest# Run API in development mode with hot reload
docker-compose up api
# View logs
docker-compose logs -f apiKey configuration in .env:
DB_*: Database connection settings- `

