The Architecture Behind PanunSchool: Building an AI Tutor with Gemini 2.0

In December 2025, I began building PanunSchool — an educational platform that puts an AI tutor in the hands of every student. The core idea was simple: what if a student could upload their textbook, and an AI would read it, understand it, and then become their personal tutor — answering questions, generating quizzes, and explaining concepts in ways tailored to their learning level?

Powered by Google's Gemini 2.0 Flash, PanunSchool is my most technically ambitious project to date. In this article, I'll walk through the architecture decisions, the AI integration challenges, and the real-time communication infrastructure that makes it all work.

The Vision: AI Tutoring for Everyone

The education gap in regions like Kashmir is not just about access to schools — it's about access to quality, personalized instruction. A classroom with 40 students and one teacher cannot provide the individualized attention that every student needs. Private tutoring is expensive and often unavailable in remote areas.

PanunSchool aims to fill this gap with an AI tutor that:

Understands the student's uploaded study materials (PDFs, textbooks, notes)
Answers questions with explanations grounded in the actual course material
Generates quizzes and assessments automatically
Adapts its teaching style to the student's level
Is available 24/7, even during internet outages (with cached content)

System Architecture

PanunSchool's architecture consists of four major layers:

┌─────────────────────────────────────────┐
│           Frontend (React)              │
│  Document Viewer │ Chat UI │ Quiz UI    │
├─────────────────────────────────────────┤
│        Real-Time Layer (Socket.IO)      │
│  Chat Messages │ Typing Indicators │    │
│  Quiz Events   │ Document Sync         │
├─────────────────────────────────────────┤
│         Backend API (Node.js)           │
│  Auth │ Documents │ Quizzes │ Sessions  │
├─────────────────────────────────────────┤
│         AI Layer (Gemini 2.0 Flash)     │
│  RAG Pipeline │ Quiz Generation │       │
│  Context Management │ Prompt Engine     │
├─────────────────────────────────────────┤
│         Data Layer                      │
│  PostgreSQL │ Redis │ Object Storage    │
└─────────────────────────────────────────┘

The AI Engine: Gemini 2.0 Flash Integration

At the heart of PanunSchool is the AI tutoring engine, powered by Gemini 2.0 Flash. I chose Gemini for several reasons:

Speed — Flash is optimized for low-latency responses, essential for a conversational tutoring experience
Long context window — Can process large documents (entire textbook chapters) in a single prompt
Multimodal capabilities — Can understand text, images, and diagrams in uploaded documents
Cost efficiency — More affordable per token than competitor models at comparable quality

RAG Pipeline: Retrieval-Augmented Generation

The naive approach to AI tutoring would be to dump the entire document into the LLM context and ask questions. This doesn't scale — textbooks can be hundreds of pages. Instead, I implemented a RAG (Retrieval-Augmented Generation) pipeline:

Document Ingestion — When a student uploads a PDF, the system extracts text using a PDF parser, splits it into semantically meaningful chunks (by section/paragraph), and generates embedding vectors for each chunk.
Vector Storage — Embeddings are stored in a vector database, indexed by document and user.
Query Processing — When a student asks a question, the system generates an embedding for the query, performs a similarity search against the document chunks, and retrieves the most relevant sections.
Context Assembly — The relevant chunks are assembled into a context prompt, combined with the student's question and conversation history.
Response Generation — Gemini generates a response grounded in the retrieved context, with citations pointing back to specific sections of the student's document.

This architecture ensures that the AI's answers are always grounded in the student's actual study material — not generic internet knowledge. When the AI references "Chapter 3, Section 2.1", the student can verify it directly in their document.

Prompt Engineering: The Teaching Persona

Getting an LLM to be a good tutor is more than just throwing questions at it. I designed a multi-layered prompt structure:

System Prompt:
"You are PanunSchool AI Tutor — a patient, encouraging 
educational assistant. Your role is to help students 
understand their course material deeply.

Rules:
1. Always ground your answers in the provided context
2. If the context doesn't contain the answer, say so
3. Use analogies and examples relevant to the student
4. Break complex concepts into smaller steps
5. Ask follow-up questions to check understanding
6. Encourage the student when they're on the right track
7. Never give direct answers to quiz questions — guide 
   the student to discover the answer themselves"

The last rule is particularly important. An AI tutor that just gives answers isn't teaching — it's doing the student's homework. PanunSchool's AI is designed to use the Socratic method, asking guiding questions that lead students to understand concepts on their own.

Automated Quiz Generation

One of PanunSchool's most powerful features is automated quiz generation. Given a document or chapter, the AI can generate:

Multiple Choice Questions (MCQs) — With distractor options that test common misconceptions
True/False questions — Testing factual recall
Short answer questions — Requiring synthesis and explanation
Fill-in-the-blank — Testing specific terminology and definitions

The quiz generation uses a specialized prompt that instructs Gemini to create questions at different Bloom's Taxonomy levels — from basic recall (Level 1) to analysis and evaluation (Levels 4-5). This ensures assessments aren't just memory tests but genuinely measure understanding.

// Quiz generation request
const generateQuiz = async (documentChunks, options) => {
  const prompt = `
    Based on the following study material, generate 
    ${options.count} quiz questions.
    
    Distribution:
    - 30% Remember/Recall (Bloom's Level 1-2)
    - 40% Apply/Analyze (Bloom's Level 3-4)  
    - 30% Evaluate/Create (Bloom's Level 5-6)
    
    Format: Return as JSON array with fields:
    { question, type, options[], correctAnswer, 
      explanation, bloomLevel, sourceSection }
    
    Study Material:
    ${documentChunks.join('\n\n')}
  `;
  
  return await gemini.generateContent(prompt);
};

Real-Time Communication: Socket.IO

The tutoring experience needs to feel like a real conversation. Typing a question and waiting for a full response to load feels disconnected. I implemented streaming responses using Socket.IO:

The student sends a message through the Socket.IO connection
The backend processes the RAG pipeline and starts generating a response via Gemini's streaming API
As tokens are generated, they're streamed back to the client in real-time
The student sees the AI "typing" its response word by word, just like a real conversation

Socket.IO also powers other real-time features: typing indicators, document sync between devices, and live quiz participation (for classroom mode, where a teacher can host a quiz for multiple students simultaneously).

Document Unlocking System

PanunSchool has a document unlocking mechanism that incentivizes engagement. When a student uploads a document, certain advanced chapters or sections are "locked." Students unlock them by completing quizzes on prerequisite chapters, ensuring they have the foundational knowledge before advancing.

This gamification element turned out to be surprisingly effective at keeping students engaged. The unlock notification — complete with a satisfying animation — creates a sense of progression and achievement that pure content access doesn't provide.

Performance and Cost Optimization

Running an AI-powered platform at scale requires careful optimization:

Response caching — Common questions about the same document sections are cached. If another student asks a similar question about the same material, we serve the cached response (with personalization adjustments).
Embedding caching — Document embeddings are computed once and cached, not regenerated for every query.
Streaming over batching — Gemini's streaming API reduces perceived latency. The student starts reading the response while it's still being generated.
Context window management — Conversation history is summarized after 10 turns to stay within token limits without losing context.
Rate limiting — Per-user rate limits prevent API cost spikes from individual users.

Lessons Learned

Building an AI-powered education platform taught me several things:

Prompt engineering is product design. The system prompt isn't code — it's the personality and pedagogy of your product. Iterate on it like you would iterate on UI.
RAG is essential for educational AI. Students need answers grounded in their specific materials, not generic knowledge. RAG makes this possible.
Streaming responses transform the UX. The difference between waiting 5 seconds for a complete response and seeing it appear word-by-word is the difference between a tool and a companion.
Gamification works. The document unlocking system increased average session time by 40% compared to a version without it.
AI safety in education is paramount. The AI must never present wrong information confidently. When uncertain, it should say "I'm not sure about this — let's look at the text together."

What's Next for PanunSchool

The roadmap includes:

Voice tutoring — students can speak their questions instead of typing
Diagram understanding — AI can explain charts, graphs, and figures in textbooks
Collaborative learning — students can form study groups with shared AI sessions
Teacher dashboard — educators can track student progress and identify knowledge gaps
Offline mode — cached AI responses for common questions, usable without internet

PanunSchool represents what I believe is the future of education: AI as a patient, knowledgeable, always-available tutor that meets every student exactly where they are. If you're building in EdTech or AI, let's connect on LinkedIn.

#PanunSchool #Gemini #AITutor #EdTech #RAG #SocketIO #LLM #PromptEngineering

The Architecture Behind PanunSchool: Building an AI Tutor with Gemini 2.0

Parveez Ahmad Lone

The Vision: AI Tutoring for Everyone

System Architecture

The AI Engine: Gemini 2.0 Flash Integration

RAG Pipeline: Retrieval-Augmented Generation

Prompt Engineering: The Teaching Persona

Automated Quiz Generation

Real-Time Communication: Socket.IO

Document Unlocking System

Performance and Cost Optimization

Lessons Learned

What's Next for PanunSchool

Parveez Ahmad Lone

Parveez Ahmad Lone

The Vision: AI Tutoring for Everyone

System Architecture

The AI Engine: Gemini 2.0 Flash Integration

RAG Pipeline: Retrieval-Augmented Generation

Prompt Engineering: The Teaching Persona

Automated Quiz Generation

Real-Time Communication: Socket.IO

Document Unlocking System

Performance and Cost Optimization

Lessons Learned

What's Next for PanunSchool

Parveez Ahmad Lone

Continue Reading

How AI is Transforming E-Commerce in Rural India

From Kashmir to Tech: My Journey as a Full Stack Developer