← Back to Blog

Verina System Card: Reimagining AI-Powered Search

October 20, 2025
19 min read
AI SearchLLMAI Agent SystemContext Engineering

Version: 1.0 Date: 2025-10-20 Author: Li Yang

Abstract

Verina is an experimental AI-powered search engine that fundamentally reimagines how users interact with information. Unlike traditional chatbot interfaces that force users to master prompting techniques, Verina offers three specialized modes tailored to different research depths: Fast/Deep Mode for quick answers, Chat Mode for interactive exploration, and Agent Mode for sustained deep research. The system leverages cutting-edge LLM capabilities, intelligent tool orchestration, and an innovative external file system architecture to deliver results that balance speed, depth, and accuracy.

Our core philosophy: Give AI agents the right tools, and they'll reason their way to better answers than relying on pre-trained knowledge alone.


Table of Contents

  1. System Overview
  2. Architecture & Tech Stack
  3. Mode Deep-Dive
    • Fast/Deep Mode
    • Chat Mode
    • Agent Mode
  4. Technical Innovations
  5. Model Selection & Rationale
  6. Future Work

System Overview

Verina addresses three fundamental problems in current AI search products:

  1. Prompting burden: Users shouldn't need technical skills to get good results
  2. Shallow reasoning: Most AI search engines rely too heavily on cached knowledge
  3. Context limitations: Long research sessions hit token limits quickly

Our solution: Three specialized modes, each optimized for a specific use case.

ModeSpeedDepthUse CaseLLM Engine
Fast Mode5sBasicQuick facts, breaking newsGemini 2.5 Flash
Deep Mode15-30sExtendedMulti-perspective analysisGemini 2.5 Flash
Chat ModeVariableInteractiveFollow-up questions, file readingClaude Sonnet 4.5
Agent Mode10-30minResearch-gradeDeep investigation, reportsGPT-5 Codex

Architecture & Tech Stack

Core Components

Three-Layer Architecture:

1. Frontend Layer (Next.js 15 + React 19)

  • Three-mode UI: Fast/Deep/Chat/Agent
  • Real-time streaming via SSE
  • Citation rendering & artifact viewer

HTTP/SSE

2. Backend Layer (FastAPI + Python)

  • SearchAgent (v1 engine for Fast/Deep Mode)
  • ChatModeAgent (interactive Q&A)
  • AgentMode Agent (deep research)

Unified Tool Layer (13+ tools)

3. Tool & Service Layer

  • Core Tools: web_search, execute_python, file_* operations
  • Intelligence Tools: compact_context, research_assistant
  • MCP Tools: Dynamically loaded (browser automation, etc.)

External APIs

4. External Services & Storage

  • OpenRouter API (GPT-5, Claude, Gemini)
  • Exa API (neural search engine)
  • E2B Sandbox (optional code execution)
  • Local file system (workspace & cache)

Tech Stack

Frontend

  • Framework: Next.js 15 (App Router), React 19
  • Language: TypeScript
  • Streaming: Server-Sent Events (SSE)
  • UI: Custom components (no shadcn/ui), responsive design

Backend

  • Framework: FastAPI (Python 3.11+)
  • LLM Access: OpenRouter (unified API for multiple providers)
  • Search: Exa API (neural search, superior to traditional engines for LLMs)
  • Code Execution: E2B Sandbox (optional, secure Python environment)
  • Storage: Local file system (workspace-based architecture)

Infrastructure

  • Containerization: Docker (frontend, backend, Redis optional)
  • Deployment: Single-command CLI (verina package)
  • Development: Hot reload, monorepo structure

Mode Deep-Dive

1. Fast/Deep Mode: The Search Engine

Model: Gemini 2.5 Flash (google/gemini-2.5-flash-preview-09-2025)

Why Gemini 2.5 Flash?

  • Blazing fast (5s end-to-end for Fast Mode)
  • Excellent tool-calling capabilities
  • Cost-effective
  • Native multimodal support (future expansion)

Architecture: Optimized Pipeline

Fast Mode (2 LLM calls):

  1. User Query → LLM Call 1: Tool Selection
  2. fast_search(query) → Exa API
  3. → Sources returned
  4. → LLM Call 2: Answer Streaming
  5. Final Answer (5-8 seconds)

Deep Mode (3 LLM calls + Test-Time Scaling):

  1. User Query → LLM Call 1: Query Analysis + Tool
  2. → Reasoning: Multi-perspective decomposition
  3. deep_search(refined_query) → Exa API
  4. → First Batch Sources
  5. LLM Call 2: Deep Exploration (Forced)
  6. → Insight: What's missing? Alternative angles?
  7. deep_search(supplemental_query) → Exa API
  8. → Second Batch Sources (deduplicated)
  9. LLM Call 3: Answer Streaming
  10. Comprehensive Answer (15-30 seconds)

Key Innovation: Deep Mode uses mandatory two-round search as a test-time scaling trick. The first round gathers initial information, then the LLM is forced to perform a second search to fill gaps or explore alternative perspectives. This dramatically improves answer quality without training.

Tools:

  • fast_search: Single-round Exa search with auto query refinement
  • deep_search: Multi-round search with reflection and supplemental queries

Technical Details:

  • Deduplication: Second batch merges with first, preserving citation indices
  • Highlights extraction: Only relevant snippets sent to LLM (saves tokens)
  • Citation format: [1][2][3] inline references

2. Chat Mode: Interactive Exploration

Model: Claude Sonnet 4.5 (anthropic/claude-sonnet-4.5)

Why Claude Sonnet 4.5?

  • Best-in-class agentic reasoning (our observation: coding ability correlates with general task performance)
  • Excellent at using tools precisely
  • Strong at citation management
  • Fast enough for interactive use

Architecture: ReAct Loop with External File System

ReAct Loop Flow:

  1. User Message → MessageManager
  2. → Enter ReAct Loop
  3. Decision Point:
    • Yes, need tools → Execute Tools → Tool Results → Loop back
    • No, ready → Final Answer
  4. → (Max 200 iterations)

Key Feature: External File System

Instead of cramming everything into context, Chat Mode uses a workspace:

workspace_chat_{session_id}/
├── cache/                    # Downloaded articles (full text)
│   ├── article_001.md
│   ├── article_002.md
│   └── ...
└── analysis/                 # Python execution outputs
    ├── images/               # Matplotlib charts
    ├── data/                 # CSV, JSON results
    └── ...

Workflow:

  1. User: "Compare pricing of GPT-5 vs Claude 4"
  2. Agent: Calls web_search(query="GPT-5 pricing") → Exa returns 5 articles
  3. System: Articles saved to cache/, LLM receives only highlights (200-500 chars each)
  4. Agent: Provides initial answer using highlights
  5. User: "What about enterprise contracts?"
  6. Agent: "Let me read the full article." → Calls file_read(filename="cache/article_003.md")
  7. System: Returns full 5000-word article text
  8. Agent: Provides detailed answer about enterprise pricing

Why This Works:

  • Context efficiency: Highlights (~100 tokens) vs full articles (~3000 tokens)
  • Human-in-the-loop: User can request deep dives on specific sources
  • Flexibility: Agent decides when to read full content

Tools (8 total):

  • web_search: Search, cache articles, return highlights with citations
  • execute_python: E2B sandbox for data analysis (optional, requires E2B_API_KEY)
  • file_read: Read cached articles or analysis outputs
  • MCP Tools: Dynamically loaded from Model Context Protocol servers (extensible)

MCP Integration: Chat Mode automatically loads tools from configured MCP servers. Currently, Verina includes:

Configured MCP Server:

  • chrome-devtools: Browser automation and web interaction
    • Tools: take_snapshot, navigate_page, click, fill, take_screenshot, list_pages, evaluate_script, etc.
    • Runs in headless Chromium with Docker-optimized settings
    • Enables web browsing, form filling, screenshot capture, and JavaScript execution

The MCP architecture is extensible - additional servers (PostgreSQL, Filesystem, GitHub, etc.) can be added to the MCP_SERVERS configuration without changing agent code.


3. Agent Mode: Deep Research & Report Generation

Model: GPT-5 Codex (openai/gpt-5-codex)

Why GPT-5 Codex? This was the most surprising finding in our research. Initially, we expected GPT-5 (the chat variant) to be better for research tasks. However, testing revealed:

  • Codex's reasoning is decisive: Short, focused reasoning that gets to the point
  • GPT-5's reasoning is verbose: Often reasons about irrelevant details
  • Codex excels at tool orchestration: Likely due to its training on function calling in code
  • KV-cache efficiency: Codex seems better optimized for long-context scenarios

In 18-minute research sessions consuming ~150k tokens per round, Codex consistently outperformed GPT-5 in both speed and reasoning quality.

Architecture: Two-Stage Progression

Agent Mode uses a Human-in-the-Loop (HIL) → Research progression:

Stage 1: HIL (Quick Search + User Confirmation)

  1. User Query → web_search → Quick Results
  2. → LLM provides initial analysis
  3. → User decides: "Good enough" or "Go deeper"
  4. → If deeper → call start_researchStage 2

Stage 2: Research (Full Toolset Unleashed)

Workspace Initialization:

  • progress.md (strategy tracker)
  • notes.md (research findings)
  • draft.md (answer composition)
  • cache/ (article storage)

Research Loop (up to 200 iterations):

  1. Tool calls → Results → Update workspace files
  2. Agent maintains progress.md:
    • Overall goal
    • Current stage (searching? analyzing? writing?)
    • Next steps
  3. Agent fills notes.md:
    • Key findings from each article
    • Data points, quotes, insights
  4. Agent drafts answer in draft.md:
    • Structured argument with [1][2] citations
  5. If context > 280k tokens:
    • compact_context auto-triggers
  6. When done: call stop_answer

HTML Blog Generation Phase:

  • Load draft.md and notes.md
  • Inject 2000-word prompt template
  • Generate Notion-inspired HTML blog
  • Extract to artifact.html

Tools (12+ total):

Core Research Tools:

  • web_search: Search and cache articles
  • file_read, file_write, file_list: Workspace file management
  • execute_python: E2B sandbox for data analysis, visualization (optional)

Intelligence Amplification Tools:

  • research_assistant: Auxiliary LLM agent (GPT-5) with independent conversation threads

    • Use case: "Read cache/quantum_article.md and explain qubit stability"
    • Maintains separate conversation history (multi-turn dialogue)
    • Saves main agent's context by delegating file reading
  • compact_context: Intelligent context compression

    • Uses a mini-agent (Gemini 2.5 Pro) that can read workspace files
    • Generates structured 5-section summary:
      1. Overall goal
      2. File system state (what's created/modified)
      3. Key knowledge (facts, data, insights)
      4. Recent actions (last 5-10 tool calls with full details)
      5. Current plan (next steps)
    • Auto-triggers at 280k tokens (limit: 400k)
    • Preserves file paths and navigation hints

Meta Tools:

  • start_research: Trigger stage transition (HIL → Research)
  • stop_answer: Signal completion and trigger blog generation

MCP Tools: Same as Chat Mode (chrome-devtools for browser automation), dynamically loaded in Research stage

Workspace Structure (Example from 18-minute session):

workspace_agent_{session_id}/
├── progress.md              # Research strategy (300 lines)
├── notes.md                 # Detailed findings (2000 lines)
├── draft.md                 # Structured answer (1500 lines)
├── cache/
│   ├── article_001.md       # 5000 words from Nature
│   ├── article_002.md       # 3000 words from ArXiv
│   ├── ... (15 articles)
├── conversations/
│   ├── conv_a1b2c3/         # Research assistant dialogue #1
│   │   └── messages.json
│   ├── conv_d4e5f6/         # Research assistant dialogue #2
│   │   └── messages.json
├── analysis/
│   ├── images/
│   │   ├── trend_chart.png  # Matplotlib output
│   │   └── comparison.png
│   ├── data/
│   │   ├── processed.csv
│   │   └── stats.json
│   └── reports/
│       └── analysis.md
└── artifact.html            # Final blog (auto-generated)

HTML Blog Generation:

When the agent calls stop_answer, a special prompt (2000+ words) is injected that instructs the LLM to:

  1. Read draft.md and notes.md from workspace
  2. Generate two deliverables:
    • Brief overview (2-3 paragraphs for chat display)
    • Deep technical blog (HTML format, Notion-inspired design)

Blog Specifications:

  • Design: Minimalist, Notion-inspired (800px max width, clean typography)
  • Content: Deep technical analysis (3000-5000 words typical)
  • Structure: Title, Executive Summary, Background, Core Analysis, Deep Dives, Practical Implications, References
  • Format: Standalone HTML (inline CSS, no external dependencies)
  • Citations: All references are clickable <a> tags
  • Responsive: Mobile-friendly with media queries
  • Accessibility: Semantic HTML5, ARIA labels, proper contrast

Example Output: A query like "Analyze the scalability challenges of quantum computing" results in:

  • 18-minute research session
  • 15 articles read and analyzed
  • 5 Python data analysis scripts executed
  • 1500-line draft.md with 30+ citations
  • 4000-word HTML blog with charts and deep technical insights

Context Management Innovation:

The compact_context tool is a mini-agent itself. Here's how it works:

  1. Triggered: When context exceeds 280k tokens (or manually called)
  2. Mini-Agent Spawned: Uses Gemini 2.5 Pro with file_read tool access
  3. Review Phase: Agent reads workspace files (progress.md, notes.md, draft.md)
  4. Compression: Generates structured 5-section summary (see above)
  5. Confirmation: Main LLM reviews summary and confirms understanding
  6. Rebuild: Replace old messages with [summary + confirmation] + recent 10 user turns
  7. Resume: Agent continues work seamlessly

This approach preserves critical information (file paths, data points, strategic decisions) while reducing token count by 60-80%.


Technical Innovations

1. External File System Architecture

Problem: LLM context windows are limited, but research involves reading dozens of articles.

Solution: Workspace-based storage with selective loading.

Benefits:

  • Articles stored once, referenced many times
  • Agent decides what to read (highlights → full text only when needed)
  • Workspace persists between sessions (future: resume research)
  • Python outputs (charts, data) accessible via file paths

Implementation:

# web_search tool
articles = exa_api.search(query)
for article in articles:
    # Save full text to cache
    cache_path = workspace / "cache" / f"article_{idx}.md"
    cache_path.write_text(article.full_text)

    # Return only highlights to LLM
    highlights.append({
        "idx": idx,
        "title": article.title,
        "snippet": article.highlights[0][:500],  # First 500 chars
        "cache_path": str(cache_path)
    })

return {"highlights": highlights}  # LLM sees ~100 tokens instead of ~3000

Problem: Single search often misses important perspectives.

Solution: Mandatory two-round search in Deep Mode.

Implementation:

  1. Round 1: LLM analyzes query → searches → receives results
  2. Forced reflection: LLM must identify gaps and search again
  3. Round 2: Supplemental search fills gaps or explores alternatives
  4. Deduplication: Merge results with continuous citation indices

Why It Works:

  • Forces LLM to critique its own results
  • Explores alternative angles and queries the first round may have missed
  • Costs 2x search API calls but improves answer comprehensiveness significantly

3. Research Assistant: Multi-Turn Auxiliary Agent

Problem: Reading files consumes main agent's context. Asking follow-up questions requires repeating file content.

Solution: Separate auxiliary agent with independent conversation memory.

Use Cases:

  1. File Reading: "Read cache/article_005.md and summarize quantum decoherence challenges"

    • Research Assistant reads file, maintains understanding
    • Main agent receives summary (saves ~2000 tokens)
  2. Multi-Turn Analysis:

    • Main: "Compare articles 3 and 7 on qubit stability" (returns conv_id: "conv_a1b2")
    • Main: "What about temperature requirements?", conv_id="conv_a1b2"
    • Research Assistant remembers previous comparison, provides focused answer
  3. Draft Review: "Review my draft.md and suggest improvements"

    • Assistant reads draft, provides feedback
    • Multi-turn editing session without cluttering main context

Technical Implementation:

  • Each conv_id = independent conversation thread stored in workspace/conversations/{conv_id}/
  • Uses GPT-5 for strategic guidance
  • Can call file_read tool to access workspace files
  • Returns results to main agent as simple text

4. Intelligent Context Compression

Problem: Traditional context compression loses critical information (file paths, data points, strategic decisions).

Solution: File-aware compression agent with structured output.

Innovation: The compactor is a mini-agent that:

  • Reads workspace files to understand current state
  • Reviews old conversation messages
  • Generates structured summary (5 sections, XML format)
  • Main LLM confirms understanding before continuing

How It Works:

  1. Triggered: When context exceeds 280k tokens (or manually called)
  2. Mini-Agent Spawned: Uses Gemini 2.5 Pro with file_read tool access
  3. Review Phase: Agent reads workspace files (progress.md, notes.md, draft.md) if needed
  4. Compression: Generates structured 5-section XML summary:
    • <overall_goal>: User's ultimate objective
    • <file_system_state>: All file operations (CREATED/MODIFIED/READ) with navigation hints
    • <key_knowledge>: Hard facts, data points, URLs, constraints, strategic decisions
    • <recent_actions>: Last 5-10 tool calls with full parameters and results
    • <current_plan>: Next steps and continuation strategy
  5. Confirmation: Main LLM reviews summary and confirms understanding
  6. Rebuild: Replace old messages with [summary + confirmation] + recent 10 user turns
  7. Resume: Agent continues work seamlessly

Result: Preserves critical information (file paths, data points, strategic decisions) while reducing token count by 60-80%.

5. Model Context Protocol (MCP) Integration

Problem: Tool ecosystems are closed. Adding new capabilities requires code changes.

Solution: MCP (Model Context Protocol) - a standardized protocol for LLM tools.

How It Works:

  1. MCP servers are configured in backend/src/chat/mcp_client.py in the MCP_SERVERS dictionary
  2. Verina automatically loads tools from all servers on startup
  3. Tools appear in Chat Mode and Agent Mode (Research stage) automatically
  4. Adding new MCP servers only requires editing the configuration dictionary

Current Implementation: chrome-devtools MCP server

# backend/src/chat/mcp_client.py
MCP_SERVERS = {
    "chrome-devtools": {
        "command": "chrome-devtools-mcp",
        "args": [
            "--headless",
            "--executablePath", "/usr/bin/chromium",
            "--isolated",
            "--chromeArg=--no-sandbox",
            "--chromeArg=--disable-setuid-sandbox",
            "--chromeArg=--disable-dev-shm-usage",
        ],
        "env": None
    }
}

This provides Claude/GPT-5 with browser automation capabilities: navigate websites, fill forms, take screenshots, execute JavaScript, interact with web pages - all without modifying agent code.

Extensibility: Additional MCP servers (PostgreSQL, Filesystem, GitHub, etc.) can be added to MCP_SERVERS dictionary with zero changes to agent logic.


Model Selection & Rationale

Our model choices are based on extensive testing and practical observations:

Gemini 2.5 Flash (Fast/Deep Mode)

Selected for: Search pipeline (tool calling + answer generation)

Why:

  • Speed: 5-second end-to-end for Fast Mode
  • Tool calling: Excellent at using fast_search and deep_search precisely
  • Streaming: Low latency for answer generation

Trade-offs:

  • Not as strong at deep reasoning as Claude or GPT-5
  • Acceptable for search tasks where tools do heavy lifting

Claude Sonnet 4.5 (Chat Mode)

Selected for: Interactive Q&A with tool access

Why:

  • Agentic reasoning: Best-in-class for deciding when to use tools
  • Citation management: Excellent at using [1][2] format consistently
  • File reading: Great at selectively choosing when to read full articles vs using highlights
  • MCP compatibility: Strong tool-calling capabilities for dynamic tools

Observations:

  • Models good at coding tend to be good at general tool use
  • Claude's function calling is more reliable than Gemini's
  • Faster than GPT-5 for interactive use

Trade-offs:

  • More expensive than Gemini
  • Acceptable for Chat Mode where quality > speed

GPT-5 Codex (Agent Mode)

Selected for: Deep research with multi-tool orchestration

Why (most surprising finding):

  • Decisive reasoning: Codex's reasoning is short and focused, unlike GPT-5's verbosity
  • Tool orchestration: Excels at complex multi-step workflows (search → read → analyze → write)
  • Long-context efficiency: Better KV-cache design for 150k+ token contexts
  • Strategic planning: Maintains coherent research strategy over 200 iterations

Our Hypothesis: Codex's training on code (function composition, API calls) transfers to tool use better than pure chat training.

Evidence: In 18-minute research sessions, Codex:

  • Called 40+ tools with 95%+ success rate
  • Maintained coherent workspace state (progress.md, notes.md, draft.md)
  • Generated higher-quality HTML blogs than GPT-5

Trade-offs:

  • Most expensive
  • Justified for Agent Mode where depth matters most

Auxiliary Models

  • Gemini 2.5 Pro (context compression): Chosen for speed and cost-effectiveness in the compact_context mini-agent
  • GPT-5 (research assistant): Same reasoning capabilities as Codex, used for auxiliary tasks

Challenges & Solutions

Challenge 1: Context Explosion in Long Research

Problem: Long research sessions can quickly consume massive amounts of context with full article content.

Solution:

  1. External file system (articles stored in cache/)
  2. Highlight-first approach (LLM sees snippets, reads full text only when needed)
  3. Intelligent compression (file-aware compaction at 280k tokens)

Result: Enables sustained research sessions without hitting context limits prematurely.

Challenge 2: Tool Calling Reliability

Problem: LLMs sometimes call tools with wrong parameters or hallucinate tool names.

Solution:

  1. Strict tool schemas with detailed descriptions
  2. Error messages guide LLM to retry correctly
  3. Model selection (Claude/Codex have higher tool-calling accuracy than Gemini)

Example:

# Strict schema with examples
{
  "name": "file_read",
  "parameters": {
    "filename": {
      "type": "string",
      "description": "Relative path to file, e.g., 'cache/article_001.md' or 'notes.md'. Do NOT include workspace path prefix."
    }
  }
}

# Error handling with guidance
if not file_exists(filename):
    return {
        "error": f"File '{filename}' not found. Available files: {list_files()}. Did you mean 'cache/{filename}'?"
    }

Result: Tool call success rate improved from ~70% (early versions) to 95%+ (current).

Challenge 3: Citation Consistency

Problem: LLMs sometimes forget citation format, use wrong numbers, or duplicate citations.

Solution:

  1. System prompt emphasizes citation format
  2. Sources provided with clear [idx] markers in highlights
  3. Post-processing validation (future: detect missing citations)

Example Prompt:

When using search results, ALWAYS cite with [1][2][3] format.
The number corresponds to the source's idx field.

Example:
Quantum computers face decoherence challenges [1]. However,
recent advances in error correction show promise [2][3].

Result: ~90% citation accuracy in generated answers.


Future Work

Our current focus areas:

  1. Exploring new LLM interaction paradigms: Moving beyond traditional prompting to more natural, intuitive ways of interacting with AI agents.

  2. Long-term memory optimization: Building persistent user memory and cross-session knowledge accumulation.

  3. Advanced context management: Improving context compression and enabling longer research sessions without quality degradation.

We're committed to continuously building and evolving the Verina brand as a platform for AI Era search paradigm.


Conclusion

Verina demonstrates that giving AI agents the right tools and architecture can overcome traditional limitations:

  • External file system solves context constraints
  • Multi-round search improves answer quality via test-time scaling
  • Specialized modes serve different user needs efficiently
  • Intelligent compression enables sustained 18-minute research sessions
  • Model selection matters: Codex > GPT-5 for tool orchestration, Claude > Gemini for chat

Our core insight: LLM capability = Base reasoning × Tool quality × Architecture. While everyone focuses on base reasoning (bigger models, more training), we believe tools and architecture are equally important.

Verina is an experiment in pushing architectural boundaries. We're excited to see where this leads.


Appendix: Tool Reference

Agent Mode Tools (Research Stage)

ToolPurposeExample
web_searchSearch web, cache articles, return highlightsweb_search(query="quantum computing challenges")
file_readRead workspace filesfile_read(filename="cache/article_001.md")
file_writeWrite to workspacefile_write(filename="notes.md", content="...")
file_listList workspace filesfile_list(directory="cache")
execute_pythonRun Python code in E2B sandboxexecute_python(code="import matplotlib...")
compact_contextCompress context intelligentlycompact_context(keep_recent_user_messages=10)
research_assistantMulti-turn auxiliary agentresearch_assistant(question="Summarize article 5", conv_id="conv_001")
stop_answerSignal completionstop_answer()
MCP ToolsBrowser automation (chrome-devtools)mcp_chrome-devtools_take_snapshot(), mcp_chrome-devtools_navigate_page(url="..."), etc.

Context Management

  • Limit: 400,000 tokens (GPT-5 Codex)
  • Auto-compact: Triggered at 280,000 tokens
  • Compaction strategy: Keep recent 10 user turns intact, summarize older messages
  • Compaction agent: Gemini 2.5 Pro with file_read access

Workspace Files

FilePurposeExample Content
progress.mdResearch strategy, status"Overall goal: ... Current stage: Analyzing data... Next: Write draft..."
notes.mdDetailed findings"Article 1 (Nature): Quantum decoherence rates... Article 2 (ArXiv): Error correction codes..."
draft.mdAnswer composition"# Introduction\nQuantum computing faces three key challenges [1][2]..."
cache/*.mdDownloaded articlesFull article text from Exa API
conversations/*/messages.jsonResearch assistant dialoguesConversation history for multi-turn consultations
analysis/images/*.pngPython outputsMatplotlib charts, visualizations
artifact.htmlFinal blogAuto-generated HTML report

Contact

For questions, feedback, or contributions:


Verina is an experimental project. All findings, observations, and design decisions are based on practical testing and may evolve as we learn more.

Version History:

  • v1.0 (2025-10-20): Initial comprehensive system card