Files
Crumb-Core-v.1/native_crumbcore_v1/CLAUDE.md

14 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

Crumbforest is a native (non-Docker) deployment configuration for a FastAPI-based multilingual CRM and RAG-powered educational platform. This directory contains installation scripts, systemd service definitions, NGINX configurations, and database initialization for production deployment on Linux servers.

The main application (located in the parent repository) is a sophisticated system featuring:

  • Role-based character chat with 15+ unique AI personas
  • RAG (Retrieval Augmented Generation) for semantic search across documents, posts, and diary entries
  • Multilingual support (German, English, French) with theme customization
  • Vector database integration (Qdrant) for embedding-based search
  • MariaDB for relational data (users, posts, audit logs)

Common Commands

Development & Testing

When working with the native deployment scripts, test them in a safe environment first:

# Check script syntax without executing
bash -n native-install.sh

# Dry-run validation (if supported by script)
sudo ./native-install.sh --dry-run

Installation & Deployment

# Initial installation (creates /opt/crumbforest, systemd services, NGINX config)
sudo ./native-install.sh

# Update deployed application
sudo ./native-update.sh

# Create backup (database + code + logs + Qdrant vectors)
sudo ./native-backup.sh

Service Management

# Start/stop/restart services
sudo systemctl start crumbforest
sudo systemctl stop crumbforest
sudo systemctl restart crumbforest
sudo systemctl start crumbforest-indexing

# Check service status
sudo systemctl status crumbforest
sudo systemctl status crumbforest-indexing

# View live logs
sudo journalctl -u crumbforest -f
sudo journalctl -u crumbforest-indexing -f

# Enable auto-start on boot
sudo systemctl enable crumbforest
sudo systemctl enable crumbforest-indexing

Database Operations

# Initialize database schema and default users
sudo mysql -u root -p crumbforest < scripts/init_database.sql

# Connect to database
mysql -u crumb_prod -p crumbforest

# Backup database manually
mysqldump -u crumb_prod -p crumbforest > backup_$(date +%Y%m%d_%H%M%S).sql

NGINX Configuration

# Test NGINX configuration syntax
sudo nginx -t

# Reload NGINX (without downtime)
sudo systemctl reload nginx

# Restart NGINX
sudo systemctl restart nginx

# View NGINX error logs
sudo tail -f /var/log/nginx/crumbforest.error.log
sudo tail -f /var/log/nginx/crumbforest.access.log

Health Checks & Verification

# Check FastAPI health endpoint
curl http://localhost:8000/health

# Test NGINX reverse proxy
curl -I https://crumbforest.194-164-194-191.sslip.io/

# Verify Qdrant is running
curl http://localhost:6333/collections

# Check MariaDB connection
mysql -u crumb_prod -p -e "SELECT 'Connection OK' AS status;"

Log Management

# View application logs
tail -f /var/log/crumbforest/app.log

# View systemd journal logs (last 100 lines)
sudo journalctl -u crumbforest -n 100

# View logs from specific date
sudo journalctl -u crumbforest --since "2025-12-24"

# Clear old logs (journalctl - keeps last 7 days)
sudo journalctl --vacuum-time=7d

Architecture & Key Concepts

Deployment Model

This is a native (non-Docker) deployment that runs directly on the Linux host:

  • Application installed to /opt/crumbforest/
  • Services managed by systemd (not Docker containers)
  • Direct localhost connections to MariaDB and Qdrant (no Docker networking)
  • NGINX acts as reverse proxy to FastAPI (port 8000 → 80/443)

Service Architecture

User Request (HTTPS)
      ↓
NGINX (80/443) → Reverse Proxy
      ↓
FastAPI (localhost:8000) → systemd: crumbforest.service
      ↓
├─ MariaDB (localhost:3306) → User data, posts, metadata
└─ Qdrant (localhost:6333) → Vector embeddings for RAG

Background Service:
crumbforest-indexing.service → Auto-indexes markdown docs on startup

Key Differences from Docker Deployment

Aspect Docker (parent repo) Native (this directory)
Service Management docker-compose up systemctl start crumbforest
Database Host db (container name) localhost
Qdrant Host qdrant (container name) localhost
Network Docker bridge network Direct localhost
Logs docker logs crumbforest journalctl -u crumbforest
Auto-start Docker restart policy systemctl enable
File Paths Container volumes /opt/crumbforest/ (direct)

Important: When editing environment variables or connection strings, always use localhost instead of Docker service names (db, qdrant).

Installation Scripts

  1. native-install.sh: Full installation script

    • Creates system user crumbforest:crumbforest
    • Sets up Python 3.11+ virtual environment at /opt/crumbforest/venv
    • Copies application code from parent repository
    • Generates secure secrets (APP_SECRET, SECRET_KEY)
    • Installs systemd service files
    • Configures NGINX reverse proxy
    • Sets proper file permissions (.env = 600, logs writable)
  2. native-update.sh: Update deployed application

    • Stops services
    • Creates automatic backup
    • Updates code via rsync or git pull
    • Reinstalls Python dependencies
    • Restarts services
    • Performs health check
  3. native-backup.sh: Comprehensive backup

    • Application code and config
    • MariaDB database dump
    • Qdrant vector database
    • Last 7 days of logs
    • Stored in /var/backups/crumbforest/

Configuration Management

Configuration follows a multi-layer approach:

  1. Environment Variables (/opt/crumbforest/.env):

    • Database credentials (MARIADB_USER, MARIADB_PASSWORD)
    • API keys (OPENAI_API_KEY, OPENROUTER_API_KEY, ANTHROPIC_API_KEY)
    • Security secrets (APP_SECRET, SECRET_KEY)
    • Service URLs (DATABASE_URL, QDRANT_URL)
    • RAG settings (chunk size, overlap, default models)
  2. systemd Service Files (/etc/systemd/system/):

    • crumbforest.service: Main FastAPI application
    • crumbforest-indexing.service: Document indexing on startup
  3. NGINX Configuration (/etc/nginx/sites-available/):

    • crumbforest.nginx.conf: Server block (SSL, domain, proxy settings)
    • crumbforest-locations.conf: Location blocks (static files, API routes)
  4. Database Schema (scripts/init_database.sql):

    • Creates database crumbforest with utf8mb4 encoding
    • Users table with roles (admin/editor/user)
    • Default users: admin@crumb.local / demo@crumb.local

Security Considerations

When modifying deployment scripts or configurations:

  1. Secrets Management:

    • Never hardcode passwords in scripts
    • Use openssl rand -hex 32 to generate secure secrets
    • Ensure .env has mode 600 (only readable by owner)
    • Change default database passwords immediately after installation
  2. systemd Hardening:

    • Services run as non-root user (crumbforest)
    • NoNewPrivileges=true prevents privilege escalation
    • PrivateTmp=true isolates temporary files
    • ProtectSystem=strict prevents system file modification
    • Only allow write access to /opt/crumbforest/logs and /var/log/crumbforest
  3. Network Security:

    • FastAPI only listens on 127.0.0.1:8000 (not publicly accessible)
    • NGINX handles all external traffic
    • MariaDB and Qdrant should only bind to localhost
    • Configure firewall (ufw/iptables) to restrict ports
  4. File Permissions:

    • Application directory owned by crumbforest:crumbforest
    • Configuration files: mode 600
    • Scripts: mode 755 (executable)
    • Logs: writable by crumbforest user

RAG System (Parent Application)

The main application uses Retrieval Augmented Generation:

Indexing Flow

Markdown Document → Chunking (1000 chars, 200 overlap)
                 ↓
            Embedding (OpenAI/OpenRouter/Claude)
                 ↓
            MD5 Hash (change detection)
                 ↓
            Store in Qdrant + metadata in MariaDB

Collections Structure

  • posts_{locale}: Blog posts per language (de/en/fr)
  • diary_child_{id}: Per-child diary entries
  • docs_crumbforest: Auto-indexed documentation
  • Custom collections based on configuration

AI Provider System

The application supports multiple embedding/completion providers through a factory pattern:

  • OpenAI: text-embedding-3-small, gpt-4o-mini
  • OpenRouter: Multi-model proxy (default for production)
  • Anthropic: claude-3-5-sonnet
  • Local: sentence-transformers (fallback)

Provider selection controlled via environment variables:

DEFAULT_EMBEDDING_PROVIDER=openrouter
DEFAULT_EMBEDDING_MODEL=text-embedding-3-small
DEFAULT_COMPLETION_PROVIDER=openrouter
DEFAULT_COMPLETION_MODEL=anthropic/claude-3-5-sonnet

Common Development Tasks

Modifying Installation Scripts

When editing native-install.sh, native-update.sh, or native-backup.sh:

  1. Always check for root privileges at script start
  2. Use set -e to exit on errors
  3. Add colored output functions (print_success, print_error, print_info)
  4. Validate prerequisites before making changes
  5. Create backups before destructive operations
  6. Test in a staging environment first

Example pattern:

#!/bin/bash
set -e

check_root() {
    if [ "$EUID" -ne 0 ]; then
        echo "Error: Must run as root"
        exit 1
    fi
}

check_root
# ... rest of script

Modifying systemd Services

When editing service files:

  1. After changes, reload systemd daemon:

    sudo systemctl daemon-reload
    
  2. Restart the affected service:

    sudo systemctl restart crumbforest
    
  3. Verify service status:

    sudo systemctl status crumbforest
    

Modifying NGINX Configuration

When editing NGINX configs:

  1. Always test syntax before applying:

    sudo nginx -t
    
  2. If test passes, reload (no downtime):

    sudo systemctl reload nginx
    
  3. Check error logs if issues occur:

    sudo tail -f /var/log/nginx/error.log
    

Database Schema Changes

When modifying scripts/init_database.sql:

  1. Always use IF NOT EXISTS for idempotency:

    CREATE TABLE IF NOT EXISTS new_table (...);
    
  2. Test on development database first

  3. Create migration scripts for existing deployments

  4. Document schema changes in comments

Adding New Environment Variables

When adding new configuration options:

  1. Add to env.production.template with descriptive comments
  2. Add default value in application's config.py (Pydantic settings)
  3. Document in deployment guide
  4. Update native-install.sh if auto-generation needed

Troubleshooting

Service Won't Start

# Check detailed error logs
sudo journalctl -u crumbforest -n 50 --no-pager

# Verify environment file exists and is readable
sudo ls -la /opt/crumbforest/.env

# Test Python environment
sudo -u crumbforest /opt/crumbforest/venv/bin/python --version

Database Connection Issues

# Test database connection
mysql -u crumb_prod -p -h localhost crumbforest

# Check if MariaDB is running
sudo systemctl status mariadb

# Verify DATABASE_URL in .env matches database credentials
sudo grep DATABASE_URL /opt/crumbforest/.env

NGINX 502 Bad Gateway

# Check if FastAPI is running
curl http://localhost:8000/health

# Verify FastAPI is listening on correct port
sudo ss -tlnp | grep 8000

# Check NGINX error logs
sudo tail -f /var/log/nginx/crumbforest.error.log

Qdrant Connection Failed

# Check if Qdrant is running
curl http://localhost:6333/collections

# Verify Qdrant service status
sudo systemctl status qdrant  # or docker ps | grep qdrant

Permission Denied Errors

# Fix ownership of application directory
sudo chown -R crumbforest:crumbforest /opt/crumbforest

# Fix log directory permissions
sudo chown -R crumbforest:crumbforest /var/log/crumbforest
sudo chmod 755 /var/log/crumbforest

# Verify .env file permissions
sudo chmod 600 /opt/crumbforest/.env

Default Credentials

Warning: Change these immediately after installation!

Database

  • User: crumb_prod
  • Password: Set during installation (check scripts/init_database.sql)
  • Database: crumbforest

Web Application

  • Admin: admin@crumb.local / admin123
  • Demo User: demo@crumb.local / demo123

API Keys

Must be configured in /opt/crumbforest/.env:

  • OPENAI_API_KEY
  • ANTHROPIC_API_KEY
  • OPENROUTER_API_KEY

At least one AI provider API key is required for RAG functionality.

File Locations

Application Files

/opt/crumbforest/
├── app/                    # FastAPI application code
├── venv/                   # Python virtual environment
├── docs/                   # Documentation (auto-indexed)
├── logs/                   # Application logs
├── .env                    # Environment configuration (mode 600)
└── crumbforest_config.json # Central config (groups, roles)

System Files

/etc/systemd/system/
├── crumbforest.service           # Main FastAPI service
└── crumbforest-indexing.service  # Document indexing service

/etc/nginx/sites-available/
├── crumbforest.nginx.conf        # NGINX server block
└── crumbforest-locations.conf    # Location blocks

/var/log/crumbforest/             # Application logs
/var/backups/crumbforest/         # Automated backups

Important Notes

  1. This is a deployment directory: The actual application code lives in the parent repository. This directory only contains installation scripts and configuration for native (non-Docker) deployment.

  2. Use localhost for connections: Unlike Docker deployment, services connect via localhost, not container names. Always use localhost in DATABASE_URL and QDRANT_URL.

  3. Service dependencies: The FastAPI service depends on MariaDB and Qdrant being available. Ensure they're running before starting crumbforest service.

  4. Backup before updates: The native-update.sh script creates automatic backups, but manual backups via native-backup.sh are recommended before major changes.

  5. Security first: This runs as a system service with privileges. Always validate scripts, use secure passwords, and follow principle of least privilege.

  6. Production domain: Default configuration uses crumbforest.194-164-194-191.sslip.io (sslip.io provides automatic DNS resolution for IP addresses). Update NGINX config for custom domains.