# 🏢 RZ Operations Handbuch - Crumbcore v1 **Zielgruppe:** RZ-Team, Betrieb, Admins **System:** Crumbcore (FastAPI + RAG) **Footprint:** 605 MB, 3 Container --- ## 📊 System Overview ``` Crumbcore Stack: ├── FastAPI App (256 MB RAM) │ ├── RAG Engine (Qdrant Client) │ ├── 3 AI Characters (Eule, Fox, Bugsy) │ └── Document Search & Chat ├── MariaDB 11.7 (512 MB RAM) │ └── User Management, Sessions └── Qdrant 1.12.5 (512 MB RAM) └── Vector Storage (733 Docs indexed) Total: ~1.3 GB RAM, 605 MB Disk ``` ## 🚀 Initial Deployment ### 1. Vorbereitung ```bash # 1. Repository klonen (oder Tarball entpacken) git clone crumbcore cd crumbcore # 2. ENV File erstellen cp .env.example .env.rz nano .env.rz # 3. Secrets generieren openssl rand -hex 32 # SECRET_KEY openssl rand -hex 16 # DB_PASSWORD openssl rand -hex 24 # DB_ROOT_PASSWORD ``` ### 2. Deployment ```bash # Automatisches Deployment ./rz-deploy.sh # Oder manuell: docker compose -f rz-deployment.yml --env-file .env.rz up -d ``` ### 3. Verify ```bash # Health Check curl http://localhost:8000/health # Collections Check curl http://localhost:6333/collections # Login Test curl -X POST http://localhost:8000/api/auth/login \ -H "Content-Type: application/json" \ -d '{"email":"admin@crumb.local","password":"admin123"}' ``` --- ## 🔄 Standard Operations ### Logs anzeigen ```bash # Live Logs (alle Services) docker compose -f rz-deployment.yml logs -f # Nur Application docker compose -f rz-deployment.yml logs -f app # Letzte 100 Zeilen docker compose -f rz-deployment.yml logs --tail=100 app # Mit Zeitstempel docker compose -f rz-deployment.yml logs -f -t app # Nur Errors docker compose -f rz-deployment.yml logs app | grep ERROR ``` ### Restart ```bash # Einzelner Service docker compose -f rz-deployment.yml restart app # Alle Services docker compose -f rz-deployment.yml restart # Mit Rebuild (nach Code-Update) docker compose -f rz-deployment.yml up -d --build app ``` ### Stop/Start ```bash # Stoppen docker compose -f rz-deployment.yml stop # Starten docker compose -f rz-deployment.yml start # Down (Container entfernen, Volumes bleiben) docker compose -f rz-deployment.yml down # Down (inkl. Volumes - ⚠️ DATENVERLUST!) docker compose -f rz-deployment.yml down -v ``` --- ## 💾 Backup & Restore ### Backup erstellen ```bash #!/bin/bash BACKUP_DIR="./backups/$(date +%Y%m%d_%H%M%S)" mkdir -p "$BACKUP_DIR" # Database Backup docker compose -f rz-deployment.yml exec -T db \ sh -c 'mariadb-dump -u$MARIADB_USER -p$MARIADB_PASSWORD $MARIADB_DATABASE' \ > "$BACKUP_DIR/database.sql" # Qdrant Backup (Volume) docker run --rm \ -v rz-crumbcore-qdrant-data:/data \ -v "$BACKUP_DIR":/backup \ alpine tar czf /backup/qdrant-data.tar.gz -C /data . # App Logs Backup docker run --rm \ -v rz-crumbcore-app-logs:/data \ -v "$BACKUP_DIR":/backup \ alpine tar czf /backup/app-logs.tar.gz -C /data . echo "✅ Backup erstellt: $BACKUP_DIR" ``` ### Restore ```bash #!/bin/bash BACKUP_DIR="./backups/20250103_120000" # Anpassen! # Database Restore cat "$BACKUP_DIR/database.sql" | \ docker compose -f rz-deployment.yml exec -T db \ sh -c 'mariadb -u$MARIADB_USER -p$MARIADB_PASSWORD $MARIADB_DATABASE' # Qdrant Restore docker run --rm \ -v rz-crumbcore-qdrant-data:/data \ -v "$BACKUP_DIR":/backup \ alpine sh -c "cd /data && tar xzf /backup/qdrant-data.tar.gz" # Restart Services docker compose -f rz-deployment.yml restart echo "✅ Restore abgeschlossen" ``` --- ## 🔧 Maintenance Tasks ### 1. Update Deployment ```bash # 1. Backup erstellen (siehe oben) ./backup-crumbcore.sh # 2. Neue Version pullen docker pull crumbcore:v1.1 # Oder neuere Version # 3. Update in docker-compose.yml nano rz-deployment.yml # image: crumbcore:v1.1 # 4. Rolling Update docker compose -f rz-deployment.yml up -d --no-deps app # 5. Health Check curl http://localhost:8000/health ``` ### 2. Re-Index Dokumente ```bash # Alle Dokumente neu indexieren docker compose -f rz-deployment.yml exec app \ python3 -c " from scripts.index_docs import index_documents index_documents('docs/rz-nullfeld', 'docs_rz_nullfeld_', force=True) index_documents('docs/crumbforest', 'docs_crumbforest_', force=True) print('✅ Re-Indexing abgeschlossen') " # Oder via API (mit Admin Login) curl -X POST http://localhost:8000/api/documents/index \ -H "Content-Type: application/json" \ -H "Cookie: session=..." \ -d '{"provider": "openrouter", "force": true}' ``` ### 3. Database Maintenance ```bash # Optimize Tables docker compose -f rz-deployment.yml exec db \ mariadb -u root -p${DB_ROOT_PASSWORD} crumbforest \ -e "OPTIMIZE TABLE users, sessions, diary_entries;" # Check Table Status docker compose -f rz-deployment.yml exec db \ mariadb -u root -p${DB_ROOT_PASSWORD} crumbforest \ -e "SHOW TABLE STATUS;" # Vacuum (wenn InnoDB) docker compose -f rz-deployment.yml exec db \ mariadb -u root -p${DB_ROOT_PASSWORD} \ -e "SET GLOBAL innodb_fast_shutdown=0;" ``` ### 4. Log Rotation ```bash # Logs älter als 30 Tage löschen find /var/lib/docker/volumes/rz-crumbcore-app-logs/_data \ -name "*.jsonl" -mtime +30 -delete # Oder via Docker docker run --rm \ -v rz-crumbcore-app-logs:/logs \ alpine find /logs -name "*.jsonl" -mtime +30 -delete ``` ### 5. Cleanup ```bash # Alte Images entfernen docker image prune -a --filter "until=720h" # Ungenutzte Volumes (⚠️ vorsichtig!) docker volume prune # Ungenutzte Networks docker network prune # System-weiter Cleanup docker system prune -a --volumes ``` --- ## 📊 Monitoring ### Health Endpoints ```bash # Application Health curl http://localhost:8000/health # → {"status": "healthy", "version": "1.0.0"} # Qdrant Health curl http://localhost:6333/health # → OK # Database Health docker compose -f rz-deployment.yml exec db \ sh -c 'mariadb -u$MARIADB_USER -p$MARIADB_PASSWORD -e "SELECT 1"' ``` ### Metrics ```bash # Qdrant Collections Status curl -s http://localhost:6333/collections | jq . # Container Stats docker stats rz-crumbcore-app rz-crumbcore-db rz-crumbcore-qdrant # Disk Usage docker system df docker volume ls -q | xargs docker volume inspect | \ jq -r '.[] | "\(.Name): \(.Mountpoint)"' | \ while read line; do name=$(echo $line | cut -d: -f1) path=$(echo $line | cut -d: -f2) size=$(du -sh "$path" 2>/dev/null | cut -f1) echo "$name: $size" done ``` ### Logs & Alerting ```bash # Critical Errors (letzte Stunde) docker compose -f rz-deployment.yml logs --since 1h app | \ grep -i "error\|critical\|exception" | \ tail -n 20 # Failed Login Attempts docker compose -f rz-deployment.yml logs --since 1h app | \ grep "Login failed" | wc -l # Rate Limit Hits docker compose -f rz-deployment.yml logs --since 1h app | \ grep "429" | wc -l # OpenRouter API Errors docker compose -f rz-deployment.yml logs --since 1h app | \ grep "OpenRouter" | grep -i error ``` --- ## 🔥 Troubleshooting ### Problem: Container startet nicht ```bash # Logs checken docker compose -f rz-deployment.yml logs app # Häufige Ursachen: # 1. Port belegt lsof -i :8000 # 2. ENV Variables falsch docker compose -f rz-deployment.yml config # 3. Volume Permissions docker run --rm -v rz-crumbcore-app-logs:/data alpine ls -la /data ``` ### Problem: Database Connection Failed ```bash # 1. DB erreichbar? docker compose -f rz-deployment.yml exec db \ mariadb -u crumb -p -e "SELECT 1" # 2. Passwort korrekt? grep DB_PASSWORD .env.rz # 3. Netzwerk ok? docker network inspect rz-internal # 4. DB Logs checken docker compose -f rz-deployment.yml logs db | tail -n 50 ``` ### Problem: Qdrant Fehler ```bash # 1. Qdrant erreichbar? curl http://localhost:6333/health # 2. Collections vorhanden? curl http://localhost:6333/collections # 3. Storage Issues? docker volume inspect rz-crumbcore-qdrant-data # 4. Neu initialisieren (⚠️ Datenverlust!) docker compose -f rz-deployment.yml down docker volume rm rz-crumbcore-qdrant-data docker compose -f rz-deployment.yml up -d # Dann re-index! ``` ### Problem: Hohe CPU/RAM Usage ```bash # Stats anzeigen docker stats --no-stream # Top Prozesse im Container docker compose -f rz-deployment.yml exec app top # Qdrant Memory Usage curl http://localhost:6333/metrics # Resource Limits setzen (docker-compose.yml): # deploy: # resources: # limits: # cpus: '2.0' # memory: 1G ``` ### Problem: Slow Response Times ```bash # 1. Response Time messen time curl -s http://localhost:8000/health # 2. Qdrant Query Performance curl -s http://localhost:6333/metrics | grep query_time # 3. Database Slow Queries docker compose -f rz-deployment.yml exec db \ mariadb -u root -p -e "SHOW FULL PROCESSLIST;" # 4. App Logs auf Delays prüfen docker compose -f rz-deployment.yml logs app | grep "took.*ms" ``` --- ## 🔒 Security Checklist ### Nach Deployment prüfen: ```bash # 1. Firewall aktiv? sudo ufw status # Nur 8000 (intern) und 6333 (intern) offen # 2. Secrets rotiert? grep -i "change_me" .env.rz # Sollte nichts finden! # 3. Default Passwords geändert? # Admin: admin@crumb.local / admin123 → ÄNDERN! # 4. CORS korrekt? curl -I http://localhost:8000/api/chat \ -H "Origin: https://evil-site.com" # Sollte CORS Error geben # 5. Rate Limiting aktiv? for i in {1..10}; do curl -s -o /dev/null -w "%{http_code}\n" \ -X POST http://localhost:8000/api/chat \ -H "Content-Type: application/json" \ -d '{"character_id":"eule","question":"test","lang":"de"}' done # Nach 5 Requests: 429 erwarten # 6. TLS (wenn öffentlich)? curl -I https://docs.rz-nullfeld.de # Sollte 200 + HSTS Header haben ``` --- ## 📞 Incident Response ### Severity Levels **P1 - Critical (Sofort):** - System down - Data Loss - Security Breach **P2 - High (< 2h):** - Performance Issues - Partial Outage - API Errors > 10% **P3 - Medium (< 8h):** - Minor Bugs - Log Warnings - Single Feature broken **P4 - Low (Next Sprint):** - Feature Requests - Documentation - Cosmetic Issues ### P1 Response ```bash # 1. Assess curl http://localhost:8000/health docker compose -f rz-deployment.yml ps # 2. Quick Fix (Restart) docker compose -f rz-deployment.yml restart # 3. If still down, restore from backup ./restore-crumbcore.sh # 4. Notify stakeholders # 5. Post-mortem doc ``` --- ## 📝 Change Management ### Deployment Checklist - [ ] Backup erstellt - [ ] Rollback-Plan bereit - [ ] Stakeholders informiert - [ ] Maintenance Window geplant - [ ] Health Checks vorbereitet - [ ] Logs monitored (30 min nach Deploy) ### Rollback Procedure ```bash # 1. Stop new version docker compose -f rz-deployment.yml down # 2. Restore backup (siehe oben) ./restore-crumbcore.sh # 3. Start old version docker compose -f rz-deployment.yml up -d # 4. Verify curl http://localhost:8000/health ``` --- ## 📚 Weitere Ressourcen - **Security Audit:** `docs/security/audit_2025-12-03_chat_v1_security.md` - **Deployment Log:** `docs/security/DEPLOYMENT_SUCCESS_2025-12-03.md` - **Quickstart:** `QUICKSTART.md` - **Architecture:** `CLAUDE.md` --- **Letzte Aktualisierung:** 2025-12-04 **Version:** 1.0 **Maintainer:** RZ-Team 🌲 **Stay safe im Crumbforest!**