452 lines
10 KiB
Markdown
452 lines
10 KiB
Markdown
# 🗄️ Qdrant Zugriff & Sicherheit
|
|
|
|
## 🔐 Sicherheits-Status
|
|
|
|
### ✅ Nach Fix (SICHER)
|
|
```yaml
|
|
ports:
|
|
- "127.0.0.1:6333:6333" # Nur localhost
|
|
```
|
|
|
|
**Zugriff:**
|
|
- ✅ Lokal (auf Server): `http://localhost:6333`
|
|
- ✅ Via Docker Network: `http://qdrant:6333`
|
|
- ❌ Von außen: NICHT erreichbar (sicher!)
|
|
|
|
### ⚠️ Vorher (UNSICHER)
|
|
```yaml
|
|
ports:
|
|
- "6333:6333" # Öffentlich!
|
|
```
|
|
|
|
## 🌐 Zugriffsmethoden
|
|
|
|
### 1. Lokal auf dem Server
|
|
|
|
```bash
|
|
# Dashboard öffnen (wenn auf Server)
|
|
open http://localhost:6333/dashboard
|
|
|
|
# Collections abfragen
|
|
curl http://localhost:6333/collections | jq
|
|
|
|
# Collection Details
|
|
curl http://localhost:6333/collections/docs_crumbforest_ | jq
|
|
```
|
|
|
|
### 2. Via Docker Network (FastAPI App)
|
|
|
|
```python
|
|
# app/deps.py - Bereits implementiert
|
|
def get_qdrant_client():
|
|
from qdrant_client import QdrantClient
|
|
from config import get_settings
|
|
|
|
settings = get_settings()
|
|
# Nutzt Docker Network Name: "qdrant"
|
|
return QdrantClient(
|
|
host=settings.qdrant_host, # "qdrant"
|
|
port=settings.qdrant_port # 6333
|
|
)
|
|
```
|
|
|
|
**Warum funktioniert das?**
|
|
- Container im gleichen Docker Network können sich per Name erreichen
|
|
- `qdrant` wird zu interner IP aufgelöst
|
|
- Keine externe Exposition nötig!
|
|
|
|
### 3. Remote Zugriff via SSH Tunnel
|
|
|
|
```bash
|
|
# Von deinem lokalen Rechner zum Server
|
|
ssh -L 6333:localhost:6333 user@your-server.com
|
|
|
|
# Jetzt lokal öffnen
|
|
open http://localhost:6333/dashboard
|
|
|
|
# Oder per API
|
|
curl http://localhost:6333/collections | jq
|
|
```
|
|
|
|
**Erklärung:**
|
|
- `-L 6333:localhost:6333` = Forward lokaler Port 6333 zu Server Port 6333
|
|
- Sicher über SSH encrypted
|
|
- Dashboard läuft "lokal" aber zeigt Server-Daten
|
|
|
|
### 4. Production Setup mit Nginx
|
|
|
|
```nginx
|
|
# /etc/nginx/sites-available/qdrant
|
|
server {
|
|
listen 443 ssl;
|
|
server_name qdrant.crumbforest.de;
|
|
|
|
ssl_certificate /etc/letsencrypt/live/qdrant.crumbforest.de/fullchain.pem;
|
|
ssl_certificate_key /etc/letsencrypt/live/qdrant.crumbforest.de/privkey.pem;
|
|
|
|
# Basic Auth
|
|
auth_basic "Qdrant Admin";
|
|
auth_basic_user_file /etc/nginx/.htpasswd;
|
|
|
|
location / {
|
|
proxy_pass http://localhost:6333;
|
|
proxy_set_header Host $host;
|
|
proxy_set_header X-Real-IP $remote_addr;
|
|
}
|
|
}
|
|
```
|
|
|
|
```bash
|
|
# Basic Auth erstellen
|
|
sudo htpasswd -c /etc/nginx/.htpasswd admin
|
|
|
|
# Nginx neu laden
|
|
sudo nginx -t && sudo systemctl reload nginx
|
|
```
|
|
|
|
## 🔍 Markdown-Dateien durchsuchen
|
|
|
|
### Methode 1: Via API (Empfohlen)
|
|
|
|
```bash
|
|
# Alle Dokumente durchsuchen
|
|
curl -X GET "http://localhost:8000/api/documents/search?q=docker&limit=10" \
|
|
-H "Cookie: session=YOUR_SESSION_COOKIE"
|
|
|
|
# Nur Crumbforest Docs
|
|
curl -X GET "http://localhost:8000/api/documents/search?q=python&category=crumbforest&limit=5" \
|
|
-H "Cookie: session=YOUR_SESSION_COOKIE"
|
|
|
|
# Session Cookie bekommen (nach Login)
|
|
# Im Browser: DevTools → Application → Cookies → session
|
|
```
|
|
|
|
**Mit Python:**
|
|
```python
|
|
import requests
|
|
|
|
# Login
|
|
session = requests.Session()
|
|
response = session.post(
|
|
"http://localhost:8000/de/login",
|
|
data={
|
|
"email": "admin@crumb.local",
|
|
"password": "admin123",
|
|
"csrf": "..." # Von Login-Form
|
|
}
|
|
)
|
|
|
|
# Suche
|
|
results = session.get(
|
|
"http://localhost:8000/api/documents/search",
|
|
params={"q": "docker", "limit": 10}
|
|
).json()
|
|
|
|
for result in results["results"]:
|
|
print(f"{result['score']:.3f} - {result['title']}")
|
|
print(f" → {result['content'][:100]}...")
|
|
```
|
|
|
|
### Methode 2: Direkt in Qdrant
|
|
|
|
```bash
|
|
# Collection Stats
|
|
curl http://localhost:6333/collections/docs_crumbforest_ | jq '.result | {
|
|
status,
|
|
points_count,
|
|
indexed_vectors_count
|
|
}'
|
|
|
|
# Points durchsuchen (braucht Embedding!)
|
|
# Komplexer - besser via API
|
|
```
|
|
|
|
### Methode 3: Database Query (Metadaten)
|
|
|
|
```bash
|
|
# Alle indexierten Dokumente
|
|
docker compose exec -T db sh -c 'mariadb -u$MARIADB_USER -p$MARIADB_PASSWORD $MARIADB_DATABASE -e "
|
|
SELECT
|
|
post_id,
|
|
collection_name,
|
|
JSON_EXTRACT(metadata, \"$.file_path\") as file_path,
|
|
JSON_EXTRACT(metadata, \"$.category\") as category,
|
|
chunk_count,
|
|
indexed_at
|
|
FROM post_vectors
|
|
WHERE post_type=\"document\"
|
|
ORDER BY indexed_at DESC
|
|
LIMIT 20;
|
|
"'
|
|
|
|
# Suche nach Dateiname
|
|
docker compose exec -T db sh -c 'mariadb -u$MARIADB_USER -p$MARIADB_PASSWORD $MARIADB_DATABASE -e "
|
|
SELECT
|
|
JSON_EXTRACT(metadata, \"$.file_path\") as file_path,
|
|
chunk_count
|
|
FROM post_vectors
|
|
WHERE post_type=\"document\"
|
|
AND JSON_EXTRACT(metadata, \"$.file_path\") LIKE \"%docker%\"
|
|
ORDER BY indexed_at DESC;
|
|
"'
|
|
```
|
|
|
|
### Methode 4: Filesystem
|
|
|
|
```bash
|
|
# Alle .md Dateien finden
|
|
find docs/ -name "*.md" -type f
|
|
|
|
# Nach Inhalt suchen
|
|
grep -r "docker" docs/ --include="*.md"
|
|
|
|
# Mit Kontext
|
|
grep -r -C 3 "docker compose" docs/ --include="*.md"
|
|
|
|
# Case-insensitive
|
|
grep -ri "python" docs/ --include="*.md"
|
|
```
|
|
|
|
## 📝 Neue Version anmelden
|
|
|
|
### Szenario: Du hast eine .md Datei aktualisiert
|
|
|
|
#### Option 1: Automatisch (Empfohlen)
|
|
|
|
```bash
|
|
# 1. Datei bearbeiten
|
|
nano docs/crumbforest/my_file.md
|
|
|
|
# 2. App neu starten (triggert Auto-Indexing)
|
|
cd compose
|
|
docker compose restart app
|
|
|
|
# 3. Logs prüfen
|
|
docker compose logs app | grep -A 10 "Document Indexing"
|
|
|
|
# Erwartete Ausgabe:
|
|
# ✓ Using provider: openrouter
|
|
# 📚 Indexing documents...
|
|
#
|
|
# 📁 crumbforest:
|
|
# Files found: 283
|
|
# Indexed: 1 ← Nur geänderte Datei!
|
|
# Unchanged: 282
|
|
# Errors: 0
|
|
```
|
|
|
|
**Wie funktioniert das?**
|
|
- File-Hash wird verglichen
|
|
- Nur geänderte Dateien werden neu indexiert
|
|
- Spart Zeit & API-Kosten!
|
|
|
|
#### Option 2: Manuell via API
|
|
|
|
```bash
|
|
# Alle Dokumente force re-indexen
|
|
curl -X POST "http://localhost:8000/api/documents/index" \
|
|
-H "Content-Type: application/json" \
|
|
-H "Cookie: session=..." \
|
|
-d '{
|
|
"provider": "openrouter",
|
|
"force": true
|
|
}'
|
|
|
|
# Nur eine Kategorie
|
|
curl -X POST "http://localhost:8000/api/documents/index" \
|
|
-H "Content-Type: application/json" \
|
|
-H "Cookie: session=..." \
|
|
-d '{
|
|
"category": "crumbforest",
|
|
"provider": "openrouter",
|
|
"force": true
|
|
}'
|
|
```
|
|
|
|
#### Option 3: Einzelne Datei via Python
|
|
|
|
```python
|
|
# manual_index.py
|
|
import sys
|
|
sys.path.insert(0, 'app')
|
|
|
|
from pathlib import Path
|
|
from deps import get_db, get_qdrant_client
|
|
from config import get_settings
|
|
from services.provider_factory import ProviderFactory
|
|
from services.document_indexer import DocumentIndexer
|
|
|
|
# Setup
|
|
settings = get_settings()
|
|
db_conn = get_db()
|
|
qdrant = get_qdrant_client()
|
|
provider = ProviderFactory.create_provider("openrouter", settings)
|
|
|
|
# Indexer
|
|
indexer = DocumentIndexer(db_conn, qdrant, provider, "docs")
|
|
|
|
# Einzelne Datei indexieren
|
|
file_path = Path("docs/crumbforest/my_updated_file.md")
|
|
result = indexer.index_document(file_path, "crumbforest", force=True)
|
|
|
|
print(f"Status: {result['status']}")
|
|
print(f"Chunks: {result.get('chunks', 0)}")
|
|
|
|
db_conn.close()
|
|
```
|
|
|
|
## 🔄 Update-Workflow
|
|
|
|
### Code-Änderungen (Python)
|
|
|
|
```bash
|
|
# 1. Code bearbeiten
|
|
nano app/routers/my_feature.py
|
|
|
|
# 2. Nur App neu starten (schnell!)
|
|
docker compose restart app
|
|
|
|
# 3. Verifizieren
|
|
curl http://localhost:8000/health
|
|
```
|
|
|
|
### Dependencies (requirements.txt)
|
|
|
|
```bash
|
|
# 1. requirements.txt bearbeiten
|
|
nano app/requirements.txt
|
|
|
|
# 2. Neu bauen
|
|
docker compose up --build -d
|
|
|
|
# 3. Verifizieren
|
|
docker compose exec app pip list | grep new-package
|
|
```
|
|
|
|
### Docker-Compose Änderungen
|
|
|
|
```bash
|
|
# 1. docker-compose.yml bearbeiten
|
|
nano compose/docker-compose.yml
|
|
|
|
# 2. Services neu erstellen
|
|
docker compose up -d
|
|
|
|
# 3. Status prüfen
|
|
docker compose ps
|
|
```
|
|
|
|
### Neue .md Dateien
|
|
|
|
```bash
|
|
# 1. Datei hinzufügen
|
|
cp new_doc.md docs/crumbforest/
|
|
|
|
# 2. App neu starten (triggert Auto-Indexing)
|
|
docker compose restart app
|
|
|
|
# 3. Logs prüfen
|
|
docker compose logs app | grep "Document Indexing"
|
|
|
|
# 4. Verifizieren in Qdrant
|
|
curl http://localhost:6333/collections/docs_crumbforest_ | \
|
|
jq '.result.points_count'
|
|
```
|
|
|
|
### Database Schema Änderungen
|
|
|
|
```bash
|
|
# 1. SQL Script erstellen
|
|
nano compose/init/99_my_migration.sql
|
|
|
|
# 2. Manuell ausführen (init/ läuft nur bei Erstellung!)
|
|
docker compose exec -T db sh -c \
|
|
'mariadb -u$MARIADB_USER -p$MARIADB_PASSWORD $MARIADB_DATABASE' \
|
|
< compose/init/99_my_migration.sql
|
|
|
|
# 3. Oder DB neu erstellen (⚠️ Löscht Daten!)
|
|
docker compose down -v
|
|
docker compose up -d
|
|
```
|
|
|
|
## 🛠️ Quick Reference
|
|
|
|
```bash
|
|
# Qdrant Dashboard (lokal)
|
|
open http://localhost:6333/dashboard
|
|
|
|
# Qdrant via SSH Tunnel
|
|
ssh -L 6333:localhost:6333 user@server
|
|
|
|
# Collections prüfen
|
|
curl http://localhost:6333/collections | jq '.result.collections[].name'
|
|
|
|
# Suche in Docs (braucht Session)
|
|
curl "http://localhost:8000/api/documents/search?q=docker" -H "Cookie: session=..."
|
|
|
|
# Status prüfen
|
|
curl http://localhost:8000/api/documents/status -H "Cookie: session=..."
|
|
|
|
# Force Re-Index
|
|
curl -X POST http://localhost:8000/api/documents/index \
|
|
-H "Content-Type: application/json" \
|
|
-H "Cookie: session=..." \
|
|
-d '{"force": true}'
|
|
|
|
# App neu starten (Auto-Indexing)
|
|
docker compose restart app
|
|
|
|
# Rebuild (bei Code/Dependency Changes)
|
|
docker compose up --build -d
|
|
|
|
# Logs live verfolgen
|
|
docker compose logs app -f
|
|
```
|
|
|
|
## 🔒 Production Checklist
|
|
|
|
- [x] Qdrant nur auf localhost: `127.0.0.1:6333:6333`
|
|
- [ ] Nginx Reverse Proxy mit SSL
|
|
- [ ] Basic Auth für Qdrant Dashboard
|
|
- [ ] Firewall Rules (nur Port 80/443 offen)
|
|
- [ ] SSH Key-Based Auth (kein Password)
|
|
- [ ] Environment Variables sicher speichern
|
|
- [ ] Backup Cron Job einrichten
|
|
- [ ] Monitoring (Uptime, Disk Space)
|
|
- [ ] Log Rotation
|
|
- [ ] Rate Limiting für API
|
|
|
|
## 💡 Tipps
|
|
|
|
1. **Immer `restart` statt `up` wenn nur Code geändert**
|
|
```bash
|
|
docker compose restart app # Schnell
|
|
# statt
|
|
docker compose up --build # Langsam
|
|
```
|
|
|
|
2. **File-Hash-Tracking nutzen**
|
|
- Nur geänderte Dateien werden neu indexiert
|
|
- Spart API-Kosten!
|
|
|
|
3. **SSH Tunnel für Remote Admin**
|
|
- Sicherer als VPN
|
|
- Keine Firewall-Änderungen nötig
|
|
|
|
4. **Logs sind deine Freunde**
|
|
```bash
|
|
# Errors finden
|
|
docker compose logs app | grep -i error
|
|
|
|
# Indexing Status
|
|
docker compose logs app | grep "Document Indexing" -A 20
|
|
```
|
|
|
|
5. **Session Cookie im Browser**
|
|
- DevTools → Application → Cookies
|
|
- Für API-Tests kopieren
|
|
|
|
---
|
|
|
|
**Wuuuuhuuu! Qdrant ist jetzt sicher! 🦉**
|