This commit is contained in:
Ladebeze66 2025-03-27 17:59:10 +01:00
parent 3ac1fa1617
commit 7b89176a89
6360 changed files with 1431311 additions and 787 deletions

View File

@ -0,0 +1,65 @@
# SpecStory Artifacts Directory
This directory is automatically created and maintained by the SpecStory extension to preserve your Cursor composer and chat history.
## What's Here?
- `.specstory/history`: Contains markdown files of your AI coding sessions
- Each file represents a separate chat or composer session
- Files are automatically updated as you work
- `.specstory/cursor_rules_backups`: Contains backups of the `.cursor/rules/derived-cursor-rules.mdc` file
- Backups are automatically created each time the `.cursor/rules/derived-cursor-rules.mdc` file is updated
- You can enable/disable the Cursor Rules feature in the SpecStory settings, it is disabled by default
## Valuable Uses
- Capture: Keep your context window up-to-date when starting new Chat/Composer sessions via @ references
- Search: For previous prompts and code snippets
- Learn: Meta-analyze your patterns and learn from your past experiences
- Derive: Keep Cursor on course with your past decisions by automatically deriving Cursor rules from your AI interactions
## Version Control
We recommend keeping this directory under version control to maintain a history of your AI interactions. However, if you prefer not to version these files, you can exclude them by adding this to your `.gitignore`:
```
.specstory
```
We recommend not keeping the `.specstory/cursor_rules_backups` directory under version control if you are already using git to version the `.cursor/rules` directory, and committing regularly. You can exclude it by adding this to your `.gitignore`:
```
.specstory/cursor_rules_backups
```
## Searching Your Codebase
When searching your codebase in Cursor, search results may include your previous AI coding interactions. To focus solely on your actual code files, you can exclude the AI interaction history from search results.
To exclude AI interaction history:
1. Open the "Find in Files" search in Cursor (Cmd/Ctrl + Shift + F)
2. Navigate to the "files to exclude" section
3. Add the following pattern:
```
.specstory/*
```
This will ensure your searches only return results from your working codebase files.
## Notes
- Auto-save only works when Cursor/sqlite flushes data to disk. This results in a small delay after the AI response is complete before SpecStory can save the history.
- Auto-save does not yet work on remote WSL workspaces.
## Settings
You can control auto-saving behavior in Cursor:
1. Open Cursor → Settings → VS Code Settings (Cmd/Ctrl + ,)
2. Search for "SpecStory"
3. Find "Auto Save" setting to enable/disable
Auto-save occurs when changes are detected in Cursor's sqlite database, or every 2 minutes as a safety net.

File diff suppressed because it is too large Load Diff

302
README.md
View File

@ -19,6 +19,47 @@ Outil de prétraitement de documents PDF avec agents LLM modulables pour l'analy
- Tesseract OCR (pour la reconnaissance de texte)
- Ollama (pour les modèles LLM)
#### Vérification des prérequis (Windows)
Un script de vérification est fourni pour vous aider à identifier les composants manquants:
1. Faites un clic droit sur `check_prerequisites.ps1` et sélectionnez "Exécuter avec PowerShell"
2. Le script vérifiera tous les prérequis et vous indiquera ce qu'il manque
3. Suivez les instructions pour installer les composants manquants
### Configuration de l'environnement virtuel
Il est fortement recommandé d'utiliser un environnement virtuel pour installer et exécuter cette application, notamment sur Windows:
#### Avec venv (recommandé)
```bash
# Windows (PowerShell)
python -m venv venv
.\venv\Scripts\Activate.ps1
# Windows (CMD)
python -m venv venv
.\venv\Scripts\activate.bat
# Linux/macOS
python -m venv venv
source venv/bin/activate
```
#### Avec Conda (alternative)
```bash
# Création de l'environnement
conda create -n ragflow_env python=3.9
conda activate ragflow_env
# Installation des dépendances via pip
pip install -r requirements.txt
```
Une fois l'environnement virtuel activé, vous pouvez procéder à l'installation.
### Méthode 1 : Installation directe (recommandée)
Clonez ce dépôt et installez avec pip :
@ -39,24 +80,56 @@ Si vous préférez une installation manuelle :
pip install -r requirements.txt
```
### Méthode 3 : Installation automatisée pour Windows
Pour une installation simplifiée sous Windows, utilisez les scripts d'installation fournis :
#### Avec les scripts batch (.bat)
1. Double-cliquez sur `install_windows.bat`
2. Le script créera un environnement virtuel et installera toutes les dépendances
3. Suivez les instructions à l'écran pour les étapes suivantes
Pour lancer l'application par la suite :
1. Double-cliquez sur `launch_windows.bat`
2. Le script vérifiera l'installation de Tesseract et Ollama avant de démarrer l'application
#### Avec PowerShell (recommandé pour Windows 10/11)
Si vous préférez utiliser PowerShell (interface plus conviviale avec code couleur) :
1. Faites un clic droit sur `install_windows.ps1` et sélectionnez "Exécuter avec PowerShell"
2. Suivez les instructions à l'écran pour terminer l'installation
Pour lancer l'application par la suite :
1. Faites un clic droit sur `launch_windows.ps1` et sélectionnez "Exécuter avec PowerShell"
2. Le script effectuera des vérifications avant de démarrer l'application
Ces méthodes sont recommandées pour les utilisateurs Windows qui ne sont pas familiers avec les lignes de commande.
### Installation de Tesseract OCR
Pour l'OCR, vous devez également installer Tesseract :
- **Windows** : Téléchargez et installez depuis [https://github.com/UB-Mannheim/tesseract/wiki](https://github.com/UB-Mannheim/tesseract/wiki)
- L'application recherchera Tesseract dans les chemins standards sur Windows (`C:\Program Files\Tesseract-OCR\tesseract.exe`, `C:\Program Files (x86)\Tesseract-OCR\tesseract.exe` ou `C:\Tesseract-OCR\tesseract.exe`)
- Assurez-vous d'installer les langues françaises et anglaises pendant l'installation
- **Linux** : `sudo apt install tesseract-ocr tesseract-ocr-fra tesseract-ocr-eng`
- **macOS** : `brew install tesseract`
### Installation d'Ollama
### Connexion au serveur Ollama
Suivez les instructions d'installation d'Ollama disponibles sur [https://ollama.ai/](https://ollama.ai/)
Cette application se connecte à un serveur Ollama distant pour les fonctionnalités LLM:
Modèles recommandés à télécharger :
```bash
ollama pull llava:34b
ollama pull mistral
ollama pull llama3.2-vision
```
- Adresse du serveur: `217.182.105.173:11434`
- Modèles utilisés:
- `mistral:latest` (pour le résumé et la traduction légère)
- `llava:34b-v1.6-fp16` (pour l'analyse visuelle)
- `llama3.2-vision:90b-instruct-q8_0` (pour l'analyse visuelle avancée)
- `deepseek-r1:70b-llama-distill-q8_0` (pour le résumé et la traduction avancés)
- `qwen2.5:72b-instruct-q8_0` (pour la traduction moyenne)
Assurez-vous que votre machine dispose d'une connexion réseau vers ce serveur. Aucune installation locale d'Ollama n'est nécessaire.
## Utilisation
@ -89,18 +162,151 @@ python main.py
L'application propose trois niveaux d'analyse :
| Niveau | Vision | Résumé | Traduction | Usage recommandé |
| --------- | ----------------- | ----------------- | ---------------- | --------------------------- |
| 🔹 Léger | llava:34b | mistral | mistral | Débogage, prototypes |
| ⚪ Moyen | llava | deepseek-r1 | qwen2.5 | Usage normal |
| 🔸 Avancé | llama3.2-vision | deepseek-r1 | deepseek | Documents critiques |
| --------- | ------------------------------------- | ----------------------------- | ---------------------------- | --------------------------- |
| 🔹 Léger | llava:34b-v1.6-fp16 | mistral:latest | mistral:latest | Débogage, prototypes |
| ⚪ Moyen | llava:34b-v1.6-fp16 | deepseek-r1:70b-llama-distill | qwen2.5:72b-instruct | Usage normal |
| 🔸 Avancé | llama3.2-vision:90b-instruct-q8_0 | deepseek-r1:70b-llama-distill | deepseek-r1:70b-llama-distill| Documents critiques |
Vous pouvez personnaliser ces configurations dans le fichier `config/llm_profiles.json`.
## Paramètres avancés
## Détail des bibliothèques utilisées
- **Température** : Contrôle la créativité des réponses (0.1-1.0)
- **Top-p/Top-k** : Paramètres d'échantillonnage pour la génération
- **Max tokens** : Limite la longueur des réponses générées
L'application utilise les bibliothèques suivantes, chacune avec un rôle spécifique dans le traitement des documents:
### Bibliothèques principales
- **PyQt6** : Interface graphique complète (v6.4.0+)
- **PyMuPDF (fitz)** : Manipulation et rendu des documents PDF (v1.21.0+)
- **numpy** : Traitement numérique des images et des données (v1.22.0+)
- **pytesseract** : Interface Python pour Tesseract OCR (v0.3.9+)
- **Pillow** : Traitement d'images (v9.3.0+)
- **opencv-python (cv2)** : Traitement d'images avancé et détection de contenu (v4.7.0+)
- **requests** : Communication avec l'API Ollama (v2.28.0+)
### Rôle de chaque bibliothèque
- **PyQt6**: Framework d'interface graphique qui gère la visualisation PDF, les sélections interactives, la configuration des agents et l'interface utilisateur globale.
- **PyMuPDF (fitz)**: Convertit les pages PDF en images, permet d'accéder au contenu des PDF, extraire les pages et obtenir des rendus haute qualité.
- **numpy**: Manipule les données d'images sous forme de tableaux pour le traitement OCR et la détection de contenu.
- **pytesseract**: Reconnaît le texte dans les images extraites des PDF, avec support multilingue.
- **Pillow + opencv-python**: Prétraitement des images avant OCR pour améliorer la reconnaissance du texte.
- **requests**: Envoie des requêtes au service Ollama local pour utiliser les modèles d'IA.
## Structure du projet et modules clés
```
ragflow_pretraitement/
├── main.py # Point d'entrée principal
├── ui/ # Interface utilisateur
│ ├── viewer.py # Visualisation PDF et sélection (PyQt6, fitz)
│ └── llm_config_panel.py # Configuration des agents (PyQt6)
├── agents/ # Agents LLM
│ ├── base.py # Classe de base des agents
│ ├── vision.py # Agent d'analyse visuelle (utilise OllamaAPI)
│ ├── summary.py # Agent de résumé (utilise OllamaAPI)
│ ├── translation.py # Agent de traduction (utilise OllamaAPI)
│ └── rewriter.py # Agent de reformulation (utilise OllamaAPI)
├── utils/ # Utilitaires
│ ├── ocr.py # Reconnaissance de texte (pytesseract, opencv)
│ ├── markdown_export.py # Export en Markdown
│ └── api_ollama.py # Communication avec l'API Ollama (requests)
├── config/ # Configuration
│ └── llm_profiles.json # Profils des modèles LLM utilisés
└── data/ # Données
└── outputs/ # Résultats générés
```
### Détail des modules principaux
#### Interface utilisateur (ui/)
- **viewer.py**: Implémente la classe principale `PDFViewer` qui gère la visualisation des PDF, la sélection des zones, le zoom et la navigation entre les pages.
- **llm_config_panel.py**: Implémente la classe `LLMConfigPanel` qui permet de configurer les agents LLM, leurs paramètres et de lancer les traitements.
#### Agents LLM (agents/)
- **base.py**: Définit la classe de base `LLMBaseAgent` avec les méthodes communes à tous les agents.
- **vision.py**: Agent spécialisé dans l'analyse d'images et de schémas.
- **summary.py**: Agent pour résumer et synthétiser du texte.
- **translation.py**: Agent pour traduire le contenu (généralement de l'anglais vers le français).
- **rewriter.py**: Agent pour reformuler et améliorer du texte.
#### Utilitaires (utils/)
- **ocr.py**: Contient la classe `OCRProcessor` pour extraire du texte des images avec prétraitement avancé.
- **api_ollama.py**: Implémente la classe `OllamaAPI` pour communiquer avec le service Ollama.
- **markdown_export.py**: Gère l'export des résultats au format Markdown.
## Suivi des flux de traitement
L'application intègre désormais un système complet de journalisation des flux de traitement entre les différents agents LLM. Cette fonctionnalité permet de suivre et d'analyser en détail chaque étape du processus d'analyse d'images et de texte.
### Flux de traitement d'une image
Pour l'analyse d'une image avec contexte, le flux de traitement complet est le suivant :
1. **Texte en français****Agent de traduction** → **Texte en anglais**
2. **Image + Texte en anglais****Agent de vision** → **Analyse en anglais**
3. **Analyse en anglais****Agent de traduction** → **Analyse en français**
Chaque étape est enregistrée avec :
- Les données d'entrée complètes
- Les données de sortie complètes
- Les métadonnées (modèle utilisé, type de contenu, etc.)
- L'horodatage précis
### Localisation des journaux
Les journaux sont sauvegardés dans les répertoires suivants :
- `data/workflows/` : Journaux du flux de travail complet (format Markdown)
- `data/translations/` : Journaux détaillés des traductions
- `data/images/` : Images analysées et résultats des analyses de vision
- `data/outputs/` : Résultats finaux combinés
### Désactivation des agents
Il est maintenant possible de désactiver certains agents LLM via le fichier de configuration `config/agent_config.py`. Par défaut, l'agent de résumé est désactivé pour permettre l'observation des résultats bruts du flux de traitement.
Pour modifier les agents actifs, vous pouvez éditer le dictionnaire `ACTIVE_AGENTS` dans ce fichier :
```python
# Configuration des agents activés/désactivés
ACTIVE_AGENTS = {
"ocr": True, # Agent de reconnaissance optique de caractères
"vision": True, # Agent d'analyse d'images
"translation": True, # Agent de traduction
"summary": False, # Agent de résumé (désactivé par défaut)
"rewriter": True # Agent de reformulation
}
```
## Adaptations pour Windows
Cette version a été optimisée pour Windows avec les adaptations suivantes:
1. **Chemins Tesseract**: Configuration automatique des chemins Tesseract pour Windows
2. **URLs API**: Configuration de l'API Ollama pour utiliser `localhost` par défaut
3. **Chemins de fichiers**: Utilisation de la séparation de chemins compatible Windows
4. **Compatibilité Unicode**: Support des caractères spéciaux dans les chemins Windows
## Fonctionnalités pratiques et cas d'utilisation
### 1. Analyse de tableaux techniques
- Sélectionnez un tableau complexe dans un document
- Utilisez l'agent de vision pour reconnaître la structure
- Exportez le contenu sous forme de markdown bien formaté
### 2. Traduction de documents techniques
- Chargez un document en anglais
- Sélectionnez des sections de texte
- Utilisez l'agent de traduction pour obtenir une version française précise
- Exportez le résultat en markdown
### 3. Résumé de documentation longue
- Identifiez les sections importantes d'un long document
- Appliquez l'agent de résumé à chaque section
- Combinez les résultats en un document synthétique
### 4. Analyse de schémas et figures
- Sélectionnez un schéma complexe
- Utilisez l'agent de vision pour obtenir une description détaillée
- Ajoutez du contexte textuel au besoin
## Résolution des problèmes courants
@ -112,63 +318,35 @@ pip uninstall PyQt6 PyQt6-Qt6 PyQt6-sip
pip install PyQt6
```
### Problèmes avec l'OCR
### Problèmes avec l'OCR sur Windows
Si l'OCR (Tesseract) ne fonctionne pas correctement :
1. Vérifiez que Tesseract est correctement installé et disponible dans le PATH
2. Pour Windows, vous devrez peut-être décommenter et modifier la ligne `pytesseract.pytesseract.tesseract_cmd` dans `utils/ocr.py`
1. Vérifiez que Tesseract est correctement installé (redémarrez l'application après installation)
2. Vérifiez l'existence d'un des chemins standards (`C:\Program Files\Tesseract-OCR\tesseract.exe`)
3. Si nécessaire, modifiez manuellement le chemin dans `utils/ocr.py`
### Ollama introuvable
### Connectivité au serveur Ollama
Si Ollama n'est pas détecté, vérifiez que le service est bien démarré :
```bash
# Linux/macOS
ollama serve
Si vous ne pouvez pas vous connecter au serveur Ollama:
1. Vérifiez votre connexion réseau et assurez-vous que vous pouvez accéder à `217.182.105.173:11434`
2. Vérifiez qu'aucun pare-feu ou proxy ne bloque la connexion
3. Pour vérifier la disponibilité du serveur, essayez d'accéder à `http://217.182.105.173:11434/api/version` dans votre navigateur
# Windows
# Utilisez l'interface graphique ou exécutez le service Ollama
```
Si vous souhaitez utiliser une instance Ollama locale à la place:
1. Modifiez l'URL dans `utils/api_ollama.py` et `agents/base.py` pour utiliser `http://localhost:11434`
2. Installez Ollama localement depuis [https://ollama.ai/](https://ollama.ai/)
3. Téléchargez les modèles requis avec `ollama pull <nom_du_modèle>`
## Structure du projet
### Modèles manquants
```
ragflow_pretraitement/
├── main.py # Point d'entrée principal
├── ui/ # Interface utilisateur
│ ├── viewer.py # Visualisation PDF et sélection
│ └── llm_config_panel.py # Configuration des agents
├── agents/ # Agents LLM
│ ├── base.py # Classe de base
│ ├── vision.py # Agent d'analyse visuelle
│ ├── summary.py # Agent de résumé
│ ├── translation.py # Agent de traduction
│ └── rewriter.py # Agent de reformulation
├── utils/ # Utilitaires
│ ├── ocr.py # Reconnaissance de texte
│ ├── markdown_export.py # Export en Markdown
│ └── api_ollama.py # Communication avec l'API Ollama
├── config/ # Configuration
│ └── llm_profiles.json # Profils des LLM
└── data/ # Données
└── outputs/ # Résultats générés
```
## Utilisation avancée
### Personnalisation des prompts
Les prompts par défaut peuvent être modifiés directement dans le code des agents (fichiers `agents/*.py`).
### Ajouter de nouveaux modèles
Pour ajouter un nouveau modèle, ajoutez-le dans `config/llm_profiles.json` et assurez-vous qu'il est disponible via Ollama.
Si certains modèles ne fonctionnent pas, vérifiez leur disponibilité sur le serveur et modifiez si nécessaire les noms dans `config/llm_profiles.json`.
## Limitations
- OCR parfois imprécis sur les documents complexes
- Certains modèles nécessitent beaucoup de mémoire GPU
- OCR parfois imprécis sur les documents complexes ou de mauvaise qualité
- Certains modèles nécessitent beaucoup de mémoire (8GB+ de RAM recommandé)
- Les formules mathématiques complexes peuvent être mal interprétées
## Contribution
Les contributions sont les bienvenues ! N'hésitez pas à soumettre des pull requests ou à signaler des problèmes dans l'outil de suivi.
Les contributions sont les bienvenues ! N'hésitez pas à soumettre des pull requests ou à signaler des problèmes.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@ -13,7 +13,7 @@ class LLMBaseAgent:
pour interagir avec différents modèles de langage.
"""
def __init__(self, model_name: str, endpoint: str = "http://localhost:11434/v1", **config):
def __init__(self, model_name: str, endpoint: str = "http://217.182.105.173:11434", **config):
"""
Initialise un agent LLM

185
agents/ocr.py Normal file
View File

@ -0,0 +1,185 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Agent for optical character recognition (OCR) in images
"""
import os
import time
import uuid
import re
from typing import Dict, Optional, List, Any, Union
import pytesseract
from PIL import Image
import io
import platform
from .base import LLMBaseAgent
class OCRAgent(LLMBaseAgent):
"""
Agent for optical character recognition (OCR)
"""
def __init__(self, model_name: str = "ocr", endpoint: str = "", **config):
"""
Initialize the OCR agent
Args:
model_name (str): Model name (default "ocr" as OCR doesn't use LLM models)
endpoint (str): API endpoint (not used for OCR)
**config: Additional configuration like language, etc.
"""
# Appeler le constructeur parent avec les paramètres requis
super().__init__(model_name, endpoint, **config)
# Default configuration for OCR
default_config = {
"language": "fra", # Default language: French
"tesseract_config": "--psm 1 --oem 3", # Default Tesseract config
}
# Merge with provided configuration
self.config.update(default_config)
for key, value in default_config.items():
if key not in self.config:
self.config[key] = value
# Windows-specific configuration
if platform.system() == "Windows":
# Possible paths for Tesseract on Windows
possible_paths = [
r"C:\Program Files\Tesseract-OCR\tesseract.exe",
r"C:\Program Files (x86)\Tesseract-OCR\tesseract.exe",
r"C:\Tesseract-OCR\tesseract.exe",
r"C:\Users\PCDEV\AppData\Local\Programs\Tesseract-OCR\tesseract.exe",
r"C:\Users\PCDEV\Tesseract-OCR\tesseract.exe"
]
# Look for Tesseract in possible paths
tesseract_path = None
for path in possible_paths:
if os.path.exists(path):
tesseract_path = path
break
# Configure pytesseract with the found path
if tesseract_path:
self.config["tesseract_path"] = tesseract_path
pytesseract.pytesseract.tesseract_cmd = tesseract_path
print(f"Tesseract found at: {tesseract_path}")
else:
print("WARNING: Tesseract was not found in standard paths.")
print("Please install Tesseract OCR from: https://github.com/UB-Mannheim/tesseract/wiki")
print("Or manually specify the path with the tesseract_path parameter")
# If a path is provided in the configuration, use it anyway
if "tesseract_path" in self.config:
pytesseract.pytesseract.tesseract_cmd = self.config["tesseract_path"]
# Create directory for OCR logs
self.log_dir = os.path.join("data", "ocr_logs")
os.makedirs(self.log_dir, exist_ok=True)
def generate(self, prompt: str = "", images: Optional[List[bytes]] = None) -> str:
"""
Perform optical character recognition on provided images
Args:
prompt (str, optional): Not used for OCR
images (List[bytes], optional): List of images to process in bytes
Returns:
str: Text extracted from images
"""
if not images:
return "Error: No images provided for OCR"
results = []
image_count = len(images)
# Generate unique ID for this OCR session
ocr_id = str(uuid.uuid4())[:8]
timestamp = time.strftime("%Y%m%d-%H%M%S")
for i, img_bytes in enumerate(images):
try:
# Open image from bytes
img = Image.open(io.BytesIO(img_bytes))
# Perform OCR with Tesseract
lang = self.config.get("language", "fra")
config = self.config.get("tesseract_config", "--psm 1 --oem 3")
text = pytesseract.image_to_string(img, lang=lang, config=config)
# Basic text cleaning
text = self._clean_text(text)
if text:
results.append(text)
# Save image and OCR result
image_path = os.path.join(self.log_dir, f"{timestamp}_{ocr_id}_img{i+1}.png")
img.save(image_path, "PNG")
# Save extracted text
text_path = os.path.join(self.log_dir, f"{timestamp}_{ocr_id}_img{i+1}_ocr.txt")
with open(text_path, "w", encoding="utf-8") as f:
f.write(f"OCR Language: {lang}\n")
f.write(f"Tesseract config: {config}\n\n")
f.write(text)
print(f"OCR performed on image {i+1}/{image_count}, saved to: {text_path}")
except Exception as e:
error_msg = f"Error processing image {i+1}: {str(e)}"
print(error_msg)
# Log the error
error_path = os.path.join(self.log_dir, f"{timestamp}_{ocr_id}_img{i+1}_error.txt")
with open(error_path, "w", encoding="utf-8") as f:
f.write(f"Error processing image {i+1}:\n{str(e)}")
# Add error message to results
results.append(f"[OCR Error on image {i+1}: {str(e)}]")
# Combine all extracted texts
if not results:
return "No text could be extracted from the provided images."
combined_result = "\n\n".join(results)
# Save combined result
combined_path = os.path.join(self.log_dir, f"{timestamp}_{ocr_id}_combined.txt")
with open(combined_path, "w", encoding="utf-8") as f:
f.write(f"OCR Language: {self.config.get('language', 'fra')}\n")
f.write(f"Number of images: {image_count}\n\n")
f.write(combined_result)
return combined_result
def _clean_text(self, text: str) -> str:
"""
Clean the text extracted by OCR
Args:
text (str): Raw text to clean
Returns:
str: Cleaned text
"""
if not text:
return ""
# Remove spaces at beginning and end
text = text.strip()
# Remove multiple empty lines
text = re.sub(r'\n{3,}', '\n\n', text)
# Remove non-printable characters
text = ''.join(c for c in text if c.isprintable() or c == '\n')
return text

View File

@ -5,25 +5,29 @@
Agent LLM pour la reformulation et l'adaptation de contenu
"""
import requests
from typing import Dict, Optional, Union
import os
import time
import uuid
from typing import Dict, Optional, Union, List, Any
from .base import LLMBaseAgent
from utils.api_ollama import OllamaAPI
class RewriterAgent(LLMBaseAgent):
"""
Agent LLM spécialisé dans la reformulation et l'adaptation de texte
"""
def __init__(self, model_name: str = "mistral", **config):
def __init__(self, model_name: str, endpoint: str = "http://217.182.105.173:11434", **config):
"""
Initialise l'agent de reformulation
Args:
model_name (str): Nom du modèle de reformulation (défaut: mistral)
model_name (str): Nom du modèle de reformulation
endpoint (str): URL de l'API Ollama
**config: Paramètres de configuration supplémentaires
"""
super().__init__(model_name, **config)
super().__init__(model_name, endpoint, **config)
# Définir les modes de reformulation et leurs prompts
self.modes = {
@ -83,26 +87,38 @@ class RewriterAgent(LLMBaseAgent):
}
}
def generate(self, text: str, mode: str = "rag", custom_prompt: Optional[str] = "") -> str:
# Création du répertoire pour les journaux de reformulation
self.log_dir = os.path.join("data", "rewrites")
os.makedirs(self.log_dir, exist_ok=True)
def generate(self, text: str, images: Optional[List[bytes]] = None, mode: str = "rag",
custom_prompt: Optional[str] = "", language: Optional[str] = None) -> str:
"""
Reformule un texte selon le mode spécifié
Args:
text (str): Texte à reformuler
images (List[bytes], optional): Non utilisé pour la reformulation
mode (str): Mode de reformulation (simplifier, détailler, rag, formal, bullet)
custom_prompt (str, optional): Prompt personnalisé pour la reformulation
language (str, optional): Langue du texte à reformuler
Returns:
str: Le texte reformulé
"""
if not text:
return ""
if not text or not text.strip():
return "Erreur: Aucun texte fourni pour la reformulation"
# Déterminer la langue et le prompt à utiliser
lang = self.config.get("language", "fr")
lang = language or self.config.get("language", "fr")
# Génération d'un ID unique pour cette reformulation
rewrite_id = str(uuid.uuid4())[:8]
timestamp = time.strftime("%Y%m%d-%H%M%S")
if custom_prompt:
prompt = custom_prompt.format(text=text)
prompt_template = custom_prompt
mode_name = "custom"
else:
# Vérifier que le mode existe pour la langue spécifiée
if lang not in self.modes or mode not in self.modes[lang]:
@ -111,67 +127,81 @@ class RewriterAgent(LLMBaseAgent):
mode = "rag"
prompt_template = self.modes[lang][mode]
mode_name = mode
# Formater le prompt avec le texte
prompt = prompt_template.format(text=text)
# Créer l'API Ollama pour l'appel direct
api = OllamaAPI(base_url=self.endpoint)
# Journaliser le prompt complet
print(f"Envoi du prompt de reformulation (mode {mode_name}) au modèle {self.model_name}")
try:
# Construire la payload pour l'API Ollama
payload = {
"model": self.model_name,
"prompt": prompt,
"options": {
"temperature": self.config.get("temperature", 0.3), # Légèrement plus créatif pour la reformulation
# Pour les modèles qui supportent le format de chat
if any(name in self.model_name.lower() for name in ["llama", "mistral", "deepseek", "qwen"]):
# Formater en tant que messages de chat
system_prompt = f"Tu es un expert en reformulation de texte spécialisé dans le mode '{mode_name}'."
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": prompt}
]
response = api.chat(
model=self.model_name,
messages=messages,
options={
"temperature": self.config.get("temperature", 0.3),
"top_p": self.config.get("top_p", 0.95),
"top_k": self.config.get("top_k", 40),
"num_predict": self.config.get("max_tokens", 2048)
}
)
if "message" in response and "content" in response["message"]:
result = response["message"]["content"]
else:
result = response.get("response", "Erreur: Format de réponse inattendu")
else:
# Format de génération standard pour les autres modèles
response = api.generate(
model=self.model_name,
prompt=prompt,
options={
"temperature": self.config.get("temperature", 0.3),
"top_p": self.config.get("top_p", 0.95),
"top_k": self.config.get("top_k", 40),
"num_predict": self.config.get("max_tokens", 2048)
}
)
# Dans une implémentation réelle, envoyer la requête à l'API Ollama
# response = requests.post(f"{self.endpoint}/api/generate", json=payload)
# json_response = response.json()
# return json_response.get("response", "")
result = response.get("response", "Erreur: Pas de réponse")
# Pour cette démonstration, retourner des exemples de reformulation
if mode == "simplifier" or mode == "simplify":
return ("Ce texte a été simplifié pour être plus facile à comprendre. "
"Les mots compliqués ont été remplacés par des mots plus simples. "
"Les phrases longues ont été raccourcies. Les idées principales "
"restent les mêmes, mais elles sont expliquées plus clairement.")
# Enregistrer la reformulation dans un fichier
rewrite_path = os.path.join(self.log_dir, f"{timestamp}_{rewrite_id}.txt")
with open(rewrite_path, "w", encoding="utf-8") as f:
f.write(f"Language: {lang}\n")
f.write(f"Mode: {mode_name}\n")
f.write(f"Model: {self.model_name}\n\n")
f.write(f"Original text:\n{text}\n\n")
f.write(f"Rewritten text:\n{result}")
elif mode == "détailler" or mode == "elaborate":
return (f"{text}\n\nEn outre, il est important de noter que ce contenu s'inscrit "
"dans un contexte plus large. Plusieurs exemples concrets illustrent ce point : "
"premièrement, l'application pratique de ces concepts dans des situations réelles; "
"deuxièmement, les différentes interprétations possibles selon le domaine d'expertise; "
"et troisièmement, les implications à long terme de ces informations. "
"Cette perspective élargie permet une compréhension plus approfondie du sujet.")
elif mode == "formal":
return ("Il convient de préciser que le contenu susmentionné présente des caractéristiques "
"particulièrement pertinentes dans le cadre de l'analyse proposée. En effet, "
"l'examen minutieux des éléments constitutifs révèle une structure cohérente "
"dont la logique sous-jacente manifeste une organisation méthodique des concepts. "
"Par conséquent, il est possible d'affirmer que les principes énoncés "
"s'inscrivent dans un paradigme rigoureux qui mérite une attention scientifique.")
elif mode == "bullet":
# Transformation en liste à puces
bullet_points = ["• Point principal: Le contenu présente des informations essentielles",
"• Structure: Organisation logique des éléments clés",
"• Application: Utilisations pratiques dans divers contextes",
"• Avantages: Amélioration de la compréhension et de l'efficacité",
"• Limitations: Considérations importantes à prendre en compte"]
return "\n".join(bullet_points)
else: # rag par défaut
return ("Ce contenu optimisé pour les systèmes RAG contient des mots-clés pertinents "
"et une structure sémantique améliorée. Les concepts principaux sont clairement "
"définis et leurs relations sont explicitement établies. Les informations "
"sont présentées de manière à faciliter la recherche et la récupération "
"automatisées, avec une organisation logique qui met en évidence les "
"éléments essentiels du sujet traité. Chaque section est conçue pour "
"maximiser la pertinence lors des requêtes d'information.")
print(f"Reformulation enregistrée dans: {rewrite_path}")
return result
except Exception as e:
return f"Erreur lors de la reformulation: {str(e)}"
error_msg = f"Erreur lors de la reformulation: {str(e)}"
print(error_msg)
# Enregistrer l'erreur
error_path = os.path.join(self.log_dir, f"{timestamp}_{rewrite_id}_error.txt")
with open(error_path, "w", encoding="utf-8") as f:
f.write(f"Language: {lang}\n")
f.write(f"Mode: {mode_name}\n")
f.write(f"Model: {self.model_name}\n\n")
f.write(f"Original text:\n{text}\n\n")
f.write(f"Error:\n{str(e)}")
return error_msg

View File

@ -2,182 +2,175 @@
# -*- coding: utf-8 -*-
"""
Agent LLM pour la génération de résumés
Agent pour la synthèse et le résumé de texte
"""
import requests
from typing import Dict, Optional, Union
import os
import time
import uuid
from typing import List, Optional, Dict, Any
from .base import LLMBaseAgent
from utils.api_ollama import OllamaAPI
class SummaryAgent(LLMBaseAgent):
"""
Agent LLM spécialisé dans la génération de résumés et d'analyses de texte
Agent pour la synthèse et le résumé de texte
"""
def __init__(self, model_name: str = "deepseek-r1", **config):
def __init__(self, model_name: str, endpoint: str = "http://217.182.105.173:11434", **config):
"""
Initialise l'agent de résumé
Args:
model_name (str): Nom du modèle de résumé (défaut: deepseek-r1)
**config: Paramètres de configuration supplémentaires
model_name (str): Nom du modèle à utiliser
endpoint (str): URL de l'API Ollama
**config: Configuration supplémentaire
"""
super().__init__(model_name, **config)
# Appeler le constructeur de la classe parent avec les paramètres requis
super().__init__(model_name, endpoint, **config)
# Définir les prompts de résumé par type de contenu et langue
self.prompts = {
"fr": {
"standard": "Résume le texte suivant en capturant les points essentiels "
"de manière concise et précise. Préserve les informations clés "
"tout en réduisant la longueur.\n\n"
"Texte à résumer :\n{text}",
"analytique": "Analyse le texte suivant en identifiant les concepts clés, "
"les arguments principaux et les implications. Fournis un résumé "
"qui met en évidence les insights importants.\n\n"
"Texte à analyser :\n{text}",
"schéma": "Décris ce schéma de manière concise en expliquant sa structure, "
"son organisation et les relations entre ses éléments. Identifie "
"le message principal qu'il communique.\n\n"
"Description du schéma :\n{text}",
"tableau": "Résume les informations essentielles présentées dans ce tableau "
"en identifiant les tendances, les valeurs importantes et les "
"relations entre les données.\n\n"
"Description du tableau :\n{text}",
"formule": "Explique cette formule mathématique de manière concise et accessible, "
"en précisant son contexte, sa signification et ses applications "
"pratiques.\n\n"
"Description de la formule :\n{text}"
},
"en": {
"standard": "Summarize the following text by capturing the essential points "
"in a concise and accurate manner. Preserve key information "
"while reducing length.\n\n"
"Text to summarize:\n{text}",
"analytical": "Analyze the following text by identifying key concepts, "
"main arguments, and implications. Provide a summary "
"that highlights important insights.\n\n"
"Text to analyze:\n{text}",
"schéma": "Describe this diagram concisely by explaining its structure, "
"organization, and relationships between elements. Identify "
"the main message it communicates.\n\n"
"Diagram description:\n{text}",
"tableau": "Summarize the essential information presented in this table "
"by identifying trends, important values, and relationships "
"between the data.\n\n"
"Table description:\n{text}",
"formule": "Explain this mathematical formula concisely and accessibly, "
"specifying its context, meaning, and practical applications.\n\n"
"Formula description:\n{text}"
}
# Configuration par défaut pour les résumés
default_config = {
"language": "fr",
"summary_length": "medium" # 'short', 'medium', 'long'
}
def generate(self, text: str, summary_type: str = "standard", selection_type: Optional[str] = "") -> str:
# Mettre à jour la configuration avec les valeurs par défaut si elles ne sont pas spécifiées
for key, value in default_config.items():
if key not in self.config:
self.config[key] = value
# Création du répertoire pour les journaux de résumé
self.log_dir = os.path.join("data", "summaries")
os.makedirs(self.log_dir, exist_ok=True)
def generate(self, prompt: str, images: Optional[List[bytes]] = None,
summary_length: Optional[str] = None, language: Optional[str] = None) -> str:
"""
Génère un résumé ou une analyse du texte fourni
Résume un texte
Args:
text (str): Texte à résumer ou analyser
summary_type (str): Type de résumé ("standard" ou "analytique")
selection_type (str, optional): Type de contenu ("schéma", "tableau", "formule")
Si fourni, remplace summary_type
prompt (str): Texte à résumer
images (List[bytes], optional): Non utilisé pour le résumé
summary_length (str, optional): Longueur du résumé ('short', 'medium', 'long')
language (str, optional): Langue du résumé
Returns:
str: Le résumé ou l'analyse générée par le modèle
str: Résumé généré
"""
if not text:
return ""
if not prompt or not prompt.strip():
return "Erreur: Aucun texte fourni pour le résumé"
# Déterminer le type de prompt à utiliser
lang = self.config.get("language", "fr")
# Utiliser les paramètres spécifiés ou ceux de la configuration
length = summary_length or self.config["summary_length"]
lang = language or self.config["language"]
# Utiliser selection_type seulement s'il est non vide et existe dans les prompts
if selection_type and selection_type in self.prompts.get(lang, {}):
prompt_type = selection_type
# Génération d'un ID unique pour ce résumé
summary_id = str(uuid.uuid4())[:8]
timestamp = time.strftime("%Y%m%d-%H%M%S")
# Définir les instructions de longueur en fonction de la langue
length_instructions = {
"short": {
"fr": "Fais un résumé court et concis (environ 2-3 phrases).",
"en": "Create a short and concise summary (about 2-3 sentences)."
},
"medium": {
"fr": "Fais un résumé de taille moyenne (environ 1-2 paragraphes).",
"en": "Create a medium-length summary (about 1-2 paragraphs)."
},
"long": {
"fr": "Fais un résumé détaillé (environ 3-4 paragraphes).",
"en": "Create a detailed summary (about 3-4 paragraphs)."
}
}
# Obtenir l'instruction de longueur dans la langue appropriée
length_instruction = length_instructions.get(length, length_instructions["medium"]).get(lang, length_instructions["medium"]["fr"])
# Construire le prompt pour le modèle
if lang == "fr":
system_prompt = f"Tu es un expert en synthèse et résumé de textes. {length_instruction} "
system_prompt += "Conserve l'information essentielle et la structure logique du texte original."
else:
prompt_type = summary_type
system_prompt = f"You are an expert in summarizing texts. {length_instruction} "
system_prompt += "Preserve the essential information and logical structure of the original text."
# Obtenir le prompt approprié
if lang not in self.prompts or prompt_type not in self.prompts[lang]:
lang = "fr" # Langue par défaut
prompt_type = "standard" # Type par défaut
# Construire le message utilisateur avec le texte à résumer
user_prompt = prompt
prompt_template = self.prompts[lang][prompt_type]
prompt = prompt_template.format(text=text)
# Créer l'API Ollama pour l'appel direct
api = OllamaAPI(base_url=self.endpoint)
# Journaliser le prompt complet
full_prompt = f"System: {system_prompt}\n\nUser: {user_prompt}"
print(f"Envoi du prompt de résumé au modèle {self.model_name}:\n{system_prompt}")
try:
# Construire la payload pour l'API Ollama
payload = {
"model": self.model_name,
"prompt": prompt,
"options": {
# Pour les modèles qui supportent le format de chat
if any(name in self.model_name.lower() for name in ["llama", "mistral", "deepseek", "qwen"]):
# Formater en tant que messages de chat
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt}
]
response = api.chat(
model=self.model_name,
messages=messages,
options={
"temperature": self.config.get("temperature", 0.2),
"top_p": self.config.get("top_p", 0.95),
"top_k": self.config.get("top_k", 40),
"num_predict": self.config.get("max_tokens", 1024)
"num_predict": self.config.get("max_tokens", 2048)
}
)
if "message" in response and "content" in response["message"]:
result = response["message"]["content"]
else:
result = response.get("response", "Erreur: Format de réponse inattendu")
else:
# Format de génération standard pour les autres modèles
prompt_text = f"{system_prompt}\n\n{user_prompt}"
response = api.generate(
model=self.model_name,
prompt=prompt_text,
options={
"temperature": self.config.get("temperature", 0.2),
"top_p": self.config.get("top_p", 0.95),
"top_k": self.config.get("top_k", 40),
"num_predict": self.config.get("max_tokens", 2048)
}
)
# Dans une implémentation réelle, envoyer la requête à l'API Ollama
# response = requests.post(f"{self.endpoint}/api/generate", json=payload)
# json_response = response.json()
# return json_response.get("response", "")
result = response.get("response", "Erreur: Pas de réponse")
# Pour cette démonstration, retourner un résumé simulé
length = len(text.split())
# Enregistrer le résumé dans un fichier
summary_path = os.path.join(self.log_dir, f"{timestamp}_{summary_id}.txt")
with open(summary_path, "w", encoding="utf-8") as f:
f.write(f"Language: {lang}\n")
f.write(f"Summary length: {length}\n")
f.write(f"Model: {self.model_name}\n\n")
f.write(f"Original text:\n{prompt}\n\n")
f.write(f"Summary:\n{result}")
if prompt_type == "schéma":
return ("Ce schéma illustre une structure organisée présentant les relations "
"entre différents éléments clés. Il met en évidence la hiérarchie et "
"le flux d'information, permettant de comprendre rapidement le processus "
"ou le système représenté. Les composants principaux sont clairement "
"identifiés et leurs connections logiques sont bien établies.")
elif prompt_type == "tableau":
return ("Ce tableau présente des données structurées qui révèlent des tendances "
"significatives. Les valeurs clés montrent une corrélation entre les "
"différentes variables présentées. L'organisation des données permet "
"d'identifier rapidement les informations importantes et de comprendre "
"les relations entre elles. Les catégories principales sont clairement "
"distinguées.")
elif prompt_type == "formule":
return ("Cette formule mathématique exprime une relation fondamentale entre "
"plusieurs variables. Elle permet de calculer précisément les valeurs "
"recherchées en fonction des paramètres d'entrée. Son application "
"pratique concerne principalement la modélisation du phénomène décrit "
"dans le contexte. La structure de l'équation révèle la nature des "
"interactions entre les différentes composantes du système.")
elif prompt_type == "analytique":
return ("L'analyse approfondie du texte révèle plusieurs concepts clés interconnectés. "
"Les arguments principaux s'articulent autour d'une thèse centrale qui est "
"étayée par des preuves pertinentes. Les implications de ces idées sont "
"significatives pour le domaine concerné. Cette analyse met en lumière "
"les points essentiels tout en préservant la profondeur intellectuelle "
"du contenu original.")
else: # standard
# Génération d'un résumé approximativement 70% plus court
words = text.split()
summary_length = max(3, int(length * 0.3)) # Au moins 3 mots
if length < 20:
return text # Texte déjà court
return ("Ce texte présente de manière concise les informations essentielles "
"du contenu original. Les points clés sont préservés tout en éliminant "
"les détails superflus. La synthèse capture l'essence du message "
"en mettant l'accent sur les éléments les plus pertinents pour "
"la compréhension globale du sujet traité.")
print(f"Résumé enregistré dans: {summary_path}")
return result
except Exception as e:
return f"Erreur lors de la génération du résumé: {str(e)}"
error_msg = f"Erreur lors de la génération du résumé: {str(e)}"
print(error_msg)
# Enregistrer l'erreur
error_path = os.path.join(self.log_dir, f"{timestamp}_{summary_id}_error.txt")
with open(error_path, "w", encoding="utf-8") as f:
f.write(f"Language: {lang}\n")
f.write(f"Summary length: {length}\n")
f.write(f"Model: {self.model_name}\n\n")
f.write(f"Original text:\n{prompt}\n\n")
f.write(f"Error:\n{str(e)}")
return error_msg

View File

@ -2,140 +2,151 @@
# -*- coding: utf-8 -*-
"""
Agent LLM pour la traduction de contenu
Agent pour la traduction de texte
"""
import json
import requests
from typing import Optional, Dict
import os
import time
import uuid
from typing import List, Optional, Dict, Any
from .base import LLMBaseAgent
from utils.api_ollama import OllamaAPI
class TranslationAgent(LLMBaseAgent):
"""
Agent LLM spécialisé dans la traduction de texte
Agent pour la traduction de texte
"""
def __init__(self, model_name: str = "mistral", **config):
def __init__(self, model_name: str, endpoint: str = "http://217.182.105.173:11434", **config):
"""
Initialise l'agent de traduction
Args:
model_name (str): Nom du modèle de traduction (défaut: mistral)
**config: Paramètres de configuration supplémentaires
model_name (str): Nom du modèle à utiliser
endpoint (str): URL de l'API Ollama
**config: Configuration supplémentaire
"""
super().__init__(model_name, **config)
super().__init__(model_name, endpoint, **config)
# Définir les prompts de traduction
self.prompts = {
"fr_to_en": "Traduis le texte suivant du français vers l'anglais. "
"Préserve le formatage, le ton et le style du texte original. "
"Assure-toi que la traduction est fluide et naturelle.\n\n"
"Texte français :\n{text}",
"en_to_fr": "Traduis le texte suivant de l'anglais vers le français. "
"Préserve le formatage, le ton et le style du texte original. "
"Assure-toi que la traduction est fluide et naturelle.\n\n"
"Texte anglais :\n{text}"
# Configuration par défaut pour la traduction
default_config = {
"source_language": "en",
"target_language": "fr"
}
def generate(self, text: str, source_lang: str = "fr", target_lang: str = "en") -> str:
# Mettre à jour la configuration avec les valeurs par défaut si elles ne sont pas spécifiées
for key, value in default_config.items():
if key not in self.config:
self.config[key] = value
# Création du répertoire pour les journaux de traduction
self.log_dir = os.path.join("data", "translations")
os.makedirs(self.log_dir, exist_ok=True)
def generate(self, prompt: str, images: Optional[List[bytes]] = None,
source_language: Optional[str] = None, target_language: Optional[str] = None) -> str:
"""
Traduit un texte d'une langue source vers une langue cible
Traduit un texte
Args:
text (str): Texte à traduire
source_lang (str): Langue source (fr ou en)
target_lang (str): Langue cible (fr ou en)
prompt (str): Texte à traduire
images (List[bytes], optional): Non utilisé pour la traduction
source_language (str, optional): Langue source (par défaut: celle de la configuration)
target_language (str, optional): Langue cible (par défaut: celle de la configuration)
Returns:
str: La traduction générée par le modèle
str: Traduction générée
"""
if not text:
return ""
if not prompt or not prompt.strip():
return "Erreur: Aucun texte fourni pour la traduction"
# Déterminer la direction de traduction
if source_lang == "fr" and target_lang == "en":
direction = "fr_to_en"
elif source_lang == "en" and target_lang == "fr":
direction = "en_to_fr"
else:
return f"Traduction non prise en charge: {source_lang} vers {target_lang}"
# Utiliser les langues spécifiées ou celles de la configuration
src_lang = source_language or self.config["source_language"]
tgt_lang = target_language or self.config["target_language"]
# Construire le prompt
prompt = self.prompts[direction].format(text=text)
# Génération d'un ID unique pour cette traduction
translation_id = str(uuid.uuid4())[:8]
timestamp = time.strftime("%Y%m%d-%H%M%S")
# Construire le prompt pour le modèle
system_prompt = f"You are a professional and accurate translator. Translate the following text "
system_prompt += f"from {src_lang} to {tgt_lang}. Maintain the original formatting and structure as much as possible."
# Construire le message utilisateur avec le texte à traduire
user_prompt = prompt
# Créer l'API Ollama pour l'appel direct
api = OllamaAPI(base_url=self.endpoint)
# Journaliser le prompt complet
full_prompt = f"System: {system_prompt}\n\nUser: {user_prompt}"
print(f"Envoi du prompt de traduction au modèle {self.model_name}:\n{system_prompt}")
try:
# Construire la payload pour l'API Ollama
payload = {
"model": self.model_name,
"prompt": prompt,
"options": {
"temperature": self.config.get("temperature", 0.1), # Basse pour traduction précise
# Pour les modèles qui supportent le format de chat
if any(name in self.model_name.lower() for name in ["llama", "mistral", "deepseek", "qwen"]):
# Formater en tant que messages de chat
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt}
]
response = api.chat(
model=self.model_name,
messages=messages,
options={
"temperature": self.config.get("temperature", 0.1), # Température plus basse pour la traduction
"top_p": self.config.get("top_p", 0.95),
"top_k": self.config.get("top_k", 40),
"num_predict": self.config.get("max_tokens", 2048)
}
}
)
# Dans une implémentation réelle, envoyer la requête à l'API Ollama
# response = requests.post(f"{self.endpoint}/api/generate", json=payload)
# json_response = response.json()
# return json_response.get("response", "")
# Pour cette démonstration, retourner une traduction simulée
if direction == "fr_to_en":
if "schéma" in text.lower():
return text.replace("schéma", "diagram").replace("Schéma", "Diagram")
elif "tableau" in text.lower():
return text.replace("tableau", "table").replace("Tableau", "Table")
elif "formule" in text.lower():
return text.replace("formule", "formula").replace("Formule", "Formula")
if "message" in response and "content" in response["message"]:
result = response["message"]["content"]
else:
# Exemple très simplifié de "traduction"
translations = {
"Le": "The", "la": "the", "les": "the", "des": "the",
"et": "and", "ou": "or", "pour": "for", "avec": "with",
"est": "is", "sont": "are", "contient": "contains",
"montre": "shows", "représente": "represents",
"plusieurs": "several", "important": "important",
"information": "information", "données": "data",
"processus": "process", "système": "system",
"analyse": "analysis", "résultat": "result"
}
# Remplacement simple mot à mot
result = text
for fr, en in translations.items():
result = result.replace(f" {fr} ", f" {en} ")
return result
else: # en_to_fr
if "diagram" in text.lower():
return text.replace("diagram", "schéma").replace("Diagram", "Schéma")
elif "table" in text.lower():
return text.replace("table", "tableau").replace("Table", "Tableau")
elif "formula" in text.lower():
return text.replace("formula", "formule").replace("Formula", "Formule")
result = response.get("response", "Erreur: Format de réponse inattendu")
else:
# Exemple très simplifié de "traduction"
translations = {
"The": "Le", "the": "le", "and": "et", "or": "ou",
"for": "pour", "with": "avec", "is": "est",
"are": "sont", "contains": "contient", "shows": "montre",
"represents": "représente", "several": "plusieurs",
"important": "important", "information": "information",
"data": "données", "process": "processus",
"system": "système", "analysis": "analyse",
"result": "résultat"
# Format de génération standard pour les autres modèles
prompt_text = f"{system_prompt}\n\n{user_prompt}"
response = api.generate(
model=self.model_name,
prompt=prompt_text,
options={
"temperature": self.config.get("temperature", 0.1),
"top_p": self.config.get("top_p", 0.95),
"top_k": self.config.get("top_k", 40),
"num_predict": self.config.get("max_tokens", 2048)
}
)
# Remplacement simple mot à mot
result = text
for en, fr in translations.items():
result = result.replace(f" {en} ", f" {fr} ")
result = response.get("response", "Erreur: Pas de réponse")
# Enregistrer la traduction dans un fichier
translation_path = os.path.join(self.log_dir, f"{timestamp}_{translation_id}.txt")
with open(translation_path, "w", encoding="utf-8") as f:
f.write(f"Source language: {src_lang}\n")
f.write(f"Target language: {tgt_lang}\n")
f.write(f"Model: {self.model_name}\n\n")
f.write(f"Original text:\n{prompt}\n\n")
f.write(f"Translation:\n{result}")
print(f"Traduction enregistrée dans: {translation_path}")
return result
except Exception as e:
return f"Erreur lors de la traduction: {str(e)}"
error_msg = f"Erreur lors de la traduction: {str(e)}"
print(error_msg)
# Enregistrer l'erreur
error_path = os.path.join(self.log_dir, f"{timestamp}_{translation_id}_error.txt")
with open(error_path, "w", encoding="utf-8") as f:
f.write(f"Source language: {src_lang}\n")
f.write(f"Target language: {tgt_lang}\n")
f.write(f"Model: {self.model_name}\n\n")
f.write(f"Original text:\n{prompt}\n\n")
f.write(f"Error:\n{str(e)}")
return error_msg

View File

@ -2,145 +2,192 @@
# -*- coding: utf-8 -*-
"""
Agent LLM pour l'analyse visuelle d'images
Agent pour l'analyse d'images et de schémas
"""
import base64
import json
import requests
from typing import List, Optional, Union
import os
import time
import uuid
from PIL import Image
import io
from typing import List, Optional, Dict, Any
from .base import LLMBaseAgent
from utils.api_ollama import OllamaAPI
class VisionAgent(LLMBaseAgent):
"""
Agent LLM spécialisé dans l'analyse d'images
Agent pour l'analyse d'images avec des modèles multimodaux
"""
def __init__(self, model_name: str = "llama3.2-vision:90b", **config):
def __init__(self, model_name: str, endpoint: str = "http://217.182.105.173:11434", **config):
"""
Initialise l'agent de vision
Args:
model_name (str): Nom du modèle de vision (défaut: llama3.2-vision:90b)
**config: Paramètres de configuration supplémentaires
model_name (str): Nom du modèle à utiliser
endpoint (str): URL de l'API Ollama
**config: Configuration supplémentaire
"""
super().__init__(model_name, **config)
super().__init__(model_name, endpoint, **config)
# Définir les prompts par défaut selon la langue
self.prompts = {
"fr": {
"schéma": "Décris en détail ce schéma. Quelle information principale est-elle représentée ? "
"Comment les éléments sont-ils organisés ? Quelle est la signification des différentes parties ?",
"tableau": "Analyse ce tableau. Quel type de données contient-il ? "
"Quelles sont les informations importantes ? Résume son contenu et sa structure.",
"formule": "Analyse cette formule ou équation mathématique. "
"Que représente-t-elle ? Quelle est sa signification et son application ?",
"texte": "Lis et résume ce texte. Quel est le contenu principal ? "
"Quels sont les points importants à retenir ?",
"autre": "Décris en détail ce que tu vois sur cette image. "
"Quel est le contenu principal ? Quelle information est présentée ?"
},
"en": {
"schéma": "Describe this diagram in detail. What main information is represented? "
"How are the elements organized? What is the meaning of the different parts?",
"tableau": "Analyze this table. What kind of data does it contain? "
"What is the important information? Summarize its content and structure.",
"formule": "Analyze this mathematical formula or equation. "
"What does it represent? What is its meaning and application?",
"texte": "Read and summarize this text. What is the main content? "
"What are the important points to remember?",
"autre": "Describe in detail what you see in this image. "
"What is the main content? What information is presented?"
}
# Configuration par défaut
default_config = {
"save_images": True # Enregistrer les images par défaut
}
# Mettre à jour la configuration avec les valeurs par défaut si non spécifiées
for key, value in default_config.items():
if key not in self.config:
self.config[key] = value
# Création du répertoire pour sauvegarder les images analysées
self.image_dir = os.path.join("data", "images")
os.makedirs(self.image_dir, exist_ok=True)
def generate(self, prompt: Optional[str] = "", images: Optional[List[bytes]] = None,
selection_type: str = "autre", context: Optional[str] = "") -> str:
"""
Génère une description ou une analyse d'une image
Args:
prompt (str, optional): Prompt personnalisé (si vide, utilise le prompt par défaut)
images (List[bytes], optional): Liste d'images en bytes
selection_type (str): Type de contenu ("schéma", "tableau", "formule", "texte", "autre")
context (str, optional): Contexte textuel associé à la sélection
prompt (str, optional): Prompt supplémentaire (non utilisé)
images (List[bytes], optional): Liste d'images à analyser
selection_type (str): Type de la sélection (schéma, tableau, formule...)
context (str, optional): Contexte textuel
Returns:
str: La réponse générée par le modèle
str: Description générée par le modèle
"""
if not images:
return "Aucune image fournie pour l'analyse."
if not images or len(images) == 0:
return "Erreur: Aucune image fournie pour l'analyse"
# Utiliser le prompt par défaut si aucun n'est fourni
if not prompt:
lang = self.config.get("language", "fr")
prompts_dict = self.prompts.get(lang, self.prompts["fr"])
prompt = prompts_dict.get(selection_type, prompts_dict["autre"])
image_data = images[0]
# Ajouter le contexte si disponible
if context:
prompt = f"Contexte: {context}\n\n{prompt}"
# Sauvegarder l'image pour référence future seulement si l'option est activée
image_id = str(uuid.uuid4())[:8]
timestamp = time.strftime("%Y%m%d-%H%M%S")
image_filename = f"{timestamp}_{image_id}.png"
image_path = os.path.join(self.image_dir, image_filename)
if self.config.get("save_images", True):
try:
# Sauvegarder l'image
img = Image.open(io.BytesIO(image_data))
img.save(image_path)
print(f"Image sauvegardée: {image_path}")
except Exception as e:
print(f"Erreur lors de la sauvegarde de l'image: {str(e)}")
# Construction du prompt en anglais pour le modèle
system_prompt = "Analyze the following image"
# Mapper les types de sélection en français vers l'anglais
content_type_mapping = {
"schéma": "diagram",
"tableau": "table",
"formule": "formula",
"graphique": "chart",
"autre": "content"
}
# Obtenir le type en anglais ou utiliser le type original
content_type_en = content_type_mapping.get(selection_type.lower(), selection_type)
system_prompt += f" which contains {content_type_en}"
# Ajout d'instructions spécifiques selon le type de contenu
if content_type_en == "diagram":
system_prompt += ". Please describe in detail what this diagram shows, including all components, connections, and what it represents."
elif content_type_en == "table":
system_prompt += ". Please extract and format the table content, describing its structure, headers, and data. If possible, recreate the table structure."
elif content_type_en == "formula" or content_type_en == "equation":
system_prompt += ". Please transcribe this mathematical formula/equation and explain what it represents and its components."
elif content_type_en == "chart" or content_type_en == "graph":
system_prompt += ". Please describe this chart/graph in detail, including the axes, data points, trends, and what information it conveys."
else:
system_prompt += ". Please provide a detailed description of what you see in this image."
# Ajouter des instructions générales
system_prompt += "\n\nPlease be detailed and precise in your analysis."
# Préparer le prompt avec le contexte
user_prompt = ""
if context and context.strip():
# Le contexte est déjà en français, pas besoin de le traduire
# mais préciser explicitement que c'est en français pour le modèle
user_prompt = f"Here is additional context that may help with your analysis (may be in French):\n{context}"
# Créer l'API Ollama pour l'appel direct
api = OllamaAPI(base_url=self.endpoint)
# Journaliser le prompt complet
full_prompt = f"System: {system_prompt}\n\nUser: {user_prompt}"
print(f"Envoi du prompt au modèle {self.model_name}:\n{full_prompt}")
try:
# Pour chaque image, encoder en base64
base64_images = []
for image in images:
if isinstance(image, bytes):
base64_image = base64.b64encode(image).decode("utf-8")
base64_images.append(base64_image)
# Pour les modèles qui supportent le format de chat
if "llama" in self.model_name.lower() or "llava" in self.model_name.lower():
# Formater en tant que messages de chat
messages = [
{"role": "system", "content": system_prompt}
]
# Construire la payload pour l'API Ollama
payload = {
"model": self.model_name,
"prompt": prompt,
"images": base64_images,
"options": {
if user_prompt:
messages.append({"role": "user", "content": user_prompt})
response = api.chat(
model=self.model_name,
messages=messages,
images=[image_data],
options={
"temperature": self.config.get("temperature", 0.2),
"top_p": self.config.get("top_p", 0.95),
"top_k": self.config.get("top_k", 40),
"num_predict": self.config.get("max_tokens", 1024)
}
}
# Dans une implémentation réelle, envoyer la requête à l'API Ollama
# response = requests.post(f"{self.endpoint}/api/generate", json=payload)
# json_response = response.json()
# return json_response.get("response", "")
# Pour cette démonstration, retourner une réponse simulée
if selection_type == "schéma":
return "Le schéma présenté illustre un processus structuré avec plusieurs étapes interconnectées. " \
"On peut observer une organisation hiérarchique des éléments, avec des flèches indiquant " \
"le flux d'information ou la séquence d'opérations. Les différentes composantes sont " \
"clairement délimitées et semblent représenter un workflow ou un système de classification."
elif selection_type == "tableau":
return "Ce tableau contient plusieurs colonnes et rangées de données structurées. " \
"Il présente une organisation systématique d'informations, probablement des " \
"valeurs numériques ou des catégories. Les en-têtes indiquent le type de données " \
"dans chaque colonne, et l'ensemble forme une matrice cohérente d'informations liées."
elif selection_type == "formule":
return "Cette formule mathématique représente une relation complexe entre plusieurs variables. " \
"Elle utilise divers opérateurs et symboles mathématiques pour exprimer un concept ou " \
"une règle. La structure suggère qu'il s'agit d'une équation importante dans son domaine, " \
"possiblement liée à un phénomène physique ou à un modèle théorique."
)
if "message" in response and "content" in response["message"]:
result = response["message"]["content"]
else:
return "L'image montre un contenu visuel structuré qui présente des informations importantes " \
"dans le contexte du document. Les éléments visuels sont organisés de manière à faciliter " \
"la compréhension d'un concept ou d'un processus spécifique. La qualité et la disposition " \
"des éléments suggèrent qu'il s'agit d'une représentation professionnelle destinée à " \
"communiquer efficacement l'information."
result = response.get("response", "Erreur: Format de réponse inattendu")
else:
# Format de génération standard pour les autres modèles
prompt_text = system_prompt
if user_prompt:
prompt_text += f"\n\n{user_prompt}"
response = api.generate(
model=self.model_name,
prompt=prompt_text,
images=[image_data],
options={
"temperature": self.config.get("temperature", 0.2),
"top_p": self.config.get("top_p", 0.95),
"top_k": self.config.get("top_k", 40),
"num_predict": self.config.get("max_tokens", 1024)
}
)
result = response.get("response", "Erreur: Pas de réponse")
# Enregistrer la réponse dans un fichier si l'option d'enregistrement est activée
if self.config.get("save_images", True):
response_path = os.path.join(self.image_dir, f"{timestamp}_{image_id}_response.txt")
with open(response_path, "w", encoding="utf-8") as f:
f.write(f"Prompt:\n{full_prompt}\n\nResponse:\n{result}")
print(f"Réponse enregistrée dans: {response_path}")
return result
except Exception as e:
return f"Erreur lors de l'analyse de l'image: {str(e)}"
error_msg = f"Erreur lors de l'analyse de l'image: {str(e)}"
print(error_msg)
# Enregistrer l'erreur si l'option d'enregistrement est activée
if self.config.get("save_images", True):
error_path = os.path.join(self.image_dir, f"{timestamp}_{image_id}_error.txt")
with open(error_path, "w", encoding="utf-8") as f:
f.write(f"Prompt:\n{full_prompt}\n\nError:\n{str(e)}")
return error_msg

154
check_prerequisites.ps1 Normal file
View File

@ -0,0 +1,154 @@
# Prerequisites check script for Ragflow PDF Preprocessing
Write-Host "======================================================" -ForegroundColor Cyan
Write-Host " Ragflow PDF Preprocessing Prerequisites Check" -ForegroundColor Cyan
Write-Host "======================================================" -ForegroundColor Cyan
Write-Host ""
$allOk = $true
# Check Python and version
try {
$pythonVersionOutput = (python --version) 2>&1
$pythonVersion = $pythonVersionOutput -replace "Python "
if ([version]$pythonVersion -ge [version]"3.8") {
Write-Host "✅ Python $pythonVersion is installed (>= 3.8 required)" -ForegroundColor Green
} else {
Write-Host "❌ Python $pythonVersion is installed, but version 3.8 or higher is required" -ForegroundColor Red
$allOk = $false
}
} catch {
Write-Host "❌ Python is not installed or not in PATH" -ForegroundColor Red
Write-Host " Required installation: https://www.python.org/downloads/" -ForegroundColor Yellow
Write-Host " Make sure to check 'Add Python to PATH' during installation" -ForegroundColor Yellow
$allOk = $false
}
# Check Tesseract OCR
$tesseractPaths = @(
"C:\Program Files\Tesseract-OCR\tesseract.exe",
"C:\Program Files (x86)\Tesseract-OCR\tesseract.exe",
"C:\Tesseract-OCR\tesseract.exe"
)
$tesseractInstalled = $false
$tesseractPath = ""
foreach ($path in $tesseractPaths) {
if (Test-Path $path) {
$tesseractInstalled = $true
$tesseractPath = $path
break
}
}
if ($tesseractInstalled) {
Write-Host "✅ Tesseract OCR is installed at: $tesseractPath" -ForegroundColor Green
# Check language packs
$tesseractDir = Split-Path -Parent $tesseractPath
$langPath = Join-Path $tesseractDir "tessdata"
$fraLangPath = Join-Path $langPath "fra.traineddata"
$engLangPath = Join-Path $langPath "eng.traineddata"
if (Test-Path $fraLangPath) {
Write-Host " ✅ French language pack (fra) installed" -ForegroundColor Green
} else {
Write-Host " ❌ French language pack (fra) missing" -ForegroundColor Yellow
Write-Host " Reinstall Tesseract selecting the language packs" -ForegroundColor Yellow
$allOk = $false
}
if (Test-Path $engLangPath) {
Write-Host " ✅ English language pack (eng) installed" -ForegroundColor Green
} else {
Write-Host " ❌ English language pack (eng) missing" -ForegroundColor Yellow
Write-Host " Reinstall Tesseract selecting the language packs" -ForegroundColor Yellow
$allOk = $false
}
} else {
Write-Host "❌ Tesseract OCR is not installed" -ForegroundColor Red
Write-Host " Required installation: https://github.com/UB-Mannheim/tesseract/wiki" -ForegroundColor Yellow
Write-Host " Make sure to select French and English language packs" -ForegroundColor Yellow
$allOk = $false
}
# Check Ollama server
try {
$response = Invoke-WebRequest -Uri "http://217.182.105.173:11434/api/version" -UseBasicParsing -ErrorAction SilentlyContinue
if ($response.StatusCode -eq 200) {
Write-Host "✅ Ollama server is accessible at 217.182.105.173:11434" -ForegroundColor Green
# Check available models
$modelsResponse = Invoke-WebRequest -Uri "http://217.182.105.173:11434/api/tags" -UseBasicParsing -ErrorAction SilentlyContinue
$models = ($modelsResponse.Content | ConvertFrom-Json).models
$modelNames = $models | ForEach-Object { $_.name }
# Required models based on the server's available models
$requiredModels = @(
"mistral:latest",
"llava:34b-v1.6-fp16",
"llama3.2-vision:90b-instruct-q8_0"
)
foreach ($model in $requiredModels) {
if ($modelNames -contains $model) {
Write-Host " ✅ Model $model is available" -ForegroundColor Green
} else {
Write-Host " ❌ Model $model not found on server" -ForegroundColor Yellow
Write-Host " This model is required for some features" -ForegroundColor Yellow
$allOk = $false
}
}
}
} catch {
Write-Host "❌ Cannot connect to Ollama server at 217.182.105.173:11434" -ForegroundColor Red
Write-Host " Make sure you have network connectivity to the Ollama server" -ForegroundColor Yellow
Write-Host " This is required for the LLM features of the application" -ForegroundColor Yellow
$allOk = $false
}
# Check Python libraries
Write-Host ""
Write-Host "Checking required Python libraries:" -ForegroundColor Cyan
$requiredLibraries = @(
"PyQt6",
"PyMuPDF",
"numpy",
"pytesseract",
"Pillow",
"opencv-python",
"requests"
)
foreach ($lib in $requiredLibraries) {
try {
$output = python -c "import $($lib.ToLower().Replace('-', '_')); print('OK')" 2>&1
if ($output -eq "OK") {
Write-Host "$lib is installed" -ForegroundColor Green
} else {
Write-Host "$lib is not installed correctly" -ForegroundColor Red
Write-Host " Recommended installation: pip install $lib" -ForegroundColor Yellow
$allOk = $false
}
} catch {
Write-Host "$lib is not installed" -ForegroundColor Red
Write-Host " Recommended installation: pip install $lib" -ForegroundColor Yellow
$allOk = $false
}
}
# Summary
Write-Host ""
Write-Host "======================================================" -ForegroundColor Cyan
if ($allOk) {
Write-Host "✅ All prerequisites are satisfied!" -ForegroundColor Green
Write-Host " You can proceed with the installation and use of Ragflow." -ForegroundColor Green
} else {
Write-Host "⚠️ Some prerequisites are not satisfied." -ForegroundColor Yellow
Write-Host " Please install the missing components before using Ragflow." -ForegroundColor Yellow
}
Write-Host "======================================================" -ForegroundColor Cyan
Read-Host "Press ENTER to exit"

8
config/__init__.py Normal file
View File

@ -0,0 +1,8 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Package de configuration de l'application
"""
# Ce fichier permet d'importer le package config

Binary file not shown.

Binary file not shown.

54
config/agent_config.py Normal file
View File

@ -0,0 +1,54 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Configuration des agents dans l'application
"""
# Configuration des agents activés/désactivés
ACTIVE_AGENTS = {
"ocr": True, # Agent de reconnaissance optique de caractères
"vision": True, # Agent d'analyse d'images
"translation": True, # Agent de traduction
"summary": False, # Agent de résumé (désactivé)
"rewriter": True # Agent de reformulation
}
# Configuration des modèles par défaut pour chaque agent
DEFAULT_MODELS = {
"ocr": None, # OCR utilise Tesseract, pas de modèle LLM
"vision": "llava:34b-v1.6-fp16", # Modèle par défaut pour l'analyse d'images
"translation": "mistral:latest", # Modèle par défaut pour la traduction
"summary": "mistral:latest", # Modèle par défaut pour le résumé
"rewriter": "mistral:latest" # Modèle par défaut pour la reformulation
}
# Configuration de l'endpoint Ollama par défaut
DEFAULT_ENDPOINT = "http://217.182.105.173:11434"
# Configuration de journalisation détaillée
VERBOSE_LOGGING = True # Enregistre tous les détails des entrées/sorties
def is_agent_enabled(agent_name: str) -> bool:
"""
Vérifie si un agent est activé dans la configuration
Args:
agent_name (str): Nom de l'agent à vérifier
Returns:
bool: True si l'agent est activé, False sinon
"""
return ACTIVE_AGENTS.get(agent_name, False)
def get_default_model(agent_name: str) -> str:
"""
Renvoie le modèle par défaut pour un agent donné
Args:
agent_name (str): Nom de l'agent
Returns:
str: Nom du modèle par défaut
"""
return DEFAULT_MODELS.get(agent_name, "mistral")

View File

@ -1,7 +1,7 @@
{
"léger": {
"vision": {
"model": "llava:34b",
"model": "llava:34b-v1.6-fp16",
"language": "en",
"temperature": 0.2,
"top_p": 0.95,
@ -9,7 +9,7 @@
"max_tokens": 1024
},
"translation": {
"model": "mistral",
"model": "mistral:latest",
"language": "fr",
"temperature": 0.1,
"top_p": 0.95,
@ -17,7 +17,7 @@
"max_tokens": 1024
},
"summary": {
"model": "mistral",
"model": "mistral:latest",
"language": "fr",
"temperature": 0.2,
"top_p": 0.95,
@ -25,7 +25,7 @@
"max_tokens": 1024
},
"rewriter": {
"model": "mistral",
"model": "mistral:latest",
"language": "fr",
"temperature": 0.3,
"top_p": 0.95,
@ -35,7 +35,7 @@
},
"moyen": {
"vision": {
"model": "llava",
"model": "llava:34b-v1.6-fp16",
"language": "en",
"temperature": 0.2,
"top_p": 0.95,
@ -43,7 +43,7 @@
"max_tokens": 1024
},
"translation": {
"model": "qwen2.5",
"model": "qwen2.5:72b-instruct-q8_0",
"language": "fr",
"temperature": 0.1,
"top_p": 0.95,
@ -51,7 +51,7 @@
"max_tokens": 1024
},
"summary": {
"model": "deepseek-r1",
"model": "deepseek-r1:70b-llama-distill-q8_0",
"language": "fr",
"temperature": 0.2,
"top_p": 0.95,
@ -59,7 +59,7 @@
"max_tokens": 1024
},
"rewriter": {
"model": "mistral",
"model": "mistral:latest",
"language": "fr",
"temperature": 0.3,
"top_p": 0.95,
@ -69,7 +69,7 @@
},
"avancé": {
"vision": {
"model": "llama3.2-vision",
"model": "llama3.2-vision:90b-instruct-q8_0",
"language": "en",
"temperature": 0.2,
"top_p": 0.95,
@ -77,7 +77,7 @@
"max_tokens": 2048
},
"translation": {
"model": "deepseek",
"model": "deepseek-r1:70b-llama-distill-q8_0",
"language": "fr",
"temperature": 0.1,
"top_p": 0.95,
@ -85,7 +85,7 @@
"max_tokens": 2048
},
"summary": {
"model": "deepseek-r1",
"model": "deepseek-r1:70b-llama-distill-q8_0",
"language": "fr",
"temperature": 0.2,
"top_p": 0.95,
@ -93,7 +93,7 @@
"max_tokens": 2048
},
"rewriter": {
"model": "deepseek",
"model": "deepseek-r1:70b-llama-distill-q8_0",
"language": "fr",
"temperature": 0.3,
"top_p": 0.95,

1
data/images/.gitkeep Normal file
View File

@ -0,0 +1 @@
# Ce fichier permet de conserver le répertoire vide dans Git

Binary file not shown.

After

Width:  |  Height:  |  Size: 66 KiB

View File

@ -0,0 +1,10 @@
Prompt:
System: Analyze the following image which contains diagram. Please describe in detail what this diagram shows, including all components, connections, and what it represents.
Please be detailed and precise in your analysis.
User: Here is additional context that may help with your analysis (may be in French):
L = plus grande longueur - D = plus grand diamètre (permettant de passer une maille de tamis £ D). - 1 = dimension perpendiculaire au plan LD et inférieure à D Figure 1 : définition de la Lmax
Response:
Error: model 'llava:34b' not found, try pulling it first

1
data/ocr_logs/.gitkeep Normal file
View File

@ -0,0 +1 @@
# Ce fichier permet de conserver le répertoire vide dans Git

1
data/outputs/.gitkeep Normal file
View File

@ -0,0 +1 @@
# Ce fichier permet de conserver le répertoire vide dans Git

View File

@ -0,0 +1,9 @@
# Analyse de document
## Schéma - Page 8
**Contexte** :
L = plus grande longueur - D = plus grand diamètre (permettant de passer une maille de tamis £ D). - 1 = dimension perpendiculaire au plan LD et inférieure à D Figure 1 : définition de la Lmax
---

1
data/rewrites/.gitkeep Normal file
View File

@ -0,0 +1 @@
# Ce fichier permet de conserver le répertoire vide dans Git

1
data/summaries/.gitkeep Normal file
View File

@ -0,0 +1 @@
# Ce fichier permet de conserver le répertoire vide dans Git

1
data/temp/.gitkeep Normal file
View File

@ -0,0 +1 @@
# Ce fichier permet de conserver le répertoire vide dans Git

1
data/templates/.gitkeep Normal file
View File

@ -0,0 +1 @@
# Ce fichier permet de conserver le répertoire vide dans Git

View File

@ -0,0 +1 @@
# Ce fichier permet de conserver le répertoire vide dans Git

View File

@ -0,0 +1,14 @@
Source language: en
Target language: fr
Model: mistral
Original text:
Voici le contexte de l'image: L = plus grande longueur - D = plus grand diamètre (permettant de passer une maille de tamis £ D). - 1 = dimension perpendiculaire au plan LD et inférieure à D Figure 1 : définition de la Lmax
Translation:
Voici le contexte de l'image :
L = longueur maximale ; D = diamètre maximal (permettant de passer une maille de tamis £ D) ;
1 = dimension perpendiculaire au plan LD et inférieure à D
Figure 1 : définition de la Lmax

1
data/uploads/.gitkeep Normal file
View File

@ -0,0 +1 @@
# Ce fichier permet de conserver le répertoire vide dans Git

1
data/workflows/.gitkeep Normal file
View File

@ -0,0 +1 @@
# Ce fichier permet de conserver le répertoire vide dans Git

View File

@ -0,0 +1,62 @@
# Flux de traitement - Session session_1743094415
Date et heure: 27/03/2025 17:53:35
## Étapes du traitement
### Étape: vision (17:53:35)
#### Métadonnées
- **model**: llava:34b
#### Entrées
- **content_type**:
```
schéma
```
- **context**:
```
L = plus grande longueur - D = plus grand diamètre (permettant de passer une maille de tamis £ D). - 1 = dimension perpendiculaire au plan LD et inférieure à D Figure 1 : définition de la Lmax
```
#### Sorties
```
Error: model 'llava:34b' not found, try pulling it first
```
---
### Étape: translation (17:53:39)
#### Métadonnées
- **model**: mistral
#### Entrées
- **text**:
```
Voici le contexte de l'image: L = plus grande longueur - D = plus grand diamètre (permettant de passer une maille de tamis £ D). - 1 = dimension perpendiculaire au plan LD et inférieure à D Figure 1 : définition de la Lmax
```
#### Sorties
```
Voici le contexte de l'image :
L = longueur maximale ; D = diamètre maximal (permettant de passer une maille de tamis £ D) ;
1 = dimension perpendiculaire au plan LD et inférieure à D
Figure 1 : définition de la Lmax
```
---
## Résumé du traitement
Traitement terminé avec succès.
Modèles utilisés: vision=llava:34b, translation=mistral
Agents exécutés: vision, translation
*Fin du flux de traitement - 27/03/2025 17:53:40*

53
debug_setup.bat Normal file
View File

@ -0,0 +1,53 @@
@echo off
:: Debug and setup script for RagFlow Preprocess
echo ===================================================
echo RagFlow Preprocess Setup and Debug
echo ===================================================
echo.
echo Installing Python dependencies...
pip install requests PyQt6 PyPDF2 pytesseract Pillow PyMuPDF
echo.
echo Checking Tesseract OCR installation...
:: Check if Tesseract is installed
where tesseract >nul 2>nul
if %ERRORLEVEL% neq 0 (
echo WARNING: Tesseract is not installed or not in PATH.
echo Please download and install Tesseract OCR from:
echo https://github.com/UB-Mannheim/tesseract/wiki
echo.
echo IMPORTANT: During installation, check the option to add Tesseract to PATH.
echo And make sure to install French and English languages.
echo.
echo Press any key to continue...
pause >nul
)
echo Testing communication with Ollama server...
python test_components.py
echo.
echo Updating configurations...
echo Ollama Server: http://217.182.105.173:11434
echo.
echo ===================================================
echo FINAL INSTRUCTIONS
echo ===================================================
echo.
echo 1. If Tesseract OCR is not installed yet:
echo - Download it from: https://github.com/UB-Mannheim/tesseract/wiki
echo - Install it with the "Add to PATH" option checked
echo - Make sure to install French (fra) and English (eng) languages
echo.
echo 2. To install French languages for Tesseract:
echo - Launch PowerShell as administrator
echo - Run the script: .\install_languages.ps1
echo.
echo 3. To run the application:
echo - Use: python main.py
echo.
echo Press any key to exit...
pause >nul

View File

@ -1,49 +0,0 @@
import os
import json
from datetime import datetime
# === Config ===
CURSOR_CHAT_DIR = os.path.expanduser("~/.cursor/chat/")
OUTPUT_FILE = "cursor_history.md"
# === Initialisation du contenu ===
md_output = ""
# === Chargement des discussions Cursor ===
for filename in sorted(os.listdir(CURSOR_CHAT_DIR)):
if not filename.endswith(".json"):
continue
filepath = os.path.join(CURSOR_CHAT_DIR, filename)
with open(filepath, "r", encoding="utf-8") as f:
try:
chat_data = json.load(f)
except json.JSONDecodeError:
continue # Fichier corrompu ou non lisible
created_at_raw = chat_data.get("createdAt", "")
try:
created_at = datetime.fromisoformat(created_at_raw.replace("Z", ""))
except ValueError:
created_at = datetime.now()
formatted_time = created_at.strftime("%Y-%m-%d %H:%M:%S")
md_output += f"\n---\n\n## Session du {formatted_time}\n\n"
for msg in chat_data.get("messages", []):
role = msg.get("role", "")
content = msg.get("content", "").strip()
if not content:
continue
if role == "user":
md_output += f"** Utilisateur :**\n{content}\n\n"
elif role == "assistant":
md_output += f"** Assistant :**\n{content}\n\n"
# === Écriture / ajout dans le fichier final ===
with open(OUTPUT_FILE, "a", encoding="utf-8") as output_file:
output_file.write(md_output)
print(f" Export terminé ! Discussions ajoutées à : {OUTPUT_FILE}")

86
install_languages.ps1 Normal file
View File

@ -0,0 +1,86 @@
#!/usr/bin/env pwsh
# Script to install additional languages for Tesseract OCR
# Check if the script is running as administrator
if (-NOT ([Security.Principal.WindowsPrincipal][Security.Principal.WindowsIdentity]::GetCurrent()).IsInRole([Security.Principal.WindowsBuiltInRole] "Administrator")) {
Write-Warning "Please run this script as administrator!"
Write-Host "The script will close in 5 seconds..."
Start-Sleep -Seconds 5
exit
}
# Function to find Tesseract installation path
function Find-TesseractPath {
$possiblePaths = @(
"C:\Program Files\Tesseract-OCR",
"C:\Program Files (x86)\Tesseract-OCR",
"C:\Tesseract-OCR"
)
foreach ($path in $possiblePaths) {
if (Test-Path "$path\tesseract.exe") {
return $path
}
}
return $null
}
# Find Tesseract path
$tesseractPath = Find-TesseractPath
if ($null -eq $tesseractPath) {
Write-Host "Tesseract OCR is not installed or was not found in standard locations." -ForegroundColor Red
Write-Host "Please install Tesseract OCR first from: https://github.com/UB-Mannheim/tesseract/wiki" -ForegroundColor Yellow
Write-Host "The script will close in 5 seconds..."
Start-Sleep -Seconds 5
exit
}
Write-Host "Tesseract OCR found at: $tesseractPath" -ForegroundColor Green
# Check tessdata folder
$tessDataPath = Join-Path -Path $tesseractPath -ChildPath "tessdata"
if (-not (Test-Path $tessDataPath)) {
Write-Host "The tessdata folder doesn't exist. Creating..." -ForegroundColor Yellow
New-Item -Path $tessDataPath -ItemType Directory | Out-Null
}
# URLs of languages to download
$languageUrls = @{
"fra" = "https://github.com/tesseract-ocr/tessdata/raw/4.00/fra.traineddata"
"fra_vert" = "https://github.com/tesseract-ocr/tessdata/raw/4.00/fra_vert.traineddata"
"frk" = "https://github.com/tesseract-ocr/tessdata/raw/4.00/frk.traineddata"
"frm" = "https://github.com/tesseract-ocr/tessdata/raw/4.00/frm.traineddata"
}
# Download languages
foreach ($lang in $languageUrls.Keys) {
$url = $languageUrls[$lang]
$outputFile = Join-Path -Path $tessDataPath -ChildPath "$lang.traineddata"
if (Test-Path $outputFile) {
Write-Host "Language file $lang already exists. Removing..." -ForegroundColor Yellow
Remove-Item -Path $outputFile -Force
}
Write-Host "Downloading $lang.traineddata..." -ForegroundColor Cyan
try {
Invoke-WebRequest -Uri $url -OutFile $outputFile
Write-Host "Language $lang successfully installed." -ForegroundColor Green
}
catch {
Write-Host "Error downloading $lang.traineddata: $_" -ForegroundColor Red
}
}
# Check installed languages
Write-Host "`nVerifying installed languages:" -ForegroundColor Cyan
$installedLanguages = Get-ChildItem -Path $tessDataPath -Filter "*.traineddata" | ForEach-Object { $_.Name.Replace(".traineddata", "") }
Write-Host "Installed languages: $($installedLanguages -join ', ')" -ForegroundColor Green
Write-Host "`nLanguage installation completed." -ForegroundColor Green
Write-Host "Press any key to close..."
$null = $Host.UI.RawUI.ReadKey("NoEcho,IncludeKeyDown")

61
install_windows.bat Normal file
View File

@ -0,0 +1,61 @@
@echo off
echo ======================================================
echo Ragflow PDF Preprocessing Installation
echo ======================================================
echo.
:: Check if Python is installed
python --version > nul 2>&1
if %ERRORLEVEL% NEQ 0 (
echo Error: Python is not installed or not in PATH.
echo Please install Python 3.8 or higher from https://www.python.org/downloads/
echo Make sure to check "Add Python to PATH" during installation.
pause
exit /b 1
)
:: Create virtual environment
echo Creating virtual environment...
python -m venv venv
if %ERRORLEVEL% NEQ 0 (
echo Error creating virtual environment.
pause
exit /b 1
)
:: Activate virtual environment
echo Activating virtual environment...
call venv\Scripts\activate.bat
:: Update pip
echo Updating pip...
python -m pip install --upgrade pip
:: Install dependencies
echo Installing dependencies...
pip install -e .
if %ERRORLEVEL% NEQ 0 (
echo Error installing dependencies.
pause
exit /b 1
)
echo.
echo ======================================================
echo Installation completed successfully!
echo.
echo Next steps:
echo 1. Make sure Tesseract OCR is installed
echo (https://github.com/UB-Mannheim/tesseract/wiki)
echo 2. Make sure Ollama is installed and running
echo (https://ollama.ai/)
echo 3. To launch the application, run:
echo.
echo call venv\Scripts\activate.bat
echo python main.py
echo.
echo Or use the launch_windows.bat script
echo ======================================================
echo.
pause

107
install_windows.ps1 Normal file
View File

@ -0,0 +1,107 @@
# PowerShell installation script for Ragflow PDF Preprocessing
Write-Host "======================================================" -ForegroundColor Cyan
Write-Host " Ragflow PDF Preprocessing Installation" -ForegroundColor Cyan
Write-Host "======================================================" -ForegroundColor Cyan
Write-Host ""
# Check if PowerShell is running as administrator
$isAdmin = ([Security.Principal.WindowsPrincipal] [Security.Principal.WindowsIdentity]::GetCurrent()).IsInRole([Security.Principal.WindowsBuiltInRole] "Administrator")
if (-not $isAdmin) {
Write-Host "Note: For installing system dependencies, it is recommended to run this script as administrator." -ForegroundColor Yellow
Write-Host "You can continue without administrator privileges, but some features might not be installed correctly." -ForegroundColor Yellow
$continue = Read-Host "Do you want to continue? (Y/N)"
if ($continue -ne "Y" -and $continue -ne "y") {
Write-Host "Installation cancelled." -ForegroundColor Red
exit
}
}
# Check if Python is installed
try {
$pythonVersion = python --version
Write-Host "Python detected: $pythonVersion" -ForegroundColor Green
} catch {
Write-Host "Error: Python is not installed or not in PATH." -ForegroundColor Red
Write-Host "Please install Python 3.8 or higher from https://www.python.org/downloads/" -ForegroundColor Red
Write-Host "Make sure to check 'Add Python to PATH' during installation." -ForegroundColor Red
Read-Host "Press ENTER to exit"
exit
}
# Create virtual environment
Write-Host "Creating virtual environment..." -ForegroundColor Cyan
python -m venv venv
if (-not $?) {
Write-Host "Error creating virtual environment." -ForegroundColor Red
Read-Host "Press ENTER to exit"
exit
}
# Activate virtual environment
Write-Host "Activating virtual environment..." -ForegroundColor Cyan
& .\venv\Scripts\Activate.ps1
# Update pip
Write-Host "Updating pip..." -ForegroundColor Cyan
python -m pip install --upgrade pip
# Install dependencies
Write-Host "Installing dependencies..." -ForegroundColor Cyan
pip install -e .
if (-not $?) {
Write-Host "Error installing dependencies." -ForegroundColor Red
Read-Host "Press ENTER to exit"
exit
}
# Check if Tesseract is installed
$tesseractPaths = @(
"C:\Program Files\Tesseract-OCR\tesseract.exe",
"C:\Program Files (x86)\Tesseract-OCR\tesseract.exe",
"C:\Tesseract-OCR\tesseract.exe"
)
$tesseractInstalled = $false
foreach ($path in $tesseractPaths) {
if (Test-Path $path) {
$tesseractInstalled = $true
Write-Host "Tesseract OCR detected at: $path" -ForegroundColor Green
break
}
}
if (-not $tesseractInstalled) {
Write-Host "Tesseract OCR was not detected." -ForegroundColor Yellow
Write-Host "The application requires Tesseract OCR for text recognition." -ForegroundColor Yellow
Write-Host "Please install it from: https://github.com/UB-Mannheim/tesseract/wiki" -ForegroundColor Yellow
}
# Check if Ollama server is accessible
try {
$response = Invoke-WebRequest -Uri "http://217.182.105.173:11434/api/version" -UseBasicParsing -ErrorAction SilentlyContinue
if ($response.StatusCode -eq 200) {
Write-Host "Ollama server is accessible at 217.182.105.173:11434." -ForegroundColor Green
}
} catch {
Write-Host "Warning: Cannot connect to Ollama server at 217.182.105.173:11434." -ForegroundColor Yellow
Write-Host "Make sure you have network connectivity to the Ollama server." -ForegroundColor Yellow
Write-Host "This is required for the LLM features of the application." -ForegroundColor Yellow
}
Write-Host ""
Write-Host "======================================================" -ForegroundColor Cyan
Write-Host "Installation completed successfully!" -ForegroundColor Green
Write-Host ""
Write-Host "Next steps:" -ForegroundColor Cyan
Write-Host "1. Make sure Tesseract OCR is installed" -ForegroundColor White
Write-Host " (https://github.com/UB-Mannheim/tesseract/wiki)" -ForegroundColor White
Write-Host "2. Make sure you have network connectivity to the Ollama server at 217.182.105.173:11434" -ForegroundColor White
Write-Host "3. To launch the application, run:" -ForegroundColor White
Write-Host "" -ForegroundColor White
Write-Host " .\launch_windows.ps1" -ForegroundColor White
Write-Host "" -ForegroundColor White
Write-Host "Or use the launch_windows.bat script" -ForegroundColor White
Write-Host "======================================================" -ForegroundColor Cyan
Write-Host ""
Read-Host "Press ENTER to exit"

50
launch_windows.bat Normal file
View File

@ -0,0 +1,50 @@
@echo off
:: Script de lancement pour l'application de prétraitement PDF
:: Ceci est un wrapper pour le script PowerShell qui lancer l'application sous Windows
echo ===================================================
echo Lancement de l'application de prétraitement PDF
echo ===================================================
echo.
:: Vérifier si PowerShell est disponible
where powershell >nul 2>nul
if %ERRORLEVEL% neq 0 (
echo ERREUR: PowerShell n'est pas disponible sur ce système.
echo Veuillez installer PowerShell pour exécuter cette application.
echo.
pause
exit /b 1
)
:: Récupérer le chemin du script
set "SCRIPT_DIR=%~dp0"
cd /d "%SCRIPT_DIR%"
:: Vérifier si l'environnement virtuel existe
if not exist "%SCRIPT_DIR%venv" (
echo L'environnement virtuel n'existe pas. Veuillez d'abord exécuter install_windows.bat
echo.
pause
exit /b 1
)
:: Définir l'exécution du script PowerShell pour permettre l'exécution non signée
powershell -Command "Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass"
:: Lancer le script PowerShell
echo Lancement du script d'initialisation et de démarrage...
powershell -ExecutionPolicy Bypass -File "%SCRIPT_DIR%launch_windows.ps1"
:: Vérifier si le script s'est terminé correctement
if %ERRORLEVEL% neq 0 (
echo Une erreur s'est produite lors du lancement de l'application.
echo Code d'erreur: %ERRORLEVEL%
echo.
echo Pour plus d'informations, veuillez vérifier les messages ci-dessus.
pause
exit /b %ERRORLEVEL%
)
:: Fin du script
exit /b 0

84
launch_windows.ps1 Normal file
View File

@ -0,0 +1,84 @@
# PowerShell launch script for Ragflow PDF Preprocessing
Write-Host "======================================================" -ForegroundColor Cyan
Write-Host " Ragflow PDF Preprocessing Launch" -ForegroundColor Cyan
Write-Host "======================================================" -ForegroundColor Cyan
Write-Host ""
# Répertoire actuel
$scriptPath = Split-Path -Parent $MyInvocation.MyCommand.Path
Set-Location $scriptPath
# Vérifier si le répertoire venv existe
if (-not (Test-Path "$scriptPath\venv")) {
Write-Host "L'environnement virtuel n'existe pas. Veuillez d'abord exécuter install_windows.ps1"
Write-Host "Vous pouvez double-cliquer sur 'install_windows.bat' pour lancer l'installation"
Read-Host "Appuyez sur Entrée pour quitter"
exit
}
# Activer l'environnement virtuel
Write-Host "Activation de l'environnement virtuel..."
& "$scriptPath\venv\Scripts\Activate.ps1"
# Vérifier l'installation des dépendances
Write-Host "Vérification des dépendances Python..."
& python -c "import PyPDF2, PyQt6, pytesseract, Pillow" 2>$null
if ($LASTEXITCODE -ne 0) {
Write-Host "Certaines dépendances ne sont pas installées. Installation en cours..."
& pip install -r requirements.txt
}
# Créer la structure de répertoires
Write-Host "Initialisation de la structure de données..."
& python -c "from utils.data_structure import initialize_data_directories; initialize_data_directories()"
# Check if Tesseract is accessible
$tesseractPaths = @(
"C:\Program Files\Tesseract-OCR\tesseract.exe",
"C:\Program Files (x86)\Tesseract-OCR\tesseract.exe",
"C:\Tesseract-OCR\tesseract.exe"
)
$tesseractInstalled = $false
foreach ($path in $tesseractPaths) {
if (Test-Path $path) {
$tesseractInstalled = $true
Write-Host "Tesseract OCR detected at: $path" -ForegroundColor Green
break
}
}
if (-not $tesseractInstalled) {
Write-Host "Information: Tesseract OCR was not detected." -ForegroundColor Yellow
Write-Host "The application will try to find Tesseract in standard locations." -ForegroundColor Yellow
Write-Host "If OCR doesn't work, please install Tesseract from: https://github.com/UB-Mannheim/tesseract/wiki" -ForegroundColor Yellow
}
# Check if Ollama server is accessible
try {
$response = Invoke-WebRequest -Uri "http://217.182.105.173:11434/api/version" -UseBasicParsing -ErrorAction SilentlyContinue
if ($response.StatusCode -eq 200) {
Write-Host "Ollama server is accessible at 217.182.105.173:11434." -ForegroundColor Green
}
} catch {
Write-Host "Warning: Cannot connect to Ollama server at 217.182.105.173:11434." -ForegroundColor Yellow
Write-Host "Make sure you have network connectivity to the Ollama server." -ForegroundColor Yellow
$continue = Read-Host "Do you want to continue anyway? (Y/N)"
if ($continue -ne "Y" -and $continue -ne "y") {
exit
}
}
# Lancer l'application
Write-Host "Lancement de l'application de prétraitement PDF..."
& python main.py
# End
Write-Host ""
if ($LASTEXITCODE -ne 0) {
Write-Host "The application has terminated with errors (code $LASTEXITCODE)." -ForegroundColor Red
} else {
Write-Host "The application has terminated normally." -ForegroundColor Green
}
Read-Host "Press ENTER to exit"

View File

@ -10,9 +10,13 @@ import os
from PyQt6.QtWidgets import QApplication
from ui.viewer import PDFViewer
from utils.data_structure import initialize_data_directories
def main():
"""Point d'entrée principal de l'application"""
# Initialiser la structure de données
initialize_data_directories()
app = QApplication(sys.argv)
app.setApplicationName("Prétraitement PDF pour Ragflow")

View File

@ -0,0 +1,333 @@
Metadata-Version: 2.4
Name: ragflow_pretraitement
Version: 1.0.0
Summary: Outil de prétraitement PDF avec agents LLM modulables pour Ragflow
Author: Ragflow Team
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: PyQt6>=6.4.0
Requires-Dist: PyMuPDF>=1.21.0
Requires-Dist: numpy>=1.22.0
Requires-Dist: pytesseract>=0.3.9
Requires-Dist: Pillow>=9.3.0
Requires-Dist: opencv-python>=4.7.0
Requires-Dist: requests>=2.28.0
Dynamic: author
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary
# Prétraitement PDF pour Ragflow
Outil de prétraitement de documents PDF avec agents LLM modulables pour l'analyse, la traduction et le résumé.
## Fonctionnalités
- **Sélection visuelle** de zones dans les documents PDF (tableaux, schémas, formules, texte)
- **Analyse automatique** avec différents agents LLM configurables
- **Niveaux d'analyse** adaptables selon les besoins (léger, moyen, avancé)
- **Export Markdown** pour intégration dans une base Ragflow
## Installation
### Prérequis
- Python 3.8 ou supérieur
- PyQt6
- PyMuPDF (fitz)
- Tesseract OCR (pour la reconnaissance de texte)
- Ollama (pour les modèles LLM)
#### Vérification des prérequis (Windows)
Un script de vérification est fourni pour vous aider à identifier les composants manquants:
1. Faites un clic droit sur `check_prerequisites.ps1` et sélectionnez "Exécuter avec PowerShell"
2. Le script vérifiera tous les prérequis et vous indiquera ce qu'il manque
3. Suivez les instructions pour installer les composants manquants
### Configuration de l'environnement virtuel
Il est fortement recommandé d'utiliser un environnement virtuel pour installer et exécuter cette application, notamment sur Windows:
#### Avec venv (recommandé)
```bash
# Windows (PowerShell)
python -m venv venv
.\venv\Scripts\Activate.ps1
# Windows (CMD)
python -m venv venv
.\venv\Scripts\activate.bat
# Linux/macOS
python -m venv venv
source venv/bin/activate
```
#### Avec Conda (alternative)
```bash
# Création de l'environnement
conda create -n ragflow_env python=3.9
conda activate ragflow_env
# Installation des dépendances via pip
pip install -r requirements.txt
```
Une fois l'environnement virtuel activé, vous pouvez procéder à l'installation.
### Méthode 1 : Installation directe (recommandée)
Clonez ce dépôt et installez avec pip :
```bash
git clone https://github.com/votre-utilisateur/ragflow-pretraitement.git
cd ragflow_pretraitement
pip install -e .
```
Cette méthode installera automatiquement toutes les dépendances Python requises.
### Méthode 2 : Installation manuelle
Si vous préférez une installation manuelle :
```bash
pip install -r requirements.txt
```
### Méthode 3 : Installation automatisée pour Windows
Pour une installation simplifiée sous Windows, utilisez les scripts d'installation fournis :
#### Avec les scripts batch (.bat)
1. Double-cliquez sur `install_windows.bat`
2. Le script créera un environnement virtuel et installera toutes les dépendances
3. Suivez les instructions à l'écran pour les étapes suivantes
Pour lancer l'application par la suite :
1. Double-cliquez sur `launch_windows.bat`
2. Le script vérifiera l'installation de Tesseract et Ollama avant de démarrer l'application
#### Avec PowerShell (recommandé pour Windows 10/11)
Si vous préférez utiliser PowerShell (interface plus conviviale avec code couleur) :
1. Faites un clic droit sur `install_windows.ps1` et sélectionnez "Exécuter avec PowerShell"
2. Suivez les instructions à l'écran pour terminer l'installation
Pour lancer l'application par la suite :
1. Faites un clic droit sur `launch_windows.ps1` et sélectionnez "Exécuter avec PowerShell"
2. Le script effectuera des vérifications avant de démarrer l'application
Ces méthodes sont recommandées pour les utilisateurs Windows qui ne sont pas familiers avec les lignes de commande.
### Installation de Tesseract OCR
Pour l'OCR, vous devez également installer Tesseract :
- **Windows** : Téléchargez et installez depuis [https://github.com/UB-Mannheim/tesseract/wiki](https://github.com/UB-Mannheim/tesseract/wiki)
- L'application recherchera Tesseract dans les chemins standards sur Windows (`C:\Program Files\Tesseract-OCR\tesseract.exe`, `C:\Program Files (x86)\Tesseract-OCR\tesseract.exe` ou `C:\Tesseract-OCR\tesseract.exe`)
- Assurez-vous d'installer les langues françaises et anglaises pendant l'installation
- **Linux** : `sudo apt install tesseract-ocr tesseract-ocr-fra tesseract-ocr-eng`
- **macOS** : `brew install tesseract`
### Connexion au serveur Ollama
Cette application se connecte à un serveur Ollama distant pour les fonctionnalités LLM:
- Adresse du serveur: `217.182.105.173:11434`
- Modèles utilisés:
- `mistral:latest` (pour le résumé et la traduction légère)
- `llava:34b-v1.6-fp16` (pour l'analyse visuelle)
- `llama3.2-vision:90b-instruct-q8_0` (pour l'analyse visuelle avancée)
- `deepseek-r1:70b-llama-distill-q8_0` (pour le résumé et la traduction avancés)
- `qwen2.5:72b-instruct-q8_0` (pour la traduction moyenne)
Assurez-vous que votre machine dispose d'une connexion réseau vers ce serveur. Aucune installation locale d'Ollama n'est nécessaire.
## Utilisation
### Lancement de l'application
Si vous avez utilisé l'installation avec le script setup.py :
```bash
ragflow-pretraitement
```
Ou manuellement :
```bash
cd ragflow_pretraitement
python main.py
```
### Processus typique
1. Charger un PDF avec le bouton "Charger PDF"
2. Naviguer entre les pages avec les boutons ou le sélecteur
3. Sélectionner une zone d'intérêt avec la souris
4. Choisir le type de contenu (schéma, tableau, formule...)
5. Ajouter un contexte textuel si nécessaire
6. Configurer les agents LLM dans l'onglet "Agents LLM"
7. Appliquer l'agent sur la sélection
8. Exporter le résultat en Markdown
## Configuration des agents LLM
L'application propose trois niveaux d'analyse :
| Niveau | Vision | Résumé | Traduction | Usage recommandé |
| --------- | ------------------------------------- | ----------------------------- | ---------------------------- | --------------------------- |
| 🔹 Léger | llava:34b-v1.6-fp16 | mistral:latest | mistral:latest | Débogage, prototypes |
| ⚪ Moyen | llava:34b-v1.6-fp16 | deepseek-r1:70b-llama-distill | qwen2.5:72b-instruct | Usage normal |
| 🔸 Avancé | llama3.2-vision:90b-instruct-q8_0 | deepseek-r1:70b-llama-distill | deepseek-r1:70b-llama-distill| Documents critiques |
Vous pouvez personnaliser ces configurations dans le fichier `config/llm_profiles.json`.
## Détail des bibliothèques utilisées
L'application utilise les bibliothèques suivantes, chacune avec un rôle spécifique dans le traitement des documents:
### Bibliothèques principales
- **PyQt6** : Interface graphique complète (v6.4.0+)
- **PyMuPDF (fitz)** : Manipulation et rendu des documents PDF (v1.21.0+)
- **numpy** : Traitement numérique des images et des données (v1.22.0+)
- **pytesseract** : Interface Python pour Tesseract OCR (v0.3.9+)
- **Pillow** : Traitement d'images (v9.3.0+)
- **opencv-python (cv2)** : Traitement d'images avancé et détection de contenu (v4.7.0+)
- **requests** : Communication avec l'API Ollama (v2.28.0+)
### Rôle de chaque bibliothèque
- **PyQt6**: Framework d'interface graphique qui gère la visualisation PDF, les sélections interactives, la configuration des agents et l'interface utilisateur globale.
- **PyMuPDF (fitz)**: Convertit les pages PDF en images, permet d'accéder au contenu des PDF, extraire les pages et obtenir des rendus haute qualité.
- **numpy**: Manipule les données d'images sous forme de tableaux pour le traitement OCR et la détection de contenu.
- **pytesseract**: Reconnaît le texte dans les images extraites des PDF, avec support multilingue.
- **Pillow + opencv-python**: Prétraitement des images avant OCR pour améliorer la reconnaissance du texte.
- **requests**: Envoie des requêtes au service Ollama local pour utiliser les modèles d'IA.
## Structure du projet et modules clés
```
ragflow_pretraitement/
├── main.py # Point d'entrée principal
├── ui/ # Interface utilisateur
│ ├── viewer.py # Visualisation PDF et sélection (PyQt6, fitz)
│ └── llm_config_panel.py # Configuration des agents (PyQt6)
├── agents/ # Agents LLM
│ ├── base.py # Classe de base des agents
│ ├── vision.py # Agent d'analyse visuelle (utilise OllamaAPI)
│ ├── summary.py # Agent de résumé (utilise OllamaAPI)
│ ├── translation.py # Agent de traduction (utilise OllamaAPI)
│ └── rewriter.py # Agent de reformulation (utilise OllamaAPI)
├── utils/ # Utilitaires
│ ├── ocr.py # Reconnaissance de texte (pytesseract, opencv)
│ ├── markdown_export.py # Export en Markdown
│ └── api_ollama.py # Communication avec l'API Ollama (requests)
├── config/ # Configuration
│ └── llm_profiles.json # Profils des modèles LLM utilisés
└── data/ # Données
└── outputs/ # Résultats générés
```
### Détail des modules principaux
#### Interface utilisateur (ui/)
- **viewer.py**: Implémente la classe principale `PDFViewer` qui gère la visualisation des PDF, la sélection des zones, le zoom et la navigation entre les pages.
- **llm_config_panel.py**: Implémente la classe `LLMConfigPanel` qui permet de configurer les agents LLM, leurs paramètres et de lancer les traitements.
#### Agents LLM (agents/)
- **base.py**: Définit la classe de base `LLMBaseAgent` avec les méthodes communes à tous les agents.
- **vision.py**: Agent spécialisé dans l'analyse d'images et de schémas.
- **summary.py**: Agent pour résumer et synthétiser du texte.
- **translation.py**: Agent pour traduire le contenu (généralement de l'anglais vers le français).
- **rewriter.py**: Agent pour reformuler et améliorer du texte.
#### Utilitaires (utils/)
- **ocr.py**: Contient la classe `OCRProcessor` pour extraire du texte des images avec prétraitement avancé.
- **api_ollama.py**: Implémente la classe `OllamaAPI` pour communiquer avec le service Ollama.
- **markdown_export.py**: Gère l'export des résultats au format Markdown.
## Adaptations pour Windows
Cette version a été optimisée pour Windows avec les adaptations suivantes:
1. **Chemins Tesseract**: Configuration automatique des chemins Tesseract pour Windows
2. **URLs API**: Configuration de l'API Ollama pour utiliser `localhost` par défaut
3. **Chemins de fichiers**: Utilisation de la séparation de chemins compatible Windows
4. **Compatibilité Unicode**: Support des caractères spéciaux dans les chemins Windows
## Fonctionnalités pratiques et cas d'utilisation
### 1. Analyse de tableaux techniques
- Sélectionnez un tableau complexe dans un document
- Utilisez l'agent de vision pour reconnaître la structure
- Exportez le contenu sous forme de markdown bien formaté
### 2. Traduction de documents techniques
- Chargez un document en anglais
- Sélectionnez des sections de texte
- Utilisez l'agent de traduction pour obtenir une version française précise
- Exportez le résultat en markdown
### 3. Résumé de documentation longue
- Identifiez les sections importantes d'un long document
- Appliquez l'agent de résumé à chaque section
- Combinez les résultats en un document synthétique
### 4. Analyse de schémas et figures
- Sélectionnez un schéma complexe
- Utilisez l'agent de vision pour obtenir une description détaillée
- Ajoutez du contexte textuel au besoin
## Résolution des problèmes courants
### Erreurs d'importation de PyQt6
Si vous rencontrez des erreurs avec PyQt6, essayez de le réinstaller :
```bash
pip uninstall PyQt6 PyQt6-Qt6 PyQt6-sip
pip install PyQt6
```
### Problèmes avec l'OCR sur Windows
Si l'OCR (Tesseract) ne fonctionne pas correctement :
1. Vérifiez que Tesseract est correctement installé (redémarrez l'application après installation)
2. Vérifiez l'existence d'un des chemins standards (`C:\Program Files\Tesseract-OCR\tesseract.exe`)
3. Si nécessaire, modifiez manuellement le chemin dans `utils/ocr.py`
### Connectivité au serveur Ollama
Si vous ne pouvez pas vous connecter au serveur Ollama:
1. Vérifiez votre connexion réseau et assurez-vous que vous pouvez accéder à `217.182.105.173:11434`
2. Vérifiez qu'aucun pare-feu ou proxy ne bloque la connexion
3. Pour vérifier la disponibilité du serveur, essayez d'accéder à `http://217.182.105.173:11434/api/version` dans votre navigateur
Si vous souhaitez utiliser une instance Ollama locale à la place:
1. Modifiez l'URL dans `utils/api_ollama.py` et `agents/base.py` pour utiliser `http://localhost:11434`
2. Installez Ollama localement depuis [https://ollama.ai/](https://ollama.ai/)
3. Téléchargez les modèles requis avec `ollama pull <nom_du_modèle>`
### Modèles manquants
Si certains modèles ne fonctionnent pas, vérifiez leur disponibilité sur le serveur et modifiez si nécessaire les noms dans `config/llm_profiles.json`.
## Limitations
- OCR parfois imprécis sur les documents complexes ou de mauvaise qualité
- Certains modèles nécessitent beaucoup de mémoire (8GB+ de RAM recommandé)
- Les formules mathématiques complexes peuvent être mal interprétées
## Contribution
Les contributions sont les bienvenues ! N'hésitez pas à soumettre des pull requests ou à signaler des problèmes.

View File

@ -0,0 +1,21 @@
README.md
setup.py
agents/__init__.py
agents/base.py
agents/rewriter.py
agents/summary.py
agents/translation.py
agents/vision.py
ragflow_pretraitement.egg-info/PKG-INFO
ragflow_pretraitement.egg-info/SOURCES.txt
ragflow_pretraitement.egg-info/dependency_links.txt
ragflow_pretraitement.egg-info/entry_points.txt
ragflow_pretraitement.egg-info/requires.txt
ragflow_pretraitement.egg-info/top_level.txt
ui/__init__.py
ui/llm_config_panel.py
ui/viewer.py
utils/__init__.py
utils/api_ollama.py
utils/markdown_export.py
utils/ocr.py

View File

@ -0,0 +1 @@

View File

@ -0,0 +1,2 @@
[console_scripts]
ragflow-pretraitement = ragflow_pretraitement.main:main

View File

@ -0,0 +1,7 @@
PyQt6>=6.4.0
PyMuPDF>=1.21.0
numpy>=1.22.0
pytesseract>=0.3.9
Pillow>=9.3.0
opencv-python>=4.7.0
requests>=2.28.0

View File

@ -0,0 +1,3 @@
agents
ui
utils

234
test_components.py Normal file
View File

@ -0,0 +1,234 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Test script to verify critical components
"""
import os
import sys
import platform
import requests
import time
import subprocess
import json
from typing import List, Dict, Any
# Check Tesseract OCR installation
def check_tesseract():
print("\n=== Checking Tesseract OCR ===")
try:
import pytesseract
from PIL import Image
# Possible paths for Tesseract on Windows
possible_paths = [
r"C:\Program Files\Tesseract-OCR\tesseract.exe",
r"C:\Program Files (x86)\Tesseract-OCR\tesseract.exe",
r"C:\Tesseract-OCR\tesseract.exe",
r"C:\Users\PCDEV\AppData\Local\Programs\Tesseract-OCR\tesseract.exe",
r"C:\Users\PCDEV\Tesseract-OCR\tesseract.exe"
]
# Check if Tesseract is in PATH
tesseract_in_path = False
try:
if platform.system() == "Windows":
result = subprocess.run(["where", "tesseract"], capture_output=True, text=True)
if result.returncode == 0:
tesseract_in_path = True
tesseract_path = result.stdout.strip().split("\n")[0]
print(f"Tesseract found in PATH: {tesseract_path}")
else:
result = subprocess.run(["which", "tesseract"], capture_output=True, text=True)
if result.returncode == 0:
tesseract_in_path = True
tesseract_path = result.stdout.strip()
print(f"Tesseract found in PATH: {tesseract_path}")
except Exception as e:
print(f"Error checking for Tesseract in PATH: {e}")
if not tesseract_in_path and platform.system() == "Windows":
print("Tesseract is not in PATH. Searching in standard locations...")
# Check standard paths
for path in possible_paths:
if os.path.exists(path):
pytesseract.pytesseract.tesseract_cmd = path
print(f"Tesseract found at: {path}")
break
# Test Tesseract with a version command
try:
if platform.system() == "Windows" and not tesseract_in_path:
# Use explicit path
for path in possible_paths:
if os.path.exists(path):
result = subprocess.run([path, "--version"], capture_output=True, text=True)
if result.returncode == 0:
print(f"Tesseract version: {result.stdout.strip().split()[0]}")
break
else:
# Tesseract is in PATH
result = subprocess.run(["tesseract", "--version"], capture_output=True, text=True)
if result.returncode == 0:
print(f"Tesseract version: {result.stdout.strip().split()[0]}")
except Exception as e:
print(f"Error checking Tesseract version: {e}")
# Check installed languages
try:
if platform.system() == "Windows" and not tesseract_in_path:
# Use explicit path
for path in possible_paths:
if os.path.exists(path):
tesseract_folder = os.path.dirname(path)
tessdata_folder = os.path.join(tesseract_folder, "tessdata")
if os.path.exists(tessdata_folder):
langs = [f for f in os.listdir(tessdata_folder) if f.endswith(".traineddata")]
print(f"Installed languages: {', '.join([lang.split('.')[0] for lang in langs])}")
break
else:
# Tesseract is in PATH
result = subprocess.run(["tesseract", "--list-langs"], capture_output=True, text=True)
if result.returncode == 0:
langs = result.stdout.strip().split("\n")[1:] # Skip the first line
print(f"Installed languages: {', '.join(langs)}")
except Exception as e:
print(f"Error checking Tesseract languages: {e}")
print("\nINSTRUCTIONS FOR TESSERACT OCR:")
print("1. If Tesseract is not installed, download it from:")
print(" https://github.com/UB-Mannheim/tesseract/wiki")
print("2. Make sure to install French (fra) and English (eng) languages")
print("3. Check the 'Add to PATH' option during installation")
except ImportError as e:
print(f"Error: {e}")
print("Tesseract OCR or its Python dependencies are not properly installed")
print("Install them with: pip install pytesseract Pillow")
# Check connection to Ollama
def check_ollama(endpoint="http://217.182.105.173:11434"):
print("\n=== Checking Ollama connection ===")
print(f"Endpoint: {endpoint}")
# Test basic connection
try:
response = requests.get(f"{endpoint}/api/version", timeout=10)
if response.status_code == 200:
version_info = response.json()
print(f"✓ Connection to Ollama successful - Version: {version_info.get('version', 'unknown')}")
# List available models
try:
response = requests.get(f"{endpoint}/api/tags", timeout=10)
if response.status_code == 200:
models = response.json().get("models", [])
if models:
print(f"✓ Available models ({len(models)}):")
for model in models:
print(f" - {model.get('name', 'Unknown')} ({model.get('size', 'Unknown size')})")
else:
print("No models found on Ollama server")
else:
print(f"✗ Error retrieving models: status {response.status_code}")
except requests.exceptions.RequestException as e:
print(f"✗ Error retrieving models: {str(e)}")
# Test a simple model
try:
print("\nTesting a simple model (mistral)...")
payload = {
"model": "mistral",
"prompt": "Say hello in English",
"options": {
"temperature": 0.1
}
}
start_time = time.time()
response = requests.post(f"{endpoint}/api/generate", json=payload, timeout=30)
elapsed_time = time.time() - start_time
if response.status_code == 200:
try:
result = response.json()
print(f"✓ Test successful in {elapsed_time:.2f} seconds")
print(f" Response: {result.get('response', 'No response')[:100]}...")
except json.JSONDecodeError as e:
print(f"✗ JSON parsing error: {str(e)}")
print(" Trying to process first line only...")
lines = response.text.strip().split("\n")
if lines:
try:
result = json.loads(lines[0])
print(f"✓ Test successful with first line parsing in {elapsed_time:.2f} seconds")
print(f" Response: {result.get('response', 'No response')[:100]}...")
except json.JSONDecodeError:
print("✗ Failed to parse first line as JSON")
print(f" Raw response (first 200 chars): {response.text[:200]}")
else:
print(f"✗ Error testing model: status {response.status_code}")
print(f" Body: {response.text[:200]}")
except requests.exceptions.RequestException as e:
print(f"✗ Error testing model: {str(e)}")
else:
print(f"✗ Error connecting to Ollama: status {response.status_code}")
print(f" Body: {response.text[:200]}")
except requests.exceptions.RequestException as e:
print(f"✗ Unable to connect to Ollama: {str(e)}")
print("\nINSTRUCTIONS FOR OLLAMA:")
print("1. Verify that the Ollama server is running at the specified address")
print("2. Verify that port 11434 is open and accessible")
print("3. Check Ollama server logs for potential issues")
# Check Python environment
def check_python_env():
print("\n=== Checking Python environment ===")
print(f"Python {sys.version}")
print(f"Platform: {platform.platform()}")
# Check installed packages
required_packages = ["PyQt6", "PyPDF2", "pytesseract", "requests", "fitz"]
print("\nChecking required packages:")
for pkg in required_packages:
try:
__import__(pkg)
print(f"{pkg} is installed")
except ImportError:
print(f"{pkg} is NOT installed")
# Check Pillow separately (package name is Pillow but import name is PIL)
try:
import PIL
print(f"✓ PIL (Pillow) is installed")
except ImportError:
print(f"✗ PIL (Pillow) is NOT installed")
print("\nINSTRUCTIONS FOR PYTHON ENVIRONMENT:")
print("1. Make sure you're using the virtual environment if configured")
print("2. Install missing packages with: pip install -r requirements.txt")
# Main function
def main():
print("=== Testing critical components ===")
# Check Python environment
check_python_env()
# Check Tesseract OCR
check_tesseract()
# Check connection to Ollama
check_ollama()
print("\n=== Checks completed ===")
print("If issues were detected, follow the displayed instructions")
print("After fixing issues, run this script again to verify")
if __name__ == "__main__":
main()

View File

@ -10,7 +10,8 @@ import json
from PyQt6.QtWidgets import (QWidget, QVBoxLayout, QHBoxLayout, QLabel,
QPushButton, QComboBox, QTextEdit, QGroupBox,
QListWidget, QSplitter, QTabWidget, QSpinBox,
QDoubleSpinBox, QFormLayout, QMessageBox, QCheckBox)
QDoubleSpinBox, QFormLayout, QMessageBox, QCheckBox,
QApplication)
from PyQt6.QtCore import Qt, QSize
class LLMConfigPanel(QWidget):
@ -121,6 +122,11 @@ class LLMConfigPanel(QWidget):
self.summary_model_combo.addItems(["mistral", "deepseek-r1"])
agent_type_layout.addRow("Modèle:", self.summary_model_combo)
# Case à cocher pour activer/désactiver l'agent de résumé
self.summary_enabled_checkbox = QCheckBox("Activer l'agent de résumé")
self.summary_enabled_checkbox.setChecked(False) # Désactivé par défaut
agent_type_layout.addRow("", self.summary_enabled_checkbox)
# Agent de traduction
agent_type_layout.addRow(QLabel("<b>Agent Traduction</b>"))
@ -196,6 +202,11 @@ class LLMConfigPanel(QWidget):
self.include_images_check.setChecked(True)
export_group_layout.addWidget(self.include_images_check)
# Option pour enregistrer les images capturées
self.save_captured_images_check = QCheckBox("Enregistrer les images capturées")
self.save_captured_images_check.setChecked(True)
export_group_layout.addWidget(self.save_captured_images_check)
export_layout.addWidget(export_group)
export_btn = QPushButton("Exporter")
@ -338,51 +349,122 @@ class LLMConfigPanel(QWidget):
self.translation_model_combo.setCurrentText(translation["model"])
def run_agent(self):
"""Execute l'agent LLM sur la sélection actuelle"""
"""Exécute l'agent LLM sur la sélection actuelle"""
if not self.current_selection:
QMessageBox.warning(self, "Aucune sélection",
"Veuillez sélectionner une région à analyser.")
self.result_text.setText("Veuillez sélectionner une région avant d'exécuter l'agent.")
return
# Pour cette démonstration, nous simulons le résultat
# Dans une implémentation réelle, nous utiliserions les agents LLM
try:
# Récupération des paramètres
selection_type = self.current_selection.get("type", "autre")
context = self.current_selection.get("context", "")
# Récupérer les paramètres
# Paramètres de génération
gen_params = {
"temperature": self.temp_spin.value(),
"top_p": self.top_p_spin.value(),
"max_tokens": self.token_spin.value(),
"save_images": self.save_captured_images_check.isChecked() # Ajout du paramètre d'enregistrement
}
self.result_text.setText("Analyse en cours...\nCette opération peut prendre plusieurs minutes.")
self.parent.status_bar.showMessage("Traitement de l'image en cours...")
# Forcer la mise à jour de l'interface
QApplication.processEvents()
# Afficher les informations sur la sélection actuelle (débogue)
rect = self.current_selection.get("rect")
if rect:
rect_info = f"Sélection: x={rect.x()}, y={rect.y()}, w={rect.width()}, h={rect.height()}, page={self.current_selection.get('page', 0)+1}"
print(f"INFO SÉLECTION: {rect_info}")
# Récupérer l'image de la sélection
selection_image = self.parent.get_selection_image(self.current_selection)
if selection_image:
# Vérifier la taille des données d'image pour le débogue
print(f"Taille des données image: {len(selection_image)} octets")
# Récupération des paramètres depuis l'interface
vision_model = self.vision_model_combo.currentText()
vision_lang = self.vision_lang_combo.currentText()
summary_model = self.summary_model_combo.currentText()
translation_model = self.translation_model_combo.currentText()
lang = self.vision_lang_combo.currentText()
temp = self.temp_spin.value()
# Exemple de résultat simulé
result = f"**Analyse de l'agent Vision ({vision_model})**\n\n"
# Configuration du pipeline de traitement
from utils.workflow_manager import WorkflowManager
from config.agent_config import ACTIVE_AGENTS
if self.current_selection["type"] == "schéma":
result += "Le schéma illustre un processus en plusieurs étapes avec des connections entre les différents éléments.\n\n"
elif self.current_selection["type"] == "tableau":
result += "Le tableau contient des données structurées avec plusieurs colonnes et rangées.\n\n"
elif self.current_selection["type"] == "formule":
result += "La formule mathématique représente une équation complexe.\n\n"
else:
result += "Le contenu sélectionné a été analysé.\n\n"
# Mise à jour dynamique de l'activation des agents
active_agents = ACTIVE_AGENTS.copy()
active_agents["summary"] = self.summary_enabled_checkbox.isChecked()
result += f"**Résumé ({summary_model})**\n\n"
result += "Ce contenu montre l'importance des éléments sélectionnés dans le contexte du document.\n\n"
# Afficher les agents actifs pour le débogage
print(f"Agents actifs: {active_agents}")
print(f"Modèles utilisés: vision={vision_model}, translation={translation_model}, summary={summary_model}")
if lang == "en":
result += f"**Traduction ({translation_model})**\n\n"
result += "The selected content has been analyzed and shows the importance of the selected elements in the context of the document.\n\n"
# Créer le gestionnaire de workflow
workflow = WorkflowManager(
vision_model=vision_model,
summary_model=summary_model,
translation_model=translation_model,
active_agents=active_agents,
config=gen_params
)
result += "**Paramètres utilisés**\n"
result += f"modèle vision={vision_model}, modèle résumé={summary_model}, "
result += f"temperature={temp}, langue={lang}"
# Exécution du workflow
results = workflow.process_image(
image_data=selection_image,
content_type=selection_type,
context=context,
target_lang=vision_lang
)
# Stocker et afficher le résultat
self.analysis_results[id(self.current_selection)] = result
self.result_text.setText(result)
# Affichage des résultats
if results:
result_text = ""
# Mise à jour du statut
# Vision (original)
if "vision" in results and results["vision"]:
result_text += "🔍 ANALYSE VISUELLE (en):\n"
result_text += results["vision"]
result_text += "\n\n"
# Traduction
if "translation" in results and results["translation"]:
result_text += "🌐 TRADUCTION (fr):\n"
result_text += results["translation"]
result_text += "\n\n"
# Résumé (si activé)
if "summary" in results and results["summary"]:
result_text += "📝 RÉSUMÉ (fr):\n"
result_text += results["summary"]
# Message d'erreur
if "error" in results:
result_text += "\n\n❌ ERREUR:\n"
result_text += results["error"]
# Enregistrer l'analyse dans l'historique
self.analysis_results[self.selection_list.currentRow()] = results
self.result_text.setText(result_text)
self.parent.status_bar.showMessage("Analyse terminée.")
else:
self.result_text.setText("Erreur: Impossible d'obtenir une analyse de l'image.")
self.parent.status_bar.showMessage("Analyse échouée.")
else:
self.result_text.setText("Erreur: Impossible d'extraire l'image de la sélection.")
self.parent.status_bar.showMessage("Erreur: Extraction d'image impossible.")
except Exception as e:
import traceback
error_details = traceback.format_exc()
print(f"Erreur lors de l'analyse: {error_details}")
self.result_text.setText(f"Erreur lors de l'analyse: {str(e)}\n\nDétails de l'erreur:\n{error_details}")
self.parent.status_bar.showMessage("Erreur: Analyse échouée.")
def export_results(self):
"""Exporte les résultats au format Markdown"""

View File

@ -164,7 +164,12 @@ class PDFViewer(QMainWindow):
zoom_matrix = fitz.Matrix(2 * self.zoom_factor, 2 * self.zoom_factor)
# Rendu de la page en un pixmap
try:
# Essayer avec la nouvelle API (PyMuPDF récent)
pixmap = page.get_pixmap(matrix=zoom_matrix, alpha=False)
except AttributeError:
# Fallback pour les anciennes versions
pixmap = page.render_pixmap(matrix=zoom_matrix, alpha=False)
# Conversion en QImage puis QPixmap
img = QImage(pixmap.samples, pixmap.width, pixmap.height,
@ -224,14 +229,31 @@ class PDFViewer(QMainWindow):
"""
page = self.current_page if page_num is None else page_num
# Ajuster les coordonnées selon le zoom actuel
# DEBUG: Afficher les coordonnées brutes et le facteur de zoom
print(f"DEBUG: Coordonnées brutes de sélection: x={rect.x()}, y={rect.y()}, w={rect.width()}, h={rect.height()}")
print(f"DEBUG: Facteur de zoom: {self.zoom_factor}")
# Calculer le décalage du viewport
viewport_pos = self.scroll_area.horizontalScrollBar().value(), self.scroll_area.verticalScrollBar().value()
print(f"DEBUG: Décalage du viewport: x={viewport_pos[0]}, y={viewport_pos[1]}")
# Obtenir les dimensions actuelles de l'image affichée
if self.pdf_label.pixmap():
img_width = self.pdf_label.pixmap().width()
img_height = self.pdf_label.pixmap().height()
print(f"DEBUG: Dimensions de l'image affichée: {img_width}x{img_height}")
# Ajuster les coordonnées en fonction du zoom et du décalage
# Note: pour simplifier, nous ignorons le décalage du viewport pour l'instant
adjusted_rect = QRectF(
rect.x() / self.zoom_factor,
rect.y() / self.zoom_factor,
rect.width() / self.zoom_factor,
rect.height() / self.zoom_factor
rect.x() / self.zoom_factor / 2, # Division par 2 car la matrice de rendu utilise 2x
rect.y() / self.zoom_factor / 2,
rect.width() / self.zoom_factor / 2,
rect.height() / self.zoom_factor / 2
)
print(f"DEBUG: Coordonnées ajustées: x={adjusted_rect.x()}, y={adjusted_rect.y()}, w={adjusted_rect.width()}, h={adjusted_rect.height()}")
selection = {
"page": page,
"rect": adjusted_rect,
@ -245,6 +267,63 @@ class PDFViewer(QMainWindow):
# Informer l'utilisateur
self.status_bar.showMessage(f"Sélection ajoutée à la page {page + 1}")
def get_selection_image(self, selection):
"""
Extrait l'image de la sélection
Args:
selection (dict): Dictionnaire contenant les informations de la sélection
Returns:
bytes: Données de l'image en bytes, ou None en cas d'erreur
"""
if not selection or not self.pdf_document:
return None
try:
# Récupérer la page
page_num = selection.get("page", 0)
page = self.pdf_document[page_num]
# Récupérer les coordonnées de la sélection
rect = selection.get("rect")
if not rect:
return None
# Convertir les coordonnées en rectangle PyMuPDF
# Les coordonnées stockées sont déjà ajustées par rapport au zoom (voir méthode add_selection)
mupdf_rect = fitz.Rect(
rect.x(),
rect.y(),
rect.x() + rect.width(),
rect.y() + rect.height()
)
# Debug: Afficher les coordonnées pour vérification
print(f"Coordonnées de la sélection (ajustées): {mupdf_rect}")
# Matrice de transformation pour la qualité (sans zoom supplémentaire)
matrix = fitz.Matrix(2, 2) # Facteur de qualité fixe à 2
# Capturer l'image de la zone sélectionnée
try:
# Essayer avec la nouvelle API (PyMuPDF récent)
pix = page.get_pixmap(matrix=matrix, clip=mupdf_rect)
except AttributeError:
# Fallback pour les anciennes versions
pix = page.render_pixmap(matrix=matrix, clip=mupdf_rect)
# Debug: Afficher les dimensions du pixmap obtenu
print(f"Dimensions de l'image capturée: {pix.width}x{pix.height}")
# Convertir en format PNG
img_data = pix.tobytes("png")
return img_data
except Exception as e:
print(f"Erreur lors de l'extraction de l'image: {str(e)}")
return None
class PDFLabel(QLabel):
"""Étiquette personnalisée pour afficher le PDF et gérer les sélections"""

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@ -2,86 +2,254 @@
# -*- coding: utf-8 -*-
"""
Module pour l'interaction avec l'API Ollama
API Interface for Ollama
"""
import json
import requests
import json
import base64
import time
import os
import threading
from typing import List, Dict, Any, Optional, Union, Callable
# Verrouillage global pour les appels Ollama
_ollama_lock = threading.Lock()
_model_in_use = None
_last_call_time = 0.0 # Float pour le temps (secondes)
_min_delay_between_calls = 3.0 # Délai minimum en secondes entre les appels à Ollama
class OllamaAPI:
"""
Classe pour interagir avec l'API Ollama
Simplified interface for Ollama API
"""
def __init__(self, base_url: str = "http://217.182.105.173:11434"):
"""
Initialise la connexion à l'API Ollama
Initialize the API with the server's base URL
Args:
base_url (str): URL de base de l'API Ollama
base_url (str): Base URL of the Ollama server
"""
self.base_url = base_url.rstrip("/")
self.base_url = base_url
self.generate_endpoint = f"{self.base_url}/api/generate"
self.chat_endpoint = f"{self.base_url}/api/chat"
self.models_endpoint = f"{self.base_url}/api/tags"
self.timeout = 120 # Increase timeout to 2 minutes
self.max_retries = 2 # Nombre maximum de tentatives pour les requêtes
self.retry_delay = 2 # Délai entre les tentatives en secondes
def list_models(self) -> List[str]:
# Check connection on startup
self._check_connection()
@staticmethod
def wait_for_ollama(model_name: str, timeout: int = 120) -> bool:
"""
Récupère la liste des modèles disponibles
Attend que le serveur Ollama soit disponible pour le modèle spécifié
Args:
model_name (str): Nom du modèle à attendre
timeout (int): Délai d'attente maximum en secondes
Returns:
List[str]: Liste des noms de modèles disponibles
bool: True si le serveur est disponible, False si timeout
"""
global _ollama_lock, _model_in_use, _last_call_time, _min_delay_between_calls
# Calculer le temps à attendre depuis le dernier appel
time_since_last_call = time.time() - _last_call_time
if time_since_last_call < _min_delay_between_calls:
delay = _min_delay_between_calls - time_since_last_call
print(f"Attente de {delay:.1f}s pour respecter le délai minimal entre appels...")
time.sleep(delay)
start_time = time.time()
while True:
with _ollama_lock:
# Si aucun modèle n'est en cours d'utilisation ou si c'est le modèle demandé
if _model_in_use is None:
_model_in_use = model_name
_last_call_time = time.time()
return True
# Si le temps d'attente est dépassé
if time.time() - start_time > timeout:
print(f"Timeout en attendant Ollama pour le modèle {model_name}")
return False
# Attendre et réessayer
wait_time = min(5, (timeout - (time.time() - start_time)))
if wait_time <= 0:
return False
print(f"En attente d'Ollama ({_model_in_use} est en cours d'utilisation)... Nouvel essai dans {wait_time:.1f}s")
time.sleep(wait_time)
@staticmethod
def release_ollama():
"""Libère le verrouillage sur Ollama"""
global _ollama_lock, _model_in_use, _last_call_time
with _ollama_lock:
_model_in_use = None
_last_call_time = time.time()
print("Ollama libéré et disponible pour de nouveaux appels")
def _check_connection(self) -> bool:
"""
Checks if the Ollama server is accessible
Returns:
bool: True if server is accessible, False otherwise
"""
try:
response = requests.get(self.models_endpoint)
response.raise_for_status()
data = response.json()
response = requests.get(f"{self.base_url}/api/version", timeout=10)
if response.status_code == 200:
version_info = response.json()
print(f"Connection to Ollama established. Version: {version_info.get('version', 'unknown')}")
return True
else:
print(f"Error connecting to Ollama: status {response.status_code}")
return False
except requests.exceptions.RequestException as e:
print(f"Unable to connect to Ollama server: {str(e)}")
print(f"URL: {self.base_url}")
print("Check that the server is running and accessible.")
return False
# Extraire les noms des modèles
models = [model['name'] for model in data.get('models', [])]
return models
def list_models(self) -> List[Dict[str, Any]]:
"""
Lists available models on Ollama server
except Exception as e:
print(f"Erreur lors de la récupération des modèles: {str(e)}")
Returns:
List[Dict[str, Any]]: List of available models
"""
try:
response = requests.get(self.models_endpoint, timeout=self.timeout)
if response.status_code == 200:
return response.json().get("models", [])
else:
print(f"Error retrieving models: status {response.status_code}")
return []
except requests.exceptions.RequestException as e:
print(f"Connection error while retrieving models: {str(e)}")
return []
def _is_model_available(self, model_name: str) -> bool:
"""
Vérifie si un modèle spécifique est disponible sur le serveur
Args:
model_name (str): Nom du modèle à vérifier
Returns:
bool: True si le modèle est disponible, False sinon
"""
models = self.list_models()
available_models = [model["name"] for model in models]
# Vérification exacte
if model_name in available_models:
return True
# Vérification partielle (pour gérer les versions)
for available_model in available_models:
# Si le modèle demandé est une partie d'un modèle disponible
if model_name in available_model or available_model in model_name:
print(f"Note: Le modèle '{model_name}' correspond partiellement à '{available_model}'")
return True
return False
def _make_request_with_retry(self, method: str, url: str, json_data: Dict[str, Any],
timeout: Optional[int] = None) -> requests.Response:
"""
Effectue une requête HTTP avec mécanisme de réessai
Args:
method (str): Méthode HTTP (POST, GET, etc.)
url (str): URL de la requête
json_data (Dict): Données JSON à envoyer
timeout (int, optional): Timeout en secondes
Returns:
requests.Response: Réponse HTTP
Raises:
requests.exceptions.RequestException: Si toutes les tentatives échouent
"""
# Utiliser la valeur par défaut de l'instance si aucun timeout n'est spécifié
request_timeout = self.timeout if timeout is None else timeout
attempt = 0
last_error = None
while attempt < self.max_retries:
try:
if method.upper() == "POST":
return requests.post(url, json=json_data, timeout=request_timeout)
elif method.upper() == "GET":
return requests.get(url, json=json_data, timeout=request_timeout)
else:
raise ValueError(f"Méthode HTTP non supportée: {method}")
except requests.exceptions.RequestException as e:
last_error = e
attempt += 1
if attempt < self.max_retries:
print(f"Tentative {attempt} échouée. Nouvelle tentative dans {self.retry_delay}s...")
time.sleep(self.retry_delay)
# Si on arrive ici, c'est que toutes les tentatives ont échoué
raise last_error or requests.exceptions.RequestException("Toutes les tentatives ont échoué")
def generate(self, model: str, prompt: str, images: Optional[List[bytes]] = None,
options: Optional[Dict[str, Any]] = None) -> Dict[str, Any]:
"""
Génère une réponse à partir d'un prompt
Generates a response from an Ollama model
Args:
model (str): Nom du modèle à utiliser
prompt (str): Texte du prompt
images (List[bytes], optional): Liste d'images en bytes
options (Dict, optional): Options de génération
model (str): Model name to use
prompt (str): Prompt text
images (List[bytes], optional): Images to send to model (for multimodal models)
options (Dict, optional): Generation options
Returns:
Dict[str, Any]: Réponse du modèle
Dict[str, Any]: Model response
"""
# Options par défaut
default_options = {
"temperature": 0.2,
"top_p": 0.95,
"top_k": 40,
"num_predict": 1024
}
# Default response in case of errors
result: Dict[str, Any] = {"error": "Unknown error", "response": "Error during generation"}
# Fusionner avec les options fournies
if options:
default_options.update(options)
# Input validation
if not model:
return {"error": "Model parameter is required", "response": "Error: no model specified"}
# Construire la payload
if not prompt and not images:
return {"error": "Either prompt or images must be provided", "response": "Error: no content to generate from"}
if options is None:
options = {}
# Vérifier si le modèle est disponible
if not self._is_model_available(model):
model_error = f"Le modèle '{model}' n'est pas disponible sur le serveur Ollama. Utilisez la commande: ollama pull {model}"
print(model_error)
return {"error": model_error, "response": f"Error: model '{model}' not found, try pulling it first"}
# Attendre que le serveur Ollama soit disponible
if not self.wait_for_ollama(model, timeout=180):
return {"error": "Timeout waiting for Ollama", "response": "Timeout waiting for Ollama server to be available"}
try:
# Prepare payload
payload = {
"model": model,
"prompt": prompt,
"options": default_options
"options": options,
"stream": False # Important: disable streaming to avoid JSON parsing errors
}
# Ajouter les images si fournies (pour les modèles multimodaux)
# Add images if provided (for multimodal models)
if images:
base64_images = []
for img in images:
@ -91,56 +259,114 @@ class OllamaAPI:
payload["images"] = base64_images
try:
# Envoyer la requête
response = requests.post(self.generate_endpoint, json=payload)
response.raise_for_status()
return response.json()
# Make request
print(f"Sending request to {self.generate_endpoint} for model {model}...")
start_time = time.time()
except requests.exceptions.HTTPError as e:
print(f"Erreur HTTP: {e}")
return {"error": str(e)}
try:
response = self._make_request_with_retry("POST", self.generate_endpoint, payload)
except requests.exceptions.RequestException as e:
self.release_ollama() # Libérer Ollama en cas d'erreur
return {"error": f"Connection error: {str(e)}", "response": "Error connecting to model server"}
elapsed_time = time.time() - start_time
# Handle response
if response.status_code == 200:
print(f"Response received in {elapsed_time:.2f} seconds")
try:
result = response.json()
except Exception as e:
# In case of JSON parsing error, try to process line by line
print(f"JSON parsing error: {e}")
print("Trying to process line by line...")
# If the response contains multiple JSON lines, take the first valid line
lines = response.text.strip().split("\n")
if len(lines) > 0:
try:
result = json.loads(lines[0])
except:
# If that still doesn't work, return the raw text
result = {"response": response.text[:1000], "model": model}
elif response.status_code == 404:
# Modèle spécifiquement non trouvé
error_msg = f"Model '{model}' not found on the server. Try running: ollama pull {model}"
print(error_msg)
result = {"error": error_msg, "response": f"Error: model '{model}' not found, try pulling it first"}
else:
error_msg = f"Error during generation: status {response.status_code}"
try:
error_json = response.json()
if "error" in error_json:
error_msg += f", message: {error_json['error']}"
except:
error_msg += f", body: {response.text[:100]}"
print(error_msg)
result = {"error": error_msg, "response": "Error communicating with model"}
except Exception as e:
print(f"Erreur lors de la génération: {str(e)}")
return {"error": str(e)}
# Catch any other unexpected errors
error_msg = f"Unexpected error: {str(e)}"
print(error_msg)
result = {"error": error_msg, "response": "An unexpected error occurred"}
finally:
# Toujours libérer Ollama à la fin
self.release_ollama()
# Ensure we always return a dictionary
return result
def chat(self, model: str, messages: List[Dict[str, Any]],
images: Optional[List[bytes]] = None,
options: Optional[Dict[str, Any]] = None) -> Dict[str, Any]:
"""
Utilise l'API de chat pour une conversation
Generates a response from a chat history
Args:
model (str): Nom du modèle à utiliser
messages (List[Dict]): Liste des messages de la conversation
Format: [{"role": "user", "content": "message"}, ...]
images (List[bytes], optional): Liste d'images en bytes (pour le dernier message)
options (Dict, optional): Options de génération
model (str): Model name to use
messages (List[Dict]): List of chat messages (format [{"role": "user", "content": "..."}])
images (List[bytes], optional): Images to send to model (for multimodal models)
options (Dict, optional): Generation options
Returns:
Dict[str, Any]: Réponse du modèle
Dict[str, Any]: Model response
"""
# Options par défaut
default_options = {
"temperature": 0.2,
"top_p": 0.95,
"top_k": 40,
"num_predict": 1024
}
# Default response in case of errors
result: Dict[str, Any] = {"error": "Unknown error", "response": "Error during chat generation"}
# Fusionner avec les options fournies
if options:
default_options.update(options)
# Input validation
if not model:
return {"error": "Model parameter is required", "response": "Error: no model specified"}
# Construire la payload
if not messages:
return {"error": "Messages parameter is required", "response": "Error: no chat messages provided"}
if options is None:
options = {}
# Vérifier si le modèle est disponible
if not self._is_model_available(model):
model_error = f"Le modèle '{model}' n'est pas disponible sur le serveur Ollama. Utilisez la commande: ollama pull {model}"
print(model_error)
return {"error": model_error, "response": f"Error: model '{model}' not found, try pulling it first"}
# Attendre que le serveur Ollama soit disponible
if not self.wait_for_ollama(model, timeout=180):
return {"error": "Timeout waiting for Ollama", "response": "Timeout waiting for Ollama server to be available"}
try:
# Prepare payload
payload = {
"model": model,
"messages": messages,
"options": default_options
"options": options,
"stream": False # Important: disable streaming to avoid JSON parsing errors
}
# Ajouter les images au dernier message utilisateur si fournies
# Add images to the last user message if provided
if images and messages and messages[-1]["role"] == "user":
base64_images = []
for img in images:
@ -148,97 +374,161 @@ class OllamaAPI:
base64_img = base64.b64encode(img).decode("utf-8")
base64_images.append(base64_img)
# Modifier le dernier message pour inclure les images
# Modify the last message to include images
last_message = messages[-1].copy()
# Les images doivent être dans un champ distinct du modèle d'API d'Ollama
# Pas comme un champ texte standard mais dans un tableau d'images
if "images" not in last_message:
last_message["images"] = base64_images
# Remplacer le dernier message
# Replace the last message
payload["messages"] = messages[:-1] + [last_message]
try:
# Envoyer la requête
response = requests.post(self.chat_endpoint, json=payload)
response.raise_for_status()
return response.json()
# Make request
print(f"Sending chat request to {self.chat_endpoint} for model {model}...")
start_time = time.time()
except requests.exceptions.HTTPError as e:
print(f"Erreur HTTP: {e}")
return {"error": str(e)}
try:
response = self._make_request_with_retry("POST", self.chat_endpoint, payload)
except requests.exceptions.RequestException as e:
self.release_ollama() # Libérer Ollama en cas d'erreur
return {"error": f"Connection error: {str(e)}", "response": "Error connecting to model server"}
elapsed_time = time.time() - start_time
# Handle response
if response.status_code == 200:
print(f"Chat response received in {elapsed_time:.2f} seconds")
try:
result = response.json()
except Exception as e:
# In case of JSON parsing error, try to process line by line
print(f"JSON parsing error: {e}")
lines = response.text.strip().split("\n")
if len(lines) > 0:
try:
result = json.loads(lines[0])
except:
result = {"message": {"content": response.text[:1000]}, "model": model}
elif response.status_code == 404:
# Modèle spécifiquement non trouvé
error_msg = f"Model '{model}' not found on the server. Try running: ollama pull {model}"
print(error_msg)
result = {"error": error_msg, "response": f"Error: model '{model}' not found, try pulling it first"}
else:
error_msg = f"Error during chat generation: status {response.status_code}"
try:
error_json = response.json()
if "error" in error_json:
error_msg += f", message: {error_json['error']}"
except:
error_msg += f", body: {response.text[:100]}"
print(error_msg)
result = {"error": error_msg, "response": "Error communicating with model"}
except Exception as e:
print(f"Erreur lors du chat: {str(e)}")
return {"error": str(e)}
# Catch any other unexpected errors
error_msg = f"Unexpected error: {str(e)}"
print(error_msg)
result = {"error": error_msg, "response": "An unexpected error occurred"}
finally:
# Toujours libérer Ollama à la fin
self.release_ollama()
# Ensure we always return a dictionary
return result
def stream_generate(self, model: str, prompt: str,
callback: Callable[[str], None],
options: Optional[Dict[str, Any]] = None) -> None:
images: Optional[List[bytes]] = None,
options: Optional[Dict[str, Any]] = None) -> str:
"""
Génère une réponse en streaming et appelle le callback pour chaque morceau
Generate a response in streaming mode with a callback function
Args:
model (str): Nom du modèle à utiliser
prompt (str): Texte du prompt
callback (Callable): Fonction à appeler pour chaque morceau de texte
options (Dict, optional): Options de génération
model (str): Model name
prompt (str): Prompt to send
callback (Callable): Function called for each received chunk
images (List[bytes], optional): Images to send
options (Dict, optional): Generation options
Returns:
str: Complete generated text
"""
# Options par défaut
default_options = {
"temperature": 0.2,
"top_p": 0.95,
"top_k": 40,
"num_predict": 1024,
"stream": True # Activer le streaming
}
if options is None:
options = {}
# Fusionner avec les options fournies
if options:
default_options.update(options)
# S'assurer que stream est activé
default_options["stream"] = True
# Construire la payload
payload = {
"model": model,
"prompt": prompt,
"options": default_options
"options": options,
"stream": True # Enable streaming
}
try:
# Envoyer la requête en streaming
with requests.post(self.generate_endpoint, json=payload, stream=True) as response:
response.raise_for_status()
# Add images if provided
if images:
base64_images = []
for img in images:
if isinstance(img, bytes):
base64_img = base64.b64encode(img).decode("utf-8")
base64_images.append(base64_img)
payload["images"] = base64_images
full_response = ""
try:
with requests.post(
self.generate_endpoint,
json=payload,
stream=True,
timeout=self.timeout
) as response:
if response.status_code != 200:
error_msg = f"Error during streaming: status {response.status_code}"
callback(error_msg)
return error_msg
# Traiter chaque ligne de la réponse
for line in response.iter_lines():
if line:
try:
data = json.loads(line)
if "response" in data:
callback(data["response"])
chunk = json.loads(line)
if "response" in chunk:
text_chunk = chunk["response"]
full_response += text_chunk
callback(text_chunk)
except json.JSONDecodeError:
print(f"Erreur de décodage JSON: {line}")
# Ignore lines that are not valid JSON
pass
except requests.exceptions.HTTPError as e:
print(f"Erreur HTTP: {e}")
callback(f"\nErreur: {str(e)}")
return full_response
except Exception as e:
print(f"Erreur lors du streaming: {str(e)}")
callback(f"\nErreur: {str(e)}")
error_msg = f"Error during streaming: {str(e)}"
callback(error_msg)
return error_msg
def check_connection(self) -> bool:
"""
Vérifie si la connexion à l'API Ollama est fonctionnelle
# Test the API if executed directly
if __name__ == "__main__":
api = OllamaAPI()
print("Testing connection to Ollama...")
Returns:
bool: True si la connexion est établie, False sinon
"""
try:
response = requests.get(f"{self.base_url}/api/version")
return response.status_code == 200
except:
return False
if api._check_connection():
print("Connection successful!")
print("\nList of available models:")
models = api.list_models()
for model in models:
print(f"- {model.get('name', 'Unknown')} ({model.get('size', 'Unknown size')})")
print("\nTesting a model (if available):")
if models and "name" in models[0]:
model_name = models[0]["name"]
print(f"Testing model {model_name} with a simple prompt...")
response = api.generate(model_name, "Say hello in English")
if "response" in response:
print(f"Response: {response['response']}")
else:
print(f"Error: {response.get('error', 'Unknown error')}")
else:
print("Failed to connect to Ollama.")
print("Check that the server is running at the specified address.")

65
utils/data_structure.py Normal file
View File

@ -0,0 +1,65 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Utilitaire pour initialiser la structure de données du projet
"""
import os
import platform
def initialize_data_directories():
"""
Crée la structure de répertoires nécessaire pour le stockage des données
"""
# Répertoire principal de données
data_dir = "data"
# Sous-répertoires pour les différents types de données
subdirs = [
"outputs", # Sorties finales
"images", # Images analysées
"translations", # Journaux de traduction
"ocr_logs", # Journaux OCR
"summaries", # Journaux de résumés
"rewrites", # Journaux de reformulation
"uploads", # Fichiers téléchargés temporaires
"templates", # Modèles de documents
"temp", # Fichiers temporaires
"workflows" # Journaux de flux de travail complets
]
# Créer le répertoire principal s'il n'existe pas
if not os.path.exists(data_dir):
os.makedirs(data_dir)
print(f"Création du répertoire principal: {data_dir}")
# Créer les sous-répertoires
for subdir in subdirs:
full_path = os.path.join(data_dir, subdir)
if not os.path.exists(full_path):
os.makedirs(full_path)
print(f"Création du sous-répertoire: {full_path}")
# Créer le fichier .gitkeep dans chaque répertoire vide pour le contrôle de version
for subdir in subdirs:
full_path = os.path.join(data_dir, subdir)
gitkeep_file = os.path.join(full_path, ".gitkeep")
if not os.path.exists(gitkeep_file):
with open(gitkeep_file, "w") as f:
f.write("# Ce fichier permet de conserver le répertoire vide dans Git")
print("Structure de répertoires initialisée avec succès.")
# Ajustements spécifiques à Windows
if platform.system() == "Windows":
# Vérifier si le chemin du fichier temporaire est trop long
temp_dir = os.path.join(data_dir, "temp")
temp_path = os.path.abspath(temp_dir)
if len(temp_path) > 240: # Windows a une limite de 260 caractères pour les chemins
print(f"AVERTISSEMENT: Le chemin du répertoire temporaire est trop long: {len(temp_path)} caractères")
print("Windows a une limite de 260 caractères pour les chemins complets.")
print("Considérez déplacer le projet dans un répertoire avec un chemin plus court.")
if __name__ == "__main__":
initialize_data_directories()

View File

@ -12,11 +12,23 @@ from typing import Union, Dict, Tuple, Optional
from PIL import Image
import io
import copy
import os
# Configuration du chemin de Tesseract OCR
# pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe' # Pour Windows
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe' # Pour Windows
# Pour Linux et macOS, Tesseract doit être installé et disponible dans le PATH
# Vérification si le chemin de Tesseract existe, sinon essayer d'autres chemins courants
if not os.path.exists(pytesseract.pytesseract.tesseract_cmd):
alternative_paths = [
r'C:\Program Files (x86)\Tesseract-OCR\tesseract.exe',
r'C:\Tesseract-OCR\tesseract.exe'
]
for path in alternative_paths:
if os.path.exists(path):
pytesseract.pytesseract.tesseract_cmd = path
break
class OCRProcessor:
"""
Classe pour traiter les images et extraire le texte

138
utils/pdf_processor.py Normal file
View File

@ -0,0 +1,138 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Traitement des PDF avec le gestionnaire de flux de travail
"""
import os
import io
import time
from typing import Dict, Any, Optional, List, Tuple, Union
from PIL import Image
from utils.workflow_manager import WorkflowManager
class PDFProcessor:
"""
Processeur de PDF intégré avec le gestionnaire de flux de travail
"""
def __init__(self):
"""
Initialise le processeur de PDF
"""
# Initialiser le gestionnaire de flux
self.workflow_manager = WorkflowManager()
# Dossier pour sauvegarder les résultats
self.output_dir = os.path.join("data", "outputs")
os.makedirs(self.output_dir, exist_ok=True)
# Timestamp pour cette session
self.timestamp = time.strftime("%Y%m%d-%H%M%S")
print(f"Processeur PDF initialisé")
def process_image_selection(self, image_data: bytes, selection_type: str,
context: str, page_number: int) -> Dict[str, Any]:
"""
Traite une sélection d'image dans un PDF
Args:
image_data (bytes): Données de l'image sélectionnée
selection_type (str): Type de sélection (schéma, tableau, etc.)
context (str): Contexte textuel associé à l'image
page_number (int): Numéro de la page se trouve l'image
Returns:
Dict: Résultats du traitement
"""
print(f"Traitement d'une image de type '{selection_type}' (page {page_number})")
# Traiter l'image avec le gestionnaire de flux
results = self.workflow_manager.process_image_with_context(
image_data, selection_type, context
)
# Ajouter des informations supplémentaires aux résultats
results["page_number"] = page_number
results["timestamp"] = time.strftime("%Y-%m-%d %H:%M:%S")
# Sauvegarder les résultats dans un fichier texte
self._save_results(results, selection_type, page_number)
return results
def _save_results(self, results: Dict[str, Any], selection_type: str, page_number: int) -> str:
"""
Sauvegarde les résultats du traitement dans un fichier
Args:
results (Dict): Résultats du traitement
selection_type (str): Type de sélection
page_number (int): Numéro de page
Returns:
str: Chemin du fichier de résultats
"""
# Créer un nom de fichier unique
filename = f"{self.timestamp}_page{page_number}_{selection_type}.txt"
output_path = os.path.join(self.output_dir, filename)
# Sauvegarder l'image si disponible
if "original" in results and "image" in results["original"]:
try:
image_data = results["original"]["image"]
img = Image.open(io.BytesIO(image_data))
image_filename = f"{self.timestamp}_page{page_number}_{selection_type}.png"
image_path = os.path.join(self.output_dir, image_filename)
img.save(image_path)
print(f"Image sauvegardée dans: {image_path}")
except Exception as e:
print(f"Erreur lors de la sauvegarde de l'image: {str(e)}")
# Écrire les résultats dans un fichier texte
with open(output_path, "w", encoding="utf-8") as f:
# En-tête
f.write(f"# Analyse de {selection_type} - Page {page_number}\n")
f.write(f"Date et heure: {results['timestamp']}\n\n")
# Contexte original
if "original" in results and "context" in results["original"]:
f.write("## Contexte original (français)\n\n")
f.write(f"{results['original']['context']}\n\n")
# Contexte traduit
if "context_en" in results:
f.write("## Contexte traduit (anglais)\n\n")
f.write(f"{results['context_en']}\n\n")
# Analyse de vision
if "vision_analysis" in results:
f.write("## Analyse de l'image (anglais)\n\n")
f.write(f"{results['vision_analysis']}\n\n")
# Analyse traduite
if "analysis_fr" in results:
f.write("## Analyse traduite (français)\n\n")
f.write(f"{results['analysis_fr']}\n\n")
# Erreurs éventuelles
if "error" in results:
f.write("## ERREUR\n\n")
f.write(f"{results['error']}\n\n")
print(f"Résultats sauvegardés dans: {output_path}")
return output_path
def get_workflow_log_path(self) -> str:
"""
Renvoie le chemin du fichier de journal du flux de travail
Returns:
str: Chemin du fichier de journal
"""
return self.workflow_manager.logger.get_workflow_path()

147
utils/workflow_logger.py Normal file
View File

@ -0,0 +1,147 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Utilitaire pour enregistrer le flux de traitement entre les agents LLM
"""
import os
import time
import uuid
import json
from typing import Dict, Any, Optional, List
class WorkflowLogger:
"""
Enregistreur de flux de travail pour suivre le traitement entre les agents
"""
def __init__(self, session_id: Optional[str] = None):
"""
Initialise l'enregistreur de flux de travail
Args:
session_id (str, optional): Identifiant de session (généré si non fourni)
"""
# Utiliser l'ID fourni ou en générer un nouveau
self.session_id = session_id or str(uuid.uuid4())[:8]
self.timestamp = time.strftime("%Y%m%d-%H%M%S")
# Répertoire pour les journaux
self.log_dir = os.path.join("data", "workflows")
os.makedirs(self.log_dir, exist_ok=True)
# Fichier principal pour ce flux de travail
self.workflow_file = os.path.join(self.log_dir, f"{self.timestamp}_{self.session_id}_workflow.md")
# Initialiser le fichier avec un en-tête
with open(self.workflow_file, "w", encoding="utf-8") as f:
f.write(f"# Flux de traitement - Session {self.session_id}\n\n")
f.write(f"Date et heure: {time.strftime('%d/%m/%Y %H:%M:%S')}\n\n")
f.write("## Étapes du traitement\n\n")
def log_step(self, agent_name: str, input_data: Dict[str, Any], output_data: Any,
metadata: Optional[Dict[str, Any]] = None) -> None:
"""
Enregistre une étape du flux de traitement
Args:
agent_name (str): Nom de l'agent utilisé
input_data (Dict): Données d'entrée de l'agent
output_data (Any): Données de sortie de l'agent
metadata (Dict, optional): Métadonnées supplémentaires
"""
step_time = time.strftime("%H:%M:%S")
with open(self.workflow_file, "a", encoding="utf-8") as f:
# En-tête de l'étape
f.write(f"### Étape: {agent_name} ({step_time})\n\n")
# Métadonnées si fournies
if metadata:
f.write("#### Métadonnées\n\n")
for key, value in metadata.items():
f.write(f"- **{key}**: {value}\n")
f.write("\n")
# Entrées
f.write("#### Entrées\n\n")
for key, value in input_data.items():
if key == "images" and value:
f.write(f"- **{key}**: {len(value)} image(s) fournie(s)\n")
else:
# Limiter la taille des entrées affichées
if isinstance(value, str) and len(value) > 500:
preview = value[:497] + "..."
f.write(f"- **{key}**:\n```\n{preview}\n```\n")
else:
f.write(f"- **{key}**:\n```\n{value}\n```\n")
# Sorties
f.write("#### Sorties\n\n")
if isinstance(output_data, str) and len(output_data) > 1000:
preview = output_data[:997] + "..."
f.write(f"```\n{preview}\n```\n")
else:
f.write(f"```\n{output_data}\n```\n")
# Séparateur
f.write("\n---\n\n")
def log_error(self, agent_name: str, error_message: str,
input_data: Optional[Dict[str, Any]] = None) -> None:
"""
Enregistre une erreur dans le flux de traitement
Args:
agent_name (str): Nom de l'agent qui a généré l'erreur
error_message (str): Message d'erreur
input_data (Dict, optional): Données d'entrée qui ont causé l'erreur
"""
step_time = time.strftime("%H:%M:%S")
with open(self.workflow_file, "a", encoding="utf-8") as f:
# En-tête de l'erreur
f.write(f"### ERREUR dans {agent_name} ({step_time})\n\n")
# Message d'erreur
f.write("#### Message d'erreur\n\n")
f.write(f"```\n{error_message}\n```\n\n")
# Entrées si fournies
if input_data:
f.write("#### Entrées ayant causé l'erreur\n\n")
for key, value in input_data.items():
if key == "images" and value:
f.write(f"- **{key}**: {len(value)} image(s) fournie(s)\n")
else:
# Limiter la taille des entrées affichées
if isinstance(value, str) and len(value) > 500:
preview = value[:497] + "..."
f.write(f"- **{key}**:\n```\n{preview}\n```\n")
else:
f.write(f"- **{key}**:\n```\n{value}\n```\n")
# Séparateur
f.write("\n---\n\n")
def log_summary(self, summary_text: str) -> None:
"""
Ajoute un résumé final au flux de traitement
Args:
summary_text (str): Texte du résumé
"""
with open(self.workflow_file, "a", encoding="utf-8") as f:
f.write("## Résumé du traitement\n\n")
f.write(f"{summary_text}\n\n")
f.write(f"*Fin du flux de traitement - {time.strftime('%d/%m/%Y %H:%M:%S')}*\n")
def get_workflow_path(self) -> str:
"""
Renvoie le chemin du fichier de flux de travail
Returns:
str: Chemin du fichier
"""
return self.workflow_file

320
utils/workflow_manager.py Normal file
View File

@ -0,0 +1,320 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Gestionnaire de flux de travail pour coordonner les agents LLM
"""
import os
import time
from typing import Dict, Any, Optional, List, Tuple, Callable, Type, Union
from agents.ocr import OCRAgent
from agents.vision import VisionAgent
from agents.translation import TranslationAgent
from agents.summary import SummaryAgent
from agents.rewriter import RewriterAgent
from agents.base import LLMBaseAgent
from config.agent_config import (
is_agent_enabled,
get_default_model,
DEFAULT_ENDPOINT,
VERBOSE_LOGGING,
ACTIVE_AGENTS
)
from utils.workflow_logger import WorkflowLogger
class WorkflowManager:
"""
Gestionnaire de flux de travail pour coordonner les agents
"""
def __init__(self, vision_model: Optional[str] = None, summary_model: Optional[str] = None,
translation_model: Optional[str] = None, active_agents: Optional[Dict[str, bool]] = None,
config: Optional[Dict[str, Any]] = None):
"""
Initialise le gestionnaire de flux de travail
Args:
vision_model (str, optional): Modèle à utiliser pour l'agent vision
summary_model (str, optional): Modèle à utiliser pour l'agent résumé
translation_model (str, optional): Modèle à utiliser pour l'agent traduction
active_agents (Dict[str, bool], optional): Dictionnaire des agents actifs
config (Dict[str, Any], optional): Configuration supplémentaire
"""
# Créer un ID unique pour cette session
self.session_id = f"session_{int(time.time())}"
# Initialiser le logger de flux
self.logger = WorkflowLogger(self.session_id)
# Stocker les modèles spécifiés
self.model_overrides = {
"vision": vision_model,
"summary": summary_model,
"translation": translation_model
}
# Stocker la configuration
self.config = config or {}
# Utiliser les agents actifs spécifiés ou la configuration par défaut
self.active_agents = active_agents if active_agents is not None else ACTIVE_AGENTS.copy()
# Initialiser les agents actifs selon la configuration
self.agents = {}
self._init_agents()
print(f"Gestionnaire de flux initialisé avec session ID: {self.session_id}")
print(f"Agents actifs: {', '.join(self.agents.keys())}")
def _init_agents(self) -> None:
"""
Initialise les agents selon la configuration
"""
# OCR Agent
if self.active_agents.get("ocr", True):
self.agents["ocr"] = OCRAgent(model_name="ocr", endpoint=DEFAULT_ENDPOINT)
# Vision Agent (analyse d'images)
if self.active_agents.get("vision", True):
model = self.model_overrides["vision"] or get_default_model("vision")
self.agents["vision"] = VisionAgent(model, DEFAULT_ENDPOINT)
# Translation Agent
if self.active_agents.get("translation", True):
model = self.model_overrides["translation"] or get_default_model("translation")
self.agents["translation"] = TranslationAgent(model, DEFAULT_ENDPOINT)
# Summary Agent (désactivé selon la configuration)
if self.active_agents.get("summary", False):
model = self.model_overrides["summary"] or get_default_model("summary")
self.agents["summary"] = SummaryAgent(model, DEFAULT_ENDPOINT)
# Rewriter Agent
if self.active_agents.get("rewriter", True):
model = get_default_model("rewriter")
self.agents["rewriter"] = RewriterAgent(model, DEFAULT_ENDPOINT)
def process_image(self, image_data: bytes, content_type: str,
context: str = "", target_lang: str = "fr") -> Dict[str, Any]:
"""
Traite une image avec contexte en utilisant les agents configurés
Args:
image_data (bytes): Données de l'image
content_type (str): Type de contenu (schéma, tableau, etc.)
context (str): Contexte textuel associé à l'image
target_lang (str): Langue cible pour les résultats (fr par défaut)
Returns:
Dict: Résultats du traitement
"""
results = {}
# Vérifier si l'option d'enregistrement des images est désactivée
save_images = self.config.get("save_images", True)
if not save_images and "vision" in self.agents:
# Appliquer la configuration au modèle de vision
self.agents["vision"].config["save_images"] = False
try:
# Étape 1: Analyser l'image avec le contexte
if "vision" in self.agents:
print(f"Étape 1/3: Analyse de l'image avec le modèle {self.agents['vision'].model_name}...")
# Si le contexte est en français et qu'on utilise un modèle en anglais,
# on pourrait le traduire ici, mais pour simplifier on l'utilise tel quel
vision_result = self.agents["vision"].generate(
images=[image_data],
selection_type=content_type,
context=context
)
results["vision"] = vision_result
# Journalisation
self.logger.log_step(
agent_name="vision",
input_data={"content_type": content_type, "context": context},
output_data=vision_result,
metadata={"model": self.agents["vision"].model_name}
)
# Si nous avons une erreur ou un message vide, ne pas continuer
if "Error" in vision_result or "Erreur" in vision_result or not vision_result.strip():
print(f"Avertissement: L'analyse d'image a échoué ou retourné un résultat vide.")
if not "translation" in self.agents:
return results
# Attendre un peu avant le prochain appel pour éviter de surcharger le serveur
print("Délai de sécurité entre les appels d'agents (2s)...")
time.sleep(2)
# Étape 2: Traduire l'analyse en français si nécessaire
if "translation" in self.agents and target_lang == "fr":
text_to_translate = results.get("vision", "")
# Si nous n'avons pas de résultat de vision mais avons un contexte, utiliser le contexte
if (not text_to_translate or "Error" in text_to_translate or "Erreur" in text_to_translate) and context:
text_to_translate = f"Voici le contexte de l'image: {context}"
elif not text_to_translate:
text_to_translate = "Aucune analyse d'image disponible."
print(f"Étape 2/3: Traduction du texte avec le modèle {self.agents['translation'].model_name}...")
translation_result = self.agents["translation"].generate(
prompt=text_to_translate,
source_language="en",
target_language="fr"
)
results["translation"] = translation_result
# Journalisation
self.logger.log_step(
agent_name="translation",
input_data={"text": text_to_translate},
output_data=translation_result,
metadata={"model": self.agents["translation"].model_name}
)
# Attendre un peu avant le prochain appel
print("Délai de sécurité entre les appels d'agents (2s)...")
time.sleep(2)
# Étape 3: Générer un résumé si l'agent est activé
if "summary" in self.agents:
# Déterminer le texte à résumer
text_to_summarize = ""
if "translation" in results:
text_to_summarize = results["translation"]
elif "vision" in results:
text_to_summarize = results["vision"]
else:
text_to_summarize = context or "Aucun texte disponible pour le résumé."
if text_to_summarize.strip():
print(f"Étape 3/3: Génération du résumé avec le modèle {self.agents['summary'].model_name}...")
# Utiliser le texte traduit pour le résumé
summary_result = self.agents["summary"].generate(
prompt=text_to_summarize,
language="fr" if "translation" in results else "en"
)
results["summary"] = summary_result
# Journalisation
self.logger.log_step(
agent_name="summary",
input_data={"text": text_to_summarize},
output_data=summary_result,
metadata={"model": self.agents["summary"].model_name}
)
# Générer un résumé du flux de travail
models_used = {
agent_name: agent.model_name
for agent_name, agent in self.agents.items()
if agent_name in ["vision", "translation", "summary"]
}
summary = f"Traitement terminé avec succès.\n"
summary += f"Modèles utilisés: {', '.join([f'{k}={v}' for k, v in models_used.items()])}\n"
summary += f"Agents exécutés: {', '.join(results.keys())}"
self.logger.log_summary(summary)
print(f"Traitement terminé avec succès.")
return results
except Exception as e:
error_msg = f"Erreur lors du traitement de l'image: {str(e)}"
print(error_msg)
self.logger.log_error("workflow_manager", error_msg)
results["error"] = error_msg
return results
def _translate_text(self, text: str, source_lang: str, target_lang: str) -> str:
"""
Traduit un texte d'une langue à une autre
Args:
text (str): Texte à traduire
source_lang (str): Langue source
target_lang (str): Langue cible
Returns:
str: Texte traduit
"""
if not text.strip():
return ""
# Préparer les entrées pour le logger
input_data = {
"prompt": text,
"source_language": source_lang,
"target_language": target_lang
}
# Traduire le texte
translation_agent = self.agents["translation"]
translated_text = translation_agent.generate(
prompt=text,
source_language=source_lang,
target_language=target_lang
)
# Enregistrer l'étape
self.logger.log_step(
agent_name="translation",
input_data=input_data,
output_data=translated_text,
metadata={
"source_language": source_lang,
"target_language": target_lang,
"model": translation_agent.model_name
}
)
return translated_text
def _analyze_image(self, image_data: bytes, selection_type: str, context: str) -> str:
"""
Analyse une image avec contexte à l'aide de l'agent de vision
Args:
image_data (bytes): Données de l'image
selection_type (str): Type de sélection
context (str): Contexte en anglais
Returns:
str: Analyse de l'image
"""
# Préparer les entrées pour le logger
input_data = {
"images": [image_data],
"selection_type": selection_type,
"context": context
}
# Analyser l'image
vision_agent = self.agents["vision"]
analysis = vision_agent.generate(
images=[image_data],
selection_type=selection_type,
context=context
)
# Enregistrer l'étape
self.logger.log_step(
agent_name="vision",
input_data=input_data,
output_data=analysis,
metadata={
"selection_type": selection_type,
"model": vision_agent.model_name
}
)
return analysis

View File

@ -0,0 +1,133 @@
#
# The Python Imaging Library
# $Id$
#
# bitmap distribution font (bdf) file parser
#
# history:
# 1996-05-16 fl created (as bdf2pil)
# 1997-08-25 fl converted to FontFile driver
# 2001-05-25 fl removed bogus __init__ call
# 2002-11-20 fl robustification (from Kevin Cazabon, Dmitry Vasiliev)
# 2003-04-22 fl more robustification (from Graham Dumpleton)
#
# Copyright (c) 1997-2003 by Secret Labs AB.
# Copyright (c) 1997-2003 by Fredrik Lundh.
#
# See the README file for information on usage and redistribution.
#
"""
Parse X Bitmap Distribution Format (BDF)
"""
from __future__ import annotations
from typing import BinaryIO
from . import FontFile, Image
bdf_slant = {
"R": "Roman",
"I": "Italic",
"O": "Oblique",
"RI": "Reverse Italic",
"RO": "Reverse Oblique",
"OT": "Other",
}
bdf_spacing = {"P": "Proportional", "M": "Monospaced", "C": "Cell"}
def bdf_char(
f: BinaryIO,
) -> (
tuple[
str,
int,
tuple[tuple[int, int], tuple[int, int, int, int], tuple[int, int, int, int]],
Image.Image,
]
| None
):
# skip to STARTCHAR
while True:
s = f.readline()
if not s:
return None
if s[:9] == b"STARTCHAR":
break
id = s[9:].strip().decode("ascii")
# load symbol properties
props = {}
while True:
s = f.readline()
if not s or s[:6] == b"BITMAP":
break
i = s.find(b" ")
props[s[:i].decode("ascii")] = s[i + 1 : -1].decode("ascii")
# load bitmap
bitmap = bytearray()
while True:
s = f.readline()
if not s or s[:7] == b"ENDCHAR":
break
bitmap += s[:-1]
# The word BBX
# followed by the width in x (BBw), height in y (BBh),
# and x and y displacement (BBxoff0, BByoff0)
# of the lower left corner from the origin of the character.
width, height, x_disp, y_disp = (int(p) for p in props["BBX"].split())
# The word DWIDTH
# followed by the width in x and y of the character in device pixels.
dwx, dwy = (int(p) for p in props["DWIDTH"].split())
bbox = (
(dwx, dwy),
(x_disp, -y_disp - height, width + x_disp, -y_disp),
(0, 0, width, height),
)
try:
im = Image.frombytes("1", (width, height), bitmap, "hex", "1")
except ValueError:
# deal with zero-width characters
im = Image.new("1", (width, height))
return id, int(props["ENCODING"]), bbox, im
class BdfFontFile(FontFile.FontFile):
"""Font file plugin for the X11 BDF format."""
def __init__(self, fp: BinaryIO) -> None:
super().__init__()
s = fp.readline()
if s[:13] != b"STARTFONT 2.1":
msg = "not a valid BDF file"
raise SyntaxError(msg)
props = {}
comments = []
while True:
s = fp.readline()
if not s or s[:13] == b"ENDPROPERTIES":
break
i = s.find(b" ")
props[s[:i].decode("ascii")] = s[i + 1 : -1].decode("ascii")
if s[:i] in [b"COMMENT", b"COPYRIGHT"]:
if s.find(b"LogicalFontDescription") < 0:
comments.append(s[i + 1 : -1].decode("ascii"))
while True:
c = bdf_char(fp)
if not c:
break
id, ch, (xy, dst, src), im = c
if 0 <= ch < len(self.glyph):
self.glyph[ch] = xy, dst, src, im

View File

@ -0,0 +1,501 @@
"""
Blizzard Mipmap Format (.blp)
Jerome Leclanche <jerome@leclan.ch>
The contents of this file are hereby released in the public domain (CC0)
Full text of the CC0 license:
https://creativecommons.org/publicdomain/zero/1.0/
BLP1 files, used mostly in Warcraft III, are not fully supported.
All types of BLP2 files used in World of Warcraft are supported.
The BLP file structure consists of a header, up to 16 mipmaps of the
texture
Texture sizes must be powers of two, though the two dimensions do
not have to be equal; 512x256 is valid, but 512x200 is not.
The first mipmap (mipmap #0) is the full size image; each subsequent
mipmap halves both dimensions. The final mipmap should be 1x1.
BLP files come in many different flavours:
* JPEG-compressed (type == 0) - only supported for BLP1.
* RAW images (type == 1, encoding == 1). Each mipmap is stored as an
array of 8-bit values, one per pixel, left to right, top to bottom.
Each value is an index to the palette.
* DXT-compressed (type == 1, encoding == 2):
- DXT1 compression is used if alpha_encoding == 0.
- An additional alpha bit is used if alpha_depth == 1.
- DXT3 compression is used if alpha_encoding == 1.
- DXT5 compression is used if alpha_encoding == 7.
"""
from __future__ import annotations
import abc
import os
import struct
from enum import IntEnum
from io import BytesIO
from typing import IO
from . import Image, ImageFile
class Format(IntEnum):
JPEG = 0
class Encoding(IntEnum):
UNCOMPRESSED = 1
DXT = 2
UNCOMPRESSED_RAW_BGRA = 3
class AlphaEncoding(IntEnum):
DXT1 = 0
DXT3 = 1
DXT5 = 7
def unpack_565(i: int) -> tuple[int, int, int]:
return ((i >> 11) & 0x1F) << 3, ((i >> 5) & 0x3F) << 2, (i & 0x1F) << 3
def decode_dxt1(
data: bytes, alpha: bool = False
) -> tuple[bytearray, bytearray, bytearray, bytearray]:
"""
input: one "row" of data (i.e. will produce 4*width pixels)
"""
blocks = len(data) // 8 # number of blocks in row
ret = (bytearray(), bytearray(), bytearray(), bytearray())
for block_index in range(blocks):
# Decode next 8-byte block.
idx = block_index * 8
color0, color1, bits = struct.unpack_from("<HHI", data, idx)
r0, g0, b0 = unpack_565(color0)
r1, g1, b1 = unpack_565(color1)
# Decode this block into 4x4 pixels
# Accumulate the results onto our 4 row accumulators
for j in range(4):
for i in range(4):
# get next control op and generate a pixel
control = bits & 3
bits = bits >> 2
a = 0xFF
if control == 0:
r, g, b = r0, g0, b0
elif control == 1:
r, g, b = r1, g1, b1
elif control == 2:
if color0 > color1:
r = (2 * r0 + r1) // 3
g = (2 * g0 + g1) // 3
b = (2 * b0 + b1) // 3
else:
r = (r0 + r1) // 2
g = (g0 + g1) // 2
b = (b0 + b1) // 2
elif control == 3:
if color0 > color1:
r = (2 * r1 + r0) // 3
g = (2 * g1 + g0) // 3
b = (2 * b1 + b0) // 3
else:
r, g, b, a = 0, 0, 0, 0
if alpha:
ret[j].extend([r, g, b, a])
else:
ret[j].extend([r, g, b])
return ret
def decode_dxt3(data: bytes) -> tuple[bytearray, bytearray, bytearray, bytearray]:
"""
input: one "row" of data (i.e. will produce 4*width pixels)
"""
blocks = len(data) // 16 # number of blocks in row
ret = (bytearray(), bytearray(), bytearray(), bytearray())
for block_index in range(blocks):
idx = block_index * 16
block = data[idx : idx + 16]
# Decode next 16-byte block.
bits = struct.unpack_from("<8B", block)
color0, color1 = struct.unpack_from("<HH", block, 8)
(code,) = struct.unpack_from("<I", block, 12)
r0, g0, b0 = unpack_565(color0)
r1, g1, b1 = unpack_565(color1)
for j in range(4):
high = False # Do we want the higher bits?
for i in range(4):
alphacode_index = (4 * j + i) // 2
a = bits[alphacode_index]
if high:
high = False
a >>= 4
else:
high = True
a &= 0xF
a *= 17 # We get a value between 0 and 15
color_code = (code >> 2 * (4 * j + i)) & 0x03
if color_code == 0:
r, g, b = r0, g0, b0
elif color_code == 1:
r, g, b = r1, g1, b1
elif color_code == 2:
r = (2 * r0 + r1) // 3
g = (2 * g0 + g1) // 3
b = (2 * b0 + b1) // 3
elif color_code == 3:
r = (2 * r1 + r0) // 3
g = (2 * g1 + g0) // 3
b = (2 * b1 + b0) // 3
ret[j].extend([r, g, b, a])
return ret
def decode_dxt5(data: bytes) -> tuple[bytearray, bytearray, bytearray, bytearray]:
"""
input: one "row" of data (i.e. will produce 4 * width pixels)
"""
blocks = len(data) // 16 # number of blocks in row
ret = (bytearray(), bytearray(), bytearray(), bytearray())
for block_index in range(blocks):
idx = block_index * 16
block = data[idx : idx + 16]
# Decode next 16-byte block.
a0, a1 = struct.unpack_from("<BB", block)
bits = struct.unpack_from("<6B", block, 2)
alphacode1 = bits[2] | (bits[3] << 8) | (bits[4] << 16) | (bits[5] << 24)
alphacode2 = bits[0] | (bits[1] << 8)
color0, color1 = struct.unpack_from("<HH", block, 8)
(code,) = struct.unpack_from("<I", block, 12)
r0, g0, b0 = unpack_565(color0)
r1, g1, b1 = unpack_565(color1)
for j in range(4):
for i in range(4):
# get next control op and generate a pixel
alphacode_index = 3 * (4 * j + i)
if alphacode_index <= 12:
alphacode = (alphacode2 >> alphacode_index) & 0x07
elif alphacode_index == 15:
alphacode = (alphacode2 >> 15) | ((alphacode1 << 1) & 0x06)
else: # alphacode_index >= 18 and alphacode_index <= 45
alphacode = (alphacode1 >> (alphacode_index - 16)) & 0x07
if alphacode == 0:
a = a0
elif alphacode == 1:
a = a1
elif a0 > a1:
a = ((8 - alphacode) * a0 + (alphacode - 1) * a1) // 7
elif alphacode == 6:
a = 0
elif alphacode == 7:
a = 255
else:
a = ((6 - alphacode) * a0 + (alphacode - 1) * a1) // 5
color_code = (code >> 2 * (4 * j + i)) & 0x03
if color_code == 0:
r, g, b = r0, g0, b0
elif color_code == 1:
r, g, b = r1, g1, b1
elif color_code == 2:
r = (2 * r0 + r1) // 3
g = (2 * g0 + g1) // 3
b = (2 * b0 + b1) // 3
elif color_code == 3:
r = (2 * r1 + r0) // 3
g = (2 * g1 + g0) // 3
b = (2 * b1 + b0) // 3
ret[j].extend([r, g, b, a])
return ret
class BLPFormatError(NotImplementedError):
pass
def _accept(prefix: bytes) -> bool:
return prefix[:4] in (b"BLP1", b"BLP2")
class BlpImageFile(ImageFile.ImageFile):
"""
Blizzard Mipmap Format
"""
format = "BLP"
format_description = "Blizzard Mipmap Format"
def _open(self) -> None:
self.magic = self.fp.read(4)
if not _accept(self.magic):
msg = f"Bad BLP magic {repr(self.magic)}"
raise BLPFormatError(msg)
compression = struct.unpack("<i", self.fp.read(4))[0]
if self.magic == b"BLP1":
alpha = struct.unpack("<I", self.fp.read(4))[0] != 0
else:
encoding = struct.unpack("<b", self.fp.read(1))[0]
alpha = struct.unpack("<b", self.fp.read(1))[0] != 0
alpha_encoding = struct.unpack("<b", self.fp.read(1))[0]
self.fp.seek(1, os.SEEK_CUR) # mips
self._size = struct.unpack("<II", self.fp.read(8))
args: tuple[int, int, bool] | tuple[int, int, bool, int]
if self.magic == b"BLP1":
encoding = struct.unpack("<i", self.fp.read(4))[0]
self.fp.seek(4, os.SEEK_CUR) # subtype
args = (compression, encoding, alpha)
offset = 28
else:
args = (compression, encoding, alpha, alpha_encoding)
offset = 20
decoder = self.magic.decode()
self._mode = "RGBA" if alpha else "RGB"
self.tile = [ImageFile._Tile(decoder, (0, 0) + self.size, offset, args)]
class _BLPBaseDecoder(ImageFile.PyDecoder):
_pulls_fd = True
def decode(self, buffer: bytes | Image.SupportsArrayInterface) -> tuple[int, int]:
try:
self._read_header()
self._load()
except struct.error as e:
msg = "Truncated BLP file"
raise OSError(msg) from e
return -1, 0
@abc.abstractmethod
def _load(self) -> None:
pass
def _read_header(self) -> None:
self._offsets = struct.unpack("<16I", self._safe_read(16 * 4))
self._lengths = struct.unpack("<16I", self._safe_read(16 * 4))
def _safe_read(self, length: int) -> bytes:
assert self.fd is not None
return ImageFile._safe_read(self.fd, length)
def _read_palette(self) -> list[tuple[int, int, int, int]]:
ret = []
for i in range(256):
try:
b, g, r, a = struct.unpack("<4B", self._safe_read(4))
except struct.error:
break
ret.append((b, g, r, a))
return ret
def _read_bgra(
self, palette: list[tuple[int, int, int, int]], alpha: bool
) -> bytearray:
data = bytearray()
_data = BytesIO(self._safe_read(self._lengths[0]))
while True:
try:
(offset,) = struct.unpack("<B", _data.read(1))
except struct.error:
break
b, g, r, a = palette[offset]
d: tuple[int, ...] = (r, g, b)
if alpha:
d += (a,)
data.extend(d)
return data
class BLP1Decoder(_BLPBaseDecoder):
def _load(self) -> None:
self._compression, self._encoding, alpha = self.args
if self._compression == Format.JPEG:
self._decode_jpeg_stream()
elif self._compression == 1:
if self._encoding in (4, 5):
palette = self._read_palette()
data = self._read_bgra(palette, alpha)
self.set_as_raw(data)
else:
msg = f"Unsupported BLP encoding {repr(self._encoding)}"
raise BLPFormatError(msg)
else:
msg = f"Unsupported BLP compression {repr(self._encoding)}"
raise BLPFormatError(msg)
def _decode_jpeg_stream(self) -> None:
from .JpegImagePlugin import JpegImageFile
(jpeg_header_size,) = struct.unpack("<I", self._safe_read(4))
jpeg_header = self._safe_read(jpeg_header_size)
assert self.fd is not None
self._safe_read(self._offsets[0] - self.fd.tell()) # What IS this?
data = self._safe_read(self._lengths[0])
data = jpeg_header + data
image = JpegImageFile(BytesIO(data))
Image._decompression_bomb_check(image.size)
if image.mode == "CMYK":
decoder_name, extents, offset, args = image.tile[0]
assert isinstance(args, tuple)
image.tile = [
ImageFile._Tile(decoder_name, extents, offset, (args[0], "CMYK"))
]
r, g, b = image.convert("RGB").split()
reversed_image = Image.merge("RGB", (b, g, r))
self.set_as_raw(reversed_image.tobytes())
class BLP2Decoder(_BLPBaseDecoder):
def _load(self) -> None:
self._compression, self._encoding, alpha, self._alpha_encoding = self.args
palette = self._read_palette()
assert self.fd is not None
self.fd.seek(self._offsets[0])
if self._compression == 1:
# Uncompressed or DirectX compression
if self._encoding == Encoding.UNCOMPRESSED:
data = self._read_bgra(palette, alpha)
elif self._encoding == Encoding.DXT:
data = bytearray()
if self._alpha_encoding == AlphaEncoding.DXT1:
linesize = (self.state.xsize + 3) // 4 * 8
for yb in range((self.state.ysize + 3) // 4):
for d in decode_dxt1(self._safe_read(linesize), alpha):
data += d
elif self._alpha_encoding == AlphaEncoding.DXT3:
linesize = (self.state.xsize + 3) // 4 * 16
for yb in range((self.state.ysize + 3) // 4):
for d in decode_dxt3(self._safe_read(linesize)):
data += d
elif self._alpha_encoding == AlphaEncoding.DXT5:
linesize = (self.state.xsize + 3) // 4 * 16
for yb in range((self.state.ysize + 3) // 4):
for d in decode_dxt5(self._safe_read(linesize)):
data += d
else:
msg = f"Unsupported alpha encoding {repr(self._alpha_encoding)}"
raise BLPFormatError(msg)
else:
msg = f"Unknown BLP encoding {repr(self._encoding)}"
raise BLPFormatError(msg)
else:
msg = f"Unknown BLP compression {repr(self._compression)}"
raise BLPFormatError(msg)
self.set_as_raw(data)
class BLPEncoder(ImageFile.PyEncoder):
_pushes_fd = True
def _write_palette(self) -> bytes:
data = b""
assert self.im is not None
palette = self.im.getpalette("RGBA", "RGBA")
for i in range(len(palette) // 4):
r, g, b, a = palette[i * 4 : (i + 1) * 4]
data += struct.pack("<4B", b, g, r, a)
while len(data) < 256 * 4:
data += b"\x00" * 4
return data
def encode(self, bufsize: int) -> tuple[int, int, bytes]:
palette_data = self._write_palette()
offset = 20 + 16 * 4 * 2 + len(palette_data)
data = struct.pack("<16I", offset, *((0,) * 15))
assert self.im is not None
w, h = self.im.size
data += struct.pack("<16I", w * h, *((0,) * 15))
data += palette_data
for y in range(h):
for x in range(w):
data += struct.pack("<B", self.im.getpixel((x, y)))
return len(data), 0, data
def _save(im: Image.Image, fp: IO[bytes], filename: str | bytes) -> None:
if im.mode != "P":
msg = "Unsupported BLP image mode"
raise ValueError(msg)
magic = b"BLP1" if im.encoderinfo.get("blp_version") == "BLP1" else b"BLP2"
fp.write(magic)
assert im.palette is not None
fp.write(struct.pack("<i", 1)) # Uncompressed or DirectX compression
alpha_depth = 1 if im.palette.mode == "RGBA" else 0
if magic == b"BLP1":
fp.write(struct.pack("<L", alpha_depth))
else:
fp.write(struct.pack("<b", Encoding.UNCOMPRESSED))
fp.write(struct.pack("<b", alpha_depth))
fp.write(struct.pack("<b", 0)) # alpha encoding
fp.write(struct.pack("<b", 0)) # mips
fp.write(struct.pack("<II", *im.size))
if magic == b"BLP1":
fp.write(struct.pack("<i", 5))
fp.write(struct.pack("<i", 0))
ImageFile._save(im, fp, [ImageFile._Tile("BLP", (0, 0) + im.size, 0, im.mode)])
Image.register_open(BlpImageFile.format, BlpImageFile, _accept)
Image.register_extension(BlpImageFile.format, ".blp")
Image.register_decoder("BLP1", BLP1Decoder)
Image.register_decoder("BLP2", BLP2Decoder)
Image.register_save(BlpImageFile.format, _save)
Image.register_encoder("BLP", BLPEncoder)

View File

@ -0,0 +1,511 @@
#
# The Python Imaging Library.
# $Id$
#
# BMP file handler
#
# Windows (and OS/2) native bitmap storage format.
#
# history:
# 1995-09-01 fl Created
# 1996-04-30 fl Added save
# 1997-08-27 fl Fixed save of 1-bit images
# 1998-03-06 fl Load P images as L where possible
# 1998-07-03 fl Load P images as 1 where possible
# 1998-12-29 fl Handle small palettes
# 2002-12-30 fl Fixed load of 1-bit palette images
# 2003-04-21 fl Fixed load of 1-bit monochrome images
# 2003-04-23 fl Added limited support for BI_BITFIELDS compression
#
# Copyright (c) 1997-2003 by Secret Labs AB
# Copyright (c) 1995-2003 by Fredrik Lundh
#
# See the README file for information on usage and redistribution.
#
from __future__ import annotations
import os
from typing import IO, Any
from . import Image, ImageFile, ImagePalette
from ._binary import i16le as i16
from ._binary import i32le as i32
from ._binary import o8
from ._binary import o16le as o16
from ._binary import o32le as o32
#
# --------------------------------------------------------------------
# Read BMP file
BIT2MODE = {
# bits => mode, rawmode
1: ("P", "P;1"),
4: ("P", "P;4"),
8: ("P", "P"),
16: ("RGB", "BGR;15"),
24: ("RGB", "BGR"),
32: ("RGB", "BGRX"),
}
def _accept(prefix: bytes) -> bool:
return prefix[:2] == b"BM"
def _dib_accept(prefix: bytes) -> bool:
return i32(prefix) in [12, 40, 52, 56, 64, 108, 124]
# =============================================================================
# Image plugin for the Windows BMP format.
# =============================================================================
class BmpImageFile(ImageFile.ImageFile):
"""Image plugin for the Windows Bitmap format (BMP)"""
# ------------------------------------------------------------- Description
format_description = "Windows Bitmap"
format = "BMP"
# -------------------------------------------------- BMP Compression values
COMPRESSIONS = {"RAW": 0, "RLE8": 1, "RLE4": 2, "BITFIELDS": 3, "JPEG": 4, "PNG": 5}
for k, v in COMPRESSIONS.items():
vars()[k] = v
def _bitmap(self, header: int = 0, offset: int = 0) -> None:
"""Read relevant info about the BMP"""
read, seek = self.fp.read, self.fp.seek
if header:
seek(header)
# read bmp header size @offset 14 (this is part of the header size)
file_info: dict[str, bool | int | tuple[int, ...]] = {
"header_size": i32(read(4)),
"direction": -1,
}
# -------------------- If requested, read header at a specific position
# read the rest of the bmp header, without its size
assert isinstance(file_info["header_size"], int)
header_data = ImageFile._safe_read(self.fp, file_info["header_size"] - 4)
# ------------------------------- Windows Bitmap v2, IBM OS/2 Bitmap v1
# ----- This format has different offsets because of width/height types
# 12: BITMAPCOREHEADER/OS21XBITMAPHEADER
if file_info["header_size"] == 12:
file_info["width"] = i16(header_data, 0)
file_info["height"] = i16(header_data, 2)
file_info["planes"] = i16(header_data, 4)
file_info["bits"] = i16(header_data, 6)
file_info["compression"] = self.COMPRESSIONS["RAW"]
file_info["palette_padding"] = 3
# --------------------------------------------- Windows Bitmap v3 to v5
# 40: BITMAPINFOHEADER
# 52: BITMAPV2HEADER
# 56: BITMAPV3HEADER
# 64: BITMAPCOREHEADER2/OS22XBITMAPHEADER
# 108: BITMAPV4HEADER
# 124: BITMAPV5HEADER
elif file_info["header_size"] in (40, 52, 56, 64, 108, 124):
file_info["y_flip"] = header_data[7] == 0xFF
file_info["direction"] = 1 if file_info["y_flip"] else -1
file_info["width"] = i32(header_data, 0)
file_info["height"] = (
i32(header_data, 4)
if not file_info["y_flip"]
else 2**32 - i32(header_data, 4)
)
file_info["planes"] = i16(header_data, 8)
file_info["bits"] = i16(header_data, 10)
file_info["compression"] = i32(header_data, 12)
# byte size of pixel data
file_info["data_size"] = i32(header_data, 16)
file_info["pixels_per_meter"] = (
i32(header_data, 20),
i32(header_data, 24),
)
file_info["colors"] = i32(header_data, 28)
file_info["palette_padding"] = 4
assert isinstance(file_info["pixels_per_meter"], tuple)
self.info["dpi"] = tuple(x / 39.3701 for x in file_info["pixels_per_meter"])
if file_info["compression"] == self.COMPRESSIONS["BITFIELDS"]:
masks = ["r_mask", "g_mask", "b_mask"]
if len(header_data) >= 48:
if len(header_data) >= 52:
masks.append("a_mask")
else:
file_info["a_mask"] = 0x0
for idx, mask in enumerate(masks):
file_info[mask] = i32(header_data, 36 + idx * 4)
else:
# 40 byte headers only have the three components in the
# bitfields masks, ref:
# https://msdn.microsoft.com/en-us/library/windows/desktop/dd183376(v=vs.85).aspx
# See also
# https://github.com/python-pillow/Pillow/issues/1293
# There is a 4th component in the RGBQuad, in the alpha
# location, but it is listed as a reserved component,
# and it is not generally an alpha channel
file_info["a_mask"] = 0x0
for mask in masks:
file_info[mask] = i32(read(4))
assert isinstance(file_info["r_mask"], int)
assert isinstance(file_info["g_mask"], int)
assert isinstance(file_info["b_mask"], int)
assert isinstance(file_info["a_mask"], int)
file_info["rgb_mask"] = (
file_info["r_mask"],
file_info["g_mask"],
file_info["b_mask"],
)
file_info["rgba_mask"] = (
file_info["r_mask"],
file_info["g_mask"],
file_info["b_mask"],
file_info["a_mask"],
)
else:
msg = f"Unsupported BMP header type ({file_info['header_size']})"
raise OSError(msg)
# ------------------ Special case : header is reported 40, which
# ---------------------- is shorter than real size for bpp >= 16
assert isinstance(file_info["width"], int)
assert isinstance(file_info["height"], int)
self._size = file_info["width"], file_info["height"]
# ------- If color count was not found in the header, compute from bits
assert isinstance(file_info["bits"], int)
file_info["colors"] = (
file_info["colors"]
if file_info.get("colors", 0)
else (1 << file_info["bits"])
)
assert isinstance(file_info["colors"], int)
if offset == 14 + file_info["header_size"] and file_info["bits"] <= 8:
offset += 4 * file_info["colors"]
# ---------------------- Check bit depth for unusual unsupported values
self._mode, raw_mode = BIT2MODE.get(file_info["bits"], ("", ""))
if not self.mode:
msg = f"Unsupported BMP pixel depth ({file_info['bits']})"
raise OSError(msg)
# ---------------- Process BMP with Bitfields compression (not palette)
decoder_name = "raw"
if file_info["compression"] == self.COMPRESSIONS["BITFIELDS"]:
SUPPORTED: dict[int, list[tuple[int, ...]]] = {
32: [
(0xFF0000, 0xFF00, 0xFF, 0x0),
(0xFF000000, 0xFF0000, 0xFF00, 0x0),
(0xFF000000, 0xFF00, 0xFF, 0x0),
(0xFF000000, 0xFF0000, 0xFF00, 0xFF),
(0xFF, 0xFF00, 0xFF0000, 0xFF000000),
(0xFF0000, 0xFF00, 0xFF, 0xFF000000),
(0xFF000000, 0xFF00, 0xFF, 0xFF0000),
(0x0, 0x0, 0x0, 0x0),
],
24: [(0xFF0000, 0xFF00, 0xFF)],
16: [(0xF800, 0x7E0, 0x1F), (0x7C00, 0x3E0, 0x1F)],
}
MASK_MODES = {
(32, (0xFF0000, 0xFF00, 0xFF, 0x0)): "BGRX",
(32, (0xFF000000, 0xFF0000, 0xFF00, 0x0)): "XBGR",
(32, (0xFF000000, 0xFF00, 0xFF, 0x0)): "BGXR",
(32, (0xFF000000, 0xFF0000, 0xFF00, 0xFF)): "ABGR",
(32, (0xFF, 0xFF00, 0xFF0000, 0xFF000000)): "RGBA",
(32, (0xFF0000, 0xFF00, 0xFF, 0xFF000000)): "BGRA",
(32, (0xFF000000, 0xFF00, 0xFF, 0xFF0000)): "BGAR",
(32, (0x0, 0x0, 0x0, 0x0)): "BGRA",
(24, (0xFF0000, 0xFF00, 0xFF)): "BGR",
(16, (0xF800, 0x7E0, 0x1F)): "BGR;16",
(16, (0x7C00, 0x3E0, 0x1F)): "BGR;15",
}
if file_info["bits"] in SUPPORTED:
if (
file_info["bits"] == 32
and file_info["rgba_mask"] in SUPPORTED[file_info["bits"]]
):
assert isinstance(file_info["rgba_mask"], tuple)
raw_mode = MASK_MODES[(file_info["bits"], file_info["rgba_mask"])]
self._mode = "RGBA" if "A" in raw_mode else self.mode
elif (
file_info["bits"] in (24, 16)
and file_info["rgb_mask"] in SUPPORTED[file_info["bits"]]
):
assert isinstance(file_info["rgb_mask"], tuple)
raw_mode = MASK_MODES[(file_info["bits"], file_info["rgb_mask"])]
else:
msg = "Unsupported BMP bitfields layout"
raise OSError(msg)
else:
msg = "Unsupported BMP bitfields layout"
raise OSError(msg)
elif file_info["compression"] == self.COMPRESSIONS["RAW"]:
if file_info["bits"] == 32 and header == 22: # 32-bit .cur offset
raw_mode, self._mode = "BGRA", "RGBA"
elif file_info["compression"] in (
self.COMPRESSIONS["RLE8"],
self.COMPRESSIONS["RLE4"],
):
decoder_name = "bmp_rle"
else:
msg = f"Unsupported BMP compression ({file_info['compression']})"
raise OSError(msg)
# --------------- Once the header is processed, process the palette/LUT
if self.mode == "P": # Paletted for 1, 4 and 8 bit images
# ---------------------------------------------------- 1-bit images
if not (0 < file_info["colors"] <= 65536):
msg = f"Unsupported BMP Palette size ({file_info['colors']})"
raise OSError(msg)
else:
assert isinstance(file_info["palette_padding"], int)
padding = file_info["palette_padding"]
palette = read(padding * file_info["colors"])
grayscale = True
indices = (
(0, 255)
if file_info["colors"] == 2
else list(range(file_info["colors"]))
)
# ----------------- Check if grayscale and ignore palette if so
for ind, val in enumerate(indices):
rgb = palette[ind * padding : ind * padding + 3]
if rgb != o8(val) * 3:
grayscale = False
# ------- If all colors are gray, white or black, ditch palette
if grayscale:
self._mode = "1" if file_info["colors"] == 2 else "L"
raw_mode = self.mode
else:
self._mode = "P"
self.palette = ImagePalette.raw(
"BGRX" if padding == 4 else "BGR", palette
)
# ---------------------------- Finally set the tile data for the plugin
self.info["compression"] = file_info["compression"]
args: list[Any] = [raw_mode]
if decoder_name == "bmp_rle":
args.append(file_info["compression"] == self.COMPRESSIONS["RLE4"])
else:
assert isinstance(file_info["width"], int)
args.append(((file_info["width"] * file_info["bits"] + 31) >> 3) & (~3))
args.append(file_info["direction"])
self.tile = [
ImageFile._Tile(
decoder_name,
(0, 0, file_info["width"], file_info["height"]),
offset or self.fp.tell(),
tuple(args),
)
]
def _open(self) -> None:
"""Open file, check magic number and read header"""
# read 14 bytes: magic number, filesize, reserved, header final offset
head_data = self.fp.read(14)
# choke if the file does not have the required magic bytes
if not _accept(head_data):
msg = "Not a BMP file"
raise SyntaxError(msg)
# read the start position of the BMP image data (u32)
offset = i32(head_data, 10)
# load bitmap information (offset=raster info)
self._bitmap(offset=offset)
class BmpRleDecoder(ImageFile.PyDecoder):
_pulls_fd = True
def decode(self, buffer: bytes | Image.SupportsArrayInterface) -> tuple[int, int]:
assert self.fd is not None
rle4 = self.args[1]
data = bytearray()
x = 0
dest_length = self.state.xsize * self.state.ysize
while len(data) < dest_length:
pixels = self.fd.read(1)
byte = self.fd.read(1)
if not pixels or not byte:
break
num_pixels = pixels[0]
if num_pixels:
# encoded mode
if x + num_pixels > self.state.xsize:
# Too much data for row
num_pixels = max(0, self.state.xsize - x)
if rle4:
first_pixel = o8(byte[0] >> 4)
second_pixel = o8(byte[0] & 0x0F)
for index in range(num_pixels):
if index % 2 == 0:
data += first_pixel
else:
data += second_pixel
else:
data += byte * num_pixels
x += num_pixels
else:
if byte[0] == 0:
# end of line
while len(data) % self.state.xsize != 0:
data += b"\x00"
x = 0
elif byte[0] == 1:
# end of bitmap
break
elif byte[0] == 2:
# delta
bytes_read = self.fd.read(2)
if len(bytes_read) < 2:
break
right, up = self.fd.read(2)
data += b"\x00" * (right + up * self.state.xsize)
x = len(data) % self.state.xsize
else:
# absolute mode
if rle4:
# 2 pixels per byte
byte_count = byte[0] // 2
bytes_read = self.fd.read(byte_count)
for byte_read in bytes_read:
data += o8(byte_read >> 4)
data += o8(byte_read & 0x0F)
else:
byte_count = byte[0]
bytes_read = self.fd.read(byte_count)
data += bytes_read
if len(bytes_read) < byte_count:
break
x += byte[0]
# align to 16-bit word boundary
if self.fd.tell() % 2 != 0:
self.fd.seek(1, os.SEEK_CUR)
rawmode = "L" if self.mode == "L" else "P"
self.set_as_raw(bytes(data), rawmode, (0, self.args[-1]))
return -1, 0
# =============================================================================
# Image plugin for the DIB format (BMP alias)
# =============================================================================
class DibImageFile(BmpImageFile):
format = "DIB"
format_description = "Windows Bitmap"
def _open(self) -> None:
self._bitmap()
#
# --------------------------------------------------------------------
# Write BMP file
SAVE = {
"1": ("1", 1, 2),
"L": ("L", 8, 256),
"P": ("P", 8, 256),
"RGB": ("BGR", 24, 0),
"RGBA": ("BGRA", 32, 0),
}
def _dib_save(im: Image.Image, fp: IO[bytes], filename: str | bytes) -> None:
_save(im, fp, filename, False)
def _save(
im: Image.Image, fp: IO[bytes], filename: str | bytes, bitmap_header: bool = True
) -> None:
try:
rawmode, bits, colors = SAVE[im.mode]
except KeyError as e:
msg = f"cannot write mode {im.mode} as BMP"
raise OSError(msg) from e
info = im.encoderinfo
dpi = info.get("dpi", (96, 96))
# 1 meter == 39.3701 inches
ppm = tuple(int(x * 39.3701 + 0.5) for x in dpi)
stride = ((im.size[0] * bits + 7) // 8 + 3) & (~3)
header = 40 # or 64 for OS/2 version 2
image = stride * im.size[1]
if im.mode == "1":
palette = b"".join(o8(i) * 4 for i in (0, 255))
elif im.mode == "L":
palette = b"".join(o8(i) * 4 for i in range(256))
elif im.mode == "P":
palette = im.im.getpalette("RGB", "BGRX")
colors = len(palette) // 4
else:
palette = None
# bitmap header
if bitmap_header:
offset = 14 + header + colors * 4
file_size = offset + image
if file_size > 2**32 - 1:
msg = "File size is too large for the BMP format"
raise ValueError(msg)
fp.write(
b"BM" # file type (magic)
+ o32(file_size) # file size
+ o32(0) # reserved
+ o32(offset) # image data offset
)
# bitmap info header
fp.write(
o32(header) # info header size
+ o32(im.size[0]) # width
+ o32(im.size[1]) # height
+ o16(1) # planes
+ o16(bits) # depth
+ o32(0) # compression (0=uncompressed)
+ o32(image) # size of bitmap
+ o32(ppm[0]) # resolution
+ o32(ppm[1]) # resolution
+ o32(colors) # colors used
+ o32(colors) # colors important
)
fp.write(b"\0" * (header - 40)) # padding (for OS/2 format)
if palette:
fp.write(palette)
ImageFile._save(
im, fp, [ImageFile._Tile("raw", (0, 0) + im.size, 0, (rawmode, stride, -1))]
)
#
# --------------------------------------------------------------------
# Registry
Image.register_open(BmpImageFile.format, BmpImageFile, _accept)
Image.register_save(BmpImageFile.format, _save)
Image.register_extension(BmpImageFile.format, ".bmp")
Image.register_mime(BmpImageFile.format, "image/bmp")
Image.register_decoder("bmp_rle", BmpRleDecoder)
Image.register_open(DibImageFile.format, DibImageFile, _dib_accept)
Image.register_save(DibImageFile.format, _dib_save)
Image.register_extension(DibImageFile.format, ".dib")
Image.register_mime(DibImageFile.format, "image/bmp")

View File

@ -0,0 +1,76 @@
#
# The Python Imaging Library
# $Id$
#
# BUFR stub adapter
#
# Copyright (c) 1996-2003 by Fredrik Lundh
#
# See the README file for information on usage and redistribution.
#
from __future__ import annotations
from typing import IO
from . import Image, ImageFile
_handler = None
def register_handler(handler: ImageFile.StubHandler | None) -> None:
"""
Install application-specific BUFR image handler.
:param handler: Handler object.
"""
global _handler
_handler = handler
# --------------------------------------------------------------------
# Image adapter
def _accept(prefix: bytes) -> bool:
return prefix[:4] == b"BUFR" or prefix[:4] == b"ZCZC"
class BufrStubImageFile(ImageFile.StubImageFile):
format = "BUFR"
format_description = "BUFR"
def _open(self) -> None:
offset = self.fp.tell()
if not _accept(self.fp.read(4)):
msg = "Not a BUFR file"
raise SyntaxError(msg)
self.fp.seek(offset)
# make something up
self._mode = "F"
self._size = 1, 1
loader = self._load()
if loader:
loader.open(self)
def _load(self) -> ImageFile.StubHandler | None:
return _handler
def _save(im: Image.Image, fp: IO[bytes], filename: str | bytes) -> None:
if _handler is None or not hasattr(_handler, "save"):
msg = "BUFR save handler not installed"
raise OSError(msg)
_handler.save(im, fp, filename)
# --------------------------------------------------------------------
# Registry
Image.register_open(BufrStubImageFile.format, BufrStubImageFile, _accept)
Image.register_save(BufrStubImageFile.format, _save)
Image.register_extension(BufrStubImageFile.format, ".bufr")

View File

@ -0,0 +1,173 @@
#
# The Python Imaging Library.
# $Id$
#
# a class to read from a container file
#
# History:
# 1995-06-18 fl Created
# 1995-09-07 fl Added readline(), readlines()
#
# Copyright (c) 1997-2001 by Secret Labs AB
# Copyright (c) 1995 by Fredrik Lundh
#
# See the README file for information on usage and redistribution.
#
from __future__ import annotations
import io
from collections.abc import Iterable
from typing import IO, AnyStr, NoReturn
class ContainerIO(IO[AnyStr]):
"""
A file object that provides read access to a part of an existing
file (for example a TAR file).
"""
def __init__(self, file: IO[AnyStr], offset: int, length: int) -> None:
"""
Create file object.
:param file: Existing file.
:param offset: Start of region, in bytes.
:param length: Size of region, in bytes.
"""
self.fh: IO[AnyStr] = file
self.pos = 0
self.offset = offset
self.length = length
self.fh.seek(offset)
##
# Always false.
def isatty(self) -> bool:
return False
def seekable(self) -> bool:
return True
def seek(self, offset: int, mode: int = io.SEEK_SET) -> int:
"""
Move file pointer.
:param offset: Offset in bytes.
:param mode: Starting position. Use 0 for beginning of region, 1
for current offset, and 2 for end of region. You cannot move
the pointer outside the defined region.
:returns: Offset from start of region, in bytes.
"""
if mode == 1:
self.pos = self.pos + offset
elif mode == 2:
self.pos = self.length + offset
else:
self.pos = offset
# clamp
self.pos = max(0, min(self.pos, self.length))
self.fh.seek(self.offset + self.pos)
return self.pos
def tell(self) -> int:
"""
Get current file pointer.
:returns: Offset from start of region, in bytes.
"""
return self.pos
def readable(self) -> bool:
return True
def read(self, n: int = -1) -> AnyStr:
"""
Read data.
:param n: Number of bytes to read. If omitted, zero or negative,
read until end of region.
:returns: An 8-bit string.
"""
if n > 0:
n = min(n, self.length - self.pos)
else:
n = self.length - self.pos
if n <= 0: # EOF
return b"" if "b" in self.fh.mode else "" # type: ignore[return-value]
self.pos = self.pos + n
return self.fh.read(n)
def readline(self, n: int = -1) -> AnyStr:
"""
Read a line of text.
:param n: Number of bytes to read. If omitted, zero or negative,
read until end of line.
:returns: An 8-bit string.
"""
s: AnyStr = b"" if "b" in self.fh.mode else "" # type: ignore[assignment]
newline_character = b"\n" if "b" in self.fh.mode else "\n"
while True:
c = self.read(1)
if not c:
break
s = s + c
if c == newline_character or len(s) == n:
break
return s
def readlines(self, n: int | None = -1) -> list[AnyStr]:
"""
Read multiple lines of text.
:param n: Number of lines to read. If omitted, zero, negative or None,
read until end of region.
:returns: A list of 8-bit strings.
"""
lines = []
while True:
s = self.readline()
if not s:
break
lines.append(s)
if len(lines) == n:
break
return lines
def writable(self) -> bool:
return False
def write(self, b: AnyStr) -> NoReturn:
raise NotImplementedError()
def writelines(self, lines: Iterable[AnyStr]) -> NoReturn:
raise NotImplementedError()
def truncate(self, size: int | None = None) -> int:
raise NotImplementedError()
def __enter__(self) -> ContainerIO[AnyStr]:
return self
def __exit__(self, *args: object) -> None:
self.close()
def __iter__(self) -> ContainerIO[AnyStr]:
return self
def __next__(self) -> AnyStr:
line = self.readline()
if not line:
msg = "end of region"
raise StopIteration(msg)
return line
def fileno(self) -> int:
return self.fh.fileno()
def flush(self) -> None:
self.fh.flush()
def close(self) -> None:
self.fh.close()

View File

@ -0,0 +1,75 @@
#
# The Python Imaging Library.
# $Id$
#
# Windows Cursor support for PIL
#
# notes:
# uses BmpImagePlugin.py to read the bitmap data.
#
# history:
# 96-05-27 fl Created
#
# Copyright (c) Secret Labs AB 1997.
# Copyright (c) Fredrik Lundh 1996.
#
# See the README file for information on usage and redistribution.
#
from __future__ import annotations
from . import BmpImagePlugin, Image, ImageFile
from ._binary import i16le as i16
from ._binary import i32le as i32
#
# --------------------------------------------------------------------
def _accept(prefix: bytes) -> bool:
return prefix[:4] == b"\0\0\2\0"
##
# Image plugin for Windows Cursor files.
class CurImageFile(BmpImagePlugin.BmpImageFile):
format = "CUR"
format_description = "Windows Cursor"
def _open(self) -> None:
offset = self.fp.tell()
# check magic
s = self.fp.read(6)
if not _accept(s):
msg = "not a CUR file"
raise SyntaxError(msg)
# pick the largest cursor in the file
m = b""
for i in range(i16(s, 4)):
s = self.fp.read(16)
if not m:
m = s
elif s[0] > m[0] and s[1] > m[1]:
m = s
if not m:
msg = "No cursors were found"
raise TypeError(msg)
# load as bitmap
self._bitmap(i32(m, 12) + offset)
# patch up the bitmap height
self._size = self.size[0], self.size[1] // 2
d, e, o, a = self.tile[0]
self.tile[0] = ImageFile._Tile(d, (0, 0) + self.size, o, a)
#
# --------------------------------------------------------------------
Image.register_open(CurImageFile.format, CurImageFile, _accept)
Image.register_extension(CurImageFile.format, ".cur")

View File

@ -0,0 +1,80 @@
#
# The Python Imaging Library.
# $Id$
#
# DCX file handling
#
# DCX is a container file format defined by Intel, commonly used
# for fax applications. Each DCX file consists of a directory
# (a list of file offsets) followed by a set of (usually 1-bit)
# PCX files.
#
# History:
# 1995-09-09 fl Created
# 1996-03-20 fl Properly derived from PcxImageFile.
# 1998-07-15 fl Renamed offset attribute to avoid name clash
# 2002-07-30 fl Fixed file handling
#
# Copyright (c) 1997-98 by Secret Labs AB.
# Copyright (c) 1995-96 by Fredrik Lundh.
#
# See the README file for information on usage and redistribution.
#
from __future__ import annotations
from . import Image
from ._binary import i32le as i32
from .PcxImagePlugin import PcxImageFile
MAGIC = 0x3ADE68B1 # QUIZ: what's this value, then?
def _accept(prefix: bytes) -> bool:
return len(prefix) >= 4 and i32(prefix) == MAGIC
##
# Image plugin for the Intel DCX format.
class DcxImageFile(PcxImageFile):
format = "DCX"
format_description = "Intel DCX"
_close_exclusive_fp_after_loading = False
def _open(self) -> None:
# Header
s = self.fp.read(4)
if not _accept(s):
msg = "not a DCX file"
raise SyntaxError(msg)
# Component directory
self._offset = []
for i in range(1024):
offset = i32(self.fp.read(4))
if not offset:
break
self._offset.append(offset)
self._fp = self.fp
self.frame = -1
self.n_frames = len(self._offset)
self.is_animated = self.n_frames > 1
self.seek(0)
def seek(self, frame: int) -> None:
if not self._seek_check(frame):
return
self.frame = frame
self.fp = self._fp
self.fp.seek(self._offset[frame])
PcxImageFile._open(self)
def tell(self) -> int:
return self.frame
Image.register_open(DcxImageFile.format, DcxImageFile, _accept)
Image.register_extension(DcxImageFile.format, ".dcx")

View File

@ -0,0 +1,573 @@
"""
A Pillow loader for .dds files (S3TC-compressed aka DXTC)
Jerome Leclanche <jerome@leclan.ch>
Documentation:
https://web.archive.org/web/20170802060935/http://oss.sgi.com/projects/ogl-sample/registry/EXT/texture_compression_s3tc.txt
The contents of this file are hereby released in the public domain (CC0)
Full text of the CC0 license:
https://creativecommons.org/publicdomain/zero/1.0/
"""
from __future__ import annotations
import io
import struct
import sys
from enum import IntEnum, IntFlag
from typing import IO
from . import Image, ImageFile, ImagePalette
from ._binary import i32le as i32
from ._binary import o8
from ._binary import o32le as o32
# Magic ("DDS ")
DDS_MAGIC = 0x20534444
# DDS flags
class DDSD(IntFlag):
CAPS = 0x1
HEIGHT = 0x2
WIDTH = 0x4
PITCH = 0x8
PIXELFORMAT = 0x1000
MIPMAPCOUNT = 0x20000
LINEARSIZE = 0x80000
DEPTH = 0x800000
# DDS caps
class DDSCAPS(IntFlag):
COMPLEX = 0x8
TEXTURE = 0x1000
MIPMAP = 0x400000
class DDSCAPS2(IntFlag):
CUBEMAP = 0x200
CUBEMAP_POSITIVEX = 0x400
CUBEMAP_NEGATIVEX = 0x800
CUBEMAP_POSITIVEY = 0x1000
CUBEMAP_NEGATIVEY = 0x2000
CUBEMAP_POSITIVEZ = 0x4000
CUBEMAP_NEGATIVEZ = 0x8000
VOLUME = 0x200000
# Pixel Format
class DDPF(IntFlag):
ALPHAPIXELS = 0x1
ALPHA = 0x2
FOURCC = 0x4
PALETTEINDEXED8 = 0x20
RGB = 0x40
LUMINANCE = 0x20000
# dxgiformat.h
class DXGI_FORMAT(IntEnum):
UNKNOWN = 0
R32G32B32A32_TYPELESS = 1
R32G32B32A32_FLOAT = 2
R32G32B32A32_UINT = 3
R32G32B32A32_SINT = 4
R32G32B32_TYPELESS = 5
R32G32B32_FLOAT = 6
R32G32B32_UINT = 7
R32G32B32_SINT = 8
R16G16B16A16_TYPELESS = 9
R16G16B16A16_FLOAT = 10
R16G16B16A16_UNORM = 11
R16G16B16A16_UINT = 12
R16G16B16A16_SNORM = 13
R16G16B16A16_SINT = 14
R32G32_TYPELESS = 15
R32G32_FLOAT = 16
R32G32_UINT = 17
R32G32_SINT = 18
R32G8X24_TYPELESS = 19
D32_FLOAT_S8X24_UINT = 20
R32_FLOAT_X8X24_TYPELESS = 21
X32_TYPELESS_G8X24_UINT = 22
R10G10B10A2_TYPELESS = 23
R10G10B10A2_UNORM = 24
R10G10B10A2_UINT = 25
R11G11B10_FLOAT = 26
R8G8B8A8_TYPELESS = 27
R8G8B8A8_UNORM = 28
R8G8B8A8_UNORM_SRGB = 29
R8G8B8A8_UINT = 30
R8G8B8A8_SNORM = 31
R8G8B8A8_SINT = 32
R16G16_TYPELESS = 33
R16G16_FLOAT = 34
R16G16_UNORM = 35
R16G16_UINT = 36
R16G16_SNORM = 37
R16G16_SINT = 38
R32_TYPELESS = 39
D32_FLOAT = 40
R32_FLOAT = 41
R32_UINT = 42
R32_SINT = 43
R24G8_TYPELESS = 44
D24_UNORM_S8_UINT = 45
R24_UNORM_X8_TYPELESS = 46
X24_TYPELESS_G8_UINT = 47
R8G8_TYPELESS = 48
R8G8_UNORM = 49
R8G8_UINT = 50
R8G8_SNORM = 51
R8G8_SINT = 52
R16_TYPELESS = 53
R16_FLOAT = 54
D16_UNORM = 55
R16_UNORM = 56
R16_UINT = 57
R16_SNORM = 58
R16_SINT = 59
R8_TYPELESS = 60
R8_UNORM = 61
R8_UINT = 62
R8_SNORM = 63
R8_SINT = 64
A8_UNORM = 65
R1_UNORM = 66
R9G9B9E5_SHAREDEXP = 67
R8G8_B8G8_UNORM = 68
G8R8_G8B8_UNORM = 69
BC1_TYPELESS = 70
BC1_UNORM = 71
BC1_UNORM_SRGB = 72
BC2_TYPELESS = 73
BC2_UNORM = 74
BC2_UNORM_SRGB = 75
BC3_TYPELESS = 76
BC3_UNORM = 77
BC3_UNORM_SRGB = 78
BC4_TYPELESS = 79
BC4_UNORM = 80
BC4_SNORM = 81
BC5_TYPELESS = 82
BC5_UNORM = 83
BC5_SNORM = 84
B5G6R5_UNORM = 85
B5G5R5A1_UNORM = 86
B8G8R8A8_UNORM = 87
B8G8R8X8_UNORM = 88
R10G10B10_XR_BIAS_A2_UNORM = 89
B8G8R8A8_TYPELESS = 90
B8G8R8A8_UNORM_SRGB = 91
B8G8R8X8_TYPELESS = 92
B8G8R8X8_UNORM_SRGB = 93
BC6H_TYPELESS = 94
BC6H_UF16 = 95
BC6H_SF16 = 96
BC7_TYPELESS = 97
BC7_UNORM = 98
BC7_UNORM_SRGB = 99
AYUV = 100
Y410 = 101
Y416 = 102
NV12 = 103
P010 = 104
P016 = 105
OPAQUE_420 = 106
YUY2 = 107
Y210 = 108
Y216 = 109
NV11 = 110
AI44 = 111
IA44 = 112
P8 = 113
A8P8 = 114
B4G4R4A4_UNORM = 115
P208 = 130
V208 = 131
V408 = 132
SAMPLER_FEEDBACK_MIN_MIP_OPAQUE = 189
SAMPLER_FEEDBACK_MIP_REGION_USED_OPAQUE = 190
class D3DFMT(IntEnum):
UNKNOWN = 0
R8G8B8 = 20
A8R8G8B8 = 21
X8R8G8B8 = 22
R5G6B5 = 23
X1R5G5B5 = 24
A1R5G5B5 = 25
A4R4G4B4 = 26
R3G3B2 = 27
A8 = 28
A8R3G3B2 = 29
X4R4G4B4 = 30
A2B10G10R10 = 31
A8B8G8R8 = 32
X8B8G8R8 = 33
G16R16 = 34
A2R10G10B10 = 35
A16B16G16R16 = 36
A8P8 = 40
P8 = 41
L8 = 50
A8L8 = 51
A4L4 = 52
V8U8 = 60
L6V5U5 = 61
X8L8V8U8 = 62
Q8W8V8U8 = 63
V16U16 = 64
A2W10V10U10 = 67
D16_LOCKABLE = 70
D32 = 71
D15S1 = 73
D24S8 = 75
D24X8 = 77
D24X4S4 = 79
D16 = 80
D32F_LOCKABLE = 82
D24FS8 = 83
D32_LOCKABLE = 84
S8_LOCKABLE = 85
L16 = 81
VERTEXDATA = 100
INDEX16 = 101
INDEX32 = 102
Q16W16V16U16 = 110
R16F = 111
G16R16F = 112
A16B16G16R16F = 113
R32F = 114
G32R32F = 115
A32B32G32R32F = 116
CxV8U8 = 117
A1 = 118
A2B10G10R10_XR_BIAS = 119
BINARYBUFFER = 199
UYVY = i32(b"UYVY")
R8G8_B8G8 = i32(b"RGBG")
YUY2 = i32(b"YUY2")
G8R8_G8B8 = i32(b"GRGB")
DXT1 = i32(b"DXT1")
DXT2 = i32(b"DXT2")
DXT3 = i32(b"DXT3")
DXT4 = i32(b"DXT4")
DXT5 = i32(b"DXT5")
DX10 = i32(b"DX10")
BC4S = i32(b"BC4S")
BC4U = i32(b"BC4U")
BC5S = i32(b"BC5S")
BC5U = i32(b"BC5U")
ATI1 = i32(b"ATI1")
ATI2 = i32(b"ATI2")
MULTI2_ARGB8 = i32(b"MET1")
# Backward compatibility layer
module = sys.modules[__name__]
for item in DDSD:
assert item.name is not None
setattr(module, f"DDSD_{item.name}", item.value)
for item1 in DDSCAPS:
assert item1.name is not None
setattr(module, f"DDSCAPS_{item1.name}", item1.value)
for item2 in DDSCAPS2:
assert item2.name is not None
setattr(module, f"DDSCAPS2_{item2.name}", item2.value)
for item3 in DDPF:
assert item3.name is not None
setattr(module, f"DDPF_{item3.name}", item3.value)
DDS_FOURCC = DDPF.FOURCC
DDS_RGB = DDPF.RGB
DDS_RGBA = DDPF.RGB | DDPF.ALPHAPIXELS
DDS_LUMINANCE = DDPF.LUMINANCE
DDS_LUMINANCEA = DDPF.LUMINANCE | DDPF.ALPHAPIXELS
DDS_ALPHA = DDPF.ALPHA
DDS_PAL8 = DDPF.PALETTEINDEXED8
DDS_HEADER_FLAGS_TEXTURE = DDSD.CAPS | DDSD.HEIGHT | DDSD.WIDTH | DDSD.PIXELFORMAT
DDS_HEADER_FLAGS_MIPMAP = DDSD.MIPMAPCOUNT
DDS_HEADER_FLAGS_VOLUME = DDSD.DEPTH
DDS_HEADER_FLAGS_PITCH = DDSD.PITCH
DDS_HEADER_FLAGS_LINEARSIZE = DDSD.LINEARSIZE
DDS_HEIGHT = DDSD.HEIGHT
DDS_WIDTH = DDSD.WIDTH
DDS_SURFACE_FLAGS_TEXTURE = DDSCAPS.TEXTURE
DDS_SURFACE_FLAGS_MIPMAP = DDSCAPS.COMPLEX | DDSCAPS.MIPMAP
DDS_SURFACE_FLAGS_CUBEMAP = DDSCAPS.COMPLEX
DDS_CUBEMAP_POSITIVEX = DDSCAPS2.CUBEMAP | DDSCAPS2.CUBEMAP_POSITIVEX
DDS_CUBEMAP_NEGATIVEX = DDSCAPS2.CUBEMAP | DDSCAPS2.CUBEMAP_NEGATIVEX
DDS_CUBEMAP_POSITIVEY = DDSCAPS2.CUBEMAP | DDSCAPS2.CUBEMAP_POSITIVEY
DDS_CUBEMAP_NEGATIVEY = DDSCAPS2.CUBEMAP | DDSCAPS2.CUBEMAP_NEGATIVEY
DDS_CUBEMAP_POSITIVEZ = DDSCAPS2.CUBEMAP | DDSCAPS2.CUBEMAP_POSITIVEZ
DDS_CUBEMAP_NEGATIVEZ = DDSCAPS2.CUBEMAP | DDSCAPS2.CUBEMAP_NEGATIVEZ
DXT1_FOURCC = D3DFMT.DXT1
DXT3_FOURCC = D3DFMT.DXT3
DXT5_FOURCC = D3DFMT.DXT5
DXGI_FORMAT_R8G8B8A8_TYPELESS = DXGI_FORMAT.R8G8B8A8_TYPELESS
DXGI_FORMAT_R8G8B8A8_UNORM = DXGI_FORMAT.R8G8B8A8_UNORM
DXGI_FORMAT_R8G8B8A8_UNORM_SRGB = DXGI_FORMAT.R8G8B8A8_UNORM_SRGB
DXGI_FORMAT_BC5_TYPELESS = DXGI_FORMAT.BC5_TYPELESS
DXGI_FORMAT_BC5_UNORM = DXGI_FORMAT.BC5_UNORM
DXGI_FORMAT_BC5_SNORM = DXGI_FORMAT.BC5_SNORM
DXGI_FORMAT_BC6H_UF16 = DXGI_FORMAT.BC6H_UF16
DXGI_FORMAT_BC6H_SF16 = DXGI_FORMAT.BC6H_SF16
DXGI_FORMAT_BC7_TYPELESS = DXGI_FORMAT.BC7_TYPELESS
DXGI_FORMAT_BC7_UNORM = DXGI_FORMAT.BC7_UNORM
DXGI_FORMAT_BC7_UNORM_SRGB = DXGI_FORMAT.BC7_UNORM_SRGB
class DdsImageFile(ImageFile.ImageFile):
format = "DDS"
format_description = "DirectDraw Surface"
def _open(self) -> None:
if not _accept(self.fp.read(4)):
msg = "not a DDS file"
raise SyntaxError(msg)
(header_size,) = struct.unpack("<I", self.fp.read(4))
if header_size != 124:
msg = f"Unsupported header size {repr(header_size)}"
raise OSError(msg)
header_bytes = self.fp.read(header_size - 4)
if len(header_bytes) != 120:
msg = f"Incomplete header: {len(header_bytes)} bytes"
raise OSError(msg)
header = io.BytesIO(header_bytes)
flags, height, width = struct.unpack("<3I", header.read(12))
self._size = (width, height)
extents = (0, 0) + self.size
pitch, depth, mipmaps = struct.unpack("<3I", header.read(12))
struct.unpack("<11I", header.read(44)) # reserved
# pixel format
pfsize, pfflags, fourcc, bitcount = struct.unpack("<4I", header.read(16))
n = 0
rawmode = None
if pfflags & DDPF.RGB:
# Texture contains uncompressed RGB data
if pfflags & DDPF.ALPHAPIXELS:
self._mode = "RGBA"
mask_count = 4
else:
self._mode = "RGB"
mask_count = 3
masks = struct.unpack(f"<{mask_count}I", header.read(mask_count * 4))
self.tile = [ImageFile._Tile("dds_rgb", extents, 0, (bitcount, masks))]
return
elif pfflags & DDPF.LUMINANCE:
if bitcount == 8:
self._mode = "L"
elif bitcount == 16 and pfflags & DDPF.ALPHAPIXELS:
self._mode = "LA"
else:
msg = f"Unsupported bitcount {bitcount} for {pfflags}"
raise OSError(msg)
elif pfflags & DDPF.PALETTEINDEXED8:
self._mode = "P"
self.palette = ImagePalette.raw("RGBA", self.fp.read(1024))
self.palette.mode = "RGBA"
elif pfflags & DDPF.FOURCC:
offset = header_size + 4
if fourcc == D3DFMT.DXT1:
self._mode = "RGBA"
self.pixel_format = "DXT1"
n = 1
elif fourcc == D3DFMT.DXT3:
self._mode = "RGBA"
self.pixel_format = "DXT3"
n = 2
elif fourcc == D3DFMT.DXT5:
self._mode = "RGBA"
self.pixel_format = "DXT5"
n = 3
elif fourcc in (D3DFMT.BC4U, D3DFMT.ATI1):
self._mode = "L"
self.pixel_format = "BC4"
n = 4
elif fourcc == D3DFMT.BC5S:
self._mode = "RGB"
self.pixel_format = "BC5S"
n = 5
elif fourcc in (D3DFMT.BC5U, D3DFMT.ATI2):
self._mode = "RGB"
self.pixel_format = "BC5"
n = 5
elif fourcc == D3DFMT.DX10:
offset += 20
# ignoring flags which pertain to volume textures and cubemaps
(dxgi_format,) = struct.unpack("<I", self.fp.read(4))
self.fp.read(16)
if dxgi_format in (
DXGI_FORMAT.BC1_UNORM,
DXGI_FORMAT.BC1_TYPELESS,
):
self._mode = "RGBA"
self.pixel_format = "BC1"
n = 1
elif dxgi_format in (DXGI_FORMAT.BC4_TYPELESS, DXGI_FORMAT.BC4_UNORM):
self._mode = "L"
self.pixel_format = "BC4"
n = 4
elif dxgi_format in (DXGI_FORMAT.BC5_TYPELESS, DXGI_FORMAT.BC5_UNORM):
self._mode = "RGB"
self.pixel_format = "BC5"
n = 5
elif dxgi_format == DXGI_FORMAT.BC5_SNORM:
self._mode = "RGB"
self.pixel_format = "BC5S"
n = 5
elif dxgi_format == DXGI_FORMAT.BC6H_UF16:
self._mode = "RGB"
self.pixel_format = "BC6H"
n = 6
elif dxgi_format == DXGI_FORMAT.BC6H_SF16:
self._mode = "RGB"
self.pixel_format = "BC6HS"
n = 6
elif dxgi_format in (
DXGI_FORMAT.BC7_TYPELESS,
DXGI_FORMAT.BC7_UNORM,
DXGI_FORMAT.BC7_UNORM_SRGB,
):
self._mode = "RGBA"
self.pixel_format = "BC7"
n = 7
if dxgi_format == DXGI_FORMAT.BC7_UNORM_SRGB:
self.info["gamma"] = 1 / 2.2
elif dxgi_format in (
DXGI_FORMAT.R8G8B8A8_TYPELESS,
DXGI_FORMAT.R8G8B8A8_UNORM,
DXGI_FORMAT.R8G8B8A8_UNORM_SRGB,
):
self._mode = "RGBA"
if dxgi_format == DXGI_FORMAT.R8G8B8A8_UNORM_SRGB:
self.info["gamma"] = 1 / 2.2
else:
msg = f"Unimplemented DXGI format {dxgi_format}"
raise NotImplementedError(msg)
else:
msg = f"Unimplemented pixel format {repr(fourcc)}"
raise NotImplementedError(msg)
else:
msg = f"Unknown pixel format flags {pfflags}"
raise NotImplementedError(msg)
if n:
self.tile = [
ImageFile._Tile("bcn", extents, offset, (n, self.pixel_format))
]
else:
self.tile = [ImageFile._Tile("raw", extents, 0, rawmode or self.mode)]
def load_seek(self, pos: int) -> None:
pass
class DdsRgbDecoder(ImageFile.PyDecoder):
_pulls_fd = True
def decode(self, buffer: bytes | Image.SupportsArrayInterface) -> tuple[int, int]:
assert self.fd is not None
bitcount, masks = self.args
# Some masks will be padded with zeros, e.g. R 0b11 G 0b1100
# Calculate how many zeros each mask is padded with
mask_offsets = []
# And the maximum value of each channel without the padding
mask_totals = []
for mask in masks:
offset = 0
if mask != 0:
while mask >> (offset + 1) << (offset + 1) == mask:
offset += 1
mask_offsets.append(offset)
mask_totals.append(mask >> offset)
data = bytearray()
bytecount = bitcount // 8
dest_length = self.state.xsize * self.state.ysize * len(masks)
while len(data) < dest_length:
value = int.from_bytes(self.fd.read(bytecount), "little")
for i, mask in enumerate(masks):
masked_value = value & mask
# Remove the zero padding, and scale it to 8 bits
data += o8(
int(((masked_value >> mask_offsets[i]) / mask_totals[i]) * 255)
)
self.set_as_raw(data)
return -1, 0
def _save(im: Image.Image, fp: IO[bytes], filename: str | bytes) -> None:
if im.mode not in ("RGB", "RGBA", "L", "LA"):
msg = f"cannot write mode {im.mode} as DDS"
raise OSError(msg)
alpha = im.mode[-1] == "A"
if im.mode[0] == "L":
pixel_flags = DDPF.LUMINANCE
rawmode = im.mode
if alpha:
rgba_mask = [0x000000FF, 0x000000FF, 0x000000FF]
else:
rgba_mask = [0xFF000000, 0xFF000000, 0xFF000000]
else:
pixel_flags = DDPF.RGB
rawmode = im.mode[::-1]
rgba_mask = [0x00FF0000, 0x0000FF00, 0x000000FF]
if alpha:
r, g, b, a = im.split()
im = Image.merge("RGBA", (a, r, g, b))
if alpha:
pixel_flags |= DDPF.ALPHAPIXELS
rgba_mask.append(0xFF000000 if alpha else 0)
flags = DDSD.CAPS | DDSD.HEIGHT | DDSD.WIDTH | DDSD.PITCH | DDSD.PIXELFORMAT
bitcount = len(im.getbands()) * 8
pitch = (im.width * bitcount + 7) // 8
fp.write(
o32(DDS_MAGIC)
+ struct.pack(
"<7I",
124, # header size
flags, # flags
im.height,
im.width,
pitch,
0, # depth
0, # mipmaps
)
+ struct.pack("11I", *((0,) * 11)) # reserved
# pfsize, pfflags, fourcc, bitcount
+ struct.pack("<4I", 32, pixel_flags, 0, bitcount)
+ struct.pack("<4I", *rgba_mask) # dwRGBABitMask
+ struct.pack("<5I", DDSCAPS.TEXTURE, 0, 0, 0, 0)
)
ImageFile._save(im, fp, [ImageFile._Tile("raw", (0, 0) + im.size, 0, rawmode)])
def _accept(prefix: bytes) -> bool:
return prefix[:4] == b"DDS "
Image.register_open(DdsImageFile.format, DdsImageFile, _accept)
Image.register_decoder("dds_rgb", DdsRgbDecoder)
Image.register_save(DdsImageFile.format, _save)
Image.register_extension(DdsImageFile.format, ".dds")

View File

@ -0,0 +1,474 @@
#
# The Python Imaging Library.
# $Id$
#
# EPS file handling
#
# History:
# 1995-09-01 fl Created (0.1)
# 1996-05-18 fl Don't choke on "atend" fields, Ghostscript interface (0.2)
# 1996-08-22 fl Don't choke on floating point BoundingBox values
# 1996-08-23 fl Handle files from Macintosh (0.3)
# 2001-02-17 fl Use 're' instead of 'regex' (Python 2.1) (0.4)
# 2003-09-07 fl Check gs.close status (from Federico Di Gregorio) (0.5)
# 2014-05-07 e Handling of EPS with binary preview and fixed resolution
# resizing
#
# Copyright (c) 1997-2003 by Secret Labs AB.
# Copyright (c) 1995-2003 by Fredrik Lundh
#
# See the README file for information on usage and redistribution.
#
from __future__ import annotations
import io
import os
import re
import subprocess
import sys
import tempfile
from typing import IO
from . import Image, ImageFile
from ._binary import i32le as i32
# --------------------------------------------------------------------
split = re.compile(r"^%%([^:]*):[ \t]*(.*)[ \t]*$")
field = re.compile(r"^%[%!\w]([^:]*)[ \t]*$")
gs_binary: str | bool | None = None
gs_windows_binary = None
def has_ghostscript() -> bool:
global gs_binary, gs_windows_binary
if gs_binary is None:
if sys.platform.startswith("win"):
if gs_windows_binary is None:
import shutil
for binary in ("gswin32c", "gswin64c", "gs"):
if shutil.which(binary) is not None:
gs_windows_binary = binary
break
else:
gs_windows_binary = False
gs_binary = gs_windows_binary
else:
try:
subprocess.check_call(["gs", "--version"], stdout=subprocess.DEVNULL)
gs_binary = "gs"
except OSError:
gs_binary = False
return gs_binary is not False
def Ghostscript(
tile: list[ImageFile._Tile],
size: tuple[int, int],
fp: IO[bytes],
scale: int = 1,
transparency: bool = False,
) -> Image.core.ImagingCore:
"""Render an image using Ghostscript"""
global gs_binary
if not has_ghostscript():
msg = "Unable to locate Ghostscript on paths"
raise OSError(msg)
assert isinstance(gs_binary, str)
# Unpack decoder tile
args = tile[0].args
assert isinstance(args, tuple)
length, bbox = args
# Hack to support hi-res rendering
scale = int(scale) or 1
width = size[0] * scale
height = size[1] * scale
# resolution is dependent on bbox and size
res_x = 72.0 * width / (bbox[2] - bbox[0])
res_y = 72.0 * height / (bbox[3] - bbox[1])
out_fd, outfile = tempfile.mkstemp()
os.close(out_fd)
infile_temp = None
if hasattr(fp, "name") and os.path.exists(fp.name):
infile = fp.name
else:
in_fd, infile_temp = tempfile.mkstemp()
os.close(in_fd)
infile = infile_temp
# Ignore length and offset!
# Ghostscript can read it
# Copy whole file to read in Ghostscript
with open(infile_temp, "wb") as f:
# fetch length of fp
fp.seek(0, io.SEEK_END)
fsize = fp.tell()
# ensure start position
# go back
fp.seek(0)
lengthfile = fsize
while lengthfile > 0:
s = fp.read(min(lengthfile, 100 * 1024))
if not s:
break
lengthfile -= len(s)
f.write(s)
if transparency:
# "RGBA"
device = "pngalpha"
else:
# "pnmraw" automatically chooses between
# PBM ("1"), PGM ("L"), and PPM ("RGB").
device = "pnmraw"
# Build Ghostscript command
command = [
gs_binary,
"-q", # quiet mode
f"-g{width:d}x{height:d}", # set output geometry (pixels)
f"-r{res_x:f}x{res_y:f}", # set input DPI (dots per inch)
"-dBATCH", # exit after processing
"-dNOPAUSE", # don't pause between pages
"-dSAFER", # safe mode
f"-sDEVICE={device}",
f"-sOutputFile={outfile}", # output file
# adjust for image origin
"-c",
f"{-bbox[0]} {-bbox[1]} translate",
"-f",
infile, # input file
# showpage (see https://bugs.ghostscript.com/show_bug.cgi?id=698272)
"-c",
"showpage",
]
# push data through Ghostscript
try:
startupinfo = None
if sys.platform.startswith("win"):
startupinfo = subprocess.STARTUPINFO()
startupinfo.dwFlags |= subprocess.STARTF_USESHOWWINDOW
subprocess.check_call(command, startupinfo=startupinfo)
with Image.open(outfile) as out_im:
out_im.load()
return out_im.im.copy()
finally:
try:
os.unlink(outfile)
if infile_temp:
os.unlink(infile_temp)
except OSError:
pass
def _accept(prefix: bytes) -> bool:
return prefix[:4] == b"%!PS" or (len(prefix) >= 4 and i32(prefix) == 0xC6D3D0C5)
##
# Image plugin for Encapsulated PostScript. This plugin supports only
# a few variants of this format.
class EpsImageFile(ImageFile.ImageFile):
"""EPS File Parser for the Python Imaging Library"""
format = "EPS"
format_description = "Encapsulated Postscript"
mode_map = {1: "L", 2: "LAB", 3: "RGB", 4: "CMYK"}
def _open(self) -> None:
(length, offset) = self._find_offset(self.fp)
# go to offset - start of "%!PS"
self.fp.seek(offset)
self._mode = "RGB"
# When reading header comments, the first comment is used.
# When reading trailer comments, the last comment is used.
bounding_box: list[int] | None = None
imagedata_size: tuple[int, int] | None = None
byte_arr = bytearray(255)
bytes_mv = memoryview(byte_arr)
bytes_read = 0
reading_header_comments = True
reading_trailer_comments = False
trailer_reached = False
def check_required_header_comments() -> None:
"""
The EPS specification requires that some headers exist.
This should be checked when the header comments formally end,
when image data starts, or when the file ends, whichever comes first.
"""
if "PS-Adobe" not in self.info:
msg = 'EPS header missing "%!PS-Adobe" comment'
raise SyntaxError(msg)
if "BoundingBox" not in self.info:
msg = 'EPS header missing "%%BoundingBox" comment'
raise SyntaxError(msg)
def read_comment(s: str) -> bool:
nonlocal bounding_box, reading_trailer_comments
try:
m = split.match(s)
except re.error as e:
msg = "not an EPS file"
raise SyntaxError(msg) from e
if not m:
return False
k, v = m.group(1, 2)
self.info[k] = v
if k == "BoundingBox":
if v == "(atend)":
reading_trailer_comments = True
elif not bounding_box or (trailer_reached and reading_trailer_comments):
try:
# Note: The DSC spec says that BoundingBox
# fields should be integers, but some drivers
# put floating point values there anyway.
bounding_box = [int(float(i)) for i in v.split()]
except Exception:
pass
return True
while True:
byte = self.fp.read(1)
if byte == b"":
# if we didn't read a byte we must be at the end of the file
if bytes_read == 0:
if reading_header_comments:
check_required_header_comments()
break
elif byte in b"\r\n":
# if we read a line ending character, ignore it and parse what
# we have already read. if we haven't read any other characters,
# continue reading
if bytes_read == 0:
continue
else:
# ASCII/hexadecimal lines in an EPS file must not exceed
# 255 characters, not including line ending characters
if bytes_read >= 255:
# only enforce this for lines starting with a "%",
# otherwise assume it's binary data
if byte_arr[0] == ord("%"):
msg = "not an EPS file"
raise SyntaxError(msg)
else:
if reading_header_comments:
check_required_header_comments()
reading_header_comments = False
# reset bytes_read so we can keep reading
# data until the end of the line
bytes_read = 0
byte_arr[bytes_read] = byte[0]
bytes_read += 1
continue
if reading_header_comments:
# Load EPS header
# if this line doesn't start with a "%",
# or does start with "%%EndComments",
# then we've reached the end of the header/comments
if byte_arr[0] != ord("%") or bytes_mv[:13] == b"%%EndComments":
check_required_header_comments()
reading_header_comments = False
continue
s = str(bytes_mv[:bytes_read], "latin-1")
if not read_comment(s):
m = field.match(s)
if m:
k = m.group(1)
if k[:8] == "PS-Adobe":
self.info["PS-Adobe"] = k[9:]
else:
self.info[k] = ""
elif s[0] == "%":
# handle non-DSC PostScript comments that some
# tools mistakenly put in the Comments section
pass
else:
msg = "bad EPS header"
raise OSError(msg)
elif bytes_mv[:11] == b"%ImageData:":
# Check for an "ImageData" descriptor
# https://www.adobe.com/devnet-apps/photoshop/fileformatashtml/#50577413_pgfId-1035096
# If we've already read an "ImageData" descriptor,
# don't read another one.
if imagedata_size:
bytes_read = 0
continue
# Values:
# columns
# rows
# bit depth (1 or 8)
# mode (1: L, 2: LAB, 3: RGB, 4: CMYK)
# number of padding channels
# block size (number of bytes per row per channel)
# binary/ascii (1: binary, 2: ascii)
# data start identifier (the image data follows after a single line
# consisting only of this quoted value)
image_data_values = byte_arr[11:bytes_read].split(None, 7)
columns, rows, bit_depth, mode_id = (
int(value) for value in image_data_values[:4]
)
if bit_depth == 1:
self._mode = "1"
elif bit_depth == 8:
try:
self._mode = self.mode_map[mode_id]
except ValueError:
break
else:
break
# Parse the columns and rows after checking the bit depth and mode
# in case the bit depth and/or mode are invalid.
imagedata_size = columns, rows
elif bytes_mv[:5] == b"%%EOF":
break
elif trailer_reached and reading_trailer_comments:
# Load EPS trailer
s = str(bytes_mv[:bytes_read], "latin-1")
read_comment(s)
elif bytes_mv[:9] == b"%%Trailer":
trailer_reached = True
bytes_read = 0
# A "BoundingBox" is always required,
# even if an "ImageData" descriptor size exists.
if not bounding_box:
msg = "cannot determine EPS bounding box"
raise OSError(msg)
# An "ImageData" size takes precedence over the "BoundingBox".
self._size = imagedata_size or (
bounding_box[2] - bounding_box[0],
bounding_box[3] - bounding_box[1],
)
self.tile = [
ImageFile._Tile("eps", (0, 0) + self.size, offset, (length, bounding_box))
]
def _find_offset(self, fp: IO[bytes]) -> tuple[int, int]:
s = fp.read(4)
if s == b"%!PS":
# for HEAD without binary preview
fp.seek(0, io.SEEK_END)
length = fp.tell()
offset = 0
elif i32(s) == 0xC6D3D0C5:
# FIX for: Some EPS file not handled correctly / issue #302
# EPS can contain binary data
# or start directly with latin coding
# more info see:
# https://web.archive.org/web/20160528181353/http://partners.adobe.com/public/developer/en/ps/5002.EPSF_Spec.pdf
s = fp.read(8)
offset = i32(s)
length = i32(s, 4)
else:
msg = "not an EPS file"
raise SyntaxError(msg)
return length, offset
def load(
self, scale: int = 1, transparency: bool = False
) -> Image.core.PixelAccess | None:
# Load EPS via Ghostscript
if self.tile:
self.im = Ghostscript(self.tile, self.size, self.fp, scale, transparency)
self._mode = self.im.mode
self._size = self.im.size
self.tile = []
return Image.Image.load(self)
def load_seek(self, pos: int) -> None:
# we can't incrementally load, so force ImageFile.parser to
# use our custom load method by defining this method.
pass
# --------------------------------------------------------------------
def _save(im: Image.Image, fp: IO[bytes], filename: str | bytes, eps: int = 1) -> None:
"""EPS Writer for the Python Imaging Library."""
# make sure image data is available
im.load()
# determine PostScript image mode
if im.mode == "L":
operator = (8, 1, b"image")
elif im.mode == "RGB":
operator = (8, 3, b"false 3 colorimage")
elif im.mode == "CMYK":
operator = (8, 4, b"false 4 colorimage")
else:
msg = "image mode is not supported"
raise ValueError(msg)
if eps:
# write EPS header
fp.write(b"%!PS-Adobe-3.0 EPSF-3.0\n")
fp.write(b"%%Creator: PIL 0.1 EpsEncode\n")
# fp.write("%%CreationDate: %s"...)
fp.write(b"%%%%BoundingBox: 0 0 %d %d\n" % im.size)
fp.write(b"%%Pages: 1\n")
fp.write(b"%%EndComments\n")
fp.write(b"%%Page: 1 1\n")
fp.write(b"%%ImageData: %d %d " % im.size)
fp.write(b'%d %d 0 1 1 "%s"\n' % operator)
# image header
fp.write(b"gsave\n")
fp.write(b"10 dict begin\n")
fp.write(b"/buf %d string def\n" % (im.size[0] * operator[1]))
fp.write(b"%d %d scale\n" % im.size)
fp.write(b"%d %d 8\n" % im.size) # <= bits
fp.write(b"[%d 0 0 -%d 0 %d]\n" % (im.size[0], im.size[1], im.size[1]))
fp.write(b"{ currentfile buf readhexstring pop } bind\n")
fp.write(operator[2] + b"\n")
if hasattr(fp, "flush"):
fp.flush()
ImageFile._save(im, fp, [ImageFile._Tile("eps", (0, 0) + im.size)])
fp.write(b"\n%%%%EndBinary\n")
fp.write(b"grestore end\n")
if hasattr(fp, "flush"):
fp.flush()
# --------------------------------------------------------------------
Image.register_open(EpsImageFile.format, EpsImageFile, _accept)
Image.register_save(EpsImageFile.format, _save)
Image.register_extensions(EpsImageFile.format, [".ps", ".eps"])
Image.register_mime(EpsImageFile.format, "application/postscript")

View File

@ -0,0 +1,382 @@
#
# The Python Imaging Library.
# $Id$
#
# EXIF tags
#
# Copyright (c) 2003 by Secret Labs AB
#
# See the README file for information on usage and redistribution.
#
"""
This module provides constants and clear-text names for various
well-known EXIF tags.
"""
from __future__ import annotations
from enum import IntEnum
class Base(IntEnum):
# possibly incomplete
InteropIndex = 0x0001
ProcessingSoftware = 0x000B
NewSubfileType = 0x00FE
SubfileType = 0x00FF
ImageWidth = 0x0100
ImageLength = 0x0101
BitsPerSample = 0x0102
Compression = 0x0103
PhotometricInterpretation = 0x0106
Thresholding = 0x0107
CellWidth = 0x0108
CellLength = 0x0109
FillOrder = 0x010A
DocumentName = 0x010D
ImageDescription = 0x010E
Make = 0x010F
Model = 0x0110
StripOffsets = 0x0111
Orientation = 0x0112
SamplesPerPixel = 0x0115
RowsPerStrip = 0x0116
StripByteCounts = 0x0117
MinSampleValue = 0x0118
MaxSampleValue = 0x0119
XResolution = 0x011A
YResolution = 0x011B
PlanarConfiguration = 0x011C
PageName = 0x011D
FreeOffsets = 0x0120
FreeByteCounts = 0x0121
GrayResponseUnit = 0x0122
GrayResponseCurve = 0x0123
T4Options = 0x0124
T6Options = 0x0125
ResolutionUnit = 0x0128
PageNumber = 0x0129
TransferFunction = 0x012D
Software = 0x0131
DateTime = 0x0132
Artist = 0x013B
HostComputer = 0x013C
Predictor = 0x013D
WhitePoint = 0x013E
PrimaryChromaticities = 0x013F
ColorMap = 0x0140
HalftoneHints = 0x0141
TileWidth = 0x0142
TileLength = 0x0143
TileOffsets = 0x0144
TileByteCounts = 0x0145
SubIFDs = 0x014A
InkSet = 0x014C
InkNames = 0x014D
NumberOfInks = 0x014E
DotRange = 0x0150
TargetPrinter = 0x0151
ExtraSamples = 0x0152
SampleFormat = 0x0153
SMinSampleValue = 0x0154
SMaxSampleValue = 0x0155
TransferRange = 0x0156
ClipPath = 0x0157
XClipPathUnits = 0x0158
YClipPathUnits = 0x0159
Indexed = 0x015A
JPEGTables = 0x015B
OPIProxy = 0x015F
JPEGProc = 0x0200
JpegIFOffset = 0x0201
JpegIFByteCount = 0x0202
JpegRestartInterval = 0x0203
JpegLosslessPredictors = 0x0205
JpegPointTransforms = 0x0206
JpegQTables = 0x0207
JpegDCTables = 0x0208
JpegACTables = 0x0209
YCbCrCoefficients = 0x0211
YCbCrSubSampling = 0x0212
YCbCrPositioning = 0x0213
ReferenceBlackWhite = 0x0214
XMLPacket = 0x02BC
RelatedImageFileFormat = 0x1000
RelatedImageWidth = 0x1001
RelatedImageLength = 0x1002
Rating = 0x4746
RatingPercent = 0x4749
ImageID = 0x800D
CFARepeatPatternDim = 0x828D
BatteryLevel = 0x828F
Copyright = 0x8298
ExposureTime = 0x829A
FNumber = 0x829D
IPTCNAA = 0x83BB
ImageResources = 0x8649
ExifOffset = 0x8769
InterColorProfile = 0x8773
ExposureProgram = 0x8822
SpectralSensitivity = 0x8824
GPSInfo = 0x8825
ISOSpeedRatings = 0x8827
OECF = 0x8828
Interlace = 0x8829
TimeZoneOffset = 0x882A
SelfTimerMode = 0x882B
SensitivityType = 0x8830
StandardOutputSensitivity = 0x8831
RecommendedExposureIndex = 0x8832
ISOSpeed = 0x8833
ISOSpeedLatitudeyyy = 0x8834
ISOSpeedLatitudezzz = 0x8835
ExifVersion = 0x9000
DateTimeOriginal = 0x9003
DateTimeDigitized = 0x9004
OffsetTime = 0x9010
OffsetTimeOriginal = 0x9011
OffsetTimeDigitized = 0x9012
ComponentsConfiguration = 0x9101
CompressedBitsPerPixel = 0x9102
ShutterSpeedValue = 0x9201
ApertureValue = 0x9202
BrightnessValue = 0x9203
ExposureBiasValue = 0x9204
MaxApertureValue = 0x9205
SubjectDistance = 0x9206
MeteringMode = 0x9207
LightSource = 0x9208
Flash = 0x9209
FocalLength = 0x920A
Noise = 0x920D
ImageNumber = 0x9211
SecurityClassification = 0x9212
ImageHistory = 0x9213
TIFFEPStandardID = 0x9216
MakerNote = 0x927C
UserComment = 0x9286
SubsecTime = 0x9290
SubsecTimeOriginal = 0x9291
SubsecTimeDigitized = 0x9292
AmbientTemperature = 0x9400
Humidity = 0x9401
Pressure = 0x9402
WaterDepth = 0x9403
Acceleration = 0x9404
CameraElevationAngle = 0x9405
XPTitle = 0x9C9B
XPComment = 0x9C9C
XPAuthor = 0x9C9D
XPKeywords = 0x9C9E
XPSubject = 0x9C9F
FlashPixVersion = 0xA000
ColorSpace = 0xA001
ExifImageWidth = 0xA002
ExifImageHeight = 0xA003
RelatedSoundFile = 0xA004
ExifInteroperabilityOffset = 0xA005
FlashEnergy = 0xA20B
SpatialFrequencyResponse = 0xA20C
FocalPlaneXResolution = 0xA20E
FocalPlaneYResolution = 0xA20F
FocalPlaneResolutionUnit = 0xA210
SubjectLocation = 0xA214
ExposureIndex = 0xA215
SensingMethod = 0xA217
FileSource = 0xA300
SceneType = 0xA301
CFAPattern = 0xA302
CustomRendered = 0xA401
ExposureMode = 0xA402
WhiteBalance = 0xA403
DigitalZoomRatio = 0xA404
FocalLengthIn35mmFilm = 0xA405
SceneCaptureType = 0xA406
GainControl = 0xA407
Contrast = 0xA408
Saturation = 0xA409
Sharpness = 0xA40A
DeviceSettingDescription = 0xA40B
SubjectDistanceRange = 0xA40C
ImageUniqueID = 0xA420
CameraOwnerName = 0xA430
BodySerialNumber = 0xA431
LensSpecification = 0xA432
LensMake = 0xA433
LensModel = 0xA434
LensSerialNumber = 0xA435
CompositeImage = 0xA460
CompositeImageCount = 0xA461
CompositeImageExposureTimes = 0xA462
Gamma = 0xA500
PrintImageMatching = 0xC4A5
DNGVersion = 0xC612
DNGBackwardVersion = 0xC613
UniqueCameraModel = 0xC614
LocalizedCameraModel = 0xC615
CFAPlaneColor = 0xC616
CFALayout = 0xC617
LinearizationTable = 0xC618
BlackLevelRepeatDim = 0xC619
BlackLevel = 0xC61A
BlackLevelDeltaH = 0xC61B
BlackLevelDeltaV = 0xC61C
WhiteLevel = 0xC61D
DefaultScale = 0xC61E
DefaultCropOrigin = 0xC61F
DefaultCropSize = 0xC620
ColorMatrix1 = 0xC621
ColorMatrix2 = 0xC622
CameraCalibration1 = 0xC623
CameraCalibration2 = 0xC624
ReductionMatrix1 = 0xC625
ReductionMatrix2 = 0xC626
AnalogBalance = 0xC627
AsShotNeutral = 0xC628
AsShotWhiteXY = 0xC629
BaselineExposure = 0xC62A
BaselineNoise = 0xC62B
BaselineSharpness = 0xC62C
BayerGreenSplit = 0xC62D
LinearResponseLimit = 0xC62E
CameraSerialNumber = 0xC62F
LensInfo = 0xC630
ChromaBlurRadius = 0xC631
AntiAliasStrength = 0xC632
ShadowScale = 0xC633
DNGPrivateData = 0xC634
MakerNoteSafety = 0xC635
CalibrationIlluminant1 = 0xC65A
CalibrationIlluminant2 = 0xC65B
BestQualityScale = 0xC65C
RawDataUniqueID = 0xC65D
OriginalRawFileName = 0xC68B
OriginalRawFileData = 0xC68C
ActiveArea = 0xC68D
MaskedAreas = 0xC68E
AsShotICCProfile = 0xC68F
AsShotPreProfileMatrix = 0xC690
CurrentICCProfile = 0xC691
CurrentPreProfileMatrix = 0xC692
ColorimetricReference = 0xC6BF
CameraCalibrationSignature = 0xC6F3
ProfileCalibrationSignature = 0xC6F4
AsShotProfileName = 0xC6F6
NoiseReductionApplied = 0xC6F7
ProfileName = 0xC6F8
ProfileHueSatMapDims = 0xC6F9
ProfileHueSatMapData1 = 0xC6FA
ProfileHueSatMapData2 = 0xC6FB
ProfileToneCurve = 0xC6FC
ProfileEmbedPolicy = 0xC6FD
ProfileCopyright = 0xC6FE
ForwardMatrix1 = 0xC714
ForwardMatrix2 = 0xC715
PreviewApplicationName = 0xC716
PreviewApplicationVersion = 0xC717
PreviewSettingsName = 0xC718
PreviewSettingsDigest = 0xC719
PreviewColorSpace = 0xC71A
PreviewDateTime = 0xC71B
RawImageDigest = 0xC71C
OriginalRawFileDigest = 0xC71D
SubTileBlockSize = 0xC71E
RowInterleaveFactor = 0xC71F
ProfileLookTableDims = 0xC725
ProfileLookTableData = 0xC726
OpcodeList1 = 0xC740
OpcodeList2 = 0xC741
OpcodeList3 = 0xC74E
NoiseProfile = 0xC761
"""Maps EXIF tags to tag names."""
TAGS = {
**{i.value: i.name for i in Base},
0x920C: "SpatialFrequencyResponse",
0x9214: "SubjectLocation",
0x9215: "ExposureIndex",
0x828E: "CFAPattern",
0x920B: "FlashEnergy",
0x9216: "TIFF/EPStandardID",
}
class GPS(IntEnum):
GPSVersionID = 0x00
GPSLatitudeRef = 0x01
GPSLatitude = 0x02
GPSLongitudeRef = 0x03
GPSLongitude = 0x04
GPSAltitudeRef = 0x05
GPSAltitude = 0x06
GPSTimeStamp = 0x07
GPSSatellites = 0x08
GPSStatus = 0x09
GPSMeasureMode = 0x0A
GPSDOP = 0x0B
GPSSpeedRef = 0x0C
GPSSpeed = 0x0D
GPSTrackRef = 0x0E
GPSTrack = 0x0F
GPSImgDirectionRef = 0x10
GPSImgDirection = 0x11
GPSMapDatum = 0x12
GPSDestLatitudeRef = 0x13
GPSDestLatitude = 0x14
GPSDestLongitudeRef = 0x15
GPSDestLongitude = 0x16
GPSDestBearingRef = 0x17
GPSDestBearing = 0x18
GPSDestDistanceRef = 0x19
GPSDestDistance = 0x1A
GPSProcessingMethod = 0x1B
GPSAreaInformation = 0x1C
GPSDateStamp = 0x1D
GPSDifferential = 0x1E
GPSHPositioningError = 0x1F
"""Maps EXIF GPS tags to tag names."""
GPSTAGS = {i.value: i.name for i in GPS}
class Interop(IntEnum):
InteropIndex = 0x0001
InteropVersion = 0x0002
RelatedImageFileFormat = 0x1000
RelatedImageWidth = 0x1001
RelatedImageHeight = 0x1002
class IFD(IntEnum):
Exif = 0x8769
GPSInfo = 0x8825
MakerNote = 0x927C
Makernote = 0x927C # Deprecated
Interop = 0xA005
IFD1 = -1
class LightSource(IntEnum):
Unknown = 0x00
Daylight = 0x01
Fluorescent = 0x02
Tungsten = 0x03
Flash = 0x04
Fine = 0x09
Cloudy = 0x0A
Shade = 0x0B
DaylightFluorescent = 0x0C
DayWhiteFluorescent = 0x0D
CoolWhiteFluorescent = 0x0E
WhiteFluorescent = 0x0F
StandardLightA = 0x11
StandardLightB = 0x12
StandardLightC = 0x13
D55 = 0x14
D65 = 0x15
D75 = 0x16
D50 = 0x17
ISO = 0x18
Other = 0xFF

View File

@ -0,0 +1,152 @@
#
# The Python Imaging Library
# $Id$
#
# FITS file handling
#
# Copyright (c) 1998-2003 by Fredrik Lundh
#
# See the README file for information on usage and redistribution.
#
from __future__ import annotations
import gzip
import math
from . import Image, ImageFile
def _accept(prefix: bytes) -> bool:
return prefix[:6] == b"SIMPLE"
class FitsImageFile(ImageFile.ImageFile):
format = "FITS"
format_description = "FITS"
def _open(self) -> None:
assert self.fp is not None
headers: dict[bytes, bytes] = {}
header_in_progress = False
decoder_name = ""
while True:
header = self.fp.read(80)
if not header:
msg = "Truncated FITS file"
raise OSError(msg)
keyword = header[:8].strip()
if keyword in (b"SIMPLE", b"XTENSION"):
header_in_progress = True
elif headers and not header_in_progress:
# This is now a data unit
break
elif keyword == b"END":
# Seek to the end of the header unit
self.fp.seek(math.ceil(self.fp.tell() / 2880) * 2880)
if not decoder_name:
decoder_name, offset, args = self._parse_headers(headers)
header_in_progress = False
continue
if decoder_name:
# Keep going to read past the headers
continue
value = header[8:].split(b"/")[0].strip()
if value.startswith(b"="):
value = value[1:].strip()
if not headers and (not _accept(keyword) or value != b"T"):
msg = "Not a FITS file"
raise SyntaxError(msg)
headers[keyword] = value
if not decoder_name:
msg = "No image data"
raise ValueError(msg)
offset += self.fp.tell() - 80
self.tile = [ImageFile._Tile(decoder_name, (0, 0) + self.size, offset, args)]
def _get_size(
self, headers: dict[bytes, bytes], prefix: bytes
) -> tuple[int, int] | None:
naxis = int(headers[prefix + b"NAXIS"])
if naxis == 0:
return None
if naxis == 1:
return 1, int(headers[prefix + b"NAXIS1"])
else:
return int(headers[prefix + b"NAXIS1"]), int(headers[prefix + b"NAXIS2"])
def _parse_headers(
self, headers: dict[bytes, bytes]
) -> tuple[str, int, tuple[str | int, ...]]:
prefix = b""
decoder_name = "raw"
offset = 0
if (
headers.get(b"XTENSION") == b"'BINTABLE'"
and headers.get(b"ZIMAGE") == b"T"
and headers[b"ZCMPTYPE"] == b"'GZIP_1 '"
):
no_prefix_size = self._get_size(headers, prefix) or (0, 0)
number_of_bits = int(headers[b"BITPIX"])
offset = no_prefix_size[0] * no_prefix_size[1] * (number_of_bits // 8)
prefix = b"Z"
decoder_name = "fits_gzip"
size = self._get_size(headers, prefix)
if not size:
return "", 0, ()
self._size = size
number_of_bits = int(headers[prefix + b"BITPIX"])
if number_of_bits == 8:
self._mode = "L"
elif number_of_bits == 16:
self._mode = "I;16"
elif number_of_bits == 32:
self._mode = "I"
elif number_of_bits in (-32, -64):
self._mode = "F"
args: tuple[str | int, ...]
if decoder_name == "raw":
args = (self.mode, 0, -1)
else:
args = (number_of_bits,)
return decoder_name, offset, args
class FitsGzipDecoder(ImageFile.PyDecoder):
_pulls_fd = True
def decode(self, buffer: bytes | Image.SupportsArrayInterface) -> tuple[int, int]:
assert self.fd is not None
value = gzip.decompress(self.fd.read())
rows = []
offset = 0
number_of_bits = min(self.args[0] // 8, 4)
for y in range(self.state.ysize):
row = bytearray()
for x in range(self.state.xsize):
row += value[offset + (4 - number_of_bits) : offset + 4]
offset += 4
rows.append(row)
self.set_as_raw(bytes([pixel for row in rows[::-1] for pixel in row]))
return -1, 0
# --------------------------------------------------------------------
# Registry
Image.register_open(FitsImageFile.format, FitsImageFile, _accept)
Image.register_decoder("fits_gzip", FitsGzipDecoder)
Image.register_extensions(FitsImageFile.format, [".fit", ".fits"])

View File

@ -0,0 +1,175 @@
#
# The Python Imaging Library.
# $Id$
#
# FLI/FLC file handling.
#
# History:
# 95-09-01 fl Created
# 97-01-03 fl Fixed parser, setup decoder tile
# 98-07-15 fl Renamed offset attribute to avoid name clash
#
# Copyright (c) Secret Labs AB 1997-98.
# Copyright (c) Fredrik Lundh 1995-97.
#
# See the README file for information on usage and redistribution.
#
from __future__ import annotations
import os
from . import Image, ImageFile, ImagePalette
from ._binary import i16le as i16
from ._binary import i32le as i32
from ._binary import o8
#
# decoder
def _accept(prefix: bytes) -> bool:
return (
len(prefix) >= 6
and i16(prefix, 4) in [0xAF11, 0xAF12]
and i16(prefix, 14) in [0, 3] # flags
)
##
# Image plugin for the FLI/FLC animation format. Use the <b>seek</b>
# method to load individual frames.
class FliImageFile(ImageFile.ImageFile):
format = "FLI"
format_description = "Autodesk FLI/FLC Animation"
_close_exclusive_fp_after_loading = False
def _open(self) -> None:
# HEAD
s = self.fp.read(128)
if not (_accept(s) and s[20:22] == b"\x00\x00"):
msg = "not an FLI/FLC file"
raise SyntaxError(msg)
# frames
self.n_frames = i16(s, 6)
self.is_animated = self.n_frames > 1
# image characteristics
self._mode = "P"
self._size = i16(s, 8), i16(s, 10)
# animation speed
duration = i32(s, 16)
magic = i16(s, 4)
if magic == 0xAF11:
duration = (duration * 1000) // 70
self.info["duration"] = duration
# look for palette
palette = [(a, a, a) for a in range(256)]
s = self.fp.read(16)
self.__offset = 128
if i16(s, 4) == 0xF100:
# prefix chunk; ignore it
self.__offset = self.__offset + i32(s)
self.fp.seek(self.__offset)
s = self.fp.read(16)
if i16(s, 4) == 0xF1FA:
# look for palette chunk
number_of_subchunks = i16(s, 6)
chunk_size: int | None = None
for _ in range(number_of_subchunks):
if chunk_size is not None:
self.fp.seek(chunk_size - 6, os.SEEK_CUR)
s = self.fp.read(6)
chunk_type = i16(s, 4)
if chunk_type in (4, 11):
self._palette(palette, 2 if chunk_type == 11 else 0)
break
chunk_size = i32(s)
if not chunk_size:
break
self.palette = ImagePalette.raw(
"RGB", b"".join(o8(r) + o8(g) + o8(b) for (r, g, b) in palette)
)
# set things up to decode first frame
self.__frame = -1
self._fp = self.fp
self.__rewind = self.fp.tell()
self.seek(0)
def _palette(self, palette: list[tuple[int, int, int]], shift: int) -> None:
# load palette
i = 0
for e in range(i16(self.fp.read(2))):
s = self.fp.read(2)
i = i + s[0]
n = s[1]
if n == 0:
n = 256
s = self.fp.read(n * 3)
for n in range(0, len(s), 3):
r = s[n] << shift
g = s[n + 1] << shift
b = s[n + 2] << shift
palette[i] = (r, g, b)
i += 1
def seek(self, frame: int) -> None:
if not self._seek_check(frame):
return
if frame < self.__frame:
self._seek(0)
for f in range(self.__frame + 1, frame + 1):
self._seek(f)
def _seek(self, frame: int) -> None:
if frame == 0:
self.__frame = -1
self._fp.seek(self.__rewind)
self.__offset = 128
else:
# ensure that the previous frame was loaded
self.load()
if frame != self.__frame + 1:
msg = f"cannot seek to frame {frame}"
raise ValueError(msg)
self.__frame = frame
# move to next frame
self.fp = self._fp
self.fp.seek(self.__offset)
s = self.fp.read(4)
if not s:
msg = "missing frame size"
raise EOFError(msg)
framesize = i32(s)
self.decodermaxblock = framesize
self.tile = [ImageFile._Tile("fli", (0, 0) + self.size, self.__offset)]
self.__offset += framesize
def tell(self) -> int:
return self.__frame
#
# registry
Image.register_open(FliImageFile.format, FliImageFile, _accept)
Image.register_extensions(FliImageFile.format, [".fli", ".flc"])

View File

@ -0,0 +1,134 @@
#
# The Python Imaging Library
# $Id$
#
# base class for raster font file parsers
#
# history:
# 1997-06-05 fl created
# 1997-08-19 fl restrict image width
#
# Copyright (c) 1997-1998 by Secret Labs AB
# Copyright (c) 1997-1998 by Fredrik Lundh
#
# See the README file for information on usage and redistribution.
#
from __future__ import annotations
import os
from typing import BinaryIO
from . import Image, _binary
WIDTH = 800
def puti16(
fp: BinaryIO, values: tuple[int, int, int, int, int, int, int, int, int, int]
) -> None:
"""Write network order (big-endian) 16-bit sequence"""
for v in values:
if v < 0:
v += 65536
fp.write(_binary.o16be(v))
class FontFile:
"""Base class for raster font file handlers."""
bitmap: Image.Image | None = None
def __init__(self) -> None:
self.info: dict[bytes, bytes | int] = {}
self.glyph: list[
tuple[
tuple[int, int],
tuple[int, int, int, int],
tuple[int, int, int, int],
Image.Image,
]
| None
] = [None] * 256
def __getitem__(self, ix: int) -> (
tuple[
tuple[int, int],
tuple[int, int, int, int],
tuple[int, int, int, int],
Image.Image,
]
| None
):
return self.glyph[ix]
def compile(self) -> None:
"""Create metrics and bitmap"""
if self.bitmap:
return
# create bitmap large enough to hold all data
h = w = maxwidth = 0
lines = 1
for glyph in self.glyph:
if glyph:
d, dst, src, im = glyph
h = max(h, src[3] - src[1])
w = w + (src[2] - src[0])
if w > WIDTH:
lines += 1
w = src[2] - src[0]
maxwidth = max(maxwidth, w)
xsize = maxwidth
ysize = lines * h
if xsize == 0 and ysize == 0:
return
self.ysize = h
# paste glyphs into bitmap
self.bitmap = Image.new("1", (xsize, ysize))
self.metrics: list[
tuple[tuple[int, int], tuple[int, int, int, int], tuple[int, int, int, int]]
| None
] = [None] * 256
x = y = 0
for i in range(256):
glyph = self[i]
if glyph:
d, dst, src, im = glyph
xx = src[2] - src[0]
x0, y0 = x, y
x = x + xx
if x > WIDTH:
x, y = 0, y + h
x0, y0 = x, y
x = xx
s = src[0] + x0, src[1] + y0, src[2] + x0, src[3] + y0
self.bitmap.paste(im.crop(src), s)
self.metrics[i] = d, dst, s
def save(self, filename: str) -> None:
"""Save font"""
self.compile()
# font data
if not self.bitmap:
msg = "No bitmap created"
raise ValueError(msg)
self.bitmap.save(os.path.splitext(filename)[0] + ".pbm", "PNG")
# font metrics
with open(os.path.splitext(filename)[0] + ".pil", "wb") as fp:
fp.write(b"PILfont\n")
fp.write(f";;;;;;{self.ysize};\n".encode("ascii")) # HACK!!!
fp.write(b"DATA\n")
for id in range(256):
m = self.metrics[id]
if not m:
puti16(fp, (0,) * 10)
else:
puti16(fp, m[0] + m[1] + m[2])

View File

@ -0,0 +1,257 @@
#
# THIS IS WORK IN PROGRESS
#
# The Python Imaging Library.
# $Id$
#
# FlashPix support for PIL
#
# History:
# 97-01-25 fl Created (reads uncompressed RGB images only)
#
# Copyright (c) Secret Labs AB 1997.
# Copyright (c) Fredrik Lundh 1997.
#
# See the README file for information on usage and redistribution.
#
from __future__ import annotations
import olefile
from . import Image, ImageFile
from ._binary import i32le as i32
# we map from colour field tuples to (mode, rawmode) descriptors
MODES = {
# opacity
(0x00007FFE,): ("A", "L"),
# monochrome
(0x00010000,): ("L", "L"),
(0x00018000, 0x00017FFE): ("RGBA", "LA"),
# photo YCC
(0x00020000, 0x00020001, 0x00020002): ("RGB", "YCC;P"),
(0x00028000, 0x00028001, 0x00028002, 0x00027FFE): ("RGBA", "YCCA;P"),
# standard RGB (NIFRGB)
(0x00030000, 0x00030001, 0x00030002): ("RGB", "RGB"),
(0x00038000, 0x00038001, 0x00038002, 0x00037FFE): ("RGBA", "RGBA"),
}
#
# --------------------------------------------------------------------
def _accept(prefix: bytes) -> bool:
return prefix[:8] == olefile.MAGIC
##
# Image plugin for the FlashPix images.
class FpxImageFile(ImageFile.ImageFile):
format = "FPX"
format_description = "FlashPix"
def _open(self) -> None:
#
# read the OLE directory and see if this is a likely
# to be a FlashPix file
try:
self.ole = olefile.OleFileIO(self.fp)
except OSError as e:
msg = "not an FPX file; invalid OLE file"
raise SyntaxError(msg) from e
root = self.ole.root
if not root or root.clsid != "56616700-C154-11CE-8553-00AA00A1F95B":
msg = "not an FPX file; bad root CLSID"
raise SyntaxError(msg)
self._open_index(1)
def _open_index(self, index: int = 1) -> None:
#
# get the Image Contents Property Set
prop = self.ole.getproperties(
[f"Data Object Store {index:06d}", "\005Image Contents"]
)
# size (highest resolution)
assert isinstance(prop[0x1000002], int)
assert isinstance(prop[0x1000003], int)
self._size = prop[0x1000002], prop[0x1000003]
size = max(self.size)
i = 1
while size > 64:
size = size // 2
i += 1
self.maxid = i - 1
# mode. instead of using a single field for this, flashpix
# requires you to specify the mode for each channel in each
# resolution subimage, and leaves it to the decoder to make
# sure that they all match. for now, we'll cheat and assume
# that this is always the case.
id = self.maxid << 16
s = prop[0x2000002 | id]
if not isinstance(s, bytes) or (bands := i32(s, 4)) > 4:
msg = "Invalid number of bands"
raise OSError(msg)
# note: for now, we ignore the "uncalibrated" flag
colors = tuple(i32(s, 8 + i * 4) & 0x7FFFFFFF for i in range(bands))
self._mode, self.rawmode = MODES[colors]
# load JPEG tables, if any
self.jpeg = {}
for i in range(256):
id = 0x3000001 | (i << 16)
if id in prop:
self.jpeg[i] = prop[id]
self._open_subimage(1, self.maxid)
def _open_subimage(self, index: int = 1, subimage: int = 0) -> None:
#
# setup tile descriptors for a given subimage
stream = [
f"Data Object Store {index:06d}",
f"Resolution {subimage:04d}",
"Subimage 0000 Header",
]
fp = self.ole.openstream(stream)
# skip prefix
fp.read(28)
# header stream
s = fp.read(36)
size = i32(s, 4), i32(s, 8)
# tilecount = i32(s, 12)
tilesize = i32(s, 16), i32(s, 20)
# channels = i32(s, 24)
offset = i32(s, 28)
length = i32(s, 32)
if size != self.size:
msg = "subimage mismatch"
raise OSError(msg)
# get tile descriptors
fp.seek(28 + offset)
s = fp.read(i32(s, 12) * length)
x = y = 0
xsize, ysize = size
xtile, ytile = tilesize
self.tile = []
for i in range(0, len(s), length):
x1 = min(xsize, x + xtile)
y1 = min(ysize, y + ytile)
compression = i32(s, i + 8)
if compression == 0:
self.tile.append(
ImageFile._Tile(
"raw",
(x, y, x1, y1),
i32(s, i) + 28,
self.rawmode,
)
)
elif compression == 1:
# FIXME: the fill decoder is not implemented
self.tile.append(
ImageFile._Tile(
"fill",
(x, y, x1, y1),
i32(s, i) + 28,
(self.rawmode, s[12:16]),
)
)
elif compression == 2:
internal_color_conversion = s[14]
jpeg_tables = s[15]
rawmode = self.rawmode
if internal_color_conversion:
# The image is stored as usual (usually YCbCr).
if rawmode == "RGBA":
# For "RGBA", data is stored as YCbCrA based on
# negative RGB. The following trick works around
# this problem :
jpegmode, rawmode = "YCbCrK", "CMYK"
else:
jpegmode = None # let the decoder decide
else:
# The image is stored as defined by rawmode
jpegmode = rawmode
self.tile.append(
ImageFile._Tile(
"jpeg",
(x, y, x1, y1),
i32(s, i) + 28,
(rawmode, jpegmode),
)
)
# FIXME: jpeg tables are tile dependent; the prefix
# data must be placed in the tile descriptor itself!
if jpeg_tables:
self.tile_prefix = self.jpeg[jpeg_tables]
else:
msg = "unknown/invalid compression"
raise OSError(msg)
x = x + xtile
if x >= xsize:
x, y = 0, y + ytile
if y >= ysize:
break # isn't really required
self.stream = stream
self._fp = self.fp
self.fp = None
def load(self) -> Image.core.PixelAccess | None:
if not self.fp:
self.fp = self.ole.openstream(self.stream[:2] + ["Subimage 0000 Data"])
return ImageFile.ImageFile.load(self)
def close(self) -> None:
self.ole.close()
super().close()
def __exit__(self, *args: object) -> None:
self.ole.close()
super().__exit__()
#
# --------------------------------------------------------------------
Image.register_open(FpxImageFile.format, FpxImageFile, _accept)
Image.register_extension(FpxImageFile.format, ".fpx")

View File

@ -0,0 +1,115 @@
"""
A Pillow loader for .ftc and .ftu files (FTEX)
Jerome Leclanche <jerome@leclan.ch>
The contents of this file are hereby released in the public domain (CC0)
Full text of the CC0 license:
https://creativecommons.org/publicdomain/zero/1.0/
Independence War 2: Edge Of Chaos - Texture File Format - 16 October 2001
The textures used for 3D objects in Independence War 2: Edge Of Chaos are in a
packed custom format called FTEX. This file format uses file extensions FTC
and FTU.
* FTC files are compressed textures (using standard texture compression).
* FTU files are not compressed.
Texture File Format
The FTC and FTU texture files both use the same format. This
has the following structure:
{header}
{format_directory}
{data}
Where:
{header} = {
u32:magic,
u32:version,
u32:width,
u32:height,
u32:mipmap_count,
u32:format_count
}
* The "magic" number is "FTEX".
* "width" and "height" are the dimensions of the texture.
* "mipmap_count" is the number of mipmaps in the texture.
* "format_count" is the number of texture formats (different versions of the
same texture) in this file.
{format_directory} = format_count * { u32:format, u32:where }
The format value is 0 for DXT1 compressed textures and 1 for 24-bit RGB
uncompressed textures.
The texture data for a format starts at the position "where" in the file.
Each set of texture data in the file has the following structure:
{data} = format_count * { u32:mipmap_size, mipmap_size * { u8 } }
* "mipmap_size" is the number of bytes in that mip level. For compressed
textures this is the size of the texture data compressed with DXT1. For 24 bit
uncompressed textures, this is 3 * width * height. Following this are the image
bytes for that mipmap level.
Note: All data is stored in little-Endian (Intel) byte order.
"""
from __future__ import annotations
import struct
from enum import IntEnum
from io import BytesIO
from . import Image, ImageFile
MAGIC = b"FTEX"
class Format(IntEnum):
DXT1 = 0
UNCOMPRESSED = 1
class FtexImageFile(ImageFile.ImageFile):
format = "FTEX"
format_description = "Texture File Format (IW2:EOC)"
def _open(self) -> None:
if not _accept(self.fp.read(4)):
msg = "not an FTEX file"
raise SyntaxError(msg)
struct.unpack("<i", self.fp.read(4)) # version
self._size = struct.unpack("<2i", self.fp.read(8))
mipmap_count, format_count = struct.unpack("<2i", self.fp.read(8))
self._mode = "RGB"
# Only support single-format files.
# I don't know of any multi-format file.
assert format_count == 1
format, where = struct.unpack("<2i", self.fp.read(8))
self.fp.seek(where)
(mipmap_size,) = struct.unpack("<i", self.fp.read(4))
data = self.fp.read(mipmap_size)
if format == Format.DXT1:
self._mode = "RGBA"
self.tile = [ImageFile._Tile("bcn", (0, 0) + self.size, 0, (1,))]
elif format == Format.UNCOMPRESSED:
self.tile = [ImageFile._Tile("raw", (0, 0) + self.size, 0, "RGB")]
else:
msg = f"Invalid texture compression format: {repr(format)}"
raise ValueError(msg)
self.fp.close()
self.fp = BytesIO(data)
def load_seek(self, pos: int) -> None:
pass
def _accept(prefix: bytes) -> bool:
return prefix[:4] == MAGIC
Image.register_open(FtexImageFile.format, FtexImageFile, _accept)
Image.register_extensions(FtexImageFile.format, [".ftc", ".ftu"])

View File

@ -0,0 +1,103 @@
#
# The Python Imaging Library
#
# load a GIMP brush file
#
# History:
# 96-03-14 fl Created
# 16-01-08 es Version 2
#
# Copyright (c) Secret Labs AB 1997.
# Copyright (c) Fredrik Lundh 1996.
# Copyright (c) Eric Soroos 2016.
#
# See the README file for information on usage and redistribution.
#
#
# See https://github.com/GNOME/gimp/blob/mainline/devel-docs/gbr.txt for
# format documentation.
#
# This code Interprets version 1 and 2 .gbr files.
# Version 1 files are obsolete, and should not be used for new
# brushes.
# Version 2 files are saved by GIMP v2.8 (at least)
# Version 3 files have a format specifier of 18 for 16bit floats in
# the color depth field. This is currently unsupported by Pillow.
from __future__ import annotations
from . import Image, ImageFile
from ._binary import i32be as i32
def _accept(prefix: bytes) -> bool:
return len(prefix) >= 8 and i32(prefix, 0) >= 20 and i32(prefix, 4) in (1, 2)
##
# Image plugin for the GIMP brush format.
class GbrImageFile(ImageFile.ImageFile):
format = "GBR"
format_description = "GIMP brush file"
def _open(self) -> None:
header_size = i32(self.fp.read(4))
if header_size < 20:
msg = "not a GIMP brush"
raise SyntaxError(msg)
version = i32(self.fp.read(4))
if version not in (1, 2):
msg = f"Unsupported GIMP brush version: {version}"
raise SyntaxError(msg)
width = i32(self.fp.read(4))
height = i32(self.fp.read(4))
color_depth = i32(self.fp.read(4))
if width <= 0 or height <= 0:
msg = "not a GIMP brush"
raise SyntaxError(msg)
if color_depth not in (1, 4):
msg = f"Unsupported GIMP brush color depth: {color_depth}"
raise SyntaxError(msg)
if version == 1:
comment_length = header_size - 20
else:
comment_length = header_size - 28
magic_number = self.fp.read(4)
if magic_number != b"GIMP":
msg = "not a GIMP brush, bad magic number"
raise SyntaxError(msg)
self.info["spacing"] = i32(self.fp.read(4))
comment = self.fp.read(comment_length)[:-1]
if color_depth == 1:
self._mode = "L"
else:
self._mode = "RGBA"
self._size = width, height
self.info["comment"] = comment
# Image might not be small
Image._decompression_bomb_check(self.size)
# Data is an uncompressed block of w * h * bytes/pixel
self._data_size = width * height * color_depth
def load(self) -> Image.core.PixelAccess | None:
if self._im is None:
self.im = Image.core.new(self.mode, self.size)
self.frombytes(self.fp.read(self._data_size))
return Image.Image.load(self)
#
# registry
Image.register_open(GbrImageFile.format, GbrImageFile, _accept)
Image.register_extension(GbrImageFile.format, ".gbr")

View File

@ -0,0 +1,102 @@
#
# The Python Imaging Library.
# $Id$
#
# GD file handling
#
# History:
# 1996-04-12 fl Created
#
# Copyright (c) 1997 by Secret Labs AB.
# Copyright (c) 1996 by Fredrik Lundh.
#
# See the README file for information on usage and redistribution.
#
"""
.. note::
This format cannot be automatically recognized, so the
class is not registered for use with :py:func:`PIL.Image.open()`. To open a
gd file, use the :py:func:`PIL.GdImageFile.open()` function instead.
.. warning::
THE GD FORMAT IS NOT DESIGNED FOR DATA INTERCHANGE. This
implementation is provided for convenience and demonstrational
purposes only.
"""
from __future__ import annotations
from typing import IO
from . import ImageFile, ImagePalette, UnidentifiedImageError
from ._binary import i16be as i16
from ._binary import i32be as i32
from ._typing import StrOrBytesPath
class GdImageFile(ImageFile.ImageFile):
"""
Image plugin for the GD uncompressed format. Note that this format
is not supported by the standard :py:func:`PIL.Image.open()` function. To use
this plugin, you have to import the :py:mod:`PIL.GdImageFile` module and
use the :py:func:`PIL.GdImageFile.open()` function.
"""
format = "GD"
format_description = "GD uncompressed images"
def _open(self) -> None:
# Header
assert self.fp is not None
s = self.fp.read(1037)
if i16(s) not in [65534, 65535]:
msg = "Not a valid GD 2.x .gd file"
raise SyntaxError(msg)
self._mode = "L" # FIXME: "P"
self._size = i16(s, 2), i16(s, 4)
true_color = s[6]
true_color_offset = 2 if true_color else 0
# transparency index
tindex = i32(s, 7 + true_color_offset)
if tindex < 256:
self.info["transparency"] = tindex
self.palette = ImagePalette.raw(
"XBGR", s[7 + true_color_offset + 4 : 7 + true_color_offset + 4 + 256 * 4]
)
self.tile = [
ImageFile._Tile(
"raw",
(0, 0) + self.size,
7 + true_color_offset + 4 + 256 * 4,
"L",
)
]
def open(fp: StrOrBytesPath | IO[bytes], mode: str = "r") -> GdImageFile:
"""
Load texture from a GD image file.
:param fp: GD file name, or an opened file handle.
:param mode: Optional mode. In this version, if the mode argument
is given, it must be "r".
:returns: An image instance.
:raises OSError: If the image could not be read.
"""
if mode != "r":
msg = "bad mode"
raise ValueError(msg)
try:
return GdImageFile(fp)
except SyntaxError as e:
msg = "cannot identify this image file"
raise UnidentifiedImageError(msg) from e

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,149 @@
#
# Python Imaging Library
# $Id$
#
# stuff to read (and render) GIMP gradient files
#
# History:
# 97-08-23 fl Created
#
# Copyright (c) Secret Labs AB 1997.
# Copyright (c) Fredrik Lundh 1997.
#
# See the README file for information on usage and redistribution.
#
"""
Stuff to translate curve segments to palette values (derived from
the corresponding code in GIMP, written by Federico Mena Quintero.
See the GIMP distribution for more information.)
"""
from __future__ import annotations
from math import log, pi, sin, sqrt
from typing import IO, Callable
from ._binary import o8
EPSILON = 1e-10
"""""" # Enable auto-doc for data member
def linear(middle: float, pos: float) -> float:
if pos <= middle:
if middle < EPSILON:
return 0.0
else:
return 0.5 * pos / middle
else:
pos = pos - middle
middle = 1.0 - middle
if middle < EPSILON:
return 1.0
else:
return 0.5 + 0.5 * pos / middle
def curved(middle: float, pos: float) -> float:
return pos ** (log(0.5) / log(max(middle, EPSILON)))
def sine(middle: float, pos: float) -> float:
return (sin((-pi / 2.0) + pi * linear(middle, pos)) + 1.0) / 2.0
def sphere_increasing(middle: float, pos: float) -> float:
return sqrt(1.0 - (linear(middle, pos) - 1.0) ** 2)
def sphere_decreasing(middle: float, pos: float) -> float:
return 1.0 - sqrt(1.0 - linear(middle, pos) ** 2)
SEGMENTS = [linear, curved, sine, sphere_increasing, sphere_decreasing]
"""""" # Enable auto-doc for data member
class GradientFile:
gradient: (
list[
tuple[
float,
float,
float,
list[float],
list[float],
Callable[[float, float], float],
]
]
| None
) = None
def getpalette(self, entries: int = 256) -> tuple[bytes, str]:
assert self.gradient is not None
palette = []
ix = 0
x0, x1, xm, rgb0, rgb1, segment = self.gradient[ix]
for i in range(entries):
x = i / (entries - 1)
while x1 < x:
ix += 1
x0, x1, xm, rgb0, rgb1, segment = self.gradient[ix]
w = x1 - x0
if w < EPSILON:
scale = segment(0.5, 0.5)
else:
scale = segment((xm - x0) / w, (x - x0) / w)
# expand to RGBA
r = o8(int(255 * ((rgb1[0] - rgb0[0]) * scale + rgb0[0]) + 0.5))
g = o8(int(255 * ((rgb1[1] - rgb0[1]) * scale + rgb0[1]) + 0.5))
b = o8(int(255 * ((rgb1[2] - rgb0[2]) * scale + rgb0[2]) + 0.5))
a = o8(int(255 * ((rgb1[3] - rgb0[3]) * scale + rgb0[3]) + 0.5))
# add to palette
palette.append(r + g + b + a)
return b"".join(palette), "RGBA"
class GimpGradientFile(GradientFile):
"""File handler for GIMP's gradient format."""
def __init__(self, fp: IO[bytes]) -> None:
if fp.readline()[:13] != b"GIMP Gradient":
msg = "not a GIMP gradient file"
raise SyntaxError(msg)
line = fp.readline()
# GIMP 1.2 gradient files don't contain a name, but GIMP 1.3 files do
if line.startswith(b"Name: "):
line = fp.readline().strip()
count = int(line)
self.gradient = []
for i in range(count):
s = fp.readline().split()
w = [float(x) for x in s[:11]]
x0, x1 = w[0], w[2]
xm = w[1]
rgb0 = w[3:7]
rgb1 = w[7:11]
segment = SEGMENTS[int(s[11])]
cspace = int(s[12])
if cspace != 0:
msg = "cannot handle HSV colour space"
raise OSError(msg)
self.gradient.append((x0, x1, xm, rgb0, rgb1, segment))

View File

@ -0,0 +1,58 @@
#
# Python Imaging Library
# $Id$
#
# stuff to read GIMP palette files
#
# History:
# 1997-08-23 fl Created
# 2004-09-07 fl Support GIMP 2.0 palette files.
#
# Copyright (c) Secret Labs AB 1997-2004. All rights reserved.
# Copyright (c) Fredrik Lundh 1997-2004.
#
# See the README file for information on usage and redistribution.
#
from __future__ import annotations
import re
from typing import IO
from ._binary import o8
class GimpPaletteFile:
"""File handler for GIMP's palette format."""
rawmode = "RGB"
def __init__(self, fp: IO[bytes]) -> None:
palette = [o8(i) * 3 for i in range(256)]
if fp.readline()[:12] != b"GIMP Palette":
msg = "not a GIMP palette file"
raise SyntaxError(msg)
for i in range(256):
s = fp.readline()
if not s:
break
# skip fields and comment lines
if re.match(rb"\w+:|#", s):
continue
if len(s) > 100:
msg = "bad palette file"
raise SyntaxError(msg)
v = tuple(map(int, s.split()[:3]))
if len(v) != 3:
msg = "bad palette entry"
raise ValueError(msg)
palette[i] = o8(v[0]) + o8(v[1]) + o8(v[2])
self.palette = b"".join(palette)
def getpalette(self) -> tuple[bytes, str]:
return self.palette, self.rawmode

View File

@ -0,0 +1,76 @@
#
# The Python Imaging Library
# $Id$
#
# GRIB stub adapter
#
# Copyright (c) 1996-2003 by Fredrik Lundh
#
# See the README file for information on usage and redistribution.
#
from __future__ import annotations
from typing import IO
from . import Image, ImageFile
_handler = None
def register_handler(handler: ImageFile.StubHandler | None) -> None:
"""
Install application-specific GRIB image handler.
:param handler: Handler object.
"""
global _handler
_handler = handler
# --------------------------------------------------------------------
# Image adapter
def _accept(prefix: bytes) -> bool:
return prefix[:4] == b"GRIB" and prefix[7] == 1
class GribStubImageFile(ImageFile.StubImageFile):
format = "GRIB"
format_description = "GRIB"
def _open(self) -> None:
offset = self.fp.tell()
if not _accept(self.fp.read(8)):
msg = "Not a GRIB file"
raise SyntaxError(msg)
self.fp.seek(offset)
# make something up
self._mode = "F"
self._size = 1, 1
loader = self._load()
if loader:
loader.open(self)
def _load(self) -> ImageFile.StubHandler | None:
return _handler
def _save(im: Image.Image, fp: IO[bytes], filename: str | bytes) -> None:
if _handler is None or not hasattr(_handler, "save"):
msg = "GRIB save handler not installed"
raise OSError(msg)
_handler.save(im, fp, filename)
# --------------------------------------------------------------------
# Registry
Image.register_open(GribStubImageFile.format, GribStubImageFile, _accept)
Image.register_save(GribStubImageFile.format, _save)
Image.register_extension(GribStubImageFile.format, ".grib")

View File

@ -0,0 +1,76 @@
#
# The Python Imaging Library
# $Id$
#
# HDF5 stub adapter
#
# Copyright (c) 2000-2003 by Fredrik Lundh
#
# See the README file for information on usage and redistribution.
#
from __future__ import annotations
from typing import IO
from . import Image, ImageFile
_handler = None
def register_handler(handler: ImageFile.StubHandler | None) -> None:
"""
Install application-specific HDF5 image handler.
:param handler: Handler object.
"""
global _handler
_handler = handler
# --------------------------------------------------------------------
# Image adapter
def _accept(prefix: bytes) -> bool:
return prefix[:8] == b"\x89HDF\r\n\x1a\n"
class HDF5StubImageFile(ImageFile.StubImageFile):
format = "HDF5"
format_description = "HDF5"
def _open(self) -> None:
offset = self.fp.tell()
if not _accept(self.fp.read(8)):
msg = "Not an HDF file"
raise SyntaxError(msg)
self.fp.seek(offset)
# make something up
self._mode = "F"
self._size = 1, 1
loader = self._load()
if loader:
loader.open(self)
def _load(self) -> ImageFile.StubHandler | None:
return _handler
def _save(im: Image.Image, fp: IO[bytes], filename: str | bytes) -> None:
if _handler is None or not hasattr(_handler, "save"):
msg = "HDF5 save handler not installed"
raise OSError(msg)
_handler.save(im, fp, filename)
# --------------------------------------------------------------------
# Registry
Image.register_open(HDF5StubImageFile.format, HDF5StubImageFile, _accept)
Image.register_save(HDF5StubImageFile.format, _save)
Image.register_extensions(HDF5StubImageFile.format, [".h5", ".hdf"])

View File

@ -0,0 +1,412 @@
#
# The Python Imaging Library.
# $Id$
#
# macOS icns file decoder, based on icns.py by Bob Ippolito.
#
# history:
# 2004-10-09 fl Turned into a PIL plugin; removed 2.3 dependencies.
# 2020-04-04 Allow saving on all operating systems.
#
# Copyright (c) 2004 by Bob Ippolito.
# Copyright (c) 2004 by Secret Labs.
# Copyright (c) 2004 by Fredrik Lundh.
# Copyright (c) 2014 by Alastair Houghton.
# Copyright (c) 2020 by Pan Jing.
#
# See the README file for information on usage and redistribution.
#
from __future__ import annotations
import io
import os
import struct
import sys
from typing import IO
from . import Image, ImageFile, PngImagePlugin, features
from ._deprecate import deprecate
enable_jpeg2k = features.check_codec("jpg_2000")
if enable_jpeg2k:
from . import Jpeg2KImagePlugin
MAGIC = b"icns"
HEADERSIZE = 8
def nextheader(fobj: IO[bytes]) -> tuple[bytes, int]:
return struct.unpack(">4sI", fobj.read(HEADERSIZE))
def read_32t(
fobj: IO[bytes], start_length: tuple[int, int], size: tuple[int, int, int]
) -> dict[str, Image.Image]:
# The 128x128 icon seems to have an extra header for some reason.
(start, length) = start_length
fobj.seek(start)
sig = fobj.read(4)
if sig != b"\x00\x00\x00\x00":
msg = "Unknown signature, expecting 0x00000000"
raise SyntaxError(msg)
return read_32(fobj, (start + 4, length - 4), size)
def read_32(
fobj: IO[bytes], start_length: tuple[int, int], size: tuple[int, int, int]
) -> dict[str, Image.Image]:
"""
Read a 32bit RGB icon resource. Seems to be either uncompressed or
an RLE packbits-like scheme.
"""
(start, length) = start_length
fobj.seek(start)
pixel_size = (size[0] * size[2], size[1] * size[2])
sizesq = pixel_size[0] * pixel_size[1]
if length == sizesq * 3:
# uncompressed ("RGBRGBGB")
indata = fobj.read(length)
im = Image.frombuffer("RGB", pixel_size, indata, "raw", "RGB", 0, 1)
else:
# decode image
im = Image.new("RGB", pixel_size, None)
for band_ix in range(3):
data = []
bytesleft = sizesq
while bytesleft > 0:
byte = fobj.read(1)
if not byte:
break
byte_int = byte[0]
if byte_int & 0x80:
blocksize = byte_int - 125
byte = fobj.read(1)
for i in range(blocksize):
data.append(byte)
else:
blocksize = byte_int + 1
data.append(fobj.read(blocksize))
bytesleft -= blocksize
if bytesleft <= 0:
break
if bytesleft != 0:
msg = f"Error reading channel [{repr(bytesleft)} left]"
raise SyntaxError(msg)
band = Image.frombuffer("L", pixel_size, b"".join(data), "raw", "L", 0, 1)
im.im.putband(band.im, band_ix)
return {"RGB": im}
def read_mk(
fobj: IO[bytes], start_length: tuple[int, int], size: tuple[int, int, int]
) -> dict[str, Image.Image]:
# Alpha masks seem to be uncompressed
start = start_length[0]
fobj.seek(start)
pixel_size = (size[0] * size[2], size[1] * size[2])
sizesq = pixel_size[0] * pixel_size[1]
band = Image.frombuffer("L", pixel_size, fobj.read(sizesq), "raw", "L", 0, 1)
return {"A": band}
def read_png_or_jpeg2000(
fobj: IO[bytes], start_length: tuple[int, int], size: tuple[int, int, int]
) -> dict[str, Image.Image]:
(start, length) = start_length
fobj.seek(start)
sig = fobj.read(12)
im: Image.Image
if sig[:8] == b"\x89PNG\x0d\x0a\x1a\x0a":
fobj.seek(start)
im = PngImagePlugin.PngImageFile(fobj)
Image._decompression_bomb_check(im.size)
return {"RGBA": im}
elif (
sig[:4] == b"\xff\x4f\xff\x51"
or sig[:4] == b"\x0d\x0a\x87\x0a"
or sig == b"\x00\x00\x00\x0cjP \x0d\x0a\x87\x0a"
):
if not enable_jpeg2k:
msg = (
"Unsupported icon subimage format (rebuild PIL "
"with JPEG 2000 support to fix this)"
)
raise ValueError(msg)
# j2k, jpc or j2c
fobj.seek(start)
jp2kstream = fobj.read(length)
f = io.BytesIO(jp2kstream)
im = Jpeg2KImagePlugin.Jpeg2KImageFile(f)
Image._decompression_bomb_check(im.size)
if im.mode != "RGBA":
im = im.convert("RGBA")
return {"RGBA": im}
else:
msg = "Unsupported icon subimage format"
raise ValueError(msg)
class IcnsFile:
SIZES = {
(512, 512, 2): [(b"ic10", read_png_or_jpeg2000)],
(512, 512, 1): [(b"ic09", read_png_or_jpeg2000)],
(256, 256, 2): [(b"ic14", read_png_or_jpeg2000)],
(256, 256, 1): [(b"ic08", read_png_or_jpeg2000)],
(128, 128, 2): [(b"ic13", read_png_or_jpeg2000)],
(128, 128, 1): [
(b"ic07", read_png_or_jpeg2000),
(b"it32", read_32t),
(b"t8mk", read_mk),
],
(64, 64, 1): [(b"icp6", read_png_or_jpeg2000)],
(32, 32, 2): [(b"ic12", read_png_or_jpeg2000)],
(48, 48, 1): [(b"ih32", read_32), (b"h8mk", read_mk)],
(32, 32, 1): [
(b"icp5", read_png_or_jpeg2000),
(b"il32", read_32),
(b"l8mk", read_mk),
],
(16, 16, 2): [(b"ic11", read_png_or_jpeg2000)],
(16, 16, 1): [
(b"icp4", read_png_or_jpeg2000),
(b"is32", read_32),
(b"s8mk", read_mk),
],
}
def __init__(self, fobj: IO[bytes]) -> None:
"""
fobj is a file-like object as an icns resource
"""
# signature : (start, length)
self.dct = {}
self.fobj = fobj
sig, filesize = nextheader(fobj)
if not _accept(sig):
msg = "not an icns file"
raise SyntaxError(msg)
i = HEADERSIZE
while i < filesize:
sig, blocksize = nextheader(fobj)
if blocksize <= 0:
msg = "invalid block header"
raise SyntaxError(msg)
i += HEADERSIZE
blocksize -= HEADERSIZE
self.dct[sig] = (i, blocksize)
fobj.seek(blocksize, io.SEEK_CUR)
i += blocksize
def itersizes(self) -> list[tuple[int, int, int]]:
sizes = []
for size, fmts in self.SIZES.items():
for fmt, reader in fmts:
if fmt in self.dct:
sizes.append(size)
break
return sizes
def bestsize(self) -> tuple[int, int, int]:
sizes = self.itersizes()
if not sizes:
msg = "No 32bit icon resources found"
raise SyntaxError(msg)
return max(sizes)
def dataforsize(self, size: tuple[int, int, int]) -> dict[str, Image.Image]:
"""
Get an icon resource as {channel: array}. Note that
the arrays are bottom-up like windows bitmaps and will likely
need to be flipped or transposed in some way.
"""
dct = {}
for code, reader in self.SIZES[size]:
desc = self.dct.get(code)
if desc is not None:
dct.update(reader(self.fobj, desc, size))
return dct
def getimage(
self, size: tuple[int, int] | tuple[int, int, int] | None = None
) -> Image.Image:
if size is None:
size = self.bestsize()
elif len(size) == 2:
size = (size[0], size[1], 1)
channels = self.dataforsize(size)
im = channels.get("RGBA")
if im:
return im
im = channels["RGB"].copy()
try:
im.putalpha(channels["A"])
except KeyError:
pass
return im
##
# Image plugin for Mac OS icons.
class IcnsImageFile(ImageFile.ImageFile):
"""
PIL image support for Mac OS .icns files.
Chooses the best resolution, but will possibly load
a different size image if you mutate the size attribute
before calling 'load'.
The info dictionary has a key 'sizes' that is a list
of sizes that the icns file has.
"""
format = "ICNS"
format_description = "Mac OS icns resource"
def _open(self) -> None:
self.icns = IcnsFile(self.fp)
self._mode = "RGBA"
self.info["sizes"] = self.icns.itersizes()
self.best_size = self.icns.bestsize()
self.size = (
self.best_size[0] * self.best_size[2],
self.best_size[1] * self.best_size[2],
)
@property # type: ignore[override]
def size(self) -> tuple[int, int] | tuple[int, int, int]:
return self._size
@size.setter
def size(self, value: tuple[int, int] | tuple[int, int, int]) -> None:
if len(value) == 3:
deprecate("Setting size to (width, height, scale)", 12, "load(scale)")
if value in self.info["sizes"]:
self._size = value # type: ignore[assignment]
return
else:
# Check that a matching size exists,
# or that there is a scale that would create a size that matches
for size in self.info["sizes"]:
simple_size = size[0] * size[2], size[1] * size[2]
scale = simple_size[0] // value[0]
if simple_size[1] / value[1] == scale:
self._size = value
return
msg = "This is not one of the allowed sizes of this image"
raise ValueError(msg)
def load(self, scale: int | None = None) -> Image.core.PixelAccess | None:
if scale is not None or len(self.size) == 3:
if scale is None and len(self.size) == 3:
scale = self.size[2]
assert scale is not None
width, height = self.size[:2]
self.size = width * scale, height * scale
self.best_size = width, height, scale
px = Image.Image.load(self)
if self._im is not None and self.im.size == self.size:
# Already loaded
return px
self.load_prepare()
# This is likely NOT the best way to do it, but whatever.
im = self.icns.getimage(self.best_size)
# If this is a PNG or JPEG 2000, it won't be loaded yet
px = im.load()
self.im = im.im
self._mode = im.mode
self.size = im.size
return px
def _save(im: Image.Image, fp: IO[bytes], filename: str | bytes) -> None:
"""
Saves the image as a series of PNG files,
that are then combined into a .icns file.
"""
if hasattr(fp, "flush"):
fp.flush()
sizes = {
b"ic07": 128,
b"ic08": 256,
b"ic09": 512,
b"ic10": 1024,
b"ic11": 32,
b"ic12": 64,
b"ic13": 256,
b"ic14": 512,
}
provided_images = {im.width: im for im in im.encoderinfo.get("append_images", [])}
size_streams = {}
for size in set(sizes.values()):
image = (
provided_images[size]
if size in provided_images
else im.resize((size, size))
)
temp = io.BytesIO()
image.save(temp, "png")
size_streams[size] = temp.getvalue()
entries = []
for type, size in sizes.items():
stream = size_streams[size]
entries.append((type, HEADERSIZE + len(stream), stream))
# Header
fp.write(MAGIC)
file_length = HEADERSIZE # Header
file_length += HEADERSIZE + 8 * len(entries) # TOC
file_length += sum(entry[1] for entry in entries)
fp.write(struct.pack(">i", file_length))
# TOC
fp.write(b"TOC ")
fp.write(struct.pack(">i", HEADERSIZE + len(entries) * HEADERSIZE))
for entry in entries:
fp.write(entry[0])
fp.write(struct.pack(">i", entry[1]))
# Data
for entry in entries:
fp.write(entry[0])
fp.write(struct.pack(">i", entry[1]))
fp.write(entry[2])
if hasattr(fp, "flush"):
fp.flush()
def _accept(prefix: bytes) -> bool:
return prefix[:4] == MAGIC
Image.register_open(IcnsImageFile.format, IcnsImageFile, _accept)
Image.register_extension(IcnsImageFile.format, ".icns")
Image.register_save(IcnsImageFile.format, _save)
Image.register_mime(IcnsImageFile.format, "image/icns")
if __name__ == "__main__":
if len(sys.argv) < 2:
print("Syntax: python3 IcnsImagePlugin.py [file]")
sys.exit()
with open(sys.argv[1], "rb") as fp:
imf = IcnsImageFile(fp)
for size in imf.info["sizes"]:
width, height, scale = imf.size = size
imf.save(f"out-{width}-{height}-{scale}.png")
with Image.open(sys.argv[1]) as im:
im.save("out.png")
if sys.platform == "windows":
os.startfile("out.png")

View File

@ -0,0 +1,381 @@
#
# The Python Imaging Library.
# $Id$
#
# Windows Icon support for PIL
#
# History:
# 96-05-27 fl Created
#
# Copyright (c) Secret Labs AB 1997.
# Copyright (c) Fredrik Lundh 1996.
#
# See the README file for information on usage and redistribution.
#
# This plugin is a refactored version of Win32IconImagePlugin by Bryan Davis
# <casadebender@gmail.com>.
# https://code.google.com/archive/p/casadebender/wikis/Win32IconImagePlugin.wiki
#
# Icon format references:
# * https://en.wikipedia.org/wiki/ICO_(file_format)
# * https://msdn.microsoft.com/en-us/library/ms997538.aspx
from __future__ import annotations
import warnings
from io import BytesIO
from math import ceil, log
from typing import IO, NamedTuple
from . import BmpImagePlugin, Image, ImageFile, PngImagePlugin
from ._binary import i16le as i16
from ._binary import i32le as i32
from ._binary import o8
from ._binary import o16le as o16
from ._binary import o32le as o32
#
# --------------------------------------------------------------------
_MAGIC = b"\0\0\1\0"
def _save(im: Image.Image, fp: IO[bytes], filename: str | bytes) -> None:
fp.write(_MAGIC) # (2+2)
bmp = im.encoderinfo.get("bitmap_format") == "bmp"
sizes = im.encoderinfo.get(
"sizes",
[(16, 16), (24, 24), (32, 32), (48, 48), (64, 64), (128, 128), (256, 256)],
)
frames = []
provided_ims = [im] + im.encoderinfo.get("append_images", [])
width, height = im.size
for size in sorted(set(sizes)):
if size[0] > width or size[1] > height or size[0] > 256 or size[1] > 256:
continue
for provided_im in provided_ims:
if provided_im.size != size:
continue
frames.append(provided_im)
if bmp:
bits = BmpImagePlugin.SAVE[provided_im.mode][1]
bits_used = [bits]
for other_im in provided_ims:
if other_im.size != size:
continue
bits = BmpImagePlugin.SAVE[other_im.mode][1]
if bits not in bits_used:
# Another image has been supplied for this size
# with a different bit depth
frames.append(other_im)
bits_used.append(bits)
break
else:
# TODO: invent a more convenient method for proportional scalings
frame = provided_im.copy()
frame.thumbnail(size, Image.Resampling.LANCZOS, reducing_gap=None)
frames.append(frame)
fp.write(o16(len(frames))) # idCount(2)
offset = fp.tell() + len(frames) * 16
for frame in frames:
width, height = frame.size
# 0 means 256
fp.write(o8(width if width < 256 else 0)) # bWidth(1)
fp.write(o8(height if height < 256 else 0)) # bHeight(1)
bits, colors = BmpImagePlugin.SAVE[frame.mode][1:] if bmp else (32, 0)
fp.write(o8(colors)) # bColorCount(1)
fp.write(b"\0") # bReserved(1)
fp.write(b"\0\0") # wPlanes(2)
fp.write(o16(bits)) # wBitCount(2)
image_io = BytesIO()
if bmp:
frame.save(image_io, "dib")
if bits != 32:
and_mask = Image.new("1", size)
ImageFile._save(
and_mask,
image_io,
[ImageFile._Tile("raw", (0, 0) + size, 0, ("1", 0, -1))],
)
else:
frame.save(image_io, "png")
image_io.seek(0)
image_bytes = image_io.read()
if bmp:
image_bytes = image_bytes[:8] + o32(height * 2) + image_bytes[12:]
bytes_len = len(image_bytes)
fp.write(o32(bytes_len)) # dwBytesInRes(4)
fp.write(o32(offset)) # dwImageOffset(4)
current = fp.tell()
fp.seek(offset)
fp.write(image_bytes)
offset = offset + bytes_len
fp.seek(current)
def _accept(prefix: bytes) -> bool:
return prefix[:4] == _MAGIC
class IconHeader(NamedTuple):
width: int
height: int
nb_color: int
reserved: int
planes: int
bpp: int
size: int
offset: int
dim: tuple[int, int]
square: int
color_depth: int
class IcoFile:
def __init__(self, buf: IO[bytes]) -> None:
"""
Parse image from file-like object containing ico file data
"""
# check magic
s = buf.read(6)
if not _accept(s):
msg = "not an ICO file"
raise SyntaxError(msg)
self.buf = buf
self.entry = []
# Number of items in file
self.nb_items = i16(s, 4)
# Get headers for each item
for i in range(self.nb_items):
s = buf.read(16)
# See Wikipedia
width = s[0] or 256
height = s[1] or 256
# No. of colors in image (0 if >=8bpp)
nb_color = s[2]
bpp = i16(s, 6)
icon_header = IconHeader(
width=width,
height=height,
nb_color=nb_color,
reserved=s[3],
planes=i16(s, 4),
bpp=i16(s, 6),
size=i32(s, 8),
offset=i32(s, 12),
dim=(width, height),
square=width * height,
# See Wikipedia notes about color depth.
# We need this just to differ images with equal sizes
color_depth=bpp or (nb_color != 0 and ceil(log(nb_color, 2))) or 256,
)
self.entry.append(icon_header)
self.entry = sorted(self.entry, key=lambda x: x.color_depth)
# ICO images are usually squares
self.entry = sorted(self.entry, key=lambda x: x.square, reverse=True)
def sizes(self) -> set[tuple[int, int]]:
"""
Get a set of all available icon sizes and color depths.
"""
return {(h.width, h.height) for h in self.entry}
def getentryindex(self, size: tuple[int, int], bpp: int | bool = False) -> int:
for i, h in enumerate(self.entry):
if size == h.dim and (bpp is False or bpp == h.color_depth):
return i
return 0
def getimage(self, size: tuple[int, int], bpp: int | bool = False) -> Image.Image:
"""
Get an image from the icon
"""
return self.frame(self.getentryindex(size, bpp))
def frame(self, idx: int) -> Image.Image:
"""
Get an image from frame idx
"""
header = self.entry[idx]
self.buf.seek(header.offset)
data = self.buf.read(8)
self.buf.seek(header.offset)
im: Image.Image
if data[:8] == PngImagePlugin._MAGIC:
# png frame
im = PngImagePlugin.PngImageFile(self.buf)
Image._decompression_bomb_check(im.size)
else:
# XOR + AND mask bmp frame
im = BmpImagePlugin.DibImageFile(self.buf)
Image._decompression_bomb_check(im.size)
# change tile dimension to only encompass XOR image
im._size = (im.size[0], int(im.size[1] / 2))
d, e, o, a = im.tile[0]
im.tile[0] = ImageFile._Tile(d, (0, 0) + im.size, o, a)
# figure out where AND mask image starts
if header.bpp == 32:
# 32-bit color depth icon image allows semitransparent areas
# PIL's DIB format ignores transparency bits, recover them.
# The DIB is packed in BGRX byte order where X is the alpha
# channel.
# Back up to start of bmp data
self.buf.seek(o)
# extract every 4th byte (eg. 3,7,11,15,...)
alpha_bytes = self.buf.read(im.size[0] * im.size[1] * 4)[3::4]
# convert to an 8bpp grayscale image
try:
mask = Image.frombuffer(
"L", # 8bpp
im.size, # (w, h)
alpha_bytes, # source chars
"raw", # raw decoder
("L", 0, -1), # 8bpp inverted, unpadded, reversed
)
except ValueError:
if ImageFile.LOAD_TRUNCATED_IMAGES:
mask = None
else:
raise
else:
# get AND image from end of bitmap
w = im.size[0]
if (w % 32) > 0:
# bitmap row data is aligned to word boundaries
w += 32 - (im.size[0] % 32)
# the total mask data is
# padded row size * height / bits per char
total_bytes = int((w * im.size[1]) / 8)
and_mask_offset = header.offset + header.size - total_bytes
self.buf.seek(and_mask_offset)
mask_data = self.buf.read(total_bytes)
# convert raw data to image
try:
mask = Image.frombuffer(
"1", # 1 bpp
im.size, # (w, h)
mask_data, # source chars
"raw", # raw decoder
("1;I", int(w / 8), -1), # 1bpp inverted, padded, reversed
)
except ValueError:
if ImageFile.LOAD_TRUNCATED_IMAGES:
mask = None
else:
raise
# now we have two images, im is XOR image and mask is AND image
# apply mask image as alpha channel
if mask:
im = im.convert("RGBA")
im.putalpha(mask)
return im
##
# Image plugin for Windows Icon files.
class IcoImageFile(ImageFile.ImageFile):
"""
PIL read-only image support for Microsoft Windows .ico files.
By default the largest resolution image in the file will be loaded. This
can be changed by altering the 'size' attribute before calling 'load'.
The info dictionary has a key 'sizes' that is a list of the sizes available
in the icon file.
Handles classic, XP and Vista icon formats.
When saving, PNG compression is used. Support for this was only added in
Windows Vista. If you are unable to view the icon in Windows, convert the
image to "RGBA" mode before saving.
This plugin is a refactored version of Win32IconImagePlugin by Bryan Davis
<casadebender@gmail.com>.
https://code.google.com/archive/p/casadebender/wikis/Win32IconImagePlugin.wiki
"""
format = "ICO"
format_description = "Windows Icon"
def _open(self) -> None:
self.ico = IcoFile(self.fp)
self.info["sizes"] = self.ico.sizes()
self.size = self.ico.entry[0].dim
self.load()
@property
def size(self) -> tuple[int, int]:
return self._size
@size.setter
def size(self, value: tuple[int, int]) -> None:
if value not in self.info["sizes"]:
msg = "This is not one of the allowed sizes of this image"
raise ValueError(msg)
self._size = value
def load(self) -> Image.core.PixelAccess | None:
if self._im is not None and self.im.size == self.size:
# Already loaded
return Image.Image.load(self)
im = self.ico.getimage(self.size)
# if tile is PNG, it won't really be loaded yet
im.load()
self.im = im.im
self._mode = im.mode
if im.palette:
self.palette = im.palette
if im.size != self.size:
warnings.warn("Image was not the expected size")
index = self.ico.getentryindex(self.size)
sizes = list(self.info["sizes"])
sizes[index] = im.size
self.info["sizes"] = set(sizes)
self.size = im.size
return None
def load_seek(self, pos: int) -> None:
# Flag the ImageFile.Parser so that it
# just does all the decode at the end.
pass
#
# --------------------------------------------------------------------
Image.register_open(IcoImageFile.format, IcoImageFile, _accept)
Image.register_save(IcoImageFile.format, _save)
Image.register_extension(IcoImageFile.format, ".ico")
Image.register_mime(IcoImageFile.format, "image/x-icon")

View File

@ -0,0 +1,386 @@
#
# The Python Imaging Library.
# $Id$
#
# IFUNC IM file handling for PIL
#
# history:
# 1995-09-01 fl Created.
# 1997-01-03 fl Save palette images
# 1997-01-08 fl Added sequence support
# 1997-01-23 fl Added P and RGB save support
# 1997-05-31 fl Read floating point images
# 1997-06-22 fl Save floating point images
# 1997-08-27 fl Read and save 1-bit images
# 1998-06-25 fl Added support for RGB+LUT images
# 1998-07-02 fl Added support for YCC images
# 1998-07-15 fl Renamed offset attribute to avoid name clash
# 1998-12-29 fl Added I;16 support
# 2001-02-17 fl Use 're' instead of 'regex' (Python 2.1) (0.7)
# 2003-09-26 fl Added LA/PA support
#
# Copyright (c) 1997-2003 by Secret Labs AB.
# Copyright (c) 1995-2001 by Fredrik Lundh.
#
# See the README file for information on usage and redistribution.
#
from __future__ import annotations
import os
import re
from typing import IO, Any
from . import Image, ImageFile, ImagePalette
# --------------------------------------------------------------------
# Standard tags
COMMENT = "Comment"
DATE = "Date"
EQUIPMENT = "Digitalization equipment"
FRAMES = "File size (no of images)"
LUT = "Lut"
NAME = "Name"
SCALE = "Scale (x,y)"
SIZE = "Image size (x*y)"
MODE = "Image type"
TAGS = {
COMMENT: 0,
DATE: 0,
EQUIPMENT: 0,
FRAMES: 0,
LUT: 0,
NAME: 0,
SCALE: 0,
SIZE: 0,
MODE: 0,
}
OPEN = {
# ifunc93/p3cfunc formats
"0 1 image": ("1", "1"),
"L 1 image": ("1", "1"),
"Greyscale image": ("L", "L"),
"Grayscale image": ("L", "L"),
"RGB image": ("RGB", "RGB;L"),
"RLB image": ("RGB", "RLB"),
"RYB image": ("RGB", "RLB"),
"B1 image": ("1", "1"),
"B2 image": ("P", "P;2"),
"B4 image": ("P", "P;4"),
"X 24 image": ("RGB", "RGB"),
"L 32 S image": ("I", "I;32"),
"L 32 F image": ("F", "F;32"),
# old p3cfunc formats
"RGB3 image": ("RGB", "RGB;T"),
"RYB3 image": ("RGB", "RYB;T"),
# extensions
"LA image": ("LA", "LA;L"),
"PA image": ("LA", "PA;L"),
"RGBA image": ("RGBA", "RGBA;L"),
"RGBX image": ("RGB", "RGBX;L"),
"CMYK image": ("CMYK", "CMYK;L"),
"YCC image": ("YCbCr", "YCbCr;L"),
}
# ifunc95 extensions
for i in ["8", "8S", "16", "16S", "32", "32F"]:
OPEN[f"L {i} image"] = ("F", f"F;{i}")
OPEN[f"L*{i} image"] = ("F", f"F;{i}")
for i in ["16", "16L", "16B"]:
OPEN[f"L {i} image"] = (f"I;{i}", f"I;{i}")
OPEN[f"L*{i} image"] = (f"I;{i}", f"I;{i}")
for i in ["32S"]:
OPEN[f"L {i} image"] = ("I", f"I;{i}")
OPEN[f"L*{i} image"] = ("I", f"I;{i}")
for j in range(2, 33):
OPEN[f"L*{j} image"] = ("F", f"F;{j}")
# --------------------------------------------------------------------
# Read IM directory
split = re.compile(rb"^([A-Za-z][^:]*):[ \t]*(.*)[ \t]*$")
def number(s: Any) -> float:
try:
return int(s)
except ValueError:
return float(s)
##
# Image plugin for the IFUNC IM file format.
class ImImageFile(ImageFile.ImageFile):
format = "IM"
format_description = "IFUNC Image Memory"
_close_exclusive_fp_after_loading = False
def _open(self) -> None:
# Quick rejection: if there's not an LF among the first
# 100 bytes, this is (probably) not a text header.
if b"\n" not in self.fp.read(100):
msg = "not an IM file"
raise SyntaxError(msg)
self.fp.seek(0)
n = 0
# Default values
self.info[MODE] = "L"
self.info[SIZE] = (512, 512)
self.info[FRAMES] = 1
self.rawmode = "L"
while True:
s = self.fp.read(1)
# Some versions of IFUNC uses \n\r instead of \r\n...
if s == b"\r":
continue
if not s or s == b"\0" or s == b"\x1A":
break
# FIXME: this may read whole file if not a text file
s = s + self.fp.readline()
if len(s) > 100:
msg = "not an IM file"
raise SyntaxError(msg)
if s[-2:] == b"\r\n":
s = s[:-2]
elif s[-1:] == b"\n":
s = s[:-1]
try:
m = split.match(s)
except re.error as e:
msg = "not an IM file"
raise SyntaxError(msg) from e
if m:
k, v = m.group(1, 2)
# Don't know if this is the correct encoding,
# but a decent guess (I guess)
k = k.decode("latin-1", "replace")
v = v.decode("latin-1", "replace")
# Convert value as appropriate
if k in [FRAMES, SCALE, SIZE]:
v = v.replace("*", ",")
v = tuple(map(number, v.split(",")))
if len(v) == 1:
v = v[0]
elif k == MODE and v in OPEN:
v, self.rawmode = OPEN[v]
# Add to dictionary. Note that COMMENT tags are
# combined into a list of strings.
if k == COMMENT:
if k in self.info:
self.info[k].append(v)
else:
self.info[k] = [v]
else:
self.info[k] = v
if k in TAGS:
n += 1
else:
msg = f"Syntax error in IM header: {s.decode('ascii', 'replace')}"
raise SyntaxError(msg)
if not n:
msg = "Not an IM file"
raise SyntaxError(msg)
# Basic attributes
self._size = self.info[SIZE]
self._mode = self.info[MODE]
# Skip forward to start of image data
while s and s[:1] != b"\x1A":
s = self.fp.read(1)
if not s:
msg = "File truncated"
raise SyntaxError(msg)
if LUT in self.info:
# convert lookup table to palette or lut attribute
palette = self.fp.read(768)
greyscale = 1 # greyscale palette
linear = 1 # linear greyscale palette
for i in range(256):
if palette[i] == palette[i + 256] == palette[i + 512]:
if palette[i] != i:
linear = 0
else:
greyscale = 0
if self.mode in ["L", "LA", "P", "PA"]:
if greyscale:
if not linear:
self.lut = list(palette[:256])
else:
if self.mode in ["L", "P"]:
self._mode = self.rawmode = "P"
elif self.mode in ["LA", "PA"]:
self._mode = "PA"
self.rawmode = "PA;L"
self.palette = ImagePalette.raw("RGB;L", palette)
elif self.mode == "RGB":
if not greyscale or not linear:
self.lut = list(palette)
self.frame = 0
self.__offset = offs = self.fp.tell()
self._fp = self.fp # FIXME: hack
if self.rawmode[:2] == "F;":
# ifunc95 formats
try:
# use bit decoder (if necessary)
bits = int(self.rawmode[2:])
if bits not in [8, 16, 32]:
self.tile = [
ImageFile._Tile(
"bit", (0, 0) + self.size, offs, (bits, 8, 3, 0, -1)
)
]
return
except ValueError:
pass
if self.rawmode in ["RGB;T", "RYB;T"]:
# Old LabEye/3PC files. Would be very surprised if anyone
# ever stumbled upon such a file ;-)
size = self.size[0] * self.size[1]
self.tile = [
ImageFile._Tile("raw", (0, 0) + self.size, offs, ("G", 0, -1)),
ImageFile._Tile("raw", (0, 0) + self.size, offs + size, ("R", 0, -1)),
ImageFile._Tile(
"raw", (0, 0) + self.size, offs + 2 * size, ("B", 0, -1)
),
]
else:
# LabEye/IFUNC files
self.tile = [
ImageFile._Tile("raw", (0, 0) + self.size, offs, (self.rawmode, 0, -1))
]
@property
def n_frames(self) -> int:
return self.info[FRAMES]
@property
def is_animated(self) -> bool:
return self.info[FRAMES] > 1
def seek(self, frame: int) -> None:
if not self._seek_check(frame):
return
self.frame = frame
if self.mode == "1":
bits = 1
else:
bits = 8 * len(self.mode)
size = ((self.size[0] * bits + 7) // 8) * self.size[1]
offs = self.__offset + frame * size
self.fp = self._fp
self.tile = [
ImageFile._Tile("raw", (0, 0) + self.size, offs, (self.rawmode, 0, -1))
]
def tell(self) -> int:
return self.frame
#
# --------------------------------------------------------------------
# Save IM files
SAVE = {
# mode: (im type, raw mode)
"1": ("0 1", "1"),
"L": ("Greyscale", "L"),
"LA": ("LA", "LA;L"),
"P": ("Greyscale", "P"),
"PA": ("LA", "PA;L"),
"I": ("L 32S", "I;32S"),
"I;16": ("L 16", "I;16"),
"I;16L": ("L 16L", "I;16L"),
"I;16B": ("L 16B", "I;16B"),
"F": ("L 32F", "F;32F"),
"RGB": ("RGB", "RGB;L"),
"RGBA": ("RGBA", "RGBA;L"),
"RGBX": ("RGBX", "RGBX;L"),
"CMYK": ("CMYK", "CMYK;L"),
"YCbCr": ("YCC", "YCbCr;L"),
}
def _save(im: Image.Image, fp: IO[bytes], filename: str | bytes) -> None:
try:
image_type, rawmode = SAVE[im.mode]
except KeyError as e:
msg = f"Cannot save {im.mode} images as IM"
raise ValueError(msg) from e
frames = im.encoderinfo.get("frames", 1)
fp.write(f"Image type: {image_type} image\r\n".encode("ascii"))
if filename:
# Each line must be 100 characters or less,
# or: SyntaxError("not an IM file")
# 8 characters are used for "Name: " and "\r\n"
# Keep just the filename, ditch the potentially overlong path
if isinstance(filename, bytes):
filename = filename.decode("ascii")
name, ext = os.path.splitext(os.path.basename(filename))
name = "".join([name[: 92 - len(ext)], ext])
fp.write(f"Name: {name}\r\n".encode("ascii"))
fp.write(f"Image size (x*y): {im.size[0]}*{im.size[1]}\r\n".encode("ascii"))
fp.write(f"File size (no of images): {frames}\r\n".encode("ascii"))
if im.mode in ["P", "PA"]:
fp.write(b"Lut: 1\r\n")
fp.write(b"\000" * (511 - fp.tell()) + b"\032")
if im.mode in ["P", "PA"]:
im_palette = im.im.getpalette("RGB", "RGB;L")
colors = len(im_palette) // 3
palette = b""
for i in range(3):
palette += im_palette[colors * i : colors * (i + 1)]
palette += b"\x00" * (256 - colors)
fp.write(palette) # 768 bytes
ImageFile._save(
im, fp, [ImageFile._Tile("raw", (0, 0) + im.size, 0, (rawmode, 0, -1))]
)
#
# --------------------------------------------------------------------
# Registry
Image.register_open(ImImageFile.format, ImImageFile)
Image.register_save(ImImageFile.format, _save)
Image.register_extension(ImImageFile.format, ".im")

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,311 @@
#
# The Python Imaging Library.
# $Id$
#
# standard channel operations
#
# History:
# 1996-03-24 fl Created
# 1996-08-13 fl Added logical operations (for "1" images)
# 2000-10-12 fl Added offset method (from Image.py)
#
# Copyright (c) 1997-2000 by Secret Labs AB
# Copyright (c) 1996-2000 by Fredrik Lundh
#
# See the README file for information on usage and redistribution.
#
from __future__ import annotations
from . import Image
def constant(image: Image.Image, value: int) -> Image.Image:
"""Fill a channel with a given gray level.
:rtype: :py:class:`~PIL.Image.Image`
"""
return Image.new("L", image.size, value)
def duplicate(image: Image.Image) -> Image.Image:
"""Copy a channel. Alias for :py:meth:`PIL.Image.Image.copy`.
:rtype: :py:class:`~PIL.Image.Image`
"""
return image.copy()
def invert(image: Image.Image) -> Image.Image:
"""
Invert an image (channel). ::
out = MAX - image
:rtype: :py:class:`~PIL.Image.Image`
"""
image.load()
return image._new(image.im.chop_invert())
def lighter(image1: Image.Image, image2: Image.Image) -> Image.Image:
"""
Compares the two images, pixel by pixel, and returns a new image containing
the lighter values. ::
out = max(image1, image2)
:rtype: :py:class:`~PIL.Image.Image`
"""
image1.load()
image2.load()
return image1._new(image1.im.chop_lighter(image2.im))
def darker(image1: Image.Image, image2: Image.Image) -> Image.Image:
"""
Compares the two images, pixel by pixel, and returns a new image containing
the darker values. ::
out = min(image1, image2)
:rtype: :py:class:`~PIL.Image.Image`
"""
image1.load()
image2.load()
return image1._new(image1.im.chop_darker(image2.im))
def difference(image1: Image.Image, image2: Image.Image) -> Image.Image:
"""
Returns the absolute value of the pixel-by-pixel difference between the two
images. ::
out = abs(image1 - image2)
:rtype: :py:class:`~PIL.Image.Image`
"""
image1.load()
image2.load()
return image1._new(image1.im.chop_difference(image2.im))
def multiply(image1: Image.Image, image2: Image.Image) -> Image.Image:
"""
Superimposes two images on top of each other.
If you multiply an image with a solid black image, the result is black. If
you multiply with a solid white image, the image is unaffected. ::
out = image1 * image2 / MAX
:rtype: :py:class:`~PIL.Image.Image`
"""
image1.load()
image2.load()
return image1._new(image1.im.chop_multiply(image2.im))
def screen(image1: Image.Image, image2: Image.Image) -> Image.Image:
"""
Superimposes two inverted images on top of each other. ::
out = MAX - ((MAX - image1) * (MAX - image2) / MAX)
:rtype: :py:class:`~PIL.Image.Image`
"""
image1.load()
image2.load()
return image1._new(image1.im.chop_screen(image2.im))
def soft_light(image1: Image.Image, image2: Image.Image) -> Image.Image:
"""
Superimposes two images on top of each other using the Soft Light algorithm
:rtype: :py:class:`~PIL.Image.Image`
"""
image1.load()
image2.load()
return image1._new(image1.im.chop_soft_light(image2.im))
def hard_light(image1: Image.Image, image2: Image.Image) -> Image.Image:
"""
Superimposes two images on top of each other using the Hard Light algorithm
:rtype: :py:class:`~PIL.Image.Image`
"""
image1.load()
image2.load()
return image1._new(image1.im.chop_hard_light(image2.im))
def overlay(image1: Image.Image, image2: Image.Image) -> Image.Image:
"""
Superimposes two images on top of each other using the Overlay algorithm
:rtype: :py:class:`~PIL.Image.Image`
"""
image1.load()
image2.load()
return image1._new(image1.im.chop_overlay(image2.im))
def add(
image1: Image.Image, image2: Image.Image, scale: float = 1.0, offset: float = 0
) -> Image.Image:
"""
Adds two images, dividing the result by scale and adding the
offset. If omitted, scale defaults to 1.0, and offset to 0.0. ::
out = ((image1 + image2) / scale + offset)
:rtype: :py:class:`~PIL.Image.Image`
"""
image1.load()
image2.load()
return image1._new(image1.im.chop_add(image2.im, scale, offset))
def subtract(
image1: Image.Image, image2: Image.Image, scale: float = 1.0, offset: float = 0
) -> Image.Image:
"""
Subtracts two images, dividing the result by scale and adding the offset.
If omitted, scale defaults to 1.0, and offset to 0.0. ::
out = ((image1 - image2) / scale + offset)
:rtype: :py:class:`~PIL.Image.Image`
"""
image1.load()
image2.load()
return image1._new(image1.im.chop_subtract(image2.im, scale, offset))
def add_modulo(image1: Image.Image, image2: Image.Image) -> Image.Image:
"""Add two images, without clipping the result. ::
out = ((image1 + image2) % MAX)
:rtype: :py:class:`~PIL.Image.Image`
"""
image1.load()
image2.load()
return image1._new(image1.im.chop_add_modulo(image2.im))
def subtract_modulo(image1: Image.Image, image2: Image.Image) -> Image.Image:
"""Subtract two images, without clipping the result. ::
out = ((image1 - image2) % MAX)
:rtype: :py:class:`~PIL.Image.Image`
"""
image1.load()
image2.load()
return image1._new(image1.im.chop_subtract_modulo(image2.im))
def logical_and(image1: Image.Image, image2: Image.Image) -> Image.Image:
"""Logical AND between two images.
Both of the images must have mode "1". If you would like to perform a
logical AND on an image with a mode other than "1", try
:py:meth:`~PIL.ImageChops.multiply` instead, using a black-and-white mask
as the second image. ::
out = ((image1 and image2) % MAX)
:rtype: :py:class:`~PIL.Image.Image`
"""
image1.load()
image2.load()
return image1._new(image1.im.chop_and(image2.im))
def logical_or(image1: Image.Image, image2: Image.Image) -> Image.Image:
"""Logical OR between two images.
Both of the images must have mode "1". ::
out = ((image1 or image2) % MAX)
:rtype: :py:class:`~PIL.Image.Image`
"""
image1.load()
image2.load()
return image1._new(image1.im.chop_or(image2.im))
def logical_xor(image1: Image.Image, image2: Image.Image) -> Image.Image:
"""Logical XOR between two images.
Both of the images must have mode "1". ::
out = ((bool(image1) != bool(image2)) % MAX)
:rtype: :py:class:`~PIL.Image.Image`
"""
image1.load()
image2.load()
return image1._new(image1.im.chop_xor(image2.im))
def blend(image1: Image.Image, image2: Image.Image, alpha: float) -> Image.Image:
"""Blend images using constant transparency weight. Alias for
:py:func:`PIL.Image.blend`.
:rtype: :py:class:`~PIL.Image.Image`
"""
return Image.blend(image1, image2, alpha)
def composite(
image1: Image.Image, image2: Image.Image, mask: Image.Image
) -> Image.Image:
"""Create composite using transparency mask. Alias for
:py:func:`PIL.Image.composite`.
:rtype: :py:class:`~PIL.Image.Image`
"""
return Image.composite(image1, image2, mask)
def offset(image: Image.Image, xoffset: int, yoffset: int | None = None) -> Image.Image:
"""Returns a copy of the image where data has been offset by the given
distances. Data wraps around the edges. If ``yoffset`` is omitted, it
is assumed to be equal to ``xoffset``.
:param image: Input image.
:param xoffset: The horizontal distance.
:param yoffset: The vertical distance. If omitted, both
distances are set to the same value.
:rtype: :py:class:`~PIL.Image.Image`
"""
if yoffset is None:
yoffset = xoffset
image.load()
return image._new(image.im.offset(xoffset, yoffset))

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,320 @@
#
# The Python Imaging Library
# $Id$
#
# map CSS3-style colour description strings to RGB
#
# History:
# 2002-10-24 fl Added support for CSS-style color strings
# 2002-12-15 fl Added RGBA support
# 2004-03-27 fl Fixed remaining int() problems for Python 1.5.2
# 2004-07-19 fl Fixed gray/grey spelling issues
# 2009-03-05 fl Fixed rounding error in grayscale calculation
#
# Copyright (c) 2002-2004 by Secret Labs AB
# Copyright (c) 2002-2004 by Fredrik Lundh
#
# See the README file for information on usage and redistribution.
#
from __future__ import annotations
import re
from functools import lru_cache
from . import Image
@lru_cache
def getrgb(color: str) -> tuple[int, int, int] | tuple[int, int, int, int]:
"""
Convert a color string to an RGB or RGBA tuple. If the string cannot be
parsed, this function raises a :py:exc:`ValueError` exception.
.. versionadded:: 1.1.4
:param color: A color string
:return: ``(red, green, blue[, alpha])``
"""
if len(color) > 100:
msg = "color specifier is too long"
raise ValueError(msg)
color = color.lower()
rgb = colormap.get(color, None)
if rgb:
if isinstance(rgb, tuple):
return rgb
rgb_tuple = getrgb(rgb)
assert len(rgb_tuple) == 3
colormap[color] = rgb_tuple
return rgb_tuple
# check for known string formats
if re.match("#[a-f0-9]{3}$", color):
return int(color[1] * 2, 16), int(color[2] * 2, 16), int(color[3] * 2, 16)
if re.match("#[a-f0-9]{4}$", color):
return (
int(color[1] * 2, 16),
int(color[2] * 2, 16),
int(color[3] * 2, 16),
int(color[4] * 2, 16),
)
if re.match("#[a-f0-9]{6}$", color):
return int(color[1:3], 16), int(color[3:5], 16), int(color[5:7], 16)
if re.match("#[a-f0-9]{8}$", color):
return (
int(color[1:3], 16),
int(color[3:5], 16),
int(color[5:7], 16),
int(color[7:9], 16),
)
m = re.match(r"rgb\(\s*(\d+)\s*,\s*(\d+)\s*,\s*(\d+)\s*\)$", color)
if m:
return int(m.group(1)), int(m.group(2)), int(m.group(3))
m = re.match(r"rgb\(\s*(\d+)%\s*,\s*(\d+)%\s*,\s*(\d+)%\s*\)$", color)
if m:
return (
int((int(m.group(1)) * 255) / 100.0 + 0.5),
int((int(m.group(2)) * 255) / 100.0 + 0.5),
int((int(m.group(3)) * 255) / 100.0 + 0.5),
)
m = re.match(
r"hsl\(\s*(\d+\.?\d*)\s*,\s*(\d+\.?\d*)%\s*,\s*(\d+\.?\d*)%\s*\)$", color
)
if m:
from colorsys import hls_to_rgb
rgb_floats = hls_to_rgb(
float(m.group(1)) / 360.0,
float(m.group(3)) / 100.0,
float(m.group(2)) / 100.0,
)
return (
int(rgb_floats[0] * 255 + 0.5),
int(rgb_floats[1] * 255 + 0.5),
int(rgb_floats[2] * 255 + 0.5),
)
m = re.match(
r"hs[bv]\(\s*(\d+\.?\d*)\s*,\s*(\d+\.?\d*)%\s*,\s*(\d+\.?\d*)%\s*\)$", color
)
if m:
from colorsys import hsv_to_rgb
rgb_floats = hsv_to_rgb(
float(m.group(1)) / 360.0,
float(m.group(2)) / 100.0,
float(m.group(3)) / 100.0,
)
return (
int(rgb_floats[0] * 255 + 0.5),
int(rgb_floats[1] * 255 + 0.5),
int(rgb_floats[2] * 255 + 0.5),
)
m = re.match(r"rgba\(\s*(\d+)\s*,\s*(\d+)\s*,\s*(\d+)\s*,\s*(\d+)\s*\)$", color)
if m:
return int(m.group(1)), int(m.group(2)), int(m.group(3)), int(m.group(4))
msg = f"unknown color specifier: {repr(color)}"
raise ValueError(msg)
@lru_cache
def getcolor(color: str, mode: str) -> int | tuple[int, ...]:
"""
Same as :py:func:`~PIL.ImageColor.getrgb` for most modes. However, if
``mode`` is HSV, converts the RGB value to a HSV value, or if ``mode`` is
not color or a palette image, converts the RGB value to a grayscale value.
If the string cannot be parsed, this function raises a :py:exc:`ValueError`
exception.
.. versionadded:: 1.1.4
:param color: A color string
:param mode: Convert result to this mode
:return: ``graylevel, (graylevel, alpha) or (red, green, blue[, alpha])``
"""
# same as getrgb, but converts the result to the given mode
rgb, alpha = getrgb(color), 255
if len(rgb) == 4:
alpha = rgb[3]
rgb = rgb[:3]
if mode == "HSV":
from colorsys import rgb_to_hsv
r, g, b = rgb
h, s, v = rgb_to_hsv(r / 255, g / 255, b / 255)
return int(h * 255), int(s * 255), int(v * 255)
elif Image.getmodebase(mode) == "L":
r, g, b = rgb
# ITU-R Recommendation 601-2 for nonlinear RGB
# scaled to 24 bits to match the convert's implementation.
graylevel = (r * 19595 + g * 38470 + b * 7471 + 0x8000) >> 16
if mode[-1] == "A":
return graylevel, alpha
return graylevel
elif mode[-1] == "A":
return rgb + (alpha,)
return rgb
colormap: dict[str, str | tuple[int, int, int]] = {
# X11 colour table from https://drafts.csswg.org/css-color-4/, with
# gray/grey spelling issues fixed. This is a superset of HTML 4.0
# colour names used in CSS 1.
"aliceblue": "#f0f8ff",
"antiquewhite": "#faebd7",
"aqua": "#00ffff",
"aquamarine": "#7fffd4",
"azure": "#f0ffff",
"beige": "#f5f5dc",
"bisque": "#ffe4c4",
"black": "#000000",
"blanchedalmond": "#ffebcd",
"blue": "#0000ff",
"blueviolet": "#8a2be2",
"brown": "#a52a2a",
"burlywood": "#deb887",
"cadetblue": "#5f9ea0",
"chartreuse": "#7fff00",
"chocolate": "#d2691e",
"coral": "#ff7f50",
"cornflowerblue": "#6495ed",
"cornsilk": "#fff8dc",
"crimson": "#dc143c",
"cyan": "#00ffff",
"darkblue": "#00008b",
"darkcyan": "#008b8b",
"darkgoldenrod": "#b8860b",
"darkgray": "#a9a9a9",
"darkgrey": "#a9a9a9",
"darkgreen": "#006400",
"darkkhaki": "#bdb76b",
"darkmagenta": "#8b008b",
"darkolivegreen": "#556b2f",
"darkorange": "#ff8c00",
"darkorchid": "#9932cc",
"darkred": "#8b0000",
"darksalmon": "#e9967a",
"darkseagreen": "#8fbc8f",
"darkslateblue": "#483d8b",
"darkslategray": "#2f4f4f",
"darkslategrey": "#2f4f4f",
"darkturquoise": "#00ced1",
"darkviolet": "#9400d3",
"deeppink": "#ff1493",
"deepskyblue": "#00bfff",
"dimgray": "#696969",
"dimgrey": "#696969",
"dodgerblue": "#1e90ff",
"firebrick": "#b22222",
"floralwhite": "#fffaf0",
"forestgreen": "#228b22",
"fuchsia": "#ff00ff",
"gainsboro": "#dcdcdc",
"ghostwhite": "#f8f8ff",
"gold": "#ffd700",
"goldenrod": "#daa520",
"gray": "#808080",
"grey": "#808080",
"green": "#008000",
"greenyellow": "#adff2f",
"honeydew": "#f0fff0",
"hotpink": "#ff69b4",
"indianred": "#cd5c5c",
"indigo": "#4b0082",
"ivory": "#fffff0",
"khaki": "#f0e68c",
"lavender": "#e6e6fa",
"lavenderblush": "#fff0f5",
"lawngreen": "#7cfc00",
"lemonchiffon": "#fffacd",
"lightblue": "#add8e6",
"lightcoral": "#f08080",
"lightcyan": "#e0ffff",
"lightgoldenrodyellow": "#fafad2",
"lightgreen": "#90ee90",
"lightgray": "#d3d3d3",
"lightgrey": "#d3d3d3",
"lightpink": "#ffb6c1",
"lightsalmon": "#ffa07a",
"lightseagreen": "#20b2aa",
"lightskyblue": "#87cefa",
"lightslategray": "#778899",
"lightslategrey": "#778899",
"lightsteelblue": "#b0c4de",
"lightyellow": "#ffffe0",
"lime": "#00ff00",
"limegreen": "#32cd32",
"linen": "#faf0e6",
"magenta": "#ff00ff",
"maroon": "#800000",
"mediumaquamarine": "#66cdaa",
"mediumblue": "#0000cd",
"mediumorchid": "#ba55d3",
"mediumpurple": "#9370db",
"mediumseagreen": "#3cb371",
"mediumslateblue": "#7b68ee",
"mediumspringgreen": "#00fa9a",
"mediumturquoise": "#48d1cc",
"mediumvioletred": "#c71585",
"midnightblue": "#191970",
"mintcream": "#f5fffa",
"mistyrose": "#ffe4e1",
"moccasin": "#ffe4b5",
"navajowhite": "#ffdead",
"navy": "#000080",
"oldlace": "#fdf5e6",
"olive": "#808000",
"olivedrab": "#6b8e23",
"orange": "#ffa500",
"orangered": "#ff4500",
"orchid": "#da70d6",
"palegoldenrod": "#eee8aa",
"palegreen": "#98fb98",
"paleturquoise": "#afeeee",
"palevioletred": "#db7093",
"papayawhip": "#ffefd5",
"peachpuff": "#ffdab9",
"peru": "#cd853f",
"pink": "#ffc0cb",
"plum": "#dda0dd",
"powderblue": "#b0e0e6",
"purple": "#800080",
"rebeccapurple": "#663399",
"red": "#ff0000",
"rosybrown": "#bc8f8f",
"royalblue": "#4169e1",
"saddlebrown": "#8b4513",
"salmon": "#fa8072",
"sandybrown": "#f4a460",
"seagreen": "#2e8b57",
"seashell": "#fff5ee",
"sienna": "#a0522d",
"silver": "#c0c0c0",
"skyblue": "#87ceeb",
"slateblue": "#6a5acd",
"slategray": "#708090",
"slategrey": "#708090",
"snow": "#fffafa",
"springgreen": "#00ff7f",
"steelblue": "#4682b4",
"tan": "#d2b48c",
"teal": "#008080",
"thistle": "#d8bfd8",
"tomato": "#ff6347",
"turquoise": "#40e0d0",
"violet": "#ee82ee",
"wheat": "#f5deb3",
"white": "#ffffff",
"whitesmoke": "#f5f5f5",
"yellow": "#ffff00",
"yellowgreen": "#9acd32",
}

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,243 @@
#
# The Python Imaging Library
# $Id$
#
# WCK-style drawing interface operations
#
# History:
# 2003-12-07 fl created
# 2005-05-15 fl updated; added to PIL as ImageDraw2
# 2005-05-15 fl added text support
# 2005-05-20 fl added arc/chord/pieslice support
#
# Copyright (c) 2003-2005 by Secret Labs AB
# Copyright (c) 2003-2005 by Fredrik Lundh
#
# See the README file for information on usage and redistribution.
#
"""
(Experimental) WCK-style drawing interface operations
.. seealso:: :py:mod:`PIL.ImageDraw`
"""
from __future__ import annotations
from typing import Any, AnyStr, BinaryIO
from . import Image, ImageColor, ImageDraw, ImageFont, ImagePath
from ._typing import Coords, StrOrBytesPath
class Pen:
"""Stores an outline color and width."""
def __init__(self, color: str, width: int = 1, opacity: int = 255) -> None:
self.color = ImageColor.getrgb(color)
self.width = width
class Brush:
"""Stores a fill color"""
def __init__(self, color: str, opacity: int = 255) -> None:
self.color = ImageColor.getrgb(color)
class Font:
"""Stores a TrueType font and color"""
def __init__(
self, color: str, file: StrOrBytesPath | BinaryIO, size: float = 12
) -> None:
# FIXME: add support for bitmap fonts
self.color = ImageColor.getrgb(color)
self.font = ImageFont.truetype(file, size)
class Draw:
"""
(Experimental) WCK-style drawing interface
"""
def __init__(
self,
image: Image.Image | str,
size: tuple[int, int] | list[int] | None = None,
color: float | tuple[float, ...] | str | None = None,
) -> None:
if isinstance(image, str):
if size is None:
msg = "If image argument is mode string, size must be a list or tuple"
raise ValueError(msg)
image = Image.new(image, size, color)
self.draw = ImageDraw.Draw(image)
self.image = image
self.transform: tuple[float, float, float, float, float, float] | None = None
def flush(self) -> Image.Image:
return self.image
def render(
self,
op: str,
xy: Coords,
pen: Pen | Brush | None,
brush: Brush | Pen | None = None,
**kwargs: Any,
) -> None:
# handle color arguments
outline = fill = None
width = 1
if isinstance(pen, Pen):
outline = pen.color
width = pen.width
elif isinstance(brush, Pen):
outline = brush.color
width = brush.width
if isinstance(brush, Brush):
fill = brush.color
elif isinstance(pen, Brush):
fill = pen.color
# handle transformation
if self.transform:
path = ImagePath.Path(xy)
path.transform(self.transform)
xy = path
# render the item
if op in ("arc", "line"):
kwargs.setdefault("fill", outline)
else:
kwargs.setdefault("fill", fill)
kwargs.setdefault("outline", outline)
if op == "line":
kwargs.setdefault("width", width)
getattr(self.draw, op)(xy, **kwargs)
def settransform(self, offset: tuple[float, float]) -> None:
"""Sets a transformation offset."""
(xoffset, yoffset) = offset
self.transform = (1, 0, xoffset, 0, 1, yoffset)
def arc(
self,
xy: Coords,
pen: Pen | Brush | None,
start: float,
end: float,
*options: Any,
) -> None:
"""
Draws an arc (a portion of a circle outline) between the start and end
angles, inside the given bounding box.
.. seealso:: :py:meth:`PIL.ImageDraw.ImageDraw.arc`
"""
self.render("arc", xy, pen, *options, start=start, end=end)
def chord(
self,
xy: Coords,
pen: Pen | Brush | None,
start: float,
end: float,
*options: Any,
) -> None:
"""
Same as :py:meth:`~PIL.ImageDraw2.Draw.arc`, but connects the end points
with a straight line.
.. seealso:: :py:meth:`PIL.ImageDraw.ImageDraw.chord`
"""
self.render("chord", xy, pen, *options, start=start, end=end)
def ellipse(self, xy: Coords, pen: Pen | Brush | None, *options: Any) -> None:
"""
Draws an ellipse inside the given bounding box.
.. seealso:: :py:meth:`PIL.ImageDraw.ImageDraw.ellipse`
"""
self.render("ellipse", xy, pen, *options)
def line(self, xy: Coords, pen: Pen | Brush | None, *options: Any) -> None:
"""
Draws a line between the coordinates in the ``xy`` list.
.. seealso:: :py:meth:`PIL.ImageDraw.ImageDraw.line`
"""
self.render("line", xy, pen, *options)
def pieslice(
self,
xy: Coords,
pen: Pen | Brush | None,
start: float,
end: float,
*options: Any,
) -> None:
"""
Same as arc, but also draws straight lines between the end points and the
center of the bounding box.
.. seealso:: :py:meth:`PIL.ImageDraw.ImageDraw.pieslice`
"""
self.render("pieslice", xy, pen, *options, start=start, end=end)
def polygon(self, xy: Coords, pen: Pen | Brush | None, *options: Any) -> None:
"""
Draws a polygon.
The polygon outline consists of straight lines between the given
coordinates, plus a straight line between the last and the first
coordinate.
.. seealso:: :py:meth:`PIL.ImageDraw.ImageDraw.polygon`
"""
self.render("polygon", xy, pen, *options)
def rectangle(self, xy: Coords, pen: Pen | Brush | None, *options: Any) -> None:
"""
Draws a rectangle.
.. seealso:: :py:meth:`PIL.ImageDraw.ImageDraw.rectangle`
"""
self.render("rectangle", xy, pen, *options)
def text(self, xy: tuple[float, float], text: AnyStr, font: Font) -> None:
"""
Draws the string at the given position.
.. seealso:: :py:meth:`PIL.ImageDraw.ImageDraw.text`
"""
if self.transform:
path = ImagePath.Path(xy)
path.transform(self.transform)
xy = path
self.draw.text(xy, text, font=font.font, fill=font.color)
def textbbox(
self, xy: tuple[float, float], text: AnyStr, font: Font
) -> tuple[float, float, float, float]:
"""
Returns bounding box (in pixels) of given text.
:return: ``(left, top, right, bottom)`` bounding box
.. seealso:: :py:meth:`PIL.ImageDraw.ImageDraw.textbbox`
"""
if self.transform:
path = ImagePath.Path(xy)
path.transform(self.transform)
xy = path
return self.draw.textbbox(xy, text, font=font.font)
def textlength(self, text: AnyStr, font: Font) -> float:
"""
Returns length (in pixels) of given text.
This is the amount by which following text should be offset.
.. seealso:: :py:meth:`PIL.ImageDraw.ImageDraw.textlength`
"""
return self.draw.textlength(text, font=font.font)

Some files were not shown because too many files have changed in this diff Show More