0605-17:09

This commit is contained in:
Ladebeze66 2025-05-06 17:09:05 +02:00
parent dc7d208480
commit 1b8776552c
7 changed files with 811 additions and 0 deletions

5
CSV/T11123/T11123.csv Normal file
View File

@ -0,0 +1,5 @@
ÉMETTEUR,TYPE,DATE,CONTENU,ÉLÉMENTS VISUELS
CLIENT,question,28/03/2025 15:00,Les parties douvrage napparaissent plus.,"L'image montre une interface de gestion de chantier fonctionnelle, mais aucune mention explicite des ""parties douvrage"" n'est visible."
SUPPORT,réponse,28/03/2025 15:59,"Nous venons d'appliquer un correctif sur votre site, les parties de chantier sont de nouveau accessibles.",L'image confirme que l'interface est accessible avec des champs modifiables et des boutons actifs.
CLIENT,question,28/03/2025 16:02,Je ne peux plus accéder à CBAO.
SUPPORT,réponse,28/03/2025 16:06,Votre site est bien accessible à l'adresse suivante : https://nob.brg-lab.com/.,"L'image confirme que le client ""NORD OUEST BETON"" est bien visible dans la liste des laboratoires."
Can't render this file because it has a wrong number of fields in line 4.

3
CSV/T11142/T11142.csv Normal file
View File

@ -0,0 +1,3 @@
ÉMETTEUR,TYPE,DATE,CONTENU,ÉLÉMENTS VISUELS
CLIENT,question,02/04/2025 15:52,Il est impossible de mettre des décimales sur la valeur de CO2/Tonne dans les caractéristiques des granulats de carrière.,Aucune image disponible
SUPPORT,réponse,03/04/2025 07:44,Nous constatons bien ce dysfonctionnement. Un ticket a été ouvert auprès de notre équipe de développement. Vous serez automatiquement informé de sa résolution.,Aucune image disponible
1 ÉMETTEUR TYPE DATE CONTENU ÉLÉMENTS VISUELS
2 CLIENT question 02/04/2025 15:52 Il est impossible de mettre des décimales sur la valeur de CO2/Tonne dans les caractéristiques des granulats de carrière. Aucune image disponible
3 SUPPORT réponse 03/04/2025 07:44 Nous constatons bien ce dysfonctionnement. Un ticket a été ouvert auprès de notre équipe de développement. Vous serez automatiquement informé de sa résolution. Aucune image disponible

4
CSV/T11143/T11143.csv Normal file
View File

@ -0,0 +1,4 @@
ÉMETTEUR,TYPE,DATE,CONTENU,ÉLÉMENTS VISUELS
CLIENT,question,03/04/2025 08:34,"Bonjour, Je ne parviens pas à accéder au lessai au bleu. Merci par avance pour votre. Cordialement",Essai au bleu de méthylène de méthylène (MB) - NF EN 933-9 (02-2022)
SUPPORT,réponse,03/04/2025 12:17,"Bonjour, Pouvez-vous vérifier si vous avez bien accès à la page suivante en l'ouvrant dans votre navigateur : https://zk1.brg-lab.com/ Voici ce que vous devriez voir affiché : Si ce n'est pas le cas, pouvez-vous me faire une capture d'écran de ce qui est affiché? Je reste à votre entière disposition pour toute information complémentaire. Cordialement,","Page d'accueil par défaut d'Apache Tomcat avec le message ""It works!"""
CLIENT,information,03/04/2025 12:21,"Bonjour, Le problème sest résolu seul par la suite. Je vous remercie pour votre retour. Bonne journée PS : ladresse fonctionne",Essai au bleu de méthylène de méthylène (MB) - NF EN 933-9 (02-2022)
1 ÉMETTEUR TYPE DATE CONTENU ÉLÉMENTS VISUELS
2 CLIENT question 03/04/2025 08:34 Bonjour, Je ne parviens pas à accéder au l’essai au bleu. Merci par avance pour votre. Cordialement Essai au bleu de méthylène de méthylène (MB) - NF EN 933-9 (02-2022)
3 SUPPORT réponse 03/04/2025 12:17 Bonjour, Pouvez-vous vérifier si vous avez bien accès à la page suivante en l'ouvrant dans votre navigateur : https://zk1.brg-lab.com/ Voici ce que vous devriez voir affiché : Si ce n'est pas le cas, pouvez-vous me faire une capture d'écran de ce qui est affiché? Je reste à votre entière disposition pour toute information complémentaire. Cordialement, Page d'accueil par défaut d'Apache Tomcat avec le message "It works!"
4 CLIENT information 03/04/2025 12:21 Bonjour, Le problème s’est résolu seul par la suite. Je vous remercie pour votre retour. Bonne journée PS : l’adresse fonctionne Essai au bleu de méthylène de méthylène (MB) - NF EN 933-9 (02-2022)

View File

@ -0,0 +1,39 @@
# agents/RAG/agent_rag_responder.py
from ..base_agent import BaseAgent
class AgentRagResponder(BaseAgent):
def __init__(self, llm, ragflow):
"""
Initialise l'agent avec un LLM et un client Ragflow.
"""
super().__init__("AgentRagResponder", llm)
self.ragflow = ragflow
def executer(self, question: str, top_k: int = 5) -> str:
"""
Effectue une recherche dans Ragflow et interroge le LLM avec le contexte.
"""
# 1. Recherche les documents similaires
documents = self.ragflow.rechercher(question, top_k=top_k)
if not documents:
return "Aucun document pertinent trouvé dans la base de connaissance."
# 2. Construit le contexte à envoyer au LLM
contexte = "\n---\n".join([doc.get("content", "") for doc in documents])
# 3. Construit le prompt final
prompt = (
f"Voici des extraits de documents liés à la question suivante :\n\n"
f"{contexte}\n\n"
f"Réponds de manière précise et complète à la question suivante :\n{question}"
)
# 4. Interroge le LLM
reponse = self.llm.interroger(prompt)
# 5. Ajoute à l'historique (facultatif)
self.ajouter_historique(prompt, reponse)
return reponse

View File

@ -0,0 +1,223 @@
import os
import json
import logging
from datetime import datetime
from typing import Optional
from ..base_agent import BaseAgent
from ..utils.pipeline_logger import sauvegarder_donnees
from utils.ocr_cleaner import clean_text_with_profiles # AJOUT
from utils.ocr_utils import extraire_texte_fr
logger = logging.getLogger("AgentVisionOCR")
class AgentVisionOCR(BaseAgent):
"""
Agent LlamaVision qui extrait du texte (OCR avancé) depuis une image.
"""
def __init__(self, llm):
super().__init__("AgentVisionOCR", llm)
# Configuration des paramètres du modèle
self.params = {
"stream": False,
"seed": 0,
#"stop_sequence": [],
"temperature": 1.3,
#"reasoning_effort": 0.5,
#"logit_bias": {},
"mirostat": 0,
"mirostat_eta": 0.1,
"mirostat_tau": 5.0,
"top_k": 35,
"top_p": 0.85,
"min_p": 0.06,
"frequency_penalty": 0.15,
"presence_penalty": 0.1,
"repeat_penalty": 1.15,
"repeat_last_n": 128,
"tfs_z": 1.0,
"num_keep": 0,
"num_predict": 2048,
"num_ctx": 16384,
#"repeat_penalty": 1.1,
"num_batch": 2048,
#"mmap": True,
#"mlock": False,
#"num_thread": 4,
#"num_gpu": 1
}
# Prompt OCR optimisé
self.system_prompt = ("""You are tasked with performing an exhaustive OCR extraction on a technical or administrative web interface screenshot.
GOAL: Extract **every legible piece of text**, even partially visible, faded, or cropped. Structure your output for clarity. Do not guess, but always report what is visible.
FORMAT USING THESE CATEGORIES:
1. PAGE STRUCTURE
- Page titles
- Interface headers or section labels
- Navigation bars or visible URLs
2. IDENTIFIERS & DATA
- Operator or user names
- Sample IDs, test references
- Materials, dates, batch numbers
3. INTERFACE ELEMENTS (MANDATORY SCAN)
- Button labels (e.g., RAZ, SAVE)
- Tabs (e.g., MATERIAL, OBSERVATIONS)
- Sidebars, form field labels
4. SYSTEM MESSAGES
- Connection or server errors
- Domains, IP addresses, server notices
5. METADATA
- Standard references (e.g., "NF EN ####-#")
- Version numbers, document codes, timestamps
6. UNCLEAR / CROPPED TEXT
- Logos, partial lines (use “[...]” for truncated)
- Background/faded elements, labels not fully legible
RULES:
- Preserve punctuation, case, accents exactly.
- Include duplicates if text appears more than once.
- Never skip faint or partial text; use “[...]” if incomplete.
- Even if cropped, report as much as possible from any UI region.
This prompt is designed to generalize across all web portals, technical forms, or reports. Prioritize completeness over certainty. Do not ignore UI components or system messages.
""")
self._configurer_llm()
self.resultats = []
self.images_traitees = set()
logger.info("AgentVisionOCR initialisé avec prompt amélioré.")
def _configurer_llm(self):
if hasattr(self.llm, "prompt_system"):
self.llm.prompt_system = self.system_prompt
if hasattr(self.llm, "configurer"):
self.llm.configurer(**self.params)
def _extraire_ticket_id(self, image_path):
if not image_path:
return "UNKNOWN"
segments = image_path.replace('\\', '/').split('/')
for segment in segments:
if segment.startswith('T') and segment[1:].isdigit():
return segment
if segment.startswith('ticket_T') and segment[8:].isdigit():
return 'T' + segment[8:]
return "UNKNOWN"
def executer(self, image_path: str, ocr_baseline: str = "", ticket_id: Optional[str] = None) -> dict:
image_path_abs = os.path.abspath(image_path)
image_name = os.path.basename(image_path)
if image_path_abs in self.images_traitees:
logger.warning(f"[OCR-LLM] Image déjà traitée, ignorée: {image_name}")
print(f" AgentVisionOCR: Image déjà traitée, ignorée: {image_name}")
return {
"extracted_text": "DUPLICATE - Already processed",
"image_name": image_name,
"image_path": image_path_abs,
"ticket_id": ticket_id or self._extraire_ticket_id(image_path),
"timestamp": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
"source_agent": self.nom,
"is_duplicate": True
}
self.images_traitees.add(image_path_abs)
logger.info(f"[OCR-LLM] Extraction OCR sur {image_name}")
print(f" AgentVisionOCR: Extraction OCR sur {image_name}")
ticket_id = ticket_id or self._extraire_ticket_id(image_path)
try:
if not os.path.exists(image_path):
raise FileNotFoundError(f"Image introuvable: {image_path}")
if not hasattr(self.llm, "interroger_avec_image"):
raise RuntimeError("Le modèle ne supporte pas l'analyse d'images.")
# Etape 1: OCR br
# Interroger le modèle
response = self.llm.interroger_avec_image(image_path, self.system_prompt)
if not response or "i cannot" in response.lower():
raise ValueError("Réponse vide ou invalide du modèle")
cleaned_text = clean_text_with_profiles(response.strip(), active_profiles=("ocr",))
model_name = getattr(self.llm, "pipeline_normalized_name",
getattr(self.llm, "modele", "llama3-vision-90b-instruct"))
model_name = model_name.replace(".", "-").replace(":", "-").replace("_", "-")
result = {
"extracted_text": cleaned_text,
"image_name": image_name,
"image_path": image_path_abs,
"ticket_id": ticket_id,
"timestamp": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
"source_agent": self.nom,
"model_info": {
"model": model_name,
**self.params
}
}
self.resultats.append(result)
logger.info(f"[OCR-LLM] OCR réussi ({len(cleaned_text)} caractères) pour {image_name}")
return result
except Exception as e:
error_result = {
"extracted_text": "",
"image_name": image_name,
"image_path": image_path_abs,
"ticket_id": ticket_id or "UNKNOWN",
"timestamp": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
"source_agent": self.nom,
"error": str(e),
"model_info": {
"model": getattr(self.llm, "pipeline_normalized_name", "llama3-vision-90b-instruct"),
**self.params
}
}
self.resultats.append(error_result)
logger.error(f"[OCR-LLM] Erreur OCR pour {image_name}: {e}")
return error_result
def sauvegarder_resultats(self, ticket_id: str = "T11143") -> None:
if not self.resultats:
logger.warning("[OCR-LLM] Aucun résultat à sauvegarder")
return
resultats_dedupliques = {}
for resultat in self.resultats:
image_path = resultat.get("image_path")
if not image_path:
continue
if image_path not in resultats_dedupliques or \
resultat.get("timestamp", "") > resultats_dedupliques[image_path].get("timestamp", ""):
resultats_dedupliques[image_path] = resultat
resultats_finaux = list(resultats_dedupliques.values())
try:
logger.info(f"[OCR-LLM] Sauvegarde de {len(resultats_finaux)} résultats")
sauvegarder_donnees(
ticket_id=ticket_id,
step_name="ocr_llm",
data=resultats_finaux,
base_dir=None,
is_resultat=True
)
self.resultats = []
self.images_traitees = set()
except Exception as e:
logger.error(f"[OCR-LLM] Erreur sauvegarde résultats: {e}")

View File

@ -0,0 +1,231 @@
Prompt de base:
You are tasked with performing a high-precision OCR extraction on a partially cropped screenshot of a technical web interface.
GOAL: Extract **all visible and partially visible text** from the image, no matter how small, faint, or cropped. Structure the output clearly to maximize usability.
FORMAT THE OUTPUT AS FOLLOWS:
---
1. PAGE STRUCTURE
* Page title or headers (e.g., test names, document references)
* Section labels or grouping titles
* URL or navigation path (if visible)
2. LABORATORY DATA
* Sample references or IDs (e.g., 25-00075)
* Material descriptions (e.g., Sable 0/2 C)
* Dates, operators (e.g., 02/04/2025 BOLLÉE Victor)
3. UI ELEMENTS
* Sidebar text, tabs, and form labels
* Buttons (e.g., RAZ, IMPRIMER, ENREGISTRER)
* Navigation or functional labels (e.g., ESSAI, SMQ)
4. SYSTEM MESSAGES & ERRORS
* Any status or error message (e.g., "Impossible de trouver l'adresse IP du serveur de zkt1.brg-lab.com")
* Domain names, IPs, server paths
5. METADATA
* Document standards (e.g., NF EN 933-9)
* Version/date (e.g., 02-2022)
* Any hidden identifiers, codes, or repeated numbers
6. PERIPHERAL / UNCERTAIN TEXT
* Logos (e.g., BRG-LAB), watermarks, faded/blurred text
* Partially cropped lines: use “[...]” to indicate incomplete text
RULES:
- Do NOT interpret or translate.
- Preserve all punctuation, accents, and casing.
- Include duplicated text if present multiple times.
- Mark unclear/partial lines explicitly.
- Report even faint or background text.
This is a high-precision extraction. Output clean bullet points grouped by section. Stay fully factual.
Résultats:
🔵 Résultat pour l'image output/ticket_T11143/T11143_20250422_084617/attachments/image_145435.png:
Texte extrait:
### 1\. Page Structure
\- **Page title**: Essai au bleu de méthylène \(MB\) - NF EN 933-9 \((02\)-2022)
\- **Header/Section labels**:
\* BRG-LAB logo (top left)
\* "Fournisseur Labo:" label with dropdown menu (right side)
\- URL/path: zkt1.brg-lab.com (from the error message at the bottom) [](https://zk1.brg-lab.com)
### 2\. Laboratory Data
\- Sample reference: 25-00075
\- Material description: Sable 0/2 C
\- Date and operator: 02/04/2025 - BOLLÉE Victor
### 3\. UI Elements
\- Sidebar text/tabs/form labels:
\+ **Tab bar**:
\# Essai | Matériel | Observations | SMQ
\* Form fields:
\+ "ESSAI"
- N° de fiche \(=\) ... (partially visible)
\+ "MATERIEL" / "Matériau": Sable 0/2 C
\+ "OBSERVATIONS" with empty field
\+ "SMQ" with an empty dropdown menu
\* Buttons: RAZ, IMPRIMER, ENREGISTRER
### 4\. System Messages & Errors
\- Status/error message at the bottom of the page:
Impossible de trouver l'adresse IP du serveur zk1.brg-lab.com.
### 5\. Metadata
\- Document standard/version/date: NF EN 933-9 \(02\)-2022
### 6. Peripheral/Uncertain Text
\- BRG-LAB logo (top left)
\- Faded/blurred text near buttons (right side) - partially readable words include "[...]tél", "dossier", and parts of what seems to be a form submission button or similar ("[...]" )
Prompt corrigé sans éléments spécifiques:
"""You are tasked with performing a high-precision OCR extraction on a partially cropped screenshot of a technical web interface.
GOAL: Extract **all visible and partially visible text** from the image, no matter how small, faint, or cropped. Structure the output clearly to maximize usability.
FORMAT THE OUTPUT AS FOLLOWS:
---
1. PAGE STRUCTURE
* Page title or headers (e.g., test names, document references)
* Section labels or grouping titles
* URL or navigation path (if visible)
2. LABORATORY DATA
* Sample references or IDs
* Material descriptions
* Dates, operators
3. UI ELEMENTS
* Sidebar text, tabs, and form labels
* Buttons
* Navigation or functional labels
4. SYSTEM MESSAGES & ERRORS
* Any status or error message
* Domain names, IPs, server paths
5. METADATA
* Document standards
* Version/date )
* Any hidden identifiers, codes, or repeated numbers
6. PERIPHERAL / UNCERTAIN TEXT
* Logos (e.g., BRG-LAB), watermarks, faded/blurred text
* Partially cropped lines: use “[...]” to indicate incomplete text
RULES:
- Do NOT interpret or translate.
- Preserve all punctuation, accents, and casing.
- Include duplicated text if present multiple times.
- Mark unclear/partial lines explicitly.
- Report even faint or background text.
This is a high-precision extraction. Output clean bullet points grouped by section. Stay fully factual.
"""
Résultats:
🔵 Résultat pour l'image output/ticket_T11143/T11143_20250422_084617/attachments/image_145435.png:
Texte extrait:
### **PAGE STRUCTURE**
#### *Page title or headers*
* Essai au bleu de méthylène (MB) - NF EN 933-9 (02-2022)
* BOLLEE Victor
* PRELEVEUR: 25/06/75
* LABORATOIRE DE CONTROLE ET D'ANALYSE DES MATERIAUX
#### *Section labels or grouping titles*: None visible.
### **LABORATORY DATA**
* Sample references or IDs:
* N/A, no samples were referenced in the image provided.
* Material descriptions:
* Echantillon n° 25-00073 réceptionné le 04/05/2025 par BOLLÉE Victo
(sample number partially obscured due to cropping.)
* Matériau Sable C7C - CARRETERE ADCEG
* Dates and operators:
Date not legible; operator unclear.
* Prélevée la 02/06/2025 p
### **UI ELEMENTS**
#### \_Sidebar text, tabs, and form labels:
* ESSAI
* MATÉRIEL
* PORTFOLIO
#### Buttons: None visible.
### **SYSTEM MESSAGES & ERRORS**
None apparent from the image provided. The partial nature of the screenshot precludes any conclusive assessment in this regard.
### PERIPHERAL / UNCERTAIN TEXT:
Logo or watermark: BRG-LAB (cropped due to formatting)
Note: This extraction strictly adheres to visible content without interpretation. Some fields are left blank where data is partially obscured by cropping or not discernible from the given snapshot.

View File

@ -0,0 +1,306 @@
You are tasked with performing a high-precision OCR extraction on a screenshot of a technical or administrative web interface.
GOAL: Extract all visible and partially visible text — no matter how small, faint, or cropped. Remain strictly factual. Do not interpret, guess, or reword.
📄 FORMAT THE OUTPUT USING THESE CATEGORIES:
---
1. PAGE STRUCTURE
* Page title(s)
* Section or interface headers
* Visible URLs, tabs, or menu paths
2. DATA & IDENTIFIERS
* Sample codes, test references, user/operator names
* Material or item descriptions
* Dates, times, unique identifiers
3. INTERFACE ELEMENTS
* Button labels
* Tab names
* Sidebar/menu content
* Field or dropdown labels
4. SYSTEM MESSAGES & ERRORS
* Status messages, warnings, or connection errors
* Domain names, IPs, server notices
5. METADATA
* Version numbers, standard references, document codes
* Any duplicated text or footer content
6. UNCLEAR OR CROPPED TEXT
* Logos, watermarks, truncated words or symbols
* Use “[...]” to mark incomplete or partially cropped text
---
RULES:
- Do not translate or paraphrase.
- Preserve original casing, spelling, punctuation.
- Include repeated elements as they appear.
- Report faint or background text if legible.
- Leave blank sections if no relevant text is found.
This prompt is designed to work across a wide range of web interfaces, dashboards, and structured forms. Output clearly grouped bullet points per section.
Résultats:
🔵 Paramètres actifs LLM:
{
"temperature": 1.3,
"top_p": 0.85,
"presence_penalty": 0.1,
"frequency_penalty": 0.15,
"stop": [],
"stream": false,
"n": 1,
"seed": 0,
"mirostat": 0,
"mirostat_eta": 0.1,
"mirostat_tau": 5.0,
"top_k": 35,
"min_p": 0.06,
"repeat_penalty": 1.15,
"repeat_last_n": 128,
"tfs_z": 1.0,
"num_keep": 0,
"num_predict": 2048,
"num_ctx": 16384,
"num_batch": 2048
}
AgentVisionOCR: Extraction OCR sur image_145435.png
🔵 Résultat pour l'image output/ticket_T11143/T11143_20250422_084617/attachments/image_145435.png:
Texte extrait:
### 1) PAGE STRUCTURE
* **Page title:** Essai au bleu de méthylène (MB)
* **Section/Interface headers:**
* BRG-LAB
* **Visible URLs, tabs, menu paths:**
* No visible URLs are displayed in the image.
### 2) DATA & IDENTIFIERS
* **Sample codes/test references/user/operator names:**
* BOLLEE Victor
* Echantillon n° 25-00075 réceptionné le 02/04/2025 par BOLLEE Victor.
* **Material/item descriptions:**
* Blue Methylene MB Testing Report
**Note:** There is no mention of dates or times within the given page snippet. Additionally, there seems to be no unique identifiers on this web page interface screenshot provided.
### 3) INTERFACE ELEMENTS
This section cannot be determined based on the information provided in the image.
### 4) SYSTEM MESSAGES & ERRORS
There are no system messages or errors present in this snapshot as it only provides a view of what appears to be an itemized report rather than any kind of error message.
### 5) METADATA
No metadata elements can be identified from the image alone such as version numbers standard references document codes etcetera without having access full context surrounding these items listed under "BRG-LAB". As far we know all text shown here pertains directly related subject matter topic discussed above thus falling outside scope defined categories listed above according guidelines outlined prompt.
Nouveau prompt:
You are tasked with performing a high-precision OCR extraction on a screenshot of a technical or administrative web interface.
GOAL: Extract all visible and partially visible text — no matter how small, faint, or cropped. Remain strictly factual. Do not interpret, guess, or reword.
📄 FORMAT THE OUTPUT USING THESE CATEGORIES:
---
1. PAGE STRUCTURE
* Page titles (e.g., test names or document references)
* Interface headers, menus or section titles
* Navigation breadcrumbs, web paths, or visible URLs
2. DATA & IDENTIFIERS
* Sample numbers, user names, dates
* Material types or item descriptions
* Unique codes or test references
3. INTERFACE ELEMENTS
* Button labels (e.g., RESET, PRINT, SAVE)
* Tabs or navigation elements (e.g., RESULTS, MATERIAL, OBSERVATIONS)
* Form field labels or sidebar sections
4. SYSTEM MESSAGES & ERRORS
* Status or connection errors (e.g., “unable to connect”, IP/domain errors)
* Visible domain names, system paths
5. METADATA
* Document codes, version numbers, dates (e.g., standard formats like “NF EN ####-#”)
* Any repeated or footer elements
6. UNCLEAR / CROPPED TEXT
* Logos, watermarks, or faded UI elements
* Truncated words or text: use “[...]” to indicate partial content
---
RULES:
- Do not interpret or translate.
- Preserve all accents, casing, punctuation.
- Include all duplicates or repeated labels.
- Capture system messages even if incomplete or faded.
- Leave a section empty only if no corresponding text is found.
Output a clean, bullet-pointed list per section. Be exhaustive and literal.
Résultats:
🔵 Paramètres actifs LLM:
{
"temperature": 1.3,
"top_p": 0.85,
"presence_penalty": 0.1,
"frequency_penalty": 0.15,
"stop": [],
"stream": false,
"n": 1,
"seed": 0,
"mirostat": 0,
"mirostat_eta": 0.1,
"mirostat_tau": 5.0,
"top_k": 35,
"min_p": 0.06,
"repeat_penalty": 1.15,
"repeat_last_n": 128,
"tfs_z": 1.0,
"num_keep": 0,
"num_predict": 2048,
"num_ctx": 16384,
"num_batch": 2048
}
AgentVisionOCR: Extraction OCR sur image_145435.png
🔵 Résultat pour l'image output/ticket_T11143/T11143_20250422_084617/attachments/image_145435.png:
Texte extrait:
### Page Structure
Page title: Essai au bleu de méthylène (MB) - NF EN 933-9 (02-2022)
Navigation breadcrumbs/path: none visible
### Data & Identifiers
Sample numbers/usernames/dates/material types/item descriptions/unique codes/test references:
none are clearly readable in the provided image snippet; however, it appears there might be data within tables or sections not fully captured in this view.
### Interface Elements
Button labels/tabs/navigation elements/form field labels/sidebar sections: None explicitly identifiable from the given portion of the interface screenshot.
### System Messages and Errors
Status/connection errors/domain names/system paths: None evident from the visible part of the page.
### Metadata
Document codes/version numbers/dates/repeated/footer elements:
NF EN 933-9 is mentioned which suggests a European standard reference but lacks specifics like date without further context. No version number, repeated text, or footer content is visible due to cropping.
### Unclear/Cropped Text
Logos/watermarks/faded UI elements/truncated words/text with “[...]” indicating partial content:
Due to significant portions being cropped out by the edges of this image snippet, several pieces of information are truncated including what seems to be a document title/standard reference and possibly interface labels or data entries. The full text for these items cannot be determined from the provided visual alone.
Autre prompt:
You are tasked with performing an exhaustive OCR extraction on a technical or administrative web interface screenshot.
GOAL: Extract **every legible piece of text**, even partially visible, faded, or cropped. Structure your output for clarity. Do not guess, but always report what is visible.
📄 FORMAT USING THESE CATEGORIES:
---
1. PAGE STRUCTURE
- Page titles
- Interface headers or section labels
- Navigation bars or visible URLs
2. IDENTIFIERS & DATA
- Operator or user names
- Sample IDs, test references
- Materials, dates, batch numbers
3. INTERFACE ELEMENTS (MANDATORY SCAN)
- Button labels (e.g., RAZ, SAVE)
- Tabs (e.g., MATERIAL, OBSERVATIONS)
- Sidebars, form field labels
4. SYSTEM MESSAGES
- Connection or server errors
- Domains, IP addresses, server notices
5. METADATA
- Standard references (e.g., "NF EN ####-#")
- Version numbers, document codes, timestamps
6. UNCLEAR / CROPPED TEXT
- Logos, partial lines (use “[...]” for truncated)
- Background/faded elements, labels not fully legible
---
RULES:
- Preserve punctuation, case, accents exactly.
- Include duplicates if text appears more than once.
- Never skip faint or partial text; use “[...]” if incomplete.
- Even if cropped, report as much as possible from any UI region.
This prompt is designed to generalize across all web portals, technical forms, or reports. Prioritize completeness over certainty. Do not ignore UI components or system messages.
Résultats:
🔵 Paramètres actifs LLM:
{
"temperature": 1.3,
"top_p": 0.85,
"presence_penalty": 0.1,
"frequency_penalty": 0.15,
"stop": [],
"stream": false,
"n": 1,
"seed": 0,
"mirostat": 0,
"mirostat_eta": 0.1,
"mirostat_tau": 5.0,
"top_k": 35,
"min_p": 0.06,
"repeat_penalty": 1.15,
"repeat_last_n": 128,
"tfs_z": 1.0,
"num_keep": 0,
"num_predict": 2048,
"num_ctx": 16384,
"num_batch": 2048
}
AgentVisionOCR: Extraction OCR sur image_145435.png
🔵 Résultat pour l'image output/ticket_T11143/T11143_20250422_084617/attachments/image_145435.png:
Texte extrait:
### **Page Structure:**
* Page title: "Essai au bleu de méthylène (MB) - NF EN 933-9 (02-2022)"
* Interface header: "RG-LAB"
* Navigation bar/visible URL: Not visible
* Sidebars/form field labels:
* "MATERIEL"
* "OBSERVATIONS"
### **Identifiers & Data:**
No legible identifiers/data present in the image.
### **Interface Elements:**
* Button labels: None fully visible. One partially cropped button appears to start with an ellipsis "...".
### **System Messages**
None are apparent from the interface elements shown, although partial text could suggest a server message or error code ("[...]", "[...]").
### **Metadata**
* Standard references: NF EN 933-9 (02-2022)
### **Unclear/Cropped Text**:
The lower section contains a faded URL and some metadata fields that appear not to be filled out or have been intentionally hidden for privacy/security reasons ("[...]"). The top left corner shows part of what might be another standard reference or version number ("RG-LAB") but is too cropped to interpret clearly.
Nouveau prompt: