llm_ticket3/prompts/test_prompt_ocr4.txt
2025-05-06 17:09:05 +02:00

231 lines
5.8 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

Prompt de base:
You are tasked with performing a high-precision OCR extraction on a partially cropped screenshot of a technical web interface.
GOAL: Extract **all visible and partially visible text** from the image, no matter how small, faint, or cropped. Structure the output clearly to maximize usability.
FORMAT THE OUTPUT AS FOLLOWS:
---
1. PAGE STRUCTURE
* Page title or headers (e.g., test names, document references)
* Section labels or grouping titles
* URL or navigation path (if visible)
2. LABORATORY DATA
* Sample references or IDs (e.g., 25-00075)
* Material descriptions (e.g., Sable 0/2 C)
* Dates, operators (e.g., 02/04/2025 BOLLÉE Victor)
3. UI ELEMENTS
* Sidebar text, tabs, and form labels
* Buttons (e.g., RAZ, IMPRIMER, ENREGISTRER)
* Navigation or functional labels (e.g., ESSAI, SMQ)
4. SYSTEM MESSAGES & ERRORS
* Any status or error message (e.g., "Impossible de trouver l'adresse IP du serveur de zkt1.brg-lab.com")
* Domain names, IPs, server paths
5. METADATA
* Document standards (e.g., NF EN 933-9)
* Version/date (e.g., 02-2022)
* Any hidden identifiers, codes, or repeated numbers
6. PERIPHERAL / UNCERTAIN TEXT
* Logos (e.g., BRG-LAB), watermarks, faded/blurred text
* Partially cropped lines: use “[...]” to indicate incomplete text
RULES:
- Do NOT interpret or translate.
- Preserve all punctuation, accents, and casing.
- Include duplicated text if present multiple times.
- Mark unclear/partial lines explicitly.
- Report even faint or background text.
This is a high-precision extraction. Output clean bullet points grouped by section. Stay fully factual.
Résultats:
🔵 Résultat pour l'image output/ticket_T11143/T11143_20250422_084617/attachments/image_145435.png:
Texte extrait:
### 1\. Page Structure
\- **Page title**: Essai au bleu de méthylène \(MB\) - NF EN 933-9 \((02\)-2022)
\- **Header/Section labels**:
\* BRG-LAB logo (top left)
\* "Fournisseur Labo:" label with dropdown menu (right side)
\- URL/path: zkt1.brg-lab.com (from the error message at the bottom) [](https://zk1.brg-lab.com)
### 2\. Laboratory Data
\- Sample reference: 25-00075
\- Material description: Sable 0/2 C
\- Date and operator: 02/04/2025 - BOLLÉE Victor
### 3\. UI Elements
\- Sidebar text/tabs/form labels:
\+ **Tab bar**:
\# Essai | Matériel | Observations | SMQ
\* Form fields:
\+ "ESSAI"
- N° de fiche \(=\) ... (partially visible)
\+ "MATERIEL" / "Matériau": Sable 0/2 C
\+ "OBSERVATIONS" with empty field
\+ "SMQ" with an empty dropdown menu
\* Buttons: RAZ, IMPRIMER, ENREGISTRER
### 4\. System Messages & Errors
\- Status/error message at the bottom of the page:
Impossible de trouver l'adresse IP du serveur zk1.brg-lab.com.
### 5\. Metadata
\- Document standard/version/date: NF EN 933-9 \(02\)-2022
### 6. Peripheral/Uncertain Text
\- BRG-LAB logo (top left)
\- Faded/blurred text near buttons (right side) - partially readable words include "[...]tél", "dossier", and parts of what seems to be a form submission button or similar ("[...]" )
Prompt corrigé sans éléments spécifiques:
"""You are tasked with performing a high-precision OCR extraction on a partially cropped screenshot of a technical web interface.
GOAL: Extract **all visible and partially visible text** from the image, no matter how small, faint, or cropped. Structure the output clearly to maximize usability.
FORMAT THE OUTPUT AS FOLLOWS:
---
1. PAGE STRUCTURE
* Page title or headers (e.g., test names, document references)
* Section labels or grouping titles
* URL or navigation path (if visible)
2. LABORATORY DATA
* Sample references or IDs
* Material descriptions
* Dates, operators
3. UI ELEMENTS
* Sidebar text, tabs, and form labels
* Buttons
* Navigation or functional labels
4. SYSTEM MESSAGES & ERRORS
* Any status or error message
* Domain names, IPs, server paths
5. METADATA
* Document standards
* Version/date )
* Any hidden identifiers, codes, or repeated numbers
6. PERIPHERAL / UNCERTAIN TEXT
* Logos (e.g., BRG-LAB), watermarks, faded/blurred text
* Partially cropped lines: use “[...]” to indicate incomplete text
RULES:
- Do NOT interpret or translate.
- Preserve all punctuation, accents, and casing.
- Include duplicated text if present multiple times.
- Mark unclear/partial lines explicitly.
- Report even faint or background text.
This is a high-precision extraction. Output clean bullet points grouped by section. Stay fully factual.
"""
Résultats:
🔵 Résultat pour l'image output/ticket_T11143/T11143_20250422_084617/attachments/image_145435.png:
Texte extrait:
### **PAGE STRUCTURE**
#### *Page title or headers*
* Essai au bleu de méthylène (MB) - NF EN 933-9 (02-2022)
* BOLLEE Victor
* PRELEVEUR: 25/06/75
* LABORATOIRE DE CONTROLE ET D'ANALYSE DES MATERIAUX
#### *Section labels or grouping titles*: None visible.
### **LABORATORY DATA**
* Sample references or IDs:
* N/A, no samples were referenced in the image provided.
* Material descriptions:
* Echantillon n° 25-00073 réceptionné le 04/05/2025 par BOLLÉE Victo
(sample number partially obscured due to cropping.)
* Matériau Sable C7C - CARRETERE ADCEG
* Dates and operators:
Date not legible; operator unclear.
* Prélevée la 02/06/2025 p
### **UI ELEMENTS**
#### \_Sidebar text, tabs, and form labels:
* ESSAI
* MATÉRIEL
* PORTFOLIO
#### Buttons: None visible.
### **SYSTEM MESSAGES & ERRORS**
None apparent from the image provided. The partial nature of the screenshot precludes any conclusive assessment in this regard.
### PERIPHERAL / UNCERTAIN TEXT:
Logo or watermark: BRG-LAB (cropped due to formatting)
Note: This extraction strictly adheres to visible content without interpretation. Some fields are left blank where data is partially obscured by cropping or not discernible from the given snapshot.