llm_ticket3/prompts_ocr.py

"""
Collection de prompts optimisés pour l'OCR avec Llama Vision.
Chaque prompt est conçu pour maximiser l'extraction de texte selon différentes stratégies.
"""

# 1. Prompt de base détaillé
PROMPT_DETAILED = """
Your task is to perform ultra-detailed OCR on this image. Extract EVERY single text element:

Rules:
- Extract ALL text, no matter how small, faint, or partially visible
- Include UI elements, watermarks, and background text
- Preserve exact formatting, symbols, and special characters
- Report numbers with their exact format (decimals, units)
- Include text from logos, stamps, or signatures
- Capture handwritten text if present

Format the output as:
MAIN TEXT:
* [exact text as shown]

INTERFACE ELEMENTS:
* [buttons, labels, headers]

METADATA:
* [dates, references, IDs]

PERIPHERAL TEXT:
* [watermarks, footnotes, margins]

HANDWRITTEN/STAMPS:
* [any manual annotations]

Important:
- Do not interpret or modify the text
- Keep original case and punctuation
- Report partial text with [...] for truncated parts
- Include repeated text if shown multiple times
"""

RESULTATS = """🔵 Résultat pour l'image output/ticket_T11143/T11143_20250422_084617/attachments/image_145435.png:
Texte extrait: **Analysis of Image Elements**

Upon examining the image, it is evident that the majority of its content remains illegible due to truncation or being cropped out. This significantly hampers the ability to extract detailed information.

**Main Text:**
The only discernible main text in the provided view is the URL at the top of the page:

`giraud.brg-lab.com/BRG-LAB/PAGE_programmeEssay/2f4AAbYNGQAA`

**Interface Elements:**
No specific buttons, labels, headers, etc., are visible within this part of the screenshot.

**Metadata:**
There's no clear metadata (dates, references, IDs) visible in this truncated section.

**Peripheral Text/Watermarks/Footnotes/Margins:**
No additional peripheral texts or elements such as watermarks are observable in the given portion of the image.

**Handwritten/Stamps:**
Given the nature of the image, which appears to be a digital representation, there is no indication of handwritten annotations or stamps present.

In conclusion, due to the extensive truncation and cropping of the content, detailed analysis beyond the visible URL at the top is not feasible. The image does not offer sufficient information for comprehensive extraction under the specified rules."""

# 2. Prompt avec analyse spatiale
PROMPT_SPATIAL = """
Perform a comprehensive text extraction by scanning the image in a precise grid pattern:

SCAN PATTERN:
1. Top-left to top-right
2. Upper-middle area
3. Center-left to center-right
4. Lower-middle area
5. Bottom-left to bottom-right
6. Margins and borders

For each detected text element, specify:
POSITION: [zone in image]
TEXT: [exact content]
TYPE: [printed/handwritten/stamp/watermark]
STYLE: [bold/italic/underlined/color if relevant]

Extraction rules:
- Include ALL text regardless of size or clarity
- Report text in its exact original format
- Note any partially visible or truncated text
- Include numbers, symbols, and special characters
- Capture overlapping or layered text

Do not:
- Skip any text, no matter how minor
- Modify or interpret the content
- Translate or paraphrase
- Omit repeated elements
"""
RESULTATS = """
Résultat pour l'image output/ticket_T11143/T11143_20250422_084617/attachments/image_145435.png:
Texte extrait:
The provided screenshot displays a French-language webpage with a white background, featuring a blue column on the left side and a navigation bar at the top.

**Blue Column:**
On the far-left, a blue column is divided into two sections. The upper section contains the company name "BRG-LAB" in blue letters accompanied by an orange rectangle and illegible text. Below it is the logo of a gear or cog, possibly representing engineering work.

**Top Navigation Bar:**

The top navigation bar consists of three buttons with illegible labels:

• First button: Unlabeled; contains a magnifying glass icon.
• Second button: Labeled with an unreadable phrase.
• Third button: Contains a box-like symbol, similar to a QR code reader app, labeled as "Masque" (French for 'mask').

**Main Content Area:**
A search field titled "Chercher" (French for 'search') occupies a significant portion of the page. However, the majority of its content remains invisible in the screenshot due to being outside the cropped area.

**Central Text Box:**
In the center of the screen lies a text box with a blue background, outlined in dark blue. The visible top line contains partially illegible letters. Below it is another row consisting of two columns, which seem to display some data.

**Additional Information:**

* Two small buttons, marked as "Imprimer" and "Fermer", are situated below the search bar. They enable users to print or close their work.
* A navigation button on the left side provides options such as "Accueil" (home), "Mentions légales" (terms and conditions), and "Contact".
* At the bottom right corner of the webpage is an option for 'S'ign-in' labeled "Connectez-vous".

**Summary:**
This website serves various purposes, including displaying a list of items within the main content area, offering tools like searching functionality through text boxes and dropdown menus. It also provides information related to BRG-LAB's services, contact details, and terms of use.

Given that much of the image remains unseen outside the cropped section provided here - where does one find out more about what this site has to offer?"""

# 3. Prompt technique laboratoire
PROMPT_LABORATORY = """
Extract all text from this technical document with laboratory-grade precision:

DOCUMENT STRUCTURE:
1. HEADER
   * Title/Document name
   * Reference numbers
   * Date/Time stamps
   * Laboratory identifiers

2. MAIN CONTENT
   * Test names/methods
   * Technical parameters
   * Measurement values
   * Units and scales
   * Standard references

3. METADATA
   * Protocol numbers
   * Batch/Sample IDs
   * Equipment references
   * Operator information

4. SUPPLEMENTARY
   * Notes/Remarks
   * Warning messages
   * System notifications
   * Status indicators

Rules:
- Extract EVERY number, symbol, and abbreviation
- Maintain exact formatting of technical values
- Include all reference codes and standards
- Report partial or truncated information
- Capture system messages and alerts
- Note any calibration or verification data

Format: Use bullet points (*) for each text element, grouped by section
"""

# 4. Prompt erreurs système
PROMPT_SYSTEM_ERRORS = """
Perform a detailed text extraction focusing on ALL visible elements:

PRIMARY FOCUS:
* Error messages (complete text)
* System notifications
* Status updates
* Warning banners
* Alert boxes
* Connection status
* Server messages
* Debug information

TECHNICAL DETAILS:
* IP addresses
* Server names
* Domain information
* Protocol indicators
* Status codes
* Timestamps
* Version numbers

USER INTERFACE:
* Menu items
* Button text
* Tab labels
* Field names
* Dialog content
* Tooltips
* Status bar text

FORMAT:
Category: [type of element]
Location: [where in image]
Content: [exact text]
Context: [if part of larger message]

RULES:
- Capture ALL text verbatim
- Include partial/truncated messages
- Report exact error codes
- Note any system paths or URLs
- Include technical parameters
- Preserve original formatting
"""

# 5. Prompt détails périphériques
PROMPT_PERIPHERAL = """
Execute a thorough OCR scan capturing ALL text elements including peripheral and subtle details:

SCAN LEVELS:

1. PRIMARY TEXT
- Main content
- Headers
- Titles
- Labels

2. SECONDARY ELEMENTS
- Footnotes
- References
- Timestamps
- IDs/Codes

3. INTERFACE TEXT
- Navigation elements
- Buttons
- Menu items
- Status indicators

4. BACKGROUND ELEMENTS
- Watermarks
- Background text
- Faint prints
- Overlays

5. TECHNICAL DETAILS
- Version numbers
- System messages
- Protocol references
- Error codes

6. METADATA
- Document properties
- Page information
- System status
- Environmental data

EXTRACTION RULES:
- Report ALL text regardless of visibility level
- Include partial or cut-off text
- Note repeated elements
- Preserve special characters
- Maintain original formatting
- Capture alphanumeric codes

FORMAT:
Use hierarchical bullet points (*) with clear section separation
Mark unclear or partially visible text with [...]
"""

# 6. Prompt minimaliste (pour tests rapides)
PROMPT_MINIMAL = """
Extract ALL visible text from the image:
- Include everything, no matter how small or faint
- Keep exact formatting and punctuation
- List each text element with a bullet point (*)
- Do not interpret or modify anything
"""

# 7. Prompt analyse scientifique
PROMPT_SCIENTIFIC = """
Perform precise scientific document text extraction:

CAPTURE CATEGORIES:

1. NUMERICAL DATA
* All measurements and values
* Units and scales
* Statistical information
* Calibration data
* Error margins
* Reference values

2. METHODOLOGICAL INFORMATION
* Protocol references
* Standard methods
* Test conditions
* Equipment specifications
* Environmental parameters

3. IDENTIFICATION
* Sample IDs
* Batch numbers
* Test references
* Operator codes
* Laboratory stamps

4. TEMPORAL DATA
* Test dates/times
* Incubation periods
* Measurement intervals
* Timestamp formats

5. QUALITY INDICATORS
* Control values
* Validation status
* Compliance markers
* Certification references

FORMAT:
* Use exact notation as shown
* Preserve all decimal places
* Maintain scientific notation
* Include all ± symbols
* Keep unit formatting

RULES:
- Extract ALL technical notation
- Preserve mathematical symbols
- Include partial measurements
- Note any quality stamps
- Capture calibration notes
"""

# 8. Prompt optimisé pour documents administratifs
PROMPT_ADMINISTRATIVE = """
Extract all text from administrative document with high attention to detail:

DOCUMENT SECTIONS:

1. HEADER INFORMATION
* Organization name/logo text
* Document title
* Reference numbers
* Date stamps
* Page numbers

2. IDENTIFICATION DATA
* File numbers
* Case references
* Client/Subject IDs
* Department codes
* Process numbers

3. STATUS INFORMATION
* Current state
* Processing stage
* Validation marks
* Approval stamps
* Priority indicators

4. CONTACT DETAILS
* Names and titles
* Service identifiers
* Department references
* Location codes
* Contact numbers

5. PROCESSING MARKS
* Reception stamps
* Validation marks
* Processing dates
* Routing information
* Priority codes

6. FOOTER DATA
* Document references
* Version information
* System identifiers
* Page information
* Classification marks

EXTRACTION RULES:
- Capture ALL administrative marks
- Include partial stamps
- Note all reference numbers
- Preserve date formats
- Include classification codes
- Report status indicators

FORMAT:
* Use exact text as shown
* Maintain original formatting
* Include all administrative symbols
* Preserve stamp text layout
"""

# Dictionnaire des prompts pour faciliter les tests
PROMPTS = {
    "detailed": PROMPT_DETAILED,
    "spatial": PROMPT_SPATIAL,
    "laboratory": PROMPT_LABORATORY,
    "system_errors": PROMPT_SYSTEM_ERRORS,
    "peripheral": PROMPT_PERIPHERAL,
    "minimal": PROMPT_MINIMAL,
    "scientific": PROMPT_SCIENTIFIC,
    "administrative": PROMPT_ADMINISTRATIVE
}

# Paramètres recommandés pour chaque prompt
RECOMMENDED_PARAMS = {
    "detailed": {"temperature": 1.5, "top_p": 0.85},
    "spatial": {"temperature": 1.8, "top_p": 0.9},
    "laboratory": {"temperature": 1.2, "top_p": 0.8},
    "system_errors": {"temperature": 1.4, "top_p": 0.85},
    "peripheral": {"temperature": 1.6, "top_p": 0.87},
    "minimal": {"temperature": 1.0, "top_p": 0.7},
    "scientific": {"temperature": 1.3, "top_p": 0.82},
    "administrative": {"temperature": 1.4, "top_p": 0.83}
}

def get_prompt(prompt_type: str) -> str:
    """
    Récupère un prompt spécifique par son nom.

    Args:
        prompt_type: Le type de prompt à récupérer

    Returns:
        Le prompt correspondant ou le prompt détaillé par défaut
    """
    return PROMPTS.get(prompt_type, PROMPT_DETAILED)

def get_recommended_params(prompt_type: str) -> dict:
    """
    Récupère les paramètres recommandés pour un type de prompt.

    Args:
        prompt_type: Le type de prompt

    Returns:
        Dictionnaire des paramètres recommandés
    """
    return RECOMMENDED_PARAMS.get(prompt_type, {"temperature": 1.5, "top_p": 0.85})