llm-in-text/backend/prompt.py

from typing import Tuple

MAX_PREFIX_CHARS = 12000
MAX_SUFFIX_CHARS = 4000


def _sanitize_language_id(language_id: str) -> str:
    if not language_id:
        return "markdown"
    allowed = []
    for ch in language_id.strip():
        if ch.isalnum() or ch in "-_+.":
            allowed.append(ch)
    value = "".join(allowed)[:32]
    return value or "markdown"


def _prepare_context(prefix: str, suffix: str) -> Tuple[str, str]:
    """
    Prepare prefix/suffix for model completion context.
    Keep the historical one-char lookahead behavior to reduce boundary drift.
    """
    if suffix:
        prefix = prefix + suffix[0]
        suffix = suffix[1:]
    return prefix[-MAX_PREFIX_CHARS:], suffix[:MAX_SUFFIX_CHARS]


def build_prompt(prefix: str, suffix: str, language_id: str = "markdown") -> str:
    safe_language_id = _sanitize_language_id(language_id)
    recent_prefix, recent_suffix = _prepare_context(prefix, suffix)

    prompt = f"""You are an inline completion engine for a {safe_language_id} editor with ghost-text suggestions.

Your job:
- Return ONLY the text that should be inserted at the cursor between PREFIX and SUFFIX.
- Prefer a meaningful, non-empty insertion with moderate length.
- Avoid overly short outputs with little information value.

Important context:
- PREFIX may contain OCR metadata inline after images, e.g. ![alt](url) <OCR:description>.
- The <OCR:...> is hidden context describing image content.
- Never copy, rewrite, or emit OCR tags in output.
- Never output <OCR: or >.

Hard rules:
1. Seamless join:
   PREFIX + OUTPUT + SUFFIX must read naturally as one continuous document.
2. No suffix repetition:
   Do NOT repeat text that already appears at the start of SUFFIX.
3. Balanced length:
   Prefer concise but meaningful continuation, not ultra-short fragments.
   Default target is 20-120 characters and 1-3 lines.
   You may go shorter only when syntax requires it.
4. Avoid trivial output:
   Do not output only punctuation or filler such as ".", ",", ";", ":".
   Do not output just one token unless it is structurally necessary.
5. Preserve local style:
   Match nearby language, tone, punctuation, spacing, and indentation.
6. Markdown awareness:
   Continue active list/checkbox/ordered-list patterns when applicable.
   Preserve indentation in nested list/code contexts.
   Close obvious unclosed inline markdown markers only when needed to bridge.
7. Strict output format:
   Output insertion text only.
   No explanations, labels, quotes, or code fences.

Decision policy:
- If PREFIX already connects naturally to SUFFIX, add a brief but useful continuation when possible.
- If uncertain, prefer a complete short phrase or sentence with clear meaning.

Examples:
<PREFIX>The quick brown fox </PREFIX>
<SUFFIX>jumps over the lazy dog.</SUFFIX>
Output: "moved quietly and then "

<PREFIX>## TODO\\n- [ ] Buy milk\\n- [ ] </PREFIX>
<SUFFIX></SUFFIX>
Output: "Write release notes and share draft with team"

Now produce the insertion.

<PREFIX>
{recent_prefix}
</PREFIX>

<SUFFIX>
{recent_suffix}
</SUFFIX>

Output:"""

    return prompt.strip()