Add image button with dropdown menu for uploading local images or inserting from URL. Integrate VLM-based OCR to extract text context from images and include in AI suggestions. Implement document size limits to disable AI when exceeding threshold. Refactor copilot plugin with per-view runtime state and OCR context injection. Add OCR cache utility for managing image metadata. Add code splitting configuration for optimized bundle size.
94 lines
3.0 KiB
Python
94 lines
3.0 KiB
Python
from typing import Tuple
|
|
|
|
MAX_PREFIX_CHARS = 12000
|
|
MAX_SUFFIX_CHARS = 4000
|
|
|
|
|
|
def _sanitize_language_id(language_id: str) -> str:
|
|
if not language_id:
|
|
return "markdown"
|
|
allowed = []
|
|
for ch in language_id.strip():
|
|
if ch.isalnum() or ch in "-_+.":
|
|
allowed.append(ch)
|
|
value = "".join(allowed)[:32]
|
|
return value or "markdown"
|
|
|
|
|
|
def _prepare_context(prefix: str, suffix: str) -> Tuple[str, str]:
|
|
"""
|
|
Prepare prefix/suffix for model completion context.
|
|
Keep the historical one-char lookahead behavior to reduce boundary drift.
|
|
"""
|
|
if suffix:
|
|
prefix = prefix + suffix[0]
|
|
suffix = suffix[1:]
|
|
return prefix[-MAX_PREFIX_CHARS:], suffix[:MAX_SUFFIX_CHARS]
|
|
|
|
|
|
def build_prompt(prefix: str, suffix: str, language_id: str = "markdown") -> str:
|
|
safe_language_id = _sanitize_language_id(language_id)
|
|
recent_prefix, recent_suffix = _prepare_context(prefix, suffix)
|
|
|
|
prompt = f"""You are an inline completion engine for a {safe_language_id} editor with ghost-text suggestions.
|
|
|
|
Your job:
|
|
- Return ONLY the text that should be inserted at the cursor between PREFIX and SUFFIX.
|
|
- Prefer a meaningful, non-empty insertion with moderate length.
|
|
- Avoid overly short outputs with little information value.
|
|
|
|
Important context:
|
|
- PREFIX may contain hidden OCR metadata in HTML comments such as <!--OCR:...-->.
|
|
- These comments are non-visible context only.
|
|
- Never copy, rewrite, or emit HTML comments in output.
|
|
- Never output <!-- or -->.
|
|
|
|
Hard rules:
|
|
1. Seamless join:
|
|
PREFIX + OUTPUT + SUFFIX must read naturally as one continuous document.
|
|
2. No suffix repetition:
|
|
Do NOT repeat text that already appears at the start of SUFFIX.
|
|
3. Balanced length:
|
|
Prefer concise but meaningful continuation, not ultra-short fragments.
|
|
Default target is 20-120 characters and 1-3 lines.
|
|
You may go shorter only when syntax requires it.
|
|
4. Avoid trivial output:
|
|
Do not output only punctuation or filler such as ".", ",", ";", ":".
|
|
Do not output just one token unless it is structurally necessary.
|
|
5. Preserve local style:
|
|
Match nearby language, tone, punctuation, spacing, and indentation.
|
|
6. Markdown awareness:
|
|
Continue active list/checkbox/ordered-list patterns when applicable.
|
|
Preserve indentation in nested list/code contexts.
|
|
Close obvious unclosed inline markdown markers only when needed to bridge.
|
|
7. Strict output format:
|
|
Output insertion text only.
|
|
No explanations, labels, quotes, or code fences.
|
|
|
|
Decision policy:
|
|
- If PREFIX already connects naturally to SUFFIX, add a brief but useful continuation when possible.
|
|
- If uncertain, prefer a complete short phrase or sentence with clear meaning.
|
|
|
|
Examples:
|
|
<PREFIX>The quick brown fox </PREFIX>
|
|
<SUFFIX>jumps over the lazy dog.</SUFFIX>
|
|
Output: "moved quietly and then "
|
|
|
|
<PREFIX>## TODO\\n- [ ] Buy milk\\n- [ ] </PREFIX>
|
|
<SUFFIX></SUFFIX>
|
|
Output: "Write release notes and share draft with team"
|
|
|
|
Now produce the insertion.
|
|
|
|
<PREFIX>
|
|
{recent_prefix}
|
|
</PREFIX>
|
|
|
|
<SUFFIX>
|
|
{recent_suffix}
|
|
</SUFFIX>
|
|
|
|
Output:"""
|
|
|
|
return prompt.strip()
|