Update WAV encoding logic to convert audio to a NumPy array, employ a
temporary file for safe write with soundfile, and ensure cleanup in a
finally block. This resolves the BytesIO limitation and improves the
reliability of the TTS endpoint.
Add support to download and load TTS model from ModelScope, with a fallback to the HuggingFace mirror.
Implement a `documentLoadStatus` property and helper functions in `office.js` to track file loading state.
Improve request cancellation logic in `api.js`, ensuring proper cancel URL resolution and request‑id handling.
These changes enhance robustness, reduce external dependencies, and provide better UX for office file handling.
- Add AGENTS.md knowledge base with project documentation
- Move UserPreferences model to separate models.py file
- Extract API_KEY to environment variable for security
- Enhance Univer Editor with PPTX support and improved UI
- Improve file system handling with binary file detection
- Add HF_ENDPOINT mirror for better China connectivity
- Clean up unused imports and code structure
Added a new `.coveragerc` file configuring coverage thresholds and exclusions.
Included `pytest.ini` to enable coverage reporting for multiple backend modules (`main`, `llm`, `prompt`, `geoip`, `tts_asr`) with a 90 % fail‑under requirement and detailed HTML output.
Implemented a suite of unit tests:
* `test_geoip.py` – validates geo‑location lookup logic.
* `test_llm_extended.py` – tests LLm response extraction and Ollama interactions.
* `test_main_endpoints.py` – covers API endpoints for completions, OCR, and TTS.
* `test_prompt_extended.py` – verifies language sanitization, timestamp generation, and prompt building.
* `test_tts_asr_coverage.py` – checks device detection, cache clearing, and model loading under various environment configurations.
* `test_tts_asr_extended.py` – further tests TTS/ASR device selection and time‑outs.
Updated `backend/requirements.txt` to use newer, compatible packages, removed obsolete testing dependencies, and added `qwen-tts`.
Modified `backend/tts_asr.py` to work with the new `Qwen3TTSModel`, simplified imports, and adjusted device mapping logic.
Additionally, frontend changes added a new `TreeNodeItem` component, updated Markdown rendering, added TTS instruction fields, and reworked context menu handling.
No breaking changes were introduced.
Introduce a comprehensive TTS/ASR module that:
- Adds /v1/tts-asr/config, /status, /warmup, /tts, /asr endpoints with detailed JSON responses
- Implements Apple‑Silicon detection, device selection (MPS/CUDA/CPU), and memory limiting logic
- Supports selectable model size, quantization, and offline mode via environment variables
- Adds robust audio validation and multi‑path resampling fallback
- Provides new README sections for API usage, device detection, and performance benchmarking
- Includes a full testing suite: unit tests, integration tests, macOS simulation and performance reports
- Updates backend dependencies and CI scripts
- Adds new front‑end views and components for Univer editor integration
All changes are backward compatible; new features are exposed through environment variables and new API routes.
Introduce ContextMenu.vue and FileContent.vue components for interactive file operations
and file preview.
Update FileTree to support root drop, integrate the new components into DocsView,
and refresh i18n strings for file actions.
Refactor MilkdownEditor to embed TTS menu and player.
Add a file tree UI and corresponding composable for local file management.
Introduce TTS menu and player components for voice synthesis integration.
Add new EditorView and DocsView routes and update SettingsPanel view switching.
Enhance Mermaid plugin with improved styling and action buttons.
Adds a DocBlock component that renders embedded documents, new export buttons for DOCX
and PDF, and updates the file‑upload picker to accept *.txt, *.docx, *.pptx, and *.pdf.
Introduces a DOCX→PDF conversion bridge in the backend and new /tts and /asr
endpoints that expose TTS and speech‑recognition functionality. The README is
rewritten to describe the new features and clean up legacy documentation. All
changes are backward‑compatible and do not introduce breaking API changes.
Added new upload functionality to the editor supporting doc/docx/ppt/pptx/pdf/zip/txt/json files. Includes:
- New upload button with file input
- File type detection utilities (isTextFile, isConvertibleFile)
- Initial markdown sync with trailing whitespace normalization
- Warning messages for unsupported file types
Add LANGUAGE_SYNONYMS dictionary to map language aliases to canonical IDs,
_canonical_language_id() to normalize language identifiers, and
_language_guidance() to provide language-specific instructions for LLM
code generation. This improves language detection and ensures consistent
prompt context across different language format variations.
Add support for cancelling in-progress LLM completion requests via new /v1/completions/cancel endpoint with task tracking. Implement mermaid diagram rendering in the Milkdown editor with a new mermaidPlugin. Update copilotPlugin to properly abort requests with descriptive reasons. Refactor settings panel to handle system theme changes reactively. Add camera capture support for image uploads.
Separate prompt generation into system and user prompts for better LLM instruction following. Backend now builds a detailed system prompt with constraints for math formatting, code block handling, boundary newlines, and OCR safety, while user prompt contains context and completion state flags. Added corresponding tests for both modules.
The Ollama API expects "think" parameter instead of "thinking". Also updates the API base URL in the frontend configuration to point to the correct endpoint.
Add API key security using fastapi.security.APIKeyHeader to protect /v1/completions and /v1/ocr endpoints. Updated frontend to include X-API-Key header in API requests. Also changed default API base URL from http://149.104.29.239:8001 to https://api.learnteach.tech:8002.
Add optional thinking parameter to the call_ollama function and pass it from the request. Also enhance timezone handling in prompt generation to support configurable timezone preferences.
- Add privacy mode to hide IP and user preferences from AI requests
- Add model thinking levels (low/medium/high) for context analysis depth
- Add PWA support with service worker, manifest, and app icons
- Add SettingsPanel for user preferences (theme, background, language)
- Add i18n translations for en/zh/ja/ko/de/fr
- Add Pinia store for centralized settings management
- Update backend to support user preferences and thinking levels
- Update config to use absolute API URLs
- Add GeoLite2-City.mmdb database for IP lookup
- Create geoip.py module for IP location services
- Extract client IP from requests and log location info
- Pass location context to LLM prompts for enhanced responses
- Deleted `windowDelineation.test.ts` as it is no longer needed.
- Removed `index.ts` and `tokenizer.ts` from the tokenization module due to refactoring.
- Eliminated unused types related to authentication, code citation, context provider API, and core functionalities.
- Cleaned up the `status.ts` file by removing the `StatusKind` type definition.
Remove sensitive environment files from repository tracking and add comprehensive Python ignore patterns including virtual environments, cache files, and environment variables. Also clean up staged __pycache__ binary files.
- Configure Vite dev server to proxy /v1 requests to backend
- Change backend port from 8000 to 8001
- Simplify API URLs to relative paths instead of absolute localhost URLs
- Increase debounce delay from 500ms to 1000ms for better stability
- Update README documentation to reflect all changes
- Implement SHA-256 image hashing to cache OCR results and avoid re-processing identical images
- Add 100MB file size limit for image uploads with user-friendly error messages
- Clear ghost suggestions when uploading new images to prevent interference
- Optimize size limit calculation in copilot plugin to include OCR context
- Remove debug logging from production code
- Add image processing optimization plan document
BREAKING CHANGE: Image upload size limit is now enforced at 100MB (previously unlimited)
- Redesign theme toggle with improved visual effects including gradient overlays, enhanced shadows, and smoother cubic-bezier transitions
- Update toggle dimensions and icon positioning for better visual balance
- Add SVG filter effects for sun/moon icons in dark mode
- Replace English UI text with Chinese localization in MilkdownEditor
- Refactor copilotPlugin by removing unused decoration functions and improving ghost mark text node handling
- Implemented a new composable `useTheme` for managing theme state.
- Added functions to read and write theme preference to local storage.
- Applied theme styles to the DOM based on user preference.
- Introduced a toggle function to switch between light and dark themes.
refactor: enhance copilot plugin functionality
- Improved request handling with sequence and document versioning.
- Refactored ghost text handling to improve clarity and efficiency.
- Updated markdown insertion logic to handle parsed content more robustly.
- Enhanced error handling and logging for better debugging.
style: update global styles for light and dark themes
- Defined CSS variables for light and dark themes to streamline styling.
- Improved overall styling consistency and responsiveness.
- Added transitions for smoother theme changes and interactions.
- Replace HTML comment OCR metadata with inline `<OCR:...>` tags
- Implement serializer-based markdown conversion for prefix/suffix content
- Add extractTextFromOCR utility function for text extraction
- Enable Table, Diagram, and ListCheck features in MilkdownEditor
- Add periodic debug logging for document state analysis