Introduce a comprehensive TTS/ASR module that: - Adds /v1/tts-asr/config, /status, /warmup, /tts, /asr endpoints with detailed JSON responses - Implements Apple‑Silicon detection, device selection (MPS/CUDA/CPU), and memory limiting logic - Supports selectable model size, quantization, and offline mode via environment variables - Adds robust audio validation and multi‑path resampling fallback - Provides new README sections for API usage, device detection, and performance benchmarking - Includes a full testing suite: unit tests, integration tests, macOS simulation and performance reports - Updates backend dependencies and CI scripts - Adds new front‑end views and components for Univer editor integration All changes are backward compatible; new features are exposed through environment variables and new API routes.
2.8 KiB
2.8 KiB
API Benchmarking Report (2026-04-05 23:55:38)
Base URL: https://api.imageteach.tech:8002
Executive Summary
| Task | Success Rate | Avg TTFB | Avg Latency | P95 Latency | TPS | RPS |
|---|---|---|---|---|---|---|
| Completion-Short | 100.0% | 7519.5ms | 7520.1ms | 14075.8ms | 63.9 | 0.58 |
| Completion-Normal | 70.0% | 9184.3ms | 9184.8ms | 14619.5ms | 100.5 | 0.14 |
| Completion-Long | 100.0% | 22419.4ms | 22419.8ms | 39618.0ms | 852.5 | 0.21 |
| OCR-Concurrent | 0.0% | 0.0ms | 0.0ms | 0.0ms | 0.0 | 5.49 |
| TTS-Concurrent | 0.0% | 0.0ms | 0.0ms | 0.0ms | 0.0 | 11.27 |
| ASR-Concurrent | 0.0% | 0.0ms | 0.0ms | 0.0ms | 0.0 | 7.54 |
| Convert-Concurrent | 100.0% | 377.9ms | 378.6ms | 1017.7ms | 26.4 | 5.28 |
Stability & Context Analysis
Detailed analysis of how context length affects TTFB and overall performance.
Completion-Short Details
- Total Samples: 10
- Duration: 17.36s
Completion-Normal Details
- Total Samples: 10
- Duration: 70.66s
- Top Errors:
[504]<html>
504 Gateway Time-out
openresty </html>
[504]<html>
504 Gateway Time-out
openresty </html>
[504]<html>
504 Gateway Time-out
openresty </html>
Completion-Long Details
- Total Samples: 10
- Duration: 47.09s
OCR-Concurrent Details
- Total Samples: 10
- Duration: 1.82s
- Top Errors:
[500]{"error":"model runner has unexpectedly stopped, this may be due to resource limitations or an internal error, check ollama server logs for details (status code: 500)"}[500]{"error":"model runner has unexpectedly stopped, this may be due to resource limitations or an internal error, check ollama server logs for details (status code: 500)"}[500]{"error":"model runner has unexpectedly stopped, this may be due to resource limitations or an internal error, check ollama server logs for details (status code: 500)"}
TTS-Concurrent Details
- Total Samples: 10
- Duration: 0.89s
- Top Errors:
[404]{"detail":"Not Found"}[404]{"detail":"Not Found"}[404]{"detail":"Not Found"}
ASR-Concurrent Details
- Total Samples: 10
- Duration: 1.33s
- Top Errors:
[404]{"detail":"Not Found"}[404]{"detail":"Not Found"}[404]{"detail":"Not Found"}
Convert-Concurrent Details
- Total Samples: 10
- Duration: 1.90s