智能摘要系统

以语音识别、摘要生成和桌面端交互为核心的智能摘要系统技术方案简述。

#tech / ai #type / synthesis #status / growing

[!info] related notes

所属 MOC: AI MOC

相关资源: transformers, openai-whisper, pytorch

相关方案: hyprnote, hyprnote-startup

范围

这篇笔记更偏一个具体系统方案示例，用来说明“语音识别 + 摘要 + 桌面端交互”如何组合落地。

智能摘要系统的核心，是把音频内容经过转录、摘要、结构化处理后，变成可搜索、可回顾、可继续加工的知识结果。

● 框架：react + typescript ● 桌面应用：tauri ● UI 组件：ant-design ● 状态管理：[[zustand]] ● 实时通信：web-socket ● 构建工具：vite-moc

● API 服务：fastapi ● AI 处理：python + transformers + pytorch ● 语音识别：openai-whisper+（流式集成） ● 大语言模型：Ollama + 通用大模型 ● 数据库：sqlite + sqlalchemy ● 异步处理：celery + redis

● 语音转写：Whisper Large V3（多语言支持） ● 文本摘要：通用大模型 ● 翻译模型：NLLB-200 / M2M-100 ● 嵌入模型：BGE / Sentence-Transformers