台灣人工智慧社團
community
AI & ML interests
Speech Processing, Natural Language Processing, Computer Vision, Deep Learning
Recent Activity
View all activity
Organization Card
Deep Learning 101
The top private AI Meetup in Taiwan, launched on 2016
http://DeepLearning101.TWMAN.ORG | https://huggingface.co/DeepLearning101 | https://www.youtube.com/@DeepLearning101
Speech Processing( 語音處理):那些語音處理踩的坑:針對訪談或對話進行分析與識別。
語音處理
Speech Recognition (語音識別)
Speaker Recognition (聲紋識別)
Speech Enhancement (語音增強)
Speech Separation (語音分離)
Speech Synthesis (語音合成)
- Rectified Flow Matching 語音合成,上海交大開源:https://github.com/cantabile-kwok/VoiceFlow-TTS
- 新一代開源語音庫CoQui TTS衝到了GitHub 20.5k Star:https://github.com/coqui-ai/TTS/
- 清華大學LightGrad-TTS,且流式實現:https://github.com/thuhcsi/LightGrad
- 出門問問MeetVoice, 讓合成聲音以假亂真
- VALL-E:微軟全新語音合成模型可以在3秒內復制任何人的聲音
- BLSTM-RNN、Deep Voice、Tacotron…你都掌握了吗?一文总结语音合成必备经典模型(一)
- Tacotron2、GST、Glow-TTS、Flow-TTS…你都掌握了吗?一文总结语音合成必备经典模型(二)
- Bark:https://github.com/suno-ai/bark
Natural Language Processing, NLP (自然語言處理):那些自然語言處理踩的坑:針對文檔進行分析與擷取。
大型語言模型(Large Language Model,LLM),想要嗎?
基於機器閱讀理解的指令微調的統一信息抽取框架之診斷書醫囑擷取分析:https://huggingface.co/spaces/DeepLearning101/IE101TW
自然語言處理
Large Language Model (大語言模型)
Information/Event Extraction (資訊/事件擷取)
Machine Reading Comprehension (機器閱讀理解)
Named Entity Recognition (命名實體識別)
Correction (糾錯)
Classification (分類)
Similarity (相似度)
Computer vision (電腦視覺):針對物件或場景影像進行分析與偵測。
用PaddleOCR的PPOCRLabel來微調醫療診斷書和收據
圖像處理:
Optical Character Recognition (光學字元辨識)
- 繁體中文醫療診斷書和收據OCR:PaddleOCR
- PaddleOCR
Document Layout Analysis (文件結構分析)
- arXiv-2020_LayoutLM
- arXiv-2021_LayoutLMv2
- arXiv-2021_LayoutXLM
- arXiv-2022_LayoutLMv3
Document Understanding (文件理解)
Object Detection (物件偵測)
Handwriting Recognition (手寫識別)
Face Recognition (人臉識別)
models
None public yet
datasets
None public yet