reazon-research/japanese-wav2vec2-base
This is a Japanese wav2vec 2.0 Base model pre-trained on ReazonSpeech v2.0 corpus.
We also release the CTC model reazon-research/japanese-wav2vec2-base-rs35kh
derived from this model.
Usage
import librosa
import torch
from transformers import AutoFeatureExtractor, AutoModel
feature_extractor = AutoFeatureExtractor.from_pretrained("reazon-research/japanese-wav2vec2-base")
model = AutoModel.from_pretrained("reazon-research/japanese-wav2vec2-base")
audio, sr = librosa.load(audio_file, sr=16_000)
inputs = feature_extractor(
audio,
return_tensors="pt",
sampling_rate=sr,
)
with torch.inference_mode():
outputs = model(**inputs)
Citation
@misc{reazon-research-japanese-wav2vec2-base,
title={japanese-wav2vec2-base},
author={Sasaki, Yuta},
url = {https://huggingface.co/reazon-research/japanese-wav2vec2-base},
year = {2024}
}
License
- Downloads last month
- 603
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.