Mana Persian Piper (fa-IR)

This repository hosts a Persian (fa-IR) Piper TTS model trained for low-latency, high-quality speech synthesis.

The model is a medium-sized Piper checkpoint, fine-tuned on the Mana-TTS dataset to produce natural and intelligible Persian speech while remaining suitable for real-time and on-device inference.

Model Description

Architecture: Piper (medium)
Language: Persian (fa-IR)
Base Checkpoint: https://huggingface.co/SadeghK/persian-text-to-speech/tree/main/farsi/amir
Fine-tuning: ~1000 epochs on Mana-TTS
Training Dataset: https://huggingface.co/datasets/MahtaFetrat/Mana-TTS

This model was trained as part of a broader effort to build efficient Persian TTS systems that integrate well with lightweight and context-aware phonemization pipelines.

Inference

Install Piper

pip install piper-tts

Download the Model

git clone https://huggingface.co/MahtaFetrat/Mana-Persian-Piper

Run Inference (Python)

import wave
from piper import PiperVoice

voice = PiperVoice.load("/content/Mana-Persian-Piper/fa_IR-mana-medium.onnx")

with wave.open("test.wav", "wb") as wav_file:
    voice.synthesize_wav("سلام به همگی!", wav_file)

This will generate a test.wav file containing synthesized Persian speech.

Model Files

fa_IR-mana-medium.onnx – Piper acoustic model
fa_IR-mana-medium.onnx.json – Model configuration and metadata

Recommended Usage

This model is best used in conjunction with context-aware phonemization, as proposed in the paper:

Beyond Unified Models: A Service-Oriented Approach to Low-Latency, Context-Aware Phonemization for Real-Time TTS

In particular, combining this Piper model with:

Lightweight G2P
Ezafe-aware context disambiguation

results in improved pronunciation accuracy while preserving real-time performance.

The full system implementation is available in the companion repository associated with the paper.

Citation

If you use this model in your research or applications, please cite the following paper:

@misc{fetrat2025servicetts,
      title={Beyond Unified Models: A Service-Oriented Approach to Low Latency, Context Aware Phonemization for Real Time TTS}, 
      author={Mahta Fetrat and Donya Navabi and Zahra Dehghanian and Morteza Abolghasemi and Hamid R. Rabiee},
      year={2025},
      eprint={2512.08006},
      archivePrefix={arXiv},
      primaryClass={cs.SD},
      url={https://arxiv.org/abs/2512.08006}, 
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MahtaFetrat/Mana-Persian-Piper

Base model

rhasspy/piper-voices

Quantized

(17)

this model

MahtaFetrat
/

Mana-Persian-Piper