Hugging Face – Posts

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

All HF Hub posts

posted an update 2 days ago

Post

7530

TRL v1.3 ships day-one training support for Qwen 3.6 🚀

The new Qwen 3.6 family (Qwen/Qwen3.6-27B, Qwen/Qwen3.6-35B-A3B) reuses the Qwen3.5-MoE architecture but ships a slightly different chat template, so we updated the stack end-to-end: new training template with {% generation %} markers, tool-call response schema routing, tiny test models for the VLM matrix.

SFT with assistant-only loss works out of the box:

from trl import SFTConfig, SFTTrainer

trainer = SFTTrainer(
    model="Qwen/Qwen3.6-27B",
    args=SFTConfig(assistant_only_loss=True),
    train_dataset=dataset,
)
trainer.train()

So does GRPO tool-calling — just hand tools=[...] to GRPOTrainer.

v1.3 also brings a new experimental TPO trainer (Triple Preference Optimization), speculative decoding in trl vllm-serve (Qwen3 MTP / Eagle3 drafts), 12 more KTO ↔ DPO alignment PRs (KTO promotion to stable is now in reach), three more {% generation %} chat templates (Gemma/Gemma 2, Phi-3, GLM-4-MoE), and a chunky SFT entropy bug fix.

Full release notes: https://github.com/huggingface/trl/releases/tag/v1.3.0

Enderchef

posted an update 1 day ago

Post

5172

Hi, everyone!
Please follow, like, and support the work of

CompactAI-O !
Spread the word!

projectlosangeles

posted an update 2 days ago

Post

10292

🔥Check out first-of-its-kind SOTA Orpheus Morpheus preview!🔥

projectlosangeles/Orpheus-Morpheus

Easily generate variations or similar compositions from any MIDI!

Please ❤️if you enjoyed Orpheus Morpheus!

Sincerely,

Alex

SeaWolf-AI

posted an update 4 days ago

Post

8622

🧬 Introducing Darwin-9B-NEG — the first model with Native Entropy Gating (NEG)

🔗 Try it now: FINAL-Bench/Darwin-9B-NEG
🔗 Q4 bit : FINAL-Bench/Darwin-9B-MFP4

We're thrilled to release Darwin-9B-NEG, a 9B-parameter reasoning model
that embeds an architecturally-internalised sense of self-confidence directly
into the transformer — our proprietary Native Entropy Gating (NEG) technology.

📊 GPQA Diamond (198 PhD-level questions):

▸ Baseline Darwin-9B (no NEG) → 51.01 %
▸ Pure NEG (greedy · 1× cost) → 63.64 % 🔥 +12.63 %p
▸ + Permutation (4× cost) → 76.26 %
▸ + Ensemble Refinement (~20×) → 84.34 % 🏆

With only 9 billion parameters and 1× inference cost, Pure NEG jumps
+12.63 %p over the same model without NEG. Going all-in with ensemble
refinement pushes it to 84.34 % — surpassing the published Qwen3.5-9B
leaderboard score (81.7 %) by +2.64 %p.

🔬 What makes NEG different from Multi-Turn Iteration (MTI)?

Classical MTI needs 3-8× extra inference passes. NEG instead lives
INSIDE the single decoding loop. Two tiny modules ride with the
transformer: NEG-Head predicts per-token entropy from the last hidden
state, and NEG-Gate conditionally restricts the top-k choice when
confidence is low. The gate activates in only 4.36 % of tokens —
essentially free at inference time.

✨ Key differentiators
• Architecturally internalised — model file *is* the feature
• 1× inference cost (vs. 3-8× for MTI)
• Drop-in with vLLM / SGLang / TGI / transformers — no extra engine
• +12.63 %p reasoning at zero latency overhead
• Single-file deployment, Apache 2.0 licensed

🧬 Lineage
Qwen/Qwen3.5-9B → Darwin-9B-Opus (V7 evolutionary merge) → Darwin-9B-NEG (V8 + NEG training)

#Darwin #NEG #NativeEntropyGating #GPQA #Reasoning #LLM #OpenSource #Apache2

prometechinc

posted an update about 3 hours ago

Post

pthinc/BCE-Prettybird-Nano-Parrot-v0.2

This dataset is a bilingual (Turkish-English mixed) comedic text collection designed for training and fine-tuning conversational AI models with humor awareness, sarcasm detection, and cultural nuance understanding. It includes short joke-style prompts, observational comedy snippets, and absurd dialogue fragments that blend everyday Turkish expressions with English punchlines, reflecting real-world code-switching behavior. The dataset aims to improve model creativity, timing, and informal language fluency while capturing the rhythm of stand-up comedy and internet humor across multilingual contexts.

It is made from synthetic in AI. There is irony and humor, some jokes might be a bit stale. 🤣

600 jokes and ironies in different languages have been added. Styles of various comedians are included.

yuriyvnv

posted an update about 22 hours ago

Post

122

🔊 Four Qwen3-ASR (0.6B and 1.7B) Fine-Tunes for Portuguese and Dutch.

Both the 1.7B and 0.6B variants of Alibaba's Qwen3-ASR, fine-tuned for European Portuguese and Dutch and bundled in a single collection.

🔗 Collection: https://huggingface.co/collections/yuriyvnv/qwen-asr-for-portuguese-and-dutch-17b-and-06b

Headline numbers — Common Voice 22 test, with the zero-shot baseline.
🇵🇹 Qwen3-ASR-1.7B-PT — 12.91% → 8.50% WER (-34%)
🇵🇹 Qwen3-ASR-0.6B-PT — 18.26% → 11.85% WER (-35%)
🇳🇱 Qwen3-ASR-1.7B-NL — 6.68% → 5.28% WER (-21%)
🇳🇱 Qwen3-ASR-0.6B-NL — 12.46% → 8.31% WER (-33%)

The 0.6B variants are the more interesting half of the release. They give up only a few WER points compared to the 1.7B at a third of the parameters — relevant for edge hardware, CPU inference, or anywhere keeping inference cost down. The Dutch 0.6B in particular lands at 8.3% WER on CV22, competitive with much larger systems.

The Dutch 1.7B started from a strong 6.7% zero-shot, so the absolute gain is smaller — Qwen already handles Dutch well, and the fine-tune mostly sharpens it on Common Voice's casing and punctuation conventions.

Training stuck close to Qwen's official SFT recipe (lr 2e-5, linear schedule, 2% warmup, bf16, gradient checkpointing on a single H100). The data is the differentiator: Common Voice 22 train + validation augmented with synthetic OpenAI-TTS speech, filtered by the WAVe multimodal embedding model that scores clips at the word level and drops the ones that don't align well with their transcripts.

📦 Full pipeline — synthetic data generation, WAVe filtering, training scripts, evaluation protocol — is open-source:
github.com/yuriyvnv/TTS-Augmented-ASR
@hf-audio .
#asr #speech #parakeet #nvidia #nemo #multilingual #fine-tuning #commonvoice

HaChazal

posted an update about 23 hours ago

Post

115

I just made this. It's a Dev Nexus. It's available for download at https://lvtn.metanoiaunlimited.com

evalstate

posted an update 1 day ago

Post

Hugging Face MCP Server v0.3.9
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Users with a bucket named mcp will get an additional list_files tool that returns the public URL of contained files. This is primarily intended for use with Gradio Spaces that need URLs as inputs.

mlabonne

posted an update 1 day ago

Post

239

Big update to llm-datasets, my curated list of datasets and tools for post-training LLMs.

> Added many new datasets
> New "thinking" column
> Refreshed recommended tools.

Thanks to everyone who told me they used it for their research at ICLR, you motivated this update!

1 reply

kanaria007

posted an update 1 day ago

Post

135

✅ Article highlight: *Continuous Audit Pipeline: Making Evidence Bundles Routine* (art-60-107, v0.1)

TL;DR:
This article argues that evidence bundles should not be an incident-only ritual.

If reconstructability matters only after something goes wrong, it is already too late. SI turns audit into a *continuous pipeline*: routine sealed bundles, immediate verification, retention-safe omissions, and automatic escalation when governance SLOs are breached.

Read:
kanaria007/agi-structural-intelligence-protocols

Why it matters:
• makes “courtroom-grade reconstructability” a routine byproduct of normal ops
• turns governance SLO breaches into explicit state transitions, not dashboard trivia
• separates stable audit spine from payload store, so erasure removes access without destroying proof
• prevents incident-time improvisation from breaking determinism, chain-of-custody, or export integrity

What’s inside:
• the operating model: *Audit Spine vs Payload Store*
• three routine bundle tiers: daily governance bundles, weekly compliance bundles, and triggered incident-ready bundles
• trigger rules where CAS / ACR / RBL / EOH breaches automatically emit bundles and degrade governance state
• an end-to-end pipeline: collect → shape/omit → canonicalize → digest → resolve refs → seal → sign → verify → retain
• a governed run record for continuous audit itself, including policy, trust, canonicalization, reason-code-set, and registry snapshot bindings

Key idea:
Do not wait until an incident to “prepare evidence.”

Make evidence production continuous, sealed, and self-verifying—so when something breaks, you select the window instead of inventing the proof.

*Continuous audit is not paperwork. It is a control loop on admissibility and autonomy.*

Recently active users