---
language:
- tr
- en
license: apache-2.0
tags:
- turkish
- language-model
- custom-architecture
- memory-network
- causal-lm
pipeline_tag: text-generation
---

# SykoLLM-CMN V5.2 Beta

SykoLLM-CMN, Türkçe için sıfırdan tasarlanmış, özgün bir **Contextual Memory Network (CMN)** mimarisine sahip bir dil modelidir. Standart Transformer bloklarının yanına yerleştirilmiş yerel ve global bellek kapıları sayesinde, uzun bağlamları chunk'lara bölerek işler ve bilgiyi oturumlar boyunca taşıyabilir.

---

## Model Detayları

| Özellik | Değer |
|---|---|
| Mimari | Transformer + CMN (Contextual Memory Network) |
| Parametre Sayısı | ~100-300M |
| Diller | Türkçe, İngilizce |
| Bağlam Uzunluğu | 1024 token |
| Chunk Boyutu | 128 token |
| Katman Sayısı | 24 |
| Model Boyutu (d_model) | 768 |
| Attention Head | 6 |
| Lokal Bellek Token | 16 |
| Global Bellek Token | 32 |
| Vocab Size | 32.000 |
| Lisans | Apache 2.0 |

---

## CMN Mimarisi Nedir?

SykoLLM-CMN, standart decoder-only modellerden farklı olarak iki katmanlı bir bellek sistemi içerir:

- **Lokal Bellek (SykoMemoryGate):** Her chunk işlenirken bir önceki chunk'tan gelen bilgiyi taşır. Klasik GRU/LSTM kapı mantığına benzer şekilde çalışır — ne unutacağını, ne öğreneceğini öğrenir.

- **Global Bellek (SykoSmartMemoryGate):** Tüm sekansa ait özet bilgiyi tutar. Cross-attention kullanarak "bu oturumda benim için ne önemli?" diye sorar ve buna göre güncellenir.

Bu yapı sayesinde model, bağlamı chunk'lara bölerek işlerken uzun vadeli bağlamı kaybetmez.

```
Input → [Global Mem | Local Mem | Chunk Tokens] → Transformer → Output
                ↑                    ↑
         SmartMemoryGate        MemoryGate
         (cross-attention)      (forget/update)
```

---

## Kullanım

```python
import torch
import torch.nn.functional as F
import re

def sample_next_token(logits, temperature=0.1, top_p=0.9, repetition_penalty=1.2, input_ids=None):
    # Tekrar cezası
    if input_ids is not None:
        for token_id in set(input_ids[0].tolist()):
            logits[token_id] /= repetition_penalty

    logits = logits / temperature
    probs  = F.softmax(logits, dim=-1)

    sorted_probs, sorted_indices = torch.sort(probs, descending=True)
    cumulative_probs = torch.cumsum(sorted_probs, dim=-1)
    sorted_indices_to_remove = cumulative_probs - sorted_probs > top_p
    sorted_probs[sorted_indices_to_remove] = 0
    sorted_probs /= sorted_probs.sum()

    next_token = torch.multinomial(sorted_probs, num_samples=1)
    return sorted_indices[next_token]

def fix_spacing(text):
    # Noktalama önündeki boşlukları sil
    text = re.sub(r'\s+([.,!?;:)])', r'\1', text)
    # Fazla boşlukları tek boşluğa indir
    text = re.sub(r'\s+', ' ', text)
    return text.strip()

prev_memory   = torch.zeros(1, config.num_memory_tokens,        config.d_model, device=device)
global_memory = torch.zeros(1, config.num_global_memory_tokens, config.d_model, device=device)

text      = "Kullanıcı: Merhaba, nasılsın?\nAsistan:"
inputs    = tokenizer(text, return_tensors="pt").to(device)
generated = inputs["input_ids"].clone()

with torch.no_grad():
    for _ in range(100):
        input_chunk = generated[:, -128:]

        logits, prev_memory, global_memory = model(
            input_ids=input_chunk,
            prev_memory=prev_memory,
            global_memory=global_memory
        )

        next_token = sample_next_token(
            logits[0, -1, :],
            temperature=0.1,
            top_p=0.9,
            repetition_penalty=1.3,
            input_ids=generated
        ).unsqueeze(0)

        generated = torch.cat([generated, next_token], dim=1)

        if next_token.item() == tokenizer.eos_token_id:
            break

output = tokenizer.decode(generated[0], skip_special_tokens=True, clean_up_tokenization_spaces=True)
print(fix_spacing(output))
```

---

## Eğitim

- **Dataset:** [SykoSLM/DeepReason-Mix](https://huggingface.co/datasets/SykoSLM/DeepReason-Mix)
- **Eğitim Örneği:** 300.000
- **Epochs:** 2
- **Batch Size:** 32
- **Gradient Accumulation:** 4
- **Optimizer:** AdamW
- **LR Scheduler:** Cosine Annealing + Linear Warmup
- **Başlangıç LR:** 3e-4 → Min: 1e-6
- **Mixed Precision:** FP16
- **Donanım:** 2× NVIDIA T4 (Kaggle)

---

## Sınırlamalar

- Model hâlâ beta aşamasındadır, üretim ortamında kullanmadan önce test edin.
- Uzun yanıtlarda tekrar ve tutarsızlık görülebilir.
- Zararlı içerik filtrelemesi uygulanmamıştır, gerekirse kendi filtrenizi ekleyin.

---

## Lisans

Apache 2.0 — Ticari kullanıma açıktır. Detaylar için [LICENSE](LICENSE) dosyasına bakın.

---

## Yazar

**SykoSLM** — Türkçe doğal dil işleme için özgün mimari araştırmaları.