| | ---
|
| | license: apache-2.0
|
| | license_name: apache-2.0
|
| | tags:
|
| | - lora
|
| | - manga
|
| | - coloring
|
| | - anime
|
| | - qwen
|
| | - dataset
|
| | - diffusers
|
| | - image-to-image
|
| | viewer: false
|
| | ---
|
| |
|
| | <p align="center">
|
| | <img src="images/logo.png" alt="PanelPainter Logo" width="400">
|
| | </p>
|
| |
|
| | # PanelPainter-Project
|
| |
|
| | **PanelPainter-Project** is an open-source initiative to automate black-and-white manga coloring using fine-tuned LoRAs.
|
| |
|
| | This project is dedicated to training LoRAs to automate the coloring of black-and-white manga panels. I am releasing all the files here, including datasets, logs, and experimental versions, so others can see exactly how it was trained.
|
| |
|
| | ## Showcase
|
| |
|
| | Here are some examples comparing the original panel, the base Qwen Image Edit model, and the result with the PanelPainter V3 LoRA.
|
| |
|
| | > [!NOTE]
|
| | > **Showcase Generation Settings:**
|
| | > * **LoRAs:** PanelPainter V3 (Weight: 1.0) + 4-Step Lighting (Weight: 1.0)
|
| | > * **Steps:** 4
|
| | > * **Sampler:** Euler
|
| | > * **Scheduler:** Simple
|
| | > * **Seed:** 1000
|
| | > * **CFG:** 1.0
|
| |
|
| | <p align="center">
|
| | <img src="images/Sample_Image_1.png" alt="Chainsaw Man Showcase">
|
| | <br>
|
| | <em>Chainsaw Man</em>
|
| | </p>
|
| |
|
| | <p align="center">
|
| | <img src="images/Sample_Image_2.png" alt="Frieren Showcase">
|
| | <br>
|
| | <em>Frieren</em>
|
| | </p>
|
| |
|
| | <p align="center">
|
| | <img src="images/Sample_Image_3.png" alt="Komi Showcase">
|
| | <br>
|
| | <em>Komi Can't Communicate</em>
|
| | </p>
|
| |
|
| | <p align="center">
|
| | <img src="images/Sample_Image_4.png" alt="Oshi no Ko Showcase">
|
| | <br>
|
| | <em>Oshi no Ko</em>
|
| | </p>
|
| |
|
| | ## Project Structure
|
| |
|
| | This repository contains everything used to create the models:
|
| |
|
| | ### 1. LoRA Models (`/loras`)
|
| | This directory contains the model weights for all iterations of the project:
|
| |
|
| | > [!TIP]
|
| | > **Trigger Word:** `Color this panelpainter` (Applicable for both V2 and V3)
|
| |
|
| | * **V3 (Latest Release):** `PanelPainter_v3_Qwen2511.safetensors`
|
| | * **Base:** Qwen Image Edit 2511
|
| | * **Note:** The latest model trained on the expanded 903-image dataset.
|
| | * **V2 (Stable):** `PanelPainter_v2_Qwen2509.safetensors`
|
| | * **Base:** Qwen Image Edit 2509 (Compatible with 2511).
|
| | * **Note:** Standard release (High quality, low variety).
|
| | * **V1 (Legacy):** `PanelPainter_v1_Legacy.safetensors`
|
| | * **Base:** Qwen Image Edit 2509
|
| | * **Note:** Archived experimental version (synthetic data).
|
| |
|
| | ### 2. Training Logs (`/logs`)
|
| | **Content:** Tensorboard logs and charts from my training runs. You can check these to see how the loss converged and how the model learned over time for each version.
|
| |
|
| | ### 3. Workflows (`/workflows`)
|
| | **Content:** ComfyUI workflow JSON files to help you get started with PanelPainter.
|
| |
|
| | ### 4. Training Dataset
|
| | The datasets used for this project are hosted separately:
|
| |
|
| | * **PanelPainter-Dataset**
|
| |
|
| | > [!NOTE]
|
| | > **Coming Soon:** The V3 dataset was a good learning step for captioning, but it was randomly picked without any streamlined curation roughly 50% doujin and 50% mainstream colored manga. We're refining it further. Expect handpicked panels, better captions, and reduced doujin content. Release coming once quality standards are met.
|
| |
|
| | ---
|
| |
|
| | ## Version History & Development Log
|
| |
|
| | ### Version 3.0 (Current Release)
|
| | * **Status:** Released.
|
| | * **Base Architecture:** Qwen 2511.
|
| | * **Strategy:** Scaling Up High-Quality Data.
|
| | * **Dataset:** Expanded to 903 images. Recreated from scratch, comprising 50% doujin and 50% SFW panels.
|
| | * **Summary:** This version combines the correct "real line art" training method discovered in V2 with a significantly larger dataset. This improves the model's ability to generalize across different manga styles while maintaining the color quality of V2.
|
| |
|
| | ### Version 2.0
|
| | * **Status:** Released / Stable.
|
| | * **Base Model:** Trained on Qwen Image Edit 2509, also it works on Qwen 2511 as well.
|
| | * **The Breakthrough:** After V1 failed, this version switched to training on real line art instead of synthetic grayscale.
|
| | * **Dataset:** A tiny, hyper-curated set of 150 images (70% Doujin / 30% SFW).
|
| | * **Outcome:** Despite the small size, it proved that high-quality real line art outperforms massive synthetic datasets. It produces good colors but lacks variety due to the small sample size.
|
| |
|
| | ### Version 1.0
|
| | * **Status:** Archived / Deprecated.
|
| | * **Base Model:** Qwen Image Edit 2509.
|
| | * **The Mistake:** Trained on 7,000 images generated by simply desaturating colored pages (synthetic grayscale).
|
| | * **Outcome:** The model learned to color "perfect gray" inputs but failed on real, imperfect ink lines.
|
| | * **Lesson:** Quantity does not matter if the data distribution doesn't match real usage.
|
| |
|
| | ---
|
| |
|
| | ## Training Configuration (V3)
|
| |
|
| | **Hardware:** Trained on an A40 GPU on Runpod.
|
| |
|
| | Below is the exact accelerate command used to train the V3 model on Musubi Tuner:
|
| |
|
| | ```bash
|
| | accelerate launch --num_cpu_threads_per_process 1 --mixed_precision bf16 \
|
| | /workspace/musubi-tuner/src/musubi_tuner/qwen_image_train_network.py \
|
| | --dataset_config dataset_edit.toml \
|
| | --dit /workspace/Training_Models_Qwen/Qwen_Image_Edit_2511_BF16.safetensors \
|
| | --vae /workspace/Training_Models_Qwen/qwen_train_vae.safetensors \
|
| | --text_encoder /workspace/Training_Models_Qwen/qwen_2.5_vl_7b_bf16.safetensors \
|
| | --model_version edit-2511 \
|
| | --network_module networks.lora_qwen_image \
|
| | --output_dir /workspace/output_panelpainter \
|
| | --output_name panelpainter_v3_part1 \
|
| | --mixed_precision bf16 \
|
| | --max_data_loader_n_workers 0 \
|
| | --learning_rate 3e-4 \
|
| | --network_dim 128 \
|
| | --network_alpha 128 \
|
| | --optimizer_type adafactor \
|
| | --optimizer_args "scale_parameter=False" "relative_step=False" "warmup_init=False" "weight_decay=0.01" \
|
| | --lr_scheduler cosine \
|
| | --lr_warmup_steps 150 \
|
| | --timestep_sampling qinglong_qwen \
|
| | --discrete_flow_shift 2.2 \
|
| | --max_train_epochs 8 \
|
| | --save_every_n_epochs 1 \
|
| | --save_state \
|
| | --gradient_checkpointing \
|
| | --gradient_checkpointing_cpu_offload \
|
| | --gradient_accumulation_steps 4 \
|
| | --blocks_to_swap 20 \
|
| | --sdpa
|
| | ```
|
| |
|
| | **Dataset Settings:** Use a resolution of **1328x1328** with bucketing enabled to handle varying aspect ratios (no upscaling). The training ran with a batch size of 1 and enabled `qwen_image_edit_no_resize_control` to preserve the original dimensions of the control images during processing.
|
| |
|
| | ## License
|
| |
|
| | * **Project:** Apache 2.0
|
| | * **Dataset:** Hosted separately, contains copyrighted manga panels.
|
| | * **Copyright:** Original art belongs to the respective creators and publishers.
|
| |
|
| | ## Acknowledgements
|
| |
|
| | Trained on Musubi Tuner. Thanks to kohya-ss.
|
| |
|
| | **Dataset Contributors:** Thanks to @Rox_Jr & @lucifer_brine04 for their help with the dataset.
|
| |
|
| | ## External Links
|
| | * **Public Model Page:** [Civitai: PanelPainter](https://civitai.com/models/2103847/panelpainter-manga-coloring) |