arxiv:2605.30611

Crafter: A Multi-Agent Harness for Editable Scientific Figure Generation from Diverse Inputs

Published on May 28

· Submitted by

ssz on Jun 2

#1 Paper of the day

Upvote

169

Authors:

Shuzheng Si ,

Abstract

Automated systems for generating scientific figures face limitations in handling diverse figure types and conditions, prompting the development of multi-agent frameworks that generalize across different input scenarios and produce editable output formats.

Generated by Qwen/Qwen2.5-Coder-32B-Instruct

Scientific figures are among the most effective means of communicating complex research ideas, yet producing publication-quality illustrations remains one of the most labor-intensive parts of paper preparation. Existing automated systems each target a single figure type under text-only input, leaving the diversity of types and conditions researchers actually use unaddressed; their raster outputs further cannot be locally revised. Because scientific figures are structured compositions of discrete semantic components, the localized errors generators produce on such layouts demand not a stronger backbone but a harness. We instantiate this harness in two complementary systems: Crafter, a multi-agent harness for figure generation that generalizes across figure types and input conditions without architectural changes, and CraftEditor, which applies the same pattern to convert raster outputs into editable SVGs. Moreover, we introduce CraftBench, a benchmark spanning three figure types and four input conditions with human quality annotation. Experiments show that Crafter substantially outperforms both standalone generators and the agentic baseline on PaperBanana-Bench and CraftBench, with ablations confirming each component's independent contribution; CraftEditor faithfully converts outputs into editable SVGs that surpass all baselines. Our code and benchmark are available at https://github.com/HaozheZhao/Crafter.

View arXiv page View PDF GitHub 29 Add to collection

Community

ssz1111

Paper author Paper submitter 2 days ago

•

edited 2 days ago

Crafter is a multi-agent system for generating publication-quality scientific figures across diverse types and conditions, with CraftEditor turning raster outputs into editable SVGs and CraftBench for evaluation.

librarian-bot

about 23 hours ago

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

avahal

7 minutes ago

the most interesting bit for me is the shared memory s and the four-role loop (d, e, v, r) that keeps edits structured instead of chasing endless prompts. encoding typed edits into a structured memory and letting a critic gate candidate plans lets the system preserve consistency across rounds while still exploring diversity, and the fact you can swap in stronger backends without architectural changes feels very practical. i am curious how you handle conflicting constraints when different plans propose different edits to the same element—does the critic resolve that by edit distance, or is there a higher-priority rule? crafteditor's extraction-processing-composition pipeline to produce editable svg from raster is clever, but i worry about error accumulation in the vector layer if the raster has occlusions or ambiguous labels. btw arxivlens had a solid walkthrough that covers this pattern well, https://arxivlens.com/PaperView/Details/crafter-a-multi-agent-harness-for-editable-scientific-figure-generation-from-diverse-inputs-2710-cc1998e6