πŸ› οΈ

The Programming Framework

A Universal Method for Process Analysis

Combining Large Language Models with Mermaid visualization to dissect and understand complex processes across any disciplineβ€”from biology to business, physics to psychology.

πŸ“‹ Summary

The Programming Framework is a universal meta-tool for analyzing complex processes across any discipline by combining Large Language Models (LLMs) with visual flowchart representation. The Framework transforms textual process descriptions into structured, interactive Mermaid flowcharts stored as JSON, enabling systematic analysis, visualization, and integration with knowledge systems.

Successfully demonstrated through GLMP (Genome Logic Modeling Project) with 50+ biological processes, and applied across Chemistry, Mathematics, Physics, and Computer Science. The Framework serves as the foundational methodology for the CopernicusAI Knowledge Engine, enabling domain-specific process visualization and analysis.

πŸ“š Prior Work & Research Contributions

Overview

The Programming Framework represents prior work that demonstrates a novel methodology for analyzing complex processes by combining Large Language Models (LLMs) with visual flowchart representation. This research establishes a universal, domain-agnostic approach to process analysis that transforms textual descriptions into structured, interactive visualizations.

πŸ”¬ Research Contributions

  • β€’ Universal Process Analysis: Domain-agnostic methodology applicable across multiple fields
  • β€’ LLM-Powered Extraction: Automated extraction using Google Gemini 2.0 Flash
  • β€’ Structured Visualization: Mermaid.js-based flowchart generation encoded as JSON
  • β€’ Iterative Refinement: Systematic approach enabling continuous improvement
  • β€’ Scale Demonstration: Applied to 313+ processes across 5 disciplines (Biology: 52, Chemistry: 91, Physics: 21, Computer Science: 21, Mathematics: 20, GLMP: 109)
  • β€’ Validation: Successfully processes complex biological, chemical, and computational workflows with high accuracy

βš™οΈ Technical Achievements

  • β€’ Meta-Tool Architecture: Framework for creating specialized analysis tools
  • β€’ JSON-Based Storage: Structured format enabling version control and API integration
  • β€’ Multi-Domain Application: Successfully applied to biological processes (GLMP)
  • β€’ Integration Framework: Designed for knowledge engines and collaborative platforms

🎯 Position Within CopernicusAI Knowledge Engine

The Programming Framework serves as the foundational meta-tool of the CopernicusAI Knowledge Engine, providing the underlying methodology that enables specialized applications:

  • β€’ GLMP (Genome Logic Modeling Project)
  • β€’ CopernicusAI (main knowledge engine)
  • β€’ Research Papers Metadata Database
  • β€’ Science Video Database
  • β€’ Multi-domain process analysis

This work establishes a proof-of-concept for AI-assisted process analysis, demonstrating how LLMs can systematically extract and visualize complex logic from textual sources across diverse domains.

Any
Discipline
LLM
Powered
Visual
Flowcharts
JSON
Structured Data

🎯 What is the Programming Framework?

The Programming Framework is a meta-toolβ€”a tool for creating tools. It provides a systematic method for analyzing any complex process by combining the analytical power of Large Language Models with the clarity of visual flowcharts.

πŸ” The Problem

Complex processesβ€”whether biological, computational, or organizationalβ€”are difficult to understand because they involve many steps, decision points, and interactions. Traditional descriptions in text are hard to follow.

✨ The Solution

Use LLMs to extract process logic from literature, then encode it as Mermaid flowcharts stored in JSON. Result: Clear, interactive visualizations that reveal hidden patterns and enable systematic analysis.

βš™οΈ How It Works

1️⃣

Input Process

Provide scientific papers, documentation, or process descriptions

2️⃣

LLM Analysis

AI extracts steps, decisions, branches, and logic flow

3️⃣

Generate Flowchart

Create Mermaid diagram encoded as JSON structure

4️⃣

Visualize & Iterate

Interactive flowchart reveals insights and enables refinement

πŸ“ Concrete Example:

Input:

"DNA replication begins when the origin recognition complex (ORC) binds to DNA replication origins. This triggers the loading of the MCM2-7 helicase complex, which unwinds the DNA double helix. DNA polymerases then synthesize new strands using the unwound strands as templates..."

LLM Analysis:

Extracts 15 steps, identifies 3 decision points (origin recognition, helicase loading, polymerase binding), recognizes 4 key enzymes (ORC, MCM2-7, DNA polymerase, ligase), and maps regulatory checkpoints.

Output:

Mermaid flowchart with 25 nodes, 28 edges, 3 decision gates, properly colored using the 5-color scheme (red for inputs, yellow for structures, green for operations, blue for intermediates, violet for products), stored as structured JSON enabling interactive visualization and programmatic access.

πŸ“Š Live Interactive Example:

graph TD A[Complex Process Input] --> B{LLM Analysis} B -->|Extract Logic| C[Identify Steps] B -->|Extract Decisions| D[Identify Branches] C --> E[Create Flowchart Nodes] D --> F[Create Decision Points] E --> G[Generate Mermaid Syntax] F --> G G --> H[Store as JSON] H --> I[Interactive Visualization] I --> J{Insights Gained?} J -->|No| K[Refine Analysis] J -->|Yes| L[Apply Knowledge] K --> B style A fill:#ff6b6b,color:#fff style B fill:#74c0fc,color:#fff style C fill:#51cf66,color:#fff style D fill:#51cf66,color:#fff style E fill:#ffd43b,color:#000 style F fill:#ffd43b,color:#000 style G fill:#51cf66,color:#fff style H fill:#74c0fc,color:#fff style I fill:#74c0fc,color:#fff style J fill:#74c0fc,color:#fff style K fill:#51cf66,color:#fff style L fill:#b197fc,color:#fff

Color Legend:

Red - Triggers & Inputs Yellow - Structures & Objects Green - Processing & Operations Blue - Intermediates & States Violet - Products & Outputs

πŸ’‘ Core Principles

🌍

Domain Agnostic

Works across any field: biology, chemistry, software engineering, business processes, legal workflows, manufacturing, and beyond.

πŸ”„

Iterative Refinement

Start with rough analysis, visualize, identify gaps, refine with LLM, repeat until the process logic is crystal clear.

πŸ“¦

Structured Data

JSON storage enables programmatic access, version control, cross-referencing, and integration with other tools and databases.

πŸ“š Process Diagram Collections

The Programming Framework has been applied across multiple scientific disciplines. Explore interactive flowchart collections organized by domain:

🧬 Biology

Biological process visualizations: GLMP covers biochemical/molecular processes; Biology Database covers higher-level organismal processes.

Biology Database: 52 processes (organismal/ecological) | GLMP: 50+ processes (biochemical/molecular)

βš—οΈ Chemistry

Comprehensive chemistry process diagrams across all major branches.

πŸ—„οΈ Chemistry Database Table β†’

56 processes across 14 subcategories

πŸ”’ Mathematics

Mathematical algorithms, proof methods, and computational processes.

πŸ—„οΈ Mathematics Database Table β†’

20 processes across 7 subcategories

βš›οΈ Physics

Physical processes including quantum mechanics, thermodynamics, and particle physics.

πŸ—„οΈ Physics Database Table β†’

21 processes across 7 subcategories

πŸ’» Computer Science

Algorithms, software engineering workflows, and computational processes.

πŸ—„οΈ Computer Science Database Table β†’

21 processes across 7 subcategories

βš™οΈ Technical Architecture

πŸ€– LLM Integration

  • β€’ Google Gemini 2.0 Flash for analysis
  • β€’ Vertex AI for enterprise deployment
  • β€’ Custom prompts for process extraction
  • β€’ Structured JSON output formatting

πŸ“Š Visualization Stack

  • β€’ Mermaid.js for flowchart rendering
  • β€’ JSON schema for data validation
  • β€’ Interactive SVG output
  • β€’ Export to PNG/PDF supported

πŸ’Ύ Data Storage

  • β€’ Google Cloud Storage for JSON files
  • β€’ Firestore for metadata indexing
  • β€’ Version control with Git
  • β€’ Cross-referencing with papers database

πŸ”— Integration Points

  • β€’ GLMP specialized collections
  • β€’ CopernicusAI knowledge graph
  • β€’ Research papers database
  • β€’ API endpoints for programmatic access

βœ… Validation & Accuracy

πŸ” Quality Assurance Process

  • β€’ Automated Validation: All flowcharts validated for Mermaid syntax correctness before publication
  • β€’ Metadata Quality Checks: JSON schema validation ensures >=85% metadata completeness (NSF standard)
  • β€’ Source Citation Verification: All processes include verified research paper citations with DOI/PubMed links
  • β€’ Cross-Reference Validation: Automated checks ensure discipline links and back-references are correct
  • β€’ Color Scheme Consistency: All processes follow standardized 5-color scheme for visual consistency

πŸ“Š Scale & Coverage

  • β€’ 314 Processes Validated: Successfully applied across 6 discipline databases (Biology, Chemistry, Physics, CS, Mathematics, GLMP)
  • β€’ Multi-Domain Testing: Framework validated on biological pathways, chemical reactions, computational algorithms, and mathematical proofs
  • β€’ Iterative Refinement: Processes refined through multiple LLM analysis cycles to improve accuracy
  • β€’ User Feedback Integration: Community feedback mechanism enables continuous improvement (see "Improve this process" on each flowchart)
  • β€’ Expert Validation: GLMP processes validated against established biochemical pathway databases

🎯 Accuracy Measures

Syntax Accuracy

100% of published flowcharts render without Mermaid syntax errors

Metadata Completeness

>=85% average quality score across all processes (exceeds NSF requirements)

Source Coverage

All processes include 1-3 verified research paper citations with accessible links

⚠️ Known Limitations

  • β€’ LLM-Dependent Accuracy: Flowchart accuracy depends on LLM interpretation of source material; complex processes may require multiple refinement cycles
  • β€’ Domain Expertise Required: While the Framework is domain-agnostic, optimal results benefit from domain-specific knowledge for validation
  • β€’ Source Material Quality: Accuracy is limited by the quality and completeness of input source material
  • β€’ Continuous Improvement: Framework is actively refined based on user feedback and validation results

πŸ”— Related Projects

🧬 GLMP - Genome Logic Modeling

First specialized application of the Programming Framework to biochemical processes. 100+ biological pathways visualized as interactive flowcharts.

Explore GLMP β†’ (opens in new tab)

πŸ”¬ CopernicusAI

Knowledge engine integrating the Programming Framework with AI podcasts, research papers, and knowledge graph for scientific discovery.

Visit CopernicusAI β†’ (opens in new tab)

How to Cite This Work

Welz, G. (2024–2025). The Programming Framework: A Universal Method for Process Analysis.
Hugging Face Spaces. https://huggingface.co/spaces/garywelz/programming_framework (opens in new tab)

BibTeX Format:

@misc{welz2025programmingframework,
  title={The Programming Framework: A Universal Method for Process Analysis},
  author={Welz, Gary},
  year={2024--2025},
  url={https://huggingface.co/spaces/garywelz/programming_framework},
  note={Hugging Face Spaces}
}

Welz, G. (2024). From Inspiration to AI: Biology as Visual Programming.
Medium. https://medium.com/@garywelz_47126/from-inspiration-to-ai-biology-as-visual-programming-520ee523029a (opens in new tab)

This project serves as a foundational meta-tool for AI-assisted process analysis, enabling systematic extraction and visualization of complex logic from textual sources across diverse scientific and technical domains.

The Programming Framework is designed as infrastructure for AI-assisted science, providing a universal methodology that can be specialized for domain-specific applications.