Using AI for graphics? Here is a much better way!

Roger Kennett
Nov 20, 2025
5 min read

Updated: Dec 28, 2025

AI takes ages to generate an image and it is still not quite right (despite consuming a bunch of electricity). When you ask for a revision it starts from scratch and sometimes things that were OK, not longer are.

Switch to using Procedural Image generation.

OK, its NOT good for generating photo-real images, what it is great for are graphics and diagrams - things teachers use to explain concepts and ideas.

Before we jump into the what and how, here are some advantages:

Advantages of procedural image generation:

Orders of magnitude less energy usage.
Making changes is fast and reliable
It can count and apply logic
It can create a visual simulation that follows your rules (especially useful for STEM teachers)
Can generate interactive graphics in 2D or 3D space.
Animation is easy
Easy to make changes later on
Full manual control where you can tweek the result yourself
Its a powerful learning tool for your students that works to improve their written language skills (see last point in this post).

What is it?

Procedural image generation is using a Large Language Model (LLM) to generate a graphic by writing code (think the code that drives web pages, html, javascript, SVG). The best part is the speed and changeability of the product. For making graphics that explain ideas, it is way better than diffusion models (the default mode in ChatGPT, Copilot, etc). If you want to understand more about how AI creates images and graphics, jump into my "Taming the AI Dragon for teachers, Module 1".

Let's explain with an example

In the first box, you can see the text (code) that the LLM wrote. The box below shows this code used to render the "Mixture of Experts" graphic. You can see there is a very simple animation, showing how the user's prompt changes which "experts" the chatbot will access to generate its response.

Because it is writing text (code) you have far more control. When you ask for adjustments, the LLM is not starting from scratch (like a diffusion model), it is altering the existing text. Since it is based on logic, it is far more accurate in responding to your requests. (it can count and spell for a starter!)

You can copy that text and paste it into any LLM. Even better, start a new chat, paste in the code and ask for changes (always a good idea - especially when the AI seems to not understand your request)

You can even (easily - so easily) create 3D spaces with 3D graphics inside. At times, these can be helpful for explaining complex ideas. These 3D spaces can controlled by your user (fly the camera around inside them). None of this is possible with a diffusion model. Oh, and it can SPELL! - did I mention that? If you want a few words changed - it does just that, with 1000's of times less energy than diffusion.

Universal and manual control.

You can make changes to the code directly. It is not that hard to change spelling or replace words - just search for the old word in the text (code) and replace it.

You can paste the code it generates into free SVG editors and make changes. This way you have complete manual control to change what you want.

You can even start manually to get the basic layout and bring that (via SVG) into the LLM. Have a hand-drawn sketch of what you want? Not problem, put that photo of your sketch into a chatbot and ask it to describe the image in words. Now you have your starting description!

Create a simulation, not just an animated graphic.

This is especially powerful. Because the graphic is following logical instructions, you can have it be, you know, logical! Below is a simulation of how an artificial neural network transforms inputs into outputs. It is NOT an animation, it is a simulation. It follows a simplified version of the mathematics behind a LLM, but it simulations, not animates. How to create this? You do NOT need to know how to code, as a LLM will take your careful, step by step instructions and do all the coding for you! (press the start button below)

Create interactive 3D worlds

Since it is code - you can create your graphics in a 3D world and be able to zoom around in that world. Below is an example to whet your appetite. You can control the camera and set a start and end position for a nice fly-through your 3D world. How did I create this?... see my pro tip below.

Pro tip

Spend time in TEXT only. Make it clear that you DO NOT want the graphic created, just explore the idea in words. As the AI to ask you questions to clarify. Once you are happy with the text description of what you want (including the rules for any simulation), then start a new chat, paste in the description and ask it to generate it. While I love Claude.ai for this, you can use (almost) any chatbot - just be explicit that you want it to generate html to create your graphic.

Powerful learning for your students

I think the point of me learning things is to then put that knowledge and skill into the hands of my students. Let's face it, you might play around to generate some cool graphics for your class, but our job is to raise young people for their future. There is a meta-lesson in giving this knowledge-power to your students: namely, how to use AI creatively, responsibly - to co-create with AI. Then there is that actual lesson - maybe they are creating a simulation of a simple food chain (grass -> rabbit -> dingo). Maybe in an assignment you stipulate they need to co-create with AI for one graphic to explain x. Because your students are explaining what they want in plain language they are not wasting time learning code. More importantly they are learning the skill of being explicit, precise, logical, and accurate in their language - because it they are not, it will not produce the desired output.

Who would have thought that the technology that threatens to down-skill everyone's writing, might actually be used to sharpen it!!?

Well of course, us teachers are always subverting things to benefit our students!

If you found this helpful, I reckon you might love to get deeper into understanding LLMs because you care about your students and preparing them for their tomorrow. Try my course written by a teacher (and educational neuroscientist), for teachers. Find out more here: https://www.learningforge.com.au/aiclassroom

I made a claim about energy use, here are my sources....

Comparative Table: Energy Use

Aspect	Procedural Image Generation (HTML/CSS via LLM)	Diffusion Model Image Generation
Compute Process	Single-pass text generation + browser rendering¹	Iterative denoising (hundreds of steps)²
Energy per Image	~0.0002–0.002 kWh (similar to text generation)³	~0.00138 kWh per image (Stable Diffusion v1.5)⁴
Scaling with Resolution	Minimal impact (browser optimized)¹	1.3×–4.7× increase when resolution doubles⁵
Hardware Demand	CPU/GPU for rendering (lightweight)¹	High-end GPU clusters for inference²
Relative Energy Use	Orders of magnitude lower³	Very high compared to text tasks³