top of page

Using AI for graphics? Here is a much better way!

  • Writer: Roger Kennett
    Roger Kennett
  • Nov 20
  • 5 min read

Updated: Nov 20

ree

AI takes ages to generate an image and it is still not quite right (despite consuming a bunch of electricity). When you ask for a revision it starts from scratch and sometimes things that were OK, not longer are.

Switch to using Procedural Image generation.

OK, its NOT good for generating photo-real images, what it is great for are graphics and diagrams - things teachers use to explain concepts and ideas.

Before we jump into the what and how, here are some advantages:


Advantages of procedural image generation:

  • Orders of magnitude less energy usage.

  • Making changes is fast and reliable

  • It can count and apply logic

  • It can create a visual simulation that follows your rules (especially useful for STEM teachers)

  • Can generate interactive graphics in 2D or 3D space.

  • Animation is easy

  • Easy to make changes later on

  • Full manual control where you can tweek the result yourself

  • Its a powerful learning tool for your students that works to improve their written language skills (see last point in this post).


What is it?

Procedural image generation is using a Large Language Model (LLM) to generate a graphic by writing code (think the code that drives web pages, html, javascript, SVG). The best part is the speed and changeability of the product. For making graphics that explain ideas, it is way better than diffusion models (the default mode in ChatGPT, Copilot, etc). If you want to understand more about how AI creates images and graphics, jump into my "Taming the AI Dragon for teachers, Module 1".


Let's explain with an example

ree

In the first box, you can see the text (code) that the LLM wrote. The box below shows this code used to render the "Mixture of Experts" graphic. You can see there is a very simple animation, showing how the user's prompt changes which "experts" the chatbot will access to generate its response.


Because it is writing text (code) you have far more control. When you ask for adjustments, the LLM is not starting from scratch (like a diffusion model), it is altering the existing text. Since it is based on logic, it is far more accurate in responding to your requests. (it can count and spell for a starter!)

You can copy that text and paste it into any LLM. Even better, start a new chat, paste in the code and ask for changes (always a good idea - especially when the AI seems to not understand your request)

You can even (easily - so easily) create 3D spaces with 3D graphics inside. At times, these can be helpful for explaining complex ideas. These 3D spaces can controlled by your user (fly the camera around inside them). None of this is possible with a diffusion model. Oh, and it can SPELL! - did I mention that? If you want a few words changed - it does just that, with 1000's of times less energy than diffusion.


Universal and manual control.

You can make changes to the code directly. It is not that hard to change spelling or replace words - just search for the old word in the text (code) and replace it.

You can paste the code it generates into free SVG editors and make changes. This way you have complete manual control to change what you want.

You can even start manually to get the basic layout and bring that (via SVG) into the LLM. Have a hand-drawn sketch of what you want? Not problem, put that photo of your sketch into a chatbot and ask it to describe the image in words. Now you have your starting description!


Create a simulation, not just an animated graphic.

This is especially powerful. Because the graphic is following logical instructions, you can have it be, you know, logical! Below is a simulation of how an artificial neural network transforms inputs into outputs. It is NOT an animation, it is a simulation. It follows a simplified version of the mathematics behind a LLM, but it simulations, not animates. How to create this? You do NOT need to know how to code, as a LLM will take your careful, step by step instructions and do all the coding for you! (press the start button below)



Create interactive 3D worlds

Since it is code - you can create your graphics in a 3D world and be able to zoom around in that world. Below is an example to whet your appetite. You can control the camera and set a start and end position for a nice fly-through your 3D world. How did I create this?... see my pro tip below.


Pro tip

Spend time in TEXT only. Make it clear that you DO NOT want the graphic created, just explore the idea in words. As the AI to ask you questions to clarify. Once you are happy with the text description of what you want (including the rules for any simulation), then start a new chat, paste in the description and ask it to generate it. While I love Claude.ai for this, you can use (almost) any chatbot - just be explicit that you want it to generate html to create your graphic.


Powerful learning for your students

I think the point of me learning things is to then put that knowledge and skill into the hands of my students. Let's face it, you might play around to generate some cool graphics for your class, but our job is to raise young people for their future. There is a meta-lesson in giving this knowledge-power to your students: namely, how to use AI creatively, responsibly - to co-create with AI. Then there is that actual lesson - maybe they are creating a simulation of a simple food chain (grass -> rabbit -> dingo). Maybe in an assignment you stipulate they need to co-create with AI for one graphic to explain x. Because your students are explaining what they want in plain language they are not wasting time learning code. More importantly they are learning the skill of being explicit, precise, logical, and accurate in their language - because it they are not, it will not produce the desired output.

Who would have thought that the technology that threatens to down-skill everyone's writing, might actually be used to sharpen it!!?

Well of course, us teachers are always subverting things to benefit our students!


If you found this helpful, I reckon you might love to get deeper into understanding LLMs from the perspective of what is helpful for teachers. Try my course written by a teacher (and educational neuroscientist), for teachers. Find out more here: https://www.learningforge.com.au/aiclassroom


I made a claim about energy use, here are my sources....

Comparative Table: Energy Use

Aspect

Procedural Image Generation (HTML/CSS via LLM)

Diffusion Model Image Generation

Compute Process

Single-pass text generation + browser rendering¹

Iterative denoising (hundreds of steps)²

Energy per Image

~0.0002–0.002 kWh (similar to text generation)³

~0.00138 kWh per image (Stable Diffusion v1.5)⁴

Scaling with Resolution

Minimal impact (browser optimized)¹

1.3×–4.7× increase when resolution doubles⁵

Hardware Demand

CPU/GPU for rendering (lightweight)¹

High-end GPU clusters for inference²

Relative Energy Use

Orders of magnitude lower³

Very high compared to text tasks³

References




Comments


Subscribe to stay informed

Learning Forge Pty Ltd ABN: 34 666 197 486

  • Instagram
  • Facebook
  • LinkedIn
  • YouTube

tel: 02 7229 5606

Learning Forge acknowledges the traditional custodians of the land on which we live and work; the Darug people. We pay our respects to their elders, past, present, and emerging, and acknowledge their ongoing connection to the land, waters, and culture. We recognise that sovereignty was never ceded.

bottom of page