The illusion of intelligence: Watching AI at work

Today’s column is a little bit of a departure from the usual. It’s not about space, or even business. It’s about everyone’s favourite topic, AI (Artificial Intelligence).

I guess you could say it’s a bit of a personal reflection. If you follow my LinkedIn feed, you may have noticed that I often post portraits of myself and of some of the founders that I work with. These, of course, are generated by AI. Despite them being “fakes” in the traditional sense, I think they are actually quite effective. If you know the people involved, you will know that the images are very – sometimes disturbingly – good likenesses. But more than that they, like any good portrait, say something about the person that goes deeper than just the image on the screen. And through the magic of AI, I can place that person in a context and situation that illustrates the message in the words.

I make a point of explaining that because getting to the point where I could generate those kinds of images has actually been a significant learning experience. And through that experience I think I have actually learned a lot about AI in general, not just AI image generation. You see, at least for me, one of the most surprising things about working with AI is how invisible the process feels. When I use it for writing or analysis, I see a clean answer—neatly phrased, confidently delivered, and entirely opaque. The reasoning, if there is any, stays hidden behind fluent language.

But, oddly enough, experimenting with AI image generation actually made the whole thing make a lot more sense to me. That’s because Image generation exposes the process behind the “intelligence”. You can literally see how it “thinks,” pixel by pixel. And, that transparency taught me more about AI than any amount of reading or probably ever could.

So, let me take a step back and explain a little about how the process of generating an AI image works. Please note that if you are quite familiar with AI image generation this will seem a bit like a “cartoon” of the real process, but I think it’s a fair description.

I started learning in more depth about image generation because I just got interested in whether or not I could really make images of myself and other people I knew that would be good likenesses. At first, it was just mostly for fun. But then, as often happens, I started genuinely wondering why the things I tried occasionally worked but mostly did not.

It started to become a bit of a test. I wanted the same person to appear across multiple scenes, moods, and poses—consistent identity, consistent personality. Not a random assortment of images that occasionally could be mistaken for me or someone else I knew.

Every time I asked an off‑the‑shelf AI tool for “the same character,” I got someone that was recognizable… most of the time. But it was still very much a game of chance. Plus, I found that the images often reproduced artifacts that were clearly not “real” – like an overly waxy complexion, or the same clothing all the time, or a very “flat” quality – no real life or animation. The AI wasn’t producing identity. It was producing whatever statistical average it thought I would find pleasing. But it really was not combining all of the elements in a consistent way.

Which makes sense, because I realized that AI doesn’t converge on an answer that is true. AI converges on a probability that it expects will please the user based on what it knows about them.

Which was my first big realization about AI in general.

Left alone, it won’t give you what’s real. It will give you what it thinks you’ll like.

So, eventually my desire to generate consistent, repeatable, recognizable images made me stop relying on someone else’s black box model and start building my own. Because it turns out that the tools exist to let you build your own process. But to do that I had to learn a lot more about how AI image generation actually works.

And, unsurprisingly, I learned that there is a lot more going on “under the hood” than is obvious from simply throwing prompts at the wall to see what sticks. So, permit me a small digression to explain the basics of image generation process.

Image generation is a statistical process. The AI starts with a grid of random pixels. It applies a massive number of variations to this noise and picks the one that it thinks is the most like the image description that you have provided for it. And then it takes that image and goes through the process again, and again, eventually “converging” on an image that looks something like your description.

The magic is in how it performs this “diffusion” process from random noise to recognizable image. I won’t claim to understand the process in detail but suffice it to say that it involves a staggering number of computational steps on massive arrays of data – which is why it requires a chip that is specifically designed for the purpose.

At the heart of the process is something called a “diffusion model.” This is a massive set of rules (often several Gb worth of rules) that have been developed to allow the AI to make the connection between and image and a human language description. This model has been developed by training the model by showing it images along with their attendant descriptions and using a Large Language Model to make the connection. To do this effectively requires literally billions of images.

So, for any image generation process there is a base diffusion model that has been trained by exposing it to a massive number of images and an equally massive number of textual descriptions of those images. Obviously building this base model is a huge amount of work which is why a lot of them are proprietary. But, there are a number of different options available in the public domain. I chose a model known as “Flux”.

So now, it was just a matter of installing a system capable of applying that diffusion process on my local PC – one that has a GPU with a chip that will run it. So far, so good. There, again, are many options. Whatever tool you use will require a number of parameters though. Just specifying the diffusion model and giving it a prompt only scratches the surface.

You also have to explain what process the model should use for sampling and converging on the final solution. Unsurprisingly there are many choices. They will often generate dramatically different results. All of the models will require a seed value for a random number generator – and all of them will (often) generate dramatically different final products depending on what seed is chosen. This is why the “black box” methods can often prove to be inconsistent. There are just too many variables that are out of your sight and out of your control.

But there is hope because when you create the model locally, you have the option to control each step. You also have the option to watch the image “develop” through the diffusion process so you can judge the effect of changing any of the parameters available to you.

So, it becomes possible to start “engineering” a solution by selective variation of all of the parameters to learn what works and what does not. This, I did. I learned how to create consistent images – and to some extent how to constrain the variation within certain limits. So, my results became more precise. But they were not necessarily more accurate.

In other words, I learned how to get consistent results – but not how to make those results look the way I wanted them to.

To get to the next level, I had to go farther back up the image generation chain and actually learn how to make my own (very small) diffusion model. It turns out that the way fitting process works, the diffusion model can be tuned in a number of ways. The most common is something called a “Low Rank Adaptation” model or LoRA. A LoRA modifies the training of the larger model with very specific teaching. For instance, I can make a LoRA that tells the diffusion model that when it sees the words “older man,” that actually means someone who looks like me.

There are many other uses of LoRA’s but lets just focus on the character aspect for now. What I learned was that there are tools that will allow you to “Train a LoRA” by giving it a number of images – along with their associated descriptions and allowing it to work through the process of iteratively comparing it’s guesses to the truth data until it comes up with rules that will generate a recognizable character.

Then this LoRA is injected into the diffusion process and voila – when I say “older man sitting at a desk” that man starts to look an awful lot like me.

Which of course sounds simple – until you realize that we have now introduced another whole truckload of variables in the creation of the LoRA that ultimately affect how the images look in the final product. And, again, most of those variables are hidden from the user, unless you spend A LOT of time finding what they are, what they are supposed to do, and frankly what they actually do in practice, regardless of what the internet tells you they do.

It was at this point – which took a number of months of experimentation – that the pieces came together – more or less by happy accident. I had been trying a variety of ways of generating images of myself by using different strategies for training and using LoRA models. When I generated some images that stopped me cold.

They didn’t just look like me. They were me.

The AI had captured micro-details I hardly registered about myself: the slightly higher eyebrow, the difference in mobility on each side of my face, the real way I smile, not the posed one, the distribution of texture and shadow unique to my features.

When I showed the images to people who know me, several did not believe they were synthetic.

It was at this moment that I started thinking of what I was doing as “AI Portrait Photography” not just image generation. In case you are wondering, I’m not actually going to reveal all of the methods involved but once I started seeing results I really liked, I had to go back and isolate the techniques that made the difference. Some of it had to do with the images I used to train the model, some of it had to do with the training parameters, and some of it had to do with how to inject the LoRA into the diffusion process.

The point, though, was that all of those little techniques were necessary to unlock the potential to have AI show me the images I wanted to see. Once I unlocked the capacity, though, it was relatively easy to apply across subject, settings, lighting, lenses and moods. Once I learned how to train and instruct AI – it took the work out of the creative process and just started generating results. Some of which really are surprisingly good. Not because they are perfect but because they aren’t. They are not “true” pictures of the subjects, but like any good portrait they are something more than just a good likeness.

So, here is where we finally come back to the main point of this column. What I learned about AI is that it is capable of generating imagery that I appears truly magical. The images it creates cause people to say things like “how does the AI know.” It genuinely feels like AI has some higher insight that goes beyond human capacity.

But it does not.

Those AI images are the result of a hugely impressive computational model. But they only exist at that level because that model has been meticulously trained both human intelligence and human creativity.

Using AI I can generate images that I could never generate on my own, and I can generate them by the hundreds. But I can only do this because I have and do supervise the process from beginning to end.

This is the fundamental truth of AI. It does not reason. It replicates. It does not create. It imitates. It does not extrapolate. It interpolates. But it does all of those things with such rapidity, accuracy and subtlety that it can convince you that it sees things you have not shown it. It hasn’t. It just sees things you might not have known you were showing it.

And that’s why, to me, learning to generate AI portraits has become a surprisingly powerful metaphor for working with AI in general.

If you don’t supervise the system, it doesn’t become more intelligent. But it becomes more flattering.

If you don’t curate the data very carefully, it doesn’t become more truthful. But it does converge on the answer you expect much more quickly. If you don’t test it until it fails, it doesn’t become more reliable. It becomes more comfortable.

The bottom line is that AI isn’t here to think for us. It’s here to amplify the parts of our thinking we are willing to cultivate because it can reliably replicate processes which we find difficult to perform repetitively and reliably.

The most dangerous illusion about AI is that it’s intelligent enough to be left alone. It isn’t. It’s literal, tireless, and deeply obedient. It will give you exactly what you have taught it that you expect—and if you aren’t paying attention, that means giving you whatever looks good instead of what is true.

The machine doesn’t find truth.

We do.

And if we stop doing that—if we hand the steering wheel to an algorithm built to please rather than understand—we don’t get artificial intelligence. We get artificial comfort.

The illusion of intelligence: Watching AI at work

Related

Iain ChristieColumnist

Leave a comment

You’ve read all free articles for the month