An AI will always give wrong answers

I recently realised that many people don’t understand how an AI has to behave.

There seems to be a common misconception that it should never give you the wrong answer, whatever that might mean.

An AI of any use will always give wrong answers.

It is easy to build an AI that is never wrong.

It can simply say “I don’t know”, or fail to answer, every question.

Of course, the problem is that you would tell me that such an AI is no use, and you would be right.

The issue is that you want some answers.

So my AI can start giving you answers; but now, some of those answers will be wrong.

In this respect, AIs are very like humans. The person sitting silently in the corner of the room may well be very knowledgeable and intelligent, but you have no way of knowing, and they are not much help in solving your problems.

On the other hand, the person who seems to know everything may well be a lot of help, but is likely to have remembered things wrongly, or made some incorrect deductions based on what they thought they knew.

The more helpful a person is, in terms of answering your questions, the more likely they are to sometimes get it wrong.

In fact, if you insist they always give an answer, then they will definitely get things wrong.

Just like a school exam – if you are not required to answer all the questions, you can restrict yourself to the things you are confident of, and get most things right.

But if I make you answer all the questions, or mark empty answers as wrong, you are likely to have a significant number of wrong answers, if the questions were of any sort of interesting challenge to you.

This is true no matter how careful you are – there will be things that you think are right, but are in fact not.

So, if you want an AI to answer a good range of questions, you have to accept that it will give undesirable answers sometimes.

In the world of AI and Machine Learning, this is called precision and recall.

Precision is a measure of the proportion of answers it gives that are considered to be correct.

Recall is a measure of the proportion of answers it gives where it is expected to be able to answer.

And the bottom line is that they can never both be 100%.

As one climbs, the other is bound to fall, and the challenge is to get them both as high as possible, and then get the right balance.

The wonder of ChatGPT 3.5 and others of their time was that they seemed to get the recall high enough without dragging the precision down too far.

That is, it gave enough good answers while not giving too many bad and hallucinatory answers; this had been the big problem for general AI & Machine Learning up to that point.

An interesting aspect to all this is how the currently discussed Turing Test is now viewed.

In this construction, someone talks to an AI or a person, and has to work out which it is.

What would the test do if the AI or person simply didn’t answer?

It is arguable that this is the sensible strategy for at least the AI to take, and possibly the person.

Certainly if the AI’s objective is to not be detected to be the AI.

In Turing’s original Imitation Game construction it was different, and this was part of his genius.

He understood that all the participants in the Game needed to have objectives, so he made them play a game.

He then cast the question of Intelligence in terms of the statistical outcome of the game, no matter how good or bad the participants were.

A final comment:

It is a bit like road safety.

I can make the roads perfectly safe, without any deaths or injuries.

I would simply ban all traffic.

And that is actually the only way.

But of course the cost in terms of industry and starvation would be enormous.

OK, impose a 5mph speed limit? The whole society still wouldn’t work smoothly, and in fact there would till be the occasional death and injury.

So the question you have to ask about road accidents is “What is your road death target?” And it probably shouldn’t be zero.

Because the more you want to reduce deaths, the higher the costs to the society.

And, shockingly, if you reduce deaths below the target, then it is possible there will be negative effects on society, including deaths, which will be greater than you wanted to incur.

And in fact recently, I have seen “Vision Zero” for road deaths from Leeds, Oxfordshire, Kent, and Essex, to name a few.

Flattr this!

Still no Proper Icon Dictionary

We see more and more icons and symbols that are meant to be language-independent and obvious, and they just aren’t.

Back at the end of the last century, I decided that everyone liked using icons and symbols, but people had no idea what they meant. Just what does An upright triangle with a cross over it mean on a clothes tag?

So I registered icondictionary.{com,org} and set to work.

And failed 😀

But it was a good idea.

It was going to be crowd-sourced from around the world. A Spaniard travelling to Norway would be able to find out what Norwegian-specific road signs (such as the brilliant “Merging by the zipper method“) means, in their own language; be it Castilian, Catalan or something else I don’t know about.

And perhaps, surprisingly, it would be of great help to people with visual impairment.

Because the user would be able to get descriptions too (the modern alt is rarely enough, as you need proper description in detail) – it must be so frustrating to be reading a book and told that “his nose looked like some road sign”, when you have never seen one clearly, or ever had it described to you.

In fact, what does a poison icon look like on a bottle? Sort of important to know and detect.

So I planned to be able to capture images and look them up. This was before smart phones, so cameras were a problem, but they were coming along, and once the data was there, it would all work. I already had a C-Pen that could scan lines of text, so it couldn’t be long. And then anyone could scan a bottle and find out if that obscure symbol that someone thought was obvious was in fact saying the contents would kill them.

There was even a bit of business proposition here. The site would have decent quality images on display, but also behind, wherever possible there would be SVG versions that could be purchased, if a designer wanted high quality. With payment going back to the crowd-person who created it. Oh, and I had moderators/editors taking responsibility for areas, such as flags or laundry symbols, and also languages. With all the database permissions that entailed.

I had a student (Peter Dibdin) build a java app that enabled me to hand craft SVG documents to their highest quality or even a perfect description, and keep them in collections. It would then allow export in jpg at different resolutions, for different purposes. It still works nicely, by the way.

Given that SVG was only submitted to W3C in 1998, and you needed an Adobe plugin to view in a browser, you may get a sense of how ambitious this all was!

Even language stuff was new. I wanted to do all the stuff to distinguish pt-br and pt-pt etc., but even RFC 1766, trying to standardise it all, had only came out in 1995.

Clearly all this was hugely ambitious. Although the biggest problem was of course that I didn’t really have the skills 😀. And when I tried moving from a database to an RDF store, as the Semantic Web developed, since what I really wanted to do was at a semantic level, that was clearly going to be the final nail in the coffin!

Also, Google had recently come along, and was developing now, so search seemed to be much easier, and surely these huge corporations could do it – it was only a matter of time.

Then, of course, Wikipedia came along, and it looked like that would make it all redundant – it was only a matter of time.

But NO, it hasn’t happened yet.

Google Lens tells me that the Norwegian zipper sign is “A Norwegian Road Sign” – woohoo! Yeah, I sort of knew that because it is on the side of the Norwegian road I am driving down.

LLMs & ChatGPT? – it's only a matter of time!

We see more and more icons and symbols that are meant to be language-independent and obvious, and they just aren’t. I still can’t point my camera at one and find out what it means. Even though my car shows me what speed limit sign I last passed!

I didn’t finally (almost) give up until a couple of years ago. It would all be so much easier now. But I have other things I am doing that are being successful, and starting over in my 70s is probably not the best thing. But I only let the domains go in 2022, though 😀. Possibly mainly because I can always find another one, in this modern world of not needing com or org 😉.

Flattr this!

MacBook Pro neon light

My MacBook Pro has a neon light on the catch. It’s off when the machine is awake with the lid open, and on when the machine is awake with the lid closed (I think!). Some tosser in Apple obviously thought it was a good idea, and probably got a prize for it. But it goes against one of the simplest and perhaps fundamental engineering design principles. If you want to indicate the state of a machine to a user, you do not use something that has a temporal element.

This is why the traffic lights in the US are badly designed, while the UK ones are not. If you are approaching the lights in the US, and you see an amber light, you don’t know whether it is about to turn green or red; in the UK it will be about to turn red. Simple innit? (If it was red and amber it would be about to turn green.)

So if I glance at my little neon light, can I tell if the machine is asleep? No. I need to stand and stare at it for a while, and probably wait for it to go through a couple of pulsing cycles before I am sure (and I needed to do this because Apple screwed up the firmware so the machine didn’t always sleep).

And apart from that, why do I need a bright pulsing light in my hotel room? If I wanted to stay awake I wouldn’t go to bed.

Flattr this!

The person who invented the blue LED.

And then everyone who decided to put them everywhere.
You try and look at your computer screen, but as soon as you switch it on the LED switches from orange to blue, and blinds you.
Then you look for a book, and your eye is blinded by the two LEDs the computer manufacturer has chosen to put on the front of his crappy box, to pretend it has some class.
At least it could be understood that the blue is so luminescent compared with the others, and toned down to reasonable levels.
And then you go outside and some tosser has ten on his car.

Flattr this!