Study Finds That 52 Percent of ChatGPT Answers to Programming Questions Are Wrong

ekZepp@lemmy.world to Technology@lemmy.ml – 600 points –
Study Finds That 52 Percent of ChatGPT Answers to Programming Questions Are Wrong
futurism.com
126

You are viewing a single comment

The funny thing is, if you point out its mistakes, it often does better on subsequent attempts.

Or it get stuck in an endless loop of two different but wrong solutions.

Me: This is my system, version x. I want to achieve this.

ChatGpt: Here's the solution.

Me: But this only works with Version y of given system, not x

ChatGpt: Try this.

Me: This is using a method that never existed in the framework.

ChatGpt:

  1. "Oh, I see the problem. In order to correct (what went wrong with the last implementation), we can (complete code re-implementation which also doesn't work)"
  2. Goto 1

I used to have this issue more often as well. I've had good results recently by **not ** pointing out mistakes in replies, but by going back to the message before GPT's response and saying "do not include y."

Agreed, I send my first prompt, review the output, smack my head “obviously it couldn’t read my mind on that missing requirement”, and go back and edit the first prompt as if I really was a competent and clear communicator all along.

It’s actually not a bad strategy because it can make some adept assumptions that may have seemed pertinent to include, so instead of typing out every requirement you can think of, you speech-to-text* a half-assed prompt and then know exactly what to fix a few seconds later.

*[ad] free Ecco Dictate on iOS, TypingMind’s built-in dictation… anything using OpenAI Whisper, godly accuracy. btw TypingMind is great - stick in GPT-4o & Claude 3 Opus API keys and boom

Ha! That definitely happens sometimes, too.

But only sometimes. Not often enough that I don't still find it more useful than not.

While explaining BTRFS I've seen ChatGPT contradict itself in the middle of a paragraph. Then when I call it out it apologizes and then contradicts itself again with slightly different verbiage.