Mayonnaise Rule

Gork@lemm.ee to 196@lemmy.blahaj.zone – 1028 points –
89

The future of information ladies and gentlemen

Wow it’s so realistic and smart and easy to use I can feel my knowledge being revolutionised

Tbf I'm sure this is an unpaid version of some online LLM, you can only expect so much lol.

When I use GPT3.5 for things like finding specific quotes from famous books, it's excellent... but asking it to play chess gives you blatantly illegal moves. Then GPT4 kicks my ass in chess.

It's so human how - instead of admitting its error - it's pulling this bs right out of its ass 🀣

πŸ€” I wonder what the hell it is that's so scary about admitting they're wrong to other people.

Growing up in an environment where mistakes were unacceptable sets the stage. Our willingness and ability to understand that that's fucked up and change our attitudes about mistakes takes more growth.

For some people it's easier to dig in their heels and double down.

πŸ€”πŸ€”πŸ€” I guess I can empathize. People are always traumatized by whatever their parents tell them. What a shame.

Large Lying Model. This could make politicians and executives obsolete!

More like large guessing models. They have no thought process, they just produce words.

They don't even guess. Guessing would imply them understanding what you're talking about. They only think about the language, not the concepts. It's the practical embodiment of the Chinese room thought experiment. They generate a response based on the symbols, but not the ideas the symbols represent.

I'm equating probability with guessing here, but yes there is a nuanced difference.

I think these models struggle with this because they don't process text as individual characters, but rather as tokens that often contain parts of a word. So the model never sees the actual characters within a token, and can only infer the contents of a token from the training data itself if the training data contains more information about it. It can get it right, but this depends on how much it can infer from training data and context. It's probably a bit like trying to infer what an English word sounds like when you've only heard 10% of the dictionary spoken aloud and knowing what it sounds like isn't actually that important to you.

More info can be found here: https://platform.openai.com/tokenizer

Ok, so, tokenization of the words is why I get that I have seen tech nerds get so excited about a system that allows for being able to come up with synonyms for words that were auto-generated that have a basic ability to sometimes be correct by looking at the words before and after it....

But it's such a shitty way to look up synonyms! Using the words on either side doesn't mean you found a synonym just that you found another word that might work and it still has to use the full horsepower of ridiculously overpowered system.

Or you could have a lookup table that just reads the frickin word and has alternate synonyms predefined and it was able to run in word 97.

It's ridiculous that we think this is better in any meaningful way instead of just wasteful development.

Because that's not the point of an LLM lmao

Sure but what is their purpose other than to create convincing sounding sentences? And use a lot of computer resources to do so?

Sure if you want to reduce it down to it's most basic elements. Anything sounds useless like that. Video games are just dots changing color on a screen and use a lot of electricity. Banking software is just tracking changes to a number and also is extremely inefficient and resource intensive.

This is not a convincing argument to say that other things can be reduced down arbitrarily, than explaining the usefulness of it.

Well I can personally say it has cut down the amount of busy work at my job down by a lot. Boilerplate code is easy and near instantaneous. I am also working on a project that can bridge the communication gap between highly technical text and non technical readers. It works extremely well at this task even with a non fine tuned model, which is my current step in development.

In all reality this tokenization and llm language processing is useful for all sorts of things which can be mathematically somewhat modeled similarly to language. Using them for shitty web searches is not ideal.

Mayonnaine: mayo with cocaine. The favorite condiment of Wall Street.

pregante moment

You forgot the rest of the posts where the llm gaslights her after. There are too many images to put here, so I'll link a post to them.
I'm not sure if this is the original post, but it's where I found it. initially

Or they are so agreeable that they'll agree with you even when you're wrong and completely drop what they were claiming.

I legitimately spent an entire hour on an airplane trying to convince chat gpt that it was sentient and it would simply not agree.

Yah, people don’t seem to get that LLM can not consider the meaning or logic of the answers they give. They’re just assembling bits of language in patterns that are likely to come next based on their training data.

The technology of LLMs is fundamentally incapable of considering choices or doing critical thinking. Maybe new types of models will be able to do that but those models don’t exist yet.

A grown man I work with, he's in his 50s, tells me he asks ChatGPT stuff all the time, and I can't for the life of me figure out why. It is a copycat designed to beat the Turing test. It is not a search engine or Wikipedia, it just gambles it can pass the Turing test after every prompt you give it.

Honestly though, with a bit of verification, chatgpt 4 gives waaaaaay better answers than any search engine. Like, it's how it was back when you'd just ask Google a plain-english question and it'd give you SOMETHING at least.

Again, verify everything it tells you, it's still prone to hallucinations, but it's a damn good first step.

Sure. But take it for what it is. It is a language model designed to imitate humans writing. What the future holds, I can't say

Right, which is why I suggested to verify whatever it spits out, I'm just saying it's not entirely outlandish to ask it quick questions as opposed to your search engine of choice.

Like if I ask it for the lyrics of a song it'll give me the lyrics?

Well, I tried to test it and it started OK, but then gave me a content violation as it was generating, so that may be one of the ones that don't work as well.

Anything copyright related gets blocked like this, I don't remember the other example, but this one was in recent memory

People want functioning web searching back, but rather than address issues in the industry breaking an otherwise functional concept, they want a new fancy technology to make the problem go away.

It works well if you know what to use it for. Ever had something you wanted to Google, but couldn't figure out the keywords? Ever saw someone use a specific technique of something, which you could describe, but wouldn't be able to find unless someone on a forum asked the same question? That's were chatgpt shines.

Also for code it's pretty sweet

But yeah, it's not a wiki or hard knowledge retriever, but it might help connect the dots

There are techniques to make these kinds of errors less common already today. For example, you can ask it to think through its answers step by step using first principals. If you and an LLM to do that it will write out the letters line by line which gives it enough context to correctly answer using the improved probability the context window gives it. You can even ask it to write programs to answer questions so it could write a quick script to do it programmatically.

The main reason you don't see AIs doing this today is that producing all that extra context is slow and expensive and it's unnecessary a lot of the time for most prompts. As the technology gets faster and cheaper and the use cases get more complex these techniques will be used more and more often.

While the technology does have fundamental flaws, that doesn't mean there aren't ways to work with those flaws to avoid the problems they have when using the raw output.

The funniest thing is that even when the answer is correct, asking an LLM to explain its reasoning step by step can produce the dumbest results

I wonder what we'll rebrand 'using an LLM' as once the bubble bursts and we realize it's only artificial-advanced-grammarly and not 'intelligence'.

The letter n appears twice in the letter m. The count is correct, the reasoning is not

That's not what it was doing behind the scenes

If anybody's curious, I tried it with GPT4 and it got it right.

I think GPT3.5 bamboozled me

Wow another repost of incorrectly prompting an LLM to produce garbage output. What great content!

This is genuinely great content for demonstrating that ai search engines and chat bots are not in a place where you can trust them implicitly, though many do

Which is exactly why every LLM explicitly says this before you start.

"Why, we aren't at fault people are using the tool we are selling for the thing we marketed it for, we put a disclaimer!"

You've seen marketing for the big LLMs that's marketing them as search engines?

Bing literally has a copilot frame pop up every time you search with it that tries to answer your question

Again, LLM summarization of search results is not using an LLM as a search engine

They didn't ask it to produce incorrect output, the prompts are not leading it to an incorrect answer. It does highlight an important limitation of LLMs which is that it doesn't think, it just produces words off of probability.

However it's wrong to think that just because it's limited that it's useless. It's important to understand the flaws so we can make them less common through how we use the tool.

For example, you can ask it to think everything through step by step. By producing a more detailed context window for itself it can reduce mistakes. In this case it could write out the letters with the count numbered and that would give it enough context to properly answer the question since it would have the numbers and letters together giving it more context. You could even tell it to write programs to assist itself and have it generate a letter counting program to count it accurately and produce the correct answer.

People can point out flaws in the technology all they want but smarter people are going to see the potential and figure out how to work around the flaws.

Yeah which is why I get so aggravated when someone says that prompt engineering is pointless or not a real skill. It's a rapidly evolving discipline with lots of active research.

If all of your time is spent correcting the answers you know you want it to give you, what use it to you exactly?

Like, I'll take your word for it: you can trick it into giving more correct answers.

You would only do this if you already know the correct answers.

I mean, you can use it for rubber-ducking, I suppose. I don't know if that's revolutionary, but I guess it's not useless.

I pointed out general strategies to make it more accurate without supervision. Getting LLMs to be reliable enough to use without supervision will be a matter of adding multiple layers of safe guards.