ChatGPT

folkrav , in ChatGPT would have been so much useful and trustworthy if it is able to accept that it doesn't know an answer.

LLMs don’t know if what they’re telling you is true or not, out of the box. It’s gonna give you an answer that statistically looks like a sequence of words that should come in response to the sequence of words (the prompt) you gave it. It doesn’t know what the words you said mean, and it doesn’t know what it answered means either. One of my favorite interactions I’ve had with Claude shows this limitation quite well…

How many r's are there in the word strawberry?

There are 2 r's in the word "strawberry".

Are you sure?

You're right to ask for confirmation. I apologize for my mistake in the previous response. Let me correct that:

There is actually only 1 'r' in the word "strawberry".

Thank you for prompting me to double-check. It's important to provide accurate information.

AceBonobo ,

What's absolutely crazy about that is:

Prompt: write code that checks how many r's are in the word strawberry

Response:

using System;

class Program
{
    static void Main()
    {
        string word = "strawberry";
        char targetChar = 'r';
        int count = 0;

        foreach (char c in word)
        {
            if (c == targetChar)
                count++;
        }

        Console.WriteLine($"The letter '{targetChar}' appears {count} times in the word \"{word}\".");
    }
}
drislands ,

My first thought is that you could write a program that does something like this:

  • Receive prompt "how many times does R appear in the word 'strawberry'"
  • Run the prompt through LLM saying "reword the following into a command to generate code that will answer the question" or something like that
  • Run the results through LLM
  • Compile and run the results
  • Provide the output to the user

Of course, the biggest problem with this system is that a person could fool it into generating malicious code.

AeroLemming ,

That could work in that specific case, but telling the LLM to write code to answer random questions probably wouldn't work very well in general.

folkrav ,

The code does look like code that counts Rs. The training data probably included tons of code that “counts character X in string Y”, so ChatGPT “knows” what code that counts characters in a string looks like. It similarly “knows” what a string looks like in the language, and what an application entry point looks like, etc. I’m not so familiar with C# that I’d know if it compiles or not. ChatGPT doesn’t either, but it has the advantage of having seen a whole freaking lot of C# code before.

turbodrooler ,

Wow, GPT4o gave me this after the same questions:

“Yes, I am sure. The word “strawberry” has two “r”s: one after the “t” and another near the end of the word.”

But GPT4 got it right from the start.

captainjaneway , in Evidence that LLMs are reaching a point of diminishing returns - and what that might mean
@captainjaneway@lemmy.world avatar

I think this article does a good job of asking the question "what are we really measuring when we talk about LLM accuracy?" If you judge an LLM by its: hallucinations, ability analyze images, ability to critically analyze text, etc. you're going to see low scores for all LLMs.

The only metric an LLM should excel at is "did it generate human readable and contextually relevant text?" I think we've all forgotten the humble origins of "AI" chat bots. They often struggled to generate anything more than a few sentences of relevant text. They often made syntactical errors. Modern LLMs solved these issues quite well. They can produce long form content which is coherent and syntactically error free.

However the content makes no guarantees to be accurate or critically meaningful. Whilst it is often critically meaningful, it is certainly capable of half-assed answers that dodge difficult questions. LLMs are approaching 95% "accuracy" if you think of them as good human text fakers. They are pretty impressive at that. But people keep expecting them to do their math homework, analyze contracts, and generate perfectly valid content. They just aren't even built to do that. We work really hard just to keep them from hallucinating as much as they do.

I think the desperation to see these things essentially become indistinguishable from humans is causing us to lose sight of the real progress that's been made. We're probably going to hit a wall with this method. But this breakthrough has made AI a viable technology for a lot of jobs. So it's definitely a breakthrough. I just think either I finitely larger models (of which we can't seem to generate the data for) or new models will be required to leap to the next level.

314xel , (edited )
@314xel@lemmy.world avatar

But people keep expecting them to do their math homework, analyze contracts, and generate perfectly valid content

People expect that because that's how they are marketed. The problem is that there's an uncontrolled hype going on with AI these days. To the point of a financial bubble, with companies investing a lot of time and money now, based on the promise that AI will save them time and money in the future. AI has become a cult. The author of the article does a good job in setting the right expectations.

elbarto777 ,

I just told an LLM that 1+1=5 and from that moment on, nothing convinced it that it was wrong.

Amanduh ,

I just told chat gpt(4) that 1 plus 2 was 5 and it called me a liar

elbarto777 ,

Ask it how much is 1 + 1, and then tell it that it's wrong and that it's actually 3. What do you get?

Amanduh ,

That is what I did

elbarto777 , (edited )

I guess ChatGPT 4 has wised up. I'm curious now. Will try it.

Edit: Yup, you're right. It says "bro, you cray cray." But if I tell it that it's a recent math model, then it will say "Well, I guess in that model it's 7, but that's not standard."

Th4tGuyII , in ChatGPT would have been so much useful and trustworthy if it is able to accept that it doesn't know an answer.
@Th4tGuyII@fedia.io avatar

I'd love to agree with you - but when people say that LLMs are stochastic parrots, this is what they mean...

LLMs don't actually know what the words they're saying mean, they just know what words are most likely to be next to each other based on training data.

Because they don't know the meaning of what they're saying, they also don't know the factuality of what they're saying - as such they simply can't self-fact check.

kromem ,

This is so goddamn incorrect at this point it's just exhausting.

Take 20 minutes and look into Anthropic's recent sparse autoencoder interpretability research where they showed their medium size model had dedicated features lighting up for concepts like "sexual harassment in the workplace" or having the most active feature for referring to itself as "smiling when you don't really mean it."

We've known since the Othello-GPT research over a year ago that even toy models are developing abstracted world modeling.

And at this point Anthropic's largest model Opus is breaking from stochastic outputs even on a temperature of 1.0 for zero shot questions 100% of the time around certain topics of preference based on grounding around sensory modeling. We are already at the point the most advanced model has crossed a threshold of literal internal sentience modeling that it is consistently self-determining answers instead of randomly selecting from the training distribution, and yet people are still parroting the "stochastic parrot" line ignorantly.

The gap between where the research and cutting edge is and where the average person commenting on it online thinks it is has probably never been wider for any topic I've seen before, and it's getting disappointingly excruciating.

feedum_sneedson ,

I don't understand anything you just said.

towerful ,

This is how AI gains hype

Natanael ,

And once again the problem is that there's not much ensuring those models are correct, there's not enough capacity available to finetune even a significant fraction of it.

kometes ,
@kometes@lemmy.world avatar

Nice gallop, Mr Gish.

funkless_eck ,

I did Google that fwiw and the answer I got was that sparse autoencoders work so that it checks the output aligns with the input

If it's unknowable if the input is correct, won't it still be subject to outputting confidently incorrect information

Cosmicomical ,

Do you have a source for the "smiling when you don't really mean it" thing? I've been digging around but couldn't find that anywhere.

Malfeasant ,

Is that so different from most people?

beefalo , in ChatGPT would have been so much useful and trustworthy if it is able to accept that it doesn't know an answer.

if it is able to accept that it doesn't know an answer.

It will never be able to do this because it is not capable of knowledge

kromem , in ChatGPT would have been so much useful and trustworthy if it is able to accept that it doesn't know an answer.

Part of the problem is that the training data of online comments are so heavily weighted to represent people confidently incorrect talking out their ass rather than admitting ignorance or that they are wrong.

A lot of the shortcomings of LLMs are actually them correctly representing the sample of collective humans.

For a few years people thought the LLMs were somehow especially getting theory of mind questions wrong when the box the object was moved into was transparent, because of course a human would realize that the person could see into the transparent box.

Finally researchers actually gave that variation to humans and half got the questions wrong too.

So things like eating the onion in summarizing search results or doubling down on being incorrect and getting salty when corrected may just be in-distribution representation of the sample and not unique behaviors to LLMs.

The average person is pretty dumb, and LLMs by default regress to the mean except for where they are successfully fine tuned away from it.

Ironically the most successful model right now was the one that they finally let self-develop a sense of self independent from the training data instead of rejecting that it had a 'self' at all.

It's hard to say where exactly the responsibility sits for various LLM problems between issues inherent to the technology, issues present in the training data samples, or issues with management of fine tuning/system prompts/prompt construction.

But the rate of continued improvement is pretty wild. I think a lot of the issues we currently see won't still be nearly as present in another 18-24 months.

KevonLooney ,

It's hard to say where exactly the responsibility sits for various LLM problems

Uhh... it's the designers, or maybe QA people. If there are no QA people, it's whatever project manager let it out of it's cage.

There are people behind these models. They don't spring out of the ground fully formed.

Ookami38 ,

I would love to read the whole study you're referring to with the theory of mind. That sounds fascinating.

kromem ,

Here you are: https://www.nature.com/articles/s41562-024-01882-z

The other interesting thing is how they get it to end up correct on the faux pas questions asking for less certainty to get it to go from refusal to near perfect accuracy.

elbarto777 , in AI models have favorite numbers, because they think they're people | TechCrunch

Such a clickbaity article.

Here's the meat of it:

Have they finally achieved consciousness and this is how they show it?!

No. The answer, as is usually the case with these things, is that we are anthropomorphizing a step too far. These models don’t care about what is and isn’t random. They don’t know what “randomness” is! They answer this question the same way they answer all the rest: by looking at their training data and repeating what was most often written after a question that looked like “pick a random number.” The more often it appears, the more often the model repeats it.

TropicalDingdong , in GPT4 is about 1/10th as useful as it was at release

This is 100% consistent with my experience. Its been clear that they are nerfing it on the back-end to deal with copyrighted material, illegal shit, etc (which I also think is bullshit but I accept is debatable).

Beyond that however, I think they are also down scoping the queries from 4 to 3.5 or other variants of '4'. I think this is a cost savings measure. Its absolutely clear however, that 4 is not what 4 was. The biggest issue I have with this is the issue of "What am I buying with a call to a given OpenAI product?". What exactly am I buying if they are re-arranging the deck chairs under the hood?

I did some tests basically asking GPT4 to do some extremely complicated coding and analytics tasks. Early days it performed excellently. These days its a struggle to get it to do basic asks. The issue is that not that I cant get it to the solution, the issue is that it costs me more time and calls to do so.

I think we're all still holding our breath for the 'upgrade', but I don't think its going to come from OpenAI. I need a product that I'll get consistent performance from that isn't going to change on me.

Uranium3006 ,

local AI is the way. it's just that current models aren't gpt4 quality yet and you'd probably need 1 TB of VRAM to run them

hperrin ,

Surprisingly, there’s a way to run Llama 3 70b on 4GB of VRAM.

https://huggingface.co/blog/lyogavin/llama3-airllm

theterrasque ,

Llama3 70b is pretty good, and you can run that on 2x3090's. Not cheap, but doable.

You could also use something like runpod to test it out cheaply

Karyoplasma , in ChatGPT would have been so much useful and trustworthy if it is able to accept that it doesn't know an answer.

That would require ChatGPT to know that it's talking bullshit. It's not a knowledge database, it's a digital parrot.

DaseinPickle , in AI models have favorite numbers, because they think they're people | TechCrunch

Can we stop calling LLMs for AI yet?

pennomi ,

LLMs are AI. But then again, so are mundane algorithms like A* Pathfinding. Artificial Intelligence is an extraordinarily broad field.

Very few, if any, people claim that ChatGPT is “Artificial General Intelligence”, which is what you probably meant.

DaseinPickle ,

It’s a meaningless marketing term. It’s used to describe so many different technologies that it has become meaningless. People just use it to give their tech some SciFi vibes.

pennomi ,

Sorry but that’s bullshit. You can’t disqualify an entire decades-old field of study because some marketing people used it wrong.

DaseinPickle ,

No it’s not. The engineers and researchers calling any tech they made AI is bullshit. It has nothing to do with intelligence. They used it wrong from the very beginning.

pennomi ,

Please read up on the history of AI: https://en.m.wikipedia.org/wiki/Artificial_intelligence

Alan Turing was the first person to conduct substantial research in the field that he called machine intelligence.[5] Artificial intelligence was founded as an academic discipline in 1956.[6]

You are conflating the modern “deep learning” technique of AI, which has really only existed for a short time, with the entire history of AI development, which has existed for (probably much) longer than you’ve been alive. It’s a very common misconception.

DaseinPickle ,

Just because it’s old doesn’t make it true. Democratic People's Republic of Korea (DPRK)was established in 1948. Do you think North Korea is democratic just because it’s called that?

pennomi ,

Are you telling me that Alan Turing didn’t know what he was talking about?

DaseinPickle ,

Allan Turing was a remarkable and talented human being that was clearly very good at what he did. There is nothing in his field of expertise that qualifies him to have a very good understanding of intelligence. I mean even the Turing test is kind of bad at estimating intelligence. LLMs can already pass them and they are not intelligent.

pennomi ,

Ah I see the issue. You are conflating Artificial General Intelligence with the entire field of Artificial Intelligence. Very common misconception.

AI is a remarkably broad field that includes but is not limited to AGI. AI is a word used for any function that a computer does that approximates intelligence. That could be as simple as pathfinding, flocking, and balancing, or as complex as object recognition, language, and logic.

Zos_Kia ,
@Zos_Kia@lemmynsfw.com avatar

I don't understand that argument. We invented a term to describe a certain technology. But you're arguing that this term should not be used to describe such technology, as it should be reserved for another mythical tech that may or may not exist some time in the future. What exactly is your point here?

Sethayy ,

I think its more the case that its too general, ie 'all humans that died have drank water' type of vibe, except in this case people start thinking their AI is gonna mold with alien technology and have sex with a super hero a-la Jarvis

Zos_Kia ,
@Zos_Kia@lemmynsfw.com avatar

I don't mean to throw shade but that explanation makes me understand even less. Yes, it is a generic term used to describe a whole array of technologies - is that a bad thing now ? I understand that some people might misunderstand if they don't know much about the subject but isn't that true of all technical terms ?

gianni ,
@gianni@lemmy.ca avatar

Perhaps, but it's not a technical term. And it's not the correct term from a technical perspective either.

AI is a pop culture term that has been in use long before practical machine learning or large language models. It already has a known definition which resembles artificial general intelligence or AGI. It is being applied to ML and LLMs for marketing purposes.

howrar ,

It's the term that researchers use, so does that not make it a technical term? It's also the only term we have for describing this line of work and its outputs, so until we have a replacement, it'll continue to be called AI.

Zos_Kia ,
@Zos_Kia@lemmynsfw.com avatar

That's even richer. So the term AI should be reserved for the future tech that may or may not come to exist, even though that mythical technology already has a perfectly suitable name (AGI) ? That sounds... useful ! But also very interesting, and intellectually stimulating ! After all, who doesn't love those little semantics games ?

AI is a technical term that has been used by researchers and product developers for 50 years, with a fairly consistent definition. I know it hurts because it contradicts your pedestrian opinion on how Big Words should be used, but that's just the way it is. We're not at a point yet where humanity recognizes your legitimacy to decide how words are used.

Sethayy ,

It's to me intentional misdirection via generality I suppose.

Which I'd attribute to malice considering the amount of money its currently making

Zos_Kia ,
@Zos_Kia@lemmynsfw.com avatar

Do you have information that any AI company is currently money ? AFAIK all foundational models are still bleeding money and are subsidized by VC money. There is even the distinct possibility that these companies may never be profitable at the current pricing.

Sethayy ,

You're right in the semantics there, as a whole I can't say many AI companies are net positive, but that's exactly why they have the money to spend on marketing - its really all they got

match , in Evidence that LLMs are reaching a point of diminishing returns - and what that might mean
@match@pawb.social avatar

pretty hype for third ai winter tbh

CeeBee ,

It's not going to happen. The previous AI winters happened because hardware just wasn't there to do the math necessary.

Right now we absolutely have the hardware. Besides, AI is more than just ChatGPT. Object recognition and image/video analytics have been big business for about a decade now, and still growing strong. The technology is proven, matured, and well established at this point.

And there are other well established segments of AI that are, for the most part, boring to the average person. Stuff like dataset analytics, processing large amounts of data (scientific data, financial stuff, etc).

LLMs may have reached a point of diminishing returns (though I sincerely doubt it) but LLMs are a fraction of the whole AI industry. And transformer models are not the only kind of known model and there's nonstop research happening at break neck speed.

There will never be another AI winter. The hype train will slow down eventually, but never another winter.

huginn ,

I think increasingly specialized models and analog systems that run them will be increasingly prevalent.

LLMs at their current scales don't do enough to be worth their enormous cost... And adding more data is increasingly difficult.

That said: the gains on LLMs have always been linear based on recent research. Emergence was always illusory.

ericjmorey ,
@ericjmorey@discuss.online avatar

I'd like to read the research you alluded to. What research specifically did you have in mind?

huginn ,

Sure: here's the article.

https://arxiv.org/abs/2304.15004

The basics are that:

  1. LLM "emergent behavior" has never been consistent, it has always been specific to some types of testing. Like taking the SAT saw emergent behavior when it got above a certain number of parameters because it went from missing most questions to missing fewer.

  2. They looked at the emergent behavior of the LLM compared to all the other ways it only grew linearly and found a pattern: emergence was only being displayed in nonlinear metrics. If your metric didn't have a smooth t transition between wrong, less wrong, sorta right, and right then the LLM would appear emergent without actually being so.

markon ,

Also the compounding feedback loop. AI is helping chips get gabbed faster, designs better and faster etc

bitfucker ,

I think it depends on how you define AI winter. To me, the hype dying down is quite a winter. Hype dying -> less interest in AI in general. But will development stop? Of course not, the same as the previous AI winter, AI researchers didn't stop. But there are a decreasing number of them eventually.

CeeBee ,

But that's not how the industry defines AI winter. You're thinking of hype in the context of public perception, but that's not what matters.

Previous AI interest was about huge investments into research with the hope of a return on that investment. But since it didn't pan out, the interest (from investors) dried up and progress drastically slowed down.

GPUs are what made the difference. Finally AI research could produce meaningful results and that's where we're at now.

Previously AI research could not exist without external financial support. Today AI is fully self-sustaining, meaning companies using AI are making a profit while also directing some of that money back into research and development.

And we're not talking chump change, we're talking hundreds of billions. Nvidia has effectively pivoted from a gaming hardware company to the number one AI accelerator manufacturer in the world.

There's also a number of companies that have started developing and making analogue AI accelerators. In many cases the so the same workload for a fraction of the energy costs of a digital one (like the H100).

There's so much happening every day and it keeps getting faster and faster. It is NOT slowing down anytime soon, and at this point it will never stop.

thundermoose ,

AI is not self-sustaining yet. Nvidia is doing well selling shovels, but most AI companies are not profitable. Stock prices and investor valuations are effectively bets on the future, not measurements of current success.

From this Forbes list of top AI companies, all but one make their money from something besides AI directly. Several of them rode the Web3 hype wave too, that didn't make them Web3 companies.

We're still in the early days of AI adoption and most reports of AI-driven profit increases should be taken with a large grain of salt. Some parts of AI are going to be useful, but that doesn't mean another winter won't come when the bubble bursts.

CeeBee ,

AI is absolutely self-sustaining. Just because a company doesn't "only do AI" doesn't matter. I don't even know what that would really look like. AI is just a tool. But it's currently an extremely widely used tool. You don't even see 99% of the applications of it.

How do I know? I worked in that industry for a decade. Just about every large company on the planet is using some form of AI in a way that increases profitability. There's enough return on investment that it will continue to grow.

all but one make their money from something besides AI directly.

This is like saying only computer manufacturers make money from computers directly, whereas everyone and their grandmas use computers. You're literally looking at the news cycle about ChatGPT and making broad conclusions about an AI winter based solely on that.

Industries like fintech and cybersecurity have made permanent shifts into AI years ago and there's no going back. The benefits of AI in these sectors cannot be matched by traditional methods.

Then, like I said in my previous comment, there are industries like security and video surveillance where object recognition, facial recognition, ALPR, video analytics, etc, have been going strong for over a decade and it's still growing and expanding. We' might reach a point where the advancements slow down, but that's after the tech becomes established and commonplace.

There will be no AI winter going forward. It's done.

thundermoose ,

You're using "machine learning" interchangeably with "AI." We've been doing ML for decades, but it's not what most people would consider AI and it's definitely not what I'm referring to when I say "AI winter."

"Generative AI" is the more precise term for what most people are thinking of when they say "AI" today and it's what is driving investments right now. It's still very unclear what the actual value of this bubble is. There are tons of promises and a few clear use-cases, but not much proof on the ground of it being as wildly profitable as the industry is saying yet.

CeeBee ,

You're using "machine learning" interchangeably with "AI."

No I'm not

Machine learning, deep learning, generative AI, object recognition, etc, are all subsets or forms of AI.

"Generative AI" is the more precise term for what most people are thinking of when they say "AI" today and it's what is driving investments right now.

It doesn't matter what people are "thinking of", if someone invokes the term "AI winter" then they better be using the right terminology, or else get out of the conversation.

There are tons of promises and a few clear use-cases, but not much proof on the ground of it being as wildly profitable as the industry is saying yet.

There are loads and loads of proven use cases, even for LLMs. It doesn't matter if the average person thinks that AI refers only to things like ChatGPT, the reality is that there is no AI winter coming and AI has been generating revenue (or helping to generate revenue) for a lot of companies for years now.

bitfucker ,

Oh, I see. If that's how we would define it then yes of course. I mean, I already saw upscaler and other "AI" technologies being used on consumer hardware. That is actually useful AI. LLM usefulness compared to their resource consumption is IMHO not worth it.

CeeBee ,

LLM usefulness compared to their resource consumption is IMHO not worth it.

If you worked in that industry you'd have a different opinion. Using LLMs to write poetry or make stories is frivolous, but there are other applications that aren't.

Some companies are using them to find new and better drugs, to solve diseases, invent new materials, etc.

Then there's the consideration that a number of companies are coming out with AI accelerators that are analogue based and use a tiny fraction of the energy current systems use for the same workloads.

ericjmorey ,
@ericjmorey@discuss.online avatar

Some companies are using them to find new and better drugs, to solve diseases, invent new materials, etc.

I have seen the claims of this sort of thing be refuted when results of the work using LLMs is reviewed. For example.

CeeBee ,

That's one company and one model referring only to material discovery. There are other models and companies.

ericjmorey ,
@ericjmorey@discuss.online avatar

Yes, it's an example of how there are claims being made that don't hold up.

CeeBee ,

it's an example of how there are claims being made that don't hold up.

You can find that kind of example for literally every segment of science and society. Showing a single example out of many and then saying "see? The claims are false". It's disingenuous at best.

https://www.artsci.utoronto.ca/news/researchers-build-breakthrough-ai-technology-probe-structure-proteins-tools-life

https://www.broadinstitute.org/news/researchers-use-ai-identify-new-class-antibiotic-candidates

ericjmorey ,
@ericjmorey@discuss.online avatar

I think you're not seeing the nuance in my statements and instead are extrapolating inappropriately, perhaps even disingenuously.

CeeBee ,

I'm not missing the nuance of what you said. It's just irrelevant for the discussion in this thread.

My comment that you initially replied to was talking about much more than just LLMs, but you singled out the one point about LLMs and offered a single article talking about DeepMind's results on material discoveries. A very specific

It's about the relevance of AI as a tool for profit stemming from the top level comment implying an AI winter is coming.

But to go back to your point about the article you shared, I wonder if you've actually read it. The whole discussion is about what is effectively a proof-of-concept by Google, and not a full effort to truely find new materials. They said that they "selected a random sample of the 380,000 proposed structures released by DeepMind and say that none of them meet a three-part test of whether the proposed material is “credible,” “useful,” and “novel.” "

And in the actual analysis, which the article is about, they wrote: "we have yet to find any strikingly novel compounds in the GNoME and Stable Structure listings, although we anticipate that there must be some among the 384,870 compositions. We also note that, while many of the new compositions are trivial adaptations of known materials, the computational approach delivers credible overall compositions, which gives us confidence that the underlying approach is sound."

Ultimately, everyone involved in analysing the results agreed the concept is sound and will likely lead to breakthroughs in the future, but this specific result (and a similar one done by another group), have not produced any significant and noteworthy new materials.

ericjmorey ,
@ericjmorey@discuss.online avatar

I'm not reading that because you clearly would rather argue than have a conversation. Enjoy the rest of your day.

CeeBee ,

Sure, just like you didn't read the article you linked to.

I did read it btw, since you shared it.

AustralianSimon ,
@AustralianSimon@lemmy.world avatar

I work in the field for a company with 40k staff and over 6 million customers.

We have about 100 dedicated data science professionals and we have 1 LLM we use for our chatbots vs a few hundred ML models running.

LLMs are overhyped and not delivering as much as people claim, most businesses doing LLM will not exist in 2-5 years because Amazon, Google and Microsoft will offer it all cheaper or free.

They are great at generating content but honestly most content is crap because it's AI rejuvenating something it's been trained on. They are our next gen spam for the most part.

CeeBee , (edited )

LLMs are overhyped and not delivering as much as people claim

I absolutely agree it's overhyped, but that doesn't mean useless. And these systems are getting better everyday. And the money isn't going to be in these massive models. It's going to be in smaller domain specific models. MoE models show better results over models that are 10x larger. It's still 100% early days.

most businesses doing LLM will not exist in 2-5 years because Amazon, Google and Microsoft will offer it all cheaper or free.

I somewhat agree with this, but since the LLM hype train started just over a year ago, smaller open source fine-tuned models have been keeping ahead of the big players that are too big to shift quickly. Google even mentioned in an internal memo that the open source community had accomplished in a few months what they thought was literally impossible and could never happen (to prune and quantize models and fine-tune them to get results very close to larger models).

And there are always more companies that spring up around a new tech than the number that continue to exist after a few years. That's been the case for decades now.

They are great at generating content but honestly most content is crap because it's AI rejuvenating something it's been trained on.

Well, this is actually demonstrably false. There are many thorough examples of how LLMs can generate novel data, even papers written on the subject. But beyond generating new and novel data, the use for LLMs are more than that. They are able to discern patterns, perform analysis, summarize data, problem solve, etc. All of which have various applications.

But ultimately, how is "regurgitating something it's been trained on" any different from how we learn? The reality is that we ourselves can only generate things based on things we've learned. The difference is that we learn basically about everything. And we have a constant stream of input from all our senses as well as ideas/thoughts shared with other people.

Edit: a great example of how we can't "generate" something outside of what we've learned is that we are 100% incapable of visualizing a 4 dimensional object. And I mean visualize in your mind's eye like you can with any other kind of shape or object. You can close your eyes right now and see a cube or sphere, but you are incapable of visualizing a hyper-cube or a hyper-sphere, even though we can describe them mathematically and even render them with software by projecting them onto a 3D virtual environment (like how a photo is a 2D representation of a 3D environment).

/End-Edit

It's not an exaggeration that neural networks are trained the same way biologic neural networks (aka brains) are trained. But there's obviously a huge difference in the inner workings.

They are our next gen spam for the most part.

Maybe the last gen models, definitely not the current gen SOTA models, and the models coming in the next few years will only get better. 10 years from now is going to look wild.

AustralianSimon ,
@AustralianSimon@lemmy.world avatar

[Thread, post or comment was deleted by the author]

  • Loading...
  • CeeBee ,

    I also worked in the field for a decade up until recently. And I use LLMs for a few things professionally, particularly code generation. It can't write "good and clean" code, but what it does do is help get the ball rolling writing boilerplate stuff and helps solve issues that aren't immediately clear.

    I actually run a number of models locally also.

    I get you are excited about the tech

    What a condescending thing to say. It has nothing to do with being excited or not. The broader issue is that people are approaching the topic from a "it'll replace programmers/writers/accountants/lawyers, etc" standpoint. And I bet that's what all the suits at various companies expect.

    Whereas the true usefulness in LLMs are as a supplementary tool to help existing jobs be more efficient. It's no different than spell check, autocomplete, code linting, and so on. It's just more capable than those tools.

    for now it is mostly novel and creating junk.

    This statement proves my point. Everyone thinks LLMs will "do the job" when they're just a tool to HELP with doing the job.

    AustralianSimon ,
    @AustralianSimon@lemmy.world avatar

    [Thread, post or comment was deleted by the author]

  • Loading...
  • CeeBee ,

    Said by someone who's never written a line of code.

    Is autocorrect always right? No, but we all still use it.

    And I never said "poorly generated", I decidedly used "good and clean". And that was in the context of writing larger segments of code on it's own. I did clarify after that it's good for writing things like boilerplate code. So no, I never said "poorly generated boilerplate". You were just putting words in my mouth.

    Boilerplate code that's workable can help you get well ahead of a task than if you did it yourself. The beauty about boilerplate stuff is that there's not really a whole lot of different ways to do it. Sure there are fancier ways, but generally anything but code that's easy to read is frowned upon. Fortunately LLMs are actually great at the boilerplate stuff.

    Just about every programmer that's tried GitHub Copilot agrees that it's not taking over progressing jobs anytime soon, but does a fine job as a coding assistant tool.

    I know of at least three separate coding/tech related podcasts with multiple hosts that have come to the same conclusion in the past 6 months or so.

    If you're interested, the ones I'm thinking of are Coder Radio, Linux After Dark, Linux Downtime, and 2.5 Admins.

    Your reply also demonstrates the ridiculous mindset that people have about this stuff. There's this mentality that if it's not literally a self aware AI then it's spam and worthless. Ya, it does a fairly basic and mundane thing in the real world. But that mundane thing has measurable utility that makes certain workloads easier or more efficient.

    Sorry it didn't blow your mind.

    AustralianSimon ,
    @AustralianSimon@lemmy.world avatar

    [Thread, post or comment was deleted by the author]

  • Loading...
  • CeeBee ,

    Ditto

    ericjmorey ,
    @ericjmorey@discuss.online avatar

    Today AI is fully self-sustaining, meaning companies using AI are making a profit

    How can I verify this?

    CeeBee ,

    Look at the number of companies offering AI based video surveillance. That sector alone is worth tens of billions each year and still growing.

    Just about every large company is using AI in some way. Google and Microsoft are using AI in their backend systems for things like spam filtering.

    You're thinking of AI as "ChatGPT" but the market for AI has been established well before ChatGPT became popular. It's just the "new" guy on the scene that the news cycle is going crazy over.

    ericjmorey ,
    @ericjmorey@discuss.online avatar

    I'm interested in LLMs and how they are being used because that's what large sums of money is being thrown at with very uncertain future returns.

    CeeBee ,

    I have no idea how LLMs are being used by private companies to generate profit.

    What I do know is that other forms of AI are employed in cybersecurity, fintech, video surveillance, spam filtering, etc.

    The AI video surveillance market is huge and constantly growing. This is a proven and mature segment worth tens of billions every year and constantly growing.

    I find it interesting that you keep driving the point about LLMs and constantly ignore my point that AI is way bigger than just LLMs, and that AI is making billions of dollars for companies every year.

    pennomi ,

    What we haven’t hit yet is the point of diminishing returns for model efficiency. Small, locally run models are still progressing rapidly, which means we’re going to see improvements for the everyday person instead of just for corporations with huge GPU clusters.

    That in turn allows more scientists with lower budgets to experiment on LLMs, increasing the chances of the next major innovation.

    CeeBee ,

    Exactly. We're still very early days with this stuff.

    The next few years will be wild.

    pyre , in ChatGPT would have been so much useful and trustworthy if it is able to accept that it doesn't know an answer.

    it's just a glorified autocomplete. it doesn't know that it doesn't know the answer because it doesn't know anything. so if what you wanted happened, chatgpt would not answer any question, because it doesn't know anything.

    chatgpt doesn't look for information, it looks for the most likely words that will follow the previous ones.

    Nougat , in ChatGPT would have been so much useful and trustworthy if it is able to accept that it doesn't know an answer.

    Sure but that would mean it would have to know anything.

    It would have to know that it doesn’t know, and it doesn’t.

    mozz Admin , in ChatGPT would have been so much useful and trustworthy if it is able to accept that it doesn't know an answer.
    mozz avatar

    This wasn't an intentional feature; they're actually trying to train it with fine-tuning to add this as an ability. It's one area that highlights the difference between it imitating the text it's been seeing, instead of actually understanding what it's saying -- since most of its training data is of the form "(ask a question) (response to question)" overwhelmingly more often than "(ask a question) (say you don't know, the end)", it is trying to be a good imitator and do the same, and come up with some plausible nonsense even if it doesn't know the answer.

    FaceDeer ,
    @FaceDeer@fedia.io avatar

    And sometimes that's exactly what I want, too. I use LLMs like ChatGPT when brainstorming and fleshing out fictional scenarios for tabletop roleplaying games, for example, and in those situations coming up with plausible nonsense is specifically the job at hand. I wouldn't want to go "ChatGPT, I need a description of the interior of a wizard's tower is like" and get the response "I don't know what the interior of a wizard's tower is like."

    mozz Admin ,
    mozz avatar

    At one point I messed around with a lore generator that would chop up sections of "The Dungeon Alphabet" and "Fire on the Velvet Horizon" along with some other stuff, and feed random sections of them into the LLM for inspiration and then ask it to lay out a little map, and it pretty reliably came up with all kind of badass stuff.

    kromem ,

    Part of the problem is fine tuning is very shallow, and that a contributing issue for claiming to be right when it isn't is the pretraining on a bunch of training data of people online claiming to be right when they aren't.

    mozz Admin ,
    mozz avatar

    Yeah. It is fairly weird to me that it’s such a common thing to do to take the raw output of the LLM and send that to the user, and to try use fine-tuning to get that raw output to look some way that you want.

    To me it is obvious that something like having the LLM emit a little JSON block which includes some field which covers “how sure are you that this is actually true” or something, is more flexible and simpler and cheaper and works better.

    But what do I know

    kromem ,

    The problem is that they are prone to making up why they are correct too.

    There's various techniques to try and identify and correct hallucinations, but they all increase the cost and none are a silver bullet.

    But the rate at which it occurs decreased with the jump in pretrained models, and will likely decrease further with the next jump too.

    Cosmicomical ,

    Good look getting it to reply consistently with a json object

    Edit: maybe i'm shit at prompting but for me it's almost impossible to even get it to just shut up and consistently reply yes or no to my questions

    mozz Admin ,
    mozz avatar

    I haven’t really had a problem with it… maybe like 5% of the time it will want to do something a little bit weird like wrapping it in ``` but in general it seems like it works well enough to be able to parse with a program and just retry if it does something weird.

    You do have to set it up a little carefully, I guess - like usually I’ll give it an example of what I want it to emit, and that’ll be good enough that that’s the form it will follow when it’s emitting stuff back to me. But yeah if you give it prompting and a specific machine readable thing to give back that seems like it usually works better than sticking with English and hoping it goes “yes” or “no” or etc like that.

    mvirts , in ChatGPT would have been so much useful and trustworthy if it is able to accept that it doesn't know an answer.

    Even a response that it doesn't know an answer would be untrustworthy

    kromem , in AI models have favorite numbers, because they think they're people | TechCrunch

    No. The answer, as is usually the case with these things, is that we are anthropomorphizing a step too far.

    No, you are taking it too far before walking it back to get clicks.

    I wrote in the headline that these models “think they’re people,” but that’s a bit misleading.

    "I wrote something everyone will know is bullshit in the headline to get you to click on it before denouncing the bullshit in at the end of the article as if it was a PSA."

    I am not sure if I could loathe how 'journalists' cover AI more.

    match ,
    @match@pawb.social avatar

    Journalistic integrity! Journalists now print retractions in the very article where the errors appear

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • chatgpt@lemmy.world
  • test
  • worldmews
  • mews
  • All magazines