Google’s new search feature, AI Overviews, seems to be going awry.
The tool, which gives AI-generated summaries of search results, appeared to instruct a user to put glue on pizza when they searched “cheese not sticking to pizza.”
Jyoti Mann at Business Insider
Google’s “artificial intelligence” is literally just parroting a joke Reddit comment from 11 years ago by a person named fucksmith. Google is paying Reddit 60 million dollars for this privilege.
“AI” is going just great.
You may like it or not, but AI is here to stay. If course it will give you some absurd replies from time to time, but will continue to improve
Anyone who has spent appreciable time working with LLM’s and other AI variants will know that to train any kind of competency and reasonable responses from these models is taxing and requires heavy lifting. If you expect “Intelligence” it doesn’t have any, and if I were to guess that’s probably Thom’s biggest gripe.
We are beta testing algorithms that are hitting hurdles, to say they can’t do impressive things as assistants with language as they were designed would be silly. They can be incredibly useful to rewrite paragraphs, give tips and hints on coding.
But the fancy demonstrations and things we are shown have taken a vast amount of time to be custom tailored to look better than it really is.
Don’t believe me, give any kind of technical document to any LLM and get it to read and then answer questions. It takes hours to train reasonable responses and there is no comprehension of the material just word salad or worse outright lies to build responses.
jerkofalltrades,
I agree with you that LLM can contain false information and accurate training is key. Garbage In = Garbage Out. However I also think we are making LLM out to be worse than they are because they are being compared to standards that even humans fair to achieve. When compared to average humans though I think these LLMs are exceedingly good at mimicking the source – both the good and the bad.
Watching Jordan Klepper’s clips going around the country interviewing real people highlights just how much humans are also affected by Garbage In = Garbage Out and are totally unable to self-correct.
https://www.youtube.com/watch?v=Whqws6PSgY0
“fair to achieve”->”fail to achieve”
Humans have a VERY different set of limitations. We are creative, and CAN actually get around that type of bad-input/bad-output cycle. But we don’t really understand our own meat brain’s algorithm any better than we understand things like LLMs.
Hint: humans don’t do reason or logic, and especially not empiricism. We do what could best be described as “moral social justification.” That is, we try to convince our peer groups of our righteousness, by telling moral (political) stories about why we are righteous (not right, not correct.)
It’s interesting that in the 17th century we were able to create a sort of in-group morality based on being objectively correct, and call that reason. But our brains stink at that kind of fact based reasoning, while being very very good at moral reasoning – not “what is true” but what “should be right.” So do LLMs, FWIW. We are VERY good at convincing ourselves and each other of the righteousness of our positions – enough so that we will happily murder other people for having that gall to disagree with us…
CaptainN-,
I have to agree that, as sentient beings, we possess the ability to become more proficient than our education provides for. Our abilities allows us to go beyond AI into general intelligence, and this clearly sets us apart from LLM.
However, most of us do not do this the vast majority of the time. Our brains suck up hordes of information from the world and registers it without any sort of formal verification. We have neither the time nor expertise to verify all the information in our heads. And in this sense I think our “intuition” behaves very much like an LLM – we rely heavily on information we were taught.
I agree, there is that as well. Righteousness and factual information are both things people feel strongly about. Ideally factual information should be self correcting as contradictions and inductive reasoning iron things out, but it seems many humans are deficient at this. Darwinian evolution doesn’t necessarily correct for this either. If we don’t get a handle on it, we could be on the path to Idiocracy.
Empirically, ALL humans are deficient at this – more specifically, it’s not what our brains do. We can set up moral frameworks that tell us it’s important (moral, righteous) to follow the evidence, but our brains categorically, do not process information this way. That’s what the evidence is tell us. If our brains did process information that way, we would be able to accept that, and not say “I agree with your empirical position” or try to carve out elevated (righteous) social categories for those of us that adhere to an empirical frameworks better than others, and things like that. Humans don’t do reason in our brains. We can impose it on our social systems, but it’s not how our brains work.
I completely agree with you, and I think that the majority of wise people question the information that they glean from other people. The accuracy of the information LLM’s retrieve is great, but there’s a lot of nuance that is missed or “this looks like X” so I’ll grab this and mix source A with B to try to build a coherent answer.
It’s impressive to say the least that we have got to this point.
I was playing around with models to build a sort of help desk for my company that sells a fairly complex hybrid power solution for well pads (methane/CO2 reduction). I provided technical documentation that I have mostly built myself as source material, for the models to read through. Lets just say my questions led to responses that were as bad as someone untrained reading through the information, a lot of coaching was necessary to get responses that were at least not completely misleading and unproductive. In the end the LLM’s aren’t going to comprehend the material, I need to train a human for that, and that’s a lot of work too!
I guess to summarize, at this point in time I’d rather spend the effort into training a person who can actually comprehend topics not just retrieve and spit out information. The problem is these people cost more and need to rest!
jerkofalltrades,
Everything you are saying makes sense to me. On this last point, a lot of us (still?) prefer to work with other humans. This can still play to our advantage, but over time I think AI’s scalability will win over employers.
Say a human costs $50k to train to an adequate level whereas AI costs $1M (I’m using numbers picked at random for this example. I picked them with the intention of leaning in favor of humans). If you just have one position to fill, then training an AI to fill that spot makes absolutely no sense because a $1M AI offers terrible value against a couple $50k employees. However if the AI could replace 100+ employees over time, the numbers start to change. $1M AI versus $5M to train employees and moreover those 100 employees will need time off and expect to be paid every year. By contrast the AI’s operating costs actually decrease with time and the training data can be used over and over again.
We may not be there yet but the enormous scalability benefits of AI are going to prove to be AI’s killer feature over hiring human employees in the future.
It’s amazing to me how many people are trying to build businesses on these 80% LLM products. The thing is, if you understand just a little bit about how they work, what a vector data base is, etc. you know the limits pretty quickly. And the limits are hard – no, it’s not going to improve, not the way people think. There are other AI products coming down the pike, but the current crop of LLMs – while very useful for a certain kind of derivative art generation – are never going to replace meat brained humans. But a lot of companies are betting their entire futures on it. It’s a great time to be in (human) business!
LLM’s limitation that struck me most at the moment is their inability to assess themself their limitations in a area they had be trained on. Compared to the human employee you train, he – as a inexperienced, still a little bit trained – can be useful in serving your customer basic requests, and forwarding more complicated to you (still learning in the process). He (usually) will be able to asses if he is capable of serving particular request or not. This is something that we definitely miss at the current LLM. They are too much self confident.
Actually I think that being aware of own limitations might be at least as important as being aware own skills and abilities. Some people are better then other in this, but LLMs are just a disaster at the moment.
jurmcc,
Oddly, I’ve had the same complaint about certain wikipedia editors, haha. I don’t think it would be that difficult, technically, to give LLM a less confident tone. It could be one of several different adjustable persona qualities. Just for kicks, I took your paragraph and prompted chatgpt to “Rewrite this quotation in an extremely timid and non-confident manor”…
Wow that was so distracting! Here I asked it to make it sound Australian…
The point being there is no doubt they can change the persona, However I still argue that these types of changes are shallow and don’t really address the quality of information the LLMs get trained on.
Still, what I just did with chatgpt in under a second is incredible. An author might take a 1000 page manuscript and change character aspects as quickly as they can think them up, edited for errors and everything. The same is being done with art and 3d assets and soon enough even movies – things that take a human hours/days/weeks are a few prompts away… Code isn’t there yet, but that’s coming too and it will be perfected using the same self-learning techniques that have made computers better than humans at chess. When you have enough compute power, these incredible accomplishments for AI finally start being possible and we’re traversing many of these thresholds now.
You’re asking LLM is to sound timid (making them more verbose as a side effect). What I’m missing is their ability just to ‘feel’ they’re not competent in particular area, or nor even in case of some knowledge they were trained for to ‘feel’ some limits about knowledge they posses. Humans, even if they are all different regarding their confidence, usually at least internally asses this own-limitiation in natural way.
Problem presented by jerkofalltrades causes you cannot ‘deploy’ LLM that is partially trained, and continue with training on the go. Even when you think it reached enough competency you still risk such blunders as ‘take some glue to chees on pizza’. People can learn from mistakes, usually try to avoid too much risks (if they are reaching barrier in their competency), try to get support from others, limit themselves somehow. In case of LLMs you only got some decorations in the output they are producing.
I have just tried if ChatGPT4.0 (corp version my company throw at us) can find clear self-contradictory statements in one SW requirement I’m trying to implement at the moment. Headline of req contradicts one of sub conditions for particular actions. Of course humans created this mess (probably by accumulating sub conditions over time and not touching headline), but problem is clearly visible, and causes humans seeing contradiction will stop and start thinking and/or reasoning as opposed to LLM who just apply statistics using data from training and produce some new output.
I even tried to pin point to LLM what might be wrong, and asking again without any result that can be useful in such simple case. Nice rewording, decorations, even ‘sorrys’ but now sign of simple logic when input to LLM prompt has logical errors.
jurmcc,
Anyone who understands the limitations of the training data and garbage-in-garbage-out already know this. It doesn’t need to be incorporated as a personality trait for us to know it. Still, maybe for the benefit of the naive public, maybe expressing more self doubt is useful? I don’t know, to me it doesn’t seem that beneficial.
That’s true today but more dynamic neural nets will come.
The glue answer is unexpected and it deserves to be a meme for its humor and provocative nature. Still though it seems many people including you are eager to jump to the conclusion that it wouldn’t work but why? That’s not objectively fair and I think that deserves to be tested scientifically. If it actually works, then I’d say the answer was more clever than we’re giving credit for.
Say we asked an alien from a civilization far more advanced than us but not accustomed to humans for ideas on how to pizza toppings stick. That alien might suggest an idea of non-toxic glue. The same idea wouldn’t be as ridiculous coming from aliens because different ideas are expected from them and it doesn’t make them stupid. In other words all of the comments saying the glue idea is stupid are likely driven by human ego and justifying preconceived opinions of AI without being open to considering actual merit.
These biases become more evident when you take a Schr"odinger’s box where we don’t know what’s actually inside ( human, AI, alien. or celebrity ) and proceed to ask questions. Now that the people judging the answers don’t know what they are dealing with and they don’t know whether to be prejudiced against the entity inside the box. Therefor a different approach is needed to judge merit. Some people’s prejudices may be so strong that they would feel compelled to ask probing questions to attempt to ascertain the entity’s identity and then make judgement based on that identity rather than on merit. This is the kind of bias I’m seeing when it comes to AI.
jurmcc,
Today’s LLMs use static networks that don’t learn beyond their training. Knowing this sets the expectation for what LLMs are capable of and explains why you’re not able to train it by communicating with it after the fact. We might some day have NNs that can be continuously trained beyond the initial dataset through further interactions , which will be useful but I still think the garbage-in-garbage-out principal puts a limit on what’s possible as long as the AI’s only interaction with the real world is via humans because we are faulty sources.
More importantly nobody has yet demonstrated that people are ready to pay the huge prices involved in LLMs for the privilege of getting these answers. If none of these “AI” companies will start turning profits – not just revenue – in a reasonable timeframe the whole exercise will have been completely moot.
I mean it will continue to produce absurd replies, there is no doubt of that. It may improve in other ways, but that will always be there. Garbage (Reddit) in Garbage out. There is no escaping that with a LLM.
If courts start ruling that all this AI “training” is nothing more than theft, I can guarantee you it won’t be “here to stay”.
Everyone who’s investing billions in this technology is gambling that the courts are going to agree with an extremely unusual view of IP, one that the courts have never agreed with in any case before. And that some very powerful entities like Hollywood and the music recording industry and the book publishing industry are all just going to give up without a fight, which they’ve never done in any other case ever in the past.
andyprough,
Even if you want to assume that is how judges would rule, it wouldn’t really matter that much.
1) AI technology developers & providers won’t disappear, they’ll move to new jurisdictions.
2) The demand will still be there and there would be a booming black market that cannot be regulated.
3) When confronted with tough legal environments back home, corporations tend to side-step it by offshoring work to foreign jurisdictions where there’s no red tape. This makes a bad situation even worse for domestic employees who will be affected by both the AI and the offshoring combined.
4) Though nuanced, copyright infringement traditionally requires some kind of clear evidence that one work derivatives from another, but with AI works that’s not usually the case. Generated works can be far enough removed from the original to be considered their own new works and they may have trouble proving any specific copyright infringement otherwise.
>”AI technology developers & providers won’t disappear, they’ll move to new jurisdictions.”
Ah, yes, the “we’ll just move to other jurisdictions” argument. AI is extremely expensive – which of these “other jurisdictions” are going to offer the kind of economics that makes it work financially?
Face it, unless you are allowed to operate in the most high profit target areas of the world, you have no chance of surviving in the AI business. It’s the same reason why CPU technology in certain countries is so advanced – it takes an enormous monetary investment to run these kinds of businesses.
andyprough,
Face it, the corporate money is going to reach AI service providers whether their data centers are in your backyard or not.
>”Face it, the corporate money is going to reach AI service providers whether their data centers are in your backyard or not.”
Doesn’t matter where you place the data centers if you aren’t legally allowed to sell the service in the high profit jurisdictions. Put them on the bottom of the ocean, who cares? If you can’t monetize the data then the data centers are worthless.
andyprough,
You may wish to ban AI for creating new expressions of existing ideas, but that’s actually changing longstanding copyright policy – totally inconsistent with the way copyright has historically been applied to humans. Courts have routinely allowed humans to create new expressions for existing works and ideas. LLMs are proving to be excellent at this. The literal words are protected, but not the ideas behind them.
>”Courts have routinely allowed humans to create new expressions for existing works and ideas. LLMs are proving to be excellent at this.”
Oh really? Is that why Sam Altman immediately dropped the “Sky” voice assistant from ChatGPT the moment Scarlett Johansson picked up the phone and called an attorney? That case didn’t go anywhere near a courtroom. Face it, the legal black clouds over this entire industry are very real, and very threatening.
Sure, and child labor labor laws are totally effective at preventing companies from exploiting child labor in other parts of the world. You can tell yourself that but this is the real world and for better or worse companies can and do find ways around it, especially on the internet.
Thom, I just can’t take you seriously when you put AI in quotes. It makes me think of people who still type “Micro$oft” or scream “Embrace, Extinguish, Extend” every time Microsoft does, well, anything.
Basically, they all suggest that the person writing out the message isn’t at all interested in having an informed opinion, and would rather cling to the first hot take they’ve ever had on the subject, without the capacity for adjusting their view based on new information
I think it’s fair to put it in quotes when it’s being used in production and actively telling a user to eat glue. Not much room for varied opinions there.
While I do understand the point you’re trying to make here, it really doesn’t feed many people’s confidence that the best a multi-billion dollar company can put out right now is telling users to eat glue. There’s only so much incompetence that average person is willing to put up with before losing faith in something, whether that be a technology or a company.
Nia,
I already posted an opinion in defense of the glue
I agree with Drumhellar, Thom’s use of AI doesn’t really belong in quotes. In computer science it’s entirely correct to use the term AI for things that have no generalized intelligence. There’s no debate in our field that pattern recognition is a form of AI. Furthermore there’s no debate that LLMs are extremely adept at pattern recognition problems. Logically speaking there should be no debate that LLMs are AI without needing any quotes.
I believe Thom puts it in quotes because it lacks more generalized knowledge/self awareness/consciousness, however IMHO the word he should be using is AGI for things that are more advanced than AI.
https://en.wikipedia.org/wiki/Artificial_general_intelligence
AGI, or maybe machine intelligence. The A in AI stands for Artificial – and it describes the intelligence. When it comes to ChatGPT and similar LLMs, artificial intelligence is more accurate than the term is when used in pop-sci-fi. (The robots on West World are actually intelligent, not artificially intelligent.) Just because intelligence is happening in a machine, doesn’t make it artificial, and that kind of machine based AGI is definitely coming. It’s just not here yet.
Being the multi-billion behemoth that it is, Google can afford to either hire people to pre-filter Reddit input, or develop AIs to do the pre-filtering, before any Reddit content is actually fed into the forward-facing AI.
Will Google actually do something like that?
No, it cannot afford to do it at this scale. The cost for the datacenters is already going to strain all of these companies. Adding a gargantuan human staff that understands the subtlety of these is not going to fly.
Bill Shooter of Bul,
It’s a conundrum. These LLMs are not original sources for factual information. and they are not equipped to verify any information outside of the internet. As long as we use them as such it’s not really a big program. However it seems probable that many people will want to use these LLMs as authoritative sources, and that’s a problem.
In some cases there are solutions to the Garbage In Garbage Out problem for certain kinds of data that has testable results. Take AI software generation, an LLM can generate code, but that code can actually be tested using a virtual compiler thereby assessing the quality of any given output. IMHO combining generative adversarial networks with LLM will eventually be able to output highly sophisticated software. The GAN will iteratively refine the model to fit the test cases. I believe this will be the way forward for coding AIs rather than pure LLMs.
However with other kinds of information would unverifiable information, it’s much harder to automate fact checking. While you have tons of data from the inernet at your disposal, you already used those sources for training and those sources don’t tell you they are wrong. Humans fact checkers might have new insights from the real world, but they are not scalable, besides. This is quite the challenge!
It comes across as funny advice to me too. But to play devil’s advocate here, food grade glue is actually a real thing (I had to look it up) They are FDA approved and might be more common in the food products we buy to eat than is commonly known.
https://www.michaelpackage.com/post/need-to-know-food-grade-adhesives
Mixing edible glue seems like it could be a technically viable solution. I wish mythbusters were still around, this would be a greath myth for them to test!
We need an update to the old axiom, “life imitates art” – how about “humans justify AI statements” – we’ll work on it.
What if pizza has had glue in it all along and we just never noticed due to it actually tasting pretty good regardless?
You mean gluten? Sure, that’s already in pizza. Maybe sprinkle some flower in the sauce and toss in with the shredded cheese. Gluten has some adhesive properties.
From https://en.wikipedia.org/wiki/Gluten
[quote]Glutens, especially Triticeae glutens, have unique viscoelastic and adhesive properties[/quote]
Ugh. How the heck do you do quotes/etc here? Clearly it’s not BBCODE, what do I use? Also…can’t edit obviously. Le sigh.
Use html…
Thom, can this be added to the FAQ?
PhilPotter,
You know, it’s not impossible that some pizza brands set out to solve this same problem. “Glue” would not be listed as an ingredient, but it may have been a factor in choosing ingredients never the less. I think it would have been much more interesting had the AI made the suggestion on it’s own rather than having a source for it on the internet.
Anyway, it makes me wonder what interesting information from the osnews article & comment archives that we might extract from the AI. It would have to be content that was unique to osnews.
Highly doubt that Gemini is “indexing” a random Reddit post. LLMs generalize from their datasets, as shown by this recent paper from Anthropic: https://transformer-circuits.pub/2024/scaling-monosemanticity/
But, more importantly, what is AI, Thom?
That assertion is demonstrably not true. Every code snippet I’ve ever gotten from ChatGPT has been clearly derived from some Stack Overflow post, or similar tech board, and it CANNOT generate anything that isn’t 80% similar to something in it’s training set. Claims (especially when someone is selling something) are not reality.
In my own experience, that’s nonsence. Having had lengthy discussions with GPT-4 about various topics, including some very nuanced deep dives into how programming languages work, in certain that the LLMs generalize. Don’t take my word for it though, check out the paper from Anthropic. Also, the YT channel called AI Explained is an excellent resource.