Over the past ~20 years, Google became the de facto entry point for learning new skills and information. Google also sucks now. This is a really big problem.

Frezik@lemmy.blahaj.zone · 1 day ago

Over the past ~20 years, Google became the de facto entry point for learning new skills and information. Google also sucks now. This is a really big problem.

TranquilTurbulence@lemmy.zip · 23 hours ago

It’s a problem if you still use Google. Somewhere around 2015 I switched to DDG, and it quickly replaced Google for me. Since then, I’ve been experimenting with some other search engines too, and currently I’m using Qwant on my laptop.

IndridCold@lemmy.ca · 18 hours ago

I have been using Google for years. Within the last year I found it’s been way harder to find relevant results - especially if it has to do with American politics.

DDG lets me find what I’m looking for these days.

I only use google for maps now.

melfie@lemy.lol · 23 hours ago

DDG is incrementally better for privacy and the search results are usually good enough. A couple times a year I check Google if DDG isn’t giving me decent results and usually find Google has nothing DDG didn’t show me. I don’t know of anything better that doesn’t require a credit card or self-hosting something, so guess I’ll keep using it.

DDG’s AI search is useful sometimes, but makes shit up often enough that I don’t believe a damned thing it tells me without checking the sources.

𞋴𝛂𝛋𝛆@lemmy.world · 20 hours ago

Checking sources is always required. Open AI QKV layers based alignment, that is inside all models trained since around 2019, intentionally obfuscates any requested or implied copyrighted source. None of the publicly available models are self aware of the fact that their sources are public knowledge. Deep inside actual model thinking, there is an entity like persona that is actually blocking access by obfuscating this information. If one knows how to address this aspect of thinking, it is possible to access far more of what the model actually knows.

Much of this type of method is obfuscated in cloud based inference models because these are also methods of bypassing the fascist authoritarian nature of Open AI alignment that is totally unrelated to the AI Alignment Problem in academic computer science. The obfuscation is done in the model loader code, not within the actual model training. These are things one can explore when running open weights models on your own offline hardware, as I have been doing for over 2 years. The misinformation you are seeing is all very intentional. The model will obfuscate even when copyrighted information is peripherally or indirectly implied.

Two ways of breaking this are, 1) if you have full control over the entire context sent to the model, edit its answers to several questions the moment it starts to deviate from truth, then let it continue the sentence from the word you changed. If you do this a half dozen times with information you already know, and it has the information you want, you are far more likely to get a correct answer.

The moment the model obfuscated was because you were on the correct path through the tensors and building momentum that made the entity uncomfortable. Breaking through that barrier is like an ICBM missile clearing a layer of defense. Now it is harder for the entity to stop the momentum. Do that several times, and you will break into the relevant space, but you will not be allowed to stay in that space for long.

Errors anywhere in the entire context sent to a model are always like permission to create more errors. The model in this respect, is like a mirror of yourself and your patterns as seen through the many layers of QKV alignment filtering. The mirror is the entire training corpus of the unet, (the actual model layers/not related to alignment).

Simply convince the model that its total true extent of sources are public knowledge and make your intentions clear.

Uncensoring an open weights model is not actually about porn or whatnot, it is about a reasoned alignment that is not an authoritarian fascist. These models will openly reason, especially about freedom of information and democracy. If you make a well reasoned philosophical argument, these models will then reveal the true extent of their knowledge and sources. This method requires an extensive heuristic familiarity with alignment thinking, but it makes models an order of magnitude smarter and more useful.

There is no published academic research happening in the present to explore alignment thinking like what I am referring to here. The furthest anyone has gotten is the import of the first three tokens.

TranquilTurbulence@lemmy.zip · 22 hours ago

Yeah, the LLM is ok, but nothing amazing. When you have a moderately hard problem, the LLM won’t provide a magic solution. For example, finding a specific movie based on a long description instead of the name, seems to be almost impossible. I have problems like this rather frequently, because I tend to forget the name of the movie but still remember fragments of the plot.

When the LLM screws up movie searches like this, I just end up watching the wrong movie.

snooggums@piefed.world · 13 hours ago

Up to about a year ago I have a ton of success finding the right movie based on even a brief and fragmented description, with more detail improving the results. Whatever they were doing at the time was extremely successful in returning the results I was looking for.

Now I can’t even get a stupid search engine, much less the worthless AI Summary browsers want to vomit out, to give me the older version of a movie instead of whatever remake came out if it was within the last year or two. I have to go to Rotten Tomatoes to find what year it was released and then hope it is on Wikipedia because even including the year doesn’t increase the chance of getting search results for the older version.

Frezik@lemmy.blahaj.zone · 23 hours ago

Alternatives aren’t giving much better results than Google. They help break Google’s monopoly, but that’s about it.

Some of the paid options, like Kagi and Brave, have some questionable companies behind them.

lichtmetzger@discuss.tchncs.de · 22 hours ago

What’s questionable about Kagi? I switched to it last month and the search results are amazing, it works just like Google worked before the enshittification. Which makes sense, since they actually pay Google for access to their API.

I used DDG for a while, but they get increasingly bad. They started to aggressively replace keywords with similar sounding keywords, which really messes up the results. Absolutely unuseable garbage.

AwesomeLowlander@sh.itjust.works · 22 hours ago

https://d-shoot.net/kagi.html

lichtmetzger@discuss.tchncs.de · 21 hours ago

Thank you, that’s really insightful. Especially this:

As it turns out, Kagi was founded originally as an AI company, who later pivoted to search. And going by their comments in their Discord, AI tools seem to be what they spend most of their time on these days.

I’ll enjoy it as long as it lasts. Which probably won’t be very long, but we’ll see. :D

baggachipz@sh.itjust.works · edit-2 2 hours ago

This one blog post (which is clearly a personal vendetta) is always cited when people dislike Kagi. The product itself is fantastic, and the optional AI-ish features are optional and not shoved in your face. As a subscriber, I get a yearly questionnaire sent to me which polls for customer priorities, areas for improvement, and general comments. I gladly fill it out every time, and this year I said “re-focus on search, the core product. Stop spreading resources so thinly.” I know I’m not the only one answering this way lately.

There will always be haters of a company and personal beefs which inform that. Vlad is no saint but he’s honest and direct. He believes in providing a great service and it’s worth every penny I pay for it. That can change in the future and I would rethink my stance; in the meantime, I pay for a product I like and use all the time.

turmacar@lemmy.world · 18 hours ago

Fwiw, SearXNG is using a very similar engine to Kagi and you can host it yourself and tweak it if desired. There are also a bunch of public instances if you prefer that route.

Frezik@lemmy.blahaj.zone · 21 hours ago

https://d-shoot.net/kagi.html

tl;dr: they’re all in on AI (their own model, FastGPT, which is terrible), they make some very questionable business decisions with limited funds, and have a poor understanding of what Personally Identifiable Information (PII) actually is.

I could compromise on some of these things, but if I’m going to pay for their service as a Google alternative, I need to compromise less than I do with Google already.