coolin

@coolin@beehaw.org
0 Post – 35 Comments
Joined 1 years ago

Only thing really missing is Wallet and NFC support. Other than that I think Graphene and Lineage OS cover it all

9 more...

We have no moat and neither does OpenAI is the leaked document you're talking about

It's a pretty interesting read. Time will tell if it's right, but given the speed of advancements that can be stacked on top of each other that I'm seeing in the open source community, I think it could be right. If open source figured out scalable distributed training I think it's Joever for AI companies.

I mean advanced AI aside, there are already browser extensions that you can pay for that have humans on the other end solving your Captcha. It's pretty much impossible to stop it imo

A long term solution would probably be a system similar to like public key/private key that is issued by a government or something to verify you're a real person that you must provide to sign up for a site. We obviously don't have the resources to do that 😐 and people are going to leak theirs starting day 1.

Honestly, disregarding the dystopian nature of it all, I think Sam Altman's worldcoin is a good idea at least for authentication because all you need to do is scan your iris to prove you are a person and you're in easily. People could steal your eyes tho 💀 so it's not foolproof. But in general biometric proof of personhood could be a way forward as well.

Sam Altman: We are moving our headquarters to Japan

Blocking out the sun with aerosols is a good idea if you know with high confidence how it will impact the climate system and environment. That's why they're trying to simulate it with the supercomputer, so they know if it fucks stuff up or not.

This is another reminder that the anomalous magnetic moment of the muon was recalculated by two different groups using higher precision lattice QCD techniques and wasn't found to be significantly different from the Brookhaven/Fermilab "discrepancy". More work needs to be done to check for errors in the original and newer calculations, but it seems quite likely to me that this will ultimately confirm the standard model exactly as we know it and not provide any new insight or the existence of another force particle.

My hunch is that unknown particles like dark matter rely on a relatively simple extension of the standard model (e.g. supersymmetry, axioms, etc.) and the new physics out there that combines gravity and QM is something completely different from what we are currently working on and can't be observed with current colliders or any other experiments on Earth.

So probably we will continue finding nothing interesting for quite some time until we can get a large ML model crunching every single possible model to check for fit on the data, and hopefully derive some better insight from there.

Though I'm not an expert and I'm talking out of my ass so take this all with a grain of salt.

I don't know what type of chatbots these companies are using, but I've literally never had a good experience with them and it doesn't make sense considering how advanced even something like OpenOrca 13B is (GPT-3.5 level) which can run on a single graphics card in some company server room. Most of the ones I've talked to are from some random AI startup that have cookie cutter preprogrammed text responses that feel less like LLMs and more like a flow chart and a rudimentary classifier to select an appropriate response. We have LLMs that can do the more complex human tasks of figuring out problems and suggesting solutions and that can query a company database to respond correctly, but we don't use them.

The natural next place for people to go to once they can't block ads on YouTube's website is to go to services that exploit the API to serve free content (NewPipe, Invidious, youtube-dl, etc.). If that happens at a large scale, YouTube might shut off its API just like Reddit did and we'll end up in scenario where creators are forced to move to Peertube, and, given how costly hosting is for video streaming, it could be much worse than Reddit->Lemmy+KBin or Twitter->Mastodon. Then again, YouTube has survived enshittiffication for a long time, so we'll have to wait and see.

2 more...

The one SIMPLE trick crypto bros HATE: Blockchain -> "Distributed Ledger" NFT -> "Unique Identifier"

Like and share with your friends

There are some in the research community that agree with your take: THE CURSE OF RECURSION: TRAINING ON GENERATED DATA MAKES MODELS FORGET

Basically the long and short of that paper is that LLMs are inherently biased towards likely responses. The more their training set is LLM generated, and thus contains that bias, the less the LLM will be able to produce unlikely responses, over time degrading the model quality throughout successive generations.

However, I tend to think this viewpoint is probably missing something important. Can you train a new LLM on today's internet? Probably not, at least without some heavy cleaning. Can you train a multimodal model on video, audio, the chat logs of people talking to it, and even other better LLMs? Yes, and you will get a much higher quality model and likely won't get the same model collapse implied by the paper.

This is more or less what OpenAI has done. All the conversations with 100M+ users are saved and used to further train the AI. Their latest GPT4 is also trained on video and image recognition, and they have also been exploring ways for LLMs to train new ones, especially to aid in alignment of these models.

Another recent example is Orca, a fine tune of the open source llama model, which is trained by GPT-3.5 and GPT-4 as teachers, and retains ~90% of GPT-3.5's performance though it uses a factor of 10 less parameters.

I really hate the state of the Supreme Court atm. Looking back, it wasn't a legitimate institution from the beginning, but the current 6-3 court shows how flawed it is, being out of line with public opinion in loads of different cases and effectively legislating from the bench via judicial review.

The only reason it has gotten this bad, though, is because Congress has abdicated its responsibilities as a legislative body and left it more and more to executive orders and court decisions. The entire debate around the Dobbs decision could have been avoided if Dems codified abortion into law, and this one could have avoided too if our Congress actually went to work legislating a solution to the ongoing student loan and college affordability crisis.

I think we need supreme court reform. I'm particularly partial to the idea of having a rotating bench pulled randomly from the lower courts each term, with each party in Congress getting a certain amount of strike outs to take people off that they don't want, similar to the way jurors are selected. I also think the people should be able to overrule the court via referendum, because ultimately we should decide what the constitution says.

I just can't see this happening though, at least for multiple decades until the younger people today get into political power.

This makes sense for any other company but OpenAI is still technically a non profit in control of the OpenAI corporation, the part that is actually a business and can raise capital. Considering Altman claims literal trillions in wealth would be generated by future GPT versions, I don't think OpenAI the non profit would ever sell the company part for a measly few billions.

I've never used Manjaro but the perception I get from it is that it is a noob friendly distro with good GUI and config (good) but then catastrophically fails when monkeying around with updates and the AUR. This is a pain for technical users and a back-to-Windows experience for the people it's targeted towards. Overall, significantly worse than EndeavorOS or plain 'ol vanilla Arch Linux.

Current LLMs are manifestly different from Cortana (🤢) because they are actually somewhat intelligent. Microsoft's copilot can do web search and perform basic tasks on the computer, and because of their exclusive contract with OpenAI they're gonna have access to more advanced versions of GPT which will be able to do more high level control and automation on the desktop. It will 100% be useful for users to have this available, and I expect even Linux desktops will eventually add local LLM support (once consumer compute and the tech matures). It is not just glorified auto complete, it is actually fairly correlated with outputs of real human language cognition.

The main issue for me is that they get all the data you input and mine it for better models without your explicit consent. This isn't an area where open source can catch up without significant capital in favor of it, so we have to hope Meta, Mistral and government funded projects give us what we need to have a competitor.

4 more...

Basically he is pro-privacy, somewhere in the libertarian space, supports usage of monero, recommends you move to a rural area, etc.

This isn't an actual problem. Can you train on post-ChatGPT internet text? No, but you can train on the pre-ChatGPT common crawls, the millions of conversations people have with the models and on audio, video and images. As we improve training techniques and model architectures, we will need even less of this data to train even more performant models.

5 more...

For the love of God please stop posting the same story about AI model collapse. This paper has been out since May, been discussed multiple times, and the scenario it presents is highly unrealistic.

Training on the whole internet is known to produce shit model output, requiring humans to produce their own high quality datasets to feed to these models to yield high quality results. That is why we have techniques like fine-tuning, LoRAs and RLHF as well as countless datasets to feed to models.

Yes, if a model for some reason was trained on the internet for several iterations, it would collapse and produce garbage. But the current frontier approach for datasets is for LLMs (e.g. GPT4) to produce high quality datasets and for new LLMs to train on that. This has been shown to work with Phi-1 (really good at writing Python code, trained on high quality textbook level content and GPT3.5) and Orca/OpenOrca (GPT-3.5 level model trained on millions of examples from GPT4 and GPT-3.5). Additionally, GPT4 has itself likely been trained on synthetic data and future iterations will train on more and more.

Notably, by selecting a narrow range of outputs, instead of the whole range, we are able to avoid model collapse and in fact produce even better outputs.

FediSearch I guess is similar to your idea, though I think the goal would be to make a new and open search index specifically containing fediverse websites instead of just using Google. I also feel like the formatting should be more like Lemmy, with the particular post title and short description showing instead of the generic search UI.

The idea of a fediverse search is really cool though. If things like news and academic papers ever got their own fediverse-connected service, I could see a FediSearch being a great alternative to the AI sludge of Google.

I definitely agree. The vast majority of people still left on Reddit are those who are corporate bootlickers and those who do not care and just want to doom scroll.

Neither type adds anything to an online community

3 more...

Reddit since changed the UI again which killed my interest in scrolling r/all. I still have to go there to view r/localllama, r/singularity and r/UFOs, none of which have a sizeable Feddit equivalent. I could do without the speculation of the latter 2 in my life, but I need LocalLlama because it is a great source for news and advice on LLMs.

Lmao Twitter is not that hard to create. Literally look at the Mastodon code base and "transform" it and you're already most of the way there.

The comment is trying to point out, albeit obtusely, that democrats have also funded crazy people on opposite end of the political spectrum. In 2022 the democrats funded far right candidates in hopes they would win the primary and be an easy victory royale for the dem candidate. The comment is trying to analogize these two things, which is fair because it is a similar political strategy.

1 more...

This is a bit cynical but I don't think companies care at all about pride. There may be individuals pushing for it within the company, but buy and large any time a company pushes for a social issue it is to make themselves look better and strategically pivot to appeal to different age brackets.

In this way I think corporate "pride-washing" is very much akin to green washing, acting like the company is carbon neutral or reducing emissions or whatever, when they just want people to feel better about buying their product that everyone knows is harmful. Plastic recycling, green ETFs and "clean" natural gas all fall under this category.

Profit decline due to these efforts is probably exaggerated, too, because people always conveniently ignore their moral values when buying products. We always hear about how horrid Nestle is, or how unethical the mining process is for our phone batteries, but look around and everyone still buys these things.

I think this is downplaying what LLMs do. Yeah, they are not the best at doing things in general, but the fact that they were able to learn the structure and semantic context of language is quite impressive, even if it doesn't know what the words converted into tokens actually mean. I suspect that we will be able to use LLMs as one part of a full digital "brain", with some model similar to our own prefrontal cortex calling the LLM (and other things like vision model, sound model, etc.) and using its output to reason about a certain task and take an action. That's where I think the hype will be validated, is when you put all these parts we've been working on together and Frankenstein a new and actually intelligent system.

Smh my head, Linux is too mainstream now!!! How will I be a cool hacker boy away from society if everyone else uses it!!!!!!!

I can't think of a time he's said any slur, but there is a particular video I would be interested to see it

I don't really think compulsory voting would be that beneficial for democrats. Yes, it may boost them a few points across the board, but my general intuition about the general public is they lean towards democrats but are more socially conservative than you see in online spaces. 2020 is probably the best example: super high turnout yet Dems still clipping by with only a +4 advantage instead of the +10 predicted by looking at far more politically engaged voters.

Yeah there's no way a viable Linux phone could be made without the ability to run Android apps.

I think we're probably at least a few years away from being able to daily drive Linux on modern phones with functioning things like NFC payments and a decent native app collection. It's definitely coming but it has far less momentum than even the Linux desktop does.

I think your job in your current form is likely in danger.

SOTA Foundation Models like GPT4 and Gemini Ultra can write code, execute, and debug with special chain of thought prompting techniques, and large acale process verification on synthetic data and RL search for correct outputs will make this 10x better. The silver lining to this is that I expect this to require an absolute shit ton of compute to constantly generate LLM output hundreds of times for each internal prompt over multiple prompts, requiring immense compute and possibly taking longer than an ordinary software engineer to run. I suspect early full stack developer LLMs will mainly be used to do a few very tedious coding tasks and SWEs will be cheaper for a fair length of time.

I expect it will be 2-3 years before this happens, so for that short period I expect workers to be "super-productive" by using LLMs in the coding process, but I expect the crossover point when the LLM becomes better is quite soon, perhaps in the next 5 years as compute requirements go down.

My solution is Newpipe. It is an alternative YouTube front end you can download via F-Droid that lets you access YouTube, peer tune and Bandcamp without ads. It isn't as good as YouTube music especially since you don't actually have an account, but you can still make local playlists and download music.

Hello, kids! Pirates are very bad! Never use qBittorent to download copyrighted material, and certainly do NOT connect it to a VPN to avoid getting caught. Additionally, you should also NEVER download illegal material via an https connection because it is fully encrypted and you won't get caught!

NFTs are stupid AF for most of the tasks people currently use them for and definitely shouldn't be used as proof of ownership of physical assets.

However, I think NFTs make a lot of sense as proof of ownership of purely digital assets, especially those which are scarce.

For example, there are several projects for domain name resolution based on NFT ownership (e.g you look up crypto.eth, your browser checks that the site is signed by the owner of the crypto.eth NFT, then you are connected to the site), as it could replace our current system, which has literally 7 guys that hold a private key that is the backbone of the DNS system and a bunch of registrars you have to go through to get a domain. This won't happen anytime soon but it is an interesting concept.

Then I think an NFT would also be good as a decentralized alternative to something like Google sign in, where you sign up for something with the NFT and sign in by proving your ownership of it.

In general though I find NFTs to be a precarious concept. I mean the experience I've had with crypto is you literally have a seed phrase for your wallet, and if it gets stolen all your funds are drained. And then for an NFT, if you click on the wrong smart contract, all your monkeys could be gone in an instant. There is in general no legal recourse to reverse crypto transactions, and I think that is frankly the biggest issue with the technology as it stands today.

TBH Minecraft Parkour in the Hypixel server's housing area are good for this. I usually play one until I starts getting hard/finish then switch to the next one while I'm watching YouTube.

I suppose having worked with LLMs a whole bunch over the past year I have a better sense of what I meant by "automate high level tasks".

I'm talking about an assistant where, let's say you need to edit a podcast video to add graphics and cut out dead space or mistakes that you corrected in the recording. You could tell the assistant to do that and it would open the video in Adobe Premiere pro, do the necessary tasks, then ask you to review it to check if it made mistakes.

Or if you had an issue with a particular device, e.g. your display, the assistant would research the issue and perform the necessary steps to troubleshoot and fix the issue.

These are currently hypothetical scenarios, but current GPT4 can already perform some of these tasks, and specifically training it to be a desktop assistant and to do more agentic tasks will make this a reality in a few years.

It's additionally already useful for reading and editing long documents and will only get better on this end. You can already use an LLM to query your documents and give you summaries or use them as instructions/research to aid in performing a task.

Yeah, I think Nix is a good concept but I feel like 99% of the config work could be managed by the OS itself and a GUI to change everything else. I also feel like flakes should be the default, not this weird multiple systems thing they have. I also wish most apps would have a sandbox built in, because nix apps would then rival flatpak and, if ported to Windows, become a universal package manager. Overall good concept but not there yet.