scrchngwsl

@scrchngwsl@feddit.uk
0 Post – 39 Comments
Joined 1 years ago

he/him

Same here. Works great, incredibly cheap too.

I’ve followed Robert Miles’ YouTube channel for years and watched his old numberphile videos before that. He’s a great communicator and a genuinely thoughtful guy. I think he’s overly keen on anthropomorphising what AI is doing, partly because it makes it easier to communicate, but also because I think it suits the field of research he’s dedicated himself to. In this particular video, he ascribes a “theory of mind” based on the LLM’s response to a traditional and well-known theory of mind test. The test is included in the training data, and ChatGPT3.5 successfully recognises it and responds correctly. However, when the details of the test (i.e. specific names, items, etc.) are changed, but the form of the problem is the same, ChatGPT3.5 fails. ChatGPT 4, however, still succeeds – which Miles concludes means that ChatGPT 4 has a stronger theory of mind.

My view is that this is obviously wrong. I mean, just prima facie absurd. ChatGPT3.5 correctly recognises the problem as a classic psychology question, and responds with the standard psychology answer. Miles says that the test is found in the training data. So it’s in ChatGPT4’s training data, too. And ChatGPT 4’s LLM is good enough that, even if you change the nouns used in the problem, it is still able to recognise that the problem is the same one found in its training data. That does not in any way prove it has a theory of mind! It just proves that the problem is in its training set! If 3.5 doesn’t have a theory of mind because a small change can break the link between training set and test set, how can 4.0 have a theory of mind, if 4.0 is doing the same thing that 3.5 is doing, just with the link intact?

The most obvious problem is that the theory of mind test is designed for determining whether children have developed a theory of mind yet. That is, they test whether the development of the human brain has reached a stage that is common among other human brains, in which they can correctly understand that other people may have different internal mental states. We know that humans are, generally, capable of doing this, that this understanding is developed during childhood years, and that some children develop it sooner than others. So we have devised a test to distinguish between those children who have developed this capability and those children who have not yet.

It would be absurd to apply the same test to anything other than a human child. It would be like giving the LLM the “mirror test” for animal self-awareness. Clearly, since the LLM cannot recognise itself in a mirror, it is not self-aware. Is that a reasonable conclusion too? I won't go too hard on this, because it's a small part of a much wider point, and I'm sure if you pushed him on this, he would agree that LLMs don't actually have a theory of mind, they merely regurgitate the answer correctly (many animals can be similarly trained to pass theory of mind tests by rewarding them for pecking/tapping/barking etc at the right answer).

Indeed, Miles’ substantial point is that the “overton window” for AI Safety has shifted, bringing it into the mainstream of tech and political discourse. To that extent, it doesn’t matter whether ChatGPT has consciousness or not, or a theory of mind, as long as enough people in mainstream tech and political discourse believe it does for it to warrant greater attention on AI Safety. Miles further believes that AI Safety is important in its own right, so perhaps he doesn’t mind whether or not the overton window has shifted on the basis of AI's true capability or its imagined capability. He hints at, but doesn’t really explore, the ulterior motives for large tech companies to suggest that the tools they are developing are so powerful that they might destroy the world. (He doesn’t even say it as explicitly as I did just then, which I think is a failing.) But maybe that’s ok for him, as long as AI Safety research is being taken seriously.

I disagree. It would be better to base policy on things that are true, and if you have to believe that LLMs have a theory of mind in order to gain mainstream attention on AI Safety, then I think this will lead us to bad policymaking. It will miss the real harms that AI pose – facial recognition used to bar people from shops that have a disproportionately high error rate for black people, resumé scanners and other hiring tools that, again, disproportionately discriminate against black people and other minorities, non-consensual AI porn, etc etc. We may well need policies to regulate this stuff, but focus on hypothetical existential risk of AGI in the future, over the very real and present harms that AI is doing right now, is misguided and dangerous.

If policymakers actually understood the tech and the risks even to the extent that Miles's YouTube viewers did, maybe they'd come to the same conclusion that he does about the risk of AGI, and would be able to balance the imperative to act against all of the other things that the government should be prioritising. But, call me a sceptic, but I do not believe that politicians actually get any of this at all, and they just like being on stage with Elon Musk...

I had assumed it was a Uniqlo style thing using tags. That truly is magical, like living in the future. This Amazon stuff with the cameras and constant surveillance, not so much....

For walking nothing beats OpenStreetMap. Absolutely destroys Google maps as it knows all the footpaths and what is and isn't walkable.

For driving I'm stuck with Google due to Android Auto.

For finding businesses etc Here is the best alternative but frankly Google is in a different league in this regard, nothing beats it.

4 more...

This is the closest I've got to that feeling, yeah. Only difference is now I'm not a teenager and not that bothered about your asl.

3 more...

Same story here. I'll never understand why they canned Inbox when it was clearly superior to vanilla Gmail.

Agreed, I've run into lots of problems trying to get reverse proxies set up on paths, which disappear if you use a subdomain. For that reason I stick with subdomains and a wildcard DNS entry.

Yeah, my general philosophy on phones these days is to use the OEM rom until either it gets slow and rubbish, or it stops getting updates, then switch to LineageOS or something. OnePlus has done a pretty decent job of not making my phone shitter every time it gets an update, so the OEM rom has lasted much longer than I expected. I reckon with a custom rom it'll last me another 2-3 years at least, which is great value for a phone I bought 3 years ago for £290.

Same with our guinea pigs 🐹 they all passed away in the last 5 months, the last of them was in perfect health and died of loneliness essentially after losing his herd. Getting some great Black Friday deals now though....

that's a really cool idea. would love to see the next version be bigger and longer, though it probably doesn't scale well with the requirement to manually verify that the calculations have been performed by hand.

I have a similar printer but with duplex printing, which I bought because it fits under my sofa. It does everything I wanted it to do; namely, to print double-sided black and white documents and fit under my sofa.

BTW I also recommend the Brother ADS-2xxx series of document scanners, which I bought to scan multi-page double-sided documents automatically. I put the stack of papers in the top, press Go, and it scans to PDF in a few seconds.

It can also start an adhoc network that you join on your phone, and input the wifi details via the browser, although this is more complicated for the device itself. Lots of low spec/low power devices do that though.

I've been using the Mi Band 3 for the past 4 years or so. Was about $15 new from aliexpress and hasn't broken yet - just the plastic straps which break every 12 months or so.

1 more...

I had the opposite experience - absolutely no idea what to do when I opened the app. Eventually got something working but I think maybe one plus's os is hostile to the app's functioning too, so it didn't always work

Yeah, surely you have to find out first, before writing an article titled "Tech news doesn't understand ad blockers or Chrome extensions"? This appears to be the crux of the article, and yet the author isn't worried about finding out? Weird.

I updated ublock but it didn't stop the ads, so I can see others doing the same. I'm guessing it didn't actually pull the very latest version or the very latest block lists for whatever reason, but others might be less patient than me.

2 more...

Yeah, I know this is the self hosted community, but nothing is as easy and straightforward as OneNote. I keep coming back to it after trying self hosted solutions.

Why is unattended upgrades frowned upon? Seems like I good idea all round to me?

4 more...

that's actually really good - thanks!

1 more...

Possibly a product of immigration. I know a guy from the US whose surname is "Supernaw". He told me that the original surname might have been a French one, something like "Surprenant", which the English-speaking immigration officials wrote down as "Supernaw". You can imagine how the conversation went right?

Something similar might have happened here, maybe an Eastern European surname like (completely made up example) "Godenov" got written down by English officials as "Goodenough".

Ever since the CS1.6 days I wanted to have a server, but it was only when I got a free Raspberry Pi that I actually started self hosting stuff 24/7. I put OwnCloud on it and a bunch of scripts to track and statistically evaluate my investments, and just took off from there. Like many others, my desire to disconnect and reduce my dependency on "Big Tech" was a big motivator, but so too was "fun" and having things exactly the way I liked.

In the beginning I rolled my own scripts most of the time, but now I tend to use more off the shelf tools as self hosting has gone more and more "mainstream"/accessible and docker has become ubiquitous.

I still do my own scripts tbf, like my DIY smart thermostat/heat pump controller. Ultimately it's just a lot of fun.

I'm assuming they'd be using the $5 per month mentioned in the opening post to pay for some upgrade, e.g. more storage, more RAM, etc. So they'd be on a paid account, but using services that cost zero dollars for the most part. This is what I do and it's been great.

1 more...

Not OP but that's really useful to know!

Shiny is pretty good. You can do interactive sliders to filter date ranges in that, and you control what happens when you slide it in the code. It's not as slick as grafana though.

One downside is it started off as an R package then got ported to python, so most resources are for R. Fine for me because I know R, but most people don't.

Here's the python link: https://shiny.rstudio.com/py/

I've tried a lot of different things before settling on a old (windows) laptop with a wireless mouse and keyboard... I just cba with any of the streaming boxes anymore, and the laptop will always be compatible and performant.

Giffgaff uses o2 and also blocks duckdns. Additionally, whatever blocklist my employer is using also blocks it, so it's probably a common thing now.

Cheers, that's not so bad - might give it a shot!

Yeah, I'd search github for django implementations as you might have better luck searching something more specific.

I don't use Chrome but Firefox, and PWAs are just really inconsistent. I think it's probably an Android/OS/home screen thing rather than Chrome specifically.

Thanks, I tried it a while ago (two years or more) and didn't like it then, but it looks a lot better now. Will give it a go on some routes I know and see how it does!

I meant that if you went to Oracle instead of Linode, you could use their free services, and then spend the $5 you're currently spending on Linode on upgrading your Oracle server instead.

Agreed, just sell the old phone and use the money to buy a proper camera.

There's bands I listen to that have <10 monthly listeners. They still deserve their $3 a year IMO.

Not sure exactly what you're asking but I have a Coral mini pcie with frigate and it works great. Hardly any cpu and tiny power consumption.

The potatoes must be very small indeed, can barely see them!

Very nice looking job. Nothing more satisfying than solving a problem that both you and your wife have!

I don't name my servers anything special, but I do name my various Zigbee sensors in Home Assistant after Egyptian gods. Atum-Ra, Tefnut, Shu, etc. I've avoided the ones that also coincide with Stargate gods, as I thought that would be too exciting for me.

The summary is total rubbish and completely misrepresents what it's actually about. I'm not sure why anyone would bother including that poorly AI-generated summary, if they had already watched the video. Useless AI bullshit.

The video is actually about the movement of AI Safety over the past year from something of fringe academic interest or curiosity into the mainstream of tech discourse, and even into active government policy. He discusses the advancements in AI in the past year in the context of AI Safety, namely, that they are moving faster than expected and that this increases the urgency of AI Safety research.

I've followed Robert Miles' YouTube channel for years and watched his old numberphile videos before "GenAI" was really a thing. He's a great communicator and a genuinely thoughtful guy. I think he's overly keen on anthropomorphising what AI is doing, partly because it makes it easier to communicate, but also because I think it suits the field of research he's dedicated himself to. In this particular video, he ascribes a "theory of mind" based on the LLM's response to a traditional and well-known theory of mind test. The test is included in the training data, and ChatGPT3.5 successfully recognises it and responds correctly. However, when the details of the test (i.e. specific names, items, etc.) are changed, but the form of the problem is the same, ChatGPT3.5 fails. ChatGPT 4, however, still succeeds -- which Miles concludes means that ChatGPT 4 has a stronger theory of mind.

My view is that this is obviously wrong. I mean, just prima facie absurd. ChatGPT3.5 correctly recognises the problem as a classic psychology question, and responds with the standard psychology answer. Miles says that the test is found in the training data. So it's in ChatGPT4's training data, too. And ChatGPT 4's LLM is good enough that, even if you change the nouns used in the problem, it is still able to recognise that the problem is the same one found in its training data. That does not in any way prove it has a theory of mind! It just proves that the problem is in its training set! If 3.5 doesn't have a theory of mind because a small change can mess up its answer, how can 4.0 have a theory of mind, if 4.0 is doing the same thing that 3.5 is doing, just a bit better?

The most obvious problem is that the theory of mind test is designed for determining whether children have developed a theory of mind yet. That is, they test whether the development of the human brain has reached a stage that is common among other human brains, in which they can correctly understand that other people may have different internal mental states. We know that humans are, generally, capable of doing this, that this understanding is developed during childhood years, and that some children develop it sooner than others. So we have devised a test to distinguish between those children who have developed this capability and those children who have not.

It would be absurd to apply the same test to anything other than a human child. It would be like giving the LLM the "mirror test" for animal self-awareness. Clearly, since the LLM cannot recognise itself in a mirror, it is not self-aware. Is that a reasonable conclusion too? Or do we cherry-pick the existing tests to suit the LLM's capabilities?

Now, Miles' substantial point is that the "overton window" for AI Safety has shifted, bringing it into the mainstream of tech and political discourse. To that extent, it doesn't matter whether ChatGPT has consciousness or not, or a theory of mind, as long as enough people in mainstream tech and political discourse believe it does for it to warrant greater attention on AI Safety. Miles further believes that AI Safety is important in its own right, so perhaps he doesn't mind whether or not the overton window has shifted on the basis of true AI capability or imagined capability. He hints at, but doesn't really explore, the ulterior motives for large tech companies to suggest that the tools they are developing are so powerful that they might destroy the world. (He doesn't even say it as explicitly as I did just then, which I think is a failing.) But maybe that's ok for him, as long as AI Safety research is being taken seriously.

I disagree. It would be better to base policy on things that are true, and if you have to believe that LLMs have a theory of mind in order to gain mainstream attention on AI Safety, then I think this will lead us to bad policymaking. It will miss the real harms that AI pose -- facial recognition used to bar people from shops that have a disproportionately high error rate for black people, resumé scanners and other hiring tools that, again, disproportionately discriminate against black people and other minorities, non-consensual AI porn, etc etc. We may well need policies to regulate this stuff, but focus on hypothetical existential risk of AGI in the future, over the very real and present harms that AI is doing right now, is misguided and dangerous.

It's a pity, because if AI Safety had just stayed an academic curiosity (as Rob says it was for him), maybe we'd have the policy resources to tackle the real and present problems that AI is causing for people.

What's the performance of Frigate like on an N5095? I've got a J5105 that I'm tempted to use for a few of my cameras, but worried I'll be wasting my time.

1 more...

Have you looked at Oracle free tier? They have decent specs for free, meaning you can use your $5 to upgrade where you need it once you've tried it out.

Having said that those specs should be fine for a single user.

3 more...