Is chatgpt proof that standard tests are bad measures of intelligence

randoot@lemmy.world to Ask Lemmy@lemmy.world – 198 points –

LLMs are solving MCAT, the bar test, SAT etc like they're nothing. At this point their performance is super human. However they'll often trip on super simple common sense questions, they'll struggle with creative thinking.

Is this literally proof that standard tests are not a good measure of intelligence?

62

You are viewing a single comment

When I was at uni lecturers would often state that exams were thr worst measure of grasping the subject material but its all we have at the moment.

I saw this my self with some of my class mates testing very well but when discussing or problem solving outside of the class there was nothing there.

I think llms fall into this category but with way better recall.

When I was at uni lecturers would often state that exams were thr worst measure of grasping the subject material but its all we have at the moment.

It's not all we have...

But it's the only way a professor can run multiple classes of 100 students each.

But colleges are all about profit, so classes sizes are going to be huge.

The goal isn't educating people, it's making money.

So when they say "there's no other option" they're not mentioning the "and keep making as much money" at the end, it's just implied.

I'm not in the us collages are generally vocational here with both colleges being less (while not totaly) concerned by the money side.

For example where I live university courses are free for those in country outside they pay fees

Dunno how it's done elsewhere but our course are usually measured in 3 parts 1 exam 2 practical 3 essey/investigation. Everyone hates exams

It's also the only way that is portable. A professor could evaluate each student, but has no way to transmit that kind of evaluation in a way that schools or employers across the country would trust. They didn't know who the professor is, or what his standards are, or even if he is being bribed to pass somebody. (Which would happen much more if the professors opinion had the weight that the standardized test does. )

I had a lot of professors who put most of the grade weight on large projects. It made for a very heavy workload, but projects/ papers give a much better picture of how capable someone is of not only reciting knowledge, but also applying it.

Most of my grades were split 40/40/20

With the 3 being a writen component