10 years ago, we wrote exams by hand with whatever we understood (in our heads.)
No colleagues, no laptops, no internet, no LLMs.
This approach still works, why do something else? Unless you're specifically testing a student's ability to Google, they don't need access to it.
Industry is full of people trying to use them to become more productive.
Why wouldn't you let students use the same tools?
Seems like you need to make the projects much harder.
Using chatGPT as a professional is different than using it for homework. Homework and school teaches you many things, not only the subject. You discover how you learn, what your interests are, etc.
ChatGPT can assist with learning also but SHOULD NOT be doing any of the work for the student. It is okay to ask "can you explain big O", then answer follow up questions. However, "give me method to reverse a string" will only hurt.
If you ask them to build a web browser when they can't do a hello world on their own, it's going to be a disaster. LLMs are like dumb juniors that you command, but students are less skilled than dumb juniors when they start programming classes..
I think the answer maybe comes down to figuring out exactly what the goal of school is. Are you trying to educate people or train them? There is for sure a lot of overlap, but I think there's a pretty clear distinction and I definitely favor the education side. On the job, a person with a solid education should be able to use whatever language or framework they need with very little training required.
One issue is that the time provided to mark each piece of work continues to decrease. Sometimes you are only getting 15 minutes for 20 pages, and management believe that you can mark back-to-back from 9-5 with a half hour lunch. The only thing keeping people sane is the students that fail to submit, or submit something obviously sub-par. So where possible, even for designing exams, you try to limit text altogether. Multiple choice, drawing lines, a basic diagram, a calculation, etc.
Some students have terrible handwriting. I wouldn't be against the use of a dumb terminal in an exam room/hall. Maybe in the background it could be syncing the text and backing it up.
> Unless you're specifically testing a student's ability to Google, they don't need access to it.
I've been the person testing students, and I don't always remember everything. Sometimes it is good enough for the students to demonstrate that they understand the topic enough to know where to find the correct information based on a good intuition.
Your blue book is being graded by a stressed out and very underpaid grad student with many better things to do. They're looking for keywords to count up, that's it. The PI gave them the list of keywords, the rubric. Any flourishes, turns of phrase, novel takes, those don't matter to your grader at 11 pm after the 20th blue book that night.
Yeah sure, that's not your school, but that is the reality of ~50% of US undergrads.
But again, the test creator matters a lot here too. To make such an exam is quite the labor. Especially as many/most PIs have other better things to do. Their incentives are grant money, then papers, then in a distant 3rd their grad students, and finally undergrad teaching.any departments are explicit on this. To spend the limited time on a good undergrad multiple choice exam is not in the PIs best interest.
Which is why, in this case of a good Scantron exam, they're likely to just farm it out to Claude. Cheap, easy, fast, good enough. A winner in all dimensions.
Also, as an aside to the above, an AI with OCR for your blue book would likely be the best realistic grader too. Needs less coffee after all
Now that I haven't been a student in a long time and (maybe crucially?) that I am friends with professors and in a relationship with one, I get it. I don't think it would be appropriate for a higher level course, but for a weed-out class where there's one Prof and maybe 2 TAs for every 80-100 students it makes sense.
Then they should have points deducted for that. Effective communication of answers is part of any exam.
If you can pass an exam just by googling something, it means you're just testing rote-memorization rather, and maybe a better design is needed where synthesis and critical thinking skills are evaluated more actively.
https://www.tomshardware.com/pc-components/gpus/tiny-corp-su...
The open book test is purposes is to not have to know all facts (formulas) but proving how to find them and how to apply them. (Finding is part of it as the more you look, the less time you got to use it, thus there is an optimisation problem which things to remember and which to look up)
In modern times you wouldn't look those up in a book, thus other research techniques are required to deal with real life (which advanced certifications should prove)
Universities aren’t profit maximizing. They are admin maximizing. Admin are always looking to expand admins budget. Professors, classrooms, facilities all divert money away from admin and they don’t want to pay it unless they have to.
Also applies to hospitals in USA.
Because 'nonprofit' is only in reference to the legal entity, not the profit-seeking people working there? There is still great incentive to increase profitability.
When I was studying at university there was a rumour that one of the dons had scraped through their fourth-year exams despite barely attending lectures, because he had a photographic memory and just so happened to leaf through a book containing a required proof, the night before the exam. That gave him enough points despite not necessarily understanding what he was writing.
Obviously very few students have that sort of memory, but it's not necessarily fair to give advantage to those like me who can simply remember things more easily.
In the book days, I sometimes got to where I knew exactly where on a page I would find my answer without remembering what that answer was. Nowadays I remember the search query I used to find an answer without remembering what that answer was.
It makes for good rumours and TV show plots, but this sort of "photographic memory" has never been shown to actually exist.
https://medium.com/young-spurs/the-unsung-genius-of-john-von...
My students then often ask me to do the same, to permit them to bring one page of notes as he does.
Then I would say: just assume you're writing the exam with him and work on your one-pager of notes, optimize your notes by copying and re-writing them a few times. Now, the only difference between my exam and his exam is that the night before, you memorize your one-pager (if you re-wrote it a few times you should be able to recreate it purely from memory from that practice alone).
I believe having had all material in your memory at the same time, at least once for a short while, gives students higher self-confidence; they may forget stuff again, but they hopefully remember the feeling of mastering it.
You did, but the best exam I had was open book bring anything. 25 and some change years ago even.
I've also had another professor do the "you can bring one A4 sheet with whatever notes you want to make on it."
For most of us--myself included--once you graduate from college, the answer is: "enough to not get fired". This is far less than most curriculums ask you to know, and every year, "enough to not get fired" is a lower and lower bar. With LLMs, it's practically on the floor for 90% of full-time jobs.
That is why I propose exactly the opposite regimen from this course, although I admire the writer's free thinking. Return to tradition, with a twist. Closed-book exams, no note sheets, all handwritten. Add a verbal examination, even though it massively increases examination time. No homework assignments, which encourage "completionist mindset", where the turning-in of the assignment feels more real than understanding the assignment. Publish problem sets thousands of problems large with worked-out-solutions to remove the incentive to cheat.
"Memorization is a prerequisite for creativity" -- paraphrase of an HN comment about a fondly remembered physics professor who made the students memorize every equation in the class. In the age of the LLM, I suspect this is triply true.
What happens if LLM’s suddenly change their cost to be 1000 USD per user per month? What if it is 1000 USD per request? Will new students and new professionals still be able to complete their jobs?
Calculators have never been more accessible/available. (And yet I personally still do most basic calculations in my head)
So I agree students should learn to do this stuff without LLMs, but not because the LLMs are going to get less accessible. There's another better reason I'm just not sure how to articulate it yet. Something to do with integrated information and how thinking works.
LLM’s are not consistent. For example, having a new company make a functional duplicate of ChatGPT is nearly impossible.
Furthermore, the cost of LLM’s can change at any time for any reason. Access can be changed by new government regulations, and private organizations can chose to suspend or revoke access to their LLM due to changes in local laws.
All of this makes dependence on an LLM a risk for any professionals. The only way these would be mitigated is by an open source, freely available LLM that creates consistent results that students can learn how to use.
I thought the point was to continue in the same vein and contribute to the sum total of all human knowledge. I suppose this is why people criticize colleges as having lost their core principles and simply responded to market forces to produce the types of graduates that corporate America currently wants.
> "enough to not get fired" is a lower and lower bar.
Usually people get fired for their actions and not their knowledge or lack thereof. It may be that David Graebers core thesis was correct. Most jobs are actually "bullshit jobs," and in the era of the Internet, they don't actually require any formal education to perform.
You are describing how school worked for me (in Italy, but much of Europe is the same I think?) from middle school through university. The idea of graded homework has always struck me as incredibly weird.
> In the age of the LLM, I suspect this is triply true.
They do change what is worth learning though? I completely agree that "oh no the grades" is a ridiculous reaction, but adapting curricula is not an insane idea.
The culture has moved from competence to performance. Where universities used to be a gateway to a middle class life, now they're a source of debt. And social performances of all kind are far more valuable than the ability to work competently.
Competence used to be central, now it's more and more peripheral. AI mirrors and amplifies that.
To the horror of anyone struggling with anxiety, ADHD, or any other source of memory-recall issues under examination pressure. This further optimizes everything for students who can memorize and recall information on the spot under artificial pressure, and who don't suffer any from any of the problems I mentioned.
In grade school you could put me on the spot and I would blank on questions about subjects that I understood rather well and that I could answer 5 minutes before the exam and 5 minutes after the exam, but not during the exam. The best way for me to display my understanding and knowledge is through project assignments where that knowledge is put to practical use, or worked "homework" examples that you want to remove.
Do you have any ideas for accommodating people who process information differently and find it easier to demonstrate their knowledge and understanding in different ways?
Why aught there be an exception for academics? Do you want your lawyer or surgeon to have performance anxiety? This seems like a perfectly acceptable thing to filter out on.
Western education passing as many fee paying students as possible seems to be very much a UK/US phenomenon but doesn't seem to be the case of European countries where the best schools are public and fees are very low (In France, private engineering schools rank lower)
At the other extreme are universities offering low quality courses that are definitely degree factories. They tend to have a strong vocational focus but nonetheless they are not effective in improving employability. In the last few decades we have expanded the university system and there are far more of these.
There is no clear cutoff and a lot of variation in between so its not a bifurcation but the quality vs factory difference is there.
Mostly done to get more degree holders which are seen as "more productive". Or at least higher paid...
The obvious answer is "Because it's interesting."
But suppose you think strictly in utilitarian terms: what effort should I invest for what $$$ return. I have two things to say to you:
First: what a meaningless life you're living.
Second: you realize that if you don't learn anything because you have LLMs, and I learn everything because it's interesting, when you and I are competing, I'll have LLMs as well...? We'll be using the same tools, but I'll be able to reason and you won't.
I think the people who struggle with the question "Why should I know anything?" aren't going to learn anything anyway. You need curiosity to learn, or at least to learn a lot and well, and if you have curiosity you're not asking why you should learn anything.
Students had very good reason to question the education system when they were asked to memorize things that were safe to forget once they graduated from school. And when most functional adults admitted they forgot what they had learned in school. It was an issue before LLM, and triply so now.
By the way, I now am 100% agree with "Memorization is a prerequisite for creativity." However, if you asked me to try to convince the 16-year-old me I would throw my hands up.
This is in tech now, were the first adopters, but soon it will come to other fields.
To your broader question
> Something that I think many students, indeed many people, struggle with is the question "why should I know anything?"
You should know things because these AIs are wrong all the time, because if you want any control in your life you need to be able to make an educated guess at what is true and what isn't.
As to how to teach students. I think we're in an age of experimentation here. I like the idea of letting students use all tools available for the job. But I also agree that if you do give exams and hw, you better make them hand written/oral only.
Overall, I think education needs to focus more on building portfolios for students, and focus less giving them grades.
Gosh that sounds horrifying. I am not an expert on that piece of system, no I do not want to take responsibility for whatever the LLMs have produced for that piece of system, I am not an expert and cannot verify it.
Here, though, is my answer: an excellent long-term goal for any band of humans is to create, inhabit, and enjoy the greatest civilization possible, and the more each individual human knows about their reality, the easier it is to do that.
But the response to that will be further beatings until morale improves.
What about technology professionals? From my biased reading of this site alone: both further beatings and pain relievers in the form of even more dulling and pacifying technology. Follow by misanhtropic, thought-terminating cliches: well people are inherently dumb/unmotivated/unworthy so topic is not really worth our genuine attention; furthermore, now with LMMs, we are seeing just how easy it is to mimic these lumps of meat—in fact they can act both better and more pathetic than human meat bags, just have to adjust the prompts...
As more jobs started requiring degrees, the motivation has to change. If people can get food and housing without a degree again to a comfortable extent than the type of person getting a degree will change again too.
If you let them, they'll alienate you until you have no free time and no space for rest or hobbies or learning. Labour movements had to work hard to prevent the 60 hour workweek, but we're creeping back away from 40, right?
I think this is changing rapidly.
I'm a university professor, and the amount of students who seem to be in need of LLM as a crutch is growing really exponentially.
We are still in a place where the oldest students did their first year completely without LLMs. But younger students have used LLMs throughout their studies, and I fear that in the future, we will see full generations of students completely incapable of working without LLM assistance.
If people know "at university you can't use LLM, you are forced to think by yourself" they will adjust, albeit by trial of fire.
I think there's an argument that growing up in an educational system unable to teach you how to not rely on LLM would for all intents and purposes permanently nerf you compared to more fortunate peers. Critical thought is a skill we continue to practice until the very end
My feeling is that for many/most students, getting a great understanding of the course material isn't the primary goal, passing the course so they can get a good job is the primary goal. For this group using LLMs makes a lot of sense.
I know when I was a student doing a course I was not particularly interested in because my parents/school told me that was the right thing to do, if LLMs had been around, I absolutely would have used them :).
If you're presented with the choice of "Don't use AI" and "Use AI, but live with the consequences" (consequences like mistakes being judged harsher when using AI than when not using AI), I do not think chatbots will be a desirable choice if you've properly prepared for the exam.
Hardware is still improving, though not as fast as it used to; it's very plausible that even the current largest open weights models will run on affordable PCs and laptops in 5 years, and high-end smartphones in 7.
I don't know how big the SOTA close-weights models are, that may come later.
But: to the extent that a model that runs on your phone can do your job, your employer will ask "why are we paying you so much?" and now you can't afford the phone.
Even if the SOTA is always running ahead of local models, Claude Code could cost 1500 times as much and still have the average American business asking "So why did we hire a junior? You say the juniors learn when we train them, I don't care, let some other company do that and we only hire mid-tier and up now."
(Threshold is less than 1500 elsewhere, I just happened to have recently seen the average US pay for junior-grade software developers, $85k*, which is 350x cheaper, and my own observation that they're not only junior quality but also much faster to output than a junior).
* but also note while looking for a citation the search results made claims varying from $55k to $97.7k
Are you sure student desire is the driving force here?
This has also something to do with it. Hard to make very accurate conclusions.
You are an outlier. When I was in school any outside assistance was tantamount to cheating and, unlike an actual crime, it was on the student to prove they were not cheating. Just the suspicion was enough to get you put in front of an honor board.
It was also pervasive. I would say 40% of international students were cheaters. When some were caught they fell back on cultural norms as their defense. The university never balked because those students, or their institutions, paid tuition in cash.
To throw another anecdote in the bucket, I know at least one professor who does not tolerate cheating from any of his students, regardless of cultural or national background, or how they're paying for their education
I have corrected exams and graded assignments as an external party before (legal requirement). The biggest problem with LLMs I see is that the weak students copy-paste commands with unnecessary command line switches. But they would have done the same with stack overflow.
Some also say they use LLM to help improve their writing but that's where the learning is so why????? I think it's the anxiety for failing, they don't seem to understand I'll not fail them as long as their incoherent text proves they understood what they were doing.
Having graduated and knowing how things are ought to look, taking exams are so much less scary now because I'm confident I will be failed for being incompetent, not because I didn't write properly. Not all students have the same privilege, they gain it over time.
It does help that computer science assignments and papers are pretty damn standard in form.
This is radically different from the world that's been described to me. Even 20 years ago cheating was endemic and I've only heard of it getting worse.
> It took me 20 years after university to learn what I know today about computers. And I’ve only one reason to be there in front of you: be sure you are faster than me. Be sure that you do it better and deeper than I did. If you don’t manage to outsmart me, I will have failed.
I wished other universities adapt so quickly too (and have such a mindful attitude to students e.g. try to understand them, be upfront with expectations, learning from students etc).
Majority of professors are stressed and treat students as idiots... at least that was the case decade a go!
I’m different because I was a bad student. Only managed to get my diploma with minimal grade, always rebel against everything. But some good people at my university thought that Open Source was really important and they needed someone with a good career in that field. I was that person (and I’m really thankful for offering that position)
Is this a French thing? In the US we don't have standardized exams to become a college professor. Instead, we need to do original research and publish.
Makes me wonder if they should also get a diploma together then, saying "may not have the tested knowledge if not accompanied by $other_student"
I know of some companies that support hiring people as a team (either all or none get hired and they're meant to then work together well), so it wouldn't necessarily be a problem if they wish to be a team like that
The main strategy is collaboration. If you are smart enough to:
1. Identify your problem 2. Ask someone about it 3. Get an answer which improve your understanding
Then you are doing pretty good by all standards
Another trick I sometimes use. I take one student which has hard time articulate it a concept. I take student two who don’t understand that concept. I say to student 1: "You have 20 minutes to teach student 2 the concept. When I come back, you will be graded according to his answers"
(I, of course, not only grade that. But it forces both of them to make an extra effort, student 2 not willing to be the cause for student 1 demise)
I would very much not count on that.
Less educated people are easier to steer via TikTok feeds anyway.
Regarding the collaboration before the exam, it's really strange. In our generation, asking or exchanging questions was perfectly normal. I got an almost perfect score in physics thanks to that. I guess the elegant solution was still in me, but I might not have been able to come up with it in such a stressful situation. 'Almost' because the professor deducted one point from my score for being absent too often :)
However, oral exams in Europe are quite different from those at US universities. In an oral exam, the professor can interact with the student to see if they truly understand the subject, regardless of the written text. Allowing a chatbot during a written exam today would be defying the very purpose of the exam.
I think not all exams can occur like that. In some cases you just have to test one's knowledge about a specific topic, and knowing facts is a very, very easy way to test this. I would agree that just focusing on facts these days is overrated, but I would still reason that it is not a useless metric still. So, when the author describes "bring your own exam questions", it more means that the exam itself is not so relevant, which is fine - but saying that university exams are now useless in the age of autosolving chatbots, is simply wrong. It just means that the exam itself is not important; that in itself does not automatically mean that ALL exams or exam styles are useless. Also, it depends on what you test. For instance, testing solving math questions - yes, chatbots can solve this, but can a student solve the same without needing a chatbot? How about practical skills? Ok, 3D printing will dominate, but the ability to craft something with your own hands, that is still a skill that may be useful, at the least to some extent.
I feel that the whole discussion about chatbots dumbs down a lot. Skills have not become irrelevant just because chatbots exist.
In my experience LLMs can significantly speed up the process of solving exam questions. They can surface relevant material I don't know about, they can remember how other similar problems are solved a lot better than I can and they can check for any mistakes in my answer. Yes when you get into very niche areas they start to fail (and often in a misleading way) but if you run through practise papers at all you can tell this and either avoid using the LLM or do some fine tuning on past papers.
Our written assignments were a lot of "have an LLM generate a business proposal, then annotate it yourself"
The final exam was a 30 minute meeting where we just talked as peers, kinda like a cultural job interview. Sure there's lots of potential for bias there, but I think it's better than just blindly passing students using LLM's for the final exam.
CS exercises that we can expect an average student to solve is trivially solved by LLMs. Even smaller local models.
This is one area where LLMs really should excel at. And that doesn't really mean that students should not also learn it and be able to solve same issues. Which is real dilemma for the school system...
They should be encouraged to read and review the LLM output so they can critically understand it and take ownership of it.
I believe there is a mechanism for this already.
Well, at least it used to be the bottleneck. Nowadays you can just ask an LLM. For all their faults, they are really good at letting you know about what tools exist in out there in the world, surfacing more than you could ever come to know about even if all you did was read about what exists all day, every day.
>I thought this was fair. You can use chatbots, but you will be held accountable for it.
So you're held more accountable for the output actually? I'd be interested in how many students would choose to use LLMs if faults weren't penalized more.
If you have this great resource available to you (an LLM) you better show that you read and checked its output. If there's something in the LLM output you do not understand or check to be true, you better remove it.
If you do not use LLMs and just misunderstood something, you will have an (flawed) justification for why you wrote this. If there's something flawed in an LLM, the likelihood that you do not have any justification except for "the LLM said so" is quite high and should thus be penalized higher.
One shows a misunderstanding, the other doesn't necessarily show any understanding at all.
You could say the same about what people find on the web, yet LLMs are penalized more than web search.
>If you do not use LLMs and just misunderstood something, you will have an (flawed) justification for why you wrote this. If there's something flawed in an LLM, the likelihood that you do not have any justification except for "the LLM said so" is quite high and should thus be penalized higher.
Swap "LLMs" for "websites" and you could say the exact same thing.
The author has this in their conclusions:
>One clear conclusion is that the vast majority of students do not trust chatbots. If they are explicitly made accountable for what a chatbot says, they immediately choose not to use it at all.
This is not true. What is true is that if the students are more accountable for their use of LLMs than their use of websites, they prefer using websites. What is "more" here? We have no idea, the author didn't say so. It could be that an error from a website or your own mind is -1 point and from a LLM is -2, so LLMs have to make two times less mistakes than websites and your mind. It could be -1 and -1.25. It could be -1 and -10.
The author even says themselves:
>In retrospect, my instructions were probably too harsh and discouraged some students from using chatbots.
But they don't note the bias they introduced against LLMs with their notation.
When I was a student, professors maintained a public archive of past exams. The reason was obvious: next time the questions would be different, and memorizing past answers wouldn't help you if you don't understand the core ideas being taught. Then I took part in an exchange program and went to some shit-tier uni and I realized that collaboration was explicitly forbidden because professors would usually ask questions along "what was on slide 54". My favorite part was when professor said "I can't publish the slides online because they're stolen from another professor but you can buy them in the faculcy's shop".
My uni maintained a giant presence on Facebook - we'd share a lot of information, and the most popular group was "easy courses" for students who wanted to graduate but couldn't afford a difficult elective course.
The exchange uni had none of that. Literally no community, no collaboration, nothing. It's astonishing.
BTW regarding the stream of consciousness - I distinctly remember taking an exam and doing my best to force my brain to think about the exam questions, rather than porn I had been watching the previous day.
https://www.youtube.com/watch?v=JcQPAZP7-sE
LLM reasoning models are very good at searching well documented problems. =3
I'm sympathetic to both sides here.
As a professor who had to run Subversion for students (a bit before Git, et al), it's a nightmare to put the infrastructure together, keep it reliable under spiky loads (there is always a crush at the deadline), be customer support for students who manage to do something weird or lose their password, etc. You wind up spending a non-trivial amount of time being sysadmin for the class on top of your teaching duties. Being able to say "Put it on GitHub" short circuits all of that. It sucks, but it makes life a huge amount easier for the professor.
From the students point of view, sure, it sucks that nobody mentioned that Git could be used independently (or jj or Mercurial or ...) However, Github is going to be better than what 99.9% of all professors will put together or be able to use. Sure, you can use Git by itself, but then it needs to go somewhere that the professor can look at it, get submitted to automated testing, etc. That's not a trivial step. My students were happy that I had the "Best Homework Submission System" (said about Subversion of all things ...) because everybody else used the dumbass university enterprise thing that was completely useless (not going to mention its name because it deserves to die in the blazing fires of the lowest circle of Hell). However, it wasn't straightforward for me to put that together. And the probability of getting a professor with my motivation and skill is pretty low.
Agree with your comment about probability, motivation, and skill.
I thought this part of penalizing mistakes made with the help of LLMs more was quite ingenious.
If you have this great resource available to you (an LLM) you better show that you read and checked its output. If there's something in the LLM output you do not understand or check to be true, you better remove it.
If you do not use LLMs and just misunderstood something, you will have a (flawed) justification for why you wrote this. If there's something flawed in an LLM answer, the likelihood that you do not have any justification except for "the LLM said so" is quite high and should thus be penalized higher.
One shows a misunderstanding, the other doesn't necessarily show any understanding at all.
Is it possible, and this is an interesting one to me, that this is the smartest kid in the class? I think maybe.
That guy who is playing with the latest tech, and forcing it to do the job (badly), and could care less about university or the course he's on. There's a time and a place where that guy is the one you want working for you. Maybe he's not the number 1 student, but I think there should be some room for this to be the Chaotic Neutral pick.
He might as well be the dumbest guy in the class. Playing with tech is not a proof of being smart on itself.