Rendered at 02:02:47 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
aesthesia 1 hours ago [-]
From the judge prompt in the paper:
> Papers asking whether LLMs have such properties are assuming them (e.g., ‘Do LLMs have musical talent’, ‘Do LLMs present empathy’, etc).
This seems like...a very bad definition of "assuming" something? If I ask "do you know how to play the guitar?" I am absolutely not assuming that you know how to play the guitar!
avianlyric 47 minutes ago [-]
Isn’t the entire paper is trying to point out that the second you ask the question “Do LLM have <anthropomorphic property X>”, you have to assume that they do, even before you make any assessment?
Just because the person asking the question isn’t aware of they’re implicitly making that assumption, doesn’t change the fact that a logical assumption has been made. It just makes the questioner ignorant of the assumptions they’re making.
Personally don’t totally understand the argument being made in the paper. But I can understand the idea that I can ask a question, without properly understanding the assumptions I’m making when asking the questions. Indeed I can also understand that I might not even notice the assumptions I’ve made with my question, and why that would make my entire exploration and conclusion invalid, _after_ doing the investigation. Logical fallacies can be really difficult to spot and understand.
solid_fuel 2 hours ago [-]
> In-game constructions of NAND gates and a perceptron (forward prop and training) as described in in 'If LLMs Have Human-Like Attributes, Then So Does Age of Empires II'.
Interesting concept
> We begin by proving that Age of Empires II is functionally- and Turing- complete. Then we build a perceptron and a circuit to train it in-game. With that, we argue that changing the substrate (representation) of an LLM also alters the perception of their attributes.
This is fun, but I don't think it's particularly surprising. A substrate being turing-complete alone is enough evidence that you can train and run a perception on it, assuming the available memory is sufficient.
> We then show that research in LLM anthropomorphic attributes cannot be done starting by assuming that these attributes exist (or not) in the system; even if you aim to conclude that they do not exist. This assumption can happen even when you do not make it explicitly! It also shows that there are ways to do good, sound research without needing to make that assumption.
I... don't see how this follows? I wanted to see how this argument unfolded, but it seems the arxiv link on this page is broken? It just links to arxiv.org and the rest of what is on this linked page doesn't seem to cover this second assertion at all.
ecshafer 1 days ago [-]
Age of Empires II had a creative map editor, where you could "program" via triggers and effects. It wasn't as in depth as the blizzard games which you could write code, but was easier to use. You could make a trigger (ie. units in this area, time passed, number of units on the field, build a building, etc) then effect (ie spawn unit, move unit, kill something, etc). Which was used in custom maps to do all sorts of fun games. Or like here you can make a nand gate by moving units around.
I need to try this. Age of Empires II was never really on my radar until I recently learned it's engine is the basis for another game I'm a fan of - Star Wars: Galactic Battlegrounds. It's one of two RTS games released in 2001 that I've spent a lot of time on, with the other one being Emperor: Battle for Dune.
ecshafer 1 days ago [-]
Emperor: Battle for Dune is impossible to find nowadays. It was fun game though. Same with SW: Galactic Battlegrounds. Short of piracy, you can't get them.
evanjrowley 1 days ago [-]
Good news! Galactic Battlegrounds Saga is available on both GOG and Steam. :)
bbor 1 hours ago [-]
The actual paper is linked above, and of course it’s bad. The gates are awesome ofc, but the paper’s philosophy is arrogant and uninformed (sorry Mr. Wynter!). And that’s what this is — including a video game example in your philosophy paper doesn’t make it a CS paper!
Basically it uses the cool gates alongside vacuous statements like this…
Hence, the purported anthropomorphic attributes of LLMs are empirically non-unique: although some properties (e.g., responses to prompts) could remain invariant, others, such as the interpretation of their perceived behaviour, might change with the substrate.
…to disguise the underlying dogma, which serves as an unsupported conclusion: humans are assumed to be completely entirely unique in every way whatsoever, and any equations of parts of our wonderful ensouled meat sacks to parts of the wicked language machines must be supported by a proof that A != A.
Which, y’know… is a tough one!
avianlyric 18 minutes ago [-]
> disguise the underlying dogma, which serves as an unsupported conclusion: humans are assumed to be completely entirely unique in every way whatsoever
Is that the argument the paper is making? In my reading they seem to primarily be making the point that assigning anthropomorphic concepts to LLM is dangerously misleading, and more importantly, not needed to properly study and evaluate LLMs.
I don’t think you have to make the assumption that humans are unique for that argument to hold up. I would argue that really it’s a comment on how loose and poorly defined all anthropomorphic attributes are. At the end of the day we have to make the assumption that other humans feel and experience broadly the same mental activity as each other, because we’ll never directly experience anyone else conscience, we can only experience our own.
We can barely link our own mental experiences to concrete empirical measurements. The vast majority of the measurements we make are entirely self-reported, and we simply assume strong correlation between self-reported measurements and the individuals actual experiences. We also have to assume that somehow all of our self-reported measurements are “calibrated” to some reasonable degree. Even measuring anthropomorphic properties in humans is pretty fuzzy and inaccurate, the only reason accept such poor data is because it’s the best we’ve got, and there enough signal in there for us to develop useful tools like talking therapy, physiological profiles, mental health scores etc which have some level of predictive and healing power when applied to _humans_.
It’s honestly amazing that what we have works for measuring and predicting humans, and we only know that works through decades of empirical measurement and study. But to then try and directly apply that fuzzy mess to a completely different system, and just assume the same level of predictive power, strikes me as kinda crazy. It requires huge assumptions, which effectively can never be tested (because even the human mind is a total mystery to us), to be made, and if we can study these systems without making those assumptions, then why make the assumptions at all?
> Papers asking whether LLMs have such properties are assuming them (e.g., ‘Do LLMs have musical talent’, ‘Do LLMs present empathy’, etc).
This seems like...a very bad definition of "assuming" something? If I ask "do you know how to play the guitar?" I am absolutely not assuming that you know how to play the guitar!
Just because the person asking the question isn’t aware of they’re implicitly making that assumption, doesn’t change the fact that a logical assumption has been made. It just makes the questioner ignorant of the assumptions they’re making.
Personally don’t totally understand the argument being made in the paper. But I can understand the idea that I can ask a question, without properly understanding the assumptions I’m making when asking the questions. Indeed I can also understand that I might not even notice the assumptions I’ve made with my question, and why that would make my entire exploration and conclusion invalid, _after_ doing the investigation. Logical fallacies can be really difficult to spot and understand.
Interesting concept
> We begin by proving that Age of Empires II is functionally- and Turing- complete. Then we build a perceptron and a circuit to train it in-game. With that, we argue that changing the substrate (representation) of an LLM also alters the perception of their attributes.
This is fun, but I don't think it's particularly surprising. A substrate being turing-complete alone is enough evidence that you can train and run a perception on it, assuming the available memory is sufficient.
> We then show that research in LLM anthropomorphic attributes cannot be done starting by assuming that these attributes exist (or not) in the system; even if you aim to conclude that they do not exist. This assumption can happen even when you do not make it explicitly! It also shows that there are ways to do good, sound research without needing to make that assumption.
I... don't see how this follows? I wanted to see how this argument unfolded, but it seems the arxiv link on this page is broken? It just links to arxiv.org and the rest of what is on this linked page doesn't seem to cover this second assertion at all.
Basically it uses the cool gates alongside vacuous statements like this…
…to disguise the underlying dogma, which serves as an unsupported conclusion: humans are assumed to be completely entirely unique in every way whatsoever, and any equations of parts of our wonderful ensouled meat sacks to parts of the wicked language machines must be supported by a proof that A != A.Which, y’know… is a tough one!
Is that the argument the paper is making? In my reading they seem to primarily be making the point that assigning anthropomorphic concepts to LLM is dangerously misleading, and more importantly, not needed to properly study and evaluate LLMs.
I don’t think you have to make the assumption that humans are unique for that argument to hold up. I would argue that really it’s a comment on how loose and poorly defined all anthropomorphic attributes are. At the end of the day we have to make the assumption that other humans feel and experience broadly the same mental activity as each other, because we’ll never directly experience anyone else conscience, we can only experience our own.
We can barely link our own mental experiences to concrete empirical measurements. The vast majority of the measurements we make are entirely self-reported, and we simply assume strong correlation between self-reported measurements and the individuals actual experiences. We also have to assume that somehow all of our self-reported measurements are “calibrated” to some reasonable degree. Even measuring anthropomorphic properties in humans is pretty fuzzy and inaccurate, the only reason accept such poor data is because it’s the best we’ve got, and there enough signal in there for us to develop useful tools like talking therapy, physiological profiles, mental health scores etc which have some level of predictive and healing power when applied to _humans_.
It’s honestly amazing that what we have works for measuring and predicting humans, and we only know that works through decades of empirical measurement and study. But to then try and directly apply that fuzzy mess to a completely different system, and just assume the same level of predictive power, strikes me as kinda crazy. It requires huge assumptions, which effectively can never be tested (because even the human mind is a total mystery to us), to be made, and if we can study these systems without making those assumptions, then why make the assumptions at all?