Rendered at 03:33:46 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
thesz 13 hours ago [-]
They used LTL as non-reduced binary decision diagrams (BDDs, binary trees in essence) to prove that transformers are exponentially more succinct that LTL. Should they allow reductions in LTL expressions, that exponential advantage of transformers would vanish in a puff of smoke, because reduced ordered decision diagrams (ROBDDs) of addition circuits (used to construct their exponential LTL example) are polynomially sized.
How to add reductions to LTL? Allow (parametric) definitions of subformulas. E.g., "let p = ... in xUp/\yUp".
Also, note that they construct transformers, transformers are not trained. Training on any truth table is as hard as one can imagine.
cadamsdotcom 1 days ago [-]
Seems intuitively sound; a larger model would have the ability to differentiate among a larger variety of concepts, which translates to a larger vocabulary and greater ability to use expressive tools such as imagery, metaphor etc etc.
I could go on, but brevity is virtuous.
yorwba 16 hours ago [-]
None of this has anything to do with the paper, which is concerned with theoretical computer science and constructs artificial "languages" that have a small representation as a(n idealized theoretical) transformer but whose smallest representation in some other formalisms is much larger. In other words, its conception of succinctness is almost diametrically opposite of the way you appear to have understood it. They looked for small models that produce gigantic but meaningless outputs, not large models that produce short, meaningful text.
cadamsdotcom 14 hours ago [-]
Had another read - you’re absolutely right, thanks for the kind correction and explanation.
pstuart 1 days ago [-]
It makes sense that flowery language is more decorative than functional, but I wonder how much nuance can help shape reckoning, reasoning, and rendering -- if at all.
Flowery language is a powerful tool, but it demands more from both the reader and writer.
That’s the fundamental flaw in using simple heuristics to evaluate language, the exact same text can be useful or deeply flawed just based on the context. You need to make sacrifices the wider the intended audience.
refulgentis 1 days ago [-]
[flagged]
JDazzle 1 days ago [-]
[flagged]
kennyadam 1 days ago [-]
Please don’t turn this place into reddit with a race to the bottom for postings puns or “jokes” that are little more than pointing out hollow pop culture associations to a word in the title of the submission. It’s already happening more and more, but maybe we could try to keep the level of discourse just slightly higher for just a bit longer, please?
drfloyd51 1 days ago [-]
I put on my robe and wizard hat.
burnte 1 days ago [-]
Megatron goes on for hours. He really needs a system prompt.
How to add reductions to LTL? Allow (parametric) definitions of subformulas. E.g., "let p = ... in xUp/\yUp".
Also, note that they construct transformers, transformers are not trained. Training on any truth table is as hard as one can imagine.
I could go on, but brevity is virtuous.
Maybe RFC terms are all that's needed: https://datatracker.ietf.org/doc/html/rfc2119
That’s the fundamental flaw in using simple heuristics to evaluate language, the exact same text can be useful or deeply flawed just based on the context. You need to make sacrifices the wider the intended audience.