Announcing ARC-AGI-3 - A benchmark that tests if AI can explore, learn, and adapt in unfamiliar situations. Humans score 100%. Frontier AI scores 0.26%.

brianpeiris@lemmy.ca · edit-2 4 days ago

Announcing ARC-AGI-3 - A benchmark that tests if AI can explore, learn, and adapt in unfamiliar situations. Humans score 100%. Frontier AI scores 0.26%.

mechoman444@lemmy.world · 1 day ago

No. That is not what the analogy means. That is what you are choosing to extract from it because it supports the direction you want this exchange to go.

The use of the word “regurgitate” carries a very specific implication. It suggests that LLMs retrieve and repeat stored information verbatim. That is not how they function. We both appear to agree on that point.

LLMs do not rely on stored facts in the way the analogy implies. They generate outputs by modeling patterns in data, producing responses that are often novel rather than retrieved.

Whether or not the model understands or comprehends the content is irrelevant to this distinction. Comprehension is not a requirement for the system to function. So yes, the analogy is overly simplistic and ignores the actual mechanism at work.

To be precise: it does not matter that the model lacks awareness or understanding. It is still capable of analyzing patterns and generating new outputs from its training data. That is not regurgitation.

Concisely as I can: llms do not regurgitate data, the analogy fails.

Announcing ARC-AGI-3 - A benchmark that tests if AI can explore, learn, and adapt in unfamiliar situations. Humans score 100%. Frontier AI scores 0.26%.

Announcing ARC-AGI-3 - A benchmark that tests if AI can explore, learn, and adapt in unfamiliar situations. Humans score 100%. Frontier AI scores 0.26%.

Announcing ARC-AGI-3 | ARC Prize