Lemvi ,

LLMs are absolutely complex, neural nets ARE somewhat modelled after human brains after all, and trying to understand transformers or LSTMs for the first time is a real pain. However, what a NN can do, or rather what it has been trained to do depends almost entirely on the loss function used. The complexity of the architecture and the training dataset don't change what a LLM can do, only how good it is at doing that, and how well it generalizes.
The loss function used for the training of LLMs simply evaluates whether the predicted tokens fit the actual ones. That means that an LLM is trained to predict words from other words, or in other words, to model language.

The loss function does not evaluate the validity of logical statements, though. All reasoning that an LLM is capable of, or seems to be capable of, emerges from its modelling of language, not an actual understanding of logic.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • noncredibledefense@sh.itjust.works
  • test
  • worldmews
  • mews
  • All magazines