The current crop of modern AI models were originally pretty basic things. The user entered a text request, and the neural network processed the input, matched patterns and delivered an answer.
However over time the technology has matured and evolved, one feature at a time.
Chain of thought models arrived shortly after transformer tech from Google revolutionized the whole AI machine learning marketplace.
Chain of thought (aka CoT) models are an evolution of some of the early prompt engineering techniques which were designed to encourage more valuable AI answers for the user.
The user would make a request to the AI, but include in the prompt something along the lines of ‘show me a step by step output of the thinking you used to arrive at your answer’.
Google to the rescue again
The improvement in responses using this simple prompt technique was so obvious, that researchers started to investigate how to integrate the concept inside their base models.
What we now understand as CoT models were introduced in 2022 in a Google white paper, which attempted to overcome some of the limitations of conventional LLM’s.
These older models struggled with handling complex reasoning, especially for things like logic and advanced mathematics.
By seeding models with examples of how to approach these kinds of problems, the Google researchers showed that chain of thought could effect a radical improvement in AI system results.
Once introduced, the concept gained ground rapidly with more functionality such as multimodality and self-seeding adding to the value of the process. These models in effect became smarter just by processing data in a more structured way.
By explicitly generating intermediate reasoning steps before producing the final output, chain of thought models can now handle significantly more advanced requirements.
These models no longer rely on hard coded rules but learn reasoning patterns so they can approach each requirement with more flexibility. This allows them to handle tasks which can be ambiguous or incomplete, which earlier classical AI models found problematic.
A milestone technology
Chain of thought largely superseded traditional methods of AI model design, but are now themselves increasingly being replaced with full reasoning or thinking models.
Full blown reasoning models don’t just run step by step through a reasoning process, but also have the capacity to re-evaluate each step, look for new options on the fly and generally conduct the kind of internal processing dialogue that mimics the human decision making process in the brain.
There is, of course, a cost that accrues to this kind of more complex AI processing, and that’s reflected in higher compute time and costs.
The more time it takes to evaluate and output an answer, the higher the cost of the computation that’s needed compared to classic AI processes.
For this reason, thinking models are typically confined to higher end, higher priced, AI systems, although in recent months the technology has filtered down to more modest AI models such as the open source DeepSeek product from China.
The importance of CoT
By any measure, chain of thought tech represented a quantum leap in the advancement of artificial intelligence systems of the past.
Their ability to handle many more complex and difficult tasks, such as scientific research, financial trading and advanced mathematics, continues to drive the utility of AI far beyond the original early models.
While chain of thought and its cousin full reasoning are not suitable for every application, especially those which require time sensitive action, the overall progress towards true artificial general intelligence will undeniably require that this sophisticated technology be part of the package.