2022 has certainly been an exciting year for Natural Language Processing and Artificial Intelligence. We have seen a number of exciting innovations that have showcased the potential impact NLP systems can have on our society.
In today's article, we're going to make our predictions on what the new 2023 might bring to our field, covering 4 trends that are likely to be influential. So, let's get started.
Trend 1 - further optimisation and commercialisation of Large language models (LLMs)
In 2023, it is likely that the LLM trend will continue, with rumours of GPT-4 already being in the works. We are certainly going to see new attempts at increasing the scale of these models, as well as further optimisation of their training. For example, extensions of the finetuning approach on human feedback that OpenAI introduced for ChatGPT are highly likely.
A huge part of LLM advances today is due to engineering advances, both in terms of hardware optimisations, as well as optimisation of the architecture and model format. We are likely to see significant advances aimed towards improving the efficiency of these models, both during training, as well as during inference, which is crucial for applications. Efficiency can be improved through techniques such as model distillation, by introducing sparcity in the network, or by reducing the precision of the model.
Now, although these models are very powerful and the advances we have seen are significant, there is still a large room for improvement, as we have discussed previously in this blog. Therefore, in the next year, we are going to see a lot of interesting work that will take apart these new LLMs and will test their robustness across a wide range of NLP tasks. This work will identify where LLMs can really shine, as well as where there is still work to do.
On the industry front, we are certainly going to see a lot of attempts at commercialising LLMs for increasingly complex applications, such as writing assistants, research assistants, chatbots, and many many more. Time will tell which features of LLMs will deliver actual value to users, it will certainly be interesting to watch.
Trend 2 - Parameter-efficient fine-tuning.
Another sub area in which we are likely to see an increase in interest is parameter efficient methods for fine-tuning large language models to perform well on unseen or specialised tasks. There are two examples we will cover.
The first are adapter approaches, which insert a small set of additional trainable parameters in the model. These approaches offer the benefit of being lightweight, while extending the model with the capacity to learn and remember from presented examples. We are likely to see new adapter methods being developed that optimise further the way the trainable parameters are distributed across the models to yield maximum effectiveness and efficiency. Furthermore, these methods are likely to be extended to target the largest language models such as GPT-3.
The second approach to fine-tuning an LLM we will cover is prompt tuning, which, rather than inserting new parameters, optimises the prompts of LLMs for a specific task by improving the instructions and prepending good examples that help to solve the task. This approach has the benefit of being more flexible, because it doesn't require model training. Furthermore, it works very well with the most recent LLMs such as ChatGPT that are extremely sensitive to the instructions and presented examples. However, this approach increases the memory footprint significantly, since the length of context passed to the LLM increases to include the extra inputs. Additionally, the number of examples that can be placed in the context of the LLM is limited. <break time="0.5s"/> In 2023, we are likely to see significant innovation in prompt tuning, including novel techniques for automatically picking the best examples to include, as well as techniques for learning the best prompts for each task.
Trend 3 - Large multi-modal and multi-task models
In 2023 we are likely to see many more attempts at building large models that learn over multiple tasks and modalities simultaneously, such as text, images, videos, or audio files. There are number of potential benefits to this approach, such as knowledge transfer across tasks, and improved model robustness.
One such model that was released in 2022 was the Generalist Agent (Gato) model by DeepMind, that was able to solve tasks such as image captioning, chat, robot arm manipulation, and more, all within the same architecture.
In 2023, we are likely to see novel approaches that improve the robustness and practical usability of these systems, reducing the gap with single-task and single-modality models, and perhaps even eliminating it.
Ultimately, in the coming years, it is highly likely that all new large AI models will be trained on multiple rather than individual modalities. Having these models would open up new possibilities for solving higher-level cognitive tasks, such as tasks in robotics. We will be able to interact with these models in natural language, similar to LLMs, and the output modality might be different depending on the goal.
Trend 4 - better synthesis models from text.
In 2022, we have seen an explosion of text-to-image models. Examples include Dall-E 2 and Stable Diffusion. The speed of development of these systems has been impressive, we have seen new methods come out sometimes on a weekly basis. We have also seen large commercial interest in deploying these text-to-image models to applications primarily targeting the creative domains, such as for art generation.
In 2023, we are likely going to see significant improvements of these methods, increasing their resolution and quality further, as well as their capacity to accurately follow instructions. We will see optimisations and industrial applications of these methods to target specialised domains and use cases, such as in architecture. Furthermore, we are likely to see new models that take the findings from text-to-image models and extend them to the continuous domain to generate music and videos.
On the video generation front, we have seen some initial attempts in 2022. In 2023, we will see increasingly powerful text-to-video models come out, that would be able to target use cases such as continuing a video clip, or generating a clip that can be integrated into a video project with little editing. Furthermore, we might see attempts at generating videos that come with an audio file. It is certainly exciting to see what the future will bring.
So there you have it - four natural language processing trends that are likely to be big in 2023 and beyond. We hope you found this article informative, and that you'll have a successful 2023!
If you are looking to develop cutting-edge projects that leverage the state-of-the-art in NLP, we at The Global NLP Lab would be happy to assist you on your AI journey. Get in touch to find out how we could help you!