While ChatGPT from OpenAI has emerged as a phenomenon and always tops the list of most widely used and mentioned AI applications in the world, Google is evaluated as falling behind even though it pioneered the launch of PaLM – a smart language processing module since mid-2022. However, at its Google I/O 2023 developer conference, the search giant announced a new large language model called PaLM 2, which is an upgraded version of PaLM and also the latest model released in the AI race from Google.
What is PaLM 2?
First of all, it is necessary to recall the definition of LLM – Large Language Model behind the magical ability of AI applications.
LLM is a type of Deep Learning algorithm that can exploit knowledge from massive data sets to perform recognition, analysis, summarization, translation, text generation, etc. In addition, they also support software development, healthcare, scientific research, and many other fields.
Based on these principles, PaLM 2 is introduced as the next-generation language of Google, with superior multi-language encoding, reasoning, and processing capabilities.
Coding:
PaLM 2 was pre-trained on a large number of publicly available source code datasets. This means it outperforms popular programming languages like Python and JavaScript, but can also create specialized code using languages such as Prolog, Fortran, and Verilog.
Reasoning:
The wide-ranging dataset of PaLM 2 includes scientific papers and websites containing mathematical expressions. As a result, it demonstrates improved abilities in logic, common reasoning, and mathematics.
Multilingual translation:
PaLM 2 is trained more deeply on multi-language text to significantly improve its ability to understand language nuances, including idioms, poems, and riddles.
Google provides PaLM 2 with 4 different versions based on size: Gecko, Otter, Bison, and Unicorn.
How is PaLM 2 built?
Although the training data is not publicly available, Google confirms that PaLM 2 has been trained on 3.6 trillion tokens – more than 5 times larger than PaLM, allowing it to program, solve complex problems, and create more complex content. Additionally, Google has trained PaLM 2 with over 100 different languages and 20 programming languages, making it more deeply understood and accurately translated.
- Few parameters, more data: The idea of scaling the model size and training data size into optimal ratios helps PaLM 2 provide fewer parameters, smaller sizes, thereby reducing costs but delivering better overall performance than PaLM.
- Multi-language training dataset: Previously, PaLM mainly used training data with English text, while PaLM 2 has improved its language data repository with more diverse languages, not only including hundreds of human and programming languages, but also including documents on math, science, websites, etc.
Which Google services is PaLM 2 integrated into?
At the I/O conference, Google announced more than 25 new products and features provided by PaLM 2. These include:
- Chatbot Bard: Improved multi-language processing ability helps expand Bard to new languages.
- Workspace services including Gmail, Docs, Sheets, Slides, etc. can all leverage the capabilities of PaLM 2 to increase efficiency and processing speed.
- PaLM API: PaLM 2 provides resources for building general-purpose AI applications at Google.
And many other applications in healthcare and programming have been, are being, and will be integrated with PaLM 2 to bring superior features such as Med-PaLM 2, Sec-PaLM, etc.
Although Google always strives to lead in the AI race that is currently booming, it cannot be denied that any new technology product has the potential for errors and comes with certain limitations. Not only PaLM 2, all AI-developed products currently only serve a supportive and creative purpose and have not been standardized to protect jobs and creators. Therefore, there are still many issues that need to be addressed for Google in particular and the AI technology industry in general to confidently assert their product position in the digital market.