SAN FRANCISCO, DECEMBER 6, 2023- Google Deepmind has marked its way to the AI industry with yet another advancement named Gemini on Wednesday. The tech giant considers the new technology a multimodal prototype with the ability to understand text, image, audio, and video prompts. The first version of Gemini will be available in three sizes, Ultra, Pro, and Nano, as per the complexity of the given tasks.
The launching of the new AI assistant is assumed to be a direct threat to OpenAI’s ChatGPT and its latest version, GPT-4. Its developers have stated that Gemini can bring better outcomes to the given prompts than any other available AI program. The challenge isn’t confined to GPT-4 but extends to other AI infrastructures as well. The Ultra model of Gemini is supposed to be the most superior form of AI and can leave other existing AI mechanisms behind.
Demis Hassabis, the CEO and Co-Founder of Google DeepMind, states, “Gemini is the result of large-scale collaborative efforts by teams across Google, including our colleagues at Google Research. It was built from the ground up to be multimodal, which means it can generalize and seamlessly understand, operate across and combine different types of information including text, code, audio, image and video.”
As per the announcement made by Google, Gemini is going through observations of external parties for trust and safety checks. However, its Pro version can be publicly accessed with the AI-based chatbot Bard. Moreover, Google’s latest smartphone, the Pixel 8 Pro, will support and run Gemini Nano. Apart from that, developers can utilize Gemini Pro via API starting 13 December alongside Android developers who can install Gemini Nano on devices.
Can Gemini beat GPT-4?
As per Google’s introduction statement on Gemini, it is the best version AI could ever get while progressing toward technological advancement. It will allow users to put any type of prompt and receive any results. Observations define that Gemini Ultra has become the first and only AI support to surpass the human mind in the significant test to evaluate AI problem-solving abilities, the Massive Multitask Language Understanding (MMLU). Where GPT-4 could only score 86.4%, Gemini Untra achieved 90.0% in MMLU in text.
Furthermore, being a multimodal mechanism, Gemini Ultra attained 59.4% in image-based understanding, whereas OpenAI’s GPT-4V, the developing multimodal program, could get 56.8% in the same evaluation process. Such stats reflect how the new AI infrastructure can outperform GPT-4 and other AI technologies in terms of solving complicated tasks efficiently and quickly.