ChatGPT Killer? What Gemini 1.5 Means for Google’s AI Future

Google vs OpenAI: Is Google Winning?

After missing the mark with Bard in the AI hype train, Google recently unveiled their latest AI product, Gemini. As part of this launch, Bard has been rebranded as Gemini and now incorporates the new Gemini Pro LLM. Let’s delve deeper to grasp the extent of these changes.

What is Gemini AI?

Gemini represents Google’s newest Large Language Model (LLM), following the release of LaMDA and PaLM. Unlike its predecessors, Gemini is natively multimodal, capable of understanding text, images, speech, and code, and boasts enhanced comprehension and reasoning abilities.

Variants of Gemini AI

The Gemini AI consists of three Large Language Models:

Gemini Nano: Optimized for on-device efficiency, delivering rapid AI solutions directly on your personal device.
Gemini Pro: A versatile and scalable model, adept at tackling diverse tasks with robust performance. Accessible on the free version of the Gemini chat interface.
Gemini Ultra: The pinnacle of the Gemini series, empowering complex problem-solving and advancing the frontiers of AI capabilities. Exclusive to subscribers of the Google One AI Premium Plan.

Gemini models were trained using TPUv5e and TPUv4, depending on their sizes and configuration. Training Gemini Ultra used a large fleet of TPUv4 accelerators owned by Google across multiple data-centers. This represents a significant increase in scale over their prior flagship model PaLM-2 which presented new infrastructure challenges.

Comparing Gemini With Other LLMs

Textual Understanding

Source: Google Deepmind

Image Understanding

Source: Google Deepmind

Benefits of Gemini

1. Seamless integration with all Google Apps

Gemini now seamlessly integrates with all Google Apps, including Maps, YouTube, Gmail, and more. To query specific apps, simply prefix the app name with ‘@’ followed by your query. While similar integrations are achievable on ChatGPT using GPTs and Plugins, they may not offer the same level of seamlessness as Gemini’s native integrations.

Gemini Integration

Google’s renowned expertise in search engine technology undoubtedly extends to enhance Gemini’s web-browsing capabilities. Leveraging foundational strengths in search algorithms and indexing, Gemini offers users a seamless and efficient browsing experience.

2. Multimodal capabilities

Gemini now provides multimodal capabilities, including image understanding, on the Gemini chat interface at no extra cost. While its performance during testing was decent, it may not match the accuracy of GPT-4V. Nevertheless, given that it’s free, we can’t really complain, can we? 😉 There’s a chance that Gemini Ultra may outperform GPT-4V based on the metrics

Gemini Multimodal

3. Free Access to Hobbyists and Students

For aspiring LLM developers looking to dive into the field but facing constraints with accessing GPT APIs due to costs, Google offers free access to the Gemini Pro 1.0 API. With this, you can make up to 60 queries per minute on Google AI Studio, a free web-based developer tool. Google AI Studio allows you to swiftly develop prompts and obtain an API key for app development. By signing into Google AI Studio with your Google account, you can take advantage of this free quota. It’s an excellent opportunity to kickstart your LLM journey and explore embeddings, vector databases, semantic search, and more.

Google AI Studio

4. Value for Money

For $20 per month, users can access GPT-4 via ChatGPT Plus. Alternatively, for the same price, they can access Gemini Advanced with Gemini Ultra 1.0, which includes additional benefits such as 2TB of cloud storage and integration with Google Apps like Gmail and Docs. However, accessing Gemini Advanced requires a subscription to the Google One AI Premium Plan. Despite this requirement, it offers greater value for your money.

Google One Plans

Introducing a mid-tier plan with 500 GB of storage and access to Gemini Advanced between the Standard and Premium Plans would significantly enhance the accessibility of Gemini, especially for students and users with moderate storage requirements. Google, if you’re listening, please consider this suggestion.

What’s Next for Gemini?

Google’s DeepMind is continuously advancing the Gemini Model, with the recent rollout of Gemini Pro 1.5 just a week ago. In this updated variant, the context window has been expanded to 128,000 tokens. Additionally, a select group of developers and enterprise customers can now experiment with even larger context windows of up to 1 million tokens through private previews on AI Studio and Vertex AI. To put this into perspective, a typical non-fiction book contains around 300,000 tokens. With the Gemini Pro 1.5’s 1 million token context window, users can now upload entire books in query requests—a remarkable advancement compared to GPT-4’s 128,000 token context window.

Amidst the saturation of LLMs in the AI industry, Google appears to have struck gold with its enhanced architecture, swift responses, and seamless integration within the Google ecosystem this time. It could indeed be a step in the right direction, keeping OpenAI and other competitors on their toes.

In this AI era, it is crucial for businesses to have well-trained employees, and incorporating AI for employee training can be a significant investment. If you are seeking AI solutions to train your employees, Cody is the right tool for you. Similar to ChatGPT and Gemini, Cody can be trained on your business data, team, processes, and clients, using your unique knowledge base. Cody is model-agnostic making it easier for you to switch models as per your requirements.

With Cody, businesses can harness the power of AI to create a personalized and intelligent assistant that caters specifically to their needs, making it a promising addition to the world of AI-driven business solutions.