Credit: VentureBeat made with Midjourney
The tsunami of new generative AI product news is showing no signs of letting up: Fresh on the heels of OpenAIâs expansion of Code Interpreter to all ChatGPT Plus users and Anthropicâs announcement of Claude 2, Google is taking the spotlight back with two big AI announcements this week. The first is a massive update to its large language model (LLM) product Bard, enabling users to upload images and have Bard analyze them. The second is the unveiling of Google NotebookLM, an AI-powered note-taking service in limited availability.
Want must read news straight to your inbox?
Sign up for VB Daily
Bard goes global and visual
First up, the updates to Bard. For a while after OpenAI released ChatGPT in November 2022, it seemed like Google was racing to play catchup with its AI efforts.
But the annual Google I/O conference in May 2023 changed all that, with CEO Sundar Pichai and other executives and presenters saying the words âgenerative AIâ more than 140 times during the two-hour-long keynote presentation like it was some sort of magical incantation for business success.
Clearly, the search and web giant was wholeheartedly embracing the tech trend that has swept Silicon Valley and the global tech industry. Though Bard has failed to reach the same user numbers as ChatGPT since its wide release at the same I/O event, it has been increasing its numbers more dramatically recently, and the new updates announced today may help further that trend.
A Google blog post published today, authored by Jack Krawczyk, Bardâs product lead, and Amarnag Subramanya, VP of engineering for Bard, outlines a flurry of new features for the language model, including:
- Availability in âmost of the globe,â and support for user prompts in 40 languages including Arabic, Chinese, German, Hindi and Spanish. Bard is also accessible in many new locations such as Brazil and Europe.
- Bard can speak its responses in 40 languages, which could be particularly beneficial for learning pronunciation.
- There are five new modes users can switch between for the types of responses they want Bard to provide: simple, long, short, professional or casual. Whatâs the difference? Google offers this example: âYou can ask Bard to help you write a marketplace listing for a vintage armchair, and then shorten the response using the drop-down.â The feature is available only in English to start, but Google says other languages will follow.
- Four new features have been launched to enhance productivity: Users can pin and rename conversations with Bard; export Python code to Replit as well as Google Colab; share responses with their network via shareable links; and use images in their prompts with the help of Google Lens integration. The pinning in particular seems generally helpful, as it allows the user to save selected responses from Bard conversations off to the left side of the interface window for easy access later (instead of scrolling all the way up or down to find them).
- Finally, following up on a promise made at I/O, Bard now integrates with Google Lens, the tech giantâs image recognition technology, allowing users to include images in their prompts. Whether you need more information about an image or require assistance with creating a caption, Bard can analyze the uploaded image to assist. As of the time of the blog post, this feature is available in English, with plans to expand it to other languages soon. However, on Reddit, one user already successfully used Bard to solve a Google image CAPTCHA (âselect all the squares with traffic lightsâ), adding an interesting twist to a world where the line between humanity and artificial intelligence is becoming increasingly blurry.
The future of note taking?
Yesterday, Google also revealed that another I/O announcement had graduated from internal development and use to limited public availability.
Introduced as âProject Tailwindâ back at I/O, Google has renamed the service NotebookLM (short for âlanguage model.â) Itâs a more fitting name for the goal of this service: re-inventing the age-old practice of taking notes.
As Googleâs self-described âsmallâ NotebookLM team sees it, note-taking can be improved from the standard scribblings on paper or typing in the Apple Notes app by automatically analyzing and finding connections among many disparate notes and documents and summarizing these in a clear, easy-to-read guide. NotebookLM can go even further and answer user questions about their notes and documents in a conversational style, or even help users create new content.
âAs weâve been talking with students, professors and knowledge workers, one of the biggest challenges is synthesizing facts and ideas from multiple sources,â wrote Raiza Martin, product manager at Google Labs, and Steven Johnson, editorial director of Google Labs, in Googleâs blog post explaining the service. âYou often have the sources you want, but itâs time consuming to make the connections.â
Googleâs solution to the problem is to create a âvirtual research assistantâ that is âgroundedâ or personalized to the user based on whatever set of documents they select. NotebookLM looks at these documents, pulls together its own guide, and then presents it to the user. The user can then ask the service in a Bard-like text-to-text prompting field for more information about any particular aspect, or for creative ideas based upon the underlying content.
As the Google blog post explains: âA medical student could upload a scientific article about neuroscience and tell NotebookLM to âcreate a glossary of key terms related to dopamine.â An author working on a biography could upload research notes and make a request like: âSummarize all the times Houdini and Conan Doyle interacted.'â
Furthermore, in what may be a boon to YouTube Creators and TikTok influencers, âA content creator could upload their ideas for new videos and ask: âGenerate a script for a short video on this topic.'â
NotebookLM is available only in the U.S. for now and on a waitlist basis, but if you are in the U.S., you can sign up here.