Microsoft on Tuesday announced a significant update to its artificial intelligence chatbot: visual search. Users can now take or upload a photo to Bing Chat and ask for more information on it through the desktop or Bing apps.
“Bing can understand the context of an image, interpret it, and answer questions about it,” Microsoft wrote in a release. “Whether you’re traveling to a new city on vacation and asking about the architecture of a particular building or at home trying to come up with lunch ideas based on the contents of your fridge, upload the image into Bing Chat and use it to harness the web’s knowledge to get you answers.”
The update comes as the AI arms race heats up among chatbot leaders like Microsoft, Google, OpenAI and Anthropic. In the effort to develop the most advanced generative AI, tech giants are quickly launching new features, aiming to keep up with not only their text-based chatbot competitors, but also image-heavy AI tools.
Although image search — and responses that include images — are now becoming part of the user experience for chatbots, none of the leading text-based chatbots seem to be able to generate their own images yet, unlike tools like Midjourney, Stable Diffusion and DALL-E 2. However, Google says the feature is on the way for its Bard chatbot.
Microsoft’s decision to allow images for Bing Chat follows Google’s recent debut of an image search feature for Bard, its AI chatbot. Using Google Lens, users can request information from Bard about an image they’ve uploaded, ask it to generate a caption or even just add some zest to the chatbot’s responses, such as a request for restaurant recommendations with photos of the restaurant’s interiors included. At the time of writing, OpenAI’s ChatGPT does not allow photo uploads, as the chatbot is still completely text-based, and Anthropic’s chatbot, Claude 2, operates similarly.