Multimodal Text/Images

Apple AI research shows how MLLMs understand, generate, search for images

Apple's researchers continue to focus on multimodal LLMs, with studies exploring their use for image generation, ...

New Apple model combines vision understanding and image generation with impressive results

Manzano combines visual understanding and text-to-image generation, while significantly reducing performance or quality trade-offs.

8hon MSN

Zhipu AI breaks US chip reliance with first major model trained on Huawei stack

Zhipu claims GLM-Image achieved industry-leading scores among open-source models for text rendering and Chinese character ...

Ars Technica

Farewell Photoshop? Google’s new AI lets you edit images by asking.

There’s a new Google AI model in town, and it can generate or edit images as easily as it can create text—as part of its chatbot conversation. The results aren’t perfect, but it’s quite possible ...

techtimes

Apple Unveils New 'MM1' Multimodal AI Model Capable of Interpreting Images, Text Data

Apple has revealed its latest development in artificial intelligence (AI) large language model (LLM), introducing the MM1 family of multimodal models capable of interpreting both images and text data.

SiliconANGLE

Writer announces Palmyra-Vision, a multimodal LLM capable of understanding images

Generative artificial intelligence startup Writer Inc. today announced the introduction of Palmyra-Vision, an AI large language model capable of text and visual understanding that can analyze images ...

VentureBeat

ChatGPT goes multimodal: now supports voice, image uploads

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More After unveiling its newest image generation model DALL-E 3 with support ...

VentureBeat

Patronus AI’s Judge-Image wants to keep AI honest — and Etsy is already using it

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Patronus AI announced today the launch of ...

Scientific American

The Latest AI Chatbots Can Handle Text, Images and Sound. Here’s How

Slightly more than 10 months ago OpenAI’s ChatGPT was first released to the public. Its arrival ushered in an era of nonstop headlines about artificial intelligence and accelerated the development of ...

TechCrunch

Gemini 2.0, Google’s newest flagship AI, can generate text, images, and speech

Google’s next major AI model has arrived to combat a slew of new offerings from OpenAI. On Wednesday, Google announced Gemini 2.0 Flash, which the company says can natively generate images and audio ...

InfoQ

Multi-Modal LLM NExT-GPT Handles Text, Images, Videos, and Audio

A monthly overview of things you need to know as an architect or aspiring architect. Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results