The latest version of xAI’s Grok can process images

xAI, the OpenAI competitor founded by Elon Musk, has introduced the first version of Grok that can process visual information. Grok-1.5V is the company's first-generation multimodal AI model, which cannot only process text, but also "documents, diagrams, charts, screenshots and photographs." In xAI's announcement, it gave a few samples of how its capabilities can be used in the real world. You can, for instance, show it a photo of a flow chart and ask Grok to translate it into Python code, get it to write a story based on a drawing and even have it explain a meme you can't understand. Hey, not everyone can keep up with everything the internet spits out.

The new version comes just a couple of weeks after the company unveiled Grok-1.5. That model was designed to be better at coding and math than its predecessor, as well as to be able to process longer contexts so that it can check data from more sources to better understand certain inquiries. xAI said its early testers and existing users will soon be able to enjoy Grok-1.5V's capabilities, though it didn't give an exact timeline for its rollout.

In addition to introducing Grok-1.5V, the company has also released a benchmark dataset it's calling RealWorldQA. You can use any of RealWorldQA's 700 images to evaluate AI models: Each item comes with questions and answers you can easily verify, but which may stump multimodal models like Grok. xAI claimed its technology received the highest score when the company tested it with RealWorldQA against competitors, such as OpenAI's GPT-4V and Google Gemini Pro 1.5.

This article originally appeared on Engadget at https://www.engadget.com/the-latest-version-of-xais-grok-can-process-images-120025782.html?src=rss

The latest version of xAI’s Grok can process images

xAI, the OpenAI competitor founded by Elon Musk, has introduced the first version of Grok that can process visual information. Grok-1.5V is the company's first-generation multimodal AI model, which cannot only process text, but also "documents...

The latest version of xAI’s Grok can process images

Leave a Reply Cancel reply

Join the Underground

a vibrant community where every pixel can be the difference between victory and defeat.

You could really enjoy these posts

Rogue Point is a four-player tactical shooter from the devs behind Black Mesa

Stellar Blade PC Port Gets a Release Window

Nintendo’s Black Friday sale has discounts on Switch games and controllers

All Arcane season 2 skins coming to LoL

Disney Said to be Considering a Surprising Replacement for Bob Iger: EA CEO Andrew Wilson

Call of Duty: Black Ops 6 is Going to Start Selling Gobblegum Packs

One of 2024’s Best Puzzle Games is Finally Coming to PlayStation

Overwatch 2 Nov. 12 patch notes: Doomfist buffs and more

Rime studio Tequila Works has filed for insolvency

News

Valve news update

E-sport

Guides

Adventures

Battle Royale

Early Access

interview

Live

Multiplayer

Platform

player

Need more help?