Alibaba Researchers Unveil Marco-o1 AI Model As Another Reasoning-Focused Competitor to OpenAI’s o1

Alibaba’s Marco-o1 AI model is available to download and use on Hugging Face.

Written by Akash Dutta, Edited by Siddharth Suvarna | Updated: 2 December 2024 16:19 IST

Highlights

Marco-o1 is a distilled version of the Qwen2-7B-Instruct
Alibaba’s AI model is fine-tuned using chain-of-thought (CoT) method
Alibaba recently released QwQ-32B reasoning-focused AI model

Alibaba Researchers Unveil Marco-o1 AI Model As Another Reasoning-Focused Competitor to OpenAI’s o1

The company says Marco-o1 is optimised for complex real-world problem-solving tasks

Photo Credit: Unsplash/Markus Spiske

Alibaba recently introduced a reasoning-focused artificial intelligence (AI) model dubbed Marco-o1. The model is similar to the QwQ-32B large language model, which is also optimised for tasks requiring advanced reasoning capabilities, however, one important distinction is that the Marco-o1 is a smaller model and is distilled from the Qwen2-7B-Instruct model. The Chinese tech giant claimed that several fine-tuning exercises have been used to make the new model reasoning-focused. Additionally, the researchers highlighted that it is optimised for complex real-world problem-solving tasks.

Alibaba Marco-o1 AI Model

The new AI model is detailed in a research paper published on arXiv, an online pre-print journal. Notably, the papers published in the online journal are not peer-reviewed. Additionally, Alibaba has also hosted the AI model on Hugging Face and has permitted downloading and using it for personal and commercial use cases under the Apache 2.0 licence.

However, it is not fully open-sourced as only the partial dataset has been made available. As such, users will not be able to replicate the model or break it down to analyse the architecture or components.

This AI App Predicts When You Will Die and How to Improve Life Expectancy

Coming to Marco-o1, it is fine-tuned from the Qwen2-7B-Instruct foundation model. In the paper, the researchers highlighted that the AI model is powered by chain-of-thought (CoT) fine-tuning, Monte Carlo Tree Search (MCTS), reflection mechanisms, and other reasoning strategies.

As a result, Alibaba's Marco-o1 can solve open-ended questions and find queries to responses “where clear standards are absent and rewards are challenging to quantify.” However, it should be understood that the advanced reasoning abilities have not come from any hardware or architectural advancement.

Instead, all reasoning models today use a technique called test-time compute that lets an AI model spend more processing time on a single query. This allows them to test out different theories to find the solution and fact-check themselves. As a result, these models are geared towards providing more accurate responses and completing complex tasks. One important area where Marco-o1 excels, as per the researchers, is understanding colloquial nuances and translating slang expressions.

One limitation of the AI model, as per the researchers, claimed that while Marco-o1 shows reasoning characteristics, “its performance still falls short of a fully realised” reasoning model.