
Alibaba has introduced a breakthrough technology that could alter how AI systems learn to search for information and significantly reduce costs.
The new tool, ZeroSearch, allows large language models (LLMs) to simulate search engine results without connecting to the internet. Instead of relying on Google or Bing to search the web, Alibaba’s method helps AI models simulate a search engine, skipping real-time searches and significantly reducing expensive API costs.
“Reinforcement learning [RL] training requires frequent rollouts, potentially involving hundreds of thousands of search requests, which incur substantial API expenses and severely constrain scalability,” Alibaba researchers wrote in their paper published on arXiv.
How ZeroSearch works
Rather than pulling real-time data from search engines, ZeroSearch trains an LLM to generate both useful and noisy documents based on a query. This is done through a lightweight supervised fine-tuning process where the model learns what high-quality and low-quality responses look like.
During training, a “curriculum rollout” strategy is used. That means the AI is first given easy-to-understand information and, over time, is exposed to more confusing and messy data, mimicking real-world internet search conditions.
“Our key insight is that LLMs have acquired extensive world knowledge during large-scale pretraining and are capable of generating relevant documents given a search query,” the researchers explained in their paper.
This process strengthens the model’s reasoning skills and makes it better at digging through unreliable data, just like humans often must do online, stated the researchers.
ZeroSearch’s huge cost savings
An attractive feature of ZeroSearch is its massive cost reduction.
Alibaba’s analysis found that training with about 64,000 Google search queries would cost roughly $586.70 via SerpAPI. In contrast, using ZeroSearch with a 14B simulation model running on four A100 GPUs costs just $70.80, an 88% decrease.
ZeroSearch vs. Google Search
In a test, Alibaba found that:
- A 7B parameter retrieval model using ZeroSearch performed as well as Google Search.
- A 14B parameter model using ZeroSearch beat Google Search in performance.
“Results show that ZeroSearch outperforms real search engine-based models while incurring zero API cost,” the report states. “Moreover, it generalizes well across both base and instruction-tuned LLMs of various parameter sizes and supports different reinforcement learning algorithms.”
It also worked well across different AI sizes and types, including instruction-tuned and base models, and is compatible with several reinforcement learning techniques like PPO, GRPO, and Reinforce++.
ZeroSearch on GitHub and Hugging Face
ZeroSearch’s performance improves with larger models and more GPUs, and it works well across a range of model families, including Qwen-2.5 and LLaMA-3.2. The company has made its code, datasets, and pre-trained models publicly available on GitHub and Hugging Face.
What this breakthrough could mean for AI models in the future
Alibaba’s move comes as AI companies race to build smarter, more self-sufficient models. While systems like OpenAI’s ChatGPT and Google’s Gemini still rely on live data or search integrations, ZeroSearch points to a future where AIs can “search” entirely within themselves, with cheaper results and sometimes even more accuracy.