The Wild East in LLMs

24 Feb

Is a blue whale the mascot of the Wild East?

For the past couple weeks I've been knee-deep comparing Anthropic's Claude 3.5 Sonnet, Alibaba's Qwen 2.5-Max, and DeepSeek's R1. I pay $20 USD/mo for Claude and $0 for both Qwen and DeepSeek.

What is Qwen? It's a new free-for-individual-use LLM from China's Alibaba. What is Alibaba? It's one of the world's largest online wholesale marketplaces, and it's also a consumer financial ecommerce company enabling billions of transactions every year in China. They're also a very large cloud provider with data centers all over the world. Alibaba releaed Qwen (I pronounce it "kwen") a few weeks ago and in short, it's awesome.

How can Alibaba create Qwen and release it for free? The same way Google does with its Gemini LLM. They charge businesses for a commercial version and try to get mindshare by releasing a consumer version.

Qwen can connect to the web, has image generation, and video generation. The only service I've tried is the regular LLM so I don't have any comment on those other services.

Qwen's interface is clean and much easier to use than Claude, especially for code.

In terms of code quality, I think Qwen and Claude are about the same. This is saying a lot because Claude's coding skills are rated the highest, depending on whose benchmarks you look at. In my experince, Claude was more terse and Qwen would explain more about the code, which I found helpful. The creativity and problem-solving was roughly equal between the two. They're both good and gave me quite a few improvements I could make to my code. They both made some similar mistakes in their suggestions which I found a little odd. Perhaps they were both trained on a similiar dataset such as Github?

As for DeepSeek, the company is a young Chinese firm that started as a quant trading firm. They applied their significant computing knowledge from creating quant trading systems to creating the DeepSeek LLM. They figured out a clever way to take advantage of less expensive but available-in-China Nvidia hardware (they used a lower-level API than CUDA). Then they open sourced the whole thing, sending a shock wave through Silicon Valley.

I've tried DeepSeek on the web and it's very good. Like all the Chinese models it lacks some of the conversational ability of Claude. DeepSeek is more direct and more "cold" in that sense. I did have quite a bit of trouble using it though, because the server was too busy most of the time.

I also used DeepSeek R1-70b on my local Mac machine. While not as fast as the web version, it was available whenever I wanted! To be honest, I couldn't tell a difference between this version and the web version other than the speed of generating a response.

For my situation where I asked all of them to optimize some code I wrote, I'd have to give the nod to DeepSeek. It was the only one that presented a unique solution, and that solution turned out to be faster and higher quality than the others. Anthropic (and OpenAI!) really need to up their game, and fast. The competition from DeepSeek, Qwen, Bytedance's (TikTok's owner) coming offering, Kimi, and others has now caught up to the best that Silicon Valley can offer. I hope these Chinese models are a wake-up call for Silicon Valley.

For us here in NZ, I think this is great news that we can choose from where to get LLMs. Competition in this space is welcome and hopefully shows people that Silicon Valley isn't the only place that can make state of the art modern technology.

Maybe we should call China the Wild East?

UPDATE: I’ve stopped paying for Claude. With Qwen and DeepSeek I don't really have a need for Claude at the moment. For me, the superior conversational ability of Claude isn't enough to justify paying for it.

Brian Calhoun

The Wild East in LLMs

Unleashing New Zealand

Be More Human Limited