Abhay Bhat

Life | Technology | Investing

Abhay Bhat

@pbteja1998 just use qwen3.5-27B-GGUF locally as a primary model and routing rules on top if you are doing anything fancy…works like a charm and your wallet will thank you


Guess Claude code proved this way earlier when it was more performant with sonnet 4.6 than other frontiers, it’s the harness fgs!! https://t.co/3dl7NMODr7


Not impacted with anthropic cutting subs from claws. If you have been using your agents regularly, using anthropic 100% was never a sustainable option as it was clearly $200++ bill a month, so I have been using locally hosted Qwen3.5 with other options. No complaints. Kimi2.5 has https://t.co/53quTIemdT


OSS models are closing the gap, and with GGUF, NVFP and MXL, local inference is gaining wider adoption. Frontier models will still be the apple of the industry https://t.co/uH662icIlC


Classic troll https://t.co/s3PhfDmPw3


BitTorrent meets Napster for LLM inferencing https://t.co/bgFnO1s870


RT @hqmank: If you’re using Claude Code, this is worth knowing. Instead of worrying about whether Opus 4.6 or GPT 5.4 is better, it’s moreโ€ฆ


@ivanfioravanti Running GLM 5.1 and M2.7 via Ollama Claude code? Global settings won’t work right for 3 separate configs?


Implementation of Googleโ€™s TurboQuant (ICLR 2026) โ€” KV cache compression for local LLM inference, with planned extensions beyond the paper https://t.co/raO8puxH9B


@thekitze @Lovable ๐Ÿคฃ


Page 4 of 86