Discussion about this post

User's avatar
JP's avatar

The 5-10x annual cost reduction is encouraging but it sidesteps one problem. If your provider is silently quantising the model to hit those savings, the capability level you're benchmarking against isn't what you're actually getting. Synthetic open-sourced an eval tool that found a 34% failure rate on model identity checks across competing providers. The cost per capability drops fast on paper, less so when you account for what's actually being served. Wrote about the full incentive structure here: https://sulat.com/p/the-real-cost-of-cheap-ai-inference

1 more comment...

No posts

Ready for more?