Alibaba Cloud BaiLian announced a partial reduction in model context caching costs.

date
26/08/2025
On August 26, Aliyun's large model service platform Bailian released a notification on the reduction of context cache pricing for some models. After this adjustment, when a request hits the cache for certain models, the input token that is hit will be charged based on the cached_token, with the price adjusted from 40% of the original input_token price to 20% of the input_token price. Input tokens that are not hit will be charged based on the standard input_token price.