Lates News

26/08/2025

On August 26th, Aliyun's large model service platform HundredRefinery released a notice on the price reduction of context caching for some models. After this price adjustment, when a request hits the cache for some models, the fee for the input token will be adjusted from 40% of the previous input_token price to 20% of the input_token price according to the cached_token unit price. Input tokens that are not hit will be billed at the standard input_token rate.