IndexCache, a new sparse attention optimizer, delivers 1.82x faster inference on long-context AI models

Processing 200,000 tokens through a large language model is expensive and slow: the longer the context, the faster the costs spiral. Researchers at Tsinghua University and Z.ai have built a technique called IndexCache that cuts up to 75% of the redundant computation in sparse attention models, delivering up to 1.82x faster time-to-first-token and 1.48x faster…

Read More

The social media ban for kids is spreading. This country is the latest to plan on restrictive legislation

Austria is seeking a ban on social media use for kids under the age of 14. Austria’s governing coalition on Friday announced plans to ban social media use for children under 14, joining a string of other countries in drawing up restrictions for young people.Alexander Pröll, the official in Chancellor Christian Stocker’s office responsible for…

Read More
Back To Top