Meituan open sources original multimodal large model LongCat-Next

27/03/2026

On March 27th, Meituan released and fully open-sourced the native multimodal large model LongCat-Next. This model breaks the traditional patchwork architecture of current large models focusing on "language-centered" by unifying images, speech, and text into homogeneous discrete tokens. Through the pure "next token prediction" paradigm, the new model allows vision and speech to become the "native language" of AI, representing the progress of Meituan's LongCat team on the path to AI in the physical world.