With its eyes set on outdoing Nvidia, AI edge company Kneron has announced the release of its new neural processing units (NPU) chips, KL730, by the end of the year.
Designed specifically for AI and machine learning applications, KL730 aims to provide a cost-effective and energy-efficient solution for running large language models (LLMs). Unlike GPUs, which were initially created for graphics processing, Kneron's NPU chips are purpose-built for AI.
The KL730 serves as a successor to Kneron’s earlier KL530 chips, which were already optimized for generative AI models that use transformer architectures. According to Albert Liu, CEO of Kneron, the new chip would enable running "powerful transformer models like GPT on many kinds of devices."
While exact pricing details for the KL730 have not been revealed, Kneron claimed that its KL530 users experienced a 75% drop in operating costs compared to using traditional GPU chips. The KL730 reportedly delivers a three to four times leap in energy efficiency and starts with a base-level compute power of 0.35 tera operations per second.
Another major advantage is the chip's ability to run large language models fully offline, eliminating the need for cloud connectivity and thus providing enhanced data security.
Although most AI companies currently prefer Nvidia’s H100 Tensor Core GPU chips, the steep price tag of roughly $40,000 per chip has led to increased competition. With AMD also planning to release its own AI chips later this year, and Nvidia announcing an even more powerful chip for 2024, the AI processor market is heating up.
The entrance of the KL730 into the market could be a game-changer, as it promises not only cost savings but also increased efficiency and security for AI applications. It's a clear signal that the domain of AI-specific processing is evolving rapidly, with more choices soon to be available for developers and enterprises alike.
The KL730 could be a significant disruption in a market long dominated by Nvidia. Its focus on cost-efficiency, energy-efficiency, and capability to handle AI-specific tasks right out of the box makes it a promising option for the future of AI processing.