Ahead of iOS 18’s debut at WWDC in June, Apple has released a family of open-source large language models. Called OpenELM, Apple describes these as: a family of Open-source Efficient Language Models.
In its testing, Apple says that OpenELM offers similar performance to other open language models, but with less training data.
Apple explains:
To this end, we release OpenELM, a state-of-the-art open language model. OpenELM uses a layer-wise scaling strategy to efficiently allocate parameters within each layer of the transformer model, leading to enhanced accuracy. For example, with a parameter budget of approximately one billion parameters, OpenELM exhibits a 2.36% improvement in accuracy compared to OLMo while requiring 2× fewer pre-training tokens.
Diverging from prior practices that only provide model weights and inference code, and pre-train on private datasets, our release includes the complete framework for training and evaluation of the language model on publicly available datasets, including training logs, multiple checkpoints, and pre-training configurations. We also release code to convert models to MLX library for inference and fine-tuning on Apple devices. This comprehensive release aims to empower and strengthen the open research community, paving the way for future open research endeavors.
You can find more details at the links below:
iOS 18 will include a collection of new artificial intelligence features, and today’s OpenELM release is just the latest piece of Apple’s behind-the-scenes work in preparation.
Bloomberg reported last week that iOS 18’s AI features will be powered by an entirely on-device large language model, which will offer privacy and speed benefits.
Follow Chance: Threads, Twitter, Instagram, and Mastodon.
FTC: We use income earning auto affiliate links. More.
Source: Apple releases new family of Open-source Efficient Language Models as AI work progresses