AI Newsletter #008 (2023/07/10 – 2023/07/16)

Big Company

Anthropic: Launched Claude 2
Anthropic has released Claude 2, an AI chatbot with a conversational tone and improved capabilities such as math, coding, and reasoning. It is now available for public use in the US and UK, and users can test out its capabilities.

Pinecone: Now Available on Microsoft Azure
Pinecone, a vector database company, has announced that it will be available on Microsoft Azure, allowing customers to use a vector database closer to their data and applications. Pinecone recently raised $100M in Series B funding and is available in three locations. Early access to Azure regions in Pinecone is now available.

Meta: Releasing Commercial AI Tools to Rival Google, OpenAI
Meta is releasing a commercial version of its AI model, LLaMa, to compete with OpenAI and Google. The commercial version will be more widely available and customizable than the current model and is expected to be released soon.

Meta: Reveals New AI Image Generation Model CM3Leon
Meta has developed CM3Leon, a token-based autoregressive model that can generate high-resolution images from text. It is more efficient than diffusion models and can handle complex prompts. CM3leon is currently a research effort and it is not yet known if it will be made publicly available.

New Company / Product

Elon Musk launches his new company, xAI
Elon Musk has launched xAI, a new artificial intelligence company with the goal of understanding the true nature of the universe. The team behind xAI consists of alumni from DeepMind, OpenAI, Google Research, Microsoft Research, Twitter and Tesla, and they have secured thousands of GPU processors from Nvidia. xAI will delve into the mathematics of deep learning and develop the “theory of everything” for large neural networks, and will work closely with X (Twitter), Tesla, and other companies. is an innovative online AI video generation and production platform where you can create beautiful videos. Simply upload an image or a song, provide a brief text description or an article, blog post, news report, etc., and it will effortlessly produce high-quality video content. Furthermore, it supports real-time editing and previewing of your videos.

Additionally, offers a rich collection of video templates and a library of resources, allowing users to customize the style, scenes, characters, and more according to their specific needs and preferences.

Funding Rounds

Nomic: $17 Million Funding Round Led By Coatue
AI startup Nomic has raised $17 million in a new funding round from investors led by Coatue. The funding will be used to hire new staff and develop two products, GPT4ALL and Atlas, which aim to increase the visibility of datasets in model training and make AI models more accessible. Nomic also plans to develop open-source models as an alternative to proprietary models developed by AI labs.


Curious Replay for Model-based Adaptation

Existing model-based reinforcement learning agents are unable to use past experiences to train their world model. The authors present Curious Replay – a form of prioritized experience replay tailored to model-based agents through use of a curiosity-based priority signal.

Agents using Curious Replay exhibit improved performance in an exploration paradigm inspired by animal behavior and on the Crafter benchmark.

DreamerV3 with Curious Replay surpasses state-of-the-art performance on Crafter, achieving a mean score of 19.4 that substantially improves on the previous high score of 14.5 by DreamerV3 with uniform replay, while also maintaining similar performance on the Deepmind Control Suite.

“Low-Resource” Text Classification: A Parameter-Free Classification Method with Compressors

With only 14 lines of code, this method outperforms Transformers in text classification.

Deep neural networks (DNNs) are often used for text classification due to their high accuracy. However, DNNs can be computationally intensive, requiring millions of parameters and large amounts of labeled data.

The authors propose a non-parametric alternative to DNNs that’s easy, lightweight, and universal in text classification: a combination of a simple compressor like gzip with a k-nearest-neighbor classifier.

Without any training parameters, this method achieves results that are competitive with non-pretrained deep learning methods on six in-distribution datasets.

It even outperforms BERT on all five OOD datasets, including four low-resource languages.

This method also excels in the few-shot setting, where labeled data are too scarce to train DNNs effectively.

Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning

The authors present CM3Leon, a retrieval-augmented, token-based, decoder-only multi-modal language model capable of generating and infilling both text and images.

CM3Leon uses the CM3 multi-modal architecture but additionally shows the extreme benefits of scaling up and tuning on more diverse instruction-style data.

It is the first multi-modal model trained with a recipe adapted from text-only language models, including a large-scale retrieval-augmented pre-training stage and a second multi-task supervised fine-tuning (SFT) stage.

It is also a general-purpose model that can do both text-to-image and iamge-to-text generation, allowing us to introduce self-contained contrastive decoding methods that produce high-quality outputs.

CM3Leon achieves state-of-the-art performance in text-to-image generation with 5x less training compute than comparable methods. After SFT, CM3Leon can also demonstrate unprecedented levels of controllability in tasks ranging from language-guided image editing to image-controlled generation and segmentation.

Want to receive TechNavi Newsletter in email?