AI Newsletter #006 (2023/06/26 – 2023/07/02)

Big Company

Databricks: Acquired MosaicML
Databricks has announced the acquisition of MosaicML, an open source startup with neural networks expertise, for $1.3 billion. This will add generative AI tooling to the Databricks Lakehouse Platform, and the entire MosaicML team will join Databricks after the deal closes.

OpenAI: Planning to Create App Store-like Marketplace for AI
OpenAI is creating a platform for developers to sell their AI models, similar to an app store, which could revolutionize the AI industry by connecting vendors to millions of potential users, creating competition and opportunities for new entrants, and providing access to an extensive library of AI tools.

Microsoft: Brings New AI-powered Shopping Tools to Bing and Edge
Microsoft has released a suite of AI-powered shopping tools for Bing and Edge, including buying guides, AI-generated review summaries, and Price Match. These features are available in the U.S. and worldwide.

Oracle: Adds Generative AI to Its Human Resources Software
Oracle Corp is introducing AI-powered features to its human resources software to help businesses automate tasks such as job descriptions and performance goals. The AI assistant is in the form of a button, and Oracle is also working on using AI for more complex tasks. The goal is to reduce decision-making and implementation time from weeks to hours and minutes.

Founding Rounds

Inflection AI, a startup backed by Microsoft and Nvidia, has raised $1.3 billion from investors, including Microsoft. Inflection AI recently launched their chatbot Pi and language model Inflection-1. They are collaborating with Nvidia and CoreWeave to expand their GPU cluster to 22,000 H100s, allowing them to train AI more efficiently.

Runway, an applied AI research company, has raised a $141 million extension to its Series C from Google, NVIDIA, Salesforce Ventures, and existing investors. The funds will be used to scale research efforts, expand the team, and bring multi-modal AI systems to market, as well as create intuitive product experiences.

Typeface, an AI platform for enterprise content creation, has raised $165 million in a Series B funding round, valuing the company at $1 billion. Additionally, Typeface has signed strategic partnerships with Salesforce and Google to create customized content within their existing workflows.

On June 27th, CalypsoAI, a company providing security protection for AI, announced on its official website that it has raised $23 million (approximately 160 million RMB) in Series A-1 funding. The round was led by Paladin Capital Group, with participation from Lockheed Martin Ventures, Hakluyt Capital, and others.

New Product

Ogimi is a meditation application based on OpenAI’s API. It is reported that the app can provide users with coach-level personalized guidance.

David Gull, the founder of the company, stated, “ is the first AI-guided meditation coach, and each meditation on the platform is generated in real-time based on the specific needs and personal growth of the user.” He further added, “The main goal of this product is to leverage the power of artificial intelligence to bring coach-level, one-on-one attention to everyone’s meditation practice.”


The Impact of Positional Encoding on Length Generalization in Transformers
The ability to generalize from smaller training context sizes to larger ones, commonly known as length generalization, is a major challenge for Transformer-based language models. With larger context sizes, a model can benefit from more in-context-learning examples, higher numbers of reasoning and planning steps, or longer text generation. However, training a Transformer with a larger context size can be excessively slow and

Positional encoding (PE) has been identified as a major factor influencing length generalization, but the exact impact of different PE schemes on extrapolation in downstream tasks remains unclear.

The researchers conduct a systematic empirical study comparing the length generalization performance of decoder-only Transformers with five different position encoding approaches including Absolute Position Embedding (APE), T5’s Relative PE, ALiBi, and Rotary, in addition to Transformers without positional encoding (NoPE).

The findings reveal that the most commonly used positional encoding methods, such as ALiBi, Rotary, and APE, are not well suited for length generalization in downstream
tasks. More importantly, NoPE outperforms other explicit positional encoding methods while requiring no additional computation. Overall, the research suggests that explicit
position encodings are not essential for decoder-only Transformers to generalize
well to longer sequences.

Extending Context Window of Large Language Models Via Position Interpolation
This paper presents Position Interpolation (PI) that extends the context window sizes of RoPE-based pretrained LLMs such as LLaMA models to up to 32768 with minimal fine-tuning (within 1000 steps), while demonstrating strong empirical results on various tasks that require long context, including passkey retrieval, language modeling, and long document summarization from LLaMA 7B to 65B.

Meanwhile, the extended model by Position Interpolation preserve quality relatively well on tasks within its original context window. To achieve this goal, Position Interpolation linearly down-scales the input position indices to match the original context window size, rather than extrapolating beyond the trained context length which may lead to catastrophically high attention scores that completely ruin the self-attention mechanism.

The theoretical study shows that the upper bound of interpolation is at least ∼ 600× smaller than that of extrapolation, further demonstrating its stability. Models extended via Position Interpolation retain its original architecture and can reuse most pre-existing optimization and infrastructure.

Want to receive TechNavi Newsletter in email?