AI Newsletter #001 (2023/05/22-2023/05/28)

News of the week

Meta AI Introduces MTIA v1: It’s First-Generation AI Inference Accelerator
The Meta team developed the Meta Training and Inference Accelerator (MTIA) ASIC to address the issue of inefficient processing of Meta’s unique recommendation workloads. The MTIA ASIC was included in PyTorch to develop an optimized ranking system. The researchers compared MTIA to an NNPI accelerator and a graphics processing unit and found that MTIA relies on efficiently managing small forms and batch sizes for low-complexity models, and actively optimizes its SW stack to achieve similar levels of performance.

IBM Announces WatsonX AI Platform
IBM has launched Watsonx, a new AI and data platform designed to help users scale and accelerate AI development. It includes an AI development studio with access to IBM-curated and trained models, a data store, and a toolkit for AI governance. Watsonx users can build their own AI models or fine-tune existing models. Watsonx features will also be added to IBM’s AIOps Insights, Watson Code Assistant, Watson Assistant, and Watson Orchestrate. A waitlist is available for more information.

Meta’s open-source speech AI recognizes over 4,000 spoken languages
Meta has open-sourced its AI language model, Massively Multilingual Speech (MMS), which can recognize over 4,000 spoken languages and produce speech in over 1,100. The model was trained using wav2vec 2.0 and the data was collected from audio recordings of translated religious texts. MMS performs well compared to existing models and covers 10 times more languages. Meta hopes that open-sourcing MMS will help preserve language diversity and allow everyone to speak and learn in their native tongues.

Google’s AI-powered Flood Hub disaster alert system is now available in 80 countries
Google has launched Flood Hub, an AI-powered flood forecasting utility, to alert people in 80 countries of impending floods. This expansion is estimated to help 460 million people living in flood-prone regions, and Google is currently monitoring over 1,800 locations across river basins.

Intel Unveils Game-Changing AI Chip to Challenge Nvidia and AMD
Intel Corp has announced the introduction of a new AI computing chip, “Falcon Shores”, in 2025. It will have 288 gigabytes of memory and 8-bit floating point computation capabilities, as part of Intel’s strategy to compete against Nvidia Corp and Advanced Micro Devices Inc.

OpenAI’s ChatGPT app tops 500K downloads in just 6 days
OpenAI’s ChatGPT app has been a huge success, with over half a million downloads in its first six days. It is already ranking in the top five among AI chatbot apps in the U.S. and is on track to beat its rivals.

What is Google Gemini? The next-generation AI model explained
Google is creating a new AI language model called Gemini, which is intended to compete with OpenAI’s GPT. Gemini is designed to be multimodal and to accommodate future advancements, such as improved memory and planning. It will also facilitate wider collaboration through tool and API integrations.

Want to try Google’s AI search engine? Join Search Labs waitlist
Google has unveiled its new AI-powered Search Generative Experience (SGE) at the I/O conference. Users can join the Search Labs program to access and test early experimental features of Google Search. SGE will provide AI-generated summaries at the top of search results, and participants in Search Labs can help shape the future of Google Search by providing feedback. A waitlist is now available for those who want to join Search Labs and try SGE.

Products of the Week is an AI-powered product discovery and strategy platform that helps product teams uncover problems to solve for customers, decide what to build next, and create strategies to drive outcomes.

Craft and refine your product copy in seconds with the AI-powered UX Writing Assistant. Get copy suggestions inspired by UX writing best practices and products in your industry to improve user experience and save writing time – without leaving Figma.

Transform your business with Desku’s AI-enhanced automations! Collaborate effortlessly with shared inboxes, and turn one-time visitors into repeat customers with WhatsApp integration. Experience the future of Customer Support & Customer Experience with Desku!

Macro PDF Editor
Macro PDF Editor is a native Mac/Windows app that uses AI to make documents interactive & editable. Click any cross-reference (e.g. Section 7.1) for a preview window, navigate with multiple tabs like a browser, and use color-coded highlights for note taking.

Fina’s simple interface allows you to connect accounts, categorize transactions, and track custom metrics. Want to know specific things like: “How much money have I actually spent at Starbucks this year??” Just ask! Flexible financial management has arrived

Research of the Week

Tree of Thoughts: Deliberate Problem Solving with Large Language Models
LLMs have shown capability of performing a wide range of tasks. However, this progress is still the original autoregressive mechanism for generating text, which makes token-level dicision one by one and in a left-to-right fashion. Why not augment the process with a more deliberate planning process that (1) maintains and explores diverse alternatices for current choices instead of just picking one. and (2) evaluates its current status and actively looks ahead or backtracks to make more global decisions.
Tree of Thoughts (ToT) allows LMs to explore multiple reasoning paths over thoughts, as Figure 1 illustrates. ToT draws inspiration from the tree search planning process. It obtains superior results on Game of 24, Creative Writing, and Crosswords by being general and flexible enough to support different levels of thoughts, different ways to generate and evaluate thoughts, and different search algorithms that adapt to the nature of different problems.

ToT is a generalization over the popular “Chain of Though” approach. If you are interested in language model inference, you shouldn’t miss this research. Read the paper at:

LIMA: Less Is More for Alignment
Large language models are trained in two stages: (1) unsupervised pretraining from raw text, to learn general-purpose representations, and (2) large scale instruction tuning and reinforcement learning, to better align to end tasks and user preferences. The authors measured the relative importance of these two stages by training LIMA, a 65B parameter LLaMa language model fine-tuned with the standard supervised loss on only 1,000 carefully curated prompts and responses, without any reinforcement learning or human preference modeling.
LIMA demonstrates remarkably strong performance, learning to follow specific response formats. Some important conclusions of this work are:
1.The strong results of LIMA suggest that almost all knowledge in large language models is learned during pretraining, and only limited instruction tuning data is necessary to teach models to product high quality output.
2.The scaling laws of alignment are not necessarily subject to quantity alone, but rather a function of prompt diversity while maintaining high quality responses.
If you are training your own model, you should definitely read this paper:

RWKV: Reinventing RNNs for the Transformer Era
All LLMs are build upon Transformers but it suffers from memory and computational complexity that scales quadratically with sequence length. In contrast, RNNs exhibit linear scaling in memory and computational requirements but struggle to match the same performance as transformers.
The authors propose a novel model architecture, Receptance Weighted Key Value (RWKV), that combines the efficient parallelizable training of Transformers with the efficient inference of RNNs. It is the first non-transformer architecture to be scaled to tens of bilions of parameters. The experiments reveal that RWKV performs on par with similarly sized Transformers.
If you are interested in network architecture and model design, read the full paper at:

Want to receive TechNavi Newsletter in email?