Robots, thermodynamic chips, train a 70b language model on your home computer

Mar 12, 2024

🌸 It’s Duke’s spring break this week, but there were a few interesting happenings this week to share. 🌸

RFM-1 (Robotics Foundation Model 1) launched from Covariant - a large language model for “robot language”. It incorporates images, video, joint angles, force reading, suction cup strength, and other sensor modalities to produce images, videos, or a series of commands for a robot.
Extropic building analog-thermal chips for AI - using thermodynamics to create more efficient computing systems

Train a 70b language model at home with Answer.AI: A fully open source system combining FSDP and QLoRA that can efficiently train a 70b large language model on a regular desktop computer with two or more standard gaming GPUs.
Transformer Debugger from OpenAI - combines automated interpretability techniques with sparse autoencoders.

From One to Many: Expanding the Scope of Toxicity Mitigation in Language Models: This paper explored several key areas of multilingual toxicity mitigation for text generation, including mitigation strategies, data characteristics, the use and impact of translation data, and the scalability of these techniques. Authors from Cohere.
ViewFusion: Towards Multi-View Consistency via Interpolated Denoising: This paper shares a new approach for turning one image into a multi view. Through a diffusion process that fuses known-view information via interpolated denoising, the framework extends single-view conditioned models to work in multiple-view conditional settings without any additional fine-tuning. Authors from Amazon.
RA-ISF: Learning to Answer and Understand from Retrieval Augmentation via Iterative Self-Feedback: This study proposes Retrieval Augmented Iterative Self-Feedback (RA-ISF), a framework that iteratively decomposes tasks and processes them in three submodules to enhance the model's problem-solving capabilities. Authors from Zhejiang University.

Raleigh Durham Startup Week April 9-12 - registration is free! Workshops, networking, and office hours.
Training great LLMs entirely from ground up in the wilderness as a startup - The co-founder of Reka shares experiences of building infrastructure and training large language & multimodal models from scratch (They’re hiring!)
Books in the public domain dataset open-sourced on HuggingFace

Spill the GPTea