🗞 This Week in News
The most popular Wikipedia pages of 2024 may not be what you expect! Taylor Swift is #11. ChatGPT is #12. #1? “Deaths in 2024”
Open AI 12 days of “ship-mas” - so far, includes $200 ChatGPT Pro subscription and the public launch of their Sora video generator
Some names cause ChatGPT to break - apparently this is due to hard-coded name filters. Hard-coded filters could make ChatGPT vulnerable to adversarial attacks.
Willow Quantum Chip from DeepMind - Willow performed a standard benchmark computation in under five minutes that would take today’s fastest supercomputers 10 septillion (that is, 1025) years — a number that vastly exceeds the age of the Universe.
🥁 Interesting Products & Features
Amazon Nova - Amazon joins the LLM game with 4 tiers of models: micro, lite, pro, premier. Here’s a nice brief comparison of pricing and performance. TLDR: Similar to Gemini models and slightly cheaper.
Veo / Imagen 3 - Announced this past week. Veo is Google’s video generation model and Imagen 3 is the newest version of its image generation model.
Grok 2 Aurora - Aurora is an autoregressive mixture-of-experts network trained to predict the next token from interleaved text and image data.
Genie 2 from Google DeepMind can generate “video game-like” interactive worlds
Cohere Reranker 3.5 - the newest version of Cohere’s reranker comes with better performance across 100+ languages. Reminder that the reranker model uses cross-encoding where the model computes a relevance score for a document (chunk) in relation to a user query.
📄 Interesting Papers
Diffusion Meets Flow Matching: Two Sides of the Same Coin - Flow matching and diffusion models are two popular frameworks in generative modeling. Despite seeming similar, there is some confusion in the community about their exact connection. In this post, we aim to clear up this confusion and show that diffusion models and Gaussian flow matching are the same, although different model specifications can lead to different network outputs and sampling schedules. Authors from Google DeepMind.
Pre-train, Align, and Disentangle: Empowering Sequential Recommendation with Large Language Models - Sequential recommendation (SR) aims to model the sequential dependencies in users’ historical interactions to better capture their evolving interests but has limitations such as the coldstart problem and sub-optimal performance. This research proposes a paradigm to use LLMs for recommenders. They first pre-train both the SR and LLM models to get collaborative and textual embeddings. Next, a characteristic recommendation-anchored alignment loss is proposed using multi-kernel maximum mean discrepancy with Gaussian kernels. Finally, a triple-experts architecture, consisting aligned and modality-specific experts with disentangled embeddings, is finetuned in a frequency-aware manner. Authors from Tencent Inc. and City University of Hong Kong.
Semantic Retrieval at Walmart - presents a hybrid system for e-commerce search deployed at Walmart that combines traditional inverted index and embedding-based neural retrieval to better answer user tail queries. The system improved the relevance of the search engine, measured by both offline and online evaluations. The improvements were achieved through a combination of different approaches. They highlight multiple learnings and practical tricks that were used in the deployment of this system. Authors from Walmart Global Technology.
HEAL: Hierarchical Embedding Alignment Loss for Improved Retrieval and Representation Learning - Hierarchical Embedding Alignment Loss (HEAL), leverages hierarchical fuzzy clustering with matrix factorization within contrastive learning to efficiently align LLM embeddings with domain-specific content. HEAL computes level/depth-wise contrastive losses and incorporates hierarchical penalties to align embeddings with the underlying relationships in label hierarchies. This approach enhances retrieval relevance and document classification, effectively reducing hallucinations in LLM outputs. Authors from Los Alamos National Laboratory, University of Maryland, and Harvard.
DogLayout: Denoising Diffusion GAN for Discrete and Continuous Layout Generation - integrates a diffusion process into GANs to enable the generation of discrete label data and significantly reduce diffusion's sampling time. Experiments demonstrate that DogLayout considerably reduces sampling costs by up to 175 times and cuts overlap from 16.43 to 9.59 compared to existing diffusion models, while also surpassing GAN based and other layout methods. Authors from Fudan University.
A Survey of Sustainability in Large Language Models: Applications, Economics, and Challenges - This survey paper comprehensively examines the environmental, economic, and computational challenges associated with LLMs, focusing on energy consumption, carbon emissions, and resource utilization in data centers. By synthesizing insights from existing literature, this work explores strategies such as resource-efficient training, sustainable deployment practices, and lifecycle assessments to mitigate the environmental impacts of LLMs. Key areas of emphasis include energy optimization, renewable energy integration, and balancing performance with sustainability. Authors from various institutions, including Cleveland State University, Mathworks, and Northeastern University.
🧠 Sources of Inspiration
Chatbot Arena - did you know this was a student project from Berkeley last year?