The AI Canvas Newsletter #16
Explore AI breakthroughs: NVIDIA's efficiency leap, Stable Video 3D's innovation, and DeepMind's SIMA mastery...
This Week in AI: Innovations, Reports, and Features
🚀 NVIDIA's Blackwell platform: A leap in AI efficiency, slashing costs and energy by 25x, paired with Project GR00T for advanced humanoid robotics.
🧠Grok-1 goes open-source: xAI's colossal 314 billion parameter model is now available for community innovation under Apache 2.0 license.
🎞️ Stable Video 3D's breakthrough: Easily convert images to 3D models with new SV3D_u and SV3D_p versions for enhanced multi-angle visuals.
🎮 DeepMind's SIMA: Mastering diverse 3D game worlds with natural language, showcasing AI's potential to adapt and assist in complex environments.
NVIDIA Unveils Blackwell Platform and Project GR00T for Next-Gen AI and Robotics
NVIDIA introduces the Blackwell platform, designed to significantly enhance AI model training and inference efficiency, and announces Project GR00T, a foundation model for humanoid robots. The Blackwell platform promises up to 25x reduction in operating costs and energy for large language models, while Project GR00T aims to advance robotics with natural language understanding and human-like dexterity.
Have a read about the Blackwell platform here and GR00T here.
Grok-1 Goes Public: xAI Releases 314 Billion Parameter Model
xAI has announced the open-source release of Grok-1, a language model with 314 billion parameters, under the Apache 2.0 license. This Mixture-of-Experts model is now accessible for developers and researchers to explore and integrate into their projects, marking a significant contribution to the AI community. Details and download instructions are provided on xAI's GitHub repository.
Read more from xAI here.
Enhancing 3D Imagery: A Look at Stable Video 3D's Latest Innovations
The new Stable Video 3D technology offers a simplified way to transform single images into detailed 3D models and videos. With its two new versions, SV3D_u and SV3D_p, users can now easily generate consistent and realistic multi-angle views, available for both commercial and non-commercial projects.
Read more on their blog post here.
SIMA: A Versatile AI Agent for Diverse 3D Game Environments
Google DeepMind's latest research introduces the Scalable Instructable Multiworld Agent (SIMA), an AI capable of understanding and executing tasks in various video games through natural-language instructions. Trained across multiple games without needing game-specific code or APIs, SIMA demonstrates the potential for AI to generalise learning and perform in both familiar and novel virtual settings. This advancement suggests a future where AI can assist with complex tasks in dynamic, real-world environments.
Read more on Deepmind’s blog here.
Technical Reads
Evolving New Foundation Models: Unleashing the Power of Automating Model Development – Sakana AI
“The core research focus of Sakana AI is in applying nature-inspired ideas, such as evolution and collective intelligence, to create new foundation models. We are currently developing technology that makes use of evolution with the goal of automating the development of foundation models with particular abilities suitable for user-specified application domains. Our goal isn’t to just train any particular individual model. We want to create the machinery to automatically generate foundation models for us!.”
Beyond Self-Attention: How a Small Language Model Predicts the Next Token - Shyam Pather
“I trained a small (~10 million parameter) transformer following Andrej Karpathy’s excellent tutorial, Let’s build GPT: from scratch, in code, spelled out. After getting it working, I wanted to understand, as deeply as possible, what it was doing internally and how it produced its results.”
What I learned from looking at 900 most popular open source AI tools – Chip Huyen
“Four years ago, I did an analysis of the open source ML ecosystem. Since then, the landscape has changed, so I revisited the topic..”
You can now train a 70b language model at home – Answer.ai
“We’re releasing an open source system, based on FSDP and QLoRA, that can train a 70b model on two 24GB GPUs.”
ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs – Fengqing Jiang et al
“In this paper, we propose a novel ASCII art-based jailbreak attack and introduce a comprehensive benchmark Vision-in-Text Challenge (ViTC) to evaluate the capabilities of LLMs in recognizing prompts that cannot be solely interpreted by semantics.”
Projects, Code and Discussions
Demystifying how GPT works: From Architecture to...Excel!?! 🚀
“Think AI is too complex? Think again! If you're spreadsheet-savvy, you're ready to grasp modern AI.”
“Automate browser-based workflows with LLMs and Computer Vision”
Ask HN: If you've used GPT-4-Turbo and Claude Opus, which do you prefer?
An interesting discussion about GPT4 vs Claude 3 Opus. It seems that Opus may have the leading edge!
“We show that GPT-4's reasoning and planning capabilities extend to the 1993 first-person shooter Doom.”
“Now we don't need to buy a Piano if we want to play music. We can play piano on paper, although it may not give a feeling of pressing keys on the piano but it gets work done!”
Learning
Monte-Carlo Graph Search from First Principles
This page offers a practical insight into Monte-Carlo Graph Search, an advanced AI technique crucial for optimizing search algorithms in complex state spaces with transpositions. It is essential reading for AI practitioners and researchers aiming to enhance their understanding of search algorithms and their applications in cutting-edge AI systems.
Journey Through the AI Canvas Podcast
Latest Episode of the AI Canvas
The AI Canvas - Generative AI in the Classroom: The Future of Learning with Francisco Recalde
In this enlightening episode of the AI Canvas podcast, host David Foster sits down with Francisco Recalde, Head of the Department of Languages at Dixon's Unity Academy, to explore the transformative effects of AI on education. They discuss AI’s potential in teaching and learning, the fear of AI replacing teachers, and the role of AI as a guide for students.
David Foster
Founding Partner, ADSP
In Case You Missed It: Explore Our Recent Newsletters
The AI Canvas Newsletter #15
Explore the latest in AI news: Anthropic's Claude 3, Pi-2.5, and NVIDIA's StarCoder2.
The AI Canvas Newsletter #17
Dive into the latest AI innovations: Stable Audio 2.0, Grok-1.5V's long-context comprehension, OpenAI's Voice Engine and more.
The AI Canvas Newsletter #18
Delve into the latest AI advancements: Boston Dynamics' Electric Atlas, Llama 3, Udio, Grok-1.5V, The Silicon Shift and more.
Looking for more specialised consultancy?
At ADSP we’re a team of data experts who build AI products with purpose.