The AI Canvas Newsletter #12
Explore the latest AI advancements: Google's Lumiere, Olympiad-level geometric reasoning with AlphaGeometry, NVIDIA's HDR video enhancement!
The AI Canvas Newsletter #12
The AI Canvas: Your weekly palette of inspiration, insights, and innovation in the world of AI.
- 📽️ Lumiere: Google Research released a new text-to-video diffusion model for content creation with Space-Time U-Net architecture.
- 🔍 AlphaGeometry: AI achieves Olympiad-level geometric reasoning, solving complex problems comparable to human gold medallists using a neuro-symbolic approach.
- 🎨 NVIDIA RTX Video HDR: AI-driven clarity enhancement transforms standard videos into HDR for an immersive viewing experience, with RTX Remix empowering game remastering.
- 🚀 OpenAI Enhanced AI Tools: OpenAI unveils new embedding models and API tools, alongside updates to GPT-4 Turbo and GPT-3.5 Turbo, aimed at enhancing developer efficiency and application performance.
Written by Oli Wilkins.
Google’s Lumiere: Video Synthesis with Space-Time Diffusion
Google Research unveils Lumiere, a novel text-to-video diffusion model that crafts high-quality videos through a Space-Time U-Net architecture. This approach generates full-frame-rate videos in a single pass, ensuring global temporal consistency and supporting a variety of content creation tasks such as video inpainting and stylised generation.
Find out more on Google’s Research Lumiere page here.
AlphaGeometry: AI's Leap into Olympiad-Level Geometric Reasoning
AlphaGeometry demonstrates a significant advancement in AI's ability to solve complex geometry problems, rivalling the performance of human International Mathematical Olympiad (IMO) gold medallists. Utilising a neuro-symbolic approach that combines a neural language model with a symbolic deduction engine, the system successfully solved 25 out of 30 Olympiad geometry problems within the competition time limits. The research-which also involved generating 100 million unique synthetic training examples, marks a notable step towards sophisticated AI reasoning in mathematics.
Find out more at Deepmind’s blog here.
AI-Enhanced Visual Fidelity with NVIDIA RTX Video HDR
NVIDIA harnesses AI to redefine video clarity through RTX Video HDR, enabling the conversion of standard videos to HDR on RTX GPUs for a superior viewing experience. The RTX Remix open beta further empowers modders with AI-driven tools to create visually stunning remasters of classic games, showcasing the fusion of AI with creative and gaming applications.
Check NVDIA’s blog for more information.
Enhancing AI Integration: New Embedding Models and API Tools
OpenAI introduces two new embedding models with improved performance and reduced pricing, alongside updates to GPT-4 Turbo and GPT-3.5 Turbo models. The release also features new API management tools for developers, a robust moderation model, and forthcoming lower pricing for GPT-3.5 Turbo, all aimed at enhancing developer experience and application efficiency.
Read more here.
Technical Reads
What's new with ML in production – Vicki Boykis
“In 2023, I wrote two pieces on machine learning engineering for The Pragmatic Programmer. (Part 1 and Part 2). However, since I started working with LLMs recently, neural architectures have changed some of those assumptions."
Sampling for Text Generation – Chip Huyen
“ML models are probabilistic. Imagine that you want to know what’s the best cuisine in the world. If you ask someone this question twice, a minute apart, their answers both times should be the same. If you ask a model the same question twice, its answer can change. If the model thinks that Vietnamese cuisine has a 70% chance of being the best cuisine and Italian cuisine has a 30% chance, it’ll answer “Vietnamese” 70% of the time, and “Italian” 30%. This probabilistic nature makes AI great for creative tasks. What is creativity but the ability to explore beyond the common possibilities, to think outside the box?”
Beware of misleading GPU vs CPU benchmarks – Itamar Turner-Trauring
“Do you use NumPy, Pandas, or scikit-learn and want to get faster results? Nvidia has created GPU-based replacements for each of these with the shared promise of extra speed. For example, if you visit the front page of NVidia’s RAPIDS project, you’ll see benchmarks showing cuDF, a GPU-based Pandas replacement, is 15× to 80× faster than Pandas! Unfortunately, while those speed-ups are impressive, they are also misleading. GPU-based libraries might be the answer to your performance problems… or they might be an unnecessary and expensive distraction.
Projects
“TL;DR: LangGraph is module built on top of LangChain to better enable creation of cyclical graphs, often needed for agent runtimes.”
“A distraction-free LLM chat web app optimized for Kindle. The perfect companion for your book. Powered by Mixtral from Mistral AI. Mainly tested on Kindle Paperwhites.”
“An Open Source text-to-speech system built by inverting Whisper. Previously known as spear-tts-pytorch. We want this model to be like Stable Diffusion but for speech – both powerful and easily customizable.”
Learning
Vector Databases: A Technical Primer - A collection of slides that give an excellent technical overview of vector databases.
Understanding Deep Learning - A free book from Simon J.D. Prince, covering the underlying ideas behind deep learning.
Machine Learning Engineering Open Book - An open collection of methodologies to help with successful training of large language models and multi-modal models.
Code
“structured outputs for llms”
“Metrics Observability & Troubleshooting”
“🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using RAG 🔄.”
“This repo provides the server side code for llmsherpa API to connect. It includes parsers for various file formats.”
Business and Trends
- OpenAI CEO Sam Altman is still chasing billions to build AI chips
- Elon Musk’s AI startup that hopes to leapfrog OpenAI is halfway to its goal of landing $1 billion in funding
- Voice AI startup ElevenLabs gains unicorn status after latest fundraising -source
- Google DeepMind Scientists in Talks to Leave and Form AI Startup
Quick Links
- Elon Musk says Neuralink has implanted a computer chip in someone’s brain
- New test detects ovarian cancer earlier — thanks to artificial intelligence
- Scientists use AI to predict when cancer cells will resist chemotherapy
- AlphaFold found thousands of possible psychedelics. Will its predictions help drug discovery?
- Weirdest Chatbots on OpenAI’s GPT Store, Ranked
🚀 Don't miss your weekly dose of cutting-edge AI innovations with The AI Canvas newsletter!
Subscribe now to ensure you never miss out on these transformative insights.
Looking for more specialised consultancy? At ADSP we’re a team of data experts who build AI products with purpose.
We deliver data science projects for companies who want to harness the power that AI can bring to their organisation. Get in touch at hello@adsp.ai.
Stay tuned with The AI Canvas podcast for in-depth episodes exploring Generative AI's transformative role across various industries.