- Horizon AI
- Posts
- Amazon's New Nova AI Models 🔥
Amazon's New Nova AI Models 🔥
Change image styles with GPT-4o

Welcome to another edition of Horizon AI,
Amazon is showcasing its take on a more conversational voice model to compete with Google’s Gemini Live and OpenAI’s Advanced Voice Mode, along with an update to its AI video model.
Let’s get into it!
Read Time: 4.5 min
Here's what's new today in the Horizon AI
Amazon Launches Nova Sonic Voice Model
New AI Paper Generates 1-Minute Long Tom & Jerry Clips With Simple Text Prompts
AI Tutorial: Change image styles with GPT-4o
AI Tools to check out
The Latest in AI and Tech 💡
AI Findings/Resources
AI News
AMAZON
Amazon Launches Nova Sonic Voice Model

Amazon has debuted a new generative AI model, Nova Sonic, capable of natively processing voice and generating natural-sounding speech.
Details:
The Nova Sonic model is designed to enable third-party app developers to build real-time, naturalistic conversational voice interactivity into their products using Amazon’s web platform, Bedrock.
The model reportedly matches the performance of leading speech models from OpenAI and Google in key metrics such as speed, speech recognition, and call quality, while offering an 80% lower cost compared to OpenAI's GPT-4o.
Alongside Nova Sonic, Amazon introduced Nova Reel 1.1, an updated video generation model that delivers improved quality, reduced latency, and the ability to maintain consistent visual styles across multiple six-second scenes—allowing for the creation of coherent videos up to two minutes long.
Developers can access both models through Amazon's Bedrock platform, and components of the Sonic model are already incorporated into the new Alexa Plus assistant.
AI RESEARCH
New AI Paper Generates 1-Minute Long Tom & Jerry Clips With Simple Text Prompts

Researchers from NVIDIA, Stanford University, UC San Diego, UC Berkeley, and the University of Texas at Austin have developed a method for generating longer, more coherent AI videos that can tell complex stories.
Details:
By incorporating Test-Time Training (TTT) layers into pre-trained Transformer models, they enabled the generation of videos up to one minute long, a substantial increase from previous limitations of 8 to 20 seconds.
Traditional Transformer models face challenges with longer videos due to their self-attention mechanisms, which require each element to relate to every other element, leading to quadratic increases in computational demands.
The introduced TTT layers address this by adding mini neural networks that learn during the video generation process, enhancing memory retention and consistency across longer sequences.
To demonstrate this technique, the team applied it to generate extended sequences of Tom and Jerry cartoons, producing coherent clips up to one minute in length, which you can check out on the project page. This progress opens new possibilities for AI in entertainment and other domains requiring extended video generation.
AI Tutorial
Change image styles with GPT-4o

Go to Chatgpt and choose ‘GPT-4o’ as your model.
Upload your image
Use the prompt: Recreate this image with a style: [Insert style]
Some options to try:
Recreate this image with a style: Studio Ghibli
Recreate this image with a style: Pixar
Recreate this image with a style: Dragon Ball
Recreate this image with a style: Lego
Recreate this image with a style: Hand-knitted doll
Recreate this image with a style: Funko Pop
Recreate this image with a style: Rick and Morty
Recreate this image with a style: Hanna Barbera
Recreate this image with a style: Manga
Recreate this image with a style: Simpsons
Recreate this image with a style: South Park
Recreate this image with a style: Gothic Stop Motion
Recreate this image with a style: Barbie
AI Tools to check out
🗣 EverTutor Live: AI-powered voice tutor that teaches, adapts, and interacts.
✨ DreamActor-M1: Upload your image and watch it come to life with our state-of-the-art animation technology.
💥 Paragon: All-in-one platform to build, ship, and manage product integrations.
🤖 Devin: A collaborative AI teammate built to help ambitious engineering teams achieve more.
✨ GitSummarize: Turn any GitHub repository into a comprehensive AI-powered documentation hub.
AI Findings/Resources
🤔 AI experts say we’re on the wrong path to achieving human-like AI
👨💻 ‘Don’t study coding’ says Replit CEO
📷 5 ways to use Gemini Live with camera and screen sharing
👉 Tech’s big anxiety: fewer jobs, lower pay, more AI
The latest in AI and Tech
Thinking Machines Lab has brought on former OpenAI leaders Bob McGrew and Alec Radford as advisors. They join other ex-OpenAI figures, including co-founder John Schulman and former post-training lead Barret Zoph.
Samsung is adding Google’s Gemini AI to its home robot Ballie through a partnership with Google Cloud. The AI will enable Ballie to handle audio and video inputs to answer different questions.
The European Commission has presented the “AI Continent Action Plan,” which will aim to “transform Europe’s strong traditional industries and its exceptional talent pool into powerful engines of AI innovation and acceleration.”
Rumors have circulated in recent days that robotaxi company Waymo might use data from interior vehicle cameras to train AI and serve targeted ads to riders. However, a company spokesperson has clarified that there are no such plans.
That’s a wrap!
We'd love to hear your thoughts on today's email!Your feedback helps us improve our content |
Not subscribed yet? Sign up here and send it to a colleague or friend!
See you in our next edition!
Gina 👩🏻💻