• Dev Notes
  • Posts
  • OpenAI Launches Sora, a Text-to-Video AI Generator

OpenAI Launches Sora, a Text-to-Video AI Generator

Good morning! OpenAI has unveiled Sora, a new text-to-video AI generator that can create highly realistic 60-second videos from text prompts, raising concerns about misinformation. Meanwhile, Amazon announced the largest text-to-speech model ever at 980 million parameters, called BASE TTS, which achieves new levels of voice quality and expressiveness for future applications. In other news, Apple has confirmed it is removing support for progressive web apps (PWAs) on iPhones in the EU in response to new regulations, a move some see as anti-competitive that sets back Safari’s web app capabilities.

OpenAI Launches Sora, a Text-to-Video AI Generator

Yesterday, AI lab OpenAI unveiled a new text-to-video tool called Sora. It can generate highly realistic 60-second videos from simple text descriptions. Early examples show it creating complex scenes with multiple characters, objects, and camera movements.

How Does It Work?: Technically, Sora is a "diffusion model" - it starts with random noise and slowly transforms it into a coherent video by removing noise in many steps. It builds on past OpenAI research like DALL-E for images and GPT for language. Sora represents videos as groups of smaller data pieces called "patches", letting it learn from diverse video datasets.

Concerns Raised

  • OpenAI says Sora deeply understands physics, allowing accurate real-world simulations. However, complex physical modeling still challenges it.

  • The rapid quality improvements raise concerns over how quickly AI-generated media could spread misinformation or threaten creative jobs needing human talent.

Read More Here

Amazon Unveils Largest Text-to-Speech Model Ever Made

Amazon recently announced BASE TTS, the largest text-to-speech model ever at 980 million parameters. Trained on 100,000 hours of speech data to achieve new levels of voice quality and expressiveness.

Model Architecture

Uses a speech tokenizer to encode text into discrete speech representations.

  • An autoregressive model then predicts speech representations based on text and reference audio.

  • A decoder converts the predicted representations into waveforms.

Emergent Capabilities

In tests, BASE TTS demonstrated linguistic skills like:

  • Using compound nouns

  • Expressing emotion

  • Applying punctuation

Also handles code-switching between languages more naturally.

Future Applications: Amazon is keeping BASE TTS private but will leverage insights from its development to improve other TTS systems. BASE paves the path for consumer TTS applications with increasingly human-like voices and language mastery - key capabilities for naturalistic text-to-speech interactions.

Read More Here

Apple Wants To Kill Progressive Web Apps (PWAs)

The EU recently passed a new Digital Markets Act (DMA) requiring large tech companies like Apple to open their platforms more to competition. Specifically, the DMA mandates that Apple allow alternative browser engines on iOS besides just Safari and WebKit.

In response, Apple has confirmed it is removing support for progressive web apps (PWAs) on iPhones in the EU. Apple claims that permitting alternate browser engines poses security risks. Rather than rebuild the underlying architecture, Apple opted to remove PWA functionality entirely.

  • Some see this move as anti-competitive since PWAs can rival native apps

  • However, Apple states that PWA usage was very low on iOS

The DMA still compels Apple to enable app sideloading and alternate app stores. So other browsers will likely still come to iOS, just without the ability to install web apps to the home screen.

This is frustrating timing, as Safari on iOS was finally gaining improved support for PWAs. Losing installed web apps is a major setback. However, Apple may have felt they had no feasible choice but to comply with the DMA regulations on the mandated timeframe, even if it meant drastically limiting web app capabilities.

Read More Here

🔥 More Notes

🎥 Youtube Spotlight

The Future of War will be in Cyberspace

Click To Watch

Johnny Harris explores the increasing threat of cyber warfare, using real-life examples like the Stuxnet virus, the 2016 cyber attacks during the US presidential election, and the widespread impact of cyber weapons like EternalBlue and NotPetya. He delves into the potential for cyber warfare to be deployed subtly, influencing geopolitics and national security in unprecedented ways.

Was this forwarded to you? Sign Up Here

Reply

or to participate.