OpenAI’s Latest Dev Tools Could Save You Thousands: What You Need to Know

Why Trust Techopedia
Key Takeaways

  • OpenAI announced four API updates at DevDay 2024 to enhance developer tools.
  • The new tools include Prompt Caching, Model Distillation, Realtime, and Vision Fine-Tuning.
  • Pricing for fine-tuning and API usage offers competitive text and audio processing rates.

At DevDay 2024 on October 1st, OpenAI announced API updates aimed at helping developers customize models, build speech applications, cut costs, and enhance the performance of smaller models.

At its San Francisco event, OpenAI highlighted incremental improvements to its AI tools and APIs instead of major product launches.

The company introduced four key API updates:

  • Model Distillation
  • Prompt Caching
  • Vision Fine-Tuning
  • Realtime

These tools reflect OpenAI’s shift toward empowering its developer ecosystem instead of competing directly in the end-user application market.

Realtime API

OpenAI has made its Advanced Voice Mode available to all ChatGPT subscribers and is now enabling developers to create speech-to-speech applications. Previously, building AI-powered applications that spoke to users required transcribing audio, processing it with a language model like GPT-4, and converting it back to speech, which often led to a noticeable latency.

The new Realtime API processes audio instantly without linking multiple applications. It supports function calling, enabling tasks like ordering pizza or scheduling appointments, with future updates planned for multimodal experiences, including video.

The API costs $5 per million input tokens and $20 per million output tokens for text, while audio processing is priced at $100 per million input tokens and $200 per million output tokens, translating to approximately $0.06 per minute of audio input and $0.24 per minute of audio output.

Introducing Vision to the Fine-Tuning API

Developers can now fine-tune GPT-4o with images, improving its visual recognition for applications like visual search, object detection, and improved medical image analysis.

For instance, OpenAI says that Grab, a food delivery and rideshare company, transforms driver-collected street imagery into mapping data for GrabMaps. Using 100 examples, they trained GPT-4o to localize traffic signs and count lane dividers, increasing lane count accuracy by 20% and speed limit sign localization by 13%, automating the mapping process.

To support developers, OpenAI will offer one million free training tokens daily in October. Starting in November, fine-tuning GPT-4o with images will cost $25 per million tokens.

Prompt Caching

Prompt Caching reduces API costs by allowing developers to reuse frequent prompts at a discounted rate. Long prefixes, often used to guide model behavior and improve consistency, typically increase API call costs.

OpenAI‘s API now automatically caches lengthy prefixes for up to an hour, providing a 50% discount if reused. This feature applies to the latest GPT-4o, GPT-4o mini, o1-preview, o1-mini, and their fine-tuned models, helping developers save money.

Model Distillation

Model Distillation enhances smaller models, like GPT-4o mini, by using outputs from larger models. Previously, the process was error-prone, requiring developers to manage multiple tasks for dataset generation and performance measurement. The new Model Distillation suite in the API platform streamlines this by enabling developers to create datasets with advanced models, fine-tune smaller models, and assess their performance on specific tasks.

To assist developers with distillation, OpenAI is offering two million free training tokens daily for GPT-4o mini and one million for GPT-4o until October 31. Beyond this limit, training and operating a distilled model will be priced at OpenAI’s standard fine-tuning rates.