Skip to main content

Prompt Cache

Overview

Apertis includes a powerful Prompt Cache feature that significantly improves system performance, reduces processing time for repeated requests, and lowers API costs. When the same prompt is requested multiple times, the system returns results directly from the cache without calling the upstream AI model again.

Free Feature

Prompt Cache is completely free for all Apertis users. Cache hits do not consume any credits or quota from your account.

Key Features

Core Benefits

  • Performance Boost: Cache hits reduce response time from seconds to milliseconds
  • Zero Cost: Cache hits are completely free — no credits or quota consumed
  • Exact Matching: Supports exact-match cache responses
  • Automatic Management: Built-in TTL (Time To Live) mechanism automatically cleans expired cache

Technical Architecture

  • Cache Strategy: LRU (Least Recently Used) eviction policy
  • Concurrency Safe: Supports cache operations in high-concurrency environments
  • Fault Tolerance: Automatically falls back to normal request flow when cache fails

Cache Strategy

Cache Key Generation

The system generates unique cache keys using:

  • Model name
  • Prompt content
  • System message
  • Temperature parameter
  • Other relevant parameters

Cache Hit Conditions

A cache hit requires:

  1. Exact same prompt content
  2. Same model and parameter settings
  3. Cache item not expired
  4. Cache size within limits

Cache Invalidation

Cache invalidates when:

  • Exceeds TTL setting
  • Redis storage space insufficient
  • Manual cache clearing
  • System restart (if not persisted)