Prompt Cache

Overview

Apertis includes a powerful Prompt Cache feature that significantly improves system performance, reduces processing time for repeated requests, and lowers API costs. When the same prompt is requested multiple times, the system returns results directly from the cache without calling the upstream AI model again.

Free Feature

Prompt Cache is completely free for all Apertis users. Cache hits do not consume any credits or quota from your account.

Key Features

Core Benefits

Performance Boost: Cache hits reduce response time from seconds to milliseconds
Zero Cost: Cache hits are completely free — no credits or quota consumed
Exact Matching: Supports exact-match cache responses
Automatic Management: Built-in TTL (Time To Live) mechanism automatically cleans expired cache

Technical Architecture

Cache Strategy: LRU (Least Recently Used) eviction policy
Concurrency Safe: Supports cache operations in high-concurrency environments
Fault Tolerance: Automatically falls back to normal request flow when cache fails

Cache Strategy

Cache Key Generation

The system generates unique cache keys using:

Model name
Prompt content
System message
Temperature parameter
Other relevant parameters

Cache Hit Conditions

A cache hit requires:

Exact same prompt content
Same model and parameter settings
Cache item not expired
Cache size within limits

Cache Invalidation

Cache invalidates when:

Exceeds TTL setting
Redis storage space insufficient
Manual cache clearing
System restart (if not persisted)

Overview​

Key Features​

Core Benefits​

Technical Architecture​

Cache Strategy​

Cache Key Generation​

Cache Hit Conditions​

Cache Invalidation​