RW

GEAR Hash Rolling Window

Step through the GEAR rolling hash byte-by-byte. Hash = (hash << 1) + GEAR[byte].

Gear Hash in Action
The quick brown fox jumps over the lazy dog. She packed her seven boxes and left. A warm breeze drifted through the open window.
GEAR Lookup Table

Each colored block is one of 256 pre-computed random 32-bit values, keyed by byte. Hover a cell to see its mapping.

GEAR[--] = --
Rows 0-1: control bytes · Rows 2-7: printable ASCII · Row 7F: DEL · Rows 8-F: extended bytes
Rolling Hash Window

The hash rolls forward one byte at a time. When it matches a bit pattern, a chunk boundary is placed. Target chunk size: min 8, avg 16, max 32 bytes.

Current Hash: 0x00000000
Speed

Parametric Chunking Explorer

Drag the slider to adjust the target average chunk size and see how FastCDC re-chunks the same text.

Parametric Chunking Explorer

See how target average size affects chunk boundaries and size distribution.

Target Average: 88 bytes (min: 44, max: 264)
Chunk Summary
-- chunks · target -- · avg -- · min -- · max --
Each bar is one chunk. Height and width show relative size (dashed line = target).

Chunk Boundary Detection

See how fixed-size chunking vs content-defined chunking handle file modifications.

Fixed-Size Chunking (48 bytes)
Content-Defined Chunking (sentence boundaries)

Deduplication Explorer

Edit text and save versions to see which chunks are new and which are shared.

Deduplication Explorer

Click "Save Version" after editing to see which chunks are new and which are shared. Hover over chunks to highlight them across views.

Basic vs Normalized Chunk Size Distribution

Compare how single-mask and dual-mask strategies distribute chunk sizes across the same data.

Basic vs Normalized Chunk Size Distribution

Compare how single-mask and dual-mask strategies distribute chunk sizes across the same data.

Target Average: 88 bytes (min: 44, max: 264)
Basic CDC (Single Mask) -- chunks · avg -- · min -- · max --
Each bar is one chunk. Height and width show relative size (dashed line = target).
Density curve: higher peaks mean more chunks of that size. Dashed line marks the target average.
Normalized CDC (Dual Mask) -- chunks · avg -- · min -- · max --
Each bar is one chunk. Height and width show relative size (dashed line = target).
Density curve: higher peaks mean more chunks of that size. Dashed line marks the target average.

Cost Tradeoffs Explorer

See how average chunk size affects each cost dimension: CPU, memory, network, and storage.

Cost Tradeoffs Explorer
Average Chunk Size: 32 KB Drag the slider to see how average chunk size affects each cost dimension.
Relative Pressure →
These bars show the direction and shape of each tradeoff, not exact magnitudes. CPU and memory costs scale with chunk count (more chunks = more hashing, larger index). Network cost decreases with smaller chunks because the higher deduplication ratio means less unique data to transfer. Storage has a U-shape: very small chunks incur metadata overhead, while very large chunks reduce deduplication and store more redundant data.

Established Object Storage Provider Cost Explorer

See how per-operation pricing on established object storage providers affects costs when every chunk is a separate object.

Established Object Storage Provider Cost Explorer
Average Chunk Size: 32 KB

Established Object Storage Provider Cost Explorer with Containers

See how container packing reduces API operations costs by bundling chunks into larger objects.

Established Object Storage Provider Cost Explorer with Containers
Average Chunk Size: 32 KB
Container Size: 4 MB

Challenger Object Storage Provider Cost Explorer

Explore costs on challenger object storage providers with radically different pricing models.

Challenger Object Storage Provider Cost Explorer
Average Chunk Size: 32 KB
Container Size: 4 MB

Established vs. Challenger Object Storage Provider Cost Comparison

Compare costs across all seven storage providers side by side.

Established vs. Challenger Object Storage Provider Cost Comparison
Average Chunk Size: 8 KB
Container Size: 4 MB

Zipf Popularity Distribution

Visualize how skewness affects the popularity distribution of items under a Zipf model.

Zipf Popularity Distribution
Skewness (α): 0.60

Cache Size vs. Hit Rate

Given a skewness level and a target hit rate, how much unique data do you need to cache?

Cache Size vs. Hit Rate
Skewness (α): 0.60
Target Hit Rate: 50%

Established Cache Provider Cost Explorer

See how established cache providers (ElastiCache, CloudFront) affect origin costs.

Established Cache Provider Cost Explorer
Cache Hit Rate: 50%
Provisioned Redis (ElastiCache, Memorystore, Azure Cache) charges for memory regardless of hit rate. CDN edges (CloudFront, Cloud CDN, Azure CDN) charge per-request and per-GB delivered. Both reduce origin GET and egress costs, but the break-even hit rate differs sharply between the two models.

Challenger Cache Provider Cost Explorer

Compare challenger cache providers that scale linearly with per-request pricing.

Challenger Cache Provider Cost Explorer
Cache Hit Rate: 50%
Per-request pricing means you pay nothing when the cache is cold and costs scale linearly with usage. Compare the net impact at different hit rates: lower per-read prices (Momento, Workers KV) break even earlier than higher per-read prices (Upstash).

Comprehensive Cost Model

Combine storage provider, cache layer, chunk size, and container packing into a single cost view.

Comprehensive Cost Model
Set hit rate to 0% to see costs without caching, matching the Established vs. Challenger Object Storage Provider Cost Comparison above. The matrix at the bottom shows every storage + cache combination. Green highlights the cheapest pairing; terracotta highlights the most expensive.