Home Features Solutions Architecture Docs About
Core Engine

Erasure Coding Engine

Built on Reed-Solomon erasure coding, KDSS splits data into configurable data and parity shards. With 4 preset configurations ranging from EC 5+2 to EC 29+3, you can achieve up to 90.6% space efficiency while tolerating simultaneous disk failures -- far surpassing the 33.3% efficiency of traditional 3-way replication.

  • Reed-Solomon algorithm with hardware-accelerated Galois field operations
  • 4 preset configurations: EC 5+2, 9+2, 18+3, 29+3
  • Up to 90.6% space efficiency vs 33.3% for 3-replica
  • Tolerates up to 3 simultaneous disk failures per stripe
  • Automatic shard distribution across nodes for fault isolation
master.toml
# Erasure Coding Configuration
[ec]
data_shards   = 5      # Number of data shards / 数据分片数
parity_shards = 2      # Number of parity shards / 校验分片数

# Space Efficiency Comparison:
# EC 5+2   = 71.4%  (tolerates 2 failures)
# EC 9+2   = 81.8%  (tolerates 2 failures)
# EC 18+3  = 85.7%  (tolerates 3 failures)
# EC 29+3  = 90.6%  (tolerates 3 failures)
# 3-replica = 33.3%  (tolerates 2 failures)
Storage Engine

Raw Disk Engine

KDSS bypasses the filesystem entirely, performing direct writes on raw block devices. An append-only write pattern maximizes sequential throughput, while a dual SuperBlock design and five-level recovery mechanism ensure data integrity even after unexpected power loss or hardware failure.

  • Direct raw disk writes bypass filesystem overhead
  • Append-only writes eliminate random I/O and fragmentation, significantly extending HDD lifespan and lowering failure rates
  • Dual SuperBlock with CRC32 checksum for metadata protection
  • Five-level crash recovery with clean shutdown optimization
  • Space reclamation via compaction without service interruption
Disk Layout
[SuperBlock 4KB: primary + backup]
[Record 1: Header(64B) + Shard Data + Padding]
[Record 2: Header(64B) + Shard Data + Padding]
[...]

Recovery Levels:
  L0: Clean shutdown marker (skip recovery)
  L1: SuperBlock validation & failover
  L2: BadgerDB index recovery (recoverIndex)
  L3: Single-shard EC rebuild
  L4: Full stripe EC reconstruction
Access Layer

S3 Compatible API

KDSS provides a fully S3-compatible HTTP API, allowing seamless integration with existing tools, SDKs, and applications. Supports bucket operations, multipart uploads for large files, presigned URLs for secure temporary access, and streaming transfers for efficient data movement.

  • Full bucket CRUD: CreateBucket, ListBuckets, DeleteBucket, HeadBucket
  • Object operations: PutObject, GetObject, DeleteObject, HeadObject, CopyObject
  • Multipart upload: InitiateMultipartUpload, UploadPart, CompleteMultipartUpload
  • Presigned URLs for time-limited authenticated access
  • Compatible with AWS CLI, boto3, MinIO Client, and standard S3 SDKs
bash
# Upload a file via S3 API
curl -X PUT "http://s3.example.com/my-bucket/photo.jpg" \
  -H "Authorization: AWS4-HMAC-SHA256 ..." \
  -T ./photo.jpg

# Multipart upload (large files)
curl -X POST \
  "http://s3.example.com/my-bucket/bigfile?uploads"

# Generate presigned download URL
kdss-cli presign get \
  --bucket my-bucket \
  --key photo.jpg \
  --expires 3600

# List objects with prefix
curl "http://s3.example.com/my-bucket?prefix=photos/&max-keys=100"
POSIX Interface

FUSE Filesystem Mount

Mount your KDSS cluster as a local directory via FUSE. Applications can use standard POSIX file operations -- read, write, stat, readdir -- while KDSS transparently handles erasure coding, shard distribution, and fault recovery behind the scenes. Configurable chunk size with LRU read caching and sequential prefetch optimize throughput for different workload patterns.

  • Standard POSIX interface: open, read, write, stat, readdir, rename
  • Transparent EC encoding on write, decoding on read
  • Configurable chunk size (default 256 MB) with LRU read caching and sequential prefetch
  • Metadata caching for reduced master node round-trips
  • Compatible with standard Linux tools: cp, rsync, tar, etc.
bash
# Mount KDSS as a local filesystem
ksfs -c /etc/ksfs/mount.toml

# Configuration file (mount.toml)
[cluster]
masters = ["master-1:6700", "master-2:6700", "master-3:6700"]

[auth]
access_key = "your-access-key"
secret_key = "your-secret-key"
bucket = "my-bucket"

[mount]
mountpoint = "/mnt/kdss"

# Now use it like any local directory
ls /mnt/kdss/
cp /var/log/app.log /mnt/kdss/logs/
df -h /mnt/kdss
Reliability

Auto Repair & Recovery

KDSS continuously monitors disk health via S.M.A.R.T. metrics and heartbeat signals. When a fault is detected -- whether a slow disk, read error, or complete disk failure -- the system automatically isolates the affected disk, reconstructs missing shards using EC parity data, and redistributes them to healthy nodes, all without manual intervention.

  • S.M.A.R.T. monitoring for early disk failure prediction
  • Three-layer disk health monitoring: S.M.A.R.T., dmesg error scanning, and I/O error sliding window
  • EC parity-based shard reconstruction without full data copies
  • Distributed repair execution across all Master nodes for parallel recovery
  • Configurable repair concurrency and bandwidth throttling
  • Repair progress tracking via Web Console and Prometheus metrics
Auto Repair Flow
  Disk Health Monitor
        |
        v
  +------------------+
  | S.M.A.R.T. Check |---> Normal ---> Continue
  | Heartbeat Check  |                 Monitoring
  +------------------+
        |
      Fault Detected
        |
        v
  +------------------+
  | Isolate Disk     |  Mark disk as "offline"
  | Stop I/O         |  Redirect traffic
  +------------------+
        |
        v
  +------------------+
  | Identify Missing |  Scan stripe metadata
  | Shards           |  for affected data
  +------------------+
        |
        v
  +------------------+
  | EC Reconstruct   |  Rebuild from parity
  | Missing Shards   |  shards (k of n)
  +------------------+
        |
        v
  +------------------+
  | Place on Healthy |  Rebalance across
  | Nodes            |  available disks
  +------------------+
        |
        v
  Repair Complete ---> Resume Normal Operation
Operations

Data Balancing

As your cluster grows and workloads shift, disk utilization can become uneven. KDSS includes a utilization-aware data balancing engine that non-disruptively migrates shards from over-utilized to under-utilized disks, keeping your cluster healthy and storage evenly distributed with real-time progress monitoring.

  • Utilization-aware scheduling: triggers when imbalance exceeds configurable threshold
  • Non-disruptive migration: live data movement without service interruption
  • Bandwidth throttling to limit impact on production workloads
  • Progress monitoring via CLI and Web Console dashboard
  • Automatic rebalancing when new disks or nodes are added to the cluster
bash
# Check cluster disk utilization
kdss-cli balance status

Disk Utilization:
  node-01/sda   ████████████░░░  82%
  node-01/sdb   █████████░░░░░░  61%
  node-02/sda   ██████████████░  93%
  node-02/sdb   ███████░░░░░░░░  47%
  node-03/sda   ██████████░░░░░  68%

Imbalance: 46%  (threshold: 20%)

# Start balancing
kdss-cli balance start --max-bandwidth 100MB/s

Balancing in progress...
  Migrated: 128 shards (12.4 GB)
  Remaining: 64 shards (~6.2 GB)
  ETA: 3m 22s
Management

Web Management Console

A full-featured web-based management console gives you complete visibility and control over your KDSS cluster. Monitor cluster health at a glance, manage nodes and disks, configure EC policies, and control user access -- all through an intuitive dashboard with role-based access control (RBAC).

  • Real-time dashboard: cluster health, capacity, throughput, and IOPS
  • Node and disk management: add, remove, decommission with guided workflows
  • Bucket and object browser with search, upload, and download
  • RBAC with predefined roles: Admin, Storage Admin, Read-Only
  • Alert configuration and notification history
Dashboard Overview
+-----------------------------------------------+
|  KDSS Console            [admin] [Logout]      |
+-----------------------------------------------+
|         |                                      |
| Dash    |  Cluster Health: Healthy             |
| Nodes   |                                      |
| Disks   |  Nodes: 5/5 online                   |
| Buckets |  Disks: 20/20 active                 |
| Users   |  Capacity: 42.8 TB / 60 TB (71%)     |
| Alerts  |                                      |
| Config  |  Throughput   IOPS                    |
|         |  ~~~^~~~      ~~^~~~                  |
|         |  348 MB/s     12.4K                   |
|         |                                      |
|         |  Recent Alerts (3)                    |
|         |  ! Disk node-02/sdc S.M.A.R.T. warn  |
|         |  ! Repair job #47 completed           |
|         |  i Balance job started                |
+-----------------------------------------------+
Observability

Monitoring & Alerting

KDSS exposes comprehensive Prometheus metrics out of the box, covering cluster health, disk I/O, EC operations, and API latencies. With 32 built-in alert rules, pre-configured Grafana dashboards, and Lark (Feishu) webhook integration, you get full observability without building monitoring infrastructure from scratch.

  • Prometheus-native /metrics endpoint on every component
  • 32 built-in alert rules covering disk, node, capacity, and performance
  • Pre-built Grafana dashboards for cluster, node, and disk-level views
  • Lark (Feishu) webhook for real-time alert notifications
  • Configurable alert thresholds and notification routing
prometheus.yml
# Prometheus scrape config for KDSS
scrape_configs:
  - job_name: 'kdss-master'
    static_configs:
      - targets: ['master-01:6701']
    metrics_path: /metrics

  - job_name: 'kdss-storage'
    static_configs:
      - targets:
          - 'storage-01:6801'
          - 'storage-02:6801'
          - 'storage-03:6801'

# Sample alert rules (32 built-in)
# - DiskSpaceCritical (>90%)
# - DiskSmartWarning
# - NodeHeartbeatLost (>30s)
# - ECRepairQueueHigh (>100)
# - APILatencyP99High (>500ms)
# - ReplicationLagHigh
# ...
Lifecycle

GC & Reclamation

KDSS implements a safe two-phase delete process with a configurable recycle bin retention period. Automatic garbage collection runs in the background to reclaim space from deleted objects and stale temporary data, while ensuring no data is permanently removed before the retention window expires.

  • Two-phase delete: soft delete to recycle bin, then gc_pending with configurable auto-purge
  • Automatic housekeeping with configurable scan interval and auto-gc timeout
  • Stale multipart upload cleanup after configurable timeout
  • Background compaction with soft and force thresholds
  • GC progress and space reclaimed metrics exported to Prometheus
master.toml
# Garbage Collection Configuration
[gc]
interval_sec = 3600              # Housekeeping scan interval (seconds)
auto_gc_pending_hours = 48       # Auto-delete gc_pending after N hours (0=disable)
stale_writing_minutes = 120      # Clean up stale writing stripes after N minutes

# Capacity Alerts
[capacity]
alert_pct  = 95                  # Alert when cluster usage >= 95%
reject_pct = 99                  # Reject writes when >= 99%

# Background Compaction (storage.toml)
[compactor]
enabled            = true
threshold          = 0.2         # Soft threshold: compact when idle (20%)
force_threshold    = 0.5         # Force compact regardless of load (50%)