从底层以纠删码为核心构建,生产级分布式存储所需的一切功能。
基于 Reed-Solomon 纠删码,KDSS 将数据分割为可配置的数据分片和校验分片。提供从 EC 5+2 到 EC 29+3 的 4 种预设配置,空间利用率最高可达 90.6%,同时容忍多块磁盘同时故障——远超三副本 33.3% 的利用率。
# Erasure Coding Configuration
[ec]
data_shards = 5 # Number of data shards / 数据分片数
parity_shards = 2 # Number of parity shards / 校验分片数
# Space Efficiency Comparison:
# EC 5+2 = 71.4% (tolerates 2 failures)
# EC 9+2 = 81.8% (tolerates 2 failures)
# EC 18+3 = 85.7% (tolerates 3 failures)
# EC 29+3 = 90.6% (tolerates 3 failures)
# 3-replica = 33.3% (tolerates 2 failures)
KDSS 完全绕过文件系统,在裸块设备上执行 Direct I/O。追加写模式最大化顺序吞吐量,双 SuperBlock 设计和五级恢复机制确保即使在意外断电或硬件故障后也能保证数据完整性。
+----------------------------------------------+
| SuperBlock A (primary) | SuperBlock B (backup) |
+----------------------------------------------+
| Data Records (append-only) |
| +--------+--------+--------+--------+----- |
| | Record | Record | Record | Record | ... |
| | Hdr+D | Hdr+D | Hdr+D | Hdr+D | |
| +--------+--------+--------+--------+----- |
+----------------------------------------------+
Recovery Levels:
L1: Clean shutdown detection (skip recovery)
L2: SuperBlock validation & failover
L3: Index recovery from data records
L4: Single-shard EC rebuild
L5: Full stripe EC reconstruction
KDSS 提供完全 S3 兼容的 HTTP API,可无缝对接现有工具、SDK 和应用。支持 Bucket 操作、大文件分段上传、预签名 URL 安全临时访问以及流式传输。
# Upload a file via S3 API
curl -X PUT "http://kdss:9000/my-bucket/photo.jpg" \
-H "Authorization: AWS4-HMAC-SHA256 ..." \
-T ./photo.jpg
# Multipart upload (large files)
curl -X POST \
"http://kdss:9000/my-bucket/bigfile?uploads"
# Generate presigned download URL
kdss-cli presign get \
--bucket my-bucket \
--key photo.jpg \
--expires 3600
# List objects with prefix
curl "http://kdss:9000/my-bucket?prefix=photos/&max-keys=100"
通过 FUSE 将 KDSS 集群挂载为本地目录。应用可使用标准 POSIX 文件操作——read、write、stat、readdir——KDSS 在后台透明处理纠删码编解码、分片分发和故障恢复。可配置读写缓冲优化不同工作负载的吞吐量。
# Mount KDSS as a local filesystem
ksfs -c /etc/ksfs/mount.toml
# Now use it like any local directory
ls /mnt/kdss/my-bucket/
cp /var/log/app.log /mnt/kdss/my-bucket/logs/
df -h /mnt/kdss
# Unmount when done
umount /mnt/kdss
KDSS 通过 S.M.A.R.T. 指标和心跳信号持续监控磁盘健康。检测到故障时——无论是慢盘、读取错误还是磁盘完全失效——系统自动隔离故障磁盘,使用 EC 校验数据重构缺失分片,并重新分布到健康节点,全程无需人工干预。
Disk Health Monitor
|
v
+------------------+
| S.M.A.R.T. Check |---> Normal ---> Continue
| Heartbeat Check | Monitoring
+------------------+
|
Fault Detected
|
v
+------------------+
| Isolate Disk | Mark disk as "offline"
| Stop I/O | Redirect traffic
+------------------+
|
v
+------------------+
| Identify Missing | Scan stripe metadata
| Shards | for affected data
+------------------+
|
v
+------------------+
| EC Reconstruct | Rebuild from parity
| Missing Shards | shards (k of n)
+------------------+
|
v
+------------------+
| Place on Healthy | Rebalance across
| Nodes | available disks
+------------------+
|
v
Repair Complete ---> Resume Normal Operation
随着集群扩展和工作负载变化,磁盘利用率可能不均衡。KDSS 内置利用率感知的数据均衡引擎,非中断地将分片从高利用率磁盘迁移到低利用率磁盘,保持集群健康和存储均匀分布,并提供实时进度监控。
# Check cluster disk utilization
kdss-cli balance status
Disk Utilization:
node-01/sda ████████████░░░ 82%
node-01/sdb █████████░░░░░░ 61%
node-02/sda ██████████████░ 93%
node-02/sdb ███████░░░░░░░░ 47%
node-03/sda ██████████░░░░░ 68%
Imbalance: 46% (threshold: 20%)
# Start balancing
kdss-cli balance start --max-bandwidth 100MB/s
Balancing in progress...
Migrated: 128 shards (12.4 GB)
Remaining: 64 shards (~6.2 GB)
ETA: 3m 22s
功能完备的 Web 管理控制台,让您对 KDSS 集群拥有完整的可视化和控制。一览集群健康状况、管理节点和磁盘、配置 EC 策略、控制用户访问——全部通过基于角色访问控制 (RBAC) 的直观仪表盘实现。
+-----------------------------------------------+
| KDSS Console [admin] [Logout] |
+-----------------------------------------------+
| | |
| Dash | Cluster Health: Healthy |
| Nodes | |
| Disks | Nodes: 5/5 online |
| Buckets | Disks: 20/20 active |
| Users | Capacity: 42.8 TB / 60 TB (71%) |
| Alerts | |
| Config | Throughput IOPS |
| | ~~~^~~~ ~~^~~~ |
| | 348 MB/s 12.4K |
| | |
| | Recent Alerts (3) |
| | ! Disk node-02/sdc S.M.A.R.T. warn |
| | ! Repair job #47 completed |
| | i Balance job started |
+-----------------------------------------------+
KDSS 开箱即用暴露全面的 Prometheus 指标,覆盖集群健康、磁盘 I/O、EC 操作和 API 延迟。配合 32 条内置告警规则、预配置 Grafana 仪表盘和飞书 Webhook 集成,无需从头搭建监控基础设施即可获得完整可观测性。
# Prometheus scrape config for KDSS
scrape_configs:
- job_name: 'kdss-master'
static_configs:
- targets: ['master-01:6701']
metrics_path: /metrics
- job_name: 'kdss-storage'
static_configs:
- targets:
- 'storage-01:6801'
- 'storage-02:6801'
- 'storage-03:6801'
# Sample alert rules (32 built-in)
# - DiskSpaceCritical (>90%)
# - DiskSmartWarning
# - NodeHeartbeatLost (>30s)
# - ECRepairQueueHigh (>100)
# - APILatencyP99High (>500ms)
# - ReplicationLagHigh
# ...
KDSS 实现安全的两阶段删除流程,支持可配置的回收站保留期。后台自动运行垃圾回收,从已删除对象和过期临时数据中回收空间,同时确保在保留窗口到期前不会永久删除数据。
# Garbage Collection Configuration
[gc]
enabled = true
# Run GC every 6 hours / 每 6 小时运行一次 GC
interval = "6h"
# Only run during low-traffic window
# 仅在低峰期运行
time_window = "02:00-06:00"
[gc.recycle_bin]
# Retain deleted objects for 7 days
# 删除对象保留 7 天
retention = "168h"
[gc.multipart]
# Clean up stale multipart uploads after 24h
# 24 小时后清理过期的分片上传
stale_timeout = "24h"
[gc.compaction]
# Trigger compaction when fragmentation > 30%
# 碎片率超过 30% 时触发压缩
fragmentation_threshold = 0.3