mirror of
https://github.com/minio/minio.git
synced 2025-11-06 20:33:07 -05:00
feat: add dynamic usage cache (#12229)
A cache structure will be kept with a tree of usages. The cache is a tree structure where each keeps track of its children. An uncompacted branch contains a count of the files only directly at the branch level, and contains link to children branches or leaves. The leaves are "compacted" based on a number of properties. A compacted leaf contains the totals of all files beneath it. A leaf is only scanned once every dataUsageUpdateDirCycles, rarer if the bloom filter for the path is clean and no lifecycles are applied. Skipped leaves have their totals transferred from the previous cycle. A clean leaf will be included once every healFolderIncludeProb for partial heal scans. When selected there is a one in healObjectSelectProb that any object will be chosen for heal scan. Compaction happens when either: - The folder (and subfolders) contains less than dataScannerCompactLeastObject objects. - The folder itself contains more than dataScannerCompactAtFolders folders. - The folder only contains objects and no subfolders. - A bucket root will never be compacted. Furthermore, if a has more than dataScannerCompactAtChildren recursive children (uncompacted folders) the tree will be recursively scanned and the branches with the least number of objects will be compacted until the limit is reached. This ensures that any branch will never contain an unreasonable amount of other branches, and also that small branches with few objects don't take up unreasonable amounts of space. Whenever a branch is scanned, it is assumed that it will be un-compacted before it hits any of the above limits. This will make the branch rebalance itself when scanned if the distribution of objects has changed. TLDR; With current values: No bucket will ever have more than 10000 child nodes recursively. No single folder will have more than 2500 child nodes by itself. All subfolders are compacted if they have less than 500 objects in them recursively. We accumulate the (non-deletemarker) version count for paths as well, since we are changing the structure anyway.
This commit is contained in:
@@ -427,17 +427,18 @@ func (s *xlStorage) NSScanner(ctx context.Context, cache dataUsageCache) (dataUs
|
||||
return sizeSummary{}, errSkipFile
|
||||
}
|
||||
|
||||
var totalSize int64
|
||||
|
||||
sizeS := sizeSummary{}
|
||||
for _, version := range fivs.Versions {
|
||||
oi := version.ToObjectInfo(item.bucket, item.objectPath())
|
||||
totalSize += item.applyActions(ctx, objAPI, actionMeta{
|
||||
sz := item.applyActions(ctx, objAPI, actionMeta{
|
||||
oi: oi,
|
||||
bitRotScan: healOpts.Bitrot,
|
||||
}, &sizeS)
|
||||
if !oi.DeleteMarker && sz == oi.Size {
|
||||
sizeS.versions++
|
||||
}
|
||||
sizeS.totalSize += sz
|
||||
}
|
||||
sizeS.totalSize = totalSize
|
||||
return sizeS, nil
|
||||
})
|
||||
|
||||
|
||||
Reference in New Issue
Block a user