Skip to content

Commit c5f7404

Browse files
committed
initial bg movers
1 parent a4c4ab3 commit c5f7404

22 files changed

+1176
-24
lines changed

MultiTierDataMovement.md

Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
# Background Data Movement
2+
3+
In order to reduce the number of online evictions and support asynchronous
4+
promotion - we have added two periodic workers to handle eviction and promotion.
5+
6+
The diagram below shows a simplified version of how the background evictor
7+
thread (green) is integrated to the CacheLib architecture.
8+
9+
<p align="center">
10+
<img width="640" height="360" alt="BackgroundEvictor" src="cachelib-background-evictor.png">
11+
</p>
12+
13+
## Synchronous Eviction and Promotion
14+
15+
- `disableEvictionToMemory`: Disables eviction to memory (item is always evicted to NVMe or removed
16+
on eviction)
17+
18+
## Background Evictors
19+
20+
The background evictors scan each class to see if there are objects to move the next (lower)
21+
tier using a given strategy. Here we document the parameters for the different
22+
strategies and general parameters.
23+
24+
- `backgroundEvictorIntervalMilSec`: The interval that this thread runs for - by default
25+
the background evictor threads will wake up every 10 ms to scan the AllocationClasses. Also,
26+
the background evictor thead will be woken up everytime there is a failed allocation (from
27+
a request handling thread) and the current percentage of free memory for the
28+
AllocationClass is lower than `lowEvictionAcWatermark`. This may render the interval parameter
29+
not as important when there are many allocations occuring from request handling threads.
30+
31+
- `evictorThreads`: The number of background evictors to run - each thread is a assigned
32+
a set of AllocationClasses to scan and evict objects from. Currently, each thread gets
33+
an equal number of classes to scan - but as object size distribution may be unequal - future
34+
versions will attempt to balance the classes among threads. The range is 1 to number of AllocationClasses.
35+
The default is 1.
36+
37+
- `maxEvictionBatch`: The number of objects to remove in a given eviction call. The
38+
default is 40. Lower range is 10 and the upper range is 1000. Too low and we might not
39+
remove objects at a reasonable rate, too high and it might increase contention with user threads.
40+
41+
- `minEvictionBatch`: Minimum number of items to evict at any time (if there are any
42+
candidates)
43+
44+
- `maxEvictionPromotionHotness`: Maximum candidates to consider for eviction. This is similar to `maxEvictionBatch`
45+
but it specifies how many candidates will be taken into consideration, not the actual number of items to evict.
46+
This option can be used to configure duration of critical section on LRU lock.
47+
48+
49+
### FreeThresholdStrategy (default)
50+
51+
- `lowEvictionAcWatermark`: Triggers background eviction thread to run
52+
when this percentage of the AllocationClass is free.
53+
The default is `2.0`, to avoid wasting capacity we don't set this above `10.0`.
54+
55+
- `highEvictionAcWatermark`: Stop the evictions from an AllocationClass when this
56+
percentage of the AllocationClass is free. The default is `5.0`, to avoid wasting capacity we
57+
don't set this above `10`.
58+
59+
60+
## Background Promoters
61+
62+
The background promotes scan each class to see if there are objects to move to a lower
63+
tier using a given strategy. Here we document the parameters for the different
64+
strategies and general parameters.
65+
66+
- `backgroundPromoterIntervalMilSec`: The interval that this thread runs for - by default
67+
the background promoter threads will wake up every 10 ms to scan the AllocationClasses for
68+
objects to promote.
69+
70+
- `promoterThreads`: The number of background promoters to run - each thread is a assigned
71+
a set of AllocationClasses to scan and promote objects from. Currently, each thread gets
72+
an equal number of classes to scan - but as object size distribution may be unequal - future
73+
versions will attempt to balance the classes among threads. The range is `1` to number of AllocationClasses. The default is `1`.
74+
75+
- `maxProtmotionBatch`: The number of objects to promote in a given promotion call. The
76+
default is 40. Lower range is 10 and the upper range is 1000. Too low and we might not
77+
remove objects at a reasonable rate, too high and it might increase contention with user threads.
78+
79+
- `minPromotionBatch`: Minimum number of items to promote at any time (if there are any
80+
candidates)
81+
82+
- `numDuplicateElements`: This allows us to promote items that have existing handles (read-only) since
83+
we won't need to modify the data when a user is done with the data. Therefore, for a short time
84+
the data could reside in both tiers until it is evicted from its current tier. The default is to
85+
not allow this (0). Setting the value to 100 will enable duplicate elements in tiers.
86+
87+
### Background Promotion Strategy (only one currently)
88+
89+
- `promotionAcWatermark`: Promote items if there is at least this
90+
percent of free AllocationClasses. Promotion thread will attempt to move `maxPromotionBatch` number of objects
91+
to that tier. The objects are chosen from the head of the LRU. The default is `4.0`.
92+
This value should correlate with `lowEvictionAcWatermark`, `highEvictionAcWatermark`, `minAcAllocationWatermark`, `maxAcAllocationWatermark`.
93+
- `maxPromotionBatch`: The number of objects to promote in batch during BG promotion. Analogous to
94+
`maxEvictionBatch`. It's value should be lower to decrease contention on hot items.
95+
96+
## Allocation policies
97+
98+
- `maxAcAllocationWatermark`: Item is always allocated in topmost tier if at least this
99+
percentage of the AllocationClass is free.
100+
- `minAcAllocationWatermark`: Item is always allocated in bottom tier if only this percent
101+
of the AllocationClass is free. If percentage of free AllocationClasses is between `maxAcAllocationWatermark`
102+
and `minAcAllocationWatermark`: then extra checks (described below) are performed to decide where to put the element.
103+
104+
By default, allocation will always be performed from the upper tier.
105+
106+
- `acTopTierEvictionWatermark`: If there is less that this percent of free memory in topmost tier, cachelib will attempt to evict from top tier. This option takes precedence before allocationWatermarks.
107+
108+
### Extra policies (used only when percentage of free AllocationClasses is between `maxAcAllocationWatermark`
109+
and `minAcAllocationWatermark`)
110+
- `sizeThresholdPolicy`: If item is smaller than this value, always allocate it in upper tier.
111+
- `defaultTierChancePercentage`: Change (0-100%) of allocating item in top tier
112+
113+
## MMContainer options
114+
115+
- `lruInsertionPointSpec`: Can be set per tier when LRU2Q is used. Determines where new items are
116+
inserted. 0 = insert to hot queue, 1 = insert to warm queue, 2 = insert to cold queue
117+
- `markUsefulChance`: Per-tier, determines chance of moving item to the head of LRU on access
Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
/*
2+
* Copyright (c) Intel and its affiliates.
3+
*
4+
* Licensed under the Apache License, Version 2.0 (the "License");
5+
* you may not use this file except in compliance with the License.
6+
* You may obtain a copy of the License at
7+
*
8+
* http://www.apache.org/licenses/LICENSE-2.0
9+
*
10+
* Unless required by applicable law or agreed to in writing, software
11+
* distributed under the License is distributed on an "AS IS" BASIS,
12+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
* See the License for the specific language governing permissions and
14+
* limitations under the License.
15+
*/
16+
17+
namespace facebook {
18+
namespace cachelib {
19+
20+
21+
template <typename CacheT>
22+
BackgroundMover<CacheT>::BackgroundMover(Cache& cache,
23+
std::shared_ptr<BackgroundMoverStrategy> strategy,
24+
MoverDir direction)
25+
: cache_(cache),
26+
strategy_(strategy),
27+
direction_(direction)
28+
{
29+
if (direction_ == MoverDir::Evict) {
30+
moverFunc =
31+
BackgroundMoverAPIWrapper<CacheT>::traverseAndEvictItems;
32+
33+
} else if (direction_ == MoverDir::Promote) {
34+
moverFunc =
35+
BackgroundMoverAPIWrapper<CacheT>::traverseAndPromoteItems;
36+
}
37+
}
38+
39+
template <typename CacheT>
40+
BackgroundMover<CacheT>::~BackgroundMover() { stop(std::chrono::seconds(0)); }
41+
42+
template <typename CacheT>
43+
void BackgroundMover<CacheT>::work() {
44+
try {
45+
checkAndRun();
46+
} catch (const std::exception& ex) {
47+
XLOGF(ERR, "BackgroundMover interrupted due to exception: {}", ex.what());
48+
}
49+
}
50+
51+
template <typename CacheT>
52+
void BackgroundMover<CacheT>::setAssignedMemory(std::vector<std::tuple<TierId, PoolId, ClassId>> &&assignedMemory)
53+
{
54+
XLOG(INFO, "Class assigned to background worker:");
55+
for (auto [tid, pid, cid] : assignedMemory) {
56+
XLOGF(INFO, "Tid: {}, Pid: {}, Cid: {}", tid, pid, cid);
57+
}
58+
59+
mutex.lock_combine([this, &assignedMemory]{
60+
this->assignedMemory_ = std::move(assignedMemory);
61+
});
62+
}
63+
64+
// Look for classes that exceed the target memory capacity
65+
// and return those for eviction
66+
template <typename CacheT>
67+
void BackgroundMover<CacheT>::checkAndRun() {
68+
auto assignedMemory = mutex.lock_combine([this]{
69+
return assignedMemory_;
70+
});
71+
72+
unsigned int moves = 0;
73+
std::set<ClassId> classes{};
74+
auto batches = strategy_->calculateBatchSizes(cache_,assignedMemory);
75+
76+
for (size_t i = 0; i < batches.size(); i++) {
77+
const auto [tid, pid, cid] = assignedMemory[i];
78+
const auto batch = batches[i];
79+
80+
classes.insert(cid);
81+
const auto& mpStats = cache_.getPoolByTid(pid,tid).getStats();
82+
83+
if (!batch) {
84+
continue;
85+
}
86+
87+
totalBytesMoved.add(batch * mpStats.acStats.at(cid).allocSize);
88+
89+
//try moving BATCH items from the class in order to reach free target
90+
auto moved = moverFunc(cache_,tid,pid,cid,batch);
91+
moves += moved;
92+
moves_per_class_[tid][pid][cid] += moved;
93+
}
94+
95+
numTraversals.inc();
96+
numMovedItems.add(moves);
97+
totalClasses.add(classes.size());
98+
}
99+
100+
template <typename CacheT>
101+
BackgroundMoverStats BackgroundMover<CacheT>::getStats() const noexcept {
102+
BackgroundMoverStats stats;
103+
stats.numMovedItems = numMovedItems.get();
104+
stats.runCount = numTraversals.get();
105+
stats.totalBytesMoved = totalBytesMoved.get();
106+
stats.totalClasses = totalClasses.get();
107+
108+
return stats;
109+
}
110+
111+
template <typename CacheT>
112+
std::map<TierId, std::map<PoolId, std::map<ClassId, uint64_t>>>
113+
BackgroundMover<CacheT>::getClassStats() const noexcept {
114+
return moves_per_class_;
115+
}
116+
117+
} // namespace cachelib
118+
} // namespace facebook
Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
/*
2+
* Copyright (c) Intel and its affiliates.
3+
*
4+
* Licensed under the Apache License, Version 2.0 (the "License");
5+
* you may not use this file except in compliance with the License.
6+
* You may obtain a copy of the License at
7+
*
8+
* http://www.apache.org/licenses/LICENSE-2.0
9+
*
10+
* Unless required by applicable law or agreed to in writing, software
11+
* distributed under the License is distributed on an "AS IS" BASIS,
12+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
* See the License for the specific language governing permissions and
14+
* limitations under the License.
15+
*/
16+
17+
#pragma once
18+
19+
#include <gtest/gtest_prod.h>
20+
#include <folly/concurrency/UnboundedQueue.h>
21+
22+
#include "cachelib/allocator/CacheStats.h"
23+
#include "cachelib/common/PeriodicWorker.h"
24+
#include "cachelib/allocator/BackgroundMoverStrategy.h"
25+
#include "cachelib/common/AtomicCounter.h"
26+
27+
28+
namespace facebook {
29+
namespace cachelib {
30+
31+
// wrapper that exposes the private APIs of CacheType that are specifically
32+
// needed for the cache api
33+
template <typename C>
34+
struct BackgroundMoverAPIWrapper {
35+
36+
static size_t traverseAndEvictItems(C& cache,
37+
unsigned int tid, unsigned int pid, unsigned int cid, size_t batch) {
38+
return cache.traverseAndEvictItems(tid,pid,cid,batch);
39+
}
40+
41+
static size_t traverseAndPromoteItems(C& cache,
42+
unsigned int tid, unsigned int pid, unsigned int cid, size_t batch) {
43+
return cache.traverseAndPromoteItems(tid,pid,cid,batch);
44+
}
45+
46+
};
47+
48+
enum class MoverDir {
49+
Evict = 0,
50+
Promote
51+
};
52+
53+
// Periodic worker that evicts items from tiers in batches
54+
// The primary aim is to reduce insertion times for new items in the
55+
// cache
56+
template <typename CacheT>
57+
class BackgroundMover : public PeriodicWorker {
58+
public:
59+
using Cache = CacheT;
60+
// @param cache the cache interface
61+
// @param strategy the stragey class that defines how objects are moved,
62+
// (promoted vs. evicted and how much)
63+
BackgroundMover(Cache& cache,
64+
std::shared_ptr<BackgroundMoverStrategy> strategy,
65+
MoverDir direction_);
66+
67+
~BackgroundMover() override;
68+
69+
BackgroundMoverStats getStats() const noexcept;
70+
std::map<TierId, std::map<PoolId, std::map<ClassId, uint64_t>>> getClassStats() const noexcept;
71+
72+
void setAssignedMemory(std::vector<std::tuple<TierId, PoolId, ClassId>> &&assignedMemory);
73+
74+
private:
75+
std::map<TierId, std::map<PoolId, std::map<ClassId, uint64_t>>> moves_per_class_;
76+
// cache allocator's interface for evicting
77+
using Item = typename Cache::Item;
78+
79+
Cache& cache_;
80+
std::shared_ptr<BackgroundMoverStrategy> strategy_;
81+
MoverDir direction_;
82+
83+
std::function<size_t(Cache&, unsigned int, unsigned int, unsigned int, size_t)> moverFunc;
84+
85+
// implements the actual logic of running the background evictor
86+
void work() override final;
87+
void checkAndRun();
88+
89+
90+
AtomicCounter numMovedItems{0};
91+
AtomicCounter numTraversals{0};
92+
AtomicCounter totalClasses{0};
93+
AtomicCounter totalBytesMoved{0};
94+
95+
std::vector<std::tuple<TierId, PoolId, ClassId>> assignedMemory_;
96+
folly::DistributedMutex mutex;
97+
};
98+
} // namespace cachelib
99+
} // namespace facebook
100+
101+
#include "cachelib/allocator/BackgroundMover-inl.h"
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
/*
2+
* Copyright (c) Facebook, Inc. and its affiliates.
3+
*
4+
* Licensed under the Apache License, Version 2.0 (the "License");
5+
* you may not use this file except in compliance with the License.
6+
* You may obtain a copy of the License at
7+
*
8+
* http://www.apache.org/licenses/LICENSE-2.0
9+
*
10+
* Unless required by applicable law or agreed to in writing, software
11+
* distributed under the License is distributed on an "AS IS" BASIS,
12+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
* See the License for the specific language governing permissions and
14+
* limitations under the License.
15+
*/
16+
17+
#pragma once
18+
19+
#include "cachelib/allocator/Cache.h"
20+
21+
namespace facebook {
22+
namespace cachelib {
23+
24+
// Base class for background eviction strategy.
25+
class BackgroundMoverStrategy {
26+
27+
public:
28+
virtual std::vector<size_t> calculateBatchSizes(const CacheBase& cache,
29+
std::vector<std::tuple<TierId, PoolId, ClassId>> acVec) = 0;
30+
};
31+
32+
} // namespace cachelib
33+
} // namespace facebook

cachelib/allocator/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@ add_library (cachelib_allocator
3535
CCacheManager.cpp
3636
ContainerTypes.cpp
3737
FreeMemStrategy.cpp
38+
FreeThresholdStrategy.cpp
3839
HitsPerSlabStrategy.cpp
3940
LruTailAgeStrategy.cpp
4041
MarginalHitsOptimizeStrategy.cpp

cachelib/allocator/Cache.h

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,12 @@ class CacheBase {
9696
//
9797
// @param poolId The pool id to query
9898
virtual const MemoryPool& getPool(PoolId poolId) const = 0;
99+
100+
// Get the reference to a memory pool using a tier id, for stats purposes
101+
//
102+
// @param poolId The pool id to query
103+
// @param tierId The tier of the pool id
104+
virtual const MemoryPool& getPoolByTid(PoolId poolId, TierId tid) const = 0;
99105

100106
// Get Pool specific stats (regular pools). This includes stats from the
101107
// Memory Pool and also the cache.

0 commit comments

Comments
 (0)