Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -222,6 +222,7 @@ test = false
workspace = true

[workspace.lints.clippy]
cloned_instead_of_copied = "warn"
cloned_ref_to_slice_refs = "warn"
implicit_clone = "warn"
semicolon_if_nothing_returned = "warn"
57 changes: 54 additions & 3 deletions docs/Caching.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,12 +48,63 @@ See https://github.com/mozilla/sccache/blob/8567bbe2ba493153e76177c1f9a6f98cc7ba

### C/C++ preprocessor

In "preprocessor cache mode", [explained in the local doc](Local.md), an
extra key is computed to cache the preprocessor output itself. It is very close
to the C/C++ compiler one, but with additional elements:
In "preprocessor cache mode" explained below, an extra key is computed to cache the preprocessor output itself.
It is very close to the C/C++ compiler one, but with additional elements:

* The path of the input file
* The hash of the input file

Note that some compiler options can disable preprocessor cache mode. As of this
writing, only `-Xpreprocessor` and `-Wp,*` do.

#### Preprocessor cache mode

This is inspired by [ccache's direct mode](https://ccache.dev/manual/3.7.9.html#_the_direct_mode) and works roughly the same.
It adds a cache that allows to skip preprocessing when compiling C/C++. This can make it much faster to return compilation results
from cache since preprocessing is a major expense for these.

Preprocessor cache mode is controlled by a configuration option which is true by default, as well as additional conditions described below.

To ensure that the cached preprocessor results for a source file correspond to the un-preprocessed inputs, sccache needs
to remember, among other things, all files included by the source file. sccache also needs to recognize
when "external factors" may change the results, such as system time if the `__TIME__` macro is used
in a source file. How conservative sccache is about some of these external factors is configurable, see below.

Preprocessor cache mode will be disabled in any of the following cases:

- Not compiling C or C++
- The configuration option is false
- Not using GCC or Clang
- Not using local storage for the cache
- Any of the compiler options `-Xpreprocessor`, `-Wp,` are present
- The modification time of one of the header files is too new (avoids a race condition)
- Certain strings such as `__DATE__`, `__TIME__`, `__TIMESTAMP__` are present in the source code,
indicating that the preprocessor result may change based on external factors

The preprocessor cache may silently produce stale results in any of the following cases:

- When a source file was compiled and its results were cached, a header file would have been included if it existed, but it did
not exist at the time. sccache does not know about such files, so it cannot invalidate the result if the header file later exists.
- A macro such as `__TIME__` (etc) is used in the source code and `ignore_time_macros` is enabled
- There are other external factors influencing the preprocessing result that sccache does not know about

Configuration options and their default values:

- `use_preprocessor_cache_mode`: `true`. Whether to use preprocessor cache mode. This can be overridden for an sccache invocation by setting the environment variable `SCCACHE_DIRECT` to `true`/`on`/`1` or `false`/`off`/`0`.
- `file_stat_matches`: `false`. If false, only compare header files by hashing their contents. If true, will use size + ctime + mtime to check whether a file has changed. See other flags below for more control over this behavior.
- `use_ctime_for_stat`: `true`. If true, uses the ctime (file status change on UNIX, creation time on Windows) to check that a file has/hasn't changed. Can be useful to disable when backdating modification times in a controlled manner.

- `ignore_time_macros`: `false`. If true, ignore `__DATE__`, `__TIME__` and `__TIMESTAMP__` being present in the source code. Will speed up preprocessor cache mode, but can produce stale results.

- `skip_system_headers`: `false`. If true, the preprocessor cache will only add the paths of included system headers to the cache key but ignore the headers' contents.

- `hash_working_directory`: `true`. If true, will add the current working directory to the cache key to distinguish two compilations from different directories.
- `max_size`: `10737418240`. The size of the preprocessor cache, defaults to the default disk cache size.
- `rw_mode`: `ReadWrite`. ReadOnly or ReadWrite mode for the cache.
- `dir`: `path_to_cache_directory`. Path to the preprocessor cache, By default it will use DiskCache's directory, under subdirectory `preprocessor`.

See where to write the config in [the configuration doc](Configuration.md).

`sccache --debug-preprocessor-cache` can be used to investigate the content of the preprocessor cache.

The preprocessor cache uses random read and write; thus, certain file systems, including `s3fs`, are not supported.
8 changes: 7 additions & 1 deletion docs/Configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ dir = "/tmp/.cache/sccache"
size = 7516192768 # 7 GiBytes

# See the local docs on more explanations about this mode
[cache.disk.preprocessor_cache_mode]
[cache.preprocessor_cache_mode]
# Whether to use the preprocessor cache mode
use_preprocessor_cache_mode = true
# Whether to use file times to check for changes
Expand All @@ -66,6 +66,12 @@ ignore_time_macros = false
skip_system_headers = false
# Whether hash the current working directory
hash_working_directory = true
# Maximum size of the cache
max_size = 1048576
# ReadOnly/ReadWrite mode
rw_mode = "ReadWrite"
# Path to the cache
dir = "/tmp/.cache/sccache-preprocess/"

[cache.gcs]
# optional oauth url
Expand Down
45 changes: 0 additions & 45 deletions docs/Local.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,51 +6,6 @@ The default cache size is 10 gigabytes. To change this, set `SCCACHE_CACHE_SIZE`

The local storage only supports a single sccache server at a time. Multiple concurrent servers will race and cause spurious build failures.

## Preprocessor cache mode

This is inspired by [ccache's direct mode](https://ccache.dev/manual/3.7.9.html#_the_direct_mode) and works roughly the same.
It adds a cache that allows to skip preprocessing when compiling C/C++. This can make it much faster to return compilation results
from cache since preprocessing is a major expense for these.

Preprocessor cache mode is controlled by a configuration option which is true by default, as well as additional conditions described below.

To ensure that the cached preprocessor results for a source file correspond to the un-preprocessed inputs, sccache needs
to remember, among other things, all files included by the source file. sccache also needs to recognize
when "external factors" may change the results, such as system time if the `__TIME__` macro is used
in a source file. How conservative sccache is about some of these external factors is configurable, see below.

Preprocessor cache mode will be disabled in any of the following cases:

- Not compiling C or C++
- The configuration option is false
- Not using GCC or Clang
- Not using local storage for the cache
- Any of the compiler options `-MP`, `-Xpreprocessor`, `-Wp,` are present
- The modification time of one of the header files is too new (avoids a race condition)
- Certain strings such as `__DATE__`, `__TIME__`, `__TIMESTAMP__` are present in the source code,
indicating that the preprocessor result may change based on external factors

The preprocessor cache may silently produce stale results in any of the following cases:

- When a source file was compiled and its results were cached, a header file would have been included if it existed, but it did
not exist at the time. sccache does not know about such files, so it cannot invalidate the result if the header file later exists.
- A macro such as `__TIME__` (etc) is used in the source code and `ignore_time_macros` is enabled
- There are other external factors influencing the preprocessing result that sccache does not know about

Configuration options and their default values:

- `use_preprocessor_cache_mode`: `true`. Whether to use preprocessor cache mode. This can be overridden for an sccache invocation by setting the environment variable `SCCACHE_DIRECT` to `true`/`on`/`1` or `false`/`off`/`0`.
- `file_stat_matches`: `false`. If false, only compare header files by hashing their contents. If true, will use size + ctime + mtime to check whether a file has changed. See other flags below for more control over this behavior.
- `use_ctime_for_stat`: `true`. If true, uses the ctime (file status change on UNIX, creation time on Windows) to check that a file has/hasn't changed. Can be useful to disable when backdating modification times in a controlled manner.

- `ignore_time_macros`: `false`. If true, ignore `__DATE__`, `__TIME__` and `__TIMESTAMP__` being present in the source code. Will speed up preprocessor cache mode, but can produce stale results.

- `skip_system_headers`: `false`. If true, the preprocessor cache will only add the paths of included system headers to the cache key but ignore the headers' contents.

- `hash_working_directory`: `true`. If true, will add the current working directory to the cache key to distinguish two compilations from different directories.

See where to write the config in [the configuration doc](Configuration.md).

## Read-only cache mode

By default, the local cache operates in read/write mode. The `SCCACHE_LOCAL_RW_MODE` environment variable can be set to `READ_ONLY` (or `READ_WRITE`) to modify this behavior.
Expand Down
125 changes: 29 additions & 96 deletions src/cache/cache.rs
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,9 @@
// limitations under the License.

use super::cache_io::*;
use super::preprocessor_cache::PreprocessorCacheStorage;
use super::storage::Storage;
use crate::cache::PreprocessorCache;
#[cfg(feature = "azure")]
use crate::cache::azure::AzureBlobCache;
#[cfg(feature = "cos")]
Expand Down Expand Up @@ -44,96 +47,13 @@ use crate::cache::s3::S3Cache;
use crate::cache::utils::normalize_key;
#[cfg(feature = "webdav")]
use crate::cache::webdav::WebdavCache;
use crate::compiler::PreprocessorCacheEntry;
use crate::config::Config;
use crate::config::{self, CacheType, PreprocessorCacheModeConfig};
use crate::config::{self, CacheType};
use crate::errors::*;
use async_trait::async_trait;

use std::io;
use std::sync::Arc;
use std::time::Duration;

use crate::errors::*;

/// An interface to cache storage.
#[async_trait]
pub trait Storage: Send + Sync {
/// Get a cache entry by `key`.
///
/// If an error occurs, this method should return a `Cache::Error`.
/// If nothing fails but the entry is not found in the cache,
/// it should return a `Cache::Miss`.
/// If the entry is successfully found in the cache, it should
/// return a `Cache::Hit`.
async fn get(&self, key: &str) -> Result<Cache>;

/// Put `entry` in the cache under `key`.
///
/// Returns a `Future` that will provide the result or error when the put is
/// finished.
async fn put(&self, key: &str, entry: CacheWrite) -> Result<Duration>;

/// Check the cache capability.
///
/// - `Ok(CacheMode::ReadOnly)` means cache can only be used to `get`
/// cache.
/// - `Ok(CacheMode::ReadWrite)` means cache can do both `get` and `put`.
/// - `Err(err)` means cache is not setup correctly or not match with
/// users input (for example, user try to use `ReadWrite` but cache
/// is `ReadOnly`).
///
/// We will provide a default implementation which returns
/// `Ok(CacheMode::ReadWrite)` for service that doesn't
/// support check yet.
async fn check(&self) -> Result<CacheMode> {
Ok(CacheMode::ReadWrite)
}

/// Get the storage location.
fn location(&self) -> String;

/// Get the cache backend type name (e.g., "disk", "redis", "s3").
/// Used for statistics and display purposes.
fn cache_type_name(&self) -> &'static str {
"unknown"
}

/// Get the current storage usage, if applicable.
async fn current_size(&self) -> Result<Option<u64>>;

/// Get the maximum storage size, if applicable.
async fn max_size(&self) -> Result<Option<u64>>;

/// Return the config for preprocessor cache mode if applicable
fn preprocessor_cache_mode_config(&self) -> PreprocessorCacheModeConfig {
// Enable by default, only in local mode
PreprocessorCacheModeConfig::default()
}
/// Return the base directories for path normalization if configured
fn basedirs(&self) -> &[Vec<u8>] {
&[]
}
/// Return the preprocessor cache entry for a given preprocessor key,
/// if it exists.
/// Only applicable when using preprocessor cache mode.
async fn get_preprocessor_cache_entry(
&self,
_key: &str,
) -> Result<Option<Box<dyn crate::lru_disk_cache::ReadSeek>>> {
Ok(None)
}
/// Insert a preprocessor cache entry at the given preprocessor key,
/// overwriting the entry if it exists.
/// Only applicable when using preprocessor cache mode.
async fn put_preprocessor_cache_entry(
&self,
_key: &str,
_preprocessor_cache_entry: PreprocessorCacheEntry,
) -> Result<()> {
Ok(())
}
}

/// Wrapper for opendal::Operator that adds basedirs support
#[cfg(any(
feature = "azure",
Expand Down Expand Up @@ -185,7 +105,7 @@ impl Storage for RemoteStorage {
async fn get(&self, key: &str) -> Result<Cache> {
match self.operator.read(&normalize_key(key)).await {
Ok(res) => {
let hit = CacheRead::from(io::Cursor::new(res.to_bytes()))?;
let hit = CacheRead::from(std::io::Cursor::new(res.to_bytes()))?;
Ok(Cache::Hit(hit))
}
Err(e) if e.kind() == opendal::ErrorKind::NotFound => Ok(Cache::Miss),
Expand Down Expand Up @@ -487,12 +407,13 @@ pub fn build_single_cache(
}
}

fn get_preprocessor_cache_storage(config: &Config) -> Result<Arc<dyn PreprocessorCacheStorage>> {
Ok(Arc::new(PreprocessorCache::new(&config.preprocessor_cache)))
}

/// Get a suitable `Storage` implementation from configuration.
/// Supports both single-cache (backward compatible) and multi-level cache configurations.
pub fn storage_from_config(
config: &Config,
pool: &tokio::runtime::Handle,
) -> Result<Arc<dyn Storage>> {
pub fn get_storage(config: &Config, pool: &tokio::runtime::Handle) -> Result<Arc<dyn Storage>> {
#[cfg(any(
feature = "azure",
feature = "gcs",
Expand All @@ -511,23 +432,32 @@ pub fn storage_from_config(

// No remote cache configured - use disk cache only
let (dir, size) = (&config.fallback_cache.dir, config.fallback_cache.size);
let preprocessor_cache_mode_config = config.fallback_cache.preprocessor_cache_mode;
let rw_mode = config.fallback_cache.rw_mode.into();
debug!("Init disk cache with dir {:?}, size {}", dir, size);

Ok(Arc::new(DiskCache::new(
dir,
size,
pool,
preprocessor_cache_mode_config,
rw_mode,
config.basedirs.clone(),
)))
}

pub fn get_storage_from_config(
config: &Config,
pool: &tokio::runtime::Handle,
) -> Result<(Arc<dyn Storage>, Arc<dyn PreprocessorCacheStorage>)> {
Ok((
get_storage(config, pool)?,
get_preprocessor_cache_storage(config)?,
))
}

#[cfg(test)]
mod test {
use super::*;
use crate::compiler::PreprocessorCacheEntry;
use crate::config::CacheModeConfig;
use fs_err as fs;

Expand Down Expand Up @@ -559,11 +489,12 @@ mod test {
config.fallback_cache.rw_mode = CacheModeConfig::ReadWrite;

{
let cache = storage_from_config(&config, runtime.handle()).unwrap();
let (cache, preprocessor_cache) =
get_storage_from_config(&config, runtime.handle()).unwrap();

runtime.block_on(async move {
cache.put("test1", CacheWrite::default()).await.unwrap();
cache
preprocessor_cache
.put_preprocessor_cache_entry("test1", PreprocessorCacheEntry::default())
.await
.unwrap();
Expand All @@ -572,9 +503,11 @@ mod test {

// Test Read-only
config.fallback_cache.rw_mode = CacheModeConfig::ReadOnly;
config.preprocessor_cache.rw_mode = CacheModeConfig::ReadOnly;

{
let cache = storage_from_config(&config, runtime.handle()).unwrap();
let (cache, preprocessor_cache) =
get_storage_from_config(&config, runtime.handle()).unwrap();

runtime.block_on(async move {
assert_eq!(
Expand All @@ -586,7 +519,7 @@ mod test {
"Cannot write to a read-only cache"
);
assert_eq!(
cache
preprocessor_cache
.put_preprocessor_cache_entry("test1", PreprocessorCacheEntry::default())
.await
.unwrap_err()
Expand Down
Loading
Loading