nemo_evaluator.adapters.caching.diskcaching#

Core disk and file backed cache API. source: https://github.com/grantjenks/python-diskcache changes:

  • instead of pickle, use raw bytes.

The assumption is that we will cache requests responses, which are bytes.

Module Contents#

Classes#

Cache

Disk and file backed cache.

Constant

Pretty display of immutable constant.

Disk

Cache key and value serialization for SQLite database and files.

JSONDisk

Cache key and value using JSON serialization with zlib compression.

Functions#

args_to_key

Create cache key out of function arguments.

full_name

Return full name of func by adding the module and function name.

Data#

API#

class nemo_evaluator.adapters.caching.diskcaching.Cache(directory=None, timeout=60, disk=Disk, **settings)[source]#

Disk and file backed cache.

Initialization

Initialize cache instance.

Parameters:
  • directory (str) – cache directory

  • timeout (float) – SQLite connection timeout

  • disk – Disk type or subclass for serialization

  • settings – any of DEFAULT_SETTINGS

add(key, value, expire=None, read=False, tag=None, retry=False)[source]#

Add key and value item to cache.

Similar to set, but only add to cache if key not present.

Operation is atomic. Only one concurrent add operation for a given key will succeed.

When read is True, value should be a file-like object opened for reading in binary mode.

Raises :exc:Timeout error when database timeout occurs and retry is False (default).

Parameters:
  • key – key for item

  • value – value for item

  • expire (float) – seconds until the key expires (default None, no expiry)

  • read (bool) – read value as bytes from file (default False)

  • tag (str) – text to associate with key (default None)

  • retry (bool) – retry if database timeout occurs (default False)

Returns:

True if item was added

Raises:

Timeout – if database timeout occurs

clear(retry=False)[source]#

Remove all items from cache.

Removing items is an iterative process. In each iteration, a subset of items is removed. Concurrent writes may occur between iterations.

If a :exc:Timeout occurs, the first element of the exception’s args attribute will be the number of items removed before the exception occurred.

Raises :exc:Timeout error when database timeout occurs and retry is False (default).

Parameters:

retry (bool) – retry if database timeout occurs (default False)

Returns:

count of rows removed

Raises:

Timeout – if database timeout occurs

close()[source]#

Close database connection.

create_tag_index()[source]#

Create tag index on cache database.

It is better to initialize cache with tag_index=True than use this.

Raises:

Timeout – if database timeout occurs

cull(retry=False)[source]#

Cull items from cache until volume is less than size limit.

Removing items is an iterative process. In each iteration, a subset of items is removed. Concurrent writes may occur between iterations.

If a :exc:Timeout occurs, the first element of the exception’s args attribute will be the number of items removed before the exception occurred.

Raises :exc:Timeout error when database timeout occurs and retry is False (default).

Parameters:

retry (bool) – retry if database timeout occurs (default False)

Returns:

count of items removed

Raises:

Timeout – if database timeout occurs

decr(key, delta=1, default=0, retry=False)[source]#

Decrement value by delta for item with key.

If key is missing and default is None then raise KeyError. Else if key is missing and default is not None then use default for value.

Operation is atomic. All concurrent decrement operations will be counted individually.

Unlike Memcached, negative values are supported. Value may be decremented below zero.

Assumes value may be stored in a SQLite column. Most builds that target machines with 64-bit pointer widths will support 64-bit signed integers.

Raises :exc:Timeout error when database timeout occurs and retry is False (default).

Parameters:
  • key – key for item

  • delta (int) – amount to decrement (default 1)

  • default (int) – value if key is missing (default 0)

  • retry (bool) – retry if database timeout occurs (default False)

Returns:

new value for item

Raises:
  • KeyError – if key is not found and default is None

  • Timeout – if database timeout occurs

property directory#

Cache directory.

property disk#

Disk used for serialization.

drop_tag_index()[source]#

Drop tag index on cache database.

Raises:

Timeout – if database timeout occurs

evict(tag, retry=False)[source]#

Remove items with matching tag from cache.

Removing items is an iterative process. In each iteration, a subset of items is removed. Concurrent writes may occur between iterations.

If a :exc:Timeout occurs, the first element of the exception’s args attribute will be the number of items removed before the exception occurred.

Raises :exc:Timeout error when database timeout occurs and retry is False (default).

Parameters:
  • tag (str) – tag identifying items

  • retry (bool) – retry if database timeout occurs (default False)

Returns:

count of rows removed

Raises:

Timeout – if database timeout occurs

expire(now=None, retry=False)[source]#

Remove expired items from cache.

Removing items is an iterative process. In each iteration, a subset of items is removed. Concurrent writes may occur between iterations.

If a :exc:Timeout occurs, the first element of the exception’s args attribute will be the number of items removed before the exception occurred.

Raises :exc:Timeout error when database timeout occurs and retry is False (default).

Parameters:
  • now (float) – current time (default None, time.time() used)

  • retry (bool) – retry if database timeout occurs (default False)

Returns:

count of items removed

Raises:

Timeout – if database timeout occurs

get(
key,
default=None,
read=False,
expire_time=False,
tag=False,
retry=False,
)[source]#

Retrieve value from cache. If key is missing, return default.

incr(key, delta=1, default=0, retry=False)[source]#

Increment value by delta for item with key.

If key is missing and default is None then raise KeyError. Else if key is missing and default is not None then use default for value.

Operation is atomic. All concurrent increment operations will be counted individually.

Assumes value may be stored in a SQLite column. Most builds that target machines with 64-bit pointer widths will support 64-bit signed integers.

Raises :exc:Timeout error when database timeout occurs and retry is False (default).

Parameters:
  • key – key for item

  • delta (int) – amount to increment (default 1)

  • default (int) – value if key is missing (default 0)

  • retry (bool) – retry if database timeout occurs (default False)

Returns:

new value for item

Raises:
  • KeyError – if key is not found and default is None

  • Timeout – if database timeout occurs

iterkeys(reverse=False)[source]#

Iterate Cache keys in database sort order.

cache = Cache() for key in [4, 1, 3, 0, 2]: … cache[key] = key list(cache.iterkeys()) [0, 1, 2, 3, 4] list(cache.iterkeys(reverse=True)) [4, 3, 2, 1, 0]

Parameters:

reverse (bool) – reverse sort order (default False)

Returns:

iterator of Cache keys

reset(key, value=ENOVAL, update=True)[source]#

Reset key and value item from Settings table.

Use reset to update the value of Cache settings correctly. Cache settings are stored in the Settings table of the SQLite database. If update is False then no attempt is made to update the database.

If value is not given, it is reloaded from the Settings table. Otherwise, the Settings table is updated.

Settings with the disk_ prefix correspond to Disk attributes. Updating the value will change the unprefixed attribute on the associated Disk instance.

Settings with the sqlite_ prefix correspond to SQLite pragmas. Updating the value will execute the corresponding PRAGMA statement.

SQLite PRAGMA statements may be executed before the Settings table exists in the database by setting update to False.

Parameters:
  • key (str) – Settings key for item

  • value – value for item (optional)

  • update (bool) – update database Settings table (default True)

Returns:

updated value for item

Raises:

Timeout – if database timeout occurs

set(key, value, expire=None, read=False, tag=None, retry=False)[source]#

Set key and value item in cache.

When read is True, value should be a file-like object opened for reading in binary mode.

Raises :exc:Timeout error when database timeout occurs and retry is False (default).

Parameters:
  • key – key for item

  • value – value for item

  • expire (float) – seconds until item expires (default None, no expiry)

  • read (bool) – read value as bytes from file (default False)

  • tag (str) – text to associate with key (default None)

  • retry (bool) – retry if database timeout occurs (default False)

Returns:

True if item was set

Raises:

Timeout – if database timeout occurs

stats(enable=True, reset=False)[source]#

Return cache statistics hits and misses.

Parameters:
  • enable (bool) – enable collecting statistics (default True)

  • reset (bool) – reset hits and misses to 0 (default False)

Returns:

(hits, misses)

property timeout#

SQLite connection timeout value in seconds.

touch(key, expire=None, retry=False)[source]#

Touch key in cache and update expire time.

Raises :exc:Timeout error when database timeout occurs and retry is False (default).

Parameters:
  • key – key for item

  • expire (float) – seconds until item expires (default None, no expiry)

  • retry (bool) – retry if database timeout occurs (default False)

Returns:

True if key was touched

Raises:

Timeout – if database timeout occurs

transact(retry=False)[source]#

Context manager to perform a transaction by locking the cache.

While the cache is locked, no other write operation is permitted. Transactions should therefore be as short as possible. Read and write operations performed in a transaction are atomic. Read operations may occur concurrent to a transaction.

Transactions may be nested and may not be shared between threads.

Raises :exc:Timeout error when database timeout occurs and retry is False (default).

cache = Cache() with cache.transact(): # Atomically increment two keys. … _ = cache.incr(‘total’, 123.4) … _ = cache.incr(‘count’, 1) with cache.transact(): # Atomically calculate average. … average = cache[‘total’] / cache[‘count’] average 123.4

Parameters:

retry (bool) – retry if database timeout occurs (default False)

Returns:

context manager for use in with statement

Raises:

Timeout – if database timeout occurs

volume()[source]#

Return estimated total size of cache on disk.

Returns:

size in bytes

class nemo_evaluator.adapters.caching.diskcaching.Constant[source]#

Bases: tuple

Pretty display of immutable constant.

Initialization

Initialize self. See help(type(self)) for accurate signature.

nemo_evaluator.adapters.caching.diskcaching.DBNAME#

‘cache.db’

nemo_evaluator.adapters.caching.diskcaching.DEFAULT_SETTINGS#

None

class nemo_evaluator.adapters.caching.diskcaching.Disk(directory, min_file_size=0)[source]#

Cache key and value serialization for SQLite database and files.

Initialization

Initialize disk instance.

Parameters:
  • directory (str) – directory path

  • min_file_size (int) – minimum size for file use

  • json_indent (int) – JSON indentation for serialization

fetch(mode, filename, value, read)[source]#

Convert fields mode, filename, and value from Cache table to value.

Parameters:
  • mode (int) – value mode raw, binary, text, or pickle

  • filename (str) – filename of corresponding value

  • value – database value

  • read (bool) – when True, return an open file handle

Returns:

corresponding Python value

Raises:

IOError if the value cannot be read

filename(key=UNKNOWN, value=UNKNOWN)[source]#

Return filename and full-path tuple for file storage.

Filename will be a randomly generated 28 character hexadecimal string with “.val” suffixed. Two levels of sub-directories will be used to reduce the size of directories. On older filesystems, lookups in directories with many files may be slow.

The default implementation ignores the key and value parameters.

In some scenarios, for example :meth:Cache.push <diskcache.Cache.push>, the key or value may not be known when the item is stored in the cache.

Parameters:
  • key – key for item (default UNKNOWN)

  • value – value for item (default UNKNOWN)

get(key, raw)[source]#

Convert fields key and raw from Cache table to key.

Parameters:
  • key – database key to convert

  • raw (bool) – flag indicating raw database storage

Returns:

corresponding Python key

hash(key)[source]#

Compute portable hash for key.

Parameters:

key – key to hash

Returns:

hash value

put(key)[source]#

Convert key to fields key and raw for Cache table.

Parameters:

key – key to convert

Returns:

(database key, raw boolean) pair

remove(file_path)[source]#

Remove a file given by file_path.

This method is cross-thread and cross-process safe. If an OSError occurs, it is suppressed.

Parameters:

file_path (str) – relative path to file

store(value, read, key=UNKNOWN)[source]#

Convert value to fields size, mode, filename, and value for Cache table.

Parameters:
  • value – value to convert

  • read (bool) – True when value is file-like object

  • key – key for item (default UNKNOWN)

Returns:

(size, mode, filename, value) tuple for Cache table

nemo_evaluator.adapters.caching.diskcaching.ENOVAL#

‘Constant(…)’

nemo_evaluator.adapters.caching.diskcaching.EVICTION_POLICY#

None

exception nemo_evaluator.adapters.caching.diskcaching.EmptyDirWarning[source]#

Bases: UserWarning

Warning used by Cache.check for empty directories.

Initialization

Initialize self. See help(type(self)) for accurate signature.

class nemo_evaluator.adapters.caching.diskcaching.JSONDisk(directory, compress_level=1, **kwargs)[source]#

Bases: nemo_evaluator.adapters.caching.diskcaching.Disk

Cache key and value using JSON serialization with zlib compression.

Initialization

Initialize JSON disk instance.

Keys and values are compressed using the zlib library. The compress_level is an integer from 0 to 9 controlling the level of compression; 1 is fastest and produces the least compression, 9 is slowest and produces the most compression, and 0 is no compression.

Parameters:
  • directory (str) – directory path

  • compress_level (int) – zlib compression level (default 1)

  • kwargs – super class arguments

fetch(mode, filename, value, read)[source]#

Convert fields mode, filename, and value from Cache table to value.

Parameters:
  • mode (int) – value mode raw, binary, text, or pickle

  • filename (str) – filename of corresponding value

  • value – database value

  • read (bool) – when True, return an open file handle

Returns:

corresponding Python value

Raises:

IOError if the value cannot be read

get(key, raw)[source]#

Convert fields key and raw from Cache table to key.

Parameters:
  • key – database key to convert

  • raw (bool) – flag indicating raw database storage

Returns:

corresponding Python key

put(key)[source]#

Convert key to fields key and raw for Cache table.

Parameters:

key – key to convert

Returns:

(database key, raw boolean) pair

store(value, read, key=UNKNOWN)[source]#

Convert value to fields size, mode, filename, and value for Cache table.

Parameters:
  • value – value to convert

  • read (bool) – True when value is file-like object

  • key – key for item (default UNKNOWN)

Returns:

(size, mode, filename, value) tuple for Cache table

nemo_evaluator.adapters.caching.diskcaching.METADATA#

None

nemo_evaluator.adapters.caching.diskcaching.MODE_JSON#

2

nemo_evaluator.adapters.caching.diskcaching.MODE_NONE#

0

nemo_evaluator.adapters.caching.diskcaching.MODE_RAW#

1

exception nemo_evaluator.adapters.caching.diskcaching.Timeout[source]#

Bases: Exception

Database timeout expired.

Initialization

Initialize self. See help(type(self)) for accurate signature.

nemo_evaluator.adapters.caching.diskcaching.UNKNOWN#

‘Constant(…)’

exception nemo_evaluator.adapters.caching.diskcaching.UnknownFileWarning[source]#

Bases: UserWarning

Warning used by Cache.check for unknown files.

Initialization

Initialize self. See help(type(self)) for accurate signature.

nemo_evaluator.adapters.caching.diskcaching.args_to_key(base, args, kwargs, typed, ignore)[source]#

Create cache key out of function arguments.

Parameters:
  • base (tuple) – base of key

  • args (tuple) – function arguments

  • kwargs (dict) – function keyword arguments

  • typed (bool) – include types in cache key

  • ignore (set) – positional or keyword args to ignore

Returns:

cache key tuple

nemo_evaluator.adapters.caching.diskcaching.full_name(func)[source]#

Return full name of func by adding the module and function name.