This module contains a number of lightweight tools, developed for managing data analytics and machine learning projects.
Install using:
pip install -U cdxcoreDocumentation can be found here: https://quantitative-research.de/docs/cdxcore. cdxcore is best used with Python 3.12 and above, but is tested vs Python 3.10 onwards.
- Dynamic plotting: simple live/animated plots built on Matplotlib.
- Config management: validated, discoverable configurations with automatic help.
- Versioning & caching: code-versioned I/O and reproducible hashing for pipelines.
- PrettyObject: dictionary-like objects that allow attribute access.
- Utilities: formatting helpers, binary I/O, shared-memory arrays, and more.
-
cdxcore.dynaplotis a framework for simple dynamic graphs withmatplotlib. It has a simple methodology for animated updates for graphs (e.g. during training runs), and allows generation of plot layouts without knowing upfront the number of plots (e.g. for plotting a list of features). -
cdxcore.configallows robust management of configurations. It automates help, validation checking, and detects misspelled configuration arguments.from cdxcore.config import Config, Int, Float class Network(object): def __init__( self, config ): self.depth = config("depth", 1, Int>0, "Depth of the network") self.width = config("width", 1, Int>0, "Width of the network") self.activation = config("activation", "selu", str, "Activation function") config.done() # see below config = Config() config.network.depth = 10 config.network.width = 100 config.network.activation = 'relu' network = Network(config.network) config.done()
-
cdxcore.subdirwraps various file and directory functions into convenient objects. Useful if files have common extensions.from cdxcore.subdir import SubDir import numpy as np root = SubDir("!") # current temp directory subdir = root("test") # sub-directory 'test' subdir.write("data", np.zeros((10,2))) data = subdir.read("data")
-
Caching:
SubDirsupports code-versioned file i/o which is used by@cdxcore.subdir.SubDir.cachefor an efficient code-versioned caching protocol for functions and objects:
from cdxcore.subdir import SubDir
cache = SubDir("!/.cache;*.bin")
@cache.cache("0.1")
def f(x,y):
return x*y
_ = f(1,2) # function gets computed and the result cached
_ = f(1,2) # restore result from cache
_ = f(2,2) # different parameters: compute and store result-
Code versioning is implemented in
cdxcore.version:from cdxbasics.version import version @version("0.0.1") def f(x): return x print( f.version.full ) # -> 0.0.1
-
Hashing (which is used for caching above) is implemented in
cdxcore.uniquehash:class A(object): def __init__(self, x): self.x = x self._y = x*2 # protected member will not be hashed by default from cdxcore.uniquehash import UniqueHash uniqueHash = UniqueHash(length=12) a = A(2) print( uniqueHash(a) ) # --> "2d1dc3767730"
-
cdxcore.prettyprovides aPrettyObjectclass whose objects operate like dictionaries. This is for users who prefer attribute.notation over item access when building structured output.from cdxbasics.prettydict import PrettyObject pdct = PrettyObject(z=1) pdct.num_samples = 1000 pdct.num_batches = 100 pdct.method = "signature"
-
cdxcore.verboseprovides user-controllable context output for providing progress updates to users. -
cdxcore.utiloffers a number of utility functions such as standard formatting for dates, big numbers, lists, dictionaries etc. -
cdxcore.npioprovides a low level binary i/o interface for numpy files. -
cdxcore.npshmprovides shared memory numpy arrays.
