Changelog
v0.5.0 (April 10, 2026)#
- Add optional and configurable metric streaming to the Spare Cores Sentinel: pass
sentinel_tokentoResourceTracker(or setSENTINEL_API_TOKENenv var) to periodically upload gzipped CSV batches to the Sentinel object storage, with automatic STS credential refresh and inline CSV fallback for short runs. - Add
sentinel_apimodule: HTTP client for registering runs, refreshing credentials, and submitting final data. - Add
s3_uploadmodule: lightweight S3 uploader using temporary STS credentials. - Fix macOS disk space reporting: correctly use the APFS data volume (
/System/Volumes/Data) instead of summing all partitions; switch disk space calculation from binary (1024^3) to SI (10^9). - Fix memory column naming and units: rename
memory→memory_mib(process),memory_free/memory_used/etc. →*_mibsuffix (system),gpu_vram→gpu_vram_mib; all memory metrics now in MiB. - Fix process I/O column naming: rename
read_bytes/write_bytes→disk_read_bytes/disk_write_bytes. - Add
offsetparameter toget_combined_metricsfor incremental reads. - Add context manager support to
ResourceTracker.
!! Breaking changes:
- The
autostartparameter ofResourceTrackernow defaults toFalse, but it's still auto-started when used in Metaflow or as a context manager.
v0.4.2 (August 8, 2025)#
- Add
cleanupmethod toResourceTrackerto cleanup temp files and background processes.
v0.4.1 (August 8, 2025)#
Introduce native R support. Example usage:
library(resource_tracker)
tracker <- ResourceTracker$new()
tracker$wait_for_samples(1)
tracker$stats()
tracker$recommend_resources()
tracker$recommend_server()
tracker$report()$browse()
Find more details in the R integration docs.
Additional changes:
- Split documentation and extend into multiple pages with more details on the integrations.
v0.4.0 (August 6, 2025)#
Refactoring and cleanup release with the main focus on extracting the reporting features from the Metaflow extension and support it in standalone use as well.
Main changes:
- Much better support for standalone use of
ResourceTrackervia thestats,recommend_resources,recommend_serverandreportmethods
More details:
- Rename
PidTrackertoProcessTrackerto better reflect its purpose.PidTrackeris still available as an alias that is to be deprecated in the future. - Rename
get_pid_statstoget_process_statsand related references inResourceTrackerandProcessTracker. - Rename the
pid_trackerandsystem_trackerproperties ofResourceTrackertoprocess_metricsandsystem_metricsrespectively. All related references were also updated, e.g. in the Metaflow extension and docs. - Rename process-related helpers in the ProcFS implementation from
pidprefix toprocessprefix. - Fix
SystemTrackerandProcessTrackerto not print dummy stats on start when header is disabled - Add optional
start_timeparameter toSystemTrackerandProcessTracker - Update
ResourceTrackerto start tracking at the nearest interval in the future, syncingSystemTrackerandProcessTracker - Fix
SystemTrackerandProcessTrackerto not drift by a few nanoseconds in every interval - Move cloud and server discovery along with the server allocation check to the
ResourceTrackerclass from the Metaflow-specific decorators - Extract serialization and deserialization of
ResourceTrackerfrom the Metaflow extension into theResourceTrackerclass withsnapshotanddump(s)/load(s)methods - Round timestamp and user/system time to reasonable (6/4) decimal places
- Rework internal data structure of
TinyDataFrameto use a list of lists instead of a dictionary of lists to support more efficient slicing and column renaming - Add
get_combined_metricsmethod toResourceTrackerto combineprocess_metricsandsystem_metricsinto a single data frame, optionally with all metrics converted to bytes, and columns renamed to use human-friendly names - Add
statsmethod toTinyDataFrameto compute on-demand statistics on columns - Add
statsmethod toResourceTrackerto compute statistics on the combined metrics - Add minimal support for Handlebars-like templates in the
render_templatefunction - Add
reportmethod toResourceTrackerto generate a report in HTML format, and use that from the Metaflow extension - Add
recommend_resourcesandrecommend_servermethods toResourceTracker - Windows-specific reliability improvements
Related breaking changes:
- Historical data collected before v0.4.0 is not compatible with the new
ResourceTrackerclass, and will be discarded TinyDataFrameis no longer made available fromresource_trackerdirectly, but from theresource_tracker.tiny_data_framesubmodule
v0.3.1 (May 30, 2025)#
- Generate card for failed step
- Note failed step status in card
- Standardize timestamp format and timezone
v0.3.0 (March 27, 2025)#
- Extract background process management and related complexities from the
track_resourcesdecorator into theResourceTrackerclass to track resource usage of a process and/or the system in a non-blocking way - Add unit tests for the
ResourceTrackerclass, including checks for deadlocks and partially started trackers - Keep test HTML card as GHA artifacts for manual inspection
- Improve documentation
v0.2.1 (March 21, 2025)#
- Fix don't always round up CPU/GPU recommendations
- Improve error message on missing historical data
- Improve documentation
v0.2.0 (March 21, 2025)#
Relatively major package rewrite to support alternative tracker implementations (other than directly reading from
/proc). No breaking changes in the public API on Linux.
- Add tracker implementation using
psutilto support MacOS and Windows - Fix data issues with the
/procimplementation after validating with thepsutilversion (e.g. number of processes reported) - Refactor code for better maintainability
- Add additional unit tests:
- Tracker implementation using
procfs - Tracker implementation using
psutil - Consistency between tracker implementations
- Metaflow decorators
- Tracker implementation using
- Extend CI/CD pipeline:
- Test on Linux, MacOS, and Windows
- Test multiple Python versions (3.9, 3.10, 3.11, 3.12, 3.13)
- Improve documentation
v0.1.2 (March 18, 2025)#
- Add experimental psutil support
- Add server info card for operating system
v0.1.1 (March 17, 2025)#
- Fix rounding down recommended vCPUs with <0.5 load
- Add info popups with more details and disclaimers for recommendations
- Add detection for shared server environments
- Add potential cost savings card
- Improve documentation
v0.1.0 (March 12, 2025)#
Initial PyPI release of resource-tracker with the following features:
- Detect if the system is running on a cloud provider, and if so, detect the provider, region, and instance type
- Detect main server hardware (CPU count, memory amount, disk space, GPU count and VRAM amount)
- Track system-wide resource usage:
- Process count
- CPU usage (user + system time, relative vCPU percentage)
- Memory usage (total, free, used, buffers, cached, active anon, inactive anon pages)
- Disk I/O (read and write bytes)
- Disk space usage (total, used, free)
- Network I/O (receive and transmit bytes)
- GPU and VRAM usage (using
nvidia-smi)
- Track resource usage of a process and its descendant processes:
- Descendant process count
- CPU usage (user + system time, relative vCPU percentage)
- Memory usage (based on proportional set sizes)
- Disk I/O (read and write bytes)
- GPU and VRAM usage (using
nvidia-smi pmon)
- Add Metaflow plugin for tracking resource usage of a step:
- Track process and system resource usage for the duration of the step
- Generate a card with the resource usage data
- Suggest
@resourcesdecorator for future runs - Find cheapest cloud instance type for a step