Changelog
v0.4.2 (August 8, 2025)#
- Add
cleanupmethod toResourceTrackerto cleanup temp files and background processes.
v0.4.1 (August 8, 2025)#
Introduce native R support. Example usage:
library(resource_tracker)
tracker <- ResourceTracker$new()
tracker$wait_for_samples(1)
tracker$stats()
tracker$recommend_resources()
tracker$recommend_server()
tracker$report()$browse()
Find more details in the R integration docs.
Additional changes: - Split documentation and extend into multiple pages with more details on the integrations.
v0.4.0 (August 6, 2025)#
Refactoring and cleanup release with the main focus on extracting the reporting features from the Metaflow extension and support it in standalone use as well.
Main changes:
- Much better support for standalone use of
ResourceTrackervia thestats,recommend_resources,recommend_serverandreportmethods
More details:
- Rename
PidTrackertoProcessTrackerto better reflect its purpose.PidTrackeris still available as an alias that is to be deprecated in the future. - Rename
get_pid_statstoget_process_statsand related references inResourceTrackerandProcessTracker. - Rename the
pid_trackerandsystem_trackerproperties ofResourceTrackertoprocess_metricsandsystem_metricsrespectively. All related references were also updated, e.g. in the Metaflow extension and docs. - Rename process-related helpers in the ProcFS implementation from
pidprefix toprocessprefix. - Fix
SystemTrackerandProcessTrackerto not print dummy stats on start when header is disabled - Add optional
start_timeparameter toSystemTrackerandProcessTracker - Update
ResourceTrackerto start tracking at the nearest interval in the future, syncingSystemTrackerandProcessTracker - Fix
SystemTrackerandProcessTrackerto not drift by a few nanoseconds in every interval - Move cloud and server discovery along with the server allocation check to the
ResourceTrackerclass from the Metaflow-specific decorators - Extract serialization and deserialization of
ResourceTrackerfrom the Metaflow extension into theResourceTrackerclass withsnapshotanddump(s)/load(s)methods - Round timestamp and user/system time to reasonable (6/4) decimal places
- Rework internal data structure of
TinyDataFrameto use a list of lists instead of a dictionary of lists to support more efficient slicing and column renaming - Add
get_combined_metricsmethod toResourceTrackerto combineprocess_metricsandsystem_metricsinto a single data frame, optionally with all metrics converted to bytes, and columns renamed to use human-friendly names - Add
statsmethod toTinyDataFrameto compute on-demand statistics on columns - Add
statsmethod toResourceTrackerto compute statistics on the combined metrics - Add minimal support for Handlebars-like templates in the
render_templatefunction - Add
reportmethod toResourceTrackerto generate a report in HTML format, and use that from the Metaflow extension - Add
recommend_resourcesandrecommend_servermethods toResourceTracker - Windows-specific reliability improvements
Related breaking changes:
- Historical data collected before v0.4.0 is not compatible with the new
ResourceTrackerclass, and will be discarded TinyDataFrameis no longer made available fromresource_trackerdirectly, but from theresource_tracker.tiny_data_framesubmodule
v0.3.1 (May 30, 2025)#
- Generate card for failed step
- Note failed step status in card
- Standardize timestamp format and timezone
v0.3.0 (March 27, 2025)#
- Extract background process management and related complexities from the
track_resourcesdecorator into theResourceTrackerclass to track resource usage of a process and/or the system in a non-blocking way - Add unit tests for the
ResourceTrackerclass, including checks for deadlocks and partially started trackers - Keep test HTML card as GHA artifacts for manual inspection
- Improve documentation
v0.2.1 (March 21, 2025)#
- Fix don't always round up CPU/GPU recommendations
- Improve error message on missing historical data
- Improve documentation
v0.2.0 (March 21, 2025)#
Relatively major package rewrite to support alternative tracker implementations (other than directly reading from /proc). No breaking changes in the public API on Linux.
- Add tracker implementation using
psutilto support MacOS and Windows - Fix data issues with the
/procimplementation after validating with thepsutilversion (e.g. number of processes reported) - Refactor code for better maintainability
- Add additional unit tests:
- Tracker implementation using
procfs - Tracker implementation using
psutil - Consistency between tracker implementations
- Metaflow decorators
- Tracker implementation using
- Extend CI/CD pipeline:
- Test on Linux, MacOS, and Windows
- Test multiple Python versions (3.9, 3.10, 3.11, 3.12, 3.13)
- Improve documentation
v0.1.2 (March 18, 2025)#
- Add experimental psutil support
- Add server info card for operating system
v0.1.1 (March 17, 2025)#
- Fix rounding down recommended vCPUs with <0.5 load
- Add info popups with more details and disclaimers for recommendations
- Add detection for shared server environments
- Add potential cost savings card
- Improve documentation
v0.1.0 (March 12, 2025)#
Initial PyPI release of resource-tracker with the following features:
- Detect if the system is running on a cloud provider, and if so, detect the provider, region, and instance type
- Detect main server hardware (CPU count, memory amount, disk space, GPU count and VRAM amount)
- Track system-wide resource usage:
- Process count
- CPU usage (user + system time, relative vCPU percentage)
- Memory usage (total, free, used, buffers, cached, active anon, inactive anon pages)
- Disk I/O (read and write bytes)
- Disk space usage (total, used, free)
- Network I/O (receive and transmit bytes)
- GPU and VRAM usage (using
nvidia-smi)
- Track resource usage of a process and its descendant processes:
- Descendant process count
- CPU usage (user + system time, relative vCPU percentage)
- Memory usage (based on proportional set sizes)
- Disk I/O (read and write bytes)
- GPU and VRAM usage (using
nvidia-smi pmon)
- Add Metaflow plugin for tracking resource usage of a step:
- Track process and system resource usage for the duration of the step
- Generate a card with the resource usage data
- Suggest
@resourcesdecorator for future runs - Find cheapest cloud instance type for a step