Changelog
dev#
v0.4.0 (August 1, 2025)#
Refactoring and cleanup release with the main focus on extracting the reporting features from the Metaflow extension and support it in standalone use as well.
Main changes:
- Much better support for standalone use of
ResourceTracker
via thestats
,recommend_resources
,recommend_server
andreport
methods
More details:
- Rename
PidTracker
toProcessTracker
to better reflect its purpose.PidTracker
is still available as an alias that is to be deprecated in the future. - Rename
get_pid_stats
toget_process_stats
and related references inResourceTracker
andProcessTracker
. - Rename the
pid_tracker
andsystem_tracker
properties ofResourceTracker
toprocess_metrics
andsystem_metrics
respectively. All related references were also updated, e.g. in the Metaflow extension and docs. - Rename process-related helpers in the ProcFS implementation from
pid
prefix toprocess
prefix. - Fix
SystemTracker
andProcessTracker
to not print dummy stats on start when header is disabled - Add optional
start_time
parameter toSystemTracker
andProcessTracker
- Update
ResourceTracker
to start tracking at the nearest interval in the future, syncingSystemTracker
andProcessTracker
- Fix
SystemTracker
andProcessTracker
to not drift by a few nanoseconds in every interval - Move cloud and server discovery along with the server allocation check to the
ResourceTracker
class from the Metaflow-specific decorators - Extract serialization and deserialization of
ResourceTracker
from the Metaflow extension into theResourceTracker
class withsnapshot
anddump(s)
/load(s)
methods - Round timestamp and user/system time to reasonable (6/4) decimal places
- Rework internal data structure of
TinyDataFrame
to use a list of lists instead of a dictionary of lists to support more efficient slicing and column renaming - Add
get_combined_metrics
method toResourceTracker
to combineprocess_metrics
andsystem_metrics
into a single data frame, optionally with all metrics converted to bytes, and columns renamed to use human-friendly names - Add
stats
method toTinyDataFrame
to compute on-demand statistics on columns - Add
stats
method toResourceTracker
to compute statistics on the combined metrics - Add minimal support for Handlebars-like templates in the
render_template
function - Add
report
method toResourceTracker
to generate a report in HTML format, and use that from the Metaflow extension - Add
recommend_resources
andrecommend_server
methods toResourceTracker
- Windows-specific reliability improvements
Related breaking changes:
- Historical data collected before v0.4.0 is not compatible with the new
ResourceTracker
class, and will be discarded TinyDataFrame
is no longer made available fromresource_tracker
directly, but from theresource_tracker.tiny_data_frame
submodule
v0.3.1 (May 30, 2025)#
- Generate card for failed step
- Note failed step status in card
- Standardize timestamp format and timezone
v0.3.0 (March 27, 2025)#
- Extract background process management and related complexities from the
track_resources
decorator into theResourceTracker
class to track resource usage of a process and/or the system in a non-blocking way - Add unit tests for the
ResourceTracker
class, including checks for deadlocks and partially started trackers - Keep test HTML card as GHA artifacts for manual inspection
- Improve documentation
v0.2.1 (March 21, 2025)#
- Fix don't always round up CPU/GPU recommendations
- Improve error message on missing historical data
- Improve documentation
v0.2.0 (March 21, 2025)#
Relatively major package rewrite to support alternative tracker implementations (other than directly reading from /proc
). No breaking changes in the public API on Linux.
- Add tracker implementation using
psutil
to support MacOS and Windows - Fix data issues with the
/proc
implementation after validating with thepsutil
version (e.g. number of processes reported) - Refactor code for better maintainability
- Add additional unit tests:
- Tracker implementation using
procfs
- Tracker implementation using
psutil
- Consistency between tracker implementations
- Metaflow decorators
- Tracker implementation using
- Extend CI/CD pipeline:
- Test on Linux, MacOS, and Windows
- Test multiple Python versions (3.9, 3.10, 3.11, 3.12, 3.13)
- Improve documentation
v0.1.2 (March 18, 2025)#
- Add experimental psutil support
- Add server info card for operating system
v0.1.1 (March 17, 2025)#
- Fix rounding down recommended vCPUs with <0.5 load
- Add info popups with more details and disclaimers for recommendations
- Add detection for shared server environments
- Add potential cost savings card
- Improve documentation
v0.1.0 (March 12, 2025)#
Initial PyPI release of resource-tracker
with the following features:
- Detect if the system is running on a cloud provider, and if so, detect the provider, region, and instance type
- Detect main server hardware (CPU count, memory amount, disk space, GPU count and VRAM amount)
- Track system-wide resource usage:
- Process count
- CPU usage (user + system time, relative vCPU percentage)
- Memory usage (total, free, used, buffers, cached, active anon, inactive anon pages)
- Disk I/O (read and write bytes)
- Disk space usage (total, used, free)
- Network I/O (receive and transmit bytes)
- GPU and VRAM usage (using
nvidia-smi
)
- Track resource usage of a process and its descendant processes:
- Descendant process count
- CPU usage (user + system time, relative vCPU percentage)
- Memory usage (based on proportional set sizes)
- Disk I/O (read and write bytes)
- GPU and VRAM usage (using
nvidia-smi pmon
)
- Add Metaflow plugin for tracking resource usage of a step:
- Track process and system resource usage for the duration of the step
- Generate a card with the resource usage data
- Suggest
@resources
decorator for future runs - Find cheapest cloud instance type for a step