Carbontracker

Latest version: v2.1.2

Safety actively analyzes 722491 Python packages for vulnerabilities to keep your Python projects secure.

2.1.0

:writing_hand: Features
- Carbontracker now logs a more informative error message when the user does not have read-permissions to relevant RAPL-directories.

:bug: Bug fixes
- Fixed measurements of DRAM, resolving issue 93

2.0.1

:bug: Bug fixes
- Fixed a bug of double-counting log statements when multiple components were used and epochs were short.

:construction_worker: Internal changes
- Update GitHub Actions (thanks andife!)

Contributors:
Snailed , andife

2.0.0

Features

📘 Documentation site is (almost) live!
We have set up a new documentation site on Github Pages using `mkdocs` and `mkdocstrings`. It will be available very soon.

🔧 Better type hints
Many of the classes and functions in Carbontracker now has type hinting for nicer IDE integration, automatic documentation and fewer type-related bugs. This has meant changing the return-types of some functions slightly, see **⚠️ Breaking Changes** below.

🗺 Updated default intensity values and PUE
When Carbontracker cannot read intensity values from its API, it falls back to a national average value. These values were updated to the latest reports from [Our World in Data](https://ourworldindata.org/grapher/carbon-intensity-electricity?tab=table). The PUE used for the computations were also updated using the latest global average from [Uptime Institute](https://journal.uptimeinstitute.com/global-pues-are-they-going-anywhere/).

⌨️ Added `carbontracker --parse <log_dir>` command
This command aggregates all logs in a folder for easier readability of ones total carbon emissions across multiple experiments.

⚠️ Breaking changes
❗️ Change of return types and parameters
As part of making better type hints, some internal functions were found to have inconsistent typing, where the decision was made to strengthen the parameter/return types.
- The `devices_by_pid` parameter for components no longer accepts dictionaries
- The `power_usage` function in handlers now always returns lists of floats

❗️ Change of `monitor_epochs` default value from 1 to -1
When using `carbontracker.tracker.CarbonTracker`, the default value of `monitor_epochs` meant that actual consumption would be printed after 1 epoch. This does not align with the most common use case of wanting the actual consumption after training has been completed.

❗ElectricityMaps is now the default fetcher for all regions.
This means that while EnergiDataService and CarbonIntensityGB still exists as modules, `CarbonTracker` and the CLI will now only use `ElectricityMaps`. In addition, a new warning will now appear whenever ElectricityMaps is missing an API key.

Minor internal changes and bug fixes

🐍 Set up testing across different Python versions
This runs both in our CI setup using Github Actions and locally using `tox`.
This also included fixing a few version-specific bugs on Python 3.7, 3.8, 3.11 and 3.12.

🐛 Log statements no longer go into overdrive on Jupyter Notebooks.
Originally posed as issue 70 (thanks andreasgoethals !)
Before, for every new CarbonTracker object instantiated, new threads would be created. This is problematic in interactive computing environments like Jupyter Notebook where one might accidentally call `tracker = CarbonTracker(...)` many times. This was solved by fixing how Carbontracker identifies threads.

🪳 Fixed log parsing on logs generated by the CLI

Acknowledgements
This release was developed by Snailed, PedramBakh, raghavian. A special thanks to andreasgoethals and jonathanwww for pinpointing bugs.

1.2.1

🛠️ CarbonTracker CLI Tool Update
Issues: 60, 61
We've made a minor update to the CarbonTracker CLI tool. It now supports arbitrary command execution, broadening its utility beyond Python scripts. Do note that programs need to be executable for this to work.

You can use the tool in this manner:
bash
carbontracker myscript arg1 arg2 --log_dir ./logs

1.2.0

Highlights

🚀 Introducing the CarbonTracker CLI Tool
We are pleased to introduce the Command Line Interface (CLI) tool for CarbonTracker. This addition offers an efficient way to monitor and manage the carbon footprint of your Python scripts. Upon installing CarbonTracker via PyPi, the CLI tool is immediately available for use.

For straightforward usage without live carbon intensity API integration:
bash
carbontracker --script train.py --log_dir ./logs

For users aiming to use live carbon intensity measurements (a feature also introduced in this release, detailed below), the API key can be integrated with the CarbonTracker CLI tool as follows:
bash
carbontracker --script train.py --log_dir ./logs --api_keys '{"electricitymaps": "YOUR_KEY_HERE"}'

🔌 Transition from CO2Signal API
Issue: 1, 52
We have phased out support for the standalone CO2Signal API in favor of its integration into the ElectricityMaps API. This transition ensures greater consistency and addresses previous timeout issues experienced with consecutive requests.

🍏 OS X Support for Apple Silicon Chips
Issue: 24
We have rolled out support for OS X on M1/M2 Apple Silicon chips. This support encompasses all cores of the CPU and GPU, including the neural engine. Note: To initiate power measurements on these chips, users will be required to grant sudo access to the script.

📢 Enhanced Feedback with Verbose Setting
Issue: 35
An identified issue where setting `verbose=0` rendered both stdout and the output log empty has been addressed. With the current update, the `verbose` setting will only affect stdout, leaving the output log intact.

📏 Decimal Precision Update
Issues: [25](https://github.com/your-repository-link/issues/25), [#45](https://github.com/your-repository-link/issues/45)
We've increased the default decimal precision to 12 to align with kWh and gCO2/kWh units, which are standards in the energy sector. This enhancement has been integrated without affecting existing functionality.

🚨 Enhanced Carbon Intensity Estimation Notifications
Issue: 43
A gap was identified where users were not alerted when default fallback values were used for carbon intensity estimations. This has been addressed to provide notifications in both the log file and stdout.

⚡ Performance Optimization
Issue: 41
Feedback regarding performance slowdowns attributed to busy-waiting in the `CarbonTrackerThread()` has been addressed. We have transitioned to an event-based approach, resulting in optimized performance.

🛠️ Additional Updates
- An issue related to fetching NVML device names in `carbontracker/components/gpu/nvidia.py` for Python versions below 3.10 has been resolved. Issue: 53
- We have extended our support for live carbon intensity measurements through integration with the [ElectricityMaps API](https://www.electricitymaps.com/), enabling access to over 160 regions. Issue: #54

To leverage this feature, refer to the example below:
python
from carbontracker.tracker import CarbonTracker
from time import sleep

max_epochs = 10
api_keys = {
'electricitymaps': "YOUR_API_KEY_HERE"
}

tracker = CarbonTracker(epochs=max_epochs, log_dir="./logs", api_keys=api_keys)

tracker.epoch_start()
Training loop.
for epoch in range(10):

Your work here

tracker.epoch_end()
tracker.stop()

We are committed to providing valuable updates to enhance your experience with CarbonTracker.

1.1.7

Highlights
This release addresses issues preventing users from using CarbonTracker due to the new kernel update on Linux, which necessitates root privileges for energy measurements through Intel's RAPL monitoring. CarbonTracker now throws more descriptive error messages in the case above and when GPUs do not support the retrieval of power usages in NVML.

The constant factors for estimating ML tasks' power consumption and carbon footprint are now updated, reflecting the latest numbers reported by the European Economic Area (EEA). Also, the carbon intensity values are now country-specific.

Additionally, we have resolved a bug preventing the correctness of logging data when using multiple instances of CarbonTracker in the same script and added an option for prefixing (labelling) logging files for individual instances of CarbonTracker.

Summary:
- Catch Intel RAPL permission error (Issue: 40)
- Throw descriptive error message when GPU does not support retrieval of power usages in NVML (Issue: 36)
- Fix the issue with log files being overwritten due to short measurement periods when multiple instances of carbontracker are instantiated.
- Add prefix labelling for individual logging instances (Issue: 26)
- Fix energydataservice API (Issue: 46)
- Update PUE
- Updated default/fallback value for when live carbon intensity cannot be fetched
- Update carbon intensity to be country-specific (PR: 49)
- Update factor for equivalent km travelled by car
- Deprecate support for Python 3.6 (Issue: 48)

Monitoring power usage
Intel RAPL
A new security update for the Linux kernel now requires root privileges to access CPU power consumption through Intel's RAPL interface. This, unfortunately, caused jobs to get aborted when using CarbonTracker. We now omit to monitor CPU power usage in such cases to prevent crashes and give a message informing users of the issue and where to look to fix it.

GPUs not supporting retrieval of power usages in NVML
Not all GPUs support retrieval of power usage through the NVML API, which is used for monitoring the power usage of GPUs. The user was previously left uninformed about this issue, and monitoring the remaining hardware components would continue. A message is now shown informing the user of the issue with a link for where to find additional information.

Logging
Multiple instances
There was an issue where log files would be overwritten due to short measurement periods when multiple instances of carbontracker were instantiated since timestamps were used for naming logging files. The corresponding process ID of the logged task now prefixes logging files. Logging files now have the format `processID_timestamp_carbontracker.log` for the standard log and `processID_timestamp_carbontracker_output.log` for the output log.

Label log runs
It is now possible to label monitoring instances - logging files - by providing a prefix when instantiating CarbonTracker:
python
from carbontracker.tracker import CarbonTracker

tracker = CarbonTracker(epochs=max_epochs, log_dir="logs", log_file_prefix="prefix")

Training loop.
for epoch in range(max_epochs):
tracker.epoch_start()

Your model training.

tracker.epoch_end()

Optional: Add a stop in case of early termination before all monitor_epochs have
been monitored to ensure that actual consumption is reported.
tracker.stop()

The resulting log files will have the format `prefix_processID_timestamp_carbontracker.log`.

Measurements
Carbon intensity
We updated the default/fallback value for when live carbon intensity cannot be fetched. We now use the latest [data](https://ourworldindata.org/grapher/carbon-intensity-electricity) for the average carbon intensity of the specific country detected. If the aforementioned fails, we default to worldwide [average carbon intensity for 2019](https://www.eea.europa.eu/ims/greenhouse-gas-emission-intensity-of-1) of 475 gCO2eq/kWh instead. The data used is generated using a [script](scripts/create_carbon_intensity_csv.py), which generates a small .csv file from our [data source](https://ourworldindata.org/grapher/carbon-intensity-electricity).

PUE
The values for estimating power consumption and carbon footprint are now updated. The PUE is now [1.55](https://uptimeinstitute.com/uptime_assets/6768eca6a75d792c8eeede827d76de0d0380dee6b5ced20fde45787dd3688bfe-2022-data-center-industry-survey-en.pdf).

Conversion
The CO2-performance of new passenger cars used is now [107.5](https://www.eea.europa.eu/ims/co2-performance-of-new-passenger). This value is used for estimating the CO2 equivalent emission for km travelled by car.

API
The [energydataservice API](https://www.energidataservice.dk/) changed, and we have adjusted the API calls accordingly.

test-1.1.9
Highlights
This release addresses issues preventing users from using CarbonTracker due to the new kernel update on Linux, which necessitates root privileges for energy measurements through Intel's RAPL monitoring. CarbonTracker now throws more descriptive error messages in the case above and when GPUs do not support the retrieval of power usages in NVML.

The constant factors for estimating ML tasks' power consumption and carbon footprint are now updated, reflecting the latest numbers reported by the European Economic Area (EEA). Also, the carbon intensity values are now country-specific.

Additionally, we have resolved a bug preventing the correctness of logging data when using multiple instances of CarbonTracker in the same script and added an option for prefixing (labelling) logging files for individual instances of CarbonTracker.

Summary:
- Catch Intel RAPL permission error (Issue: 40)
- Throw descriptive error message when GPU does not support retrieval of power usages in NVML (Issue: 36)
- Fix the issue with log files being overwritten due to short measurement periods when multiple instances of carbontracker are instantiated.
- Add prefix labelling for individual logging instances (Issue: 26)
- Fix energydataservice API (Issue: 46)
- Update PUE
- Updated default/fallback value for when live carbon intensity cannot be fetched
- Update carbon intensity to be country-specific (PR: 49)
- Update factor for equivalent km travelled by car
- Deprecate support for Python 3.6 (Issue: 48)

Monitoring power usage
Intel RAPL
A new security update for the Linux kernel now requires root privileges to access CPU power consumption through Intel's RAPL interface. This, unfortunately, caused jobs to get aborted when using CarbonTracker. We now omit to monitor CPU power usage in such cases to prevent crashes and give a message informing users of the issue and where to look to fix it.

GPUs not supporting retrieval of power usages in NVML
Not all GPUs support retrieval of power usage through the NVML API, which is used for monitoring the power usage of GPUs. The user was previously left uninformed about this issue, and monitoring the remaining hardware components would continue. A message is now shown informing the user of the issue with a link for where to find additional information.

Logging
Multiple instances
There was an issue where log files would be overwritten due to short measurement periods when multiple instances of carbontracker were instantiated since timestamps were used for naming logging files. The corresponding process ID of the logged task now prefixes logging files. Logging files now have the format `processID_timestamp_carbontracker.log` for the standard log and `processID_timestamp_carbontracker_output.log` for the output log.

Label log runs
It is now possible to label monitoring instances - logging files - by providing a prefix when instantiating CarbonTracker:
python
from carbontracker.tracker import CarbonTracker

tracker = CarbonTracker(epochs=max_epochs, log_dir="logs", log_file_prefix="prefix")

Training loop.
for epoch in range(max_epochs):
tracker.epoch_start()

Your model training.

tracker.epoch_end()

Optional: Add a stop in case of early termination before all monitor_epochs have
been monitored to ensure that actual consumption is reported.
tracker.stop()

The resulting log files will have the format `prefix_processID_timestamp_carbontracker.log`.

Measurements
Carbon intensity
We updated the default/fallback value for when live carbon intensity cannot be fetched. We now use the latest [data](https://ourworldindata.org/grapher/carbon-intensity-electricity) for the average carbon intensity of the specific country detected. If the aforementioned fails, we default to worldwide [average carbon intensity for 2019](https://www.eea.europa.eu/ims/greenhouse-gas-emission-intensity-of-1) of 475 gCO2eq/kWh instead. The data used is generated using a [script](scripts/create_carbon_intensity_csv.py), which generates a small .csv file from our [data source](https://ourworldindata.org/grapher/carbon-intensity-electricity).

PUE
The values for estimating power consumption and carbon footprint are now updated. The PUE is now [1.55](https://uptimeinstitute.com/uptime_assets/6768eca6a75d792c8eeede827d76de0d0380dee6b5ced20fde45787dd3688bfe-2022-data-center-industry-survey-en.pdf).

Conversion
The CO2-performance of new passenger cars used is now [107.5](https://www.eea.europa.eu/ims/co2-performance-of-new-passenger). This value is used for estimating the CO2 equivalent emission for km travelled by car.

API
The [energydataservice API](https://www.energidataservice.dk/) changed, and we have adjusted the API calls accordingly.

Releases

Has known vulnerabilities