Save the planet, skip a release: 3.33 was due 6 months ago, so skip directly to 3.34.
General:
- SimGrid now requires a compiler with C++17 support for public headers too.
Sibling projects should upgrade their FindSimGrid.cmake
- Remove the MSG API: its EOL was scheduled for 2020.
- Remove the Java bindings: they were limited to the MSG interface.
- On Windows, you now need to install WSL2 as the native builds are now disabled.
It was not really working anyway.
- Support for 32bits architecture is not tested anymore on our CI infrastructure.
It may break in the future, but we think that nobody's using SimGrid on 32 bits.
- Remove the surf module. It was replaced by the kernel/models module, and that
refactoring took almost 10 years to properly complete.
S4U:
- Activity::set_remaining() is not public anymore. Use for example
Comm::set_payload_size() to change the size of the simulated data.
- New function: Engine::flatify_platform(), to get a fully detailed vision of the
configured platform.
- New Task abstraction: They are designed to represent dataflows, i.e, graphs of repeatable Activities.
See the examples under examples/cpp/task-* and the associated documentation.
- Full simDAG integration: Activity::start() actually starts only when all dependencies
are fulfilled. If it cannot be started right away, it will start as soon as it becomes
possible.
- Allow to set a concurrency limit on disks and hosts, as it was already the case for links.
- Rename Link::get_usage() to Link::get_load() for consistency with Host::
- Every signal now come with a static version that is invoked for every object of that class,
and an instance version that is invoked for this specific object only. For example,
s4u::Actor::on_suspend_cb() adds a callback that is invoked for the suspend of any actor while
s4u::Actor::on_this_suspend_cb() adds a callback for this specific actor only.
- Activity::on_suspended_cb() is renamed to Activity::on_suspend_cb(), and fired right before the suspend.
- Activity::on_resumed_cb() is renamed to Activity::on_resume_cb(), and fired right before the resume.
- Resource::on_state_change_cb() is renamed to Resource::on_onoff_cb() to distinguish from the
Activity::on_state_change_cb() that is related to the activity state machine, not on/off.
- Activity signals (veto, suspend, resume, completion) are now specialized by activity class.
That is, callbacks registered in Exec::on_suspend_cb will not be fired for Comms nor Ios.
New S4U plugins:
- Battery: Enable the management of batteries on hosts.
See the examples under examples/cpp/battery-* and the documentation in the Plugins page.
- Photovoltaic: Enable the management of photovoltaic panels on hosts.
See the examples under examples/cpp/photovoltaic-* and the documentation in the Plugins page.
Kernel:
- optimize an internal data structure (use a set instead of a list for ongoing activities),
leading to a potentially big performance gain, in particular with many detached comms.
MPI:
- New option smpi/barrier-collectives to add a barrier to some collectives
to detect dangerous code that /may/ work on some MPI implems.
- New function SMPI_app_instance_start() to easily start a MPI instance in your S4U simulation.
Models:
- Write the section of the manual about models, at least.
- WiFi: the total capacity of a link depends on the amount of flows on that link.
- Use the nonlinear callback feature of LMM to reflect this.
- Calibration values can be changed to match different MCS configurations
- See the example teshsuite/models/wifi_usage_decay/wifi_usage_decay.cpp
- See also "A Flow-Level Wi-Fi Model for Large Scale Network Simulation"
https://hal.archives-ouvertes.fr/hal-03777726
- Merge parameters network/bandwidth-factor and smpi/bw-factor that serve the same purpose.
- Same for the latency
- Rewrite the corresponding documentation.
- Allow to disable the TCP windowing modeling by setting network/TCP-gamma to 0.
- Finally kill the 'compound' host model. You can change the CPU or network model
with the default host model, as it should.
- Rename option "surf/precision" to "precision/timing" for clarity.
- Rename option "maxmin/precision" to "precision/work-amount" for clarity.
- New function: Engine::flatify_platform() to debug your platform.
sthread:
- Implement pthread_join in MC mode.
- Implement semaphore functions in sthread.
- Add an intricated way to verify the access to non-reentrant data structures
It requires code annotation, as shown in examples/sthread/stdobject/stdobject.cpp
Model checking:
- Stateless model-checking is now usable on any system, including Mac OSX and ARM processors.
- The stateless aspects of the MC are now enabled by default in all SimGrid builds.
Liveness and stateful aspects are still controlled by the enabling_model-checking
configuration option.
- Introducing ODPOR and SDPOR reduction strategies
- Introducing guiding heuristics, trying to find bugs faster than DFS in reduced state space.
- Synchronize the MBI tests with upstream.
- Show the full actor backtraces when replaying a MC trace (with model-check/replay)
and the status of all actors on deadlocks in MC mode.
XBT:
- simgrid::xbt::cmdline and simgrid::xbt::binary_name are gone.
Please use simgrid::s4u::Engine::get_cmdline() instead.
Documentation:
- New tutorial on simulating DAGs.
- New section in the user guide on the provided performance models.
- New section presenting some technical good practices for (potential) contributors.
- Add a section on errors and exceptions to the API documentation.
- Move the s4u examples to a section on their own to ease navigation.
Fixed bugs (FG.. -> FramaGit bugs; FG!.. -> FG merge requests)
(FG: issues on Framagit; GH: issues on GitHub)
- FG18: Java bindings should be redone or removed
- FG!118: Wi-Fi callback mechanism
- FG!119: SMPI: add option to inject a barrier before every collective call
- GH383: Segfault when adding a disk after load_platform(xml)
----------------------------------------------------------------------------
SimGrid (3.32) October 3. 2022.
The Wiedervereinigung release. Germany was reunited 32 years ago.
General:
- SimGrid now requires a compiler with C++17 support to compile the lib.
Our public headers still allow the user code to be compiled in C++14.
- Support graphviz v3 and ns-3 v3.36 (older versions are still supported).
- Tested with clang (v11, v13, v14 and v16), gcc (v7 to v13) and IntelCC v2022.2
S4U:
- API evolutions:
- kill signal Comm::on_completion that was not working anyway.
- Expose signals Activity::on_suspend and Activity::on_resume
- New macro xbt_enforce(): similar to xbt_assert(), but throws an AssertionError
instead of calling abort().
- New: s4u::Exec::get_thread_count()
- Various cleanups around virtual machines:
- host_by_name() and friends now only return hosts. VMs are now excluded.
- It is now impossible to search a VM by name globally.
You can only search VM by name on a given PM, so either you know
the PM on which your VM runs and you can search by name, or you need
to manually iterate over all PMs to search this VM.
- The s4u::VirtualMachine constructor is now deprecated.
Please use s4u::Host::create_vm() instead.
- Rename s4u::VirtualMachine::on_creation() to on_vm_creation() to
avoid confusion with s4u:Host::on_creation() that is inherited.
Also s4u::VirtualMachine::on_destruction -> on_vm_destruction().
- Bug fixes:
- One-sided communications (Comm::sendto) can now be detached,
and should now be more resilient to network and host faults.
Python:
- Added the following bindings / examples:
- Comm (now 100% covers the C++ interface):
- Comm.dst_data_size, Comm.mailbox, Comm.sender, Comm.start_time, Comm.finish_time
- Comm.state_str [examples: examples/python/comm-failure/, examples/python/comm-host2host/]
- Comm.remaining [examples: examples/python/comm-host2host/, examples/python/comm-suspend/]
- Comm.set_payload_size [example: examples/python/comm-host2host/]
- Comm.set_rate [example: examples/python/comm-throttling/]
- Comm.sendto, Comm.sendto_init, Comm.sendto_async [example: examples/python/comm-host2host/]
- Comm.start, Comm.suspend, Comm.resume [example: examples/python/comm-host2host/]
- Comm.test_any [example: examples/python/comm-testany/]
- Comm.wait_until [example: examples/python/comm-waituntil/]
- Engine:
- Engine.host_by_name [example: examples/python/comm-host2host/]
- Engine.mailbox_by_name_or_create [example: examples/python/comm-pingpong/]
- Engine.set_config
- Mailbox: Mailbox.ready [example: examples/python/comm-ready/]
- Ptask [example: examples/python/exec-ptask/]:
- this_actor.exec_init
- this_actor.parallel_execute
- Exec.suspend
- Exec.wait_for
- Added an AssertionError exception that may be thrown in case of error.
For instance, creating two hosts with the same name will now throw this exception
instead of killing the interpreter.
SMPI:
- Implement MPI_File_get_type_extent(), MPI_File_s/get_atomicity() and
MPI_File_get_byte_offset()
- Intercept getpid() calls to return the simulated ones.
- Fix various bugs in MPI IO.
Platform description & visualization:
- More robust sanity checks for platforms, to reject forbidden topologies with
a proper error message.
- New platform example: supernode.cpp and supernode.py.
The Python version generates a nice graphical representation of the platform.
- Bug fixes around fat-tree topologies.
- Allow to dump the platform topology as a CSV file representing the graph edges
with platform_graph_export_csv() (similar to the DOT export).
- Fix graphicator for "cluster" topologies (e.g. fat-tree, dragonfly).
Models:
- Fix a bug when using ptasks with multicores (FG!111).
Model-Checker:
- First bits of sthread, that intercepts pthread operations at runtime.
The intend is to use it together with simgrid-mc, but it is TBD.
- Sync MBI generators with upstream changes.
- Various cosmetics, small bug fixes and inner refactorings
Fixed bugs (FG.. -> FramaGit bugs; FG!.. -> FG merge requests)
(FG: issues on Framagit; GH: issues on GitHub)
- FG105: "Variable penalty should not be negative!" with in-flight messages and bandwidth profiles
- FG109: Application time reported by --cfg=smpi/display-timing:yes is wrong
- FG110: Wait_any does not trigger new model solve when host events occur
- FG111: Wrong execution time in rare cases when using multicore
- FG!98: Re-enable the tests for legacy stochastic profiles
- FG!109: Trigger new engine solve upon host events such as host on/off
- FG!116: SMPI/replay: Fix issue with recv of size =0
----------------------------------------------------------------------------
SimGrid (3.31) March 22. 2022.
The ненасильство release. We stand against war.
Against the agression by a sick system that forces peoples to take arms against each other.
MC:
- Rework the internals, for simpler and modern code. This shall unlock many future improvements.
- You can now define plugins onto the DFS explorer (previously called SafetyChecker), using the
declared signals. See CommunicationDeterminism for an example.
- Support mutex, semaphore and barrier in DPOR reduction
- Seems to work on Arm64 architectures too.
- Display a nice error message when ptrace is not usable.
- New test suite, imported from the MPI Bugs Initiative (MBI). Not all MBI generators are integrated yet.
- Remove the ISP test suite: it's not free software, and it's superseeded by MBI.
SMPI:
- fix for FG100 by ensuring small asynchronous messages never overtake larger
ones, conforming to the standard.
- replay: fix waitall behaviour to avoid forgetting requests and leaking
their handles.
- tracing: ensure that we dump the TI traces continuously during execution and
not just at the end, reducing memory cost and performance hit.
- Update OpenMPI collectives selection logic to match current one (4.1.2)
- Add a coherence check for collective operation order and root/MPI_Op
coherence. Potentially costly so not activated unless smpi:pendantic is set
or -analyze is given.
S4U:
- New signal: Engine::on_simulation_start_cb()
- Introduce a new execution mode with this_actor::thread_execute(). This simulate
the execution of a certain amount of flops by multiple threads ran by a host. Each
thread executes the same number of flops, given as argument. An example of this new
function can be found in examples/cpp/exec-threads.
- Reimplementation of barriers natively.
Previously, they were implemented on top of s4u::Mutex and s4u::ConditionVariable.
The new version should be faster (and can be used in the model-checker).
- Actor::get_restart_count(): Returns the number of reboots that this actor did.
MSG:
- MSG_barrier_destroy now expects a non-const msg_barrier parameter.
New plugin: the Chaos Monkey (killing actors at any time)
- Along with the new simgrid-monkey script, it tests whether your simulation
resists resource failures at any possible timestamp in your simulation.
- It is mostly intended to test the SimGrid core in extreme conditions,
but some users may find it interesting too.
Models:
- New solver for parallel task: BMF.
- More realistic sharing of heterogeneous resources compared to the fair
bottleneck solver used by ptask_L07.
- Implement the BMF (Bottleneck max fairness) fairness.
- Improved resource sharing for parallel tasks with sub-flows (parallel
communications between same source and destination inside the ptask).
- Parameters:
- "--cfg=host/model:ptask_L07 --cfg=host/solver:bmf": enable the ptask
model with BMF solver.
- "--cfg=bmf/max-iterations: <N>": maximum number of iterations performed
by BMF solver (default: 1000).
- "--cfg=bmf/precision: <N>": numerical precision used when computing
resource sharing (default: 1e-12).
- This model requires Eigen3 library. Make sure Eigen3 is installed to use BMF.
General:
- Modifications of the Profile mechanism, with some impact on users
- Addition of a new (S4U) method to init profiles from generic functions to improve versatility
- Fix initial behaviour of state_profiles
- Modify periodicity to behave like a period, and not like a loop delay
XBT:
- Drop xbt_dynar_shrink().
Python:
- Made the following bindings static (previously member functions):
- Actor: Actor.kill_all(), Actor.by_pid()
- Host: Host.by_name(), Host.current(), Host.on_creation_cb()
- Mailbox: Mailbox.by_name()
- Added the following bindings:
- this_actor.warning()
- Mailbox.put_init() [example: examples/python/comm-waitallfor/]
- Comm.detach() [example: examples/python/comm-waitallfor/]
- Comm.wait_for() [example: examples/python/comm-waitfor/]
- Comm.wait_any_for()
- Comm.wait_all_for() [example: examples/python/comm-waitallfor/]
- Mutex [example: examples/python/synchro-mutex/]
- Barrier [example: examples/python/synchro-barrier/]
- Semaphore [example: examples/python/synchro-semaphore/]
Build System:
- Remove target "make uninstall" which was incomplete and no longer maintained.
Fixed bugs (FG.. -> FramaGit bugs; FG!.. -> FG merge requests)
(FG: issues on Framagit; GH: issues on GitHub)
- FG57: Mc SimGrid should test whether ptrace is usable
- FG87: Smpi scripts fail with spaces in paths
- FG100: [SMPI] Order of the message matching is not guaranteed
- FG101: LGPL 2.1 is deprecated license
- FG104: "make uninstall" not up-to-date
- GH151: Missing mutexes for DPOR.
----------------------------------------------------------------------------
SimGrid (3.30) January 30. 2022.
The Sunday Bloody Sunday release.
Main user-visible changes:
- The SimDag API for the simulation of the scheduling of Directed Acyclic
Graphs has been dropped. It was marked as deprecated for a couple of years.
We finally complete the implementation of what has been called SimDag++
internally, i.e., porting the different features of SimDag on top of S4U.
The new way to simulate the execution of dependent activities directly by
maestro (without any other actor) is details in the examples/cpp/dag-* series
of examples.
- The removal of SimDag led us to also remove the export to Jedule files that
was tightly coupled to SimDag. The instrumentation of DAG simulation is still
possible through the regular instrumentation API based on the Paje format.
- We also dropped the old and clumsy Lua bindings to create platforms in a
programmatic way. It can be done in C++ in a much cleaner way now, which
motivates this suppression.
S4U:
- Introduce on_X_cb() functions for all signals, to attach a new
callback to the signal X. The signal variables are now hidden and
only these functions should be used.
Rationale: this enables the usual deprecation scheme where functions
remain for 4 releases if we need to modify the signals, while the
current code with the signal variables directly visible prevents any
smooth transition.
- New function: Engine::run_until(date), to split the simulation.
- New signal: Activity::on_veto, to detect when an activity fails to start.
- Signal change: Comm::on_start(Comm&, bool) has been replaced by
Comm::on_send and Comm::recv. These two signals respectively correspond to
when the sending or receiving side of a Comm is ready. They are raised at
the same locations as the former Comm::on_start signal.
- New function: Engine::track_vetoed_activities() to interrupt run()
when an activity fails to start, and to keep track of such activities.
Please see the corresponding example for more info.
- New functions: s4u::Comm::{sendto_init, set_source, set_destination} to enable
the use of vetoers with direct host-to-host communications. Both source and
destination have to set for a comm to start. Each call to these setters check
if all vetoes are satisfied. When it is the case, the comm starts. A use case of
these functions is given in examples/cpp/dag-scheduling.
- New functions: {Exec, Io}::update_priority allow you to modify the priority of
these kinds of activities during their execution. Behavior is detailed in
examples/cpp/io-priority/.
SMPI:
- Dynamic costs for MPI operations: New API to allow users to dynamically
change injected costs for MPI_Recv, MPI_Send and MPI_Isend operations.
Alternative for smpi/or, smpi/os and smpi/ois configuration options.
- Fix some issues with the replay mechanism.
XBT:
- Function xbt::Extendable::get_data() is now templated with the type of the
pointee. Untyped function is deprecated. Use get_data<void>() if you still
want to retrieve void*.
Documentation:
- New section: "SimGrid MPI calibration of a Grid5000 cluster"
presenting how to properly calibrate MPI communications in SimGrid.
- Complete and reword the platform section, which is now completed.
Python:
- Thread contexts are used by default with Python bindings. Other kinds of
contexts revealed unstable, specially starting with pybind11 v2.8.0.
Fixed bugs (FG.. -> FramaGit bugs; FG!.. -> FG merge requests)
(FG: issues on Framagit; GH: issues on GitHub)
- FG95: Wrong computation time for multicore execution after pstate change
- FG97: Wrong computation time for ptask+multicore+pstates
- FG98: SMPI offline simulation is inconsistent with the online simulation
(deadlocks / message truncation)
- FG99: Weird segfault when not sealing an host
----------------------------------------------------------------------------