ytree.utilities.parallel.parallel_trees

ytree.utilities.parallel.parallel_trees(trees, collect_results=True, save_every=None, save_in_place=None, save_roots_only=False, filename=None, njobs=0, dynamic=False)[source]

Iterate over a list of trees in parallel.

Trees are divided up between the available processor groups. Analysis field values can then be assigned to halos within the tree. The trees will be saved either at the end of the loop or after a number of trees given by the save_every keyword are completed.

This uses the yt parallel_objects function, which is parallelized with MPI underneath and so is suitable for parallelism across compute nodes.

Parameters:

trees (list of TreeNode objects) – The trees to be iterated over in parallel.
collect_results (optional, bool) – If True, then results stored in analysis fields will be collected by the root process. This must be set to True if saving is to be done. If False, results collection is ignored. This will result in a significant speedup. If you have no intention of altering analysis fields or do not need results to be recollected or saved, then this is the best option. Setting this to False will automatically set save_every to False as well. Default: True
save_every (optional, int or False) –
Number of trees to be completed before results are saved. This is used to save intermediate results in case scripts need to be restarted. This parameter results in different behavior depending on the value of the collect_results keyword. If save_every is set to:
- integer: if collect_trees is True, the number of trees to complete before saving. If collect_trees is False, a ValueError exception will be raised.
- False: no saving will be done. Results will still be collected if collect_results is True.
- None: if collect_results if True, save will occur after iterating over all trees. If collect_results is False, no saving will be done.
Default: None
save_in_place (optional, bool or None) – If True, analysis fields will be saved to the original arbor, even if only a subset of all trees is provided with the trees keyword. If False and only a subset of all trees is provided, a new arbor will be created containing only the trees provided. If set to None, behavior is determined by the type of arbor loaded. If the arbor is a YTreeArbor (i.e., saved with save_arbor), save_in_place will be set to True. If not of this type, it will be set to False. Default: None
save_roots_only (optional, bool) – If True, only field values of each node are saved. If False, field data for the entire tree stemming from that node are saved. Default: False.
filename (optional, string) – The name of the new arbor to be saved. If None, the naming convention will follow the filename keyword of the save_arbor function. Default: None
njobs (optional, int) – The number of process groups for parallel iteration. Set to 0 to make the same number of process groups as available processors. Hence, each tree will be allocated to a single processor. Set to a number less than the total number of processors to create groups with multiple processors, which will allow for further parallelization within a tree. For example, running with 8 processors and setting njobs to 4 will result in 4 groups of 2 processors each. Default: 0
dynamic (optional, bool) – Set to False to divide iterations evenly among process groups. Set to True to allocate iterations with a task queue. If True, the number of processors available will be one fewer than the total as one will act as the task queue server. Default: False

Examples

>>> import ytree
>>> a = ytree.load("arbor/arbor.h5")
>>> a.add_analysis_field("test_field", default=-1, units="Msun")
>>> trees = list(a[:])
>>> for tree in ytree.parallel_trees(trees):
...     for node in tree["forest"]:
...         node["test_field"] = 2 * node["mass"] # some analysis