Fields in ytree¶
ytree
supports multiple types of fields, each representing numerical
values associated with each halo in the
Arbor
. These include the
native fields stored on disk, alias fields, derived fields, and
analysis fields.
The Field Info Container¶
Each Arbor
contains a dictionary,
called field_info
,
with relevant information for each available field. This information
can include the units, type of field, any dependencies or aliases, and
things relevant to reading the data from disk.
>>> import ytree
>>> a = ytree.load("tree_0_0_0.dat")
>>> print (a.field_info["Rvir"])
{'description': 'Halo radius (kpc/h comoving).', 'units': 'kpc/h ', 'column': 11,
'aliases': ['virial_radius']}
>>> print (a.field_info["mass"])
{'type': 'alias', 'units': 'Msun', 'dependencies': ['Mvir']}
Fields on Disk¶
Every field stored in the dataset’s files should be available within
the Arbor
. The field_list
contains a list of all fields on disk
with their native names.
>>> print (a.field_list)
['scale', 'id', 'desc_scale', 'desc_id', 'num_prog', ...]
Alias Fields¶
Because the various dataset formats use different naming conventions for
similar fields, ytree
allows fields to be referred to by aliases. This
allows for a universal set of names for the most common fields. Many are
added by default, including “mass”, “virial_radius”, “position_<xyz>”,
and “velocity_<xyz>”. The list of available alias and derived fields
can be found in the derived_field_list
.
>>> print (a.derived_field_list)
['uid', 'desc_uid', 'scale_factor', 'mass', 'virial_mass', ...]
Additional aliases can be added with
add_alias_field
.
>>> a.add_alias_field("amount_of_stuff", "mass", units="kg")
>>> print (a["amount_of_stuff"])
[ 1.30720461e+45, 1.05085632e+45, 1.03025691e+45, ...
1.72691772e+42, 1.72691772e+42, 1.72691772e+42]) kg
Derived Fields¶
Derived fields are functions of existing fields, including other
derived and alias fields. New derived fields are created by
providing a defining function and calling
add_derived_field
.
>>> def potential_field(field, data):
... # data.arbor points to the parent Arbor
... return data["mass"] / data["virial_radius"]
...
>>> a.add_derived_field("potential", potential_field, units="Msun/Mpc")
[ 2.88624262e+14 2.49542426e+14 2.46280488e+14, ...
3.47503685e+12 3.47503685e+12 3.47503685e+12] Msun/Mpc
Field functions should take two arguments. The first is a dictionary
that will contain basic information about the field, such as its name.
The second argument represents the data container for which the field
will be defined. It can be used to access field data for any other
available field. This argument will also have access to the parent
Arbor
as data.arbor
.
Vector Fields¶
For fields that have x, y, and z components, such as position, velocity, and angular momentum, a single field can be queried to return an array with all the components. For example, for fields named “position_x”, “position_y”, and “position_z”, the field “position” will return the full vector.
>>> print (a["position"])
[[0.0440018, 0.0672202, 0.9569643],
[0.7383264, 0.1961563, 0.0238852],
[0.7042797, 0.6165487, 0.500576 ],
...
[0.1822363, 0.1324423, 0.1722414],
[0.8649974, 0.4718005, 0.7349876]]) unitary
A list of defined vector fields can be seen by doing:
>>> print (a.field_info.vector_fields)
('position', 'velocity', 'angular_momentum')
For all vector fields, a “_magnitude” field also exists, defined as the quadrature sum of the components.
>>> print (a["velocity_magnitude"])
[ 488.26936644 121.97143067 146.81450507, ...
200.74057711 166.13782652 529.7336846 ] km/s
Only specifically registered fields will be available as vector fields.
For example, saved Analysis Fields with x,y,z components will
not automatically be available. However, vector fields can be created
with the add_vector_field
function.
>>> a.add_vector_field("thing")
The above example assumes that fields named “thing_x”, “thing_y”, and “thing_z” already exist.
Analysis Fields¶
Analysis fields provide a means for saving the results of complicated
analysis for any halo in the Arbor
.
This would be operations beyond derived fields, for example, things that
might require loading the original simulation snapshots. New analysis
fields are created with
add_analysis_field
and are
initialized to zero.
>>> a.add_analysis_field("saucer_sections", units="m**2")
>>> my_tree = a[0]
>>> print (my_tree["tree", "saucer_sections"])
[ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0.,] m**2
>>> import numpy as np
>>> for halo in my_tree["tree"]:
... halo["saucer_sections"] = np.random.random() # complicated analysis
...
>>> print (my_tree["tree", "saucer_sections"])
[ 0.33919263 0.79557815 0.38264336 0.53073945 0.09634924 0.6035886, ...
0.9506636 0.9094426 0.85436984 0.66779632 0.58816873] m**2
Analysis fields will be saved when the
TreeNode
objects that have been
analyzed are saved with save_arbor
or save_tree
.
>>> my_trees = list(a[:]) # all trees
>>> for my_tree in my_trees:
... # do analysis...
>>> a.save_arbor(trees=my_trees)
Note that we do my_trees = list(a[:])
and not just my_trees =
a[:]
. This is because a[:]
is a generator that will return a new
set of trees each time. The newly generated trees will not retain
changes made to any analysis fields. Thus, we must use list(a[:])
to explicitly store a list of trees.
Re-saving Analysis Fields¶
All analysis fields are saved to sidecar files with the “-analysis” keyword appended to them. They can be altered and the arbor re-saved as many times as you like. In the very specific case of re-saving all trees and not providing a new filename or custom list of fields (as in the example above), analysis fields will be saved in place (i.e., over-writing the “-analysis” files). The conventional on-disk fields will not be re-saved as they cannot be altered.