Interview: Computational astrophysics with the yt project

flattr this!

This time we have for you an interview with Britton Smith and Matthew Turk, lead developers of the yt Project, a toolkit that offers a common language for developing astrophysical simulations and analysis tools. Astronomy professionals and amateurs will find this interview fascinating. Enjoy!


yt Project lead developers

Britton Smith and Matthew Turk


F4S: Please, give us a brief introduction about yourself.

Britton: My name is Britton Smith. I am a postdoc at Michigan State University in the department of Physics and Astronomy. Very broadly, I study the formation and evolution of structure in the Universe in a cosmological context. We live in a Universe that is made up of about 5% normal matter, 20% dark matter (which we know very little about), and 75% dark energy (which we know even less about). The proportions of each of these ingredients determine exactly how and when structures like stars and galaxies form and how they interact with their surroundings.

I use a simulation code that simultaneously solves the equations of fluid dynamics, gravity, some chemistry, and the expansion of the Universe. Some of the specific areas of research I work on are star formation in the very early Universe, the evolution of the gas that exists between galaxies (known as the intergalactic medium), and the physics relating to galaxy clusters. I am also an active developer of yt and one of the very first users.

F4S: What is the yt Project?

Britton: The simulation code that I use is one of many in existence. Despite the significant overlap in the areas of research associated with these codes, each tend to have their own separate communities of users and developers, their own files formats, and their own analysis tools. This has traditionally made any sort of cross-code research very difficult, or at least inconvenient.

The primary goal of the yt Project is to provide a common language for computational astrophysicists everywhere, regardless of the simulation code they use. The main component of the yt Project is the yt analysis toolkit, which is an open-source package for analyzing and visualizing astrophysical simulation data. In this package, we have attempted to incorporate all analysis functionality common to most environments. This includes slices, projections, volume renders, contour finding, multi-dimensional profiling, halo finding, and many other things. While working on a research project, we often need to create new pieces of analysis. When this happens, we do our best to add this to yt’s capabilities so that the next person who wants to do something similar won’t have to reinvent the wheel. Our goal is to make every yt user a yt developer.

While the feature list of yt is certainly a strength, a potentially greater strength is its ability to work with many different simulation codes. Currently, yt supports eight different codes. With yt, simulation data from any code looks virtually the same once it is in memory. As such, the main challenge for adding support to another code is reading the file format properly. When someone comes to us from an unsupported code, we do our best to work with them to build the necessary components and get them involved in the community. The odds are that they will bring with them functionality and ideas from their own corner that then will help make yt better for everyone, and that is the whole point.


Image from Turk, Norman, & Abel (2010, The Astrophysical Journal Letter, 725, 140), available to everyone here: http://lanl.arxiv.org/abs/1010.6076


F4S: Why and when did the yt Project come to be?

Britton: The yt analysis toolkit first came into existence in late 2006 and at the time was the creation of a graduate student in astrophysics at Stanford named Matthew Turk. Matt, who is now a postdoc at Columbia University, was using the same simulation code as me. At the time, analysis for this code existed in the form of a separate, disconnected pieces of software, each invented to perform a single task. There was one tool for projections, one for slices, one for profiling cosmological halos, etc. These tools were written in different languages, read and stored data differently, and had no means to talk to one another. It was not impossible to, say, make an image of every halo in a box, but you would have had to write yet another script that would have taken the output of one code and fed it in to another.

The data format of our simulation code was also quite complicated, with multiple files containing information, such as densities and temperatures, for chunks of the computational domain. The format of the data was a reflection of the computation that had been performed
to create it, i.e., the number of processors used and the total number of grid cells in the box, but was not related in nearly any way to physical objects being simulated, such as a star forming cloud or a galaxy. Matt’s idea was to put the data reading behind the scenes and
allow the user to work with more physically motivated objects, such as spheres, disks, or cubes. For example, one could tell yt to create a sphere in the center of the box with a radius of 10 light years.

Then, if you wanted to see the densities of all the grid cells within the sphere, yt would figure out what files to get those from and just give them to you. You could do a projection, locate contours, or compute an average quantity of that sphere just as easily as you could for a cube. From there, creating new analysis routines suddenly became much easier because all one really had to think about was the analysis itself.

Matt shared this toolkit with his colleagues, including me. As we became more familiar with yt, we started to contribute to its capabilities, and a small group of users and developers began to take shape. In mid-2008, Matt was contacted by Jeff Oishi, a postdoc at UC, Berkeley working on astrophysical simulations using a completely different code. Matt and Jeff worked together to add yt support for Jeff’s simulation code. Soon after this, it became clear that yt could be more than a customized toolkit for one simulation code and a small group of users. We did not call it that yet, but that is probably about the time that the yt analysis toolkit became the yt project. And in this way, yt has grown as a community of user/developers. The first peer-reviewed journal article to use yt for its analysis was published in January of 2009. As of November 2011, this number has reached nearly thirty.

F4S: In which language(s) and platform(s) is the project developed?

Britton: yt is primarily written in Python with some of the more computationally intensive routines written in Cython or C for better performance. The main reasons for choosing Python are that it is free and because of the large number of existing modules, such as numpy for optimized array and vector math, matplotlib for plotting and visualization, mpi4py for parallelization, and h5py for hdf5 file i/o. It is my opinion that the availability of these packages is also due mainly to the fact that Python is free for everyone.

yt is developed and runs on a wide array of platforms, from our laptops to large supercomputers and will work just about anywhere Python can be installed.


Image from Skillman et. al. (2011, The Astrophysical Journal, 735, 96), available to everyone here: http://lanl.arxiv.org/abs/1006.3559

F4S: Does the yt Project have sponsors?

Britton: The yt community consists mostly of astronomers and astrophysicists working and studying at academic institutions. Many of us are funded by federal grant agencies, such as the NSF and NASA. We have no private sponsors.

Most of us are funded by scientific grants where the primary focus in science research. Included in most of those projects is time to develop the tools required for the work.

Also, the FLASH Center at the University of Chicago has recently supported us by offering to host our first ever yt Workshop in January, 2012.

F4S: How many users you estimate the yt Project have?

Britton: It is difficult to know exactly how many users there are since we do not track downloads of the source. We have a very active mailing list where users can seek help and that gives us a decent estimate. Currently, there are 108 subscribers on the users list.

F4S: Do you know where is the yt Project used?

Britton: Users are almost exclusively at academic institutions.

F4S: How many team members does the project have?

Britton: We also have a yt developers list with 33 subscribers. Out of those, about a dozen people are active developers. As always, anyone interested in getting involved is welcome.

F4S: In what areas of the yt Project development do you currently need help?

Britton: We are always looking for people who work with currently unsupported simulation codes who are interested in working with yt. In general, if you have created or are interested in creating some form of simulation analysis, we would love to have you incorporate it into yt. Other ways to get involved include helping out on the mailing list, hanging out in IRC, contributing documentation, or simply sharing scripts you have written.

F4S: How can people get involved with the project?

Britton: People should visit http://yt-project.org/ and sign up for the users and developers email lists. We also have an IRC channel where people can come to get live help. Information on these can be found in the Community section on the web page. We would be more than happy to
have people show up and ask, “Is there anything I can help with?” If people have written other useful scripts or codes that are not related to the yt analysis toolkit, they can share them at hub.yt-project.org/.

F4S: What features are in the roadmap?

Britton: Below is a very incomplete list of features we would like to see in yt someday. There are certainly others, and this changes quite a bit.

Thanks to Matt Turk for helping me compile this.

- Non-cartesian coordinates
- Support for more simulation codes (always!)
- Better support for particle-based hydro simuluations. yt is currently stronger with grid-based codes.
- Better in situ visualization, including a library against which simulation codes can link for IO and in situ viz.
- Interactive volume renderer.
- Microphysical solvers for things like radiative cooling and chemical species.
- Simulated observations: simulation to telescope to image pipeline.
- Data sharing hooks, to share reduced data products between researchers.
- Selection and manipulating *conceptual* objects instead of just geometric objects. When you select something and identify it as a galaxy, the code should be able to provide you with richer operations that are easier to accomplish. This is akin to baking analysis recipes into the interface. So if I select a region and call it a galaxy, it should immediately then be able to generate SEDs without having to go through the entire recipe myself manually.



Reason: yt browser-gui (video by Cameron Hummels, another yt developer and graduate student at Columbia U)


F4S: Which projects, blogs or sites related to open source software for science can you recommend?

Britton: I have again channeled Matt for this one.

He says, “The SAGE project (sagemath.org) is a shining example of open source software for math and science. They’re pretty much the project I admire the most.”

We are big fans of NumPy and Matplotlib, obviously, and we think SciPy is pretty great, too. We also like Visit, Paraview, and R.

Here’s Matt again:

For open science blogs I read Cameron Neylon’s blog: http://cameronneylon.net/category/blog/

Victoria Stodden’s blog: http://blog.stodden.net/

Planet SciPy is good too: http://planet.scipy.org/

F4S: Why do you consider free/libre open source software important for the advancement of your field?

Britton: We owe a whole lot to open source software. In addition to yt’s dependencies, without which none of this would be possible, text editors, compilers, and countless other open source tools have helped push astronomy forward for decades.

In addition, one of the fundamentals of science is the reproducibility of results. If findings are to be accepted by the scientific community, then research methods simply cannot be trade secrets. For computation, transparency in methodology can only mean open source.

F4S: Is there any other topic you would like our readers to know about?

Britton: Also, as mentioned earlier, we are planning to hold a yt workshop in January 2012 in Chicago, hosted by the FLASH Center. This will be aimed at getting new users up to speed with yt and involved in development as well. We will be announcing this officially soon.

More information can be found at: http://blog.yt-project.org/announcing-the-2012-yt-workshop

F4S: Where people can contact you and learn more about the yt Project?

Britton: In general, the best way to contact us is through the users and developers email lists and the IRC channel. Those can be founded in the community section on the website. Below are some addition places yt can be found.

F4S: Thanks Britton and Matt for letting us know more about you and the yt project.


Note: If you liked this interview you can Flattr it at the top of the post!

No related posts.

Tags: