Today we have for you an interview with the developers of the Sage mathematics system project. Note that the answers were given as a team. Sage is an open source project that provides an environment very similar to Matlab, Maple and other proprietary well known mathematics systems. It is widely used in academia and scientific research. Version 5.0 was released in mid-May 2012. Enjoy the interview!.
F4S: Please, give us a brief introduction about the Sage team.
We are more than 200 active developers, working worldwide on Sage and connected via the Internet. Our workplaces are mainly at universities, and our usual profession is mathematician (students, PhDs and up to professors), but our backgrounds are also from other scientific fields.
Here is an overview of where we live and our associations with universities or companies: http://sagemath.org/development-map.html
Sage is an open-source mathematical software environment. It is able to do calculations in basic mathematics, plot functions, perform calculus and computer algebra, it has support for high level mathematics and encompasses a variety of topics like cryptography, numerical computations, commutative algebra, group theory, combinatorics, graph theory, exact linear algebra and much more. The best way of thinking about Sage is to imagine a huge, but still well organized toolbox. Just as Firefox is an alternative to Internet Explorer or LibreOffice is an alternative to Microsoft Office, Sage is a comprehensive open source alternative to Magma, Maple, Mathematica, and Matlab.
The field of mathematics is very old and encompasses many very different topics. Sage aims to provide everything that mathematicians, scientists, researchers, and students need for their work or study. The basic concept is to combine many established software packages under one umbrella, but it also provides powerful and unique algorithms in its own library. It is hard to come up with one unique approach that suites beginners as well as experts. Sage tries to solve this and "is doing remarkably well at keeping a balance between ease-of-use for beginners and high-end users," as David Kohel once said.
Sage is also a programming environment that can be used in advanced research at the university level. As a programming language, it is essentially an extension of Python. Sage loves curious students and researchers to examine its source code and learn how its calculation is done. Sage fosters a community of developers and encourages them to take part in its development. A community of people that not only uses but also participates in development is key to a healthy ecosystem in the field of mathematical software.
There are three basic ways that the user can interact with Sage: a web-interface, accessible through a web-browser while Sage is running on your local machine or on a server, a rich command-line interface, and as a Python library. Additionally, for example it is possible to embed Sage in LaTeX documents.
The guided tour gives some overview:
Sage is built on top of dozens of independent free/libre open source scientific software projects:
Read more key facts at http://sagemath.org/library-press.html
William Stein is the creator of Sage. In the following blogpost (and PDF) he outlines the history how Sage was created, the reasons behind it and his personal experiences with other mathematical software systems.
The key points are that he started “Manin”, the precursor of Sage, around 2004. William’s frustration with proprietary mathematical software was his main motivation to create a viable open-source alternative. He was driven by his need for a software tool that he can use for his research, his teaching, and that his collaborators and students can freely use. First release of “Sage” version 0.1 happened 2005-02-28. Just a few months ago Sage 1.0 had its 6th anniversary.
The Sage code itself (as opposed to the many packages it uses) is written in Python and Cython (a compiled version of Python) and a tiny bit of C and C++. The almost 100 components of Sage are written in a mixture of languages: C, C++, Python, Assembly, Fortran, Lisp.
Sage itself acts as a complete, self-contained software distribution. It has minimal dependencies and installs all its tools in an isolated environment. That makes it possible to maintain a high level of quality across very different environments (several Linux distributions, Apple OSX, Solaris). Also, it is easy to develop inside that environment and all its components can be accessed directly.
There is no single dedicated sponsor, but we are very thankful to all kinds of sponsoring from the NSF, University of Washington, US DOD, Sun, Google, Microsoft, … and of course many private contributors.
An overview is here: http://www.sagemath.org/development-ack.html
There are several different ways of how the project benefits from contributions.
Hardware: Server infrastructure at the University of Washington, used for serving the website, running the online Sage Notebooks, and used for advanced research in number theory and related topics.
Sage Days: Regular meetings of Sage developers, often sponsored by universities.
Mirror Network: The Sage software itself is several hundred megabytes large and one single release consists of several such builds. There are more than a dozend international servers mirroring Sage for free. Thanks to all the sysadmins and organizations out there helping Sage make its way around the globe.
Community Award: A yearly prize, funded by Jaap Spies, is given to outstanding members of the Sage community. See http://sagemath.org/development-prize.html
Unfortunately, there are no reliable numbers of our user base. We estimate about 7000 downloads per month, the number of active users is probably a few times that number. One reason this is hard to track is that a university or company may use one server installation to allow hundreds of students or employees to use Sage – few of whom may ever download it.
The statistics for our main mailing lists are as follows:
- sage-support: more than 2000 members and about 300 postings per month in the last year.
- sage-devel: nearly 1500 members and between 400 to up to 1000 postings per month in the last year.
Sage has been used in over one hundred academic publications (articles, theses, books):
Here is an incomplete list for two years in the past:
From some feedback, we also know that Sage is used by corporations to interactively share calculations among co-workers.
As of December 2011, at least 244 people have actively contributed code.
For the last releases, we roughly had more than 100 active contributors. The last Sage release (version 5.0, May 2012) was done by 126 contributors: http://boxen.math.washington.edu/home/release/sage-5.0/sage-5.0.txt
Going a bit back into the past, 4.7.1 had 107, 4.7 had 93 and 3.6.2 from a year ago had also 100 contributors.
Frontend: the web interface. This is not only about the visual design and usability, which need some improvement, but also the entire server infrastructure.
Backend: Right now, almost all Sage developers are mathematicians with a varying background in programming. It would however be good to have some people developing Sage which have a strong background in Unix Programming.
Windows Platform: There is no native port to Microsoft Windows, because it is currently impossible to build it on that platform. The best way to distribute Sage on that platform is a good and solid working virtualized image, or some other kind of emulation. Streamlining this for the average user will definitely help Sage’s adoption.
After you have downloaded Sage, you automatically have the source code of the core Sage library. If you a curious, you can immediately start inspecting or even modifying it. Just dig into the source-code and browse the various Python files.
One simple way to get involved is to start using Sage and send us your comments about what you like, what you don’t like, and any errors or bugs that you encounter.
There is smooth transition from “using” to “developing” Sage, because
- The interface and primary programming language of Sage is Python. [[footnote: There is a preparser, which does some trivial string substitutions for some inputs]]. They key point to emphasize is, that the interface language is not specific to Sage. This is quite different to all the other major mathematical software systems. Therefore, once you have managed to write a few lines of code in Sage, e.g. using a control structure like a for loop, you already develop in Sage.
- When you install the source distribution and build it for yourself, you truly have everything there is to have, including the project’s history of the last years. Everything is already set up to record your modifications as a new commit and submitting it to our bugtracker.
- You can code Python scripts using Sage as a Python library. There is almost no barrier to adding this code to the core part of Sage, besides our style guide and rules for inclusion.
- Also, you are indirectly contributing to Sage, if you contribute to one of its bundled projects. Those improvements will get incorporated into Sage once we update the respective dependency.
- sage-devel mailing list: https://groups.google.com/forum/?hl=en#!forum/sage-devel
- Sage Days: http://wiki.sagemath.org/Workshops
They are gatherings of Sage developers, exchange of ideas and extensive coding sprints.
- Development guide: http://www.sagemath.org/doc/developer/index.html
It explains the development process in detail and defines the quality standards for including new software into Sage.
Developing Sage is a war on many fronts. One major feature to highlight is a significantly improved notebook. As of today, you can try http://test.sagenb.org as a next generation version of http://www.sagenb.org/.
Apart from that, we disclose all our current bugs and new code contributions. A list of tickets that need review, these are the ones which are ready to be included in the next version of Sage, can be found here: http://trac.sagemath.org/sage_trac/report/10. They range from fixing small bugs, to supporting new data formats and replacing underlying libraries.
F4S: Which projects, books, blogs or sites related to open source software for science can you recommend?
TeX/LaTeX is perhaps one of the earliest and most broadly-used FLOSS projects in science.
R Project for Statistical Computation http://www.r-project.org/ is somehow similar to, and a component of, Sage. Research in the field of statistics is almost always demoed via R scripts in papers or at conferences, and it is becoming a leader in analytics, with a number of private companies providing services.
F4S: Why do you consider free/libre open source software important for the advancement of your field?
In mathematics, all new theorems are based on existing proven theorems. The scientific method behind this enormous project demands that every theorem can be verified by a third party at any time. Relying on calculations done by closed source software contradicts this, because it is near impossible for anyone without the software to independently verify the computation.Opening up the software to everyone is the best we can do to enable this. Additionally, this is a great way of sharing insight into how those advanced calculations are actually done.
We should note that openness of mathematical proofs is not just a meaningless technicality — like the small print on a bank card agreement. Reading and understanding proofs is one of the main ways mathematicians learn how things work, and it’s a key step in generating new ideas which lead to new mathematics. When we contribute to mathematics, it is important to contribute both the results and the methods. When software plays an essential role in research, this is a valuable part of the public contribution. If other mathematicians can’t learn how it works, modify it, and use it for new purposes, then there is a serious loss of value.
Benefits of free software are often described in analogy with the benefits of a free recipe (free to use, modify, distribute). This analogy also works quite well with the benefits of free mathematical proofs.
Second topic: Unifying the ecosystem. There are several different commercial software tools. Software libraries developed by one research group based on one of them is unusable by other research groups. Students and universities don’t have the money to finance all of this.
A unified ecosystem also has the advantage that anyone can better understand the code written by others. This is already the case in the field of statistics regarding R.
Third topic: Quality of research. If you consider including your code directly into the code of the whole Sage project, it will be part of it for the foreseeable future. This means, the code must meet a certain level of quality and there need to be a set of tests for each part of the contributed code. In preparation for each new release of Sage, it is made certain, that all those tests pass on all supported systems. Therefore, your code continues to be functional and probably actively maintained, even though you have stopped working on it.
This is very different from the following situation, where you publish a half-working code once on your private website, and stop maintaining it. It will likely stop being functional and since it is only on one website, it is hard for other researchers to discover, too.
Fourth topic: Accessibility (Cost). We want the code we write to be usable by students or researchers without access to large department budgets. People who can’t (or don’t want to) afford an expensive software license are restricted from using work developed for that software. This is a restriction on users and on developers. Also, a related example are 3rd world and emerging countries. Just think about the importance of the freely accessible Wikipedia for education via projects like e.g. OLPC. The very same holds true for higher education and in our case for the accessibility to advanced mathematical software.
Example: the http://www.aims.ac.za/ is interested in Sage.
William Stein’s motivation behind creating Sage also outlines some important points: http://sagemath.blogspot.com/2009/12/mathematical-software-and-me-very.html
Sage is not at the end of its journey, and there are many missing pieces, but there are already areas where Sage excels above other commercial software environments. Sage is also a quest for really fast algorithms. As an example, here is one benchmark about calculatiing Bernoulli Numbers from 2008:
10^7 took them 5 days
http://arxiv.org/abs/0807.1347 (see Table 1)
10^7 took him with his code, now in Sage, a bit less than one hour.
By combining a diverse range of products, Sage is an ideal environment for mathematical visualizations. These require sophisticated mathematics, a full-featured programming language, and high-quality graphics. Sage provides all of these with a uniform interface. Examples of such visualizations are:
- Hopf fibration animation: http://www.nilesjohnson.net/hopf.html
- Seven-Manifolds: http://www.nilesjohnson.net/seven-manifolds
- Sage graphics tour: http://www.sagemath.org/tour-graphics.html
- Groebner Fan, by Marshal Hampton, via gfan:
sage-support mailing list: http://groups.google.com.au/group/sage-support
Ask Sage Q&A: http://ask.sagemath.org
IRC: #sagemath on freenode.net: irc://irc.freenode.net/#sagemath
This is a beginner’s guide with clear step-by-step instructions, explanations, and advice. Each concept is illustrated with a complete example that you can use as a starting point for your own work. If you are an engineer, scientist, mathematician, or student, this book is for you. To get the most from Sage by using the Python programming language, we’ll give you the basics of the language to get you started. For this, it will be helpful if you have some experience with basic programming concepts.