Friday, February 15, 2013

A short review of best computing practices for scientists

Best Practices for Scientific Computing is a good read if you, like me, are a scientist who frequently programs but never received proper training in software development. It simply enumerates a list of practices that helps improve the productivity of coders and the reusability of code written in an academic environment. The techniques on this list are well known to software development professionals and have been extensively developed over many years.

Some of the suggestions and points in the article that are of note include:
  1. 90% of scientists are self-taught programmers
  2. All aspects of software development should be broken into tasks roughly an hour long
  3. Provenance of data refers to data that is accompanied by a detailed list of code and operations for recreating the data and code output
  4. Programmers should work in small steps with frequent feedback and course corrections
  5. Use assertions (executable documentation) to avoid mistakes in code
  6. Scientists should reprogram complicated tasks to make them simpler for a human to read instead of including paragraphs of comments explaining how the code works.

While largely approachable, the paper still suffers from a slight overuse of jargon from the software development field. As a result, the importance of some of their recommendations escapes me.