Tuesday, 21 November 2006

Mistakes

Apparently a program to process NMR spectra is like any other computer program: eventually they consume ink and paper and, if the user is happy with the printout, he quits the program. The fundamental difference, in the NMR case, is that the user is forced to accept the result even if it slightly wrong. If, for example, there is a very small error in the calculation of integrals, or the spectrum and the scale are misaligned by 1 mm, or the relaxation time has not been correctly estimated, how can the user recognize the mistake? Comparing the results of two different programs is easy, but time-consuming, and not everybody has two programs to compare. This is also what I do as a programmer. Sometimes, when I can't find a second program, I write two different algorithms and test them against each other. Some subtle differences are really difficult to discern; I remember a couple of cases in which it took months. Version 1 of SwaN-MR (the version that nobody used) calculated wrong integrals. The mistake became, however, evident as soon as the program was used in practice.
Version 0.1 of iNMR drew 1D spectra shifted by 1 spectral point (in many cases less than 0.1%). When the scale was calibrated (against the... shifted spectrum), the error was perfectly compensated, because it was constant. There was no way to notice or demonstrate the bug, the spectrum and the scale were perfectly aligned, until I wrote the peak-picking function. The output of peak-picking was constantly shifted by 1 point and in this way the bug was revealed. In the case it was a benign bug with no consequence. My experience is that there is no reason to trust a programmer. The user should find the time to personally test the software, at least those parts that are essential for his work. Remember that they come with no warranty (how could it be different?).
Assuming that the NMR software is perfect, the printed output can still be misleading. Is it possible to tell by inspection if the processing has been performed correctly? The first thing that I observe is the shape of the peaks. It tells if the sample has not been accurately shimmed and if the user relied exclusively on automatic phase correction or, instead, spent the canonical minute in manual phase correction. These things are not as important as baseline correction or accurate referencing against TMS, and have no relation with those other fundamental steps, but are diagnostic hints about the experience and the patience of the user. Including the TMS peak (or a solvent peak) into the peak-picking can be useful to demonstrate that the scale has been correctly referenced. Including the graphical integrals or, better, integrating pieces of pure baseline, can show how flat the latter is. These expedients are not enough, however: small deviations of the integral from zero are not graphically evident and a single baseline sample is only a partial demonstration. The TMS position deserves another article, I hope to write it soon.
Unfortunately the above expedients are not elegant and reduce the clarity and readability of the printed spectrum. What happens, however, if you discover, from the printout, that the processing is inaccurate? When you discover a grammar error in the draft of an article you can correct it and print again. Can you do the same, with the spectrum, in the absence of the raw data? Even if you have an accurate log file of all processing operations, what's it for, if you are not allowed to reprocess the spectrum?
The printed spectrum cannot be replaced because all magnetic and optical supports have been a failure (if their purpose was to save the information for posterity). The CD seems to be more durable than magnetic supports, but it is already being replaced by the DVD and, most of all, we don't have the proof that our CDs will still be readable 20 years from now. Store the spectra on paper, but inspect them on screen, where it is possible to check the quality of processing. The point is that most of casual users lack the basic know-how to judge the quality of a spectrum. The apparent complexity of 2D spectroscopy effectively stops inexperienced users, but in the more familiar 1D environment they feel free to process spectra as they like. Processing 2D spectra is like using a word-processor: the effect of a mistake is apparent even to the uninitiated. The common mistake, in 1D spectroscopy, is to forget that it's a branch of science and that it is to be approached with a minimal knowledge of the field. I admit the sacrosanct right of the user to ignore the manual, but he must already know the steps of routine processing, even when the processing is completely automatic. If he knows how NMR processing works, he can learn the program by trial and error. It's not his/her fault if manuals are boring. My personal advise: read them!

No comments:

Post a Comment