lishcenthetelevisyen: 2010

Thursday, 9 December 2010

Readable Tests: Separating intent from implementation

Very recently, I was working on a test class like this:

public class AnalyticsExpirationDateManagerTest extends TestCase {

 private static final long ONE_HOUR_TIMEOUT = 1000 * 60 * 60;
 private static final long TWO_HOUR_TIMEOUT = ONE_HOUR_TIMEOUT * 2;
 
 private Map<Parameter, Long> analyticsToTimeout;
 private long defaultTimeout;
 
 private Parameter minTimeoutParam;
 @Mock private CacheKeyImpl<Parameter> cacheKey;

    @Override
    protected void setUp() throws Exception {
     MockitoAnnotations.initMocks(this);
     
     this.minTimeoutParam = new Parameter("minTimeout", "type");
     
     when(cacheKey.getFirstKey()).thenReturn(minTimeoutParam);
     
     this.analyticsToTimeout = new HashMap<Parameter, Long>();
     this.defaultTimeout = 0;
    }
 
 public void
 testGetExpirationDateWhenAnalyticsToTimeoutsAndCacheKeyAreEmpty() {
  AnalyticsExpirationDateManager<Long> manager = 
    new AnalyticsExpirationDateManager<Long>(analyticsToTimeout, defaultTimeout);
  Date date = manager.getExpirationDate(cacheKey, 0L);
  assertNotNull(date);
 }
 
 public void 
 testGetExpirationDateWithMinimunTimeoutOfOneHour() {
  this.analyticsToTimeout.put(this.minTimeoutParam, ONE_HOUR_TIMEOUT);
  Collection<Parameter> cacheKeysWithMinTimeoutParam = new ArrayList<Parameter>();
  cacheKeysWithMinTimeoutParam.add(this.minTimeoutParam);
  when(this.cacheKey.getKeys()).thenReturn(cacheKeysWithMinTimeoutParam);
  
  AnalyticsExpirationDateManager<Long> manager = 
   new AnalyticsExpirationDateManager<Long>(analyticsToTimeout, defaultTimeout);
  Date date = manager.getExpirationDate(cacheKey, 0L);

  assertNotNull(date);
  Calendar expirationDate = Calendar.getInstance();
  expirationDate.setTime(date);
  
  Calendar currentDate = Calendar.getInstance();
  
  // Check if expiration date is one hour ahead current date. 
  int expirationDateHour = expirationDate.get(Calendar.HOUR_OF_DAY);
  int currentDateHour = currentDate.get(Calendar.HOUR_OF_DAY);
  assertTrue(expirationDateHour - currentDateHour == 1);
 }
 
 public void 
 testGetExpirationDateWhenCacheKeyIsNullAndDefaultTimeoutIsOneHour() {
  CacheKeyImpl<Parameter> NULL_CACHEKEY = null;
  AnalyticsExpirationDateManager<Long> manager = 
   new AnalyticsExpirationDateManager<Long>(analyticsToTimeout, ONE_HOUR_TIMEOUT);
  Date date = manager.getExpirationDate(NULL_CACHEKEY, 0L);
  
  assertNotNull(date);
  Calendar expirationDate = Calendar.getInstance();
  expirationDate.setTime(date);
  
  Calendar currentDate = Calendar.getInstance();
  
  // Check if expiration date hour is the same of current date hour.
  // When cache key is null, system date and time is returned and default timeout is not used.
  int expirationDateHour = expirationDate.get(Calendar.HOUR_OF_DAY);
  int currentDateHour = currentDate.get(Calendar.HOUR_OF_DAY);
  assertTrue(expirationDateHour - currentDateHour == 0);
 }
 
 public void 
 testGetExpirationDateWithDefaultTimeout() {
  // Default timeout is used when no time out is specified.
  Collection<Parameter> cacheKeysWithoutTimeoutParam = new ArrayList<Parameter>();
  cacheKeysWithoutTimeoutParam.add(new Parameter("name", "type"));
  when(this.cacheKey.getKeys()).thenReturn(cacheKeysWithoutTimeoutParam);

  AnalyticsExpirationDateManager<Long> manager = 
   new AnalyticsExpirationDateManager<Long>(analyticsToTimeout, ONE_HOUR_TIMEOUT);
  Date date = manager.getExpirationDate(cacheKey, 0L);
  
  assertNotNull(date);
  Calendar expirationDate = Calendar.getInstance();
  expirationDate.setTime(date);
  
  Calendar currentDate = Calendar.getInstance();
  
  // Check if expiration date is one hour ahead current date. 
  int expirationDateHour = expirationDate.get(Calendar.HOUR_OF_DAY);
  int currentDateHour = currentDate.get(Calendar.HOUR_OF_DAY);
  assertTrue(expirationDateHour - currentDateHour == 1);
 }
 
 public void 
 testGetExpirationDateWhenMinTimeoutIsSetAfterCreation() {
  AnalyticsExpirationDateManager<Long> manager = 
   new AnalyticsExpirationDateManager<Long>(analyticsToTimeout, ONE_HOUR_TIMEOUT);
  manager.setExpirationTimeout(this.minTimeoutParam.getName(), TWO_HOUR_TIMEOUT);
  
  Date date = manager.getExpirationDate(cacheKey, 0L);
  
  assertNotNull(date);
  Calendar expirationDate = Calendar.getInstance();
  expirationDate.setTime(date);
  
  Calendar currentDate = Calendar.getInstance();
  
  // Check if expiration date is two hour ahead current date. 
  int expirationDateHour = expirationDate.get(Calendar.HOUR_OF_DAY);
  int currentDateHour = currentDate.get(Calendar.HOUR_OF_DAY);
  assertTrue("Error", expirationDateHour - currentDateHour == 2);
 }
 
}

Quite frightening, isn't it? Very difficult to understand what's going on there.

The class above covers 100% of the class under test and all the tests are valid tests, in terms of what is being tested.

Problems

There are quite a few problems here:
- The intent (what) and implementation (how) are mixed, making the tests very hard to read;
- There is quite a lot of duplication among the test methods;
- There is also a bug in the test methods when comparing dates, trying to figure out how many hours one date is ahead of the other. When running these tests in the middle of the day, they work fine. If running them between 22:00hs and 00:00hs, they break. The reason is that the hour calculation does not take into consideration the day.

Making the tests more readable

Besides testing the software, tests should also be seen as documentation, where business rules are clearly specified. Since the tests here are quite messy, understanding the intention and detecting bugs can be quite difficult.

I've done quite a few refactorings to this code in order to make it more readable, always working in small steps and constantly re-running the tests after each change. I'll try to summarise my steps for clarity and brevity.

1. Fixing the hour calculation bug

One of the first things that I had to do was to fix the hour calculation bug. In order to fix the bug across all test methods, I decided to extract the hour calculation into a separate class, removing all the duplication from the test methods. Using small steps, I took the opportunity to construct this new class called DateComparator (yes, I know I suck naming classes) using some internal Domain Specific Language (DSL) techniques.

public class DateComparator {
 
 private Date origin;
 private Date target;
 private long milliseconds;
 private long unitsOfTime;
 
 private DateComparator(Date origin) {
  this.origin = origin;
 }
 
 public static DateComparator date(Date origin) {
  return new DateComparator(origin);
 }
 
 public DateComparator is(long unitsOfTime) {
  this.unitsOfTime = unitsOfTime;
  return this;
 }
 
 public DateComparator hoursAhead() {
  this.milliseconds = unitsOfTime * 60 * 60 * 1000;
  return this;
 }
 
 public static long hours(int hours) {
  return hoursInMillis(hours);
 }
 
 private static long hoursInMillis(int hours) {
  return hours * 60 * 60 * 1000;
 }
 
 public boolean from(Date date) {
  this.target = date;
  return this.checkDifference();
 }
 
 private boolean checkDifference() {
  return (origin.getTime() - target.getTime() >= this.milliseconds);
 }
}

So now, I can use it to replace the test logic in the test methods.

2. Extracting details into a super class

This step may seem a bit controversial at first, but can be an interesting approach for separating the what from how. The idea is to move tests set up, field declarations, initialisation logic, everything that is related to the test implementation (how) to a super class, leaving the test class just with the test methods (what).

Although this many not be a good OO application of the IS-A rule, I think this is a good compromise in order to achieve better readability in the test class.

NOTE: Logic can be moved to a super class, external classes (helpers, builders, etc) or both.

Here is the super class code:

public abstract class BaseTestForAnalyticsExperationDateManager extends TestCase {

 protected Parameter minTimeoutParam;
 @Mock protected CacheKeyImpl<Parameter> cacheKey;
 protected Date systemDate;
 protected CacheKeyImpl<Parameter> NULL_CACHEKEY = null;
 protected AnalyticsExpirationDateManager<Long> manager;

 @Override
 protected void setUp() throws Exception {
  MockitoAnnotations.initMocks(this);
  this.minTimeoutParam = new Parameter("minTimeout", "type");
  when(cacheKey.getFirstKey()).thenReturn(minTimeoutParam);
  this.systemDate = new Date();
 }

 protected void assertThat(boolean condition) {
  assertTrue(condition);
 }
 
 protected void addMinimunTimeoutToCache() {
  this.configureCacheResponse(this.minTimeoutParam);
 }
 
 protected void doNotIncludeMinimunTimeoutInCache() {
  this.configureCacheResponse(new Parameter("name", "type"));
 }
 
 private void configureCacheResponse(Parameter parameter) {
  Collection<Parameter> cacheKeysWithMinTimeoutParam = new ArrayList<Parameter>();
  cacheKeysWithMinTimeoutParam.add(parameter);
  when(this.cacheKey.getKeys()).thenReturn(cacheKeysWithMinTimeoutParam);
 }
}

3. Move creation and configuration of the object under test to a builder class

The construction and configuration of the AnalyticsExpirationDateManager is quite verbose and adds a lot of noise to the test. Once again I'll be using a builder class in order to make the code more readable and segregate responsibilities. Here is the builder class:

public class AnalyticsExpirationDateManagerBuilder {
 
 protected static final long ONE_HOUR = 1000 * 60 * 60;

 protected Parameter minTimeoutParam;
 private AnalyticsExpirationDateManager<Long> manager;
 private Map<Parameter, Long> analyticsToTimeouts = new HashMap<Parameter, Long>();
 protected long defaultTimeout = 0;
 private Long expirationTimeout;
 private Long minimunTimeout;

 private AnalyticsExpirationDateManagerBuilder() {
  this.minTimeoutParam = new Parameter("minTimeout", "type");
 }
 
 public static AnalyticsExpirationDateManagerBuilder aExpirationDateManager() {
  return new AnalyticsExpirationDateManagerBuilder();
 }
 
 public static long hours(int quantity) {
  return quantity * ONE_HOUR;
 }
 
 public AnalyticsExpirationDateManagerBuilder withDefaultTimeout(long milliseconds) {
  this.defaultTimeout = milliseconds;
  return this;
 }
 
 public AnalyticsExpirationDateManagerBuilder withExpirationTimeout(long milliseconds) {
  this.expirationTimeout = new Long(milliseconds);
  return this;
 }
 
 public AnalyticsExpirationDateManagerBuilder withMinimunTimeout(long milliseconds) {
  this.minimunTimeout = new Long(milliseconds);
  return this;
 }
 
 public AnalyticsExpirationDateManager<Long> build() {
  if (this.minimunTimeout != null) {
   analyticsToTimeouts.put(minTimeoutParam, minimunTimeout);
  }
  this.manager = new AnalyticsExpirationDateManager(analyticsToTimeouts, defaultTimeout);
  if (this.expirationTimeout != null) {
   this.manager.setExpirationTimeout(minTimeoutParam.getName(), expirationTimeout);
  }
  return this.manager;
 }

}

The final version of the test class

After many small steps, that's how the test class looks like. I took the opportunity to rename the test methods as well.

import static com.mycompany.AnalyticsExpirationDateManagerBuilder.*;
import static com.mycompany.DateComparator.*;

public class AnalyticsExpirationDateManagerTest extends BaseTestForAnalyticsExperationDateManager {

 public void
 testExpirationTimeWithJustDefaultValues() {
  manager = aExpirationDateManager().build();
  Date cacheExpiration = manager.getExpirationDate(cacheKey, 0L);
  assertThat(dateOf(cacheExpiration).is(0).hoursAhead().from(systemDate));
 }
 
 public void 
 testExpirationTimeWithMinimunTimeoutOfOneHour() {
     addMinimunTimeoutToCache();  
  manager = aExpirationDateManager()
      .withMinimunTimeout(hours(1))
      .build();
  Date cacheExpiration = manager.getExpirationDate(cacheKey, 0L);
  assertThat(dateOf(cacheExpiration).is(1).hoursAhead().from(systemDate));
 }
 
 public void 
 testExpirationTimeWhenCacheKeyIsNullAndDefaultTimeoutIsOneHour() {
  manager = aExpirationDateManager()
      .withDefaultTimeout(hours(1))
      .build();
  Date cacheExpiration = manager.getExpirationDate(NULL_CACHEKEY, 0L);
  // When cache key is null, system date and time is returned and default timeout is not used.
  assertThat(dateOf(cacheExpiration).is(0).hoursAhead().from(systemDate));
 }
 
 public void 
 testExpirationTimeWithDefaultTimeout() {
  doNotIncludeMinimunTimeoutInCache();
  manager = aExpirationDateManager()
      .withDefaultTimeout(hours(1))
      .build();
  Date cacheExpiration = manager.getExpirationDate(cacheKey, 0L);
  assertThat(dateOf(cacheExpiration).is(1).hoursAhead().from(systemDate));
 }
 
 public void 
 testExpirationTimeWhenExpirationTimeoutIsSet() {
  manager = aExpirationDateManager()
      .withDefaultTimeout(hours(1))
      .withExpirationTimeout(hours(2))
      .build();
  Date cacheExpiration = manager.getExpirationDate(cacheKey, 0L);
  // Expiration timeout has precedence over default timeout.
  assertThat(dateOf(cacheExpiration).is(2).hoursAhead().from(systemDate));
 }
 
}

Conclusion

Test classes should be easy to read. They should express intention, system behaviour, business rules. Test classes should express how the system works. They are executable requirements and specifications and should be a great source of information for any developer joining the project.

In order to achieve that, we need to try to keep our test methods divided in just 3 simple instructions.

1. Context: The state of the object being tested. Here is where we set all the attributes and mock dependencies. Using variations of the Builder pattern can greatly enhance readability.

manager = aExpirationDateManager()
                .withDefaultTimeout(hours(1))
                .withExpirationTimeout(hours(2))
                .build();

2. Operation: The operation being tested. Here is where the operation is invoked.

Date cacheExpiration = manager.getExpirationDate(cacheKey, 0L);

3. Assertion: Here is where you specify the behaviour expected. The more readable this part is, the better. Using DSL-style code is probably the best way to express the intent of the test.

assertThat(dateOf(cacheExpiration).is(2).hoursAhead().from(systemDate));

In this post I went backwards. I've started from a messy test class and refactored it to a more readable implementation. As many people now are doing TDD, I wanted to show how we can improve an existing test. For new tests, I would suggest that you start writing the tests following the Context >> Operation >> Assertion approach. Try writing the test code in plain English. Once the test intent is clear, start replacing the plain English text with Java internal DSL code, keeping the implementation out of the test class.

PS: The ideas for this blog post came from a few discussions I had during the Software Craftsmanship Round-table meetings promoted by the London Software Craftsmanship Community (LSCC).

Tuesday, 7 December 2010

A basic ActiveRecord implementation in Java

Recently I was looking for different ways to develop applications in Java and thought that it would be interesting trying to use ActiveRecord in my persistence layer instead of implementing the traditional approach with a DAO. My idea is not to create my own ActiveRecord framework but use the existing ones as a support for this approach.

The scope: Write functional tests that could prove that the methods save and delete on an entity work. I'll use an entity called Traveller for this example.

The technology: I chose to use the following frameworks: Spring 3.0.5, JPA 2.0, Hibernate 3.5.3, AspectJ 1.6.9, JUnit 4.8.2, Maven 2.2.1, Eclipse Helios and MySQL 5.x

I'll be omitting things that are not too important. For all the details, please have a look at the whole source code at:

https://github.com/sandromancuso/cqrs-activerecord

Let's start with the test class:

@RunWith(SpringJUnit4ClassRunner.class)
@ContextConfiguration(
  locations={
    "file:src/test/resources/applicationContext-test.xml",
    "file:src/main/resources/applicationContext-services.xml"    
  })
@TransactionConfiguration(transactionManager = "myTransactionManager", defaultRollback = true)
@Transactional
public class TravellerActiveRecordIntegrationTest extends BaseTravellerIntegration {
 
 @Test public void 
 testTravellerSelfCreation() {
  assertThereAreNoTravellers(named("John"), from("England"));
  
  Traveller traveller = aTraveller().named("John").from("England").build();
  traveller.save();
  
  assertThereIsASingleTraveller(named("John"), from("England"));
 }
 
 @Test public void
 testTravellerEdition() {
  Traveller traveller = aTraveller().named("John").from("England").build();
  traveller.save();
  
  traveller.setName("Sandro");
  traveller.setCountry("Brazil");
  traveller.save();
  
  assertThereAreNoTravellers(named("John"), from("England"));
  assertThereIsASingleTraveller(named("Sandro"), from("Brazil"));  
 }
 
 @Test public void 
 testDeleteTraveller() {
  Traveller traveller = aTraveller().named("John").from("England").build();
  traveller.save();

  traveller.delete();
  assertThereAreNoTravellers(named("John"), from("England"));
 }
 
}

A few things to notice about this test class:
- As this test is meant to insert and delete from the database, I set it to always rollback the transaction after each test, meaning that nothing will be committed to the database permanently. This is important in order to execute the tests multiple times and get the same results.
- The test class extends a base class, that has the methods assertThereAreNoTravellers and assertThereIsASingleTraveller. The advantage of doing that is that you separate what you want to test from how you want to test, making the test class clean and focused on the intention.
- Note that I also used the Builder pattern to build the traveller instance instead of using the setter methods. This is a nice approach in order to make your tests more readable.

The Traveller entity implementation

So now, let's have a look at the Traveller entity implementation.

@Entity
@Table(name="traveller")
@EqualsAndHashCode(callSuper=false)
@Configurable
public @Data class Traveller extends BaseEntity {

 @Id @GeneratedValue(strategy=GenerationType.AUTO)
 private long id;
 private String name;
 private String country;
 
}

The Traveller class is a normal JPA entity as you can see by the @Entity, @Table, @Id and @GenereratedValue annotations. To reduce the boiler place code like getters, setters, toString() and hashCode(), I'm using Lombok, that is a small framework that generate all that for us. Just add a @Data annotation.

The real deal here is Traveller's super class BaseEntity (for a lack of inspiration to find better name). Let's have a look:

@Configurable
public abstract class BaseEntity {

 @PersistenceContext
 protected EntityManager em;

 public abstract long getId();
    
 public void save() {
  if (getId() == 0) {
   this.em.persist(this);
  } else {
   this.em.merge(this);
  }
  this.em.flush();
 } 
 
 public void delete() {
  this.em.remove(this);
  this.em.flush();
 }
 
 public void setEntityManager(EntityManager em) {
  this.em = em;
 }
 
}

The BaseEntity class has quite a few things that can be discussed.

The save() method: I've chosen to have a single method that can either insert or update an entity, based on its id. The problem with this approach is that, firstly, the method has more than one responsibility, making it a bit confusing to understand what it really does. Secondly it relies on the entities id, that needs to be a long, making it a very specific and weak implementation. The advantage is that from the outside (client code), you don't need to worry about the details. Just simple call save() and you are done. If you prefer a more generic and more cohesive implementation, make the getId() method return a generic type and split the save() method into a create() and update() methods. My idea here was just to make it simple to use.

EntityManager dependency: Here is where the biggest problem lies. For it to work well, every time a new instance of a entity is created, either by using new EntityXYZ() by hand or when it is created by a framework (e.g. as a result of a JPA / Hibernate query), we want the entity manager to be injected automatically. The only way I found to make it work is using aspects with AspectJ and Spring.

Configuring AspectJ and Spring

My idea here is not to give a full explanation about the whole AspectJ and Spring integration, mainly because I don't know it very well myself. I'll just give the basic steps to make this example work.

First add @Configurable to the Entity. This will tell that the entity will be managed by Spring. However, Spring is not aware of instances of the entity, in this case, Traveller, being created. This is why we need AspectJ. The only thing we need to do is to add the following line to our Spring context xml file.

<context:load-time-weaver />

This makes AspectJ intercept the creation of beans annotated with @Configurable and tells Spring to inject the dependencies. In order to the load-time weaver (LTW) work, we need to override JVM's class loading so that the first time our entity classes are loaded, AspectJ can kick in, the @Configurable annotation is discovered and all the dependencies are injected. For that we need to pass the following parameter to the JVM:

-javaagent:<path-to-your-maven-repository>/.m2/repository/org/springframework/spring-instrument/3.0.5.RELEASE/spring-instrument-3.0.5.RELEASE.jar

The snippet above is what we must use if using Spring 3.0.x. It works fine inside Eclipse but apparently it has some conflicts with Maven 2.2.1. If you run into any problems, you can use the old version below.

-javaagent:<path-to-your-maven-repository>/.m2/repository/org/springframework/spring-agent/2.5.6/spring-agent-2.5.6.jar

Another thing that is a good idea is to add the aop.xml file to your project, limiting the classes that will be affected by AspectJ. Add the aop.xml to src/main/resources/META-INF folder.


 
  
  <include within="com.lscc.ddddemo.model.entity.*" />
  <include within="com.lscc.ddddemo.model.entity.builder.*" />
  <exclude within="*..*CGLIB*" />

NOTE: The exclude clause is important to avoid some conflicts during the integration test. AspectJ sometimes tries to do some magic with the test classes as well causing a few problems and the exclude clause avoids that.

Making the integration test work

I'll be using MySQL, so I'll need a database with the following table there:

CREATE TABLE `traveller` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(45) NOT NULL,
`country` varchar(45) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB

As I'm using JPA 2.0, we need a persistence.xml that must be located at src/main/resources/META-INF


 
    
        com.lscc.ddddemo.model.entity.Traveller
 
        
            <property name="hibernate.show_sql" value="true" />
            <property name="hibernate.format_sql" value="true" />
 
            <property name="hibernate.connection.driver_class" value="com.mysql.jdbc.Driver" />
            <property name="hibernate.connection.url" value="jdbc:mysql://localhost:3306/lscc-ddd" />
            <property name="hibernate.connection.username" value="root" />
 
            <property name="hibernate.c3p0.min_size" value="5" />
            <property name="hibernate.c3p0.max_size" value="20" />
            <property name="hibernate.c3p0.timeout" value="300" />
            <property name="hibernate.c3p0.max_statements"
                    value="50" />
            <property name="hibernate.c3p0.idle_test_period"
                    value="3000" />

In our Spring configuration, we also need to tell Spring how to create an EntityManager, providing the EntityManager Factory:


        
    
        <property name="persistenceUnitName" value="testPU" />
    

    <tx:annotation-driven transaction-manager="myTransactionManager" />
 
    
        <property name="entityManagerFactory" ref="entityManagerFactory" />

In order to separate unit tests and integration tests in my Maven project, I've added a different profile for the integration tests in my pom.xml:


   with-integration-tests
   
    
     
      org.apache.maven.plugins
      maven-surefire-plugin
      
       always
       -javaagent:/Users/sandro.mancuso/.m2/repository/org/springframework/spring-agent/2.5.6/spring-agent-2.5.6.jar
      
      2.5
      true
      
       
        integration-tests
        integration-test
        
         test
        
        
         
          **/common/*
         
         
          **/*IntegrationTest.java

So now, if you want to run the integration tests from the command line, just type:

mvn clean install -Pwith-integration-tests

If running the tests from inside Eclipse, don't forget to add the -javaagent paramenter to the VM.

Conclusion

So now, from anywhere in your application you can do some thing like:

Traveller traveller = new Traveller();
traveller.setName("John");
traveller.setCountry("England");
traveller.save();

The advantage of using the ActiveRecord:
- Code becomes much simpler;
- There is almost no reason for a DAO layer any more;
- Code is more explicit in its intent.

The disadvantages:
- Entities will need to inherit from a base class;
- Entities would have more than one responsibility (against the Single Responsibility Principle);
- Infrastructure layer (EntityManager) would be bleeding into our domain objects.

In my view, I like the simplicity of the ActiveRecord pattern. In the past 10 years we've been designing Java web applications where our entities are anaemic, having state (getters and setters) and no behaviour. So at the end, they are pure data structures. I feel that entities must be empowered and with techniques like that, we can do it, abstracting the persistence layer.

I'm still not convinced that from now on I'll be using ActiveRecord instead of DAOs and POJOs but I'm glad that now there is a viable option. I'll need to try it in a real project, alongside with Command Query Responsibility Segregation (CQRS) pattern to see how I will feel about it. I'm really getting sick of the standard Action/Service/DAO way to develop web applications in Java instead of having a proper domain model.

By the way, to make the whole thing work, I had loads of problems to find the right combination of libraries. Please have a look at my pom.xml file for details. I'll be evolving this code base to try different things so when you look at it, it may not be exactly as it is described here.

https://github.com/sandromancuso/cqrs-activerecord

Interesting links with more technical details:
http://nurkiewicz.blogspot.com/2009/10/ddd-in-spring-made-easy-with-aspectj.html
http://blog.m1key.me/2010/06/integration-testing-your-spring-3-jpa.html

Wednesday, 1 December 2010

Routine Prediction

Not everybody remembers how easy it is and how effective it can be to add (or correct) a few points at the beginning of the FID. Two years ago I explained how Linear Prediction works and how we can extrapolate the FID in both directions. This time I will show a simple practical application.
I have observed that, in recent years, C-13 spectra acquired on Varian instruments require a much-larger-then-it-used-to-be phase correction. When I say correction, I mean first-order phase correction, because the zero-order correction is merely a different way of representing the same thing (a different perspective).
A large first-order phase correction can be substituted with linear prediction. I will show the advantage with a F-19 example, yet the concept is general.

The spectrum, after FT and before phase correction, looks well acquired. Now we apply the needed correction, which amounts to no less than 1073 degrees.

Have you noticed what happened to the baseline? It's all predictable. When you increase the phase correction, the baseline starts rolling. The higher the phase correction, the shorter the waves. With modern method for correcting the baseline we eliminate all the waves, yet there are two potential problems: 1) The common methods for automatic phase correction will have an hard time. 2) If you prefer manual phase correction, you need an expert eye to assess the symmetry of the peaks over such a rolling baseline. Anyway, just to show you that linear prediciton is not a necessity, here is the spectrum after applying standard baseline correction:

Now let's start from the FID again, this time applying linear prediction. one way to use it is to add the 3 missing points at the beginning. The result, after a mild phase correction (<10°) and before the baseline correction is:

The lesson is: by adding the missing points we correct the phase.
Alternatively we can both add the 3 missing points and recalculate the next 4 points. In this way the baseline improves a lot:

The lesson is: by recalculating the first few points of the FID we can correct the baseline of the frequency domain spectrum.

Thursday, 4 November 2010

2011

Wednesday, 27 October 2010

Promotional Sale

There are three possible reasons why you can be tempted by iNMR.
First reason: it's for research. It happens that they are not using iNMR in the industry, not because they don't like it, but because they don't buy Macs anymore in the industry. So, the majority of iNMR users are not doing repetitive activities. They don't ask to process 20 spectra in 20 seconds. Maybe they want to estimate the concentrations by time-consuming line-fitting or they want to monitor the phosphorylation of a protein by a series of thirty H-N HSQC, or they want to simulate the effect of a slow rotation, as they used to do with DNMR in the '70s. iNMR users asked for such things years ago and now you find them already into the program.
Second reason: students learn the program by themselves. Nowadays few research groups are pure-NMR-groups. When a new PhD students joins the lab, he has many techniques to learn, not just NMR processing. Luckily, iNMR has many things in common with the other applications he daily works with on his/her Mac. iNMR also helps novices to understand NMR processing because spectra are clearly depicted at every stage. A lot of things become natural after the first day of use.
Third reason: today Mestrelab has started a promotional sale, a sort of end-of-the-year clearance.
You can buy a disposable license at €90 instead of €150. You can download and try the program before buying.

Wednesday, 13 October 2010

OK then, first post!

My acclaimed puzzle/platform game Apple Jack was released at the end of May this year on the Xbox Indie games service. Here is a video of it:

"Oh", I hear you cry, "Acclaimed is it? Where's the proof?"
To which I simply shake my head quietly and with a wry smile point you towards the following links:

http://www.eurogamer.net/articles/download-games-roundup-review-11th-june-2010?page=3
http://www.digitalspy.co.uk/gaming/levelup/a222963/indie-pick-apple-jack.html
http://www.armlessoctopus.com/2010/06/21/xbox-indie-review-apple-jack/#more-292
http://gaygamer.net/2010/06/weekly_xbox_indies_6210.html
http://www.xnplay.co.uk/xnplay-essentials-platformers/

"Yeah yeah" I hear you persist, "But those aren't PROPER reviews from respected print magazines, they're just the stupid old internet making up rubbish as usual. I bet you haven't got any reviews from, say, Edge magazine and the Official Xbox Magazine have you?"

Ahem:

Edge (Issue 217, 8/10)
Official UK Xbox Magazine (Issue 64, 5 stars)

"Oh, OK.." You mumble, thoroughly chastened and embarassed, "So it IS acclaimed after all, I humbly apologise for my rudeness earlier and I will buy your excellent looking game forthwith."

Monday, 27 September 2010

Bad Code: The Invisible Threat

One of the first things said by the non-believers of the software craftsmanship movement was that good and clean code is not enough to guarantee the success of a project. And yes, they are absolutely right. There are innumerable reasons which would make a project to fail ranging from business strategy to competitors, project management, cost, time to market, partnerships, technical limitations, integrations, etc.

Due to the amount of important things a software project may have, organisations tend not to pay to much attention to things that are considered less important, like the quality of the software being developed. It is believed that with a good management team, deep hierarchies, micro management, strict process and a large amount of good documentation, the project will succeed.

In a software project, the most important deliverable is the software itself. Anything else is secondary.

Many organisations see software development as a production line where the workers (developers) are viewed as less skilled than their high-qualified and much more well paid managers. Very rarely companies like that will be able to attract or retain good software developers, leaving their entire business on the hands of mediocre professionals.

Look after your garden

Rather than construction, programming is more like gardening. - The Pragmatic Programmer

Code is organic, not mechanical. Like a garden, code needs constant maintenance. For a garden to look nice all year round, you need to look after its soil, constantly remove weeds, regularly water it, remove some dead plants, replant new ones, trim or re-arrange existing ones so they can stay healthy and look nice as whole. With basic regular maintenance, the garden will always look great but if you neglect it, even if for a short period, the effort to make it nice again will be much bigger. The longer you neglect it, the harder it will be to make it look nice again and you may even loose some or all of your plants.

Code is no different. If code quality is not constantly looked after, the code starts to deteriorate. Bad design choices, lack of tests and poor use of languages and tools will make parts of the code to rot. Bit by bit other parts of the code will also be contaminated up to the point that the whole code base is so ill that it becomes extremely painful to change it or add new features to it.

The Invisible Threat

When starting a greenfield project, everything is great. With a non-existent code base, developers can quickly start creating new features without the fear of breaking or changing any existing code. Testers are happy because everything they need to test is new, meaning that they don't need to worry about regression tests. Managers can see a quickly progress in terms of new features added and delivered. What a fantastic first month the team is having.

However, this is not a team of craftsmen. This is a team of average developers structured and treated like unskilled production line workers.

As time goes by, things are getting messier, bugs start to appear (some with no apparent explanation) and features start taking longer and longer to be developed and tested. Very slowly, the time to deliver anything starts to stretch out. But this is a slow process. Slow enough that takes months, sometimes, over a year or two to be noticed by the management.

It's very common to see projects where, at the beginning of a project, a feature of size X takes N number of days to be implemented. Over the time, as more bad code is added to the application, the same feature X (or a feature of the same size) takes much longer to be implemented than it used to take at the beginning of the project. As the quality of the code decreases, the amount of time to implement a new feature, fix a bug or make a change increases. The lower the quality, the higher the number of bugs, the harder is to test and less robust and reliable the application becomes.

Some people say that they just don't have time to do it properly but, in general, a lot more time and money is spent later on on tests and bug fixing.

Hostage of your own software

When the code base gets into the situation where changes or additional features take too long to be implemented or worse, developers and managers are scared to touch existing code, an action must be taken immediately. This is a very dangerous situation to be since business progress is being impeded or delayed by the software instead of being helped by it.

To keep business progress, schedule and budget under control, high quality code needs to be maintained at all costs.

Organisations may need to cancel the implementation of some features or postpone changes just because of the amount of time and money that they may cost to be built. Having poor quality of code responsible for it is totally unacceptable.

The biggest problem here is that bad code is invisible to everyone besides developers. Other members of the team will just realise that something is wrong when it is too late. This means that it is the developers responsibility to look after the quality of the code. Some times, developers expose the problem to project managers but the request for having some time to "re-factor" the code is often ignored for various reasons, including a lack of understand of the impacts of bad code and the inability of developers to explain it. On the other hand, when developers come to a point where they need to ask for some formal time to do refactoring, this means that for one reason or another, they neglected the code at some point in the past.

Hire craftsmen not average developers

With the amount of literature, tools, technologies, methodologies and the infinite source of information available on the web, it is just unacceptable to have a team of developers that let the code to rot.

Craftsmen are gardeners and are constantly looking after the code base, quickly refactoring it without fear since they are strongly backed by a good battery of tests that can test the entire application in just a few minutes. Time constraints or change in requirements will never be used as excuses for bad code or lack of tests due to the good design principles and techniques constantly used throughout the application.

Having an empowered team of craftsmen can be the difference between success and failure of any software project.

Quality of code may not guarantee the success of a project but it can definitely be the main invisible cause of its failure.

Tuesday, 14 September 2010

Beyond the manifesto: The Software Craftsmanship Attitude

Being an aspiring software craftsman goes way beyond than just saying it. I'll quote my own definition of software craftsmanship from my previous post.

Software craftsmanship is a long journey to mastery. It's a lifestyle where developers choose to be responsible for their own careers and for improving their craft, constantly learning new tools and techniques. Software Craftsmanship is all about putting responsibility, professionalism, pragmatism and pride back into software development.

Software craftsmanship is all about attitude. The attitude of raising the bar of professional software development starting with our own skills and professionalism.

The responsibility shift

Not long ago, I was speaking to a developer and he was complaining about his company, saying that they didn't have a good career plan, that he did not have any training, that he was not given the opportunity to learn new technologies and, of course, that he was not paid enough. Apparently, from his perspective, his employer was responsible for his career.

Imagine that we need a doctor. Would we pay a doctor to learn while he cut us open or give us a diagnosis? Would we pay an engineer to learn while he draws the plan for our new house? Would we go to a concert and pay the musician to learn how to play the guitar during the show? What about a chef in a restaurant?

So why is the employer's obligation to pay us training courses and pay us to learn new technologies and tools while we are working on a project? Should the employers be responsible for what we learn and what we don't learn.

Software development is not a 9 to 5 profession. To be a professional software developer, we need to take our own time and money to keep learning and improving. As professionals, we should be paid for what we know, our ability to learn fast and for the quality of the work we do. We own our careers and are responsible for them. Working for a customer/employer that helps us with our career in terms of training, books, conferences, seminars, etc, is great but should be considered a bonus.

... but how can we learn and keep ourselves up-to-date?

Different developers have different preferences but here is a list of ways I find useful.

Literature

Books, many books. Having your own library is essential. Books give you a good overview of a specific technology or subject quickly. No, it does not mean you will be proficient and next day you will be a specialist. What a book will give you is an understanding of what a technology or subject is about. It will be up to you then to decide if you want to practice what you've learned and become proficient. If you don't have the habit, try reading 3 to 4 technical books per year. Once you get the habit, try one per month. The most common excuse is that you don't have time. The cool thing about books is that you can read them during periods of "dead" time, like on the tube, bus, in your dentist's waiting room, on the bed before going to sleep, toilet, etc.

Blogs are now one of my favourite types of reading. They tend to fit more the software craftsmanship model since they are much more personal and in general, related to personal findings, opinions, successes and failures. Reading blogs from more experienced professionals and subject matter experts is a good, quick and free way for apprentices and journeymen to learn from multiple master craftsmen at the same time. But don't think that just experienced professionals should write blogs. Every software developer should write his or her own blogs, sharing their experiences and findings, helping to create a great community of professionals.

Technical websites are also good in order to keep yourself up-to-date with what's going in the market. New technologies, techniques, etc.

Practice, practice, practice

Pet project(s) is for me, by far, the best way to learn and study. A pet project is a real project but without the boring bits. There are no deadlines, does not need to make money, you control the requirements and most importantly, you use the technologies and methodologies you want, whenever you want, wherever you want. You are the boss. Pet projects give something for you to focus and help you to understand why and how you can use certain technologies. It gives you the experience you need to apply what you've learned into real projects. Pet projects are meant to be fun.

Contributing to open source projects can also be a great thing. There are thousands of them out there. Find a project that is related to what you want to learn or know more about and download the source code. Start running and reading the tests, if any. Inspect and debug the code. If you want to contribute, start small. Add some documentation and write some tests. Then, check the list of features to be implemented and pick a simple one and give it a go. You can also propose and implement a new and small one to start with.

Pair-programming can be a great experience. Many developers are afraid to try or think they will feel uncomfortable. That's what I thought as well before I tried. Today I really enjoy it. Pair-programming gives you an opportunity to discuss your ideas, to learn new tricks and techniques from your pair and the resulting code is much better. Pairing with someone from your team is good but pairing with someone that you barely know can be a very interesting experience.

If learning a new language or new technique like TDD / BDD or trying different approaches to OOP or functional programming, try a code kata. Code kata is a small exercise that in general can be solved in a few minutes or in a few hours. Code katas were created to help developers to focus in a problem while improving their skills. You can find a good source of code katas at codingkata.org and codekata.pragprog.com

Community and Events

Be part of your local community. Software craftsmanship is all about a community of professionals, learning and sharing with each other and elevating the level of professionalism and maturity of our industry. Join your nearest user groups and participate in their events. User groups are the best way for making contacts, sharing ideas and learning things. Many user groups promote free talks, coding activities and social events. A great aspect of being part of a community if the feeling that you are not alone. There are many people out there having the same problems you've got and many others that will happily share their solution to it.

Raising the bar

The main changes proposed by the software craftsmanship movement are related to the developers attitude. In summary, it's about being proud about your work, the code you produce and the software you create. It's about constantly trying to improve your skills and learn new ones.

An aspiring software craftsman is intolerant of bad code will constantly apply the Boy Scout Rule.

If you find that the good code you wrote one year ago is still good enough today, it means you didn't learn anything during this period and this is unacceptable.

Productive partnerships

The attitude of a software craftsman goes beyond good code. The relationship between a software craftsman and his or her customer (or employer) is of a productive partnership and not employer / employee. As a productive partnership, it's our role to constantly question the requirements and propose improvements and new features. It's our job to warn our customers about the problems we see. Due to the knowledge of the code and technology, developers are well positioned to help their customers improve their business.

However, sometimes some customers or employers don't want or don't see the advantages of this partnership and will treat developers as production line workers that should just do what they are told. Coding monkeys. In cases like that, any aspiring software craftsman should move on and find another job. Staying in a place where your skills are not appreciated is a career suicide.

Raising the bar of our industry is on our own interest. The better we become in writing code and delivering valuable software, the better our lives will be, professionally and financially.

PS: If you are in London or nearby, join the London Software Craftsmanship Community.

Friday, 3 September 2010

Software Craftsmanship

So what is software craftsmanship?

A better metaphor: In a very simplistic way, we can say that software craftsmanship is a better metaphor for software development than software engineering, as I wrote in a previous post. Software craftsmanship sees software as a craft and compares software developers to the medieval blacksmiths. Apprentices would work with more experienced blacksmiths, travelling from place to place, working with and for different masters, learning different tools and techniques, improving their craft until the point they were good enough to become master themselves. (There is more to it, but let's keep it simple for now).

Wikipedia Definition: is an approach to software development that emphasizes the coding skills of the software developers themselves. It is a response by software developers to the perceived ills of the mainstream software industry, including the prioritization of financial concerns over developer accountability. (read in full)

I personally don't like too much the Wikipedia's definition. It's very dry and I don't think it captures the essence of what being a software craftsman means to a software developer.

A more personal definition: Software craftsmanship is a long journey to mastery. It's a lifestyle where developers choose to be responsible for their own careers and for improving their craft, constantly learning new tools and techniques. Software Craftsmanship is all about putting responsibility, professionalism, pragmatism and pride back into software development

A software craftsman cares and is proud of his or her work and is extremely professional and pragmatic when it comes to its implementation.

The Software Craftsmanship Movement

The software craftsmanship movement is basically an evolution of ideas that started probably in the late 90ies, early 2000 with the publication of The Pragmatic Programmer by Andy Hunt and Dave Thomas (1999) and Software Craftsmanship: The New Imperative by Pete McBreen (2001). In 2008, Uncle Bob proposed "Craftsmanship over Execution" (originally Craftsmanship over Crap) as the fifth value for the Agile Manifesto. In 2009, the Manifesto for Software Craftsmanship was created, defining the values of the movement and international Software Craftsmanship conferences emerged in the US and UK.

In the manifesto, the essence of the software craftsmanship is capture in its subtitle: Raising the bar. The manifesto was idealised by very experienced developers that had enough of project failures mainly caused by poor management, ill processes and, of course, badly-written code.

Developers are taking the matter into their own hands and are trying to change how the industry sees software development not just proposing new and revolutionary processes but showing customers that they care about what they do and that they want to work together with their customers in order to produce great and long-lived software.

The values of the Software Craftsmanship Movement

Not only working software, but also well-crafted software

Working code is not good enough. Think on a 5 year old application (it could be 2 or 10) where we are scared to change some of its parts (fix bugs, add new features, etc.) because we don't understand how it works and have no confidence that we will not break anything else. This application is working software but is it good enough? Well crafted software means that regardless how old the application is, developers can understand it easily, side effects are well known and controlled, high test coverage, clear design, business language well expressed in the code and adding or changing features does not take longer that it used to take at the beginning of the project, when the code base was small.

The code must be maintainable and predictable. Developers must know what is going to happen when changing the code and must not fear to change it. Changes should be localised and not cause impact in other parts of the application. Tests will guarantee that nothing else was broken.

Not only responding to change, but also steadily adding value

This is not just about adding new features and fixing bugs. This is also about constantly improving the structure and cleanliness of the code. The software must be seen as an asset and the constant maintenance of it will make it more valuable during its lifetime, instead of letting it rot and devalue.

The Boy Scout Rule (defined by Uncle Bob) states that we should always let the code a bit cleaner than we found it. This is a paraphrase of Boy Scouts rule: leave the camp cleaner than you found it.

If customers want to keep benefiting from adding and changing features quickly, they will need high quality code to enable them to do it.

Not only individuals and interactions, but also a community of professionals

This is somehow related to the idea of apprentices, journeymen and masters, where software craftsmanship masters will mentor apprentices and help them in their journey. The software craftsmanship community is responsible for training the next generation of professionals. Knowledge and ideas must be shared and discussed within the community in order to keep moving the software development industry forward.

Not only customer collaboration, but also productive partnerships

Software craftsmen need successful projects to build their reputation and are proud of their achievements. Successfully delivering high quality software is essential for any software craftsman journey. With this in mind, software craftsmen will do whatever they can for a project to succeed. They don't act like simple employees that just do what they are told to do. They want to actively contribute to the success of the project, questioning requirements, understanding the business, proposing improvements and productively partnering with their customers or employers. This is an interesting shift of perspective, if you like, and the advantages for the customer and for the project success are enormous. A well-motivated team has a much bigger chance to make any project succeed. However, if the customer is not prepared to have this partnership and sees software development as an industrial process and the least important part of the project, this customer will never have real software craftsmen working for him for too long. Not getting involved with the business, not questioning requirements, not proposing improvements and not knowing the customers needs is not a partnership. Real software craftsmen make the customer needs, their needs.

Conclusion

Recently, Uncle Bob Martin said during an interview:

The original torch of the Agile message has changed hands, and is now being carried by the Software Craftsmanship movement. These are the folks who continue to pursue the technical excellence and professionalism that drove the original founders of the Agile movement.

The Software Craftsmanship Movement is another step forward towards better and healthier software projects. Instead of just focusing on processes, it focuses on the quality of the software that is produced and most importantly on the attitude and competence of the people involved.

Software Craftsmanship brings pride and professionalism into software development.

Oh, Sugar

I have good news! With a minimal simplification of my INADEQUATE filter I have been able to rescue the last cross-peak of cholesterol. It had been rejected because too near to the diagonal. I changed the code saying: "if it's on the diagonal, it is bad; if it's just near, let's accept it". So it is possible to have the perfect INADEQUATE of cholesterol, with all the expected cross-peaks IN and everything else OUT.
Yesterday I received another INADEQUATE spectrum, this time of sucrose. The S/N is still high enough to make my filter unnecessary. If I play with the contour plot all the noise disappears while the 12 carbon atoms and their 10 bonds remain. Only a spurious peak remains at the coordinates 103.7;-22.9. I have not received the 1-D external projection, so I created it artificially. The spectral width is the same in both dimension (instead of being doubled for the DQF axis). The consequence is that two cross-peaks fall just on the boundary and are partially folded. This is the spectrum:

I have applied the filter with the same parameters used for the cholesterol (C-C coupling; linewidths in the two dimensions) while the threshold corresponds to the above plot. The result is perfect. All the cross-peaks are resolved and they are all present. Nothing else survives.

This time all the peaks are regular anti-phase doublets. The Js are generally larger than in cholesterol.
Click on the thumbnails to see the full-size pictures. There is an expansion to help counting the correct number of cross-peaks.
Do you want to send another spectrum? I can clean it for free. Remember to enclose the 13-C of the same sample.

Wednesday, 1 September 2010

Interstellar Space

One of the most ancient 2-D experiments has always been more talked about than practiced. The INADEQUATE was invented 30 years ago (an era in which many chemists were still using CW-NMR) and has always been regarded as a the future thing, like travels to the Moon. Everybody agrees it is useful, but the experimental difficulties are discouraging.
The information that can be extracted by this experiment is a formidable aid to unveil the structure of unknown natural compounds. For a long time, however, the experiment has been nearly impossible. You needed a powerful transmitter, because the nominal 180° pulse must be a true 180° pulse over a large spectral width. You also needed a sensitive probe. A cryo-probe is the best. Today we can have both things.
I am not mentioning here the many attempts to increase the actual sensitivity with experimental tricks, because this is a blog about software. 19 years ago the S/N limit was overcome by a pure numerical method, see this paper, commercialized as CCBond (also sold under the trademark FRED). I understand that the same program is now part of the larger NMRanalyst (™). No software that I know has ever changed the world. Having never personally used this particular product, I take for granted what the paper says while observing, at the same time, that its popularity is limited.

From the cited paper I read that the program failed to detect a few bonds in the INADEQUATE spectrum of cholesterol; the authors also explained why (the presence of second order effects). I think this is a serious issue. You can't use such a tool with confidence. A user has no way to verify if these second order effects are present or not, then he cannot verify if his spectrum can or cannot be handled by this particular program.
A few months ago I received a nice INADEQUATE spectrum. Guess which compound it was? Cholesterol! This is the first time I see this particular spectrum, so I can't tell if it was acquired correctly or not. I can see ALL the bonds with possibly one exception. I see the peaks corresponding to the bond between C20 and C22, yet it's not clear if they fall at the correct frequency or, instead, at the frequencies of C10 and C20. It is amazing, however, to see all the bonds without the aid of any special software. It is like discovering that we can go to the Moon with RyanAir.

A bitter surprise came from the observation that not all the peaks have the same shape. The very few articles I have read on the subject say that every cross peak is an anti-phase doublet (one peak goes up, the other peak goes down). Here, instead, I can see several different deviations from this simple model. This means that all the numerical methods previously described cannot be applied to this example. Should I try to improve this spectrum, I am going to lose one or more peaks, which would be a pity since I can already see all the peaks in the untreated spectrum. Maybe all the other INADEQUATE spectra ever acquired only have doublets and mine is the worst INADEQUATE ever. Maybe my spectrum is OK and the model is too simplistic. I hope some reader can solve this fundament doubt.
For the moment being I assume that my spectrum is perfect, simply because it's the only spectrum I have got. I have devised a new numerical method to improve it. It works like a filter and can clean the spectrum. I start from the known frequencies of the carbon atoms. They are easy to get from the standard 13-C spectrum reported at the top in my pictures. Everybody knows that a genuine peak can only fall at the frequency of an existing carbon. The cross-peaks also come in horizontal pairs. The vertical position (DQ frequency) of a pair is given by the sum of the two chemical shifts. Anything that does not fall at these predictable frequencies cannot be a genuine signal. My method filters it out. The signal/noise remains the same, but the plot is much easier to read.
My method, in theory, requires 4 parameter, provided by the user:
- A threshold level. This is the lowest contour level.
- The approximate width of a generic multiplet in the X dimension.
- The width in the Y dimension.
- The C-C coupling constant.
The method is not dumb, though. If nothing is found using the user's parameters, it will automatically try with different starting values. The last 3 parameters are easy to set, either by observation or by mining the literature. The most critical parameter is the threshold. Fortunately, it's not very critical.
I am going to show how tolerant my method can be. The previous picture shows the optimal threshold. [Click on the thumbnails to see the full-size pictures]. When I start from this value of threshold (1620), no cross-peak is lost and no false positive is created (spectrum not shown).
In the next picture the threshold has been reduced to less than the half, quite far from the optimal value. The threshold is now 620.

My method yields this:

The only difference between this and the spectrum processed with the optimal threshold, is the additional presence of the two cross-peaks labeled 22->20. This is good to know, because it means that the threshold is not critical. You can enter a wrong value and the result will almost be the same. This bond C20-C22 actually exists, but the cross-peaks on the left falls at the wrong frequency. It makes you think that C20 is bonded with C10 instead. As you can see, the column of C-10 now contains 5 cross-peaks (5 bonds??).
C-22 falls at 36.2 ppm. C-10 falls at 36.5. C-20 falls at 35.8. It is difficult to demonstrate if two atoms are bonded when their chemical shifts are so near.
I am very glad to see the cross peaks 5-10, 5-4, 3-4, 3-2 because they deviate from the simple doublet model. A simplistic model would never recognize these peaks, but my filter assumes a more generic model.
If I raise the threshold to 2620, the following cross-peaks disappear: 5-10, 1-10, 14-13 and 25-27. They are not so many, yet the basic lesson is: it is better to underestimate the threshold than to overestimate it.
I need more examples to test this method against. I feel it is promising indeed. The algorithm is simple and fast, which are always good qualities. It is amazing how far you can go with very little math.

Thursday, 26 August 2010

Student Price

It was 40 days ago when I mentioned the student license of TopSpin and other products, once pricey, that are now free for academic users. Knowing that many students prefer the MacBook and would like to run authentic Mac applications on it, I am going to write about the student promotion of iNMR. It is not free, but it is as cheap as it can possibly be. It's 39 euro (equivalent to 49 USD or 32 British pounds or 4200 Yens). What's so special about this license is that it is... perfectly normal! I mean: it INCLUDES direct customer support and this is quite valuable for a student that is learning NMR and a new software at the same time. Is this program difficult to learn? As for every NMR program, it CAN be hard if you are so familiar with TopSpin (or VNMR, or Jeol Delta) that you can't adapt yourself to anything else. The learning curve of iNMR is actually incredibly smooth if you start processing easy examples (1-D spectra or well acquired 2-D like TOCSY, HSQC...) before moving on to the esoteric.
Today you find video tutorials everywhere. The iNMR site offers "visual guides" instead. I feel more comfortable with the latter, first of all because English is not my native language; second of all because I can keep both the program and the guide open at the same time; third of all because I can read the guide at my own pace.
The iNMR manual is also worth of a mention. Actually I dedicated a whole post to it a few years ago. It is not the usual bulky PDF file. It looks like coming directly from Apple, because it closely resembles the manuals of Mail, Safari and iTunes for Mac OS X. Technically speaking, all these manuals are task-oriented. In simpler terms, every chapter answers to a question in the form: "I want to perform the operation X. I can I do it?". Here is an example. As you can see each chapter is just one page long.
In which cases would a student need help, then? An example is when she needs to write a script (a macro command); another extreme case is when she needs a modification to the program itself. Anyway, support also means giving a fast answer to people who can't find the time to read the manual. When you are a paying customer, you have your privileges.
For those who prefer freeware, there is the trial version of the same product (printing is disabled). iNMR has been around for 5 years by now, therefore there's plenty of reviews, short and long ones.

Sunday, 22 August 2010

Software Engineering: The problem with the production line

Is software engineering the best approach for developing software? Does it apply for the majority of the software projects or just a very few of them?

Software Engineering was an answer for the perceived "software crisis", back in 1968, in the First NATO Software Engineering Conference and it was created to solve the problems of extremely large NATO and U.S. Department of Defence projects. In the majority of these projects, the hardware was still being designed and with no hardware to test, there was plenty of time to investigate requirements and write the software specifications. Hardware controlled by the software was generally worth billions of dollars like in the case of space shuttle and the Safeguard Ballistic Missile Defence System. People's lives and national security were also at stake.

The IEEE Computer Society's Software Engineering Body of Knowledge defines "software engineering" as:

Software engineering is the application of a systematic, disciplined, quantifiable approach to the development, operation, and maintenance of software, and the study of these approaches; that is, the application of engineering to software.

Software Engineering can be very effective when developing safety critical systems like the one for the space shuttle as described on "They Write The Right Stuff" - Fast Company article, from 1996:

The last three versions of the program - each 420,000 lines long - had just one error each. The last 11 versions of this software had a total of 17 errors. Commercial programs of equivalent complexity would have 5,000 errors.

Of course that this looks really impressive but there is more to it:

Money is not the critical constraint: The group's $35 million per year budget is a trivial slice of the NASA pie, but on a dollars-per-line basis, it makes the group among the nation's most expensive software organizations.

Another extract from this same article when they were discussing the process:

And the culture is equally intolerant of creativity, the individual coding flourishes and styles that are the signature of the all-night software world. "People ask, doesn't this process stifle creativity? You have to do exactly what the manual says, and you've got someone looking over your shoulder," says Ted Keller (senior technical manager). "The answer is, yes, the process does stifle creativity."

Many of the NATO and US Department of Defence took years, some over a decade to complete. Many had hundreds of people involved and almost half of the time was spend in requirements and design specifications, with uncountable reviews and approval cycles. Of course that, due to their reliability and quality, many are still in use today.

Software Engineering for the Masses

More and more hardware became cheap and business of all sizes needed software in order to survive and be competitive. The difference was that a the big majority of these businesses couldn't afford to pay $35 million dollars per year and neither wait for too many years to start benefiting from their software. Also, many of those software projects were not manipulating expensive hardware or dealing with life-threatening situations. Edward Yourdon, in his book Rise and Resurrection of the American Programmer, wrote:

I'm going to deliver a system to you in six months that will have 5,000 bugs in it - and you're going to be very happy!

It was clear that software engineering processes had to be adapted in order to satisfy a more impatient and lower budget legion of businesses.

The "Good Enough Software" Era

Over the decades, many new software engineering processes and methodologies were created in order to make software development faster and cheaper. The most adopted were the ones based on iterative and incremental development that evolved into the agile software development.

Agile software development and "good enough software" were an amazing improvement in bringing costs down, mitigating risks with quicker feedbacks and much faster time to market.

Regardless the methodology or process used, software projects are still failing. Some fail because they were over-budget, others because they were not delivered on time, others failed to satisfy the requirements and business goals, et al.

The main problem is that for decades, software development was seen as a production line. That's the software engineering perspective of software development. Processes and methodologies are generally much more focused in making this production line more productive, creating different management and organisational styles than actually trying to make the employees more capable. Developers are treated as mere brain-damaged programmers and are at the bottom of the food chain.

Good enough is not always good enough

Using agile, lean or any methodology is not enough. It is all about the people involved in the project and their willingness to succeed and be proud of what they produce. If developers are seen as the least important and cheapest members of a software project, the maximum that this team is going to produce is mediocre software, regardless of methodology.

A software project would have much better chances to succeed with "good people with a bad process" than "mediocre people with a good process". Good people would always find a way to improve the process, be productive and produce something that they are proud of. Mediocre people accepts whatever is established, even when it is not good enough, and will produce just what they were asked for.

In a software project, if a company wants a good software, they will need good and empowered software developers. Instead of 10 mediocre people with a good process, it would be better to have 3 or 4 good people and empower them to deliver the project.

A process, imposed by managers and people that have no clue how to write software, will just guarantee that mediocre software is predictably delivered (if lucky). On the other hand, a team of self-organised and great developers would have a better shot at creating a more efficient way to produce great software, constantly trying to improve they way they work.

A process should never be more important than the people. Managers should facilitate the work of great software developers and not tell them what to do. It should always be harder to replace a good software developers than a manager since they are the ones that know the system inside-out. Managing a few well motivated, well paid and good professionals is always easier than manage many mediocre people.

Software development is a creative and highly skilled profession that takes years to master. While software development is treated like a production line, projects will continue to fail.

Source

Software Craftsmanship: The New Imperative - ISBN 0-201-73386-2, 2002
http://en.wikipedia.org/wiki/Software_engineering
http://en.wikipedia.org/wiki/Software_crisis
IEEE Standard Computer Dictionary, ISBN 1-55937-079-3, IEEE 1990
"They Write The Right Stuff", Fast Company, http://www.fastcompany.com/magazine/06/writestuff.html
Safeguard Program: http://en.wikipedia.org/wiki/Safeguard_Program
Stephenson, W. E. "An analysis of the resources used in the SAFEGUARD system software development"
Edward Yourdon - http://yourdon.com/about/
http://en.wikipedia.org/wiki/Iterative_and_incremental_development
http://en.wikipedia.org/wiki/Agile_software_development