SmartCSV.fx version 0.6

Posted on August 8, 2016

Refactored the validation algorithm to improve maintainability.

As the interest in the project grows, I revisited the validation algorithm and refactored it.

Uniqueness rule in version 0.5

animated gif with uniqueness rule in action

I introduced an uniqueness feature in version 0.5. This was the first validation, which has to include all the other values in a column for the check. This validation is slower than the validation of a single value against a simple rule. So I decided to try to improve the performance.

Old code

The old version of the validation algorithm walks through every possible validation rule and asks the configuration if the validation is active for the current column.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
public ValidationError isValid(String column, String value, Integer lineNumber) {
    ValidationError result = null;
    if (hasConfig()) {
        ValidationError error = ValidationError.withLineNumber(lineNumber);
        checkBlankOrNull(column, value, error);
        if (value != null && !value.isEmpty()) {
            checkRegularExpression(column, value, error);
            checkAlphaNumeric(column, value, error);
            checkDate(column, value, error);
            checkMaxLength(column, value, error);
            checkMinLength(column, value, error);
            checkInteger(column, value, error);
            checkGroovy(column, value, error);
            checkValueOf(column, value, error);
            checkDouble(column, value, error);
            checkUniqueness(column, value, lineNumber, error);
        }

        if (!error.isEmpty()) {
            result = error;
        }
    }
    return result;
}

This results in a lot checks.

1
2
3
4
5
6
7
8
private void checkBlankOrNull(String column, String value, ValidationError error) {
    Boolean notEmptyRule = validationConfig.getNotEmptyRuleFor(column);
    if (notEmptyRule != null && notEmptyRule) {
        if (isBlankOrNull(value)) {
            error.add("validation.message.not.empty");
        }
    }
}

Refactored code

All checks were done in a single validation class, which was straight forward, but with more validation rules, it was harder to maintain.

So I refactored the whole validation algorithm.

Each validation rule was implemented as an own class like strategy pattern. validators uml

All active validation rules for an column are stored in a list (value list of the HashMap).

1
2
3
private Map<String, Map<Validation.Type, Validation>> columnValidationMap = new HashMap<>();
Map<Validation.Type, Validation> validationMap = columnValidationMap.get(column);
validationMap.put(validation.getType(), validation);

The isValid() method just iterates over the active rules in the list.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
public ValidationError isValid(Integer row, String column, String value) {
    ValidationError result = null;
    if (hasConfig()) {
        ValidationError error = ValidationError.withLineNumber(row);
        Map<Validation.Type, Validation> validationMap = columnValidationMap.get(column);
        if (validationMap != null) {
            for (Validation validation: validationMap.values()) {
                if (validation.getType() == Validation.Type.NOT_EMPTY) {
                    validation.check(row, value, error);
                } else {
                    if (value != null && !value.isEmpty()) {
                        validation.check(row, value, error);
                    }
                }
            }
        }
        if (!error.isEmpty()) {
            result = error;
        }
    }
    return result;
}

With that design, it is a lot easier to implement new validation rules and the performance is much better as only the active rules for a column are used.