Friday, July 10, 2009

Validation Architecture in Our Application Framework

Validation is a very important topic which is poorly handled in many frameworks. Although they tend to give “framework” name to their validation mechanism, as I will explain here, they have many drawbacks which are not easy to fix since they are architectural. Validation both enhances client interaction and eliminates some problems earlier. For instance, a database error reported back a user may not give enough information for correction. Or a missing validation may lead to erroneous data. Validation has many forms in software; data, input, requirement, architecture, class, object validation etc.

Problems in Current Validation Mechanisms

1- Validation is a Broken Link between Different Types of Frameworks: Every framework needs some sort of validation. Because of this fact, every framework has its own validation mechanism and adds a burden on developer. What we need is a single point for validation. This single validation mechanism should meet both server-side and client-side requirements. Broken link can be easily seen in validation error handling and reporting codes. Yes, when I talk about MVC integration pains (Better Java Web Framework blog post), this is a part of it.

2- Validations are not really Metadata-based: Metadata-based validation is a must needless to say. Although I will mention only database metadata, there may be many kind of metadata in other systems. Although some frameworks provide features to declare validation rules, this task is given to developer. Than they claim that they are metadata-based. This is not true, metadata-based validation is not that. Metadata-based validation is automatic validation (nullability, data type, data format, data range, data length, data security) by using the metadata knowledge of persistent objects without requiring new declaration. I sorrowfully read such documentation; “Add a ‘Not Null’ annotation to your persistent object field to make it required”. If I add this line, why do I need metadata-based validation, I just add it to the business-tier code. If that code is generated then this is OK but if we add them one by one, it is a real tedious job for programmers.

3- Persistent Object Metadata Declaration Place: Another problem is where to put object metadata. Object metadata is somewhat similar with database table metadata but may contain additional information. It is mandatory for many reasons that we must have an object metadata. At first, many frameworks used XML then recently are using annotations mixed with ugly validation annotation rules. For XML use; one of my primary programming rules is that programmer should have minimum touch-point for development, what I mean that metadata should be placed in objects; easy to reach-read metadata, easy to change. We used an array block for metadata declaration in initialize method of objects. I am also not sure about using annotations for metadata.

4- Use of Annotation and XML for Validation: We see again an overuse of something (annotations), history repeating. I believe many of you confused when reading mess of annotations lingering on top of classes or methods. Let’s do every programming job by using annotations (!). Although it may seem a good idea to use annotation (JSR 303), let me explain why I don't like this approach; you can’t declare every validation rule by annotations, that’s enough. For example, you have to validate parent object according to child objects’ data (Say “SalesOrder” with “SalesOrderLine” persistent objects, rule is “Lines’ total amount can’t be bigger than header amount”), then if rule is not met, raise a validation exception. How would you declare this validation with your annotation? Just adding a validation method but that causes to have 2 places for looking up validation rules. Instead, I strongly favor programmatic metadata-based validation; this is simple and effective.

5- Validation After Setting Persistent Object Fields: One of the major problems is that where do we store invalidated data? One pitfall of many validation solutions is to do validation after setting the fields of persistent objects. Let’s say we have a web application which has an update form. How can we validate a data that can’t be set to a field (Formatted data or String value to number field)? When user enters a data not appropriate validation, how do we rollback set data to the objects if we want to use dirty object again? Invalidated object may be replaced with a new fresh object but this is not easy in some page conditions (In sub-tabs, user may re-edit same object and you can’t get a fresh object from database). This is why Data Transfer or Value Objects are required. But that approach may add a new object layer without real benefit (Code duplication with persistent objects, same setters and getters). What we did to solve this problem is to use a validator object (Contains HashMap for fieldnames and values) attached to persistent objects to store invalidated data. If validation succeeds than these values are committed to the object, if not exception is thrown and object fields are left intact.

What to Validate
Our programmatic metadata-based approach contains following validations:
1- Object Consistency Validation: In our framework, persistent object consistency validation is done when the object is created at the first time. What do I mean by object consistency validation? This is a validation of object’s metadata information against database table structure. For example, when you declare a field as “Primary Key” or “Foreign Key” then if that declaration is different in database structure, object produces an exception. Another example is that your object has a field that doesn’t exist or removed from database. Missing database fields are reported within a validation exception. This validation is very important that provides consistency between object and database.

2- Input Data Validation: A program may take input from different kind of systems or users. Many scenarios; A user may enter some data from a web client application; a web service may be invoked by a batch program to enter data to the system. An application should validate any type of data coming from the outside. Even within same enterprise applications, different web modules should validate data coming from other module if their tables are different. Our metadata-based validation frees programmer from adding validation rules for standard validations including nullability, data type, data format, data range, data length, data security. If additional validations are required then this rules are added by programmer to the standard validation method (dbValidate). When validation runs, firstly standard validations are automatically executed then finally extra validations added by programmer are executed. If validation errors occur, these errors are reported field-based to the user. In addition to programmatic validation, we added a Rule Engine (I mentioned about it in my previous blog post) for dynamic validation rules that are the benefit of XML-based validation definition. We added changing validation rules to the Rule Engine definitions.

Validation Architecture
Let’s delve into our validation architecture:
1- Persistent Object Metadata: Every persistent class has metadata information in initialize method. We tried to use minimal metadata here. Some database metadata is taken from the database and doesn’t need to be declared here (ie. FK Table Names). This metadata is used by our validator class for general validations. A persistent object may contain other fields that are not persistent. These are not held in this declaration. Only the table-columns are included in this array list.


public class DBEmployee extends DBObject{
private int m_nNo;
private String m_sName;
private String m_sSurname;
private int m_nDepartmentNo;
private int m_nAddressNo;
...
protected void initialize() {
super.initialize();
if(isInitRequired()){
setTableInfo(new DBTableInfo(this,new String[][]{
{"No","NO","true","","No","true"},
{"Name","NAME","true","25","Name"},
{"Surname","SURNAME","true","25","Surname"},
{"DepartmentNo","F_DEPARTMENT_NO","true","","Department No"}, {"AddressNo","F_ADDRESS_NO","true","","Address No"}},
"TEST.EMPLOYEE"));
}
}
}

Array Reference:
[FieldName],[ColumnName],[Required],[Length],[Text],[Identity],[FieldViewer],[PrimaryKey],[ForeignKey] …

2- Fetching and Setting Data to be Validated: Every persistent object has a corresponding validator object to store changed data by the user. In new record entry, validator object is empty. In update record, validator is filled with data from persistent fields (These fields are also filled from database fetch). In input form, validator data is used to display and set before validation.

Update:
DB --[FETCH]--> Persistent Object --[FETCH]--> DBValidator --[DISPLAY]--> Input Form
Insert:
Persistent Object --[NEW]--> DBValidator --[DISPLAY EMPTY]--> Input Form


A part of input form (Max length control and input text coloring for required fields is done via metadata, field-based validation reporting is possible):

<tr>
<td class="tdBgK" align="right" valign="top"><%=appLocale.getText("Name")%>:</td>
<td class="tdBgA" valign="top">
<input type="text" class="inputTxt" <%=dbEmployee.isFieldRequired("Name")?UIConstants.REQUIRED:""%> name="Name" size="25" maxlength="<%=dbEmployee.getFieldMaxLength("Name")%>" value="<%=dbEmployee.getValue("Name")%>"><span class="errorMessage"><%=dbEmployee.getError("Name")%></span></td>
</tr>

When user enters data this is stored in validator object initially. After all fields are set then validate method is called (manually with dbValidate or implicitly in dbSave method). Validation occurs only for the set values. Programmer may set persistent object fields directly, but these won’t be validated.

Save:
Input Form --[SET]--> DBValidator --[VALIDATE-SET]--> Persistent Object --[SAVE]--> DB

Servlet method handling set and validation:


private void saveEmployee() throws DBValidationException {
DBEmployee dbEmployee = (DBEmployee) getFromSession("dbEmployee");
dbEmployee.setValue("Name", getParString("Name"));
dbEmployee.setValue("Surname", getParString("Surname "));
dbEmployee.setValue("DepartmentNo", getParString("DepartmentNo"));
dbEmployee.setValue("AddressNo", getParString("AddressNo"));
dbEmployee.dbValidate();
}

In the code above, internal validator object is filled by setValue method.

3- Validation Execution: When user input is set then validation is executed by calling dbValidate method. In dbValidate method, standard validation is automatically done by framework without any code addition. If any extra validation is required then it can be added here thus it collects validation logic in one place.



public class DBEmployee extends DBObject{
...
public void dbValidate() throws DBValidationException{
super.dbValidate();

if("TestValue".equals(getValue("Name")))
setError("Name", "A Name Validation Error Message");

if(!isValidated())
raiseValidationException();
else
commitValidatedValues();
}
}

If validation error occurs, this method throws an exception. Then error is displayed at the right side of the input text. If no validation error occurs, then the values on the validator object are committed to persistent object. Object is now ready to make database save. In database save, persistent object’s field values are used.

This architecture may be plugged to any application environment. Key point is that general validation should be done by your application framework without any manual coding. For example, developer won’t find enough time to manually parsing a date field and user-friendly throwing an exception to the user. If it is skipped your application quality is deemed. A value holder object (Validator) is a necessity not redundancy. We use server-side validation by default but in some part of applications we used object metadata with a client-side validation JavaScript code to validate client-side. To reveal validation logic to client is not a good practice if it is not necessity.

No comments: