Tuesday, January 8, 2008

Safely Storing Business Objects in Session State

Introduction
This article describes a method for creating robust ASP.NET applications when business data is stored in Session state and users may open multiple browser instances from the same desktop client. Without protection, business data may get out of sync with a particular browser window if it is altered through another browser window. The solution presented in this article supplements the ASP.NET Session ID with a unique sequence ID string written out to each page on every post-back just before it is rendered to the client browser. Different instances of business data stored in Session state are indexed by this sequence ID string and are mapped to the correct browser windows using the sequence ID found on the page. The solution presented here is stable against multiple instances of web forms opened up by users on different instances of different business data, different instances of (possibly) the same business data, or popping cloned windows with CTRL-N.

Background
Some time ago, I developed a web form for internal use at our institution to help staff members gather complex service data from customers. The application architecture is such that the web form is driven by a controller which keeps it in sync with its stored in Session state. Within a month of rollout, one of the users tried keeping multiple browsers open on at least two distinct instances of business data, in order to copy and paste from one to the other. Since there was only one instance of business data in Session state at a time, this resulted in data corruption.

ASP.NET can keep application data in server memory based on a user's Session as identified by a long string called a Session ID. The Session ID is often stored in a cookie, but it is also possible to store the session in a URL string. You can see the Session ID in Visual Studio by examining the Page.Session.SessionID string. The above problem occurs when multiple browser windows having the same session ID are open at the same time within the same web application. It can happen with or without cookies, although it happens more transparently with cookies because you don't ordinarily see the value of the Session ID. Also, cookies are indexed on disk by application URL and are thus transparently shared by multiple browsers running in the same web application.

It is instructive to think about the web application as a sequence of discrete user operations running against distinguishable instances of business data. In order to prevent user operations from being applied to the wrong instance of business data, all user operations must be identified with a particular sequence, and applied only to the instance of business data corresponding to that sequence. In this respect, it is interesting to compare the ASP.NET web application to a traditional machine-bound OO program. In both the ASP.NET web application and the OO program, there is logical data encapsulation. Based on the types of objects involved, methods will happily run against object data if the types and method signatures agree. There is only limited physical data encapsulation in the web application, however. In the traditional OO model, methods are run against object data based on object identity kept in pointer tables maintained by an interpreter or produced by a compiler. In an ASP.NET web application with state, we have only one Session ID and it's not fine enough on its own to distinguish among multiple instances of business data in the same Session.

The solution is to supplement the ordinary ASP.NET Session ID with a secondary sequence ID string. This allows the identification of a sequence of user operations and maps it to one of several stored instances of business data in Session state. Since a user generally thinks of operations within an individual browser window as a distinct set, we'll write this sequence ID out to the browser window using ViewState in an ASP.NET HiddenField control. The sequence of logical user operations can thus be identified with a sequence of post-backs.

The code behind base class
The above solution is most conveniently implemented as a base class to your code behind pages. The machinery behind it applies to any web page.

public abstract class CodeBehindBase : System.Web.UI.Page
{
private string sequenceID;
public string SequenceID
{
get { return sequenceID; }
}
private string NewSequenceID()
{
// Just using the clock with one second resolution is
// sufficient for most cases.
// If you need to worry about more that one browser
// window open per second,
// use milliseconds.
return DateTime.Now.ToString();
}
// (...)
}

The class defines the sequence ID and a private method for creating new sequence IDs. Although it may seem that some sort of business moniker may be needed to tie a sequence of user operations to an instance of business data, it is not the case. All that is needed is a relatively unique identifier that can be tied to a particular sequence of post-backs. The mechanism uses the same sequence ID to index instances of business data in memory, so there is no dependence of this solution on how objects are named in the business model. The reason that the base class is abstract will be explained below.

The code behind base class emits an ASP.NET HiddenField control to the control tree during OnInit. This field will be used to write the sequence ID out to the browser. A dictionary to be used to index different instances of business data is also initialized and stored in Session state.

protected sealed override void OnInit(EventArgs e)
{
// Adds a hidden field to the control tree to hold the
// sequencce number
// in ViewState so we can identify the browser window.
// Must be added here
// or else it will miss having ViewState loaded.
HiddenField hdSequenceID = new HiddenField();
hdSequenceID.ID = "hdSequenceID";
hdSequenceID.Value = "";
Form.Controls.Add(hdSequenceID);

// Initializes the data structure for linking sequence IDs to
// form data
if (Session["FormDataObjects"] == null)
{
Session["FormDataObjects"] = new Dictionary();
}

// Page_Init is called here if it is defined in the derived class.
base.OnInit(e);
}

Note that this amounts to storing the Session ID in the ViewState of a dynamically created control. The control must be created and added to the control tree during OnInit so that ViewState is available in OnLoad on subsequent post-backs. Instances of the business data must be indexed by the same sequence ID, so the code behind also has rudimentary facilities for handling business data. To avoid dependencies on the business model, this is done at the level of object.

protected abstract object NewFormData();

private object GetFormData(string sid)
{
Dictionary formDataObjects =
(Dictionary)Session["FormDataObjects"];
if (formDataObjects.ContainsKey(sid))
{
return formDataObjects[sid];
}
return null;
}

protected object GetFormData()
{
object retval = GetFormData(sequenceID);
if (retval == null)
{
throw new Exception(
"Form data was not found in the session." +
"This could have been caused by " +
"opening a duplicate window using \"Ctrl-N\"." +
"Close this window and the duplicate window " +
"should be OK.");
}
return retval;
}

Derived classes must implement the NewFormData() method. This method will be called when the page is accessed for the first time to load blank form data. Since the details of object creation may be business model dependent, this method is left abstract so the developer of a derived class must implement it. In turn, when accessing business data, derived classes must use the GetFormData() method to get the instance corresponding to the value of the Session ID. Note that the only two "special" methods the derived class developer needs to know about are NewFormData() and GetFormData(). If a particular business object was not found in session state, it is usually because its corresponding sequence ID has been removed or updated. This most often happens when the user opens a window clone using CTRL-N, works with the clone for a while, and returns to the original window. Working with the clone will have caused the sequence ID to be updated and the original window will become invalid.

On page load, ViewState will be available so that the sequence ID can be read from the post-back. The sequence ID is set directly on the Page object. If the page is not a post-back, then the sequence ID just gets whatever value was set during OnInit (usually the empty string.) When the page is not a post-back, further processing ensures that a new instance of business data is loaded and a new sequence ID is generated for it. It uses two routines, LinkFormData() and UnlinkFormData(), to maintain the correspondence between instances of business data and sequence ID stored on Page.

Collapse protected sealed override void OnLoad(EventArgs e)
{
// Set the SequenceID coming back from the page.
sequenceID = ((HiddenField)Form.FindControl("hdSequenceID")).Value;

if (!Page.Ispost-back)
{
// Get an instance of form data to use
object formData = null;
if (Page.Request.Params["SequenceID"] != null)
{
// If this is a Server.Transfer, check if the SequenceID is
// in the query string.
// If so, load the form data from the old page.
// If you're worried about spoofing,
// you'd probably want to turn this off.
string sid = Page.Request.Params["SequenceID"];
formData = GetFormData(sid);
UnlinkFormData(sid);
}
else
{
// Otherwise get new form data.
formData = NewFormData();
}

// If not a post-back, then the seuqenceID would have the
// value set in OnInit.
// So create a new sequence ID here and link it to the form data.
sequenceID = NewSequenceID();
LinkFormData(sequenceID, formData);
}
// Calls Page_Load()
base.OnLoad(e);
}

The code also contains sugar to allow for instances of business data to be maintained across a Server.Transfer(). In that case, make sure the transferring page references the SequenceID in the URL query string. At the PreRender stage, the code ensures that a fresh sequence ID is written out. This is to prevent problems with the user obtaining browser clones using CTRL-N. Since clones copy ViewState, clones get the same sequence ID. Unless the sequence ID changes on every post-back, business data can once again change underneath windows copied with CTRL-N.

protected sealed override void OnPreRender(EventArgs e)
{
// Regenerate the sequenceID on every post-back.
// This prevents problems arising when the
// user opens up multiple browser windows with "Ctrl-N".
// The new window will still work,
// but older windows will safely error out.

// Get the form data
object formData = GetFormData(sequenceID);
// Unlink the old sequence ID
UnlinkFormData(sequenceID);
// Generate a new sequence ID
sequenceID = NewSequenceID();
// Relink the form data
LinkFormData(sequenceID, formData);
// Write the sequence ID out to the page.
((HiddenField)this.FindControl("hdSequenceID")).Value = sequenceID;

// Calls Page_PreRender.
base.OnPreRender(e);
}

The LinkFormData() method mentioned above will use the object Equals() method to determine when an attempt is made to link multiple instances of the same business data when the business layer thinks they are the same. Using application means, a user may for example open multiple instances of the same customer record. If it is important to try and prevent this occurrence, the developer can override Equals() on the business data object.

private void LinkFormData(string sid, object formData)
{
Dictionary formDataObjects = (Dictionary object>)Session["FormDataObjects"];
foreach (object o in formDataObjects.Values)
{
if (formData.Equals(o))
{
throw new Exception(
"Tried to load multiple instances of the same form data.");
}
}
formDataObjects[sid] = formData;
}

Using the code
The demo project associated with this article requires Visual Studio 2005 and .NET 2.0. It contains the above code behind base class and two test pages that can be opened and run within Visual Studio. The test application is basically just a text box with an "Append" button that allows users to append characters to a string. The business data class is simply an instance of System.Text.StringBuilder. Note that the string class is often a bad choice for these kinds of demos because it is a read-only sealed reference type. Appending to a string therefore necessarily involves the implicit creation of new strings and often makes things not so clear for a demonstration.

The first test page, BadSession.aspx, does not use the code behind class. To see the Session state problem, open BadSession.aspx in a browser window and type in some data. To open up a second window, use the "Pop" button. Work with the second window for a little while and go back to the first window. Data from the second window will have contaminated the first window through the common shared Session state. To see the problem again in a slightly different way, use CTRL-N instead of "Pop." The only difference is that "Pop" will additionally clear business data, since the initial pop is not a post-back.

The second test page uses the code behind base class developed above. The only additional difference is that instead of loading a new instance of StringBuilder into Session state during PageLoad, it overrides the NewFormData() method and lets the base class do it. Subsequently, when it needs the business data, it uses GetFormData() instead of accessing Session state directly. Each new window (using "Pop") now corresponds to a distinct instance of form data. If the user uses CTRL-N to clone a window, then the first clone to generate a post-back grabs the sequence ID and the rest of the clones become safely invalid. This is why the sequence ID is regenerated on every post-back.

The reader is invited to try this out with a custom business class with overridden Equals(). Care must be taken how to define equality here. For example, if equality is determined only by the content of business data, then there is a danger that separate evolutions of business data may wind up having similar enough content to be considered the same when really they're not. It is best to define equality based on some known unique identifier, like a database primary key.

Points of interest
The implementation above uses ViewState to store the sequence ID. If the developer is worried about ViewState being turned off, then the implementation can be changed to use ControlState in the Hidden field control.
The implementation uses a date string as the sequence ID. Any sequence ID will do, as long as it is distinct on every post-back for all open windows. It does not have to be a globally unique ID, just unique within the scope of the web application running client-side.
The developer may elect to override Equals() in the business data object. This can prevent multiple instances of the same business data -- as defined by business rules -- from being linked to multiple windows at the same client. It does not prevent multiple windows from being open on the same business data on different clients, since they will have different Session IDs to begin with. It is beyond the scope of this article to address that.
The "sequence ID" itself doesn't have to be a constant identifier or even part of a sequence. The user operations are the sequence that we're interested in and the "sequence ID" merely denotes the fact that the user operations need to be mapped to a particular instance of business data.
Old instances of business data are not garbage collected from Session state in this implementation. If the sequence ID implementation uses a timestamp, a scan can be be added to OnLoad to remove old business data instances when their age exceeds some timeout period.

No comments: