Marco Bellinaso's Blog

 Wednesday, March 29, 2006

In every site or web-based application there is some data that doesn’t change very often, which is requested very frequently by a lot of end users. Examples are the list of article categories, or the e-store’s product categories and product items, the list of countries and states, and so on. The most common solution to increase the performance of your site is to implement a caching system for that type of data, so that once the data is retrieved from the data store once, it will be kept in memory for some interval, and subsequent requests for the same data will retrieve it from the memory cache, avoiding a round-trip to the database server and running another query. This will save processing time and network traffic, and thus produce a faster output to the user. In ASP.NET 1.x, the System.Web.Caching.Cache class was commonly used to cache data. The cache works as an extended dictionary collection, whereby each entry has a key and a related value. You can store an item in cache by writing Cache.Insert("key", data), and you retrieve it by writing data = Cache["key"]. The Insert method of the Cache class has a number of other overloads through which you can specify either the cached data’s expiration time or how long the data will be kept in the cache, and whether it is a sliding interval (a sliding interval is reset every time the cached data is accessed), plus a dependency to a file or other cached items. When the dependent file is changed, the expiration time is reached, or the interval passes, the data will be purged from the cache, and at the next request you will need to query the data directly from the database, storing it into the cache again.

The New Caching with SQL Dependency Support

One limitation of the ASP.NET 1.x cache is that when the expiration time or caching interval is reached, the data is removed from the cache and you have to read it again from the DB even if it hasn’t actually changed in the database. Conversely, if you cache the data for 30 minutes, and the data changes the second after you cache it, you’ll be displaying stale and out-of-sync data for almost 30 minutes. This could be unacceptable for some types of information, such as the price of a product or the number of items in stock. The Cache class has been enhanced for ASP.NET 2.0; it now supports dependencies to database tables, in addition to files and other cached items. In practice, you can cache the data for an indeterminate period, until the data in the source database’s table actually changes. This cache invalidation mechanism works for all versions of SQL Server (version 7 and later), where it is based on polling and triggers. SQL Server 2005 adds another type of cache invalidation based on receiving events from the database, so it’s more efficient if you know you’ll be deploying to SQL Server 2005. In addition, the polling method only watches for table-level changes, but the SQL Server 2005 event method enables you to watch individual rows to see if they’ve been changed.

SQL Server 7+ Support for Table-level SQL Dependencies

When using the polling style of cache invalidation, ASP.NET 2.0 checks a counter in a support table every so often (the interval being configurable), and if the retrieved value is greater than the value read on the last check, then the data was changed, and thus it removes it from cache. There is one counter (and therefore one record in the AspNet_CacheTablesForChangeNotification support table) for each table for which you want to add SQL-dependency support. The counter is incremented by a table-specific trigger. You create the required table, trigger, and stored procedure needed to support the SQL dependency system by executing the aspnet_regsql.exe command-line tool from Visual Studio’s command prompt. Run it once to add the support at the database level to create the AspNet_CacheTablesForChangeNotification table and the supporting stored procedure, as follows (this assumes your database is a local SQL Server Express instance named SqlExpress):

aspnet_regsql.exe -E -S .\SqlExpress -d aspnetdb -ed

The -E option specifies that you’re using Windows integrated security and thus don’t need to pass username and password credentials (you would need to use the -U and -P parameters, respectively, otherwise). The -S option specifies the SQL Server instance name (specifying localhost\SqlExpress is the same). SqlExpress is the default instance name you get when you install SQL Server 2005 Express. The -d option specifies the database name (aspnetdb), and the -ed tells it to “enable database.”

The next step is to add support for a specific table, which means you must create a record in the AspNet_CacheTablesForChangeNotification table, and a trigger for the table to which you’re adding support:

aspnet_regsql.exe -E -S .\SqlExpress -d aspnetdb -t Customers -et

In addition to the first command description given earlier, the -t parameter specifies the table name, and the -et parameter stands for “enable table.” For the preceding commands to work, the aspnetdb database must be already attached to a SQL Server instance. This is already the case for SQL Server 7/2000 and for the fully featured editions of SQL Server 2005; however, with SQL Server 2005 Express, you typically have the database attach dynamically at runtime, so that you can do the XCopy deployment for the database as well as for the rest of the site’s files. If that’s your situation, you must first temporarily attach the database file, run aspnet_regsql.exe, and then detach the database. The attachment/detachment can be done by running the sp_attach_db and sp_detach_db system stored procedures. You can execute them from SQL Server Management Studio Express (downloadable from Microsoft if it didn’t come with your SQL Express installation), or from the sqlcmd.exe command-line program, run from the VS 2005’s command prompt. Many of the SQL commands used as examples in this book use the sqlcmd program because everyone should have this program. It is started from a Visual Studio command prompt as follows (the command-line options are similar to those of aspnet_regsql as explained above):

sqlcmd -E -S .\SqlExpress

Once you are in the sqlcmd program, you run the following command to attach the database:

sp_attach_db "aspnetdb", "c:\Websites\TheBeerHouse\App_Data\AspNetDB.mdf"

go

Then run the two aspnet_regsql commands listed above followed by “go” on a separate line to end the batch, and finally detach the database as follows:

sp_detach_db "aspnetdb"

go

To close the sqlcmd shell, just type quit or exit and press Enter. Note that if you were running the stored procedures from SQL Server Management Studio, you would need to replace the double quotes with single quotes, and the GO command would not be needed.

The last thing to do to complete the SQL dependency configuration is to write the polling settings in the web.config file. You can configure different polling profiles for the same database, or different settings for different databases. This is done by adding entries under the system.web/caching/sqlCacheDependency/databases section, as shown below:

<configuration>

   <connectionStrings>

      <add name="SiteDB" connectionString="Data Source=.\SQLExpress;Integrated Security=True;User Instance=True;AttachDBFilename=|DataDirectory|AspNetDB.mdf" />

   </connectionStrings>

 

   <system.web>

      <caching>

         <sqlCacheDependency enabled="true" pollTime="10000">

            <databases>

               <add name="SiteDB-Cache" connectionStringName="SiteDB"

                  pollTime="2000" />

            </databases>

         </sqlCacheDependency>

      </caching>

 

      <!-- other settings here... -->

   </system.web>

</configuration>

As you see, there is an entry named SiteDB-cache that refers to the databases pointed to by the connection string called SiteDB (more about this later) and that defines a polling interval of 2 seconds (2,000 milliseconds). If the pollTime attribute is not specified, the default value of 10 seconds (in the sample above) would be used.

Now that everything is configured, you can finally write the code to actually cache the data. To create a dependency to a Customers table, you create an instance of the System.Web.Caching.SqlCacheDependency class, whose constructor takes the caching profile name defined above, and the table name. Then, when you insert the data into the Cache class, you pass the SqlCacheDependency object as a third parameter to the Cache.Insert method, as shown below:

SqlCacheDependency dep = new SqlCacheDependency("SiteDB-cache", "Customers");

Cache.Insert("Customers", customers, dep);

Let’s assume that you have a GetCustomers method in your DAL that returns a list of CustomerDetails objects filled with data from the Customers table. You could implement caching as follows:

public List<CustomerDetails> GetCustomers()

{

   List<CustomerDetails> customers = null;

 

   if (Cache["Customers"] != null)

   {

      customers = (List<CustomerDetails>)Cache["Customers"];

   }

   else

   {

      using (SqlConnection cn = new SqlConnection(_connString))

      {

         SqlCommand cmd = new SqlCommand("SELECT * FROM Customers", cn);

         customers = FillCustomerListFromReader(cmd.ExecuteReader());

 

         SqlCacheDependency dep = new SqlCacheDependency(

            "SiteDB-cache", "Customers");

         Cache.Insert("Customers", customers, dep);

      }

   }

 

   return customers;

}

The method first checks whether the data is already in cache: If it is, then it retrieves the data from there; otherwise, it first retrieves it from the database, and then caches it for later use.

Not only can you use this caching expiration mechanism for storing data to be accessed from code, you can also use it for the ASP.NET’s Output Caching feature, i.e., caching the HTML produced by page rendering, so that pages don’t have to be re-rendered every time, even when the page’s output would not change. To add output caching to a page, add the @OutputCache page directive at the top of the .aspx file (or the .ascx file if you want to use fragment caching in user controls):

<%@ OutputCache Duration="3600" VaryByParam="None"

   SqlDependency="SiteDB-cache:Customers" %>

With this directive, the page’s output will be cached for a maximum of one hour, or less if the data in the Customers table is modified.

The problem with this implementation of the SQL dependency caching is that the dependency is to the entire table; it invalidates the cache regardless of which data in the table is changed. If you retrieved and cached just a few records from a table of thousands of records, why should you purge them when some other records are modified? With SQL Server 7 and 2000 whole-table monitoring for cache dependencies is your only choice, but SQL Server 2005 adds row-specific cache dependency tracking.

SQL Server 2005-specific SQL Dependency Support

The counter- and polling-based SQL dependency implementation just described is fully supported by SQL Server 2005, but this latest version of SQL Server also has some new features and technology built into it that further extend the capabilities of the Cache class. The new engine is able to create an indexed view (a view with a physical copy of the rows) when a query for which the client wants to create a dependency is executed. If after an insert, delete or update statement the results returned by a query would change, SQL Server 2005 can detect this situation and send a message to the client that registered for the dependency, by means of the Service Broker. These Query Notification events sent from SQL Server back to an application program enable a client to be notified when some data it previously retrieved was changed in the DB since it was originally retrieved, so that the client can re-request that data for the latest changes. A new class, System.Data.SqlClient.SqlDependency, can create a dependency tied to a specific SqlCommand, and thus create a logical subscription for change notifications that are received by its OnChange event handler. The following snippet shows how to create such a dependency:

using (SqlConnection cn = new SqlConnection(_connString))

{

   cn.Open();

   SqlCommand cmd = new SqlCommand(

      "SELECT ID, CustomerName FROM dbo.Customers", cn);

   SqlDependency dep = new SqlDependency(cmd);

   dep.OnChange += new OnChangeEventHandler(CustomersData_OnChange);

   SqlDataReader reader = cmd.ExecuteReader();

 

   while (reader.Read())

   {

      Trace.Write(reader["CustomerName"].ToString());

   }

 

}

Below is the specified event handler for OnChange, raised when the underlying data returned by the preceding query changes in the database:

void CustomersData_OnChange(object sender, SqlNotificationEventArgs e)

{

   Trace.Warn("Customers data has changed. Reload it from the DB.");

}

Note that in order for this code to work, you must first enable the Query Notifications support in your client by calling the SqlDependency.Start method once, somewhere in the application. If you’re using it from within a web-based application, the right place to put this call would be in the Application_Start global event handler in global.asax. For a WinForms application, it may be the Main entry-point method, or the main form’s Form_Load event.

The preceding code just shows that we’re being notified when the underlying data in the database has changed, but we normally want to go one step further and purge data from the cache when changes are detected in the database. The ASP.NET’s SqlCacheDependency has other overloaded versions of its constructor, and one of them takes a SqlCommand instance. It creates a SqlDependency object internally behind the scenes, and handles its OnChange event to automatically remove the data from the cache when data for the specific SELECT query would change. Here’s all you have to do to cache some data with a dependency to a SqlCommand:

SqlCommand cmd = new SqlCommand("SELECT ID, CustomerName FROM dbo.Customers", cn);

SqlCacheDependency dep = new SqlCacheDependency(cmd);

Cache.Insert("keyname", data, dep);

The sample GetCustomers method shown above would then become the following:

public List<CustomerDetails> GetCustomers()

{

   List<CustomerDetails> customers = null;

 

   if (Cache["Customers"] != null)

   {

      customers = (List<CustomerDetails>)Cache["Customers"];

   }

   else

   {

      using (SqlConnection cn = new SqlConnection(_connString))

      {

         SqlCommand cmd = new SqlCommand(

            "SELECT ID, CustomerName FROM dbo.Customers", cn);

         SqlCacheDependency dep = new SqlCacheDependency(cmd);

 

         customers = FillCustomerListFromReader(cmd.ExecuteReader());

         Cache.Insert("Customers", customers, dep);

      }

   }

 

   return customers;

}

This technology has the obvious advantage that the dependency is at the query level, and not at the entire table level like the implementation for SQL Server 7/2000, and the event method is much more efficient than using a polling mechanism. However, it has a number of serious limitations that drastically reduce the number of occasions when it can be used, so sometimes the whole-table polling method is your only choice. Here are the most important constraints:

  • You can’t use the * wildcard in the SELECT query; instead, you must explicitly list all the fields. This is a good practice anyway, because you should only request fields that you actually need and not necessarily all of them. Listing them explicitly also puts you in control of their order in the returned DataReader or DataTable, something that can be important if you access fields by index and not by name (although access by index is not itself a good practice).
  • You must reference any table with the full name, e.g., dbo.Customers. Just Customers wouldn’t be enough. This is a significant issue because most of us aren’t used to fully qualifying table names, but it’s a simple matter to handle if you remember that you need to do it.
  • The query can’t use aggregation functions such as COUNT, SUM, AVG, MIN, MAX, etc.
  • The query can’t use ranking and windowing functions, such as the new ROW_NUMBER() function, which is tremendously useful for implementing high-performance results pagination to be used, for example, in the DataGrid, GridView, or other ASP.NET server-side controls. (This function will be explained in Chapter 5.)
  • You can’t reference views or temporary tables in the query.
  • The query can’t return fields of type text, ntext, or image (blob types). Consider that many tables will have such columns, for containing the description of a product, the content of an article or a newsletter, etc.
  • You can’t use DISTINCT, HAVING, CONTAINS and FREETEXT.
  • The query can’t include subqueries, outer-joins or self-joins. This is one of the biggest limitations, as subqueries are commonly used.
  • All of the preceding limitations exist regardless of whether the query is run directly from the client as a SQL text command, or from inside a stored procedure. For stored procedures, however, there’s a further limitation: You can’t use the SET NOCOUNT ON statement, which is often used (and suggested) to reduce the quantity of information sent across the network, for cases where you don’t need counts.

If you consider that most of the modules developed in the following chapters will need to implement custom pagination to be fast (and thus windowing functions or temporary tables, COUNT, subqueries and other prohibited features), and that many columns will be of type ntext, you can easily understand why you may not be able to use to this form of SQL dependency often.

If you want to know more about the Service Broker and Query Notifications, the technology behind Sql Dependencies, I recommend a whitepaper written by Bob Beauchemin, titled "Query Notifications in ADO.NET 2.0" and available on MSDN Online at http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnvs05/html/querynotification.asp.

Choosing a Caching Strategy That Fits Your Needs

The Cache class has been greatly improved in its latest implementation. However, more is less sometimes, and using your own code to handle the expiration of data and manual purging may be better than using some form of automated SQL dependency. The polling-based approach is done at the table level, so it will often invalidate your data when unrelated data in the same table has been changed. The SQL Server 2005’s Service Broker/Query Notification technology is very intriguing, and will be very handy in some situations, but as I said earlier it suffers from too many limitations to be used often. Additionally, both approaches are bound to SQL Server, and should only be used in the DAL provider specific to SQL Server. Therefore, if we used the SQL dependencies, different providers (for different RDBMSs) should implement a different caching strategy, rewritten from scratch. This is something I don’t like, because I want my caching code in the BLL (not in the DAL), so that it’s executed the same way regardless of which DAL provider is being used. For all these reasons I won’t be using any form of built-in SQL dependency for the modules developed in the rest of the book. Instead, I use the good old ASP.NET 1.x caching features based on time/interval expiration. To avoid displaying stale data, we’ll implement some simple methods that purge data from cache when it actually changes in the database. To call the methods you won’t need to implement triggers at the database level, or use some other connection and notification service between the data store and the application’s code. You’ll just call them from the BLL methods that add, edit and delete data. Because the site contents will be managed by a web-based custom administration console and your own BLL classes, there won’t be a need to intercept the changes at the database level. Instead, you just add some code to the BLL methods themselves. This gives you complete control of what data you need to purge, and when you must actually purge it (e.g., when you remove data if a specific row field changes, but not if another field changes).



NOTE: This excerpt was taken from the book "ASP.NET 2.0 Website Programming". Click here to find more about it, and download the complete source code of the sample project.

Name
E-mail
Home page

Comment (HTML not allowed)  

Enter the code shown (prevents robots):

 
Get RSS/Atom Feed
RSS 2.0 | Atom 1.0
Search in the blog
Archive
<May 2008>
SunMonTueWedThuFriSat
27282930123
45678910
11121314151617
18192021222324
25262728293031
1234567
Categories

Powered by: newtelligence dasBlog 1.8.5223.1