Using .NET CLR in SQL Server 2005
In this tutorial we will introduce the concepts relating to CLR integration with SQL Server 2005 including how to implement managed code.
New types and aggregates extend the functionality of SQL Server 2005. The Common Language Runtime enables developers write procedures, triggers and functions in any of the languages supported. SQL server CLR integration is achieved in a few steps:
1. The developer has to write the managed code as a set of class definitions. These include stored procedures, functions, triggers written as static methods of a class. User defined types and aggregates are written as classes and the code is created and compiled as an assembly.
2. The assembly is then uploaded into the SQL server database and stored in the System catalog using the CREATE ASSEMBLY data definition language.
3. Thereafter, T-SQL objects—routines, types and aggregates—are created and bound to entry points within the assembly. CREATE PROCDEDURE/ FUNTION/TRIGGER/TYPE/AGGREGATE statements are used.
4. These routines etc then function like T-SQL routines.
Development, deployment and debugging of managed code in SQL Server 2005 is made easy by the new SQL Server Project template. The project is compiled into an assembly and when the project is deployed the assembly is uploaded as a binary assembly into the SQL Server database associated with the project. The process of deployment also automatically creates routines, types and aggregates in the database. These are based on the custom aggregates defined in the code. The source code and the debugging symbols are also uploaded with the assembly. Significantly debugging is a key feature in SQL Server 2005 and the debugging works seamlessly allowing the user to step from CLR methods to T-SQL and vice versa.
References to other assemblies in the data base can also be added.
CLR support is intended to augment the expressive power of T-SQL query language. It enhances the power to write recursive queries, to use analytical functions such as RANK and ROW_NUMBER and new relational operators such as EXCEPT, INTERSECT, APPLY, PIVOT and UNPIVOT. Developers can take advantage of the extensive libraries of .NET framework API and utilize the rich data structures such as arrays, lists etc. The computational power of CLR can be harnessed to strengthen the T-SQL.
Data access codes written in CLR is verbose compared to those written in T-SQL. The programming models use similar query language but are different in procedural portions.
The SQL statement written in CLR supported languages are not compiled or validated till they are executed and hence debugging and performance are impacted. The model is similar to that of the ADO.NET and is used in the client or middle tiers to leverage existing skills.
It is advisable that T-SQL procedural and row based processing should be used only when the logic is not expressible using the declarative language. If forward only, read only row navigation is involved in a result set with processing for each row, it is better to use CLR. If data access and computation are involved, the procedural code should be divided into a CLR portion and a T-SQL portion.
CLR provides an efficient alternative to XPs. The stored procedures can be expressed better as table valued functions and can be manipulated using query language. It provides granular control, reliability, Data access, additional data types and scalability. It also gives a performance advantage as it functions within the SQL Server and reduces the delay that occurs due to the transitions between managed code and native code.
CLR also enables developers to place their code outside the database and select the language of their choice. This programming logic can be moved into the database and reduce the amount of data flowing on the network. This helps in data validation, network traffic reduction and the CLR ensures that the programming language does not interfere with the right code location decision.
Database Programming tasks
A number of database tasks and problems have been solved by CLR integration. Data validation using the .NET framework library helps augment the T-SQL built in function library with useful functions.
The developer now has the option of writing stored procedures and table valued functions to return result sets in T-SQL or using CLR. The “caller pipe” within the stored procedure needs to be properly defined in T-SQL. When the stored procedure is invoked fro any client data access API the caller will return the result set to the invoking API bypassing all the T-SQL frames on the stack. SQL Server 2005 introduces a new type of caller. The result set is made available through the SqlDataReader object when the query is executed using the ADO.NET provider. The SqlDataReader can be consumed within the stored procedure. Managed routines return results to the caller using a static instance of SqlPipe available within the SqlContext class and the same is implemented by the table valued function which allows the SQL Server to retrieve the results using the ExecuteAndSend() to send the results to the invoker of the stored procedure.
Table valued functions (TVF) are also enabled by CLR support when written in managed languages. Tabular results are returned. A significant feature is the capability to stream the results produced. The managed TVFs return a standard IEnumerable interface. The IEnumerator object provided by this interface can be retrieved by the query processor and is opaque to SQL Server until it is accessed by another method.
The developer has to make up his mind as to what he wishes to use—T-SQL or CLR. Composition requirements, the source of data and the need for result sets on the run, or typing requirements will determine the kind of code he will use. For instance if he needs to manipulate the results produced inside a table valued function or a stored procedure table valued functions are better. If the results need to streamed CLR is a better option. Again external data sources would be better with a CLR based implementation.
Performing Custom Aggregations over Data
Aggregation on data can be done in a number of ways. If the function is not built in, a function can be added. It can be written as a user defined aggregate, a CLR stored procedure or a server side cursor in T-SQL. The kind of solution to be used will be determined by composability requirements, aggregation algorithm and the need for side effects.
User defined types are one of the most important aspects of SQL Server 2005. It extends the scalar type system of the database. Custom date or time data types are good examples of User defined data types that extend the scalar type system. However UDT should not be used to model complex business objects as it is treated as a unit by the SQL Server to which it is opaque. There is also a limitation on size, indexing and the entire value needs to be updated every time any value is updated.