Language Integrated Query (LINQ) as its name implies, is a way to address the data access needs of developers by enabling support directly in the programming language.
There have always been a problem bridging the semantic gap between different types of data and the world of objects, LINQ introduces standard, easily-learned patterns for querying and updating data, and the technology can be extended to support potentially any kind of data store.
Traditionally, queries against data are expressed as simple strings without type checking at compile time or IntelliSense support. Furthermore, we have to learn a different query language for each type of data source: SQL, Oracle databases, XML documents, various Web services…etc. LINQ makes a query a first-class language construct in C# and Visual Basic. We write queries against strongly typed collections of objects by using language keywords and familiar operators.
There are many questions raised about LINQ performance and implementation as some developers thought that it depends on reflection which hurts the performance. Today I’m going to clarify that matter and show you how LINQ was originally implemented.
LINQ is mostly dependent on a mathematical concept known as “Lambda Calculas”. Lambda calculus is a formal mathematical system devised by Alonzo Church to investigate functions, function application and recursion. It has influenced many programming languages but none more so than functional programming languages.
In the lambda calculus, functions are first-class entities: they are passed as arguments, and returned as results. Thus lambda expressions are a reification of the concept of an unnamed procedure without side effects. The lambda calculus can be thought of as an idealized, minimalistic programming. It is capable of expressing any algorithm, and it is this fact that makes the model of functional programming an important one.
Wes Dyer talked about anonymous recursion in C# and how to use lambdas which can be found here [http://blogs.msdn.com/wesdyer/archive/2007/02/02/anonymous-recursion-in-c.aspx], here is a snapshot of the lambda expression:
namespace Church
{
public delegate F F(F x);
public static class Church
{
// Identity Function
public static F id = x => x;
// Conditionals
public static F True = tbranch => fbranch => tbranch;
public static F False = tbranch => fbranch => fbranch;
public static F Not = cond => cond(False)(True);
public static F And = cond1 => cond2 => cond1(cond2)(False);
public static F Or = cond1 => cond2 => cond1(True)(cond2);
//Numerals
public static F Zreo = f => id;
public static F One = id;
public static F Two = f => x => f(f(x));
public static F Succ = n => f => x => f(n(f)(x));
//Arithmetic
public static F IsZero = n => n(x => False)(True);
public static F Plus = m => m(Succ);
public static F Times = m => n => f => m(n(f));
}
}
So it something like a function of a function of a function … etc.
To furthur understand LINQ, consider the following example:
using System;
using System.Collections.Generic;
//using System.Linq;
using System.Text;
namespace LinqTest
{
static class Program
{
static void Main(string[] args)
{
var clients = GetClients();
foreach (var c in clients)
{
Console.WriteLine("Name : {0}\tCity : {1}\tCountry : {2}", c.Name, c.City, c.Country);
}
}
static IEnumerable<Client> GetClients()
{
return new List<Client>
{
new Client{ID=1,Name="Ahmed",City="Cairo",Country="Egypt"},
new Client{ID=2,Name="Hana",City="Alex",Country="Egypt"},
new Client{ID=3,Name="Adham",City="Luxor",Country="Egypt"},
new Client{ID=4,Name="Noha",City="London",Country="England"},
new Client{ID=5,Name="Xavi",City="Barcelona",Country="Spain"}
};
}
}
public class Client
{
public long ID { get; set; }
public string Name { get; set; }
public string City { get; set; }
public string Country { get; set; }
}
}
As observed from the example, we have commented the LINQ namespace, and created a list of clients, now consider that we want to filter the list to get only the clients from Egypt so we will create a method to do this filtering:
static IEnumerable<Client> GetEgyptian(this IEnumerable<Client> source)
{ foreach (var c in source)
if (c.Country == "Egypt") yield return c;
}
And update the Main entry to be like:
static void Main(string[] args)
{
var clients = GetClients();
var EgyptClients = clients.GetEgyptian();
foreach (var c in EgyptClients)
{
Console.WriteLine("Name : {0}\tCity : {1}\tCountry : {2}",c.Name, c.City, c.Country);
}
}
Consider that we want to get clients from England, so we will create another method to do filtering our we can generalize our filtering method to be like the following :
static IEnumerable<T> Where<T>(this IEnumerable<T> source,Predicate<T> p)
{
foreach (var c in source)
if (p(c)) yield return c;
}
As can be seen, we use generics to make a template for our filtering method, to use this filter in .Net 2.0, we write the following code:
static void Main(string[] args)
{
var clients = GetClients();
var EgyptClients = clients.Where(delegate(Client c) { return c.Country == "Egypt"; });
foreach (var c in EgyptClients)
{
Console.WriteLine("Name : {0}\tCity : {1}\tCountry : {2}", c.Name, c.City, c.Country);
}
}
We can optimze this delegate by using lambda expression as follows :
var EgyptClients = clients.Where((Client c) => { return c.Country == "Egypt"; });
And we can refine this code as we are querying the Client object as fllows :
var EgyptClients = clients.Where( c => c.Country == "Egypt" );
Now consider that we want to sort our clients by city, instead of creating our OrderBy generic method we will use the generic one provided by LINQ framework. So we will uncomment the LINQ namespace and order our clients descending by city :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace Test
{
static class Program
{
static void Main(string[] args)
{
var clients = GetClients();
var EgyptClients = clients.Where( c => c.Country == "Egypt").OrderByDescending(c => c.City);
foreach (var c in EgyptClients)
{
Console.WriteLine("Name : {0}\tCity : {1}\tCountry : {2}",c.Name, c.City, c.Country);
}
}
static IEnumerable<Client> GetClients()
{
return new List<Client>
{
new Client{ID=1,Name="Ahmed",City="Cairo",Country="Egypt"},
new Client{ID=2,Name="Hana",City="Alex",Country="Egypt"},
new Client{ID=3,Name="Adham",City="Luxor",Country="Egypt"},
new Client{ID=4,Name="Noha",City="London",Country="England"},
new Client{ID=5,Name="Xavi",City="Barcelona",Country="Spain"}
};
}
static IEnumerable<T> Where<T>(this IEnumerable<T> source,Predicate<T> p)
{
foreach (var c in source)
if (p(c)) yield return c;
}
}
}
C# 3.0 introduces Query expressions which are written in declarative query syntax. A query expression must begin with a [from] clause and must end with a [select] or [group] clause. Between the first from clause and the last select or group clause, it can contain one or more of these optional clauses: [where], [orderby], [join], and even additional [from] clauses.
Let’s transform our client filtering to be written as a query expression, see below:
//var EgyptClients = clients.Where( c => c.Country == "Egypt" ).OrderByDescending(c => c.City);
var EgyptClients = from c in clients where c.Country == "Egypt" orderby c.City descending select c;
If you use a Reflector for the above code,it will show the code below,
private static void Main(string[] args)
{
IOrderedEnumerable<Client> EgyptClients = GetClients().Where<Client>(delegate (Client c)
{
return (c.Country == "Egypt");
}).OrderByDescending<Client, string>(delegate (Client c)
{
return c.City;
});
foreach (Client c in EgyptClients)
{
Console.WriteLine("Name : {0}\tCity : {1}\tCountry : {2}", c.Name, c.City, c.Country);
}
}
As we can see from the above example that we used some c# new keywords to do the query and this syantx is transforemed by the compiler into the same code we wrote before, so LINQ querires are just some recusive functions with no reflection and no performance issues. So, there is no magic about it!