I recently read tiredblogger's IQueryable Methods on ActiveReports ControlCollections post. First of all, I can see how the "find control by name" type of method to our control collection would certainly be useful. We should look at adding it in an efficient way that won't impose overhead when it isn't needed. However, I do have some suggestions that might help in this scenario.
Most importantly, you're not stuck catching exceptions. Allowing exceptions to be thrown is extremely slow, and in the first example tiredblogger noted, this is probably the cause of "destroying performance". If there are hundreds of "indicators" in his example, there may be several to several hundred exceptions being thrown. However, I want to get right into "Linq" and using it as a solution in this case...
In cases such as this, I don't think Linq is appropriate. First, I think it is helpful to define exactly what "Linq" is. There are lots of idioms associated with Linq which I don't think are Linq at all, namely Extension Methods and Lambda Expressions. Both of which Linq relies on heavily, but are not really Linq. Don't get me wrong though, they are extremely useful (arguably more useful than Linq itself), just not really Linq. At its essence, Linq is the language integrated query facility and IQueryable/IQueryable<T> and goes through IQueryProvider and the various IQueryProvider implementations.
It is very interesting to understand what these extension methods are really doing in order to take full advantage of them. Their concise syntax can mislead us into thinking their "query-like" nature are some how performant in scenarios where they are not. In this example the whole interaction with Linq comes down to IQueryable<Label> (where Label happens to be an ActiveReports Label) and the use of SingleOrDefault.
So my first thought was that, SingleOrDefault has no state between each call to it during that loop, so in the best case it is doing a search through the controls list. Another thing might be to just fill up a Dictionary<string,ARControl> with controls keyed by name before doing the loop. With hundreds of indicators the Diciontary<T> lookup should be much faster than repeatedly searching through the control collection with SingleOrDefault. Essentially this comes down to replacing the code to convert/wrap List<T> with an IQueryable<T> instance to convert the List<T> into a Dictionary<T>.
To satisfy my own curiosity, I did some quick testing and as it turns out the point of IQueryable<T> vs Dictionary<T> is more important than I first realized. My test removed ActiveReports from the scenario since my focus is on Linq & performance here.
The first test is to replicate the conditions from tiredblogger's post. First I created some really simple sample data:
private void CreateSampleData()
{
for (var i = 0; i < indicatorCount; i++)
{
_indicators.Add(new Indicator(i));
}
for (var i = 0; i < indicatorCount; i = i + 2)
{
_controlsSimpleList.Add(new ARControl(i));
}
_controlsQueryable = _controlsSimpleList.AsQueryable();
}
Then the test simulating what tiredblogger described which uses the IQueryable<Label> as the source of data used with SingleOrDefault:
private void WithLinqQueryable()
{
var foundCount = 0;
foreach (var indicator in _indicators)
{
var indicatorHeader = string.Format("i{0}", indicator.Id);
var indicatorControl = _controlsQueryable.SingleOrDefault(x => x.Name == indicatorHeader);
if (indicatorControl != null)
{
foundCount++;
indicatorControl.Text = IsSpanish
? indicator.SpanishText
: indicator.EnglishText;
}
}
if (foundCount != indicatorCount/2)
throw new InvalidOperationException("invalid foundCount");
}
I ran a test that uses 400 "indicators" with half of the SingleOrDefault calls not finding a corresponding Label (i.e. returned null). Running the above test in a 10 iteration loop gives me the result of 17046 milliseconds. Next, I realized that IQueryable<T> is not necessary since SingleOrDefault is available as an extension method for both IQueryable<T> (via System.Linq.Queryable.SingleOrDefault) and IEnumerable<T> (via System.Linq.Enumerable.SingleOrDefault). In the prior example, we're using the System.Linq.Queryable.SingleOrDefault implementation. In the next example we'll use the implementation from System.Linq.Enumerable since the source is _controlsSimpleList and it is merely List<T> (i.e. IEnumerable<T> ):
private void WithLinqExtensionMethods()
{
var foundCount = 0;
foreach (var indicator in _indicators)
{
var indicatorHeader = string.Format("i{0}", indicator.Id);
var indicatorControl = _controlsSimpleList.SingleOrDefault(x => x.Name == indicatorHeader);
if (indicatorControl != null)
{
foundCount++;
indicatorControl.Text = IsSpanish
? indicator.SpanishText
: indicator.EnglishText;
}
}
if (foundCount != indicatorCount / 2)
throw new InvalidOperationException("invalid foundCount");
}
Under the same conditions this one runs in a shocking 46 milliseconds! MUCH faster. Next I thought I'd compare the result to not using any Linq-related technologies at all, just a boring old Dictionary:
private void NoLinqTest()
{
var controlLookup = new Dictionary();
_controlsSimpleList.ForEach(x => controlLookup[x.Name] = x);
var foundCount = 0;
foreach (var indicator in _indicators)
{
var indicatorHeader = string.Format("i{0}", indicator.Id);
Label indicatorControl;
if (controlLookup.TryGetValue(indicatorHeader, out indicatorControl))
{
foundCount++;
indicatorControl.Text = IsSpanish
? indicator.SpanishText
: indicator.EnglishText;
}
}
if (foundCount != indicatorCount/2)
throw new InvalidOperationException("invalid foundCount");
}
This one runs under the same conditions in only 15 milliseconds.
So the saying goes, "When the only tool you have is a hammer, everything looks like a nail." Linq is another nice tool, but it's not the only tool we have :)
For those of you that want to play around with it, you can download the code
here.