[Feature Request] Suggest Linq simplifications
There are plenties of situations where Linq queries could be simplified.
For example:
var foo = bar.Select(x => y).ToArray();
// simplified in
var foo = bar.ToArray(x => y);
var foo = bar.Select(x => y).Count();
// simplifed in
var foo = bar.Count(x => y);
Then, there is the naive redundancies:
var foo = bar.ToArray().ToArray();
// simplified in
var foo = bar.ToArray();
var foo = bar.ToList().ToArray();
// simplified in
var foo = bar.ToArray()
Then, there are calls to Count() that do not have good perf (did the check, perf is poor), that should usually be replaced by
var foo = mylist.Count();
// simplified in
var foo = mylist.Count;
var foo = myarray.Count();
// simplify
var foo = mylist.Length;
It would be really nice if R# was adding some deeper analysis for Linq queries.
Best regards,
Joannes Vermorel
Lokad Sales Forecasting
Please sign in to leave a comment.
I'm a bit sceptical about your Count() suggestions - Count() contains code specifically to optimise for these sort of cases (if they're what I think they are).
I suspect array.Length might still win on a for() loop termination expression but, in general, Count() isn't linearly stepping through collections where it doesn't need to.
How did you test the count() performance change, and how big a change was it?
Oups, sorry, you are right on that one. It was the Last() that got us terrible performance. Although, I would guess that Count() is about 10x slower than .Length; but well, it might not be relevent enough to suggest a refactoring.
Best regards,
Joannes Vermorel
Jon Skeet just wrote an article on a similar subject - did you see it: http://msmvps.com/blogs/jon_skeet/archive/2010/02/10/optimisations-in-linq-to-objects.aspx
Your 'last' case is interesting - be nice to know exactly what is taking the time in there, as it may have wider implications.
Alternatively it might just be an unrealistic micro-performance issue with no implications in the real world... I can't see the code you used to repro that - did you call 'Last' several billion times?
Sorry, can't remember, but yes, that was idea (although more proabably a million time rather than a billion).
Then, Linq has been designed for querying databases, but people like us end up primarily using Linq2objects for algorithmic purposes; and those constant factors while negligible for DB accesses, tend to badly hurt in CPU intensive apps.
This does sound like it would be a nice feature but I have to imagine the scope of this would be very large aside from removing duplicate casting ones like the ToList().ToArray().
I've just been profiling this - the time is all spent in the IList<T> cast which Enumerable.Count() does.
Interestingly enough, casting to IList rather than IList<T> roughly doubles the performance, but there's still expensive boxing (at least for an array of ints) because you have to cast the IList[] result to T before returning it.
More interestingly though, adding
TSource[] arr = source as TSource[];
if (arr != null)
{
int count = arr.Length;
if (count > 0)
{
return arr[count - 1];
}
}
Before the IList<T> cast takes the performance back to roughly what you get from a direct arr[arr.len-1] implementation.
So it's not casting per se which is expensive, it's casting arrays to IList<T>s.
I will add a comment to your Connect bug, though obviously nothing's going to happen to .NET4 now. Of course, this is an optimisation for arrays and probably a pessimisation for ILists (I haven't checked). There is a school of thought that arrays are an obsolete type, so I doubt anyone at MS will be implementing this. But it would be useful if you wanted to do a 'Fast Linq to objects' for specialised cases.
Will, thanks a lot for your follow-up on that one.