LINQ Aggregation and Set Operations

LINQ’s aggregation operators compute a single value from a sequence — sum, count, average, min, max, or any custom reduction via Aggregate. Its set operators treat sequences as mathematical sets — computing unions, intersections, and differences. The materialisation operators convert lazy LINQ queries into concrete collections. Mastering these operators is essential for building the statistics, filtering, deduplication, and collection-building logic that appears throughout ASP.NET Core service and repository layers.

Aggregation Operators

var scores = new List<int> { 85, 92, 78, 96, 71, 88, 95 };

int    count   = scores.Count();                    // 7
int    total   = scores.Sum();                      // 605
int    max     = scores.Max();                      // 96
int    min     = scores.Min();                      // 71
double avg     = scores.Average();                  // 86.43...

// With a predicate
int  highCount = scores.Count(s => s >= 90);       // 3

// Min/Max with selector (on objects)
var posts = GetPosts();
Post mostViewed = posts.MaxBy(p => p.ViewCount)!;   // .NET 6+
Post oldest     = posts.MinBy(p => p.CreatedAt)!;

// Aggregate — custom fold: product of all scores
int product = scores.Aggregate(1, (acc, s) => acc * s);

// Aggregate — build a CSV string
string csv = scores.Select(s => s.ToString())
                   .Aggregate((joined, next) => joined + "," + next);
// "85,92,78,96,71,88,95"
Note: Count() on a List<T> is O(1) — it reads the stored count. On a lazy IEnumerable<T> or an EF Core query, it may enumerate the entire sequence (or translate to SELECT COUNT(*) in SQL). Always use Any() instead of Count() > 0 to check for non-empty sequences — Any() short-circuits on the first element, while Count() iterates everything. In EF Core, Any() translates to SELECT CASE WHEN EXISTS(...) THEN 1 ELSE 0 END which is much faster than COUNT(*) for existence checks.
Tip: Aggregate is the most general operator — Sum, Count, and Max are all special cases of Aggregate. Use Aggregate when you need a custom fold that the built-in aggregates do not cover: running totals, building complex strings, accumulating statistics in a single pass. For simple aggregation, the named operators (Sum, Min, Max) are clearer and should be preferred. Reserve Aggregate for genuinely custom reductions.
Warning: Calling Max(), Min(), Sum(), or Average() on an empty sequence throws InvalidOperationException. Use DefaultIfEmpty() before the aggregation to handle empty sequences: scores.DefaultIfEmpty(0).Sum() returns 0 for an empty list. Alternatively, use Any() to guard: scores.Any() ? scores.Max() : 0. In EF Core, these operators on empty result sets return null for nullable types, which is safer — prefer nullable return types on numeric aggregations when the source may be empty.

Set Operations

var a = new[] { 1, 2, 3, 4, 5 };
var b = new[] { 3, 4, 5, 6, 7 };

// Distinct — remove duplicates from one sequence
var nums   = new[] { 1, 2, 2, 3, 3, 3 };
var unique = nums.Distinct();                    // { 1, 2, 3 }

// .NET 6+ DistinctBy — deduplicate by a key
var distinctPosts = posts.DistinctBy(p => p.AuthorId);  // one post per author

// Union — all unique items from both sequences
var union  = a.Union(b);                         // { 1, 2, 3, 4, 5, 6, 7 }

// Intersect — only items in both sequences
var shared = a.Intersect(b);                     // { 3, 4, 5 }

// Except — items in a but not in b
var only_a = a.Except(b);                        // { 1, 2 }

// Concat — combine (does NOT deduplicate)
var all    = a.Concat(b);                        // { 1,2,3,4,5,3,4,5,6,7 } — duplicates kept

Materialisation Operators

var query = posts.Where(p => p.IsPublished);

// ToList — materialise to List<T> (most common)
List<Post> list   = query.ToList();

// ToArray — materialise to T[]
Post[] array = query.ToArray();

// ToDictionary — keyed by a property (throws on duplicate keys!)
Dictionary<int, Post> byId = query.ToDictionary(p => p.Id);

// ToHashSet — unique items only
HashSet<string> tags = posts.SelectMany(p => p.Tags).ToHashSet();

// ToLookup — like Dictionary but allows multiple values per key (no throw on dupe)
ILookup<string, Post> byAuthor = posts.ToLookup(p => p.AuthorId);
IEnumerable<Post> alicePosts   = byAuthor["alice-id"];  // all Alice's posts

Common Mistakes

Mistake 1 — Using Count() > 0 instead of Any()

❌ Wrong — Count() enumerates the entire sequence:

if (posts.Count() > 0) { }   // iterates all elements to count them

✅ Correct — Any() short-circuits on the first element:

if (posts.Any()) { }   // stops at the first element

Mistake 2 — ToDictionary with duplicate keys (throws ArgumentException)

❌ Wrong — throws if two posts have the same AuthorId:

var dict = posts.ToDictionary(p => p.AuthorId);   // throws on duplicate!

✅ Correct — use ToLookup when multiple items can share a key:

var lookup = posts.ToLookup(p => p.AuthorId);   // ✓ multiple posts per author

🧠 Test Yourself

Why should you use Any() instead of Count() > 0 to check if a LINQ sequence has any elements?