Login


Implementing The TakeLast, TakeLastWhile, SkipLast And SkipLastWhile LINQ Operators

By Paulo Morgado on 2/9/2011
Language: C#
Technology: .NETLINQ
Platform: Windows
License: CPOL
Views: 10,348
Frameworks & Libraries » LINQ » General » Implementing The TakeLast, TakeLastWhile, SkipLast And SkipLastWhile LINQ Operators

Introduction

Some time ago, I needed to retrieve the last items of a sequence that satisfied some criteria and, looking at the operators available in the Enumerable class, I noticed that there wasn't such an operator.

The only way to achieve this was to reverse the sequence, take the items that satisfied the criteria, and reverse the resulting sequence. Something like this:

sequence.Reverse().TakeWhile(criteria).Reverse();

Looks quite simple, right? First, we call the Reverse method to produce a new sequence with the same items as the original sequence, but in the reverse order, then we call the TakeWhile method to take the first items that satisfy the criteria, and then call the Reverse method again to restore the original order of the items.

The problem with this approach is that the Reverse method buffers the entire sequence before iterating through its items in the reverse order - and the above code uses it twice. This means iterating over all items in the original sequence and buffering them all, iterating over the first items of the resulting sequence that satisfy the criteria and buffering them all and, finally, iterating over the result to produce the final sequence.

If you're counting, you've come to the conclusion that all items in the original sequence will be iterated over once, and the ones in the resulting sequence will be iterated again three times. If the original sequence is large, this can take a lot of memory and time.

There's another issue if you're using the variant that uses the index of the item in the original sequence in the evaluation of the selection criteria (>). When we reverse the order of the items, the indexes will be reversed and the predicate must take that in account, which might not be possible if you don't know the number of items in the original sequence.

There must be a better way, and that's why I implemented the Take Last and Skip Last operators.

The TakeLast Operator

This operator returns a specified number of contiguous elements from the end of a sequence, and is implemented as the TakeLast Extension Method:

public static IEnumerable<TSource>
    TakeLast<TSource>(this IEnumerable<TSource> source, int count);

And works like this:

int[] grades = { 59, 82, 70, 56, 92, 98, 85 };

var topThreeGrades = grades.OrderBy(grade => grade).TakeLast(3);

Console.WriteLine("The top three grades are:");
foreach (int grade in topThreeGrades)
{
    Console.WriteLine(grade);
}

/*
This code produces the following output:

The top three grades are:
98
92
85
*/

Implementing the TakeLast Operator

To implement this operator, we start by buffering, at most, a count number of items from the source sequence in an array that acts as a circular buffer:

var sourceEnumerator = source.GetEnumerator();
var buffer = new TSource[count];
var numOfItems = 0;
int idx;

for (idx = 0; (idx < count) && sourceEnumerator.MoveNext(); idx++, numOfItems++)
{
    buffer[idx] = sourceEnumerator.Current;
}

If the number of buffered items (numOfItems) is less than the requested number of items (count), we just yield all the buffered items:

if (numOfItems < count)
{
    for (idx = 0; idx < numOfItems; idx++)
    {
        yield return buffer[idx];
    }

    yield break;
}

Next, we iterate over the rest of the items, circularly buffering them:

for (idx = 0; sourceEnumerator.MoveNext(); idx = (idx + 1) % count)
{
    buffer[idx] = sourceEnumerator.Current;
}

And finally, we just iterate over the buffered items and yield them:

for (; numOfItems > 0; idx = (idx + 1) % count, numOfItems--)
{
    yield return buffer[idx];
}

There are two optimizations you can make here.

The first is obvious; if the requested number of items is 0 (zero), we just return an empty sequence:

if (count <= 0)
{
    return System.Linq.Enumerable.Empty<TSource>();
}

The second is if the source sequence is known to implement the IList<T> interface. Objects implementing this interface have a Count property and indexed access to its items, which allows us to optimize the production of the final sequence.

Producing the final sequence consists of yielding the required number of items from the end of the list (or all of them if the list contains less items than required):

int listCount = list.Count;

for (int idx = listCount - ((count < listCount) ? 
         count : listCount); idx < listCount; idx++)
{
    yield return list[idx];
}

The TakeLastWhile Operator

This operator returns the elements from the end of a sequence as long as the specified condition is true, and is implemented as the TakeLastWhile Extension Method:

public static IEnumerable<TSource> TakeLastWhile<TSource>(
      this IEnumerable<TSource> source, Func<TSource, bool> predicate)
public static IEnumerable<TSource> TakeLastWhile<TSource>(
      this IEnumerable<TSource> source, Func<TSource, int, bool> predicate)

and works like this:

string[] fruits = { "apple", "passionfruit", "banana",
    "mango", "orange", "blueberry",
    "grape", "strawberry" };

var query = 
    fruits.TakeLastWhile(fruit => string.Compare("orange", fruit, true) != 0);

foreach (string fruit in query)
{
    Console.WriteLine(fruit);
}

/*
This code produces the following output:

passionfruit
grape
strawberry
*/
string[] fruits = { "apple", "passionfruit", "banana",
    "mango", "orange", "blueberry",
    "grape", "strawberry" };

var query = fruits.TakeLastWhile((fruit, index) => fruit.Length >= index);

foreach (string fruit in query)
{
    Console.WriteLine(fruit);
}

/*
This code produces the following output:

strawberry
*/

Implementing the TakeLastWhile Operator

To implement this operator, we start with an empty buffer, and buffer every item that satisfies the criteria implemented by a predicate. Whenever an item doesn't satisfy the criteria, the buffer is cleared:

var buffer = new List<TSource>();

foreach (var item in source)
{
    if (predicate(item))
    {
        buffer.Add(item);
    }
    else
    {
        buffer.Clear();
    }
}

After traversing the source sequence, we just yield all the items, if any, in the buffer:

foreach (var item in buffer)
{
    yield return item;
}

The overload that takes into account the index of the item only differs in the call to the predicate that implements the criteria:

var buffer = new List<TSource>();
var idx = 0;

foreach (var item in source)
{
    if (predicate(item, idx++))
    {
        buffer.Add(item);
    }
    else
    {
        buffer.Clear();
    }
}

foreach (var item in buffer)
{
    yield return item;
}

The SkipLast Operator

This operator returns all but a specified number of contiguous elements from the end of a sequence, and is implemented as the SkipLast Extension Method:

public static IEnumerable<TSource> SkipLast<TSource>(
      this IEnumerable<TSource> source, int count)

and works like this:

int[] grades = { 59, 82, 70, 56, 92, 98, 85 };

var lowerGrades = grades.OrderBy(g => g).SkipLast(3);

Console.WriteLine("All grades except the top three are:");
foreach (int grade in lowerGrades)
{
    Console.WriteLine(grade);
}

/*
This code produces the following output:

All grades except the top three are:
56
59
70
82
*/

Implementing the SkipLast Operator

To implement this operator, we start by buffering, at most, a count number of items from the source sequence in an array that acts as a circular buffer:

var sourceEnumerator = source.GetEnumerator();
var buffer = new TSource[count];
int idx;

for (idx = 0; (idx < count) && sourceEnumerator.MoveNext(); idx++)
{
    buffer[idx] = sourceEnumerator.Current;
}

Next, we iterate over the rest of the items of the source sequence, circularly buffering them, and yielding the previously buffered item at the same position:

idx = 0;

while (sourceEnumerator.MoveNext())
{
    var item = buffer[idx];

    buffer[idx] = sourceEnumerator.Current;

    idx = (idx + 1) % count;

    yield return item;
}

If the number of items to skip is greater or equal to the number of items in the source sequence, sourceEnumerator.MoveNext() will return false on the first iteration of the while loop, and an empty sequence will be produced.

As with the TakeLast operator, a few optimizations can be made here.

The first is obvious, if the requested number of items is 0 (zero) or lower, we just return an equivalent sequence:

if (count <= 0)
{
    return source.Select(i => i);
}

The second is if the source sequence is known to implement the IList<T> interface. Objects implementing this interface have a Count property and indexed access to its items, which allows us to optimize the production of the final sequence.

If the number of items to skip is equal to or greater than the number of items in the source sequence, we just return an empty sequence:

var list = source as IList<TSource>;

if (list != null)
{
    if (count >= list.Count)
    {
        return System.Linq.Enumerable.Empty<TSource>();
    }

    // ...
}

If the number of items in the source sequence is greater than the number of items to skip, producing the final sequence consists of yielding all the items in the source sequence except the last count items:

int returnCount = list.Count - count;

for (int idx = 0; idx < returnCount; idx++)
{
    yield return list[idx];
}

The SkipLastWhile Operator

This operator returns all the elements from the sequence skipping those at the end as long as the specified condition is true, and is implemented as the SkipLastWhile Extension Method:

public static IEnumerable<TSource> SkipLastWhile<TSource>(
         this IEnumerable<TSource> source, Func<TSource, bool> predicate)
public static IEnumerable<TSource> SkipLastWhile<TSource>(
         this IEnumerable<TSource> source, Func<TSource, int, bool> predicate)

and works like this:

int[] grades = { 59, 82, 70, 56, 92, 98, 85 };

var lowerGrades = grades.OrderBy(grade => grade).SkipLastWhile(grade => grade >= 80);

Console.WriteLine("All grades below 80:");
foreach (int grade in lowerGrades)
{
    Console.WriteLine(grade);
}

/*
This code produces the following output:

All grades below 80:
56
59
70
*/
int[] amounts = { 5000, 2500, 5500, 8000, 6500, 4000, 1500, 9000 };

var query = amounts .SkipLastWhile((amount, index) => amount > index * 1000);

foreach (int amount in query)
{
    Console.WriteLine(amount);
}

/*
 This code produces the following output:

9000
*/

Implementing the SkipLastWhile Operator

To implement this operator, we use an empty buffer and buffer every item that satisfies the criteria implemented by a predicate. Whenever an item doesn't satisfy the criteria, all buffered items are yielded, the buffer is cleared, and the the item that doesn't satisfy the criteria is yielded:

var buffer = new List<TSource>();

foreach (var item in source)
{
    if (predicate(item))
    {
        buffer.Add(item);
    }
    else
    {
        if (buffer.Count > 0)
        {
            foreach (var bufferedItem in buffer)
            {
                yield return bufferedItem;
            }

            buffer.Clear();
        }

        yield return item;
    }
}

The overload that takes in to account the index of the item only differs in the call to the predicate that implements the criteria:

var buffer = new List<TSource>();
var idx = 0;

foreach (var item in source)
{
    if (predicate(item, idx++))
    {
        buffer.Add(item);
    }
    else
    {
        if (buffer.Count > 0)
        {
            foreach (var bufferedItem in buffer)
            {
                yield return bufferedItem;
            }

            buffer.Clear();
        }

        yield return item;
    }
}

Conclusion

As you can see, implementing these kind of operators is very easy. There's no reason to not implement such operators whenever they're needed. Implementing such operators can also improve the readability and maintainability of the code. You can find these (and more) operators in my CodePlex project for LINQ utilities and operators: PauloMorgado.Linq.

End-User License

Use of this article and any related source code or other files is governed by the terms and conditions of The Code Project Open License.

Author Information

Paulo Morgado