Login


Implementing VB's Like Operator in C#

By Jonathan Wood on 2/16/2014 (Updated on 7/27/2015)
Language: C#
Technology: .NET
Platform: Windows
License: CPOL
Views: 35,993
Frameworks & Libraries » .NET » General » Implementing VB's Like Operator in C#

Introduction

VB.NET programmers can use the Like operator to compare if two strings are similar. This operator supports wildcards that make it easy to test if strings are similar but not exactly the same.

C# doesn't provide a Like operator, but C# programmers wanting the same functionality can still obtain it. This article will present several ways to get functionality of the Like operator in C#.

But before we discuss how this functionality can be made available in C# applications, let's first talk about how this operator works in VB.

Like Operator Syntax

VB.NET's Like operator determines if a given string matches a specified pattern.

If "123" Like "1#3" Then
    'Code executed if string matches the specified pattern.
End If

The pattern string supports the following wildcards:

Characters in Pattern Matches in String
? Any single character
* Zero or more characters
# Any single digit (0-9)
[charlist] Any single character in charlist
[!charlist] Any single character not in charlist

A group of one or more characters (charlist) enclosed in brackets ([ ]) can be used to match any single character in the string and can include almost any character code, including digits. An exclamation point (!) at the beginning of charlist means that a match is made if any character except the characters in charlist is found in the string. When used outside brackets, the exclamation point matches itself.

To match the special characters such as the left bracket ([), question mark (?), number sign (#), and asterisk (*), enclose them in brackets. The right bracket (]) cannot be used within a group to match itself, but it can be used outside a group as an individual character. The character sequence [] is considered a zero-length string (""). However, it cannot be part of a character list enclosed in brackets. If you want to check whether a position in string contains one of a group of characters or no character at all, you can use Like twice.

By using a hyphen (-) to separate the lower and upper bounds of the range, charlist can specify a range of characters. For example, [A-Z] results in a match if the corresponding character position in the string contains any character within the range A-Z, and [!H-L] results in a match if the corresponding character position contains any character outside the range H-L. When you specify a range of characters, they must appear in ascending sort order, that is, from lowest to highest. Thus, [A-Z] is a valid pattern, but [Z-A] is not.

Use Microsoft.VisualBasic.CompilerServices

The first option you have is to use the LikeOperator.LikeString method found in the Microsoft.VisualBasic.CompilerServices namespace in Microsoft.VisualBasic.dll. This is the same method called under the covers by VB's Like operator.

LikeOperator.LikeString("123", "1#3", Microsoft.VisualBasic.CompareMethod.Binary);

Note that you can set the third argument to CompareMethod.Text to compare without regard to case.

Use Regular Expressions

If you are comfortable with regular expressions, then this is a no-brainer. The .NET platform has rich support for regular expressions. Regular expressions can duplicate the functionality of the Like operator, but they can also do so much more!

Of course, with this added flexibility comes added complexity. For those who are regular-expression challenged, the following table translates the syntax of special characters in the Like operator to its equivelent characters in a regular expression.

Like Operator Syntax Regular Expression Syntax
To match any single character in charlist, use [charlist]. To match any single character in charlist, use [charlist].
To match any single character not in charlist, use [!charlist]. To match any single character not in charlist, use [^charlist].
To match any single digit (0 - 9), use #. To match any single digit (0 - 9), use the character class for decimal digits, \d.
To match any single character, use ?. To match any single character, specify mutually exclusive character classes for the charlist in [charlist]. For example, [\s\S].
To match zero or more characters, use *. To match zero or more characters, specify mutually exclusive character classes for the charlist in [charlist]*. For example, [\s\S]*.
To match a special character char, enclose it in brackets: [char]. To match a special character char, precede it with a backslash: \char.
To match any character in a range, use a hyphen (-) to separate the lower and upper bounds of the range in a charlist. To match any character in a range, use a hyphen (-) to separate the lower and upper bounds of the range in a charlist.

Note that the behavior of the Like operator depends on the Option Compare statement. The default string comparison method for each source file is Option Compare Binary. In comparison, regular expressions work the same regardless of Option Compare.

Roll Your Own

Of course, if you have a little time and feel like rolling your own, that can be a bit of fun as well.

Listing 1 shows my IsLike() extension method. It implements the same functionality as the Like operator using straight C# code.

Listing 1: IsLike Extension Method

static class StringCompareExtensions
{
    /// <summary>
    /// Implement's VB's Like operator logic.
    /// </summary>
    public static bool IsLike(this string s, string pattern)
    {
        // Characters matched so far
        int matched = 0;

        // Loop through pattern string
        for (int i = 0; i < pattern.Length; )
        {
            // Check for end of string
            if (matched > s.Length)
                return false;

            // Get next pattern character
            char c = pattern[i++];
            if (c == '[') // Character list
            {
                // Test for exclude character
                bool exclude = (i < pattern.Length && pattern[i] == '!');
                if (exclude)
                    i++;
                // Build character list
                int j = pattern.IndexOf(']', i);
                if (j < 0)
                    j = s.Length;
                HashSet<char> charList = CharListToSet(pattern.Substring(i, j - i));
                i = j + 1;

                if (charList.Contains(s[matched]) == exclude)
                    return false;
                matched++;
            }
            else if (c == '?') // Any single character
            {
                matched++;
            }
            else if (c == '#') // Any single digit
            {
                if (!Char.IsDigit(s[matched]))
                    return false;
                matched++;
            }
            else if (c == '*') // Zero or more characters
            {
                if (i < pattern.Length)
                {
                    // Matches all characters until
                    // next character in pattern
                    char next = pattern[i];
                    int j = s.IndexOf(next, matched);
                    if (j < 0)
                        return false;
                    matched = j;
                }
                else
                {
                    // Matches all remaining characters
                    matched = s.Length;
                    break;
                }
            }
            else // Exact character
            {
                if (matched >= s.Length || c != s[matched])
                    return false;
                matched++;
            }
        }
        // Return true if all characters matched
        return (matched == s.Length);
    }

    /// <summary>
    /// Converts a string of characters to a HashSet of characters. If the string
    /// contains character ranges, such as A-Z, all characters in the range are
    /// also added to the returned set of characters.
    /// </summary>
    /// <param name="charList">Character list string</param>
    private static HashSet<char> CharListToSet(string charList)
    {
        HashSet<char> set = new HashSet<char>();

        for (int i = 0; i < charList.Length; i++)
        {
            if ((i + 1) < charList.Length && charList[i + 1] == '-')
            {
                // Character range
                char startChar = charList[i++];
                i++; // Hyphen
                char endChar = (char)0;
                if (i < charList.Length)
                    endChar = charList[i++];
                for (int j = startChar; j <= endChar; j++)
                    set.Add((char)j);
            }
            else set.Add(charList[i]);
        }
        return set;
    }
}

There isn't anything too fancy going on in the code. The code simply parses through the pattern string, ensuring each character in the string matches the pattern. The private CharListToSet() method does the job of converting a character list string to an actual collection of characters. It also handles character ranges by adding all characters in the range to the collection.

My code performs identically to the Microsoft.VisualBasic.CompilerServices.LikeOperator.LikeString method except that my version does not throw an exception if the pattern string is invalid. My version just does the best it can with the data provided. In addition, the code above is always case sensitive, although it could be changed to be case insensitive without too much trouble.

Calling the IsLike() extension method is very simple. It appears as a member for any string when my StringCompareExtensions class is visible to the compiler.

string s = "123";
s.IsLike("1#3");  // Returns true

Conclusion

So there you have several different options if you need to duplicate the Like operator in C# code. If you need this functionality, I hope you find one of them to be useful.

Update History

7/27/2015 : Fixed an error pointed out to me by Donovan Edye where my code failed if the * came at the end of the pattern.

End-User License

Use of this article and any related source code or other files is governed by the terms and conditions of The Code Project Open License.

Author Information

Jonathan Wood

I'm a software/website developer working out of the greater Salt Lake City area in Utah. I've developed many websites including Black Belt Coder, Insider Articles, and others.