How Well Do You Know C#?

Although C# is considered to be a language that’s easy to learn and understand, the code written in it might behave unexpectedly sometimes, even for developers with years of experience and good knowledge of the language.

This article features several code snippets which fall into that category, and explains the reasons behind the surprising behavior.

C# Quiz

Are you keeping up with new developer technologies? Advance your IT career with our Free Developer magazines covering C#, Patterns, .NET Core, MVC, Azure, Angular, React, and more. Subscribe to the DotNetCurry (DNC) Magazine for FREE and download all previous, current and upcoming editions.

Null Value

We are all aware that null values can be dangerous, if not handled properly.

Dereferencing a null-valued variable (i.e. calling a method on it or accessing one of its properties) will result in a NullReferenceException, as demonstrated with the following sample code:

object nullValue = null;
bool areNullValuesEqual = nullValue.Equals(null);

To be on the safer side, we should always make sure that reference type values are not null before dereferencing them. Failing to do so could result in an unhandled exception in a specific edge case. Although such a mistake occasionally happens to everyone, we could hardly call it unexpected behavior.

What about the following code?

string nullString = (string)null;
bool isStringType = nullString is string;

What will be the value of isStringType? Will the value of a variable that is explicitly typed as string always be typed as string at runtime as well?

The correct answer is No.

A null value has no type at runtime.

In a way, this also affects reflection. Of course, you can’t call GetType() on a null value because a NullReferenceException would get thrown:

object nullValue = null;
Type nullType = nullValue.GetType();

Let’s look at nullable value types then:

int intValue = 5;
Nullable<int> nullableIntValue = 5;
bool areTypesEqual = intValue.GetType() == nullableIntValue.GetType();

Is it possible to distinguish between a nullable and a non-nullable value type using reflection?

The answer is No.

The same type will be returned for both variables in the above code: System.Int32. This does not mean that reflection has no representation for Nullable<T>, though.

Type intType = typeof(int);
Type nullableIntType = typeof(Nullable<int>);
bool areTypesEqual = intType == nullableIntType;

The types in this code snippet are different. As expected, the nullable type will be represented with System.Nullable`1[[System.Int32]]. Only when inspecting values, nullable ones are treated the same as non-nullable ones in reflection.

Handling Null values in Overloaded methods

Before moving on to other topics, let’s take a closer look at how null values are handled when calling overloaded methods with the same number of parameters, but of different type.

private string OverloadedMethod(object arg)
{
    return "object parameter";
}
 
private string OverloadedMethod(string arg)
{
    return "string parameter ";
}

What will happen if we invoke the method with null value?

var result = OverloadedMethod(null);

Which overload will be called? Or will the code fail to compile because of an ambiguous method call?

In this case, the code will compile and the method with the string parameter will be called.

In general, the code will compile when one parameter type can be cast to the other one (i.e. one is derived from the other). The method with the more specific parameter type will be called.

When there’s no cast between the two types, the code won’t compile.

To force a specific overload to be called, the null value can be cast to that parameter type:

var result = parameteredMethod((object)null);

Arithmetic Operations

Most of us don’t use bit shifting operations very often!

Let’s refresh our memory first. Left shift operator (<<) shifts the binary representation to the left for the given number of places:

var shifted = 0b1 << 1; // = 0b10

Similarly, the right shift operator (>>) shifts the binary representation to the right:

var shifted = 0b1 >> 1; // = 0b0

The bits don’t wrap around when they reach the end. That’s why the result of the second expression is 0. The same would happen if we shifted the bit far enough to the left (32 bits because integer is a 32-bit number):

var shifted = 0b1;
for (int i = 0; i < 32; i++)
{
    shifted = shifted << 1;
}

The result would again be 0.

However, the bit shifting operators have a second operand. Instead of shifting to the left by 1 bit 32 times, we can shift left by 32 bits and get the same result.

var shifted = 0b1 << 32;

Right? Wrong.

The result of this expression will be 1. Why?

Because that’s how the operator is defined. Before applying the operation, the second operand will be normalized to the bit length of the first operand with the modulo operation, i.e. by calculating the remainder of dividing the second operand by the bit length of the first operand.

The first operand in the example we just saw was a 32-bit number, hence: 32 % 32 = 0. Our number will be shifted left by 0 bits. That’s not the same as shifting it left by 1 bit 32 times.

Let’s move on to operators & (and) and | (or). Based on the type of operands, they represent two different operations:

  • For Boolean operands, they act as logical operators, similar to && and ||, with one difference: they are eager, i.e. both operands are always evaluated, even if the result could already be determined after evaluating the first operand.
  • For integral types, they act as logical bitwise operators and are commonly used with enum types representing flags.’
[Flags]
private enum Colors
{
    None = 0b0,
    Red = 0b1,
    Green = 0b10,
    Blue = 0b100
}

The | operator is used for combining flags and the & operator is used for checking whether flags are set:

Colors color = Colors.Red | Colors.Green;
bool isRed = (color & Colors.Red) == Colors.Red;

In the above code, I put parenthesis around the bitwise logical operation to make the code more clear. Are parenthesis required in this expression?

As it turns out, Yes.

Unlike arithmetic operators, the bitwise logical operators have lower priority than the equality operator. Fortunately, code without the parenthesis wouldn’t compile because of type checking.

Since .NET framework 4.0, there’s a better alternative available for checking flags, which you should always use instead of the & operator:

bool isRed = color.HasFlag(Colors.Red);

Math.Round()

We will conclude the topic of arithmetic operations with the Round operation. How does it round the values at the midpoint between the two integer values, e.g. 1.5? Up or down?

var rounded = Math.Round(1.5);

If you predicted up, you were right. The result will be 2. Is this a general rule?

var rounded = Math.Round(2.5);

No. The result will be 2 again. By default, the midpoint value will be rounded to the nearest even value. You could provide the second argument to the method to request such behavior explicitly:

var rounded = Math.Round(2.5, MidpointRounding.ToEven);

The behavior can be changed with a different value for the second argument:

var rounded = Math.Round(2.5, MidpointRounding.AwayFromZero);

With this explicit rule, positive values will now always be rounded upwards.

Rounding numbers can also be affected by the precision of floating point numbers.

var value = 1.4f;
var rounded = Math.Round(value + 0.1f);

Although, the midpoint value should be rounded to the nearest even number, i.e. 2, the result will be 1 in this case, because with single precision floating point numbers there is no exact representation for 0.1 and the calculated number will actually be less than 1.5 and hence rounded to one.

Although this particular issue does not manifest itself when using double precision floating point numbers, rounding errors can still happen, albeit less often. When requiring maximum precision, you should therefore always use decimal instead of float or double.

Class Initialization

Best practices suggest that we should avoid class initialization in class constructors as far as possible to prevent exceptions.

All of this is even more important for static constructors.

As you might know, the static constructor is called before the instance constructor when we try to instantiate it at runtime.

This is the order of initialization when instantiating any class:

  • Static fields (first time class access only: static members or first instance)
  • Static constructor (first time class access only: static members or first instance)
  • Instance fields (each instance)
  • Instance constructor (each instance)

Let’s create a class with a static constructor, which can be configured to throw an exception:

public static class Config
{
    public static bool ThrowException { get; set; } = true;
}
 
public class FailingClass
{
    static FailingClass()
    {
        if (Config.ThrowException)
        {
            throw new InvalidOperationException();
        }
    }
}

It shouldn’t come as a surprise that any attempt to create an instance of this class will result in an exception:

var instance = new FailingClass();

However, it won’t be InvalidOperationException. The runtime will automatically wrap it into a TypeInitializationException. This is an important detail to note if you want to catch the exception and recover from it.

try
{
    var failedInstance = new FailingClass();
}
catch (TypeInitializationException) { }
Config.ThrowException = false;
var instance = new FailingClass();

Applying what we have learned, the above code should catch the exception thrown by the static constructor, change the configuration to avoid exceptions being thrown in future calls, and finally successfully create an instance of the class, right?

Unfortunately, not.

The static constructor for a class is only called once. If it throws an exception, then this exception will be rethrown whenever you want to create an instance or access the class in any other way.

The class becomes effectively unusable until the process (or the application domain) is restarted. Yes, having even a minuscule chance that the static constructor will throw an exception, is a very bad idea.

Initialization Order in Derived classes

Initialization order is even more complex for derived classes. In edge cases, this can bring you into trouble. It’s time for a contrived example:

public class BaseClass
{
    public BaseClass()
    {
        VirtualMethod(1);
    }
 
    public virtual int VirtualMethod(int dividend)
    {
        return dividend / 1;
    }
}
 
public class DerivedClass : BaseClass
{
    int divisor;
    public DerivedClass()
    {
        divisor = 1;
    }
 
    public override int VirtualMethod(int dividend)
    {
        return base.VirtualMethod(dividend / divisor);
    }
}

Can you spot a problem in DerivedClass? What will happen when I try to instantiate it?

var instance = new DerivedClass();

A DivideByZeroException will be thrown. Why?

Well, the reason is in the order of initialization for derived classes:

  • First, instance fields are initialized in the order from the most derived to the base class.
  • Then, constructors are called in the order from the base class to the most derived class.

Since the class is treated as DerivedClass throughout the initialization process, our call to VirtualMethod in BaseClass constructor invokes the DerivedClass implementation of the method before the DerivedClass constructor had a chance to initialize the divisor field. This means that the value was still 0, which caused the DivideByZeroException.

In our case, the problem could be fixed by initializing the divisor field directly instead of in the constructor.

However, the example showcases why it can be dangerous to invoke virtual methods from a constructor. When they are invoked, the constructor of the class they are defined in, might not have been called yet, therefore they could behave unexpectedly.

Polymorphism

Polymorphism is the ability for different classes to implement the same interface, in a different way.

Still, we usually expect a single instance to always use the same implementation of a method, no matter to which type it is cast. This makes it possible to have a collection typed as a base class and invoke a particular method on all instances in the collection, resulting in the specific implementation for each type to be called.

Having said that, can you think of a way to have a different method be called when we downcast the instance before calling the method i.e. to break polymorphic behavior?

var instance = new DerivedClass();
var result = instance.Method(); // -> Method in DerivedClass
result = ((BaseClass)instance).Method(); // -> Method in BaseClass

The correct answer is: by using the new modifier.

public class BaseClass
{
    public virtual string Method()
    {
        return "Method in BaseClass ";
    }
}
 
public class DerivedClass : BaseClass
{
    public new string Method()
    {
        return "Method in DerivedClass";
    }
}

This hides the DerivedClass.Method from its base class, therefore BaseClass.Method is called when the instance is cast to the base class.

This works for base classes, which can have their own method implementations. Can you think of a way to achieve the same for an interface, which cannot contain its own method implementation?

var instance = new DerivedClass();
var result = instance.Method(); // -> Method in DerivedClass
result = ((IInterface)instance).Method(); // -> Method belonging to IInterface

It’s explicit interface implementation.

public interface IInterface
{
    string Method();
}
 
public class DerivedClass : IInterface
{
    public string Method()
    {
        return "Method in DerivedClass";
    }
 
    string IInterface.Method()
    {
        return "Method belonging to IInterface";
    }
}

It’s typically used to hide the interface methods from the consumers of the class implementing it, unless they cast the instance to that interface. But it works just as well if we want to have two different implementations of a method inside a single class. It’s difficult to think of a good reason for doing it, though.

Iterators

Iterators are the construct used for stepping through a collection of items, typically using a foreach statement. They are represented by the IEnumerable<T> generic type.

Although they are very easy to use, thanks to some compiler magic, we can quickly fall into a trap of incorrect usage if we don’t understand the inner workings well enough.

Let’s look at such an example. We will call a method, which returns an IEnumerable from inside a using block:

private IEnumerable<int> GetEnumerable(StringBuilder log)
{
    using (var context = new Context(log))
    {
        return Enumerable.Range(1, 5);
    }
}

The Context class of course implements IDisposable. It writes a message to the log to indicate when its scope is entered and exited. In real world code, this context could be replaced by a database connection. Inside it, rows would be read from the returned result set in a streaming manner.

public class Context : IDisposable
{
    private readonly StringBuilder log;
 
    public Context(StringBuilder log)
    {
        this.log = log;
        this.log.AppendLine("Context created");
    }
 
    public void Dispose()
    {
        this.log.AppendLine("Context disposed");
    }
}

To consume the GetEnumerable return value, we iterate through it with a foreach loop:

var log = new StringBuilder();
foreach (var number in GetEnumerable(log))
{
    log.AppendLine($"{number}");
}

What will be the contents of log after the code executes? Will the returned values be listed between the context creation and disposal?

No, they won’t:

Context created
Context disposed
1
2
3
4
5

This means that in our real world database example, the code would fail – the connection would be closed before the values could be read from the database.

How can we fix the code so that the context would only be disposed after all values have already been iterated through?

The only way to do it is to iterate through the collection already inside the GetEnumerable method:

private IEnumerable<int> GetEnumerable(StringBuilder log)
{
    using (var context = new Context(log))
    {
        foreach (var i in Enumerable.Range(1, 5))
        {
            yield return i;
        }
    }
}

When we now iterate through the returned IEnumerable, the context will only be disposed at the end as expected:

Context created
1
2
3
4
5
Context disposed

In case you’re not familiar with the yield return statement, it is syntactic sugar for creating a state machine, allowing the code in the method using it to be executed incrementally, as the resulting IEnumerable is being iterated through.

This can be explained better with the following method:

private IEnumerable<int> GetCustomEnumerable(StringBuilder log)
{
    log.AppendLine("before 1");
    yield return 1;
    log.AppendLine("before 2");
    yield return 2;
    log.AppendLine("before 3");
    yield return 3;
    log.AppendLine("before 4");
    yield return 4;
    log.AppendLine("before 5");
    yield return 5;
    log.AppendLine("before end");
}

To see how this piece of code behaves, we can use the following code to iterate through it:

var log = new StringBuilder();
log.AppendLine("before enumeration");
foreach (var number in GetCustomEnumerable(log))
{
    log.AppendLine($"{number}");
}
log.AppendLine("after enumeration");

Let’s look at the log contents after the code executes:

before enumeration
before 1
1
before 2
2
before 3
3
before 4
4
before 5
5
before end
after enumeration

We can see that for each value we iterate through, the code between the two yield return statements gets executed.

For the first value, this is the code from the beginning of the method to the first yield return statement. For the second value, it’s the code between the first and the second yield return statements. And so on, until the end of the method.

The code after the last yield return statement is called when the foreach loop checks for the next value in the IEnumerable after the last iteration of the loop.

It’s also worth noting that this code will get executed every time we iterate through IEnumerable:

var log = new StringBuilder();
var enumerable = GetCustomEnumerable(log);
for (int i = 1; i <= 2; i++)
{
    log.AppendLine($"enumeration #{i}");
    foreach (var number in enumerable)
    {
        log.AppendLine($"{number}");
    }
}

After executing this code, the log will have the following contents:

enumeration #1
before 1
1
before 2
2
before 3
3
before 4
4
before 5
5
before end
enumeration #2
before 1
1
before 2
2
before 3
3
before 4
4
before 5
5
before end

To prevent the code from getting executed every time we iterate through IEnumerable, it’s a good practice to store the results of an IEnumerable into a local collection (e.g. a List) and read it from there if we are planning to use it multiple times:

var log = new StringBuilder();
var enumerable = GetCustomEnumerable(log).ToList();
for (int i = 1; i <= 2; i++)
{
    log.AppendLine($"enumeration #{i}");
    foreach (var number in enumerable)
    {
        log.AppendLine($"{number}");
    }
}

Now, the code will be executed only once – at the point when we create the list, before iterating through it:

before 1
before 2
before 3
before 4
before 5
before end
enumeration #1
1
2
3
4
5
enumeration #2
1
2
3
4
5

This is particularly important when there are slow I/O operations behind the IEnumerable we are iterating through. Database access is again a typical example of that.

Conclusion

Did you correctly predict the behavior of all the samples in the article?

If not, you might have learned that it can be dangerous to assume behavior when you are not completely certain how a particular feature is implemented. It’s impossible to know and remember every single edge case in a language, therefore it’s a good idea to check the documentation or try it out yourself when you are unsure about an important piece of code which you have encountered.

More important that any of this is to avoid writing code which might surprise other developers (or may be even you, after a certain amount of time passes). Try to write it differently or pass the default value for that optional parameter (as in our Math.Round example) to make the intention clearer.

If that’s not possible, write the tests in such way that they will clearly document the expected behavior!

Which ones could you predict correctly? Let us know in the comments.

Add comment