New features in C# 4.0

August 3rd, 2011 Add Your Comments

With this post I will try to continue my series of articles dedicated to the new fetures added to C# in each of its new versions. As in the previous cases, I will not only try to provide an overview of the features, but also bring you up to speed with them.

Dynamic types

Every new version of C# had its main attraction point. C# 2.0 had generics, C# 3.0 had LINQ and C# 4.0 has dynamic types, allowing it to combine the strength of strong typed languages with the flexibility of the dynamic ones. Personally, I believe that with this new addition, C# has become one of the most evolved languages of today. However, there is a significant number of people that might disagree with my opinion, considering that dynamic types will open the hell gates into C#, allowing programmers, especially inexperienced ones, to create code that is hard to read, debug or maintain.

In this section I will try to give a short overview on what the dynamic types are, when they should be used and when they should be avoided, so that you can benefit from their flexibility but at the same time avoid their pitfalls.

Syntactically, dynamic types are expressed using the dynamic keyword:

dynamic myDynamicVar = "Hello world";
Console.WriteLine(myDynamicVar.GetType().ToString());

The dynamic keyword can be used in a multitude of places: for local method variables, for method parameters, for method return values, for class instance or static fields and even for generic types. The code bellow illustrates valid uses of the dynamic keyword:

class MyClass
{
    private dynamic _instaceDynObj;
    private static dynamic staticDynObj;

    public dynamic DoOperation(dynamic dynParam1, dynamic dynParam2)
    {
        dynamic localDyn;
        List<dynamic> colOfDynObjects;

        return "Hello world";
    }
}

Variables denoted with the dynamic keyword are actually of static types, but their evaluation is only going to be done at runtime, by a new component in the .Net framework called Dynamic Language Runtime, or DLR. The DLR sits on top of the Common Language Runtime (CLR) and its responsibility is to identify the type of an object at runtime, along with all the available methods, properties and fields. The following code sample illustrates that:

// initialize dynamic object with a string
dynamic dynObj = "Hello world";

// call the Legth property on the dynamic object
Console.WriteLine(dynObj.Length);
// call Jump() function on the dynamic object.
// The code will compile just fine, but since the dynamic objects is actually a string 
// and string does not contain a Jump() method, a RuntimeBinderException will be thrown at runtime
dynObj.Jump();

Conversions to and from the dynamic type are implicit. It is important to note that they will only be evaluated at runtime and each invalid conversion will be translated into a runtime exception:

dynamic dynObj = "Hello world"; // implicit conversion  from sting to dynamic

string str = dynObj; // implicit conversion from dynamic to string
Console.WriteLine(str.Length);

int myInt = dynObj; // implicit conversion allowed by compiler by will cause a runtime exception

One of the key aspects of dynamic programming is the ability to add or remove members to an object at runtime. As mentioned a bit earlier, in .Net, dynamic types are in fact static types which are evaluated at runtime. Because of that we cannot say C# is fully dynamic.

In order to emulate the dynamic functionality of adding members to objects at runtime, .Net comes with the IDynamicMetaObjectProvider interface. Classes implementing it will have a full dynamic behavior.

An useful class from the version 4 of the .Net framework is ExpandoObject. It is a built in implementation of the IDynamicMetaObjectProvider interface and it allows dynamically adding member at runtime, like in the following example:

dynamic expando = new ExpandoObject();
// the following two member are dynamically added to the above created instance
expando.SalutationShort = "Hello";
expando.SalutationFull = "Hello world!";

Console.WriteLine(expando.SalutationFull);

One common confusion is between dynamic types and implicitly typed variables, expressed with the var keyword (you can find more info on implicitly typed variables here). At a first glance they both allow assigning an object to a variable without declaring its type, but they are actually very different underneath.

Variables declared with the var keyword are actually strong typed, it is just that they will get their type when assigned and not when declared. After assignment, the compiler will be able to know the actual type and will provide compile time errors in case of misuse. Also, the Visual Studio intellisense will work properly. For example:

var myVar = "Hello world!";

Console.WriteLine(myVar.Length);
// the code bellow will fail to compile, since myVar is a string and string does not have a Jump() method
myVar.Jump();

Because dynamic typed variables are being interpreted at runtime, their usage will involve a performance overhead. In fact, underneath, calling a method from a dynamic variable is closer to calling a method using reflection then to a direct call. However, compared to reflection, on dynamic types, after the first time they are being used, their structure is being cached by the runtime so that each subsequent call will be as fast as possible. As a result, dynamic method invocation is on average 10-20 times faster than a method invoked using reflection.

Even if a lot faster than reflection, a call on dynamic method will still be about 10 times slower than a direct method call. Also, it is worth mentioning, that the first call on a dynamic type will actually be quite a lot slower than that. Another aspect that might be important to applications that use a lot of dynamic type instances at once is that they take quite a lot more memory than normal variables because of the caching performed after the first type interpretation.

However, performance is probably not the biggest pitfall of dynamic programming, but rather the fact that a weakly typed variable doesn’t have an enforced interface (can also be read as contract) to code against. Every time when calling a method or property on a dynamic type you will have to hope that the method or property will actually exist, you will never know for sure. Also the compiler will no longer be able to help with any compile time errors (like those from bad spelling) and the intellisense will not work either.

So quite some downsides with dynamic types. However there are also situations when they would prove to be useful. The main scenario for the .Net team was improving interoperability with COM libraries and with dynamic languages, like IronPython.

Another use is when interacting with data with a variable structure from an external data source. For example a web service, returning JSON, might return a different response structure depending on the context. Furthermore, in order to optimize the network traffic, the service might allow the client to select fields that it needs and only return those. Consider the following REST web service call for example:

// the following API call will only get the firstName, 
// lastName and age fields for the returned employee object
string url = "http://api.example.com/users?userId=xxxx&fields=firstName,lastName,age"

Mapping a JSON or XML document to a dynamic object can be a very easy and elegant way to access its content. .Net does not come with support for that by default, however third party libraries are available, like jsonfx and DynamicJson.

Sometimes, in some sections of an application, when accessing database data, you only need data from a few columns from a table. Selecting all columns would make the query inefficient. Linq allows doing that by the means of anonymous types and implicitly typed local variables. The problem with this approach is that the result from the query can only be used inside a single method. With the use of dynamic types, this can be solved as in the following code:

public static List<dynamic> GetEmployees()
{
    List<Employee> source = GenerateEmployeeCollection();
    var queyResult = from employee in source
                        where employee.Age > 20
                        select new { employee.FirstName, employee.Age };

    return queyResult.ToList<dynamic>();
}

Another useful usage of dynamic types is when creating REST service that is returning dynamic data as in the example provided a bit earlier. WCF is recommended way of creating web services in .Net and it works mapping the service methods to actual methods inside the code. So in order to have a dynamic response from a WCF service, the method representing the service call would have to return a dynamic object. The problem is that by default this not possible, because dynamic objects are not serializable. Maybe something that would be added to the .Net framework in the future. However, until then, a class like the following can be used to represent the dynamic response from the service:

[Serializable]
public class SerializableDynamicObject : DynamicObject, ISerializable
{
    private Dictionary<string, object> props = new Dictionary<string, object>();

    public SerializableDynamicObject()
    { }

    /// <summary>
    /// Neede by ISerializable during WCF serialization ?
    /// </summary>
    protected SerializableDynamicObject(SerializationInfo info, StreamingContext context)
    {}

    public override bool TryGetMember(GetMemberBinder binder, out object result)
    {
        return props.TryGetValue(binder.Name, out result);
    }

    public override bool TrySetMember(SetMemberBinder binder, object value)
    {
        if (props.ContainsKey(binder.Name))
            props[binder.Name] = value;
        else
            props.Add(binder.Name, value);
        return true;
    }


    public void GetObjectData(System.Runtime.Serialization.SerializationInfo x, System.Runtime.Serialization.StreamingContext y)
    {
        foreach (string key in props.Keys)
        {
            if (props[key] != null)
            {
                x.AddValue(key, props[key], props[key].GetType());
            }
        }
    }
}

In the end, I would like to add a few words of caution to the use of dynamic inside C#. While it provides the means to solve some problems in an elegant way, it also presents the very important downsides described previously in this section. Here are a few items to take into account:

  • never ever use dynamic (or var for the same reason) when you know the type of the variable you are going to work with. This is not the place to be lazy. Declare variables to their actual type. It will make code much more easier to read and maintain.
  • note the difference between var and dynamic and use var where possible. Variables declared with var are evaluated at compile-time, they offer type safety, intellisense and better performance.
  • always think if your problem can also be solved through polymorphism, generics or OOP patterns. Any of those are a much better desirable way compared to dynamic types.
  • in general, before using dynamic search for alternative ways. Dynamic should be the last resort. It is important to note that in real life situations, the cases where dynamic typing is the solution are quite limited. If you find yourself making a lot of use of this feature, you might be on the a wrong path.

Covariance and contravariance

I first mentioned covariance when I talked about generics, in New things in C# 2.0. Back then generics covariance was not supported, but as anticipated it was going to be added in future versions of C# and now, in version 4.0, it was finally added but, as I will detail, with some limitations.

In a nutshell, we have covariance when constructs that depend on certain types can be cast in the same direction as dictated by the types on which they depend.

I am first going to explain the cast in the same direction as the types on which they depend part. In OOP, casting is dictated by the order in the inheritance chain and it is always possible from a derived type to an ancestor. For example if we have a String instance, we can always assign it to a an Object instance, since Object is an ancestor of String, or in other words a String is always an Object:

String s = "Hello world";
Object o = s; // this works just fine

Assigning the instance of a class to an ancestor type instance is called upcasting. Assigning the instance of a class to a descendant type is called downcasting and in OOP it is not allowed as an implicit cast operation:

Object o = new Object();
String s = o; // will not compile. An Object is not necessarily a String

In conclusion, in OOP, the casting direction as dictated by the inheritance chain is from descendant to ancestor. A covariant construct will allow casting in the same direction. For example, in C#, arrays have always been covariant. Because of that, the following is possible:

string[] arrStr = new string[] {"one", "two", "three"};
object[] arrObj = arrStr;
arrObj[1] = "Hello";

Starting with C# 4.0, covariance is now also available at generics level. That enables the following code to be valid:

IEnumerable<string> strings = new String[] { "one", "two", "three" };
IEnumerable<object> objects = strings;

foreach (object crtObject in objects)
{
    Console.WriteLine(crtObject);
}

By carefully analyzing the above example we can discover a potential pitfall with generics covariance. Let’s consider the following piece of code:

List<string> colStrings = new List<string>() { "one", "two", "three" };
List<object> colObjects = colStrings;
colObjects[1] = 1;

The above code will not compile, but allows illustrating the problem. We have cast colStrings to List<object> by assigning it to the colObjects object, but colObjects is still list of strings underneath. When trying to assign 1, which is an integer, to colObjects the compiler should be OK with that, because an integer is an object and it should be possible to add integers to lists of objects. However, at runtime the application would break, because as said above, while declared as List<object>, colObjects is still a List<string> underneath and assigning an int to a string collection would raise an exception.

In order to avoid the above pitfall, C# 4.0 introduced the concept of getonly and setonly generics. Covariance can only be applied to getonly generics, that is generics that only allow returning elements of the generic type. In such a construct, the generic type can only be found as the return type from functions but not as a parameter in any of the functions. For example:

public interface IGetOnlyGeneric<out T>
{
    T GetFirstItem(); // this is valid
    void DoAction(Action<T> callback); // Action<T> is the only exception of T in parameters
                                    // It is allowed because through it an instance of T 
									// can only be output from an Action<T> callback
    void SetFirstItem(T myInstance); // this is would not compile
}

IEnumerable is an example of a getonly generic interface. Intuitively, setonly generics would act only as setters in relation to the generic type, which means that the generic type would only be allowed in method parameters, but never as return value:

public interface ISetOnlyGeneric<in T>
{
    void SetFirstItem(T myInstance); // this is valid
    // T GetFirstItem(); // this is would not compile
}

Setonly generics are contravariant. Contravariance applies to constructs that can be casted in the opposite direction compared to the types on which they depend. The IComparable interface is an example of a setonly generic interface. The following code illustrates contravariance:

IComparable<Pet> petComparer = GetPetComparer(); // Pet is a base class 
				// for pet animals
Icomparable<Dog> dogComparer = petComparer;

Getonly and setonly generics are better known as variant generics. An important limitation is that only interfaces and delegates can be variant generics. Classes and structures cannot be defined as variant generics, but they can implement variant interfaces. Additional info in variant generics can be found on MSDN here.

In my opinion cotravariance is a bit more difficult to comprehend and also probably less useful than covariance. It was probably added for consistency to the language.

A question that might arise in here is why variance is so simple for arrays (all arrays of reference types are always covariant) and why is it so complicated for generics. There are two potential answers for that. Either the C# guys missed some potential problems during the first version of C#, either they didn’t want to complicate things at that stage. Because of that the following code would compile just fine, but will throw an ArrayTypeMismatchException at runtime:

string[] strArray = new string[] { "one", "two", "three" };
object[] objArray = strArray;

objArray[1] = new Dog { Name = "Cesar", Age = 7 };

Optional and named parameters

C# 4.0 introduces optional method parameters for the first time in C#, a feature that was a available for a long in time in C++. The syntax for defining default parameters is the same as in C++:

void WriteToFile(string text, string encoding = "UTF8", bool includeBom = 	true)
{ … }

In the above sample the second parameter, encoding, will have the default value of “UTF8” and the third parameter, includeBom will have a default value of true. This means that the following method call

WriteToFile("Hello world!");

will have the same effect as

WriteToFile("Hello world!", “UTF8”, true);

Of course, until now, the same behavior could have been achieved using overloading:

void WriteToFile(string text)
{
  // automatically sets the parameter values
  WriteToFile(text, "UTF8", true);
}

Considering this, optional parameters look more like syntactical sugar, since their functionality could have already been achieved in the previous versions of C# using method overloading. However, C# 4.0 also brings an additional feature, called named parameters, which adds additional flexibility to optional parameters.

In the previous versions of C#, as well as in most of the other languages, like C/C++ or Java, parameters are passed to methods by order. For example when calling the following method

Rectangle CreateRectangle(int top, int left, int bottom, int right) { … }

the second parameter will always be the left parameter.

In C# 4.0, parameters can be passed as named arguments and because of that they can be passed in every order:

CreateRectangle(bottom: 5, right: 5, top: 1, left: 1);

Combining named arguments with optional parameters and getting back to the original example, we can make a call like the following:

WriteToFile("Hello world!", includeBom: false);

In the above example, the second parameter, encoding, was excluded and took the default value of “UTF8”, while the last parameter, includeBom was explicitly set to false, overriding the default value. That would have not been possible in C++, since passing parameters by order would have involved that all excluded parameters should have been from the end of the parameter list and excluding the second parameter would have only been possible if also excluding the third.

The combination between named and optional parameters gives more flexibility than overloading especially for methods with a lot of parameters. Also they can make the code easier to read.

Optional parameters have a couple of constraints that need to be take into account.

While you can call optional parameters in any order, by the use of named parameters, when you define them, they can only be defined to the end of the parameter list. I don’t know why this constraint is in place, since the C# compiler would not need it. My guessing is that it might be needed for interoperability with other .Net languages that might not support named parameters.

The second constraint, is that you can define default values only for parameters of primitive types. For example the following code will not compile, with a Default parameter value for ‘encoding’ must be a compile-time constant error:

void WriteToFile(string text, Encoding encoding = Encoding.UTF8, BomType 	includeBom = BomType.UseBom)
{ ... }

Please note that this time encoding was defined using the Encoding type instead of string in the examples from the beginning of this section. For the above scenario, overloading would still be the only way to define default values for parameters.

New COM interop features

The 4.0 version of the C# added a few features that made interoping with COM objects a lot easier. The new COM related features are outside the scope of this article, but a good description of them can be found in this MSDN Magazine article.

Comments are closed.