2008
05.29

Fun with Interop

Coding with C# is fun (most of the time) and really takes a lot of the work out of stuff, like memory allocation/deallocation, that was pretty grungy in C++. Unfortunately, sometimes a new .NET component has to talk to a legacy C++ API and that’s where things can get interesting.
I want to look at a hypothetical example where we are creating a managed wrapper around an old-style extern “C” exported interface.
This example will use the method known as P/Invoke (or Platform Invoke) to call from C# to the C++ interface. Passing primitive types is easy, since there is a pretty straightforward correlation between the managed and unmanaged types. But what about if we want to pass a wchar_t* (pointer to 16-bit UNICODE character) to an Extern “C” exported function as an embedded structure member? How could we do that from C#?

Assume the C structure looks like this:

typedef struct
{
  const wchar_t *Address;
  //other stuff
}MyStruct;

and the exported C function looks like this:

void foo(MyStruct* mystruct);

Assume that the C++ implementation is just going to examine the contents of the Address wchar_t* in the structure; it is not going to change it in any way.
Calling that from C++ is easy since it knows about pointers. C#, however does not know about pointers — just references to objects. It turns out to be a bit of a pain to call from C#.
What I did was define a C# structure that mimics the “C” one:

[StructLayout(LayoutKind.Sequential)]
internal unsafe struct _MyStruct
{
  //IntPtr acts as placeholder for char*
  public IntPtr Address;

  //Managed memory handle
  private GCHandle hAddress;
}

What we doing here is creating a wrapper around a raw GCHandle memory handle and managing the address of the bytes ourselves in the IntPtr. Yuck. Keep reading, it gets better.

Let’s add to the struct:

[StructLayout(LayoutKind.Sequential)]
internal unsafe struct _MyStruct
{
  //IntPtr acts as placeholder for char*
  public IntPtr Address;

  //Managed memory handle
  private GCHandle hAddress;

  internal _MyStruct(string str)  //c'tor
  {
    UnicodeEncoding ue = new UnicodeEncoding();
    byte[] arr = ue.GetBytes(str);

    //extra 2 for null terminator char on end that "C" expects
    Array arr0 = Array.CreateInstance(typeof(byte), arr.Length + 2);
    Array.Copy(arr, arr0, arr.Length);

    hAddress = GCHandle.Alloc(arr0, GCHandleType.Pinned);
    Address = hAddress.AddrOfPinnedObject();
  }
}

We get the raw bytes of the incoming string, create a byte array of that size (+ 2 extra for the null char that “C” expects), then GCHandle.Alloc a “Pinned” address that we remember. Since “C” expects an address of a “string” that is not going to move around, we have to “Pin” the memory and get its address with GCHandle.AddrOfPinnedObject and remember that too.
Yuck. Keep reading, it gets better.

Let’s add to the struct again:

[StructLayout(LayoutKind.Sequential)]
public unsafe struct _MyStruct : IDisposable
{
  //IntPtr acts as placeholder for char*
  public IntPtr Address;

  //Managed memory handle
  private GCHandle hAddress;

  internal _MyStruct(string str)  //c'tor
  {
    UnicodeEncoding ue = new UnicodeEncoding();
    byte[] arr = ue.GetBytes(str);

    //extra 2 for null terminator char on end that "C" expects
    Array arr0 = Array.CreateInstance(typeof(byte), arr.Length + 2);
    Array.Copy(arr, arr0, arr.Length);

    hAddress = GCHandle.Alloc(arr0, GCHandleType.Pinned);
    Address = hAddress.AddrOfPinnedObject();
  }

  public void Dispose()
  {
    if (hAddress.IsAllocated)
      hAddress.Free();
  }

}

Here, we are deriving from IDisposable so we can free the memory handle manually in our Dispose method. Isn’t managed code supposed to free us from this kind of grunge? Guess not.

So, to call this puppy, we could do something like this:

[DllImport("MyDLL.dll")]
public static extern void foo(ref _MyStruct mystruct);
...

string strHello = "hello";
_MyStruct mystruct = new _MyStruct(strHello);
try
{
  //The "C" function gets "hello" in the struct Address member
  foo(ref mystruct);
}
finally
{
  mystruct.Dispose();
}

Sadly, we CANNOT do the following elegant call:

//Won't compile ***
using(_MyStruct mystruct = new _MyStruct(strHello))
{
  foo(ref mystruct);
}

Why won’t it compile? Because, as the compiler complains, you “Cannot pass ‘mystruct’ as a ref or out argument because it is a ‘using variable'”. Hmmm… Oh well. Another thing we live with.

One caveat here, the _MyStruct structure has to mirror the “C” struct, so if the “C” struct has “const wchar_t *Address” as its first member, _MyStruct has to have “public IntPtr Address” as its first also. If there were a second wchar_t* in the “C” struct, _MyStruct would have to have its second “public IntPtr” come immediately after the first and BEFORE any of the GCHandles. The reason here is that the memory layouts have to be the same if you are going to pass a _MyStruct with IntPtr members to an unmanaged function that expects a struct* with wchar_t* members. Make sense?
Oh, and don’t try to make _MyStruct a class. Bad things will happen.

Got a better solution? Lemme hear it.

midniteblogger