Friday, 17 April 2009

Cross-site XMLHttpRequest in Javascript

You know how sometimes you put together a simple bit of code and things seem fine but in the back of your mind you think there might be an issue. So you try another scenario/browser and sure enough it doesn't work and you remember why? Well that's what happened a few days ago.

I decided to put together a fairly straightforward site that would let users search for jobs across a range of Craigslist sites. My first impulse was to do this in ASP.Net, doing the scraping on the server side. But Craigslist takes a dim view of people scraping their content and repackaging it.

The next solution I explored was a simple bit of Javascript that would run client-side. The user would fill in a simple form, press go and the Javascript would use XMLHttpRequest to grab search results from Craigslist. It sounds simple enough and worked in IE7.

I had a nagging feeling that I had been reading something recently about cross-site XMLHttpRequests. Sure enough, in the latest incarnation of Firefox, the code didn't work at all, and in VS2008 in the debugger it threw security exceptions.

Cross-site XMLHttpRequest is now severely restricted/forbidden.

I put together a C# WinForms app to do it all for me. It works like a charm. I can now search for telecommuting opportunities across a range of Craigslist sites. The tool is a convenience app that federates user queries. It's a classic case of developers writing the tools that they want for themselves.

I will make the tool available for others shortly. I'll use it for a few days before releasing it. It's always a good idea to let things sit for a day or two.

I spent two hours today bringing the code into conformance with StyleCop. I have mixed feelings about StyleCop. It can be persnickety (Boolean properties with getters and setters have to have XML documentation that starts with the exact words "Gets or sets a value indicating whether...") but a lot of the advice is good. I particularly like the warnings about the correct implementation of IEnumerable<> and IDisposable. By now I have internalized the normative rules concerning casing of variables, methods etc and the ordering of members of a class according to visibility. The added benefit that I derive from using StyleCop is similar to a code-review -- it forces me to reexamine some pieces of code that in hindsight could do with reworking.

Thursday, 5 March 2009

Version Control

Version control is a vitally important part of software development. But really, who wants to get it all set up for small projects?

Well this is the age of free/cheap hosting. I have a wonderful solution going. I am using Unfuddle.com for free SVN hosting and AnkhSVN as the version control provider for Visual Studio 2008. It's a great combination. AnkhSVN integrates seamlessly into Visual Studio. As an added bonus, Unfuddle provides a defect ticket system.

C# interop

C# interop with unmanaged code is not nearly as difficult in practice as it sounds on paper. Most of the interop I have done has been COM (to/from managed code) but for my current freelance gig I need to call from C# into an ANSI C DLL.

I wrote a few simple ANSI C functions to act as the interface to my managed code, just to save some work. I could call them fine. Crucially, I could pass in ANSI strings and get them back.

Then I started getting an AccessViolationException "Attempted to read or write protected memory". I was positive that I had annotated the ANSI C code correctly to export the functions. The C# code had the appropriate DllImport decorations. C# and ANSI C thought the function was CDECL. Everything looked just fine.

The symptoms were downright peculiar. If I declared a local variable in the ANSI C function that I was calling the code threw the exception. If I attempted to call another function from inside the ANSI C function the code threw the exception.

As with so many things, it boiled down to an incorrect setting in the project file, a residual glitch from the fact that I first produced an EXE to exercise the code and then flipped to producing a DLL.

Along the way I learned that I don't need to mess about with IntPtr allocation and freeing. When I have a mutable string (e.g. the result is returned in a char* parameter), I can create a StringBuilder with the appropriate size and pass that from C#. C# marshals it for me.

Sunday, 1 March 2009

Rebuilding machines

This past month has been the month of installing stuff. My wife's hard drive crashed. I managed to resuscitate it enough to pull most data off, then dropped a new drive in. I got a new laptop (an insurance replacement for the laptop that got stolen from our house) and had to set that up. Then serious issues with my primary machine. I am now freelancing. None of this is billable time of course.

My primary machine is a seven (!) year old Dell. It has served me well. Lately I have had difficulty installing software, especially some Windows Updates and Visual Studio 2008. The errors have never been very helpful but in the event logs I could see recurring errors with the Cryptographic Provider Service not being able to start. The error is simply "error: 193".

It seems that people had big issues with the Cryptographic Service in 2003-2004 associated with some service packs, but none of the solutions from that time (deleting corrupt databases so it would rebuild them, checking your "Trusted Root Certification Authorities") made a bit of difference.

In the end I reinstalled the Windows XP Pro, then all my usual applications, then various service packs (SP3 can't be installed without a prior SP; SP1 is not available; SP2 can be installed without anything prior), drivers from Dell etc.

This past month has blown out our data caps several times. New Zealand ISPs charge by data throughput. We have a 5Gb per month allocation. This past month we used 14Gb. Of course, downloading a 1.7Gb IDE for iPhone development didn't help.

Sunday, 11 January 2009

Evidence for the AppDomain: The resolution

To recap what has been a continuing saga: I discovered that attempts to get Streams on PackageParts in an OpenXML Package were failing because of an issue in the .Net framework. Under the hood, a MemoryStream was overflowing to disk. The framework examines the evidence of the assembly and then examines the evidence of the AppDomain. Our managed COM component was getting created in the DefaultDomain and didn't have the evidence (a SecurityZone of MyComputer) that was needed to satisfy the framework.

As I discussed here, I resolved this by having the COM interface act as a shim. It created a second AppDomain that had the evidence necessary to satisfy the framework.

Further testing did not reveal any more show-stoppers but it was obvious that performance was not great. Large blobs of XML were being marshaled across AppDomain boundaries (from the child AppDomain to the DefaultDomain) and then marshalled again across the COM boundary to the Win32 application.

I briefly considered hosting the CLR so that I could explicitly create the AppDomain of my dreams but decided I'd really rather not.

Fortunately help came in the form of this post. (Can I just say how great it is that everyone with a keyboard in Microsoft seems to be blogging these days. Shawnfa's posts have been a great help.)

To summarize: you don't have to do all the work of hosting the CLR. You can just write (in managed code) an AppDomainManager and tell the CLR (by setting two environment variables) to use it.

This was almost the complete answer to my problem: when an AppDomain gets created I can add the Evidence I need, namely a SecurityZone of MyComputer. Sure enough, when I set the environment variables and my managed COM component gets called, I see a new instance of my AppDomainManager get called. But the CreateDomain() method never gets called. The CLR is just loading my COM component into the DefaultDomain; it's not making a new AppDomain.

It's almost the solution though. It's apparent that my AppDomainManager is getting called, and that the HostSecurityManager property is being used by the framework to make decisions. All I need is for that property to return my own subclass of a HostSecurityManager with the evidence set. In my subclass, I override the method ProvideAppDomainEvidence to provide the Evidence that I need.

Now my COM component works like a charm! Performance has improved too. We are marshaling large XML blobs across COM boundaries but at least we aren't also marshaling across the boundary to a second AppDomain.

Link multiple .netmodule files

This past week I was attempting to add evidence to an assembly directly, i.e. to bake it into the DLL, to satisfy some security requirements. The .Net linker (al.exe) has an option (/evidence:) that lets you add a serialized Evidence object as a resource with the magic name Security.Evidence. There is some discussion of producing this resource file here.

A little poking around on the web showed me how to compile to .netmodule files, which seemed analogous to .obj files in the C++ Win32 world. I had to rework the .csproj file produced by Visual Studio so that I could invoke msbuild and produce one .netmodule file for each .cs file, build the resources etc and then link.

Everything appeared to work just fine. The link stage didn't give any errors or warnings. Curiously, the resulting DLL was only about 8Kb in size. The assembly when produced in Visual Studio from the .csproj file is a little more than 1Mb.

This turns out to be "by design". You can indeed compile to a .netmodule and link but if you have more than one .netmodule file it really is not analogous to the linking of various C++ obj files to produce a single DLL. Instead, the resulting assembly is a meta-assembly. It contains references to the various .netmodule files, which you would have to deploy with the assembly.

Luckily all was saved when I realized that I could compile all the files and produce a single assembly directly using csc.exe (the command line C# compiler). I just needed to specify a resource file containing the serialized Evidence and name the resource Security.Evidence. Indeed, it would seem there is little that you can do with the linker (al.exe) that you couldn't just do with csc.exe (the command line compiler), a point made here.