MCP Server for Hosting COM Servers

When the Model Context Protocol (MCP) came out it reminded me of the Common Object Model (COM) from Microsoft.

COM has been around for decades and it’s used for programming, scripting, sharing of functionality at a binary/object level across languages and hosts. Via DCOM all of this can even be done remotely, and well, it’s also useful for red teaming and in a different era, browsers (Internet Explorer), were able to . A lot of software on Windows was implemented as COM objects, including Microsoft Office.

Here are some of the similarities of the two standards:

  • Interface-based Communication and Abstraction
  • Scripting and Automation
  • Integration with External Tools and Services
  • Local and Remote Server Options

Very similar things could be said about most RPC based communications system (CORBA, SOAP,…), so MCP isn’t really that new in principle at all. And the security issues that come with such RCP communication endpoints are also not new.

Anyhow, we are getting side tracked, but because of these similarities I thought why not build an AI Agent that can host any COM object, and then I realized… why not build an MCP server that abstracts COM - essentially, an MCP server for COM servers.

Due to the dynamic discovery nature of COM, all that should be needed is the CLSID or the ProgID to launch, interact and script any COM object.

Welcome the MCP COM Server

To get the prototype off the ground I did some vibe coding using ChatGPT, Claude, Cursor and VS Code and a few hours of debugging and fixing some basic bugs before I had it working.

The mcp-com-server offers these tools

  • CreateObject (basically CoCreateInstance, or how it was called in VB was actually CreateObject)
  • Get/Set Property
  • InvokeMethod
  • QueryInterface
  • ListAllHostedServers: This allows to list all the currently active servers that are hosted.

Challenges

The main challenge I had was to create a work-around for when calling Invoke or Get/Set Property on a COM object actually returns a new COM object. That corner case was not anticipated until I did testing, and it took time to figure out how to abstract that via MCP, and it’s probably a bit hacky at the moment, but it works. There are likely more such edge cases.

Security

As you can imagine, instantiating any COM server means, the MCP server can open a Shell.Application or FileSystemObject and perform dangerous operations. As a basic mitigation there is an Allow List for CLSIDs and ProgIDs, and the MCP server will instantiate allow listed COM objects. This could be expanded to include specific interfaces/methods as well. Overall, like with a lot of MCP servers, it’s risky business. But it offered a great learning opportunity. For me to learn about new tech, it’s important to actually build something with it - even if it’s just for fun.

The Result

The result is pretty impressive! After connecting the MCP server with Claude it can create, open and edit Excel files, save them, then open Outlook and send them via email. Pretty cool stuff.

basic interaction com-mcp-server

There is also SAPI COM object that I used to have Claude speak. There many other capabilities available with COM.

Confirmation Dialogs

Claude shows an Allow/Deny button before invoking custom tools.

Advanced Agent-Driven Windows and Office Automation

This opens up the door for sophisticated automation, like AI agents that can perform precise actions at the API/COM level, rather then depending on clicking on UI elements.

Offensive Security!

This is useful for general Windows and Office automation tasks, however in the long run I want to explore how this fits into the red teaming, as COM has a place there.

Conclusion

This was to demonstrate the power of MCP and how one can build a quite sophisticated Windows automation MCP server that allows to control many aspects of Windows and Office (and many other vendors).

It’s always a good idea to implement and use new tech to learn about the risks from first principle, so I always encourage people asking for advice to not just try to break things, but also actually go and build things from scratch. As that helps with brainstorming and gathering ideas for how someone might abuse certain parts, and generally just to see what pitfalls and security bugs might easily get introduced.

Happy hacking.

References