MCP - The USB-C for AI | Sammy Deprez / Data Savanna

Large language models (LLMs) like GPT-4 or Claude are incredibly powerful, but they have a big limitation: by default they only know what’s in their training data and what you prompt them with. They don’t automatically have access to your live data, files, or tools. This means if you want an AI assistant to work with your documents, databases, or APIs, you traditionally have to write custom code or plugins for each case. That approach is clunky and hard to scale – every new data source or tool requires yet another integration. Enter the Model Context Protocol (MCP), an open standard that aims to simplify and standardize how LLMs connect to external data and services . Think of MCP as a way to give AI assistants a universal “plug” into the information and tools they need, no matter where those live.

The Problem: LLMs Need External Context

Even the best AI models can feel isolated. They’re trained on vast text corpora, but often that data is outdated or not specific to the task at hand. For example, a company’s internal wiki or a user’s personal files aren’t part of a model’s training data. In the past, if you wanted an AI to use such external information, you had to manually feed it in or create one-off connectors. Developers often resorted to brittle scripts or custom APIs to bridge the gap. Every integration was bespoke – one for your Google Drive, another for Slack, another for your database, and so on. This quickly becomes a maintenance nightmare (imagine updating dozens of separate plugins whenever something changes). In short, LLMs lacked a standard way to access fresh data or perform actions, limiting their usefulness in real-world applications.

Why is this a real problem? Without external context, an AI’s answers can be irrelevant or outdated. And without a common integration method, building an AI-powered app means re-inventing the wheel for each new tool you want the AI to use. It’s as if every time you bought a new gadget, you also had to build a custom cable to connect it to your computer – clearly not ideal.

What is the Model Context Protocol (MCP)?

MCP is an open protocol (spearheaded by Anthropic, the company behind Claude) designed to solve the above problem in a big way. Formally, “the Model Context Protocol is an open standard that connects AI assistants to the systems where data actually lives”. In simpler terms, MCP provides a universal interface between LLM-based applications and external sources of data or functionality. Rather than each AI platform having its own specialized plugins or APIs, MCP defines a common language that any AI client and any data/tool service (server) can speak to each other. This means once something implements MCP, any compatible AI can interact with it out of the box.

One way to visualize MCP is to think of it like a USB-C port for AI. Just as USB-C is a standard plug that works for many devices (regardless of brand) – charging your phone, connecting a hard drive, a monitor, etc. – MCP is a standard “plug-and-play” protocol for connecting AI to all sorts of tools and data sources. In the past, connecting M different AI systems to N different data sources required custom pairwise integrations (an M×N problem). MCP flips that into a much simpler M+N model: each AI system and each data source only needs to implement MCP to work with all the others. In other words, developers and data providers only have to do the integration work once, and everything that speaks MCP can understand each other going forward. This standardization is very similar to what the Language Server Protocol (LSP) did for coding tools – it allowed any code editor to support any programming language via one protocol instead of one plugin per language .

Crucially, MCP is open-source and not tied to a single vendor or model. It’s meant to be a community-driven standard, so it can be adopted across the industry. Anthropic’s Claude was an early adopter (Claude Desktop integrates MCP out-of-the-box), but other AI platforms and open-source projects are jumping in as well. The goal is broad adoption so that MCP becomes a common layer for AI-tool communication.

How MCP Works (Under the Hood)

MCP uses a simple client-server architecture under the hood. The AI application (for example, a chatbot interface, an AI-powered IDE, or any app with an embedded LLM) acts as the host that wants to use external data. Within that host, an MCP client component handles communication with an external MCP server. Each external data source or tool you want to integrate runs as an MCP server – a lightweight service that knows how to fetch some data or perform some action, and communicates via the MCP protocol. The host (AI app) can connect to multiple MCP servers at once, one for each resource it needs.

All communication in MCP follows a standard message format (based on JSON-RPC 2.0). This means requests and responses are just JSON messages with a defined structure. For example, when the AI app first connects to a server, it can ask “what tools or actions do you offer?” by sending a JSON request. Here’s a simplified example:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/list",
  "params": {}
}

This is a request from the client asking the server to list available “tools” (functions it can call). The MCP server would reply with a JSON response (not shown in full here) that includes a list of tools it provides, each with a name, description, and what inputs it expects . For instance, a server might respond that it has a tool named "get_weather" which “retrieves weather data” and requires a "location" parameter.

Using this standardized exchange, the AI app can discover what actions or data are accessible. Later, if the LLM decides it needs to use one of those tools (say, to get the weather), the host will send another JSON request like {"method": "tools/call", "params": {...}} to invoke that specific tool. The server executes the action (e.g., calls an external API or database) and returns the result in a JSON response. The host app then injects that result back into the model’s context or answer. All of this happens through a uniform protocol – so the AI doesn’t call external APIs directly; it asks the MCP client, which negotiates with the MCP server in a structured way. This design keeps things secure and controlled: the AI can only use the specific functions the server exposes, and all data passing in and out goes through the standardized MCP interface .

Key concepts in MCP: An MCP server can provide three kinds of things to the AI:

Tools: Think of these as functions or actions the model can invoke (with permission). For example, a server might offer a createNewTicket() function to create an issue in Jira, or a queryDatabase() function to run a database query. The model can call these tools to perform tasks or get computed information.
Resources: These are data sources the model can read from. They behave like documents or files. For instance, a server could expose a resource like /policies/leave-policy.md which contains text the model can retrieve. The AI might “open” that resource to get its content as context for answering a question.
Prompts: These are pre-made prompt templates or instructions that help the model with specialized tasks. For example, a prompt might be a template for translating text or a formatted outline for writing an email. The AI can request a prompt from the server to guide its response (like a dynamic prompt library).

By separating these, MCP lets developers clearly define what an AI can do (tools), what it can read (resources), and helpful guidance it can use (prompts). Each MCP server might implement one or more of these. For example, a GitHub MCP server could offer a resource for the README file of a project, and a tool function to create a new issue.

Example: Using MCP in Practice

What does using MCP actually look like for a developer? Let’s walk through a simple example. Imagine you want your AI assistant to be able to fetch live weather information. Without MCP, you’d likely have to call a weather API outside of the model and then feed the result into the model’s prompt manually. With MCP, we can wrap the weather API into an MCP server and let the model call it when needed.

Below is a simplified Python example of an MCP server (using a hypothetical MCP Python SDK) that provides weather data. This server defines two tools: one to get weather alerts for a U.S. state, and one to get the forecast for specific coordinates:

from mcp.server import FastMCP  # import MCP server library
import httpx                   # for making HTTP requests (e.g., to a weather API)

mcp = FastMCP("weather")       # create a new MCP server named "weather"

@mcp.tool()
async def get_alerts(state: str) -> str:
    """Get weather alerts for a U.S. state."""
    # ... call weather API (e.g., weather.gov) and format the alerts ...
    return "Active weather alerts for " + state + "..."

@mcp.tool()
async def get_forecast(latitude: float, longitude: float) -> str:
    """Get weather forecast for a location."""
    # ... call weather API and parse forecast data ...
    return "Forecast at coordinates {}, {}: ...".format(latitude, longitude)

if __name__ == "__main__":
    mcp.run(transport="stdio")  # start the server (using standard I/O as transport)

Let’s break down what’s happening in this code:

We create an MCP server instance called "weather" using the provided FastMCP class. This sets up the server and gives it an identity (so clients know it’s the “weather” service).
We define two asynchronous functions get_alerts and get_forecast and decorate them with @mcp.tool(). This tells our MCP server that these functions should be exposed as tools that an AI can call. In practice, these functions use an HTTP client (httpx) to fetch data from an external weather API and then return a text result.
Finally, mcp.run(...) starts the server. In this case, we’re using a transport over standard I/O (which is common for local MCP servers, but you could also use other transports like WebSocket or HTTP). Once running, the server will listen for JSON requests (like the examples above) for the tools it provides.

Now, any MCP-compatible AI client can connect to this server and discover the get_alerts and get_forecast capabilities. For instance, if our AI assistant needs the forecast, it will trigger a tools/call request to get_forecast via MCP. The server executes the get_forecast function, and returns the result (e.g., “Forecast at coordinates 1.29, 36.82: …” for Nairobi). The AI model then gets that information and can include it in its response to the user – all without the developer writing custom integration code for the weather API in the AI prompt logic. The MCP server encapsulated that functionality and made it accessible in a standardized way.

This example shows how MCP simplifies practical AI development: the developer writes a small server (which can be reused in other projects), and the AI can call it as needed. There’s no need to teach the AI how to call a REST API or parse JSON – the MCP client and server handle the details. To the AI, it’s just invoking a known tool.

Why MCP Could Be a Game Changer for LLM Applications

MCP introduces a level of standardization and ease that could significantly accelerate AI app development. Here are a few reasons why developers and the industry are excited about it:

Seamless access to live data: With MCP, an AI assistant can fetch up-to-date information from your sources on the fly. Instead of being stuck with a static knowledge cutoff, the model’s answers can be more relevant and context-rich by pulling in data from, say, your Google Drive or database when needed. This greatly reduces the chance of outdated answers and allows AI to operate in dynamic environments.
“Plug-and-play” integrations: MCP lets you plug in new capabilities as easily as connecting a new device to your laptop. If an MCP server exists for a service (e.g. Slack, Git, Gmail), your AI app can use it immediately – no bespoke coding required. Developers can also swap out or add data sources without overhauling their whole application. This is a huge win for maintainability: your AI can scale to use dozens of tools via one consistent interface.
Reusable community tools: Because MCP is open and standardized, it encourages an ecosystem of shared connectors. Developers can build an MCP server once and others can reuse it across different AI clients. For example, a SQL database connector built for MCP could be used by any AI IDE or chatbot that supports MCP. This “build once, use anywhere” approach means over time there may be a library of ready-made MCP servers (for cloud services, SaaS apps, etc.), just like plugins, that all speak the same language . It saves everyone time and fosters collaboration.
Multi-LLM compatibility and flexibility: MCP is model-agnostic. Whether you’re using OpenAI, Anthropic, an open-source LLM, or multiple different models, the way they access tools via MCP remains the same. This gives developers freedom to switch LLM providers or use several without rewriting integration code. The protocol handles the differences. It’s akin to how any web browser can talk to any website because they all use HTTP – here any compliant LLM client can talk to any MCP server. This could reduce vendor lock-in in the AI ecosystem.
More powerful AI agents: By giving AI easier access to actions and data, MCP paves the way for more autonomous AI agents. An AI agent could chain multiple MCP tools to accomplish a complex task. For example, imagine a coding assistant that, upon a single high-level request, can: fetch recent code changes from Git (via an MCP server), look up a relevant issue ticket (via another server), then open a documentation page – all before writing code, ensuring it has full context. Such multi-step tool usage becomes much simpler to implement with MCP’s standardized toolbox. Developers can focus on high-level orchestration (what the AI should do) rather than the low-level details of each integration.
Consistency and easier debugging: Because everything goes through a uniform JSON protocol, the format of requests and responses is consistent across all tools. This means an AI developer doesn’t have to handle one service returning XML, another returning JSON, etc., within the model’s logic. The responses from all MCP servers look similar (structure-wise), which makes it easier to parse results and debug issues. If something goes wrong, you have a clear trail of standardized messages to inspect. Over time, this consistency could lead to robust tooling (like debuggers or monitors specifically for MCP messages) that works for any integration.
Security and control: MCP was built with security in mind. Because the protocol is explicit about what can be done, it allows for centralized management of permissions and safe sandboxes for tools. For example, an MCP host application can require user approval before the AI calls certain tools, or restrict an MCP server’s access to only certain files. All tool usage is mediated by the host, which can enforce checks. This is harder to manage when you have a bunch of ad-hoc integrations. A standardized protocol means best practices (like authentication, logging, rate limiting) can be built once and apply everywhere.

Looking Ahead: The Future of LLMs with MCP

MCP is still a relatively new development (introduced in late 2024), but it’s already showing promise as a foundational piece of the LLM app toolkit. Major tech players and open-source communities are rallying behind it. Early adopters like Block (Square), Apollo, and others have integrated MCP into their AI systems, seeing it as a bridge that connects AI to real-world applications. Developer tool companies – Zed (code editor), Replit, Codeium, Sourcegraph, and more – are experimenting with MCP to make their AI features smarter, allowing, for example, a coding AI to seamlessly pull in context from your codebase and documentation.

If MCP gains wide adoption, we could soon have a rich ecosystem of MCP-compatible tools and services. Envision an “App Store” for AI tools, where you can pick from a catalog of MCP servers (for email, calendars, CRM systems, you name it) and easily extend your AI assistant’s capabilities. In fact, the community is already talking about creating registries of available MCP servers, so that in the future you might literally ask your AI, “Can you add a Wikipedia tool?” and it could find and connect to the appropriate MCP server automatically. This kind of interoperability could be a game changer. It turns the AI from an isolated brain into a versatile doer, because it can interface with everything else through a common protocol.

From a developer’s perspective, MCP could dramatically streamline the AI development workflow. Instead of spending days writing glue code for each integration, developers can focus on the core logic and let MCP handle the boring parts. It also means that improvements to any one part of the system benefit all – if someone builds a faster or more secure MCP server for, say, SQL databases, every AI app using MCP can switch to that and gain the benefits instantly. This collective improvement is exactly what we saw in other tech ecosystems when open standards took hold.

Finally, MCP’s approach of maintaining context across tools hints at reducing fragmentation in AI interactions. Today, an AI might lose track of information when switching between different plugins or data sources. In a future MCP-enabled world, an AI could carry its “context” with it as it moves through various tools, leading to more coherent and intelligent behavior. We might see AI assistants that can truly act as collaborators, handling complex tasks that involve multiple systems, all while feeling like one continuous experience to the user.

In conclusion, the Model Context Protocol is poised to be a game changer because it tackles a fundamental bottleneck in the LLM industry: how to reliably connect powerful AI models with the wealth of data and services we have in the real world. By standardizing those connections, MCP doesn’t just solve a technical integration problem – it opens the door to more useful, up-to-date, and capable AI applications. For beginners and seasoned developers alike, this means building with LLMs can become easier and more productive, sparking the next wave of innovation in AI-powered tools. The groundwork is laid; now it will be exciting to watch the ecosystem grow. 🚀