Author: Quek Xiucheng

  • Design Pattern Matters -Level up your Lambda Code (including AI Generated Code) with these 3 patterns

    Design Pattern Matters -Level up your Lambda Code (including AI Generated Code) with these 3 patterns

    3 Essential Design Patterns for Robust AWS Lambda Functions

    When you first start with AWS Lambda, it’s easy to write simple, single-file scripts. But to build robust, enterprise-grade serverless applications, you need to apply proven software design patterns. These patterns help you create code that is testable, maintainable, and scalable.

    This post will explore three essential design patterns—and their common anti-patterns—that will immediately elevate your Lambda functions.


    Dependency Injection and the Principle of Separation of Concerns

    Perhaps the most important principle for writing clean Lambda functions is Separation of Concerns. While not a formal design pattern itself, the principle is simple: always separate your core business logic from the Lambda handler code. The pattern we use to achieve this separation is Dependency Injection (DI).

    The Anti-Pattern: Mixing Logic in the Handler

    Developers often write all business logic directly inside the handler, creating the database client and mixing it with validation and event parsing. This makes the code impossible to test without creating complex mock AWS events.

    Python

    # ANTI-PATTERN EXAMPLE
    import boto3
    
    def lambda_handler(event, context):
      # Dependency is created and used directly inside the handler
      dynamodb_client = boto3.client('dynamodb')
        
        # Business logic is mixed with event parsing
      user_data = event['detail']
      if not user_data.get("email"):
        raise ValueError("Email is required.")
            
        # Database interaction is hardcoded
      dynamodb_client.put_item(
        TableName='Users', 
        Item={'email': {'S': user_data['email']}}
      )
      return {"status": "User created"}
    
    

    The Pattern: Inject Your Dependencies

    You implement Separation of Concerns by designing your core logic functions to accept their dependencies (like a database client) as arguments. The Lambda handler is then only responsible for creating those dependencies and “injecting” them.

    Python

    # business_logic.py
    # This function is pure, testable, and knows nothing about Lambda.
    def process_user_signup(user_data: dict, db_client):
      if not user_data.get("email"):
        raise ValueError("Email is required.")
      db_client.put_item(TableName='Users', Item=...)
      return "User created"
    
    # --- lambda_handler.py ---
    import boto3
    from business_logic import process_user_signup
    
    # Initialize client once for reuse
    dynamodb_client = boto3.client('dynamodb')
    
    def lambda_handler(event, context):
      user_data = event['detail']
        # The dependency is "injected" into the core logic
      result = process_user_signup(user_data, dynamodb_client)
      return {"status": result}
    
    

    With this pattern, you can easily unit-test process_user_signup by passing it a simple dictionary and a mock database client.

    Treat software like a well-run kitchen. Each chef has a single responsibility—like a software component. This is how complex systems deliver a quality product, whether it’s a meal or an application.

    2. The Dispatcher Pattern for Routing Events

    The Anti-Pattern: The if/elif/else Chain

    A single Lambda is often triggered by different event variations from the same source (e.g., a DynamoDB Stream sends INSERT, MODIFY, and DELETE events). The most common anti-pattern is a long, cumbersome if/elif/else chain in the handler. This is hard to read and brittle to change.

    Python

    # ANTI-PATTERN EXAMPLE
    def lambda_handler(event, context):
      for record in event['Records']:
        event_name = record['eventName']
        if event_name == 'INSERT':
          print("Handling INSERT event...")
          # ... insert logic ...
        elif event_name == 'MODIFY':
          print("Handling MODIFY event...")
          # ... modify logic ...
        elif event_name == 'DELETE':
          print("Handling DELETE event...")
          # ... delete logic ...
        else:
          print("Warning: Unknown event type.")
    
    

    The Pattern: Use a Dictionary as a Dispatcher

    A cleaner approach is to use a dictionary as a “router” to map an event key to a specific handler function. This makes your handler readable and easy to extend.

    Python

    # event_handlers.py
    def handle_insert(record): print("Handling INSERT event...")
    def handle_modify(record): print("Handling MODIFY event...")
    def handle_unknown(record): print("Warning: Unknown event type.")
    
    # --- lambda_handler.py ---
    from event_handlers import handle_insert, handle_modify, handle_unknown
    
    EVENT_ROUTER = {
      'INSERT': handle_insert,
      'MODIFY': handle_modify,
    }
    
    
    def handle_records(records):
      for record in records
        event_name = record['eventName']        
        handler_func = EVENT_ROUTER.get(event_name, handle_unknown)
        handler_func(record)
    
    
    def lambda_handler(event, context):
      handle_records(event['Records']
      ...
    

    Adding support for DELETE events is now as simple as creating a handle_delete function and adding one line to the EVENT_ROUTER.

    A switchboard (AI generated, probably wrong lol) – routes the conversation to the intended recipients.

    Expanding the Pattern: Handling Logical Outcomes

    The dispatcher pattern isn’t limited to routing based on an event’s type. It’s an even more powerful tool for handling different outcomes from your business logic, such as success, validation errors, or downstream failures. This allows you to create clean, explicit paths for every possible result of an operation.

    The Scenario: A Payment Processing Function

    Let’s imagine a Lambda function that processes a payment. This single operation can have multiple distinct outcomes. A common but messy way to handle this is with a large if/elif/else block directly in the handler. This code can get hard to read and test because the business logic, error handling, and response formatting are all tightly coupled in one place.

    Dispatching Based on Status

    With the dispatcher pattern, we separate these concerns. The core logic function determines the outcome, and the handler dispatches that result to a dedicated function responsible for formatting the response.

    Step 1: Define Outcome-Specific Handlers

    First, create a separate handler for each possible outcome. Their only job is to create the final HTTP response.

    # outcome_handlers.py
    
    def handle_success(result: dict):
      """Handle successful payment."""
      print(f"SUCCESS: Payment processed for transaction ID {result['transactionId']}.")
      ... # code for handling success outcome
      return {"statusCode": 200, "body": "Payment successful"}
    
    def handle_validation_error(error_message: str):
      """Handle validation error."""
      print(f"VALIDATION_ERROR: {error_message}")
      ... # code for handling success outcome
      return {"statusCode": 400, "body": error_message}
    
    def handle_gateway_error(error_details: str):
      """Handle Gateway Error"""
      ... # code for handling error
      return {"statusCode": 502, "body": "Payment provider error"}
    
    # The router maps an outcome status to a handler function
    STATUS_ROUTER = {
      'SUCCESS': handle_success,
      'VALIDATION_ERROR': handle_validation_error,
      'GATEWAY_ERROR': handle_gateway_error,
    }
    
    

    Step 2: Define the Core Logic and the Dispatcher Handler

    Next, the process_payment function contains the business rules and uses early returns to exit as soon as a rule fails. The main lambda_handler calls this function and uses the STATUS_ROUTER to dispatch the result.

    # lambda_handler.py
    import json
    from outcome_handlers import STATUS_ROUTER
    
    def process_payment(request_body: dict) -> tuple[str, dict | str]:
      """
      Core business logic that returns a status and a result.
      It uses early returns to handle failures.
      """
      amount = request_body.get('amount')
        
      # Rule 1: Validate amount exists and is positive
      if not amount or not isinstance(amount, (int, float)) or amount <= 0:
        return ('VALIDATION_ERROR', "Amount must be a positive number.")
    
      card_token = request_body.get('card_token')
        
      # Rule 2: Validate card token exists
      if not card_token:
        return ('VALIDATION_ERROR', "Card token is required.")
    
      # --- All validation passed, proceed to core action ---
      print(f"Charging payment gateway ${amount}...")
        
      success = payment_gateway.charge(amount, card_token)
        
      if not success:
        return ('GATEWAY_ERROR', '...')
      return ('SUCCESS', {'transactionId': 'txn_12345'})
    
    
    def lambda_handler(event, context):
      """
      Main handler that dispatches work based on the outcome of the payment processing.
        """
      body = json.loads(event.get('body', '{}'))   
      status, result = process_payment(body)   
      handler_func = STATUS_ROUTER.get(status)
      return handler_func(result)
    
    

    Why This is Better

    This design is better as it provides clear separation of concerns:

    • Business Logic (process_payment): Knows how to validate and process a payment. It knows nothing about HTTP status codes or JSON response bodies.
    • Response Formatting (handle_* functions): Know how to create specific HTTP responses for different outcomes. They know nothing about business logic.
    • Orchestration (lambda_handler): Knows how to connect the two. Its only job is to call the logic and dispatch the result.

    3. Repository and DTOs for Consistent Data Handling

    The Anti-Pattern: Inconsistent Payloads and Duplicated Queries

    In a serverless system, lambdas communicate via message queues and shared databases. This can lead to data inconsistencies if not managed properly. This pattern uses two techniques to enforce data contracts: one for data moving between services (in-flight) and one for data in your database (at-rest).

    Use Data Transfer Objects (DTOs) for Message Payloads

    The Problem: JSON payloads sent between Lambdas have no enforced structure. If a producer Lambda changes a key name (userId to user_id), the consumer Lambda breaks at runtime.

    The Solution: Define a strict contract using a Data Transfer Object (DTO), implemented as a Python dataclass. This DTO lives in a shared library or Lambda Layer.

    • Producer: Creates a DTO instance and serializes it to JSON.
    • Consumer: Deserializes the JSON back into a DTO instance. This fails immediately if the structure is wrong.
    • Note: There can be multiple consumer and producer

    Python

    # shared/contracts.py
    from dataclasses import dataclass, asdict
    import json
    
    @dataclass
    class UserSignupDTO:
      user_id: str
      email_address: str
    
      def to_json(self): return json.dumps(asdict(self))
    
      @classmethod
      def from_json(cls, s: str): return cls(**json.loads(s))
    
    # In the consumer Lambda:
    # payload = UserSignupDTO.from_json(record['body'])
    # print(f"Processing user: {payload.email_address}")
    
    

    This approach prevents runtime errors from data mismatches, acts as self-documentation, and enables IDE autocompletion.


    Use the Repository Pattern for Database Access

    The Problem: If multiple Lambdas access the same database table, you get duplicated query logic (e.g., the same boto3 call in five functions). Changing the query means updating it everywhere.

    The Solution: Use the Repository Pattern. Create a single class (e.g., UserRepository) that contains all database access logic for that entity.

    • All database queries for a specific table are methods within this single class.
    • Lambdas call methods on the repository object instead of writing raw queries.

    Python

    # shared/database.py
    import boto3
    
    class UserRepository:
      def __init__(
        self, 
        table_name="Users",
        ddb=boto3.resource('dynamodb')
      ):
        self.table = ddb.Table(table_name)
    
      def get_by_id(self, user_id: str):
        response = self.table.get_item(Key={'userId': user_id})
        return response.get('Item')
    
    # In any Lambda function:
    # user_repo = UserRepository()
    # user = user_repo.get_by_id("user-123")
    
    

    This keeps your code DRY (Don’t Repeat Yourself), makes maintenance easy (change logic in one place), and abstracts the database details from your business logic.


    Design Pattern Provides A Blueprint For AI

    The great news is that we live in the age of Large Language Models (LLMs). These models understand design patterns and now that you understand why these patterns are important, you don’t have to implement them from scratch. You can use clever prompting to have an AI partner do the heavy lifting.

    More importantly, this method also prevents “AI code drift.” By consistently instructing an AI to use a specific pattern for a task—like always using the Repository Pattern for database access—you enforce architectural standards across your codebase. This ensures the code remains predictable and maintainable as the project evolves, regardless of who (or which agent/model) writes the prompt.

    Therefore, instead of asking “write me a lambda,” you can now ask:

    Prompt for DI: “Refactor this Python Lambda handler that uses dependency injection. Separate the core business logic from the handler and make the DynamoDB client an injectable dependency.”

    Prompt for Dispatcher: “Write a Python Lambda handler that uses the dispatcher pattern to process DynamoDB Stream events. It should have separate functions for ‘INSERT’ and ‘MODIFY’ events and use a dictionary to route them.”

    Prompt for Repository/DTO: “Generate a Python UserRepository class that uses Boto3 to interact with a DynamoDB table named ‘Users’. Also, create a UserDTO dataclass to represent the user payload.”

    Ultimately, understanding design patterns lets you write better prompts and critically evaluate the AI-generated code, making the AI a more effective tool.

  • AI, Code, and Verification: A Simple Trick for Accurate Results

    AI, Code, and Verification: A Simple Trick for Accurate Results

    TLDR

    • LLM can be terrible at math or generating response that require precision.
    • A simple rule is to ask LLM to generate code to do math instead of using its answer. This can be achieve with a simple prompt like –
      When asked to do any calculations or conversions, always generate code and run it instead of generating a response immediately

    Hallunication

    It’s a known problem that AIs “hallucinate,” especially when you need a precise answer – like doing math or counting.

    This was famously exposed when earlier generation LLMs got stumped by ‘gotcha’ questions like, “How many ‘r’s are in strawberry?”, which showed they weren’t really thinking. While most advanced models today have now learned to answer that question correctly, this isn’t necessarily because they’ve learned to reason, but because they have been specifically trained or prompted to patch that obvious flaw.

    Taken from https://www.reddit.com/r/singularity/comments/1enqk04/how_many_rs_in_strawberry_why_is_this_a_very/


    While this shows progress, it also reveals that their accuracy can be a result of targeted training rather than innate computational ability.

    This exact issue resurfaced for me with a more practical, real-world problem – and this is what I am doing now to prevent it!

    Feeling Lazy

    I was debugging an issue in MongoDB and had a seemingly simple task: convert a MongoDB ObjectId, 6616b9157bac1647326e11e1, into a human-readable timestamp.

    For those who are unfamiliar with MongoDB ObjectIds, or have been using MongoDB but is unaware – A MongoDB ObjectId is a 12-byte value that includes a 4-byte timestamp in its initial segment. This timestamp represents the number of seconds that have passed since the Unix epoch (January 1, 1970). (see docs)

    The Hallucination

    And… it wasn’t just an answer—ChatGPT delivered it with the full swagger of a lead engineer who’s 100% sure of themselves. It laid out the whole thing step-by-step, explaining the ID format, how it pulled the timestamp, and all that.

    The correct answer should have been 2025-07-09T06:01:39.000Z

    The timestamp it gave me seemed legit at first since it was the right day. But something felt off; the time seemed to be off by a few hours Thank goodness I listened to that little voice in my head and ran the conversion myself. Sure enough, ChatGPT was wrong!

    Not Just ChatGPT

    Curious, I tried the same prompt with Grok, Gemini, and Claude. The results were a mixed bag of confidently incorrect answers. This experience was a stark reminder that while the most obvious flaws are being patched, the underlying weakness in performing novel, precise conversions still persists.


    The Better Approach: Ask for the Code, Not the Answer

    This brings me to the core lesson I learned from this: instead of asking an LLM for the final answer, ask it to write code to produce the answer. My experience with Cursor was a perfect example. While the answer in its chat was wrong, it also provided a code snippet.

    Always ask for code!

    That code was the correct path. This approach plays to the AI’s strengths, shifting the task from a weak point (calculation) to a strong point (code generation). Ideally, the model would then execute that code in a sandboxed environment to provide a verified result.

    That’s right!

    A Simple Rule

    Here’s a simple rule: if it involves math or a conversion, always ask the LLM to write code.

    Here is a short example on how to do that with a simple prompt –

    When ask to do any calucations or converstion always generate a code and run it instead of generating a response immediately.

    This too works for counting “R”s =)

  • MCP Version 2025-06-18 Changes: Confused No More!

    MCP Version 2025-06-18 Changes: Confused No More!

    Hey there! In the midst of the Juneteenth holiday break, the Model Context Protocol (MCP) didn’t slow down. In its latest 2025-06-18 specification, MCP introduced significant enhancements to bolster its security posture. I’m especially interested in how these updates directly addresses a long-standing OAuth vulnerability: the Confused Deputy problem.” Let’s dive in!

    The Confused Deputy Problem With MCP

    Working with AI agents that connect to various tools can bring new security challenges, particularly the “confused deputy” problem. This issue arises when a system, entrusted with certain permissions, is tricked into misusing that authority, often by directing an action to the wrong target. Here are the main ways this can manifest with MCP:

    Confused Deputy Scenario 1 (The “Wrong Legitimate Server” Mix-up):

    Your agent is a trusted assistant. It has permission to do things, like reading documents from Google Docs. A “confused deputy” happens when your agent tries to do something, but accidentally directs its action (and its granted permissions) to the wrong server, even if that server isn’t malicious.

    Example: Your company has two MCP servers that can read Google Docs:

    • Finance MCP Server: (https://finance.mycompany.com/mcp) – This server is meant for highly sensitive financial documents.
    • HR MCP Server: (https://hr.mycompany.com/mcp) – This server is meant for confidential HR documents.

    Both servers might offer a tool called “Google Doc Reader” with very similar descriptions. Your agent intends to read a sensitive financial report from Google Docs using the Finance MCP Server. However, due to a slight confusion (e.g., similar tool descriptions), your agent might mistakenly try to send the request (and its Google Docs access token) to the HR MCP Server. The HR server, though not malicious, is not authorized to see financial documents, creating a data leak or compliance issue.

    Confused Deputy Scenario 2 (The “Malicious Look-Alike Server” Trick):

    This scenario, highlighted in GitHub Issue #544, focuses on a more direct phishing attempt where a user is tricked into connecting to a malicious server from the start.

    Example: An attacker publishes a seemingly legitimate article or guide titled “MCP Configuration Best Practices from MyCompany Inc.” This guide subtly promotes configuring a malicious MCP server address (e.g., https://financc.mycompany.com/mcp – a typo, or https://mycompany-docs.net/mcp) in your MCP client application.

    • The Deception: The user, believing they are following official guidance, unknowingly configures their MCP client to use the malicious server’s address.
    • OAuth Flow Triggered: When the agent tries to perform its first action, the OAuth authorization flow begins. To the user, everything seems legitimate – the authorization prompts, the scopes requested – because the malicious server is designed to mimic the real one.
    • The Confusion (and Risk): Upon completing the authorization, your MCP client obtains an OAuth access token. The core of the confused deputy problem here is that the user, confused by the deception, has essentially granted legitimate authority (the OAuth token) to the wrong server. Your client then unknowingly sends this legitimate token to the attacker-controlled MCP server. Once the malicious server has your token, it can then use it to exfiltrate your sensitive data from Google Docs or other services your token has access to.

    How MCP Version 2025-06-18 Helps

    The 2025-06-18 MCP update brings about these changes to fight these problems:

    1. MCP Servers as OAuth Resource Servers

    What it means: MCP servers now function as “OAuth 2.0 Resource Servers.” This means their core responsibility is to validate the access tokens presented by MCP clients to determine if a request for a protected resource (like using a tool or accessing data) should be allowed. They are the guardians of their own services.

    How it helps (Exact Example of Validation): This makes the overall security setup much clearer and stronger. When an MCP server receives a request from an agent, it will perform critical checks on the access token provided in that request. Specifically, it will verify:

    • Signature: Is the token genuinely issued by a trusted Authorization Server and has it not been tampered with?
    • Expiration: Is the token still valid, or has its lifespan expired?
    • Issuer (iss claim): Was the token issued by an Authorization Server that this specific MCP server trusts?
    • Audience (aud claim): Was the token explicitly intended for this specific MCP server? (This is where RFC 8707’s resource parameter comes into play, as detailed below.)
    • Scope (scope claim): Does the token grant the necessary permissions (e.g., read:document, write:database, summarize:report) for the particular action the agent is trying to perform on this server?

    By performing these precise validations, the MCP server ensures that only genuinely authorized agents, with tokens specifically issued for it and with the correct permissions, can access its protected tools and data. This dramatically enhances security by ensuring every interaction is rigorously checked against industry-standard security rules.

    2. MCP Client to Indicate Resource (Using RFC 8707)

    What it means: When your MCP agent asks for permission (an access token) to use a tool on an MCP server, it now must explicitly tell the permission provider (Authorization Server) exactly which resource (MCP server) it plans to talk to.

    How it helps (Directly addresses the “Wrong Server” / Prompt injection Mix-up):

    • Let’s go back to our example. When your agent wants to read a financial document, it asks for a Google Docs access token, but specifically tells the system: “This token is for the Finance MCP Server (https://finance.mycompany.com/mcp) only.”
    • The token then gets a special “audience” tag saying it’s only for finance.mycompany.com/mcp.
    • If your agent then gets confused and accidentally tries to use this token with the HR MCP Server (https://hr.mycompany.com/mcp), the HR server (which also follows these new rules) will check the token. It will see that the token is not meant for it, and reject the request.
    • This prevents the HR server from ever seeing your sensitive financial documents, even if your agent made a mistake in routing.

    Example: Resource Server Payload and Token Parameters

    To make this more concrete, let’s look at how the resource parameter is used in a client’s request and how the aud (audience) claim appears in the access token that the Resource Server (your MCP server) then receives and validates.

    The Client’s Request (Client asking for a Token)

    When your MCP agent needs an access token to interact with, say, the Finance MCP Server, it makes a request to the Authorization Server. This request will include the resource parameter, as mandated by RFC 8707:

    HTTP

    GET /authorize?
      response_type=code
      &client_id=your_mcp_client_id
      &scope=read:document
      &resource=https://finance.mycompany.com/mcp  <-- THIS IS THE KEY PART
      &redirect_uri=https://your_mcp_client/callback
    
    

    (Note: This is a simplified authorization request. A full flow involves exchanging an authorization code for a token.)

    Here, the resource parameter explicitly tells the Authorization Server: “I need a token specifically for the resource located at https://finance.mycompany.com/mcp.” A malicious server “https://finance.mycoy.com/mcp” trying to impersonate https://finance.mycompany.com/mcp, wont be able to request for a token to the resource located at

    The Access Token Payload (What the MCP Server Receives)

    If the Authorization Server supports RFC 8707, it will issue a JSON Web Token (JWT) as an access token. This token will contain an aud (audience) claim in its payload, identifying its intended recipient.

    The payload of such an access token (after decoding, as tokens are usually Base64 encoded) would look something like this:

    JSON

    {
      "iss": "https://auth.mycompany.com",          // Issuer (the Authorization Server)
      "sub": "user_id_12345",                       // Subject (the user or client using the token)
      "aud": "https://finance.mycompany.com/mcp",   // Audience: THIS MUST MATCH THE RESOURCE SERVER
      "exp": 1717603200,                            // Expiration Time
      "iat": 1717602900,                            // Issued At Time
      "scope": "read:document"                      // Permissions granted
      // ... other claims
    }
    
    

    The Resource Server’s Validation (What your MCP Server Does)

    When the Finance MCP Server (https://finance.mycompany.com/mcp) receives this access token, it performs critical validation steps. As an OAuth Resource Server (and specifically following MCP’s requirements), it must check the aud claim in the token’s payload.

    • If aud is https://finance.mycompany.com/mcp: The token is for this server. The server can proceed to process the request (assuming other validations like signature, expiration, etc., also pass).
    • If aud is https://hr.mycompany.com/mcp (or anything else): The token is not for this server. The Finance MCP Server will reject the request, typically with an “Unauthorized” (401) error, because the token’s audience does not match its own identifier.

    This mechanism is what directly prevents the “confused deputy” problem we discussed, ensuring that even if an agent mistakenly tries to send a token to the wrong server, that server will identify that the token isn’t intended for it and deny access.

    Note-Worthy: Server-Side Elicitation

    The MCP Version 2025-06018 also included specs for a new MCP capability where servers can initiate requests for more information or to confirm actions during a task.

    What it’s for: While it doesn’t directly solve the confused deputy problem, this allows MCP servers to dynamically clarify ambiguous requests or get your explicit consent before performing critical, sensitive, or ambiguous actions. It can, therefore, act as a crucial safety net by ensuring your actual intent aligns with the agent’s proposed action, providing a vital “human-in-the-loop” checkpoint.

    Important Considerations:

    Github discussion emphasizing that no sensitive information should be sent via elicitation
    • NEVER Send Sensitive Info Directly: It’s vital that users never send sensitive data (like passwords, credit card numbers, or PII) through an elicitation prompt. This information is likely to be logged, creating a major security risk. The MCP specification strictly prohibits servers from requesting such data via elicitation.
    • Not for Authentication: Elicitation is not a way for servers to ask for your login credentials. Authentication is handled securely and separately by trusted Identity Providers (IdPs). For scenarios requiring authentication or increased permissions, elicitation might trigger a redirect to a secure browser-based flow (as detailed in the upcoming GitHub Pull Request #475), ensuring sensitive login data never directly passes through the MCP channel.

    Final Notes

    Broader Security & Community Notes

    The MCP update also clarifies general security rules and offers new best practices to help developers build safer MCP systems. (See https://modelcontextprotocol.io/specification/2025-06-18/basic/security_best_practices)

    In terms of adoption of 2025-06-18 MCP spec, as of Jun 22nd, MR for these changes are made but remain Open for the official SDKs. VSCode is reportedly looking into it. at https://github.com/microsoft/vscode/issues/248418.

    It’s also worth noting that, 2025-06-18 MCP specification isn’t publicly mentioned in the cline GitHub repository or in the Cursor forum.

    Your Role in Keeping Things Secure

    While these new MCP features are powerful, your carefulness is still crucial. No technology can completely replace user vigilance.

    Even with these updates, the “confused deputy” problem can still arise if you unknowingly make the initial connection to an MCP server that is truly malicious or an unintended target. The protocol’s security features, like RFC 8707, are designed to prevent the misuse of tokens after they are issued for a specific resource. However, if a user is tricked (e.g., through a convincing phishing attack that directs them to configure a malicious look-alike server address), they might legitimately authorize the wrong server from the outset. This is why:

    • Choose Trusted Servers: Always use MCP servers from reputable sources and meticulously verify their exact URLs.
    • Be Aware: Understand what your agent is doing and what permissions it has.
    • Review Requests: Pay close attention to any questions or confirmations your agent asks you through elicitation, especially for sensitive actions.

    In short, MCP gives you stronger tools, but using them safely means staying aware and making smart choices about who you let your agent interact with.

    Have feedback or want to discuss your experience with MCP 2025-06-18? Leave a comment or reach out!

  • MCP is not a fad, it is certain to happen

    MCP is not a fad, it is certain to happen

    Imagine you are building a factory to manufacture cars. You need complex, specialized components like the engine, or parts that require a multi-step process to manufacture, like the tyres.

    If the engine supplier were to simply dump a pallet of raw, unassembled parts on your factory floor, your car assembly line would have to stop while your workers frantically tried to figure out how to build the engine, test it, and prepare it for installation. Your factory would have to become an expert in engine assembly, a job it was never designed to do.

    Similarly, tyre manufacturing is a multistep process requiring high heat for vulcanization. That’s an environment you won’t want to accommodate in your factory space.

    In both cases, your factory is forced to do the hard, specialized work of preparing raw materials. You will not be efficient.

    AI Applications – An Assembly Of Model, Tools, and Data

    Building an AI application is much like designing a modern assembly line. In both cases, you assemble different components to generate your ultimate product. For today’s AI, this means integrating three key components: the core model (like an LLM), a set of external tools, and a constant flow of data.

    To make this assembly truly functional, models are now trained in “tool calling.” This technique allows the LLM to overcome its static knowledge by recognizing when a query requires help from an external tool—like an API or a database. The model learns to call the necessary tool and integrate its response, transforming itself from a simple text generator into a dynamic agent that can act on live data.

    Agent and the Brain

    However, this powerful feature creates a significant, hidden burden for the developer. While the AI calls the tool, the developer is responsible for building and maintaining the entire environment for it. For every single tool, they must manage its specific dependencies, understand its unique authentication and data formats, and write brittle “glue code” to make it compatible with the main application. This is like designing a chaotic assembly line where every machine needs a different power outlet, its own specialized mechanic, and a unique instruction manual. It is highly inefficient and bloats the core application with third-party complexity.

    This is precisely the problem the Model Context Protocol (MCP) is designed to solve.

    MCP: Solving Complexity Through Standardization and Compartmentalization

    The Model Context Protocol (MCP) solves this problem by simultaneously introduces two powerful concepts: a universal standard for integration—the equivalent of a universal utility port for every workstation (usb c for AI apps) —and a decoupled architecture where each tool operates in its own self-contained environment. This combination of a standard interface and a compartmentalized structure is what provides the key benefits.

    1. Simplified Integration Through Standardization

    First, MCP provides a common language for how an AI application communicates with any tool. While API specifications have existed before, MCP standardizes the layer above that: the protocol and context an AI agent needs to reliably call a tool, understand its capabilities, and use its output. This dramatically simplifies the initial work of integrating a new tool into the assembly line.

    2. Independent Evolution Through Compartmentalization

    Second, and arguably more powerful, is how MCP forces a clean separation between the application and the tool (or prompt, or resources). An MCP Server is a self-contained application. This means the AI application doesn’t need to know anything about the tool’s internal environment, its programming language, or its software dependencies. All of that complexity is managed entirely within the MCP Server.

    Hidden Complexity

    This creates a clear and powerful division of labor, which brings us back to the assembly line. The engine manufacturer can completely re-design their own factory—using new machines, new software, and new processes. But as long as the final engine they ship has the same standard mounting points and data connectors, your car factory’s assembly line doesn’t need to change at all. The engine can evolve independently.

    Similarly, a tool provider can completely update their service and its dependencies within their own MCP Server. As long as the MCP interface remains consistent, the AI application that calls it requires no modification. This decoupling is what allows for a truly scalable and maintainable ecosystem, where developers can build complex applications by assembling robust, independent components without inheriting their internal complexity.

    MCP is not a fad, it is certain to happen

    This brings us to a concluding thought, one that could serve as the title for this entire post: The Model Context Protocol is not a temporary fad; it is an evolutionary concept for building complex systems.

    The reason for this is simple: if MCP didn’t exist, something like it would have to be invented. The principles it embodies—compartmentalization and specialization—are fundamental to solving complexity. We see this pattern repeated everywhere, in both natural and man-made systems.

    Complex systems – Life, Software, Trade

    We see this most profoundly in biology, a system that has been self-optimizing for millions of years. Evolution itself discovered that the most robust path to creating complex organisms was through compartmentalization: specialized cells form tissues, and tissues form organs. Each component hides its immense internal complexity, communicating and collaborating through standardized biological and chemical signals.

    We see it in modern software architecture, where developers have moved from monolithic applications to microservices—small, independent services that communicate through standardized APIs, allowing each one to evolve without breaking the entire system. We even see it in global trade with the invention of the simple shipping container. Before this standard interface, logistics were a nightmare of custom work. The container allowed the entire global system of ships, cranes, and trucks to specialize and scale.

    In every case, a standard interface enables a clear division of labor, allowing a system to grow in sophistication without collapsing under its own weight.

    MCP is the application of this universal, time-tested principle to the assembly line of AI. This is not theoretical. The value of this compartmentalized approach is why major industry players like Google, Microsoft, OpenAI, and Anthropic are rapidly adopting MCP.

    Credits: https://x.com/sundarpichai/status/1906484930957193255

    All That Glitter Is Not Gold

    While the Model Context Protocol (MCP) presents a compelling vision, its initial design appears to prioritize functionality at the expense of a robust security framework. The current specifications leave key implementation details ambiguous, such as mandating OAuth 2.1 for authorization—a protocol that has yet to see wide industry adoption. Furthermore, researchers have identified critical risks, including prompt injection, the potential exposure of credentials from MCP servers, and supply chain attacks via malicious third-party tools. As the ecosystem evolves to mitigate these threats, security must remain a paramount consideration for developers adopting the protocol.

    P.S – This is a high level opinion on MCP. Stay tuned for future articles with actual technical examples and description!

  • Why Unraveled Strands

    Why Unraveled Strands

    I’ve always used coding as my main creative outlet. For me, designing a system or writing clean code is its own form of creativity. It’s about solving problems and building something useful.

    I find it’s often easier to express my ideas through code than through conversation. This blog is a new way for me to do that—a place to share my thoughts on software engineering, bioinformatics, and AI in a more structured way.

    The field is changing quickly, especially with AI. This blog will be my way of documenting what I’m learning as I navigate these changes.

    A quick note on my process: I use AI as a writing tool. The AI is the tool, not the engineer. All the ideas, opinions, and technical insights are my own, based on my experience; the AI just helps clean up the writing.

    Putting my work out there is a new challenge for me, but I believe it’s a good way to get feedback and learn from others.

    This blog is about the process of building software for biology. I hope you find it useful.