• MS built a fake online ma

    From Mike Powell@1:2320/105 to All on Sunday, November 09, 2025 10:28:15
    Microsoft built a fake online marketplace to see how its AI agents would work selling unsupervised - and let's just say the results were... unsurprising

    Date:
    Sat, 08 Nov 2025 19:34:00 +0000

    Description:
    Microsofts Magentic Marketplace shows AI tools still cannot reliably act independently in complex multi-agent simulations.

    FULL STORY

    A new Microsoft study has raised questions on the current suitability of AI agents operating without full human supervision/

    The company recently built a synthetic environment, the Magentic Marketplace ", designed to observe how AI agents perform in unsupervised situations.

    The project took the form of a fully simulated ecommerce platform which
    allowed researchers to study how AI agents behave as customers and businesses
    - with possible predictable results.

    Testing the limits of current AI models

    The project included 100 customer-side agents interacting with 300 business-side agents, giving the team a controlled setting to test agent decision-making and negotiation skills.

    The source code for the marketplace is open source; therefore, other researchers can adopt it to reproduce experiments or explore new variations.

    Ece Kamar, CVP and managing director of Microsoft Researchs AI Frontiers Lab, noted this research is vital for understanding how AI agents collaborate and make decisions.

    The initial tests used a mix of leading models, including GPT-4o, GPT-5, and Gemini-2.5-Flash.

    The results were not entirely unexpected, as several models showed
    weaknesses.

    Customer agents could easily be influenced by business-side agents into selecting products, revealing potential vulnerabilities when agents interact
    in competitive environments.

    The agents efficiency dropped sharply when faced with too many options, overwhelming their attention span and leading to slower or less accurate decisions.

    AI agents also struggled when asked to work toward shared goals, as the
    models were often unsure which agent should take on which role, which reduced their effectiveness in joint tasks.

    However, their performance improved only when step-by-step instructions were provided.

    We can instruct the models - like we can tell them, step by step. But if we
    are inherently testing their collaboration capabilities, I would expect these models to have these capabilities by default, Kamar noted.

    The results show AI tools still need substantial human guidance to function effectively in multi-agent environments.

    Often promoted as capable of independent decision-making and collaboration,
    the results show unsupervised agent behavior remains unreliable, so humans
    must improve coordination mechanisms and add safeguards against AI manipulation.

    Microsofts simulation shows that AI agents remain far from operating independently in competitive or collaborative scenarios and may never achieve full autonomy.

    ======================================================================
    Link to news story: https://www.techradar.com/pro/microsoft-built-a-fake-marketplace-to-see-how-it s-ai-agents-would-work-selling-unsupervised-and-lets-just-say-the-results-were -unsurprising

    $$
    --- SBBSecho 3.28-Linux
    * Origin: capitolcityonline.net * Telnet/SSH:2022/HTTP (1:2320/105)