With Large Language Models (LLMs) like ChatGPT, OpenAI has witnessed a surge in enterprise and user adoption, currently raking in around $80 million in monthly revenue. According to a recent report by The Information, the San Francisco-based company is reportedly on pace to hit $1 billion in annual revenue.
Last time we delved into AutoGPT and GPT-Engineering, the early mainstream open-source LLM-based AI agents designed to automate complex tasks. While promising, these systems had their fair share of issues: inconsistent results, performance bottlenecks, and limitations in handling multifaceted demands. They show proficiency in code generation, but their capabilities often stop there. They lack critical project management functionalities like PRD generation, technical design generation, and API interface prototyping.
Enter MetaGPT— a Multi-agent system that utilizes Large Language models by Sirui Hong fuses Standardized Operating Procedures (SOPs) with LLM-based multi-agent systems. This emerging paradigm disrupts the existing limitations of LLMs in fostering effective collaboration and task decomposition in complex, real-world applications.
The beauty of MetaGPT lies in its structuring. It capitalizes on meta-programming techniques to manipulate, analyze, and transform code in real-time. The aim? To actualize an agile, flexible software architecture that can adapt to dynamic programming tasks.
SOPs act as a meta-function here, coordinating agents to auto-generate code based on defined inputs. In simple terms, it’s as if you’ve turned a highly coordinated team of software engineers into an adaptable, intelligent software system.
Understanding MetaGPT Framework
Foundational & Collaboration Layers
MetaGPT’s architecture is divided into two layers: the Foundational Components Layer and the Collaboration Layer.
- Foundational Components Layer: This layer focuses on individual agent operations and facilitates system-wide information exchange. It introduces core building blocks such as Environment, Memory, Roles, Actions, and Tools. The Environment sets the stage for shared workspaces and communication pathways, while Memory serves as the historical data archive. Roles encapsulate domain-specific expertise, Actions execute modular tasks, and Tools offer common services. This layer essentially serves as the operating system for the agents. More details on how these work together are available in the article ‘Beyond ChatGPT; AI Agent: A New World of Workers‘
- Collaboration Layer: Built on top of foundational components, this layer manages and streamlines the collaborative efforts of individual agents. It introduces two mechanisms: Knowledge Sharing and Encapsulating Workflows.
- Knowledge Sharing: This acts as the collaborative glue that binds agents together. Agents can store, retrieve, and share information at varying levels, therefore reducing redundancy and enhancing operational efficiency.
- Encapsulating Workflows: This is where Standardized Operating Procedures (SOPs) come into play. SOPs act as blueprints that break down tasks into manageable components. Agents are assigned these sub-tasks, and their performance is aligned with standardized outputs.
MetaGPT also uses “Role Definitions” to initiate various specialized agents such as Product Managers, Architects, etc. as we discussed above. These roles are characterized by key attributes like name, profile, goal, constraints, and description.
Furthermore, “Anchor Agents” provides role-specific guidance to these agents. For example, a Product Manager’s role might be initialized with the constraint of “efficiently creating a successful product.” Anchor agents ensure that agents’ behaviors align with the overarching goals, thereby optimizing performance.
Cognitive Processes in MetaGPT Agents
MetaGPT can observe, think, reflect, and act. They operate through specific behavioral functions like _think()
, _observe()
, _publish_message()
, etc. This cognitive modeling equips the agents to be active learners that can adapt and evolve.
- Observe: Agents scan their environment and incorporate key data into their Memory.
- Think & Reflect: Through the
_think()
function, roles deliberate before undertaking actions. - Broadcast Messages: Agents used
_publish_message()
to share current task statuses and related action records. - Knowledge Precipitation & Act: Agents assess incoming messages and update their internal repositories before deciding on the next course of action.
- State Management: With features like task locking and status updating, roles can process multiple actions sequentially without interruption, mirroring real-world human collaboration.
Code-Review Mechanisms for MetaGPT
Code review is a critical component in the software development life cycle, yet it is absent in several popular frameworks. Both MetaGPT and AgentVerse support code review capabilities, but MetaGPT goes a step further. It also incorporates precompilation execution, which aids in early error detection and subsequently elevates code quality. Given the iterative nature of coding, this feature is not just an add-on but a requirement for any mature development framework.
Quantitative experiments conducted across several tasks revealed that MetaGPT outperformed its counterparts in almost every scenario. Pass@1 is a measure of the framework’s ability to generate accurate code in a single iteration. This metric offers a more realistic reflection of a framework’s utility in a practical setting. A higher Pass@1 rate means less debugging and more efficiency, directly impacting development cycles and costs. When stacked against other advanced code generation tools such as CodeX, CodeT, and even GPT-4, MetaGPT outperforms them all. The framework’s ability to achieve an 81.7% to 82.3% Pass@1 rate on HumanEval and MBPP benchmarks.
The framework also uses fewer tokens and computational resources, achieving a high success rate at a fraction of traditional software engineering costs. The data indicated an average cost of just $1.09 per project with MetaGPT which is just a fraction of what a developer would charge for the same task.
Steps to Locally Installing MetaGPT on Your System
NPM, Python Installation
- Check & Install NPM: First things first, ensure NPM is installed on your system. If it’s not, you’ll need to install node.js. To check if you have npm, run this command in your terminal:
npm --version
. If you see a version number, you’re good to go. - To install
mermaid-js
, a dependency for MetaGPT, run:sudo npm install -g @mermaid-js/mermaid-cli
ornpm install -g @mermaid-js/mermaid-cli
- Verify Python Version: Ensure that you have Python 3.9 or above. To check your Python version, open your terminal and type:
python --version
. If you’re not up-to-date, download the latest version from the Python official website. - Clone MetaGPT Repository: Start by cloning the MetaGPT GitHub repository using the command
git clone https://github.com/geekan/metagpt
. Make sure you have Git installed in your system for this. If not, visit here. - Navigate to Directory: Once cloned, navigate to the MetaGPT directory using the command
cd metagpt
. - Installation: Execute the Python setup script to install MetaGPT with the command
python setup.py install
. - Create an Application: Run
python startup.py "ENTER-PROMPT" --code_review True
Note:
- Your new project should now be in the
workspace/
directory. --code_review True
will allow the GPT model to do extra operations which will ensure the code runs accurately but note that it will cost more.- If you encounter a permission error during installation, try running
python setup.py install --user
as an alternative. - For access to specific releases and further details, visit the official MetaGPT GitHub releases page: MetaGPT Releases.
Docker Installation
For those who prefer containerization, Docker simplifies the process:
- Pull the Docker Image: Download the MetaGPT official image and prepare the configuration file:
docker pull metagpt/metagpt:v0.3.1
mkdir -p /opt/metagpt/{config,workspace}
docker run --rm metagpt/metagpt:v0.3.1 cat /app/metagpt/config/config.yaml > /opt/metagpt/config/key.yaml
vim /opt/metagpt/config/key.yaml
- Run the MetaGPT Container: Execute the container with the following command:
docker run --rm --privileged \
-v /opt/metagpt/config/key.yaml:/app/metagpt/config/key.yaml \
-v /opt/metagpt/workspace:/app/metagpt/workspace \
metagpt/metagpt:v0.3.1 \
python startup.py "Create a simple and interactive CLI based rock, paper and scissors game" --code_review True
Configuring MetaGPT with Your OpenAI API Key
After the initial setup, you’ll need to integrate MetaGPT with your OpenAI API Key. Here are the steps to do so:
- Locate or Generate Your OpenAI Key: You can find this key in your OpenAI Dashboard under API settings.
- Set the API Key: You have the option to place the API key in either
config/key.yaml
,config/config.yaml
, or set it as an environment variable (env
). The precedence order isconfig/key.yaml > config/config.yaml > env
. - To set the key, navigate to
config/key.yaml
and replace the placeholder text with your OpenAI key:OPENAI_API_KEY: "sk-..."
Remember to safeguard your OpenAI API Key. Never commit it to a public repository or share it with unauthorized individuals.
Use-Case Illustration
I gave the objective to develop a CLI-based rock, paper, and scissors game, and MetaGPT successfully executed the task.
Below is a video that showcases the actual run of the generated game code.
MetaGPT Demo Run
MetaGPT provided a system design document in Markdown—a commonly used lightweight markup language. This Markdown file was replete with UML diagrams, thereby offering a granular view of the architectural blueprint. Moreover, API specifications were detailed with HTTP methods, endpoints, request/response objects, and status codes
The class diagram details the attributes and methods of our Game
class, providing an abstraction that is easy to understand. It even visualizes the call flow of the program, effectively turning abstract ideas into tangible steps.
Not only does this significantly reduce the manual overhead in planning, but it also accelerates the decision-making process, ensuring that your development pipeline remains agile. With MetaGPT, you’re not just automating code generation, you’re automating intelligent project planning, thus providing a competitive edge in rapid application development.
Conclusion: MetaGPT—Revolutionizing Software Development
MetaGPT redefines the landscape of generative AI and software development, offering a seamless blend of intelligent automation and agile project management. Far surpassing the capabilities of ChatGPT, AutoGPT, and traditional LangChain models it excels in task decomposition, efficient code generation, and project planning. Learn more on
Here are the key takeaways from this article:
- The Power of Meta-Programming: By employing meta-programming, MetaGPT provides an agile and adaptive software framework. It transcends the narrow functionality of legacy tools and introduces a transformative approach that handles not just coding, but project management and decision-making aspects as well.
- Two-Layered Architecture: With its foundational and collaborative layers, MetaGPT effectively creates a synergistic ecosystem where agents can work cohesively, akin to an expertly managed software team.
- Optimized Code Review: Beyond just generating code, MetaGPT offers precompilation execution features, which is essentially an early-warning system for errors. This not only saves debugging time but also assures code quality.
- Cognitive Agents: MetaGPT’s intelligent agents, replete with cognitive functions like _observe(), _think(), and _publish_message(), evolve and adapt, ensuring your software solution isn’t just coded but is ‘intelligent.’
- Installation & Deployment: We’ve illustrated that MetaGPT can be easily set up, whether you prefer a local installation via npm and Python, or containerization via Docker.
Credit: Source link