OpenAGI 0.2 Release: Workers, New Actions and LLM Integrations

Jun 19, 2024

In this article, we will discuss the latest release of OpenAGI 0.2.

What is OpenAGI and why do we need it?

In the era of AI, we constantly witness how our interactions with Large Language Models (LLMs) evolve and change. While LLMs have been pivotal for tasks such as answering questions on various topics, writing emails and code, and drafting reports, they still lack a crucial element: Autonomy.

To address this, we introduced OpenAGI a few months ago, leveraging the capabilities of human-like agents that can independently plan, use reasoning, and make decisions to perform user-assigned tasks. These agents are not only intelligent but also capable of learning and adapting to their environment and from human feedback.

Key highlights of the latest release

Workers — Workers are specialized classes tasked with executing the assignments given by the “Admin” class. They use tools such as internet search engines, LLMs, and document writers to complete their tasks for complex tasks like researching and writing blogs. Using Workers, one can build a multi-agent architecture framework.
New LLM integrations — Added support for Groq, which is an inference engine specifically designed for applications requiring low latency and rapid responses. We have also added Gemini 1.5 LLM integration, developed by Google, which has the support of 1.5 million context length window.
Action tools support — Actions provide predefined functionalities that the Agent can invoke to accomplish various tasks. In this release, we have added the support to the Document loaders like text loader and CSV loader have been added in the latest release of OpenAGI. We have also added support for the UnstructuredPdfLoaderAction, which helps to extract the content from PDFs including its metadata. Additionally, we have added the GitHub Search tool in the actions to load GitHub files.
Added usecases — Examples for Document loader, Search, and other tools have been added.

Explore the usecases

For more details, keep scrolling ⬇️!

What’s new in OpenAGI?

Workers

Workers are specialized classes tasked with executing the assignments given by the “Admin” class. They use tools such as internet search engines, LLMs, and document writers to complete their tasks for complex tasks like researching and writing blogs. However, these tasks are mostly dependent. For instance, to write a blog, the researcher (worker 1) first gathers information on the blog’s subject, and then the writer (worker 2) compiles this information into the blog article. Hence, the writer’s task can only begin once the researcher has completed theirs.

Workers Implementation

Let’s start by installing the framework and importing the necessary modules. Next, we’ll set up the environment variables required to configure the large language model.

pip install openagi

from openagi.actions.files import WriteFileAction
from openagi.actions.tools.ddg_search import DuckDuckGoNewsSearch
from openagi.actions.tools.webloader import WebBaseContextTool
from openagi.agent import Admin
from openagi.llms.azure import AzureChatOpenAIModel
from openagi.memory import Memory
from openagi.planner.task_decomposer import TaskPlanner
from openagi.worker import Worker

import os

os.environ['AZURE_BASE_URL'] = '<base-url>'
os.environ['AZURE_DEPLOYMENT_NAME'] = '<deployment-name>'
os.environ['AZURE_MODEL_NAME'] = '<model-name>'
os.environ['AZURE_OPENAI_API_VERSION'] = '<openai-api-version>'
os.environ['AZURE_OPENAI_API_KEY'] = '<api-key>'

config = AzureChatOpenAIModel.load_from_env_config()
llm = AzureChatOpenAIModel(config=config)

We define the role of each worker to establish their responsibilities. Additionally, we provide instructions, which describe how the LLM should behave in its role. These instructions can also include backstory and other relevant details to aid in generating the desired output.

# define the role and instructions to each worker in granular 

researcher = Worker(
    role="Researcher",
    instructions="sample instruction.",
    actions=[
        DuckDuckGoNewsSearch,
        WebBaseContextTool,
    ],
)
writer = Worker(
    role="Writer",
    instructions="sample instruction.",
    actions=[
        DuckDuckGoNewsSearch,
        WebBaseContextTool,
    ],
)
reviewer = Worker(
    role="Reviewer",
    instructions="sample instruction.",
    actions=[
        DuckDuckGoNewsSearch,
        WebBaseContextTool,
        WriteFileAction,
    ],
)

The primary components,TaskWorker, provide a structured way to define and execute tasks. The TaskWorker class specializes in executing specific tasks assigned by the planner.

Assign the Workers to the Admin in the order and then run the Admin object.

# define the Admin with Planner, Memory and LLM. Further assign the workers in order
admin = Admin(
    planner=TaskPlanner(human_intervene=False),
    memory=Memory(),
    llm=llm,
)

# Assign sub-tasks to workers
admin.assign_workers([researcher, writer, reviewer])

res = admin.run(
    query="Write a blog post.",
    description="sample description.",
)

Refer to the documentation, here

New LLMs — Groq and Gemini

Groq is an inference engine specifically designed for applications requiring low latency and rapid responses. It uses open-source models such as Mistral, Gemma, and Llama 2, delivering hundreds of tokens per second, making it faster than other models. The engine’s superiority is attributed to its use of the LPU (Language Processing Unit) inference engine.

To access the Groq model in OpenAGI, users need to set the Groq API key in the “GROQ_API_KEY” environment variable, the model name in the “GROQ_MODEL” environment variable, and the temperature in the “GROQ_TEMP” environment variable. These environment variables are then used to generate the configuration necessary to define the LLM.

import os
os.environ['GROQ_API_KEY'] = '<groq-api-key>'
os.environ['GROQ_MODEL'] = '<model-name>'
os.environ['GROQ_TEMP'] = '<temperature-randomness>'

Then, we can define and configure the LLM based on the information provided.

config = GroqModel.load_from_env_config()
llm = GroqModel(config=config)

We have also added support for Gemini from Google, which offers capabilities such as larger context length and multimodality. To implement Gemini LLM in OpenAGI, users need to define the environment variables: set the Google API key in the “GOOGLE_API_KEY” environment variable, the model name in the “Gemini_MODEL” environment variable, and the temperature in the “Gemini_TEMP” environment variable.

import os
os.environ['GOOGLE_API_KEY'] = '<google-api-key>'
os.environ['Gemini_MODEL'] = '<model-name>'
os.environ['Gemini_TEMP'] = '<temperature-randomness>'

config = GeminiModel.load_from_env_config()
llm = GeminiModel(config=config)

Document Loaders

We have added text and CSV loaders to load and read the content of documents along with their metadata. To use the document loader in a usecase, we need to specify just the file path. It automatically recognizes whether the file is a text document or a CSV file.

Here’s one usecase using Document Loader:

from openagi.actions.tools.document_loader import DocumentLoader
from openagi.agent import Admin
from openagi.llms.azure import AzureChatOpenAIModel
from openagi.memory import Memory
from openagi.planner.task_decomposer import TaskPlanner
import os

if __name__ == "__main__":    
    config = AzureChatOpenAIModel.load_from_env_config()
    llm = AzureChatOpenAIModel(config=config)

    file_path = input("Select file to make a questionnaire:\n")
    no_of_ques = input("Number of question's you want in questionnaire:\n")

    query = f"""
Help me create a questionnaire based on the information provided in the attached file.
File Path: {file_path}
Number of Questions Needed: {no_of_ques}
Requirements:
- Do not provide any hints in the questions.
- Ensure all questions are distinct from each other.
"""

    admin = Admin(
        llm=llm,
        actions=[DocumentLoader],
        planner=TaskPlanner(human_intervene=False),
        memory=Memory(),
    )

    res = admin.run(
        query=query,
        description="You are an expert AI agent , who is able to understand the information in the file",
    )
    print(res)

Similar to the Document Loaders, the UnstructuredPdfLoaderAction requires the file_path and the GitHubFileLoadAction requires — repo (in the username/repository format), directory (in the format — src/openagi/actions), and the file extension (which can be “.py” or “.md”). Both of the tools can be imported from the openagi.actions.tools.

Dive into OpenAGI

Eager to explore OpenAGI and contribute to the project?

Explore the GitHub repo: GitHub - aiplanethub/openagi: Paving the way for open agents and AGI for all.

Also, feel free to open issues, make pull requests to contribute, and don’t forget to ⭐️ the repo to stay updated with the latest developments.

OpenAGI documentation: Introduction | v.0.2.0 | OpenAGI

By collaborating with the community, we can continue to improve OpenAGI and drive innovation in the AI field.

You can reach out to us on the Discord: Join the AI Planet Discord Server!