In this article, we will discuss the latest release of OpenAGI 0.2.
What is OpenAGI and why do we need it?
In the era of AI, we constantly witness how our interactions with Large Language Models (LLMs) evolve and change. While LLMs have been pivotal for tasks such as answering questions on various topics, writing emails and code, and drafting reports, they still lack a crucial element: Autonomy.
To address this, we introduced OpenAGI a few months ago, leveraging the capabilities of human-like agents that can independently plan, use reasoning, and make decisions to perform user-assigned tasks. These agents are not only intelligent but also capable of learning and adapting to their environment and from human feedback.
Key highlights of the latest release
Workers — Workers are specialized classes tasked with executing the assignments given by the “Admin” class. They use tools such as internet search engines, LLMs, and document writers to complete their tasks for complex tasks like researching and writing blogs. Using Workers, one can build a multi-agent architecture framework.
New LLM integrations — Added support for Groq, which is an inference engine specifically designed for applications requiring low latency and rapid responses. We have also added Gemini 1.5 LLM integration, developed by Google, which has the support of 1.5 million context length window.
Action tools support — Actions provide predefined functionalities that the Agent can invoke to accomplish various tasks. In this release, we have added the support to the Document loaders like text loader and CSV loader have been added in the latest release of OpenAGI. We have also added support for the UnstructuredPdfLoaderAction, which helps to extract the content from PDFs including its metadata. Additionally, we have added the GitHub Search tool in the actions to load GitHub files.
Added usecases — Examples for Document loader, Search, and other tools have been added.
For more details, keep scrolling ⬇️!
What’s new in OpenAGI?
Workers
Workers are specialized classes tasked with executing the assignments given by the “Admin” class. They use tools such as internet search engines, LLMs, and document writers to complete their tasks for complex tasks like researching and writing blogs. However, these tasks are mostly dependent. For instance, to write a blog, the researcher (worker 1) first gathers information on the blog’s subject, and then the writer (worker 2) compiles this information into the blog article. Hence, the writer’s task can only begin once the researcher has completed theirs.
Workers Implementation
Let’s start by installing the framework and importing the necessary modules. Next, we’ll set up the environment variables required to configure the large language model.
We define the role of each worker to establish their responsibilities. Additionally, we provide instructions, which describe how the LLM should behave in its role. These instructions can also include backstory and other relevant details to aid in generating the desired output.
The primary components,TaskWorker
, provide a structured way to define and execute tasks. The TaskWorker
class specializes in executing specific tasks assigned by the planner.
Assign the Workers to the Admin in the order and then run the Admin object.
Refer to the documentation, here
New LLMs — Groq and Gemini
Groq is an inference engine specifically designed for applications requiring low latency and rapid responses. It uses open-source models such as Mistral, Gemma, and Llama 2, delivering hundreds of tokens per second, making it faster than other models. The engine’s superiority is attributed to its use of the LPU (Language Processing Unit) inference engine.
To access the Groq model in OpenAGI, users need to set the Groq API key in the “GROQ_API_KEY” environment variable, the model name in the “GROQ_MODEL” environment variable, and the temperature in the “GROQ_TEMP” environment variable. These environment variables are then used to generate the configuration necessary to define the LLM.
Then, we can define and configure the LLM based on the information provided.
We have also added support for Gemini from Google, which offers capabilities such as larger context length and multimodality. To implement Gemini LLM in OpenAGI, users need to define the environment variables: set the Google API key in the “GOOGLE_API_KEY” environment variable, the model name in the “Gemini_MODEL” environment variable, and the temperature in the “Gemini_TEMP” environment variable.
Document Loaders
We have added text and CSV loaders to load and read the content of documents along with their metadata. To use the document loader in a usecase, we need to specify just the file path. It automatically recognizes whether the file is a text document or a CSV file.
Here’s one usecase using Document Loader:
Similar to the Document Loaders, the UnstructuredPdfLoaderAction requires the file_path and the GitHubFileLoadAction requires — repo (in the username/repository format), directory (in the format — src/openagi/actions), and the file extension (which can be “.py” or “.md”). Both of the tools can be imported from the openagi.actions.tools.
Dive into OpenAGI
Eager to explore OpenAGI and contribute to the project?
Explore the GitHub repo: GitHub - aiplanethub/openagi: Paving the way for open agents and AGI for all.
Also, feel free to open issues, make pull requests to contribute, and don’t forget to ⭐️ the repo to stay updated with the latest developments.
OpenAGI documentation: Introduction | v.0.2.0 | OpenAGI
By collaborating with the community, we can continue to improve OpenAGI and drive innovation in the AI field.
You can reach out to us on the Discord: Join the AI Planet Discord Server!