Oracle DB 23ai is available for Exadata and I've been spending a lot of time working on building some demos in my lab environment. Below is the architecture.
To help you get started below are the pre-steps I did to create this demo.
- Download and install DB 23ai (latest version which was 23.6 when I created my demo).
- Install APEX within the database. Most existing demos use APEX, and makes it easy to build a simple application. Here is a link to a blog that I used to explain the install process, and ORDS setup for the webserver.
- Optional - Install the embedding model in your database to convert text it's vector representation. Here is a link to how to do this. You can also use an external model with Ollama.
- Optional - Install DBMS_CLOUD to access object storage. Most demos access object storage to read in documents. Here is a link to my blog on how to install it. I actually used ZFS for my object storage after installing DBMS_CLOUD. You can OCI, or even a PAR against any Object storage.
- Install ollama. Ollama is used to host the LLM, and you within Ollama you can download any open source model.. For my demo, I downloaded and installed llama3.2.
The demo I started with was the Texas Legislation demo which can be found here. This link points to a video showing the demo, and within the description is a link to the code and instruction on how to recreate the demo in your environment which are located in Github
The majority of the application is written in APEX, and can be downloaded using the instructions on github which can be found here.
The major changes I had to make to get this demo working on-premises had to do with using Ollama rather than access OCI for the LLM.
Documentation for using Ollama can be found here.
The biggest challenge was the LLM calls. The embedding and document search was the same DBM_VECTOR calls regardless of the model. The Demo, however uses DBMS_CLOUD.send_request which does not support OLLAMA.
I changed the functions to call DBMS_VECTOR_CHAIN.UTL_TO_GENERATE_TEXT instead, and I built a "prompt" instead of a message. This is outlined below.
Description | Demo request | Ollam request |
---|---|---|
Call LLM with chat history and results/td> | dbms_cloud.send_request Message: Question: |
DBMS_VECTOR_CHAIN.UTL_TO_GENERATE_TEXT Question: Chat History: Context: |
SUMMARY : This RAG demo is a great place to start learning how to create a RAG architecture, and with just a few changes many of the Demo's created for Autonomous can be used on-premises also !