【博客文章2024】Oracle Database 23ai:AI Vector Search笔记与实验1---实验环境
Author: Bo Tang
Oracle AI Vector Search为了实现快速检索和相似性查询,将嵌入向量存储于数据库中,同时为嵌入向量创建索引。Oracle AI Vector Search为处理人工智能 (AI)工作负载而生,让我们能够基于语义而不仅仅是基于关键词来查询数据。 在同一个Oracle数据库系统中,在对商业数据进行关系型查询的同时,也可以对非结构化的数据进行语义查询。 为了深入研究嵌入向量和机器学习,我们需要搭建Oracle Cloud and Engineered Systems Version 23.5.0.24.07的实验环境,进行《AI Vector Search笔记与实验》系列博客的写作。本系列将使用Hugging Face的all-minilm开放式嵌入语言模型,并同时使用本地REST endpoint运行框架Ollama,来协同运行AI Vector Search。
1. 实验环境:
1) 操作系统为Oracle Enterprise Linux 8.10。单台主机,连接有11个LUN的存储。主机无Nvidia或AMD的GPU。
2)网格基础架构部分,版本为Oracle Cloud and Engineered Systems Version 23.5.0.24.07:
[oracle@station1 ~]$ . oraenv ORACLE_SID = [+ASM] ? +ASM The Oracle base remains unchanged with value /u01/app/oracle [oracle@station1 ~]$ crsctl status res -t -------------------------------------------------------------------------------- Name Target State Server State details -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.DATA.dg ONLINE ONLINE station1 STABLE ora.FRA.dg ONLINE ONLINE station1 STABLE ora.LISTENER.lsnr ONLINE ONLINE station1 STABLE ora.asm ONLINE ONLINE station1 Started,STABLE ora.ons ONLINE ONLINE station1 STABLE -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.cdb1.db 1 ONLINE ONLINE station1 Open,HOME=/u01/app/oracle/product/23.5.0/dbhome_1,STABLE ora.cdb1.pdb1_1.pdb 1 ONLINE ONLINE station1 READ WRITE,STABLE ora.cssd 1 ONLINE ONLINE station1 STABLE ora.diskmon 1 OFFLINE OFFLINE STABLE ora.evmd 1 ONLINE ONLINE station1 STABLE
|
3)数据库部分,版本为Oracle Cloud and Engineered Systems Version 23.5.0.24.07:
数据库为:根容器cdb1和插件数据库pdb1_1:
4)在插件数据库pdb1_1中创建本地用户:
SQL> conn sys/cloud_4U@pdb1_1 as sysdba Connected. SQL> create user hr identified by cloud_4U ;
User created.
SQL> grant resource,connect to hr;
Grant succeeded. SQL> grant unlimited tablespace to hr;
Grant succeeded.
|
5)在插件数据库pdb1_1中创建目录对象并授权给hr:
SQL> conn sys/cloud_4U@pdb1_1 as sysdba Connected.
SQL> create directory dir1 as '/home/oracle/dir1';
Directory created. SQL> grant all on directory dir1 to hr;
Grant succeeded. SQL> grant db_developer_role to hr;
Grant succeeded.
|
2. 向数据库导入开源的预训练的语言模型:
虽然可以在数据库外使用开源的预训练的语言模型或自己训练的语言模型来生成嵌入向量,但是我们仍旧推荐向数据库导入语言模型。只要这些模型能够与Open Neural Network Exchange (ONNX) 标准兼容,那么它们就可以被直接导入数据库内部。这样就可以方便地使用SQL语句生成嵌入向量。 SQL和PL/SQL提供了强大的工具集将文档、图片和声音等先转换成chunks,然后在每个chunk上生成嵌入向量。
1)在插件数据库pdb1_1中查询后发现默认没有嵌入过任何语言模型:
[oracle@station1 ~]$ sqlplus /nolog
SQL*Plus: Release 23.0.0.0.0 - for Oracle Cloud and Engineered Systems on Fri Dec 13 04:45:52 2024 Version 23.5.0.24.07
Copyright (c) 1982, 2024, Oracle. All rights reserved.
SQL> conn hr/cloud_4U@pdb1_1 Connected. SQL> select model_name, algorithm, mining_function from user_mining_models;
no rows selected
|
2)下载预训练的语言模型: ONNX格式的模型不需要进行后续处理,就可以直接导入Oracle Database 23ai数据库中来使用。Hugging Face all-MiniLM-L12-v2就是这样一种模型。据有关文档报道all-MiniLM-L6-v2已经具有384个向量维度,那么 all-MiniLM-L12-v2应该具有更高的向量维度。all-MiniLM-L12-v2模型是一个紧凑而强大的语句转换模型,被广泛应用于优化各种自然语言处理(NLP)的任务中。all-MiniLM-L12-v2模型的文档指出,它采用12层架构,旨在提高效率和性能,并在句子相似性和文本分类等任务中表现出色。与较大的模型相比,该模型的轻量级设计旨在允许更快的处理和减少计算需求。尽管其体积小巧,但是它能够保持高精度和可靠性,使其成为实现NLP解决方案的开发人员的选择。 下载后,将它移动到目录对象中,并解压缩:
[oracle@station1 ~]$ mv all_MiniLM_L12_v2_augmented.zip dir1 [oracle@station1 dir1]$ unzip all_MiniLM_L12_v2_augmented.zip Archive: all_MiniLM_L12_v2_augmented.zip inflating: all_MiniLM_L12_v2.onnx inflating: README-ALL_MINILM_L12_V2-augmented.txt [oracle@station1 dir1]$ ls all_MiniLM_L12_v2_augmented.zip all_MiniLM_L12_v2.onnx README-ALL_MINILM_L12_V2-augmented.txt
|
使用DBMS_VECTOR包,将语言模型加载到Oracle数据库中。您必须指定ONNX模型文件所在的目录、文件名和模型名:
SQL> conn hr/cloud_4U@pdb1_1 Connected. SQL> BEGIN 2 DBMS_VECTOR.LOAD_ONNX_MODEL( 3 directory => 'DIR1', 4 file_name => 'all_MiniLM_L12_v2.onnx', 5 model_name => 'ALL_MINILM_L12_V2'); 6 end; 7 /
PL/SQL procedure successfully completed.
|
在插件数据库pdb1_1中查询后发现了嵌入all-MiniLM-L12-v2语言模型:
SQL> select model_name, algorithm, mining_function from user_mining_models;
MODEL_NAME -------------------------------------------------------------------------------- ALGORITHM MINING_FUNCTION ------------------------------ ------------------------------ ALL_MINILM_L12_V2 ONNX EMBEDDING
|
导入该模型后,在插件数据库pdb1_1的本地用户HR下,存在4个表DM$P5ALL_MINILM_L12_V2、DM$P8ALL_MINILM_L12_V2、DM$P9ALL_MINILM_L12_V2和DM$PAALL_MINILM_L12_V2:
其中DM$P5ALL_MINILM_L12_V2的结构:
其中DM$P8ALL_MINILM_L12_V2的结构:
其中DM$P9ALL_MINILM_L12_V2的结构:
其中DM$PAALL_MINILM_L12_V2的结构:
使用该模型,为测试文本“I live in China”成功生成嵌入向量:
[oracle@station1 ~]$ sqlplus /nolog
SQL*Plus: Release 23.0.0.0.0 - for Oracle Cloud and Engineered Systems on Mon Dec 16 16:07:45 2024 Version 23.5.0.24.07
Copyright (c) 1982, 2024, Oracle. All rights reserved.
SQL> conn hr/cloud_4U@pdb1_1 Connected. SQL> SELECT VECTOR_EMBEDDING(ALL_MINILM_L12_V2 USING 'I live in China' as DATA) AS embedding;
EMBEDDING -------------------------------------------------------------------------------- [1.79761201E-002,2.85252575E-002,2.22478658E-002,-5.96921053E-003,...
|
3. 语言模型运行框架:
为了避免产生幻觉(hallucinate),Oracle AI Vector Search使用流行的运行框架(LangChain、Ollama和LlamaIndex等)来执行Retrieval Augmented Generation(RAG)。 Ollama是免费的开源命令行工具。依靠它可以在本地操作系统上运行语言模型,并且可以通过SQL和PL/SQL访问Ollama。 1)本地安装Ollama(主机需要Internet连接):
首先从 https://ollama.com/download下载Ollama:
[root@station1 ~]# wget --no-check-certificate https://ollama.com/install.sh --2024-12-15 03:26:40-- https://ollama.com/install.sh Resolving ollama.com (ollama.com)... 34.36.133.15 Connecting to ollama.com (ollama.com)|34.36.133.15|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 13269 (13K) [text/html] Saving to: ‘install.sh’
install.sh 100%[========================================================================>] 12.96K --.-KB/s in 0.02s
2024-12-15 03:26:41 (665 KB/s) - ‘install.sh’ saved [13269/13269]
[root@station1 ~]# ls anaconda-ks.cfg Desktop Documents Downloads install.sh Music original-ks.cfg Pictures Public Templates Videos
|
然后,以操作系统root用户安装Ollama(可以看见:环境中没有GPU,所以运行于CPU模式):
[root@station1 ~]# chmod +x install.sh [root@station1 ~]# ./install.sh >>> Installing ollama to /usr/local >>> Downloading Linux amd64 bundle ####################################################################################################### 100.0% >>> Creating ollama user... >>> Adding ollama user to render group... >>> Adding ollama user to video group... >>> Adding current user to ollama group... >>> Creating ollama systemd service... >>> Enabling and starting ollama service... Created symlink /etc/systemd/system/default.target.wants/ollama.service → /etc/systemd/system/ollama.service. >>> The Ollama API is now available at 127.0.0.1:11434. >>> Install complete. Run "ollama" from the command line. WARNING: No NVIDIA/AMD GPU detected. Ollama will run in CPU-only mode. |
如果希望Ollama在0.0.0.0上监听,需要修改配置文件在Service节加入红色标注的行:
[root@station1 ~]# cat /etc/systemd/system/ollama.service [Unit] Description=Ollama Service After=network-online.target
[Service] ExecStart=/usr/local/bin/ollama serve User=ollama Group=ollama Restart=always RestartSec=3 Environment="PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin" Environment="OLLAMA_HOST=0.0.0.0:11434"
[Install] WantedBy=default.target
|
启动Ollama:
[root@station1 ~]# systemctl daemon-reload [root@station1 ~]# systemctl restart ollama
|
查看Ollama的运行状态:
[root@station1 ~]# systemctl status ollama ● ollama.service - Ollama Service Loaded: loaded (/etc/systemd/system/ollama.service; enabled; vendor preset: disabled) Active: active (running) since Sun 2024-12-15 03:48:18 CST; 5min ago Main PID: 512822 (ollama) Tasks: 9 (limit: 100300) Memory: 64.7M CGroup: /system.slice/ollama.service └─512822 /usr/local/bin/ollama serve
Dec 15 03:48:18 station1.lab.example.com systemd[1]: Started Ollama Service. Dec 15 03:48:18 station1.lab.example.com ollama[512822]: 2024/12/15 03:48:18 routes.go:1195: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEV> Dec 15 03:48:18 station1.lab.example.com ollama[512822]: time=2024-12-15T03:48:18.551+08:00 level=INFO source=images.go:753 msg="total blobs: 0" Dec 15 03:48:18 station1.lab.example.com ollama[512822]: time=2024-12-15T03:48:18.551+08:00 level=INFO source=images.go:760 msg="total unused blobs re> Dec 15 03:48:18 station1.lab.example.com ollama[512822]: time=2024-12-15T03:48:18.552+08:00 level=INFO source=routes.go:1246 msg="Listening on [::]:11> Dec 15 03:48:18 station1.lab.example.com ollama[512822]: time=2024-12-15T03:48:18.552+08:00 level=INFO source=common.go:135 msg="extracting embedded f> Dec 15 03:48:18 station1.lab.example.com ollama[512822]: time=2024-12-15T03:48:18.717+08:00 level=INFO source=common.go:49 msg="Dynamic LLM libraries"> Dec 15 03:48:18 station1.lab.example.com ollama[512822]: time=2024-12-15T03:48:18.717+08:00 level=INFO source=gpu.go:221 msg="looking for compatible G> Dec 15 03:48:18 station1.lab.example.com ollama[512822]: time=2024-12-15T03:48:18.721+08:00 level=INFO source=gpu.go:386 msg="no compatible GPUs were > Dec 15 03:48:18 station1.lab.example.com ollama[512822]: time=2024-12-15T03:48:18.721+08:00 level=INFO source=types.go:123 msg="inference compute" id=> lines 1-19/19 (END)
|
查看Ollama的监听端口:
[root@station1 ~]# netstat -lntp | grep :11434 tcp6 0 0 :::11434 :::* LISTEN 512822/ollama
|
查看Ollama的版本:
[root@station1 ~]# ollama --version ollama version is 0.5.4
|
2)拉取嵌入向量模型(主机需要Internet连接): 运行pull命令,拉取all-minilm model:
[root@station1 ~]# ollama pull all-minilm pulling manifest pulling 797b70c4edf8... 100% ▕███████████████████████████████████████████████████████████████████████████████████████▏ 45 MB pulling c71d239df917... 100% ▕███████████████████████████████████████████████████████████████████████████████████████▏ 11 KB pulling 85011998c600... 100% ▕███████████████████████████████████████████████████████████████████████████████████████▏ 16 B pulling 548455b72658... 100% ▕███████████████████████████████████████████████████████████████████████████████████████▏ 407 B verifying sha256 digest writing manifest success
|
测试Ollama本地运行是否成功,如果能够返回向量embedding,就说明运行成功:
[root@station1 ~]# curl http://localhost:11434/api/embeddings -d '{ > "model" : "all-minilm", > "prompt": "I live in China"}' {"embedding":[0.0065165190026164055,-0.15042546391487122,0.35......
|
|