Spaces:

HH-AI-Org
/

HH-azure-openai-poc

Paused

App Files Files Community

Change Liao commited on Apr 9, 2024

Commit

0952648

1 Parent(s): 94b4e2b

update 教學jupyter notebook

Browse files

Files changed (1) hide show

Langchain_demo.ipynb +126 -4

Langchain_demo.ipynb CHANGED Viewed

@@ -11,8 +11,27 @@
   {
    "cell_type": "markdown",
    "id": "0e42d7d7-8815-4c76-ad6c-f5d09719e17b",
-   "metadata": {},
    "source": [
     "### pip install everything\n",
     "我會提供我的 requirements.txt, 讓大家安裝相同的virtualenv.; 當然也可以使用其他的 virtual environment"
    ]
@@ -1385,7 +1404,70 @@
    "id": "225fbc10-a54f-4a01-b580-592437b55234",
    "metadata": {},
    "outputs": [],
-   "source": []
   },
   {
    "cell_type": "markdown",
@@ -1405,7 +1487,33 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "#TBD"
    ]
   },
   {
@@ -1426,7 +1534,21 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "#TBD"
    ]
   }
  ],

   {
    "cell_type": "markdown",
    "id": "0e42d7d7-8815-4c76-ad6c-f5d09719e17b",
+   "metadata": {
+    "jp-MarkdownHeadingCollapsed": true
+   },
    "source": [
+    "### 寫在前面\n",
+    "Gen-AI 是一種可以幫你省力, 但又不能完全依賴的工具. AI的三種時代:\n",
+    "* ANI: 弱人工智能, 能幫人類但是比人類弱\n",
+    "* AGI: 跟人類做得一樣好\n",
+    "* ASI: 做得比人還好\n",
+    "\n",
+    "現在是ANI 時代, 所以, 你的客戶可能對Gen-AI 有錯誤的期待. 以為它可以完美地的取代掉人類, 或是會得到100%正確的結果, 這些都是不對的.\n",
+    "\n",
+    "正確的態度是:\n",
+    "你把它當成是一個 `不會累而且很厲害的社會新鮮人` , 可以做很多事, 但產出的結果你一定要多看一下, 多驗證一下. 這個觀念, 你需要去訓練跟教育你的客戶.\n",
+    "\n",
+    "### 為啥要弄成程式來做Gen-AI?\n",
+    "已經有很多整合好Gen-AI的app, 大家也都自己有用ChatGPT, 那為啥還要在程式層級去使用?\n",
+    "答案: \n",
+    "* 整合到現有開發的系統\n",
+    "* Local LLM(Gen-AI) 的使用.\n",
+    "  \n",
     "### pip install everything\n",
     "我會提供我的 requirements.txt, 讓大家安裝相同的virtualenv.; 當然也可以使用其他的 virtual environment"
    ]
    "id": "225fbc10-a54f-4a01-b580-592437b55234",
    "metadata": {},
    "outputs": [],
+   "source": [
+    "##存文件進去\n",
+    "def initial_croma_db(db_name, files_path, file_ext, collection_name):\n",
+    "    _db_name = db_name\n",
+    "\n",
+    "    documents = multidocs_loader(files_path, file_ext)\n",
+    "    ##embedded 是一種向量化的model, azure 有提供\n",
+    "    embeddings = OpenAIEmbeddings(\n",
+    "        deployment=\"CivetGPT_embedding\",\n",
+    "        model=\"text-embedding-ada-002\",\n",
+    "        openai_api_base=\"https://civet-project-001.openai.azure.com/\",\n",
+    "        openai_api_type=\"azure\",\n",
+    "        openai_api_key = \"0e3e5b666818488fa1b5cb4e4238ffa7\",\n",
+    "        chunk_size=1\n",
+    "    )\n",
+    "\n",
+    "    chroma_db = Chroma.from_documents(\n",
+    "        documents,\n",
+    "        embeddings,\n",
+    "        collection_name = collection_name,\n",
+    "        persist_directory= root_file_path+ persist_db,\n",
+    "        chroma_db_impl=chroma_db_impl\n",
+    "    )\n",
+    "\n",
+    "    chroma_db.persist()\n",
+    "    print('vectorstore done!')\n",
+    "\n",
+    "#詢問問題\n",
+    "def local_vector_search(question_str,\n",
+    "                        chat_history,\n",
+    "                        collection_name = hr_collection_name):\n",
+    "    embedding = get_openaiembeddings()\n",
+    "    vectorstore = Chroma( embedding_function=embedding,\n",
+    "                          collection_name=collection_name,\n",
+    "                          persist_directory=root_file_path+persist_db,\n",
+    "                          )\n",
+    "\n",
+    "    memory = ConversationBufferMemory(memory_key=\"chat_history\", return_messages=True, ai_prefix = \"AI超級助理\")\n",
+    "\n",
+    "    llm = AzureOpenAI(\n",
+    "        deployment_name = global_deployment_id,\n",
+    "        model_name= global_model_name,\n",
+    "        temperature = 0.0)\n",
+    "\n",
+    "    chat_llm = AzureChatOpenAI(\n",
+    "        deployment_name = global_deployment_id,\n",
+    "        model_name= global_model_name,\n",
+    "        temperature = 0.0)\n",
+    "\n",
+    "    prompt = PromptTemplate(\n",
+    "        template=get_prompt_template_string(),\n",
+    "        input_variables=[\"question\",\"chat_history\"]\n",
+    "    )\n",
+    "    prompt.format(question=question_str,chat_history=chat_history)\n",
+    "    km_chain = ConversationalRetrievalChain.from_llm(\n",
+    "        llm=chat_llm,\n",
+    "        retriever=vectorstore.as_retriever(),\n",
+    "        memory=memory,\n",
+    "        condense_question_prompt=prompt,\n",
+    "    )\n",
+    "    \n",
+    "    result=km_chain(question_str)\n",
+    "    print(result)"
+   ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "outputs": [],
    "source": [
+    "def agent_demo():\n",
+    "    #其他chat_llm 的宣告需要自己寫\n",
+    "    \n",
+    "    km_tool = Tool(\n",
+    "        name='Knowledge Base',\n",
+    "        func=km_chain.run,\n",
+    "        description='一個非常有用的工具, 當要查詢任何公司政策以及鴻海相關資料都使用這個工具'\n",
+    "    )\n",
+    "\n",
+    "    math_math = LLMMathChain(llm=llm,verbose=True)\n",
+    "    math_tool = Tool(\n",
+    "        name='Calculator',\n",
+    "        func=math_math.run,\n",
+    "        description='Useful for when you need to answer questions about math.'\n",
+    "    )\n",
+    "\n",
+    "    tools=[math_tool,km_tool]\n",
+    "    agent=initialize_agent(\n",
+    "        agent=AgentType.OPENAI_FUNCTIONS,\n",
+    "        tools=tools,\n",
+    "        llm=chat_llm,\n",
+    "        verbose=True,\n",
+    "        memory=memory,\n",
+    "        max_iterations=30,\n",
+    "    )\n",
+    "    result=agent.run(question_str)\n",
+    "    print(result)\n"
    ]
   },
   {
    "metadata": {},
    "outputs": [],
    "source": [
+    "chain = LLMChain(llm=llm, prompt=prompt, callbacks=[handler])\n",
+    "chain.invoke({\"number\":2})\n",
+    "chain.invoke({\"number\":2}, {\"callbacks\":[handler]})"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "bb0ef293-e9f1-4fb0-bd05-5daecbecc982",
+   "metadata": {},
+   "source": [
+    "# Local LLM\n",
+    "目前在虎躍雲上有GPU Inference Server\n",
+    "ip:\n",
+    "\n",
+    "### 擁有的Open Source LLM\n"
    ]
   }
  ],