Wasm Chat
Wasm-chat
allows you to chat with LLMs of
GGUF
format both locally and via chat service.
WasmChatService
provides developers an OpenAI API compatible service to chat with LLMs via HTTP requests.WasmChatLocal
enables developers to chat with LLMs locally (coming soon).
Both WasmChatService
and WasmChatLocal
run on the infrastructure
driven by WasmEdge Runtime, which provides a
lightweight and portable WebAssembly container environment for LLM
inference tasks.
Chat via API Service
WasmChatService
provides chat services by the llama-api-server
.
Following the steps in llama-api-server
quick-start,
you can host your own API service so that you can chat with any models
you like on any device you have anywhere as long as the internet is
available.
from langchain_community.chat_models.wasm_chat import WasmChatService
from langchain_core.messages import AIMessage, HumanMessage, SystemMessage
# service url
service_url = "https://b008-54-186-154-209.ngrok-free.app"
# create wasm-chat service instance
chat = WasmChatService(service_url=service_url)
# create message sequence
system_message = SystemMessage(content="You are an AI assistant")
user_message = HumanMessage(content="What is the capital of France?")
messages = [system_message, user_message]
# chat with wasm-chat service
response = chat(messages)
print(f"[Bot] {response.content}")
[Bot] Paris