This is a stub, please find the full documentation at https://pages.jlab.org/physdiv/ai-ml/llm-deployment-docs/
This document provides an introduction to open weights large language models (LLMs) we have deployed at Jefferson Lab. These services are designed to expose high-performance, GPU-accelerated large language models to internal users through authenticated and auditable interfaces. This will include access to paide commercial LLM vendors such as Google, OpenAI, and others.
The local LLM system consists of four major components:
Please refer to the full documentation in the link at the top of this page for details.