I have built a document question answering system using LLaMa-2 large languae model, now I have to deploy the system across the organization.
Here are my questions :
- Will the system with (Intel(R) Xeon(R) Gold 6238R CPU @ 2.20GHz, 2195 Mhz, 2 Core(s), 2 LogicalProcessor with 24GB RAM and GPU - Nvidia A40-12Q with 12gb) be able to handle the workload of 1000 users logging in?
- I have used Streamlit for my document QA, is it better to use streamlit or Chainlit?
- Which OS is better for production level deployment? Windows or Linux?
This is pretty new to me, So I would welcome all inputs and suggestions.