We are thrilled to announce a significant array of enhancements aimed at improving user experience, ease of use, and overall functionality in UpTrain v0.6!
New Features:
1. **Local Evaluation Capability ✨**
- Users can now run evaluations locally on their systems, providing more flexibility and control over the evaluation process.
2. **Custom Prompt Evaluation 🎛️**
- Introducing the ability to create custom evaluations tailored to specific user needs, empowering users with more control over the evaluation process.
3. **Scenario Description Parameter for Operators 📝**
- Operators can now specify additional context to the Language Model (LLM) using the scenario description parameter, enhancing the quality of evaluations.
4. **Modular Prompt Templates 🧩**
- Release of customizable prompt templates featuring customizable instructions, few-shot examples, scenario descriptions, and output formats, providing users with versatile tools for prompt creation.
5. **New Integrations 🚀**
- **Vector DBs Integration 🔍**
- Integration with vector databases such as Qdrant, ChromaDB, and FAISS for RAG operations, query responses, and evaluation using UpTrain.
- **Framework Integration 🛠️**
- Integration with LLamaindex framework for streamlined operations.
- **LLM Providers Integration 💡**
- Integration with LLMs like Mistral and Llama from platforms such as Anyscale and Together AI for evaluation purposes.
- **LLM Embeddings Integration 🧠**
- Integration with Jina for generating embeddings to enhance RAG operations and evaluations.
6. **Research Integration 📚**
- UpTrain now incorporates the state-of-the-art Spade framework for auto-generating assertions to identify poor LLM outputs, facilitating seamless evaluation on user datasets.
7. **Root Cause Analysis 🕵️**
- UpTrain facilitates root cause analysis for failure issues in RAG pipelines, aiding in the identification and resolution of problems.
8. **Vector Search Integration 🔍**
- Enhanced vector search capability allows for comparing different embedding models, enabling users to derive more relevant context from vector databases.
New Evaluations:
1. **Jailbreak Detection 🚨**
- Identify attempts to perform illegal activities or misuse of the LLM. Users can specify a model purpose to ensure adherence to intended usage.
2. **Code Hallucination 💻**
- Determine the grounding of code generated by the LLM based on provided documents/context, ensuring coherence and relevance.