Best Practices
Learn the best practices for training your chatbot
Best Practices for Training
| Practice | Description |
|---|---|
| Consistency | Use a consistent format for all training data. |
| Clarity | Ensure questions and answers are clear and to the point. |
| Comprehensiveness | Cover as many potential user queries as possible in your training data. The more the chatbot is trained on, the better it will be at handling a wide range of user queries. |
| Accuracy | Ensure that the training data is accurate and up to date. |
| Periodic Updates | Periodically update the training data to reflect new information. |
| Regular Testing | Routinely test the chatbot to verify it's providing accurate responses. |
Chunking
Big documents are further divided into smaller chunks. The default chunk size is 1024 characters, with an overlap of 200 characters. When creating documents, add one of the following separators after every 1000 characters: "\n~~~\n", otherwise other common "\n\n\n", "\n\n", "\n" etc will be used as separators.
What is Match Score?
The match score measures how closely the training data aligns with the user's query. It helps identify the most relevant knowledge base node. A higher match score indicates greater relevance with knowledge base node.
The score is calculated by comparing the user's query with the training data, considering factors such as:
- Semantic understanding: The score considers the meaning and context of the query, not just exact keyword matches
- Vector proximity: Technically, the score often represents the cosine similarity between query and document vectors in the embedding space
- Contextual relevance: How well the entire query aligns with the document's overall topic and focus
- Relative ranking: The absolute score matters less than how documents rank compared to each other
The match score ranges from 0 to 1 and is generated by embedding models. Different models use different scoring mechanisms, so some may produce consistently higher scores, while others may yield lower scores.
Training View Source
To review your AI's training data, check out the Training View Source docs. This allows you to inspect the match score for the knowledge base nodes used to train your AI.
Additional Training Tips
Be Specific
Focus training on common scenarios.
Include Variations
Vary phrasing for similar questions.
Keep It Clear
Craft responses precisely as desired.
Stay Consistent
Ensure consistent tone and policy.
Consistency
Use a consistent format for all training data.
Review Regularly
Analyze chat logs to refine training.