Why Train Your Bot After NLP Has Been Implemented?
NLP Training is a process that never ends. You need to train your bot when you set-up your intent system, and then as you gather more data, you train your bot to become even more accurate. Also, as language evolves and changes, you need your bot to stay updated.
A couple months ago, Google added a training feature to Google’s Dialogflow. (This feature was already available in Wit.ai). The training module is designed such that your agent (bot) matches user responses to the bot, to the correct intent as often as possible. Dialogflow purports that the greater the number of natural language examples and training performed, the better the classification accuracy of the bot. Though from speaking with the other bot builders in the community who shared this experience, as soon as the training feature was introduced, our bot started making many more errors than it had in the past. In order to ensure the performance issues subsided, you had to go in and “train” your bot. (Well played Google!)
Why Does Google Offer this for Free?
Google has every reason to make this process easy and effective, because essentially, Google has created a genius system (really, hats off!) where in exchange for using their NLP, you are training their platform and organizing their data for free. With all those trainers and an immense amount of data, this will create a supreme competitive advantage—if the data is managed effectively, and context can be assigned over time.
How It Works:
Training is simple. While there are some nuances, I will go over the basic steps.
1.) First, from the left-hand panel of Google Dialogflow, choose ‘Training’. You will then be provided with a list of responses to your bot.
Then tap into one of the responses:
2.) If the intent suggested is correct, simply tap on the checkbox on the right-hand side and validate this response for the intent. It will turn green and then you have to click to “Approve”.
3.) If Dialogflow does not automatically know the appropriate Intent for a specific user response, then tap on the words, “Click to Assign” in the bottom left-hand corner.
After you tap to assign, you can either choose one of your exising intents, or create a new intent. To create a new intent, see directions here. And voila! You are done training!
Seems easy enough, right? Well, um, not exactly. There are a couple challenges which make this process incredibly painful and somewhat unrealistic to manage on a timely basis if you don’t have a dedicated resource.
Pitfalls & Challenges in Training Your Bot
Here are some challenges we encountered:
1.) Organization by User: Responses are organized by user. This means that you have to click into every user, and “train based off of user data”. In order to see all the responses from each user you have to tap in and validate/train the bot for each response. Firstly, it would be great to see which users had more than one response and assign to users over time, instead of time increments; (this could be useful for correlation of language in the future.)
2.) Order & Scroll: For a platform that is all about remembering and learning, it doesn’t quite master the last response you were training. Both trained and untrained data stay in the same order, even after training. Once you hit approve, it immediately moves you to the top of the page (where the data is already trained) so you have to scroll down all over again to untrained data and find the last response you were training. For immense amounts of data, this becomes incredibly time consuming.
3.) You Have to Train the Bot on Data You Already Provided: This might be my biggest pet peeve of the whole platform. While setting up my intent, I meticulously provided a list of data for my intent. Dialogflow then asks you to “train” and validate the data that you already originally provided for your intent. This is beyond absurd. It does not cross-check training data with intent data. https://giphy.com/gifs/funny-gif-reaction-thank-you-b1kgk5tYEwS9W
4.) You Must Validate Training Duplicate Data Over and Over Again: Each response requires a training validation even if it has already been provided. It’s déjà vu all over again.
Moreover, this issue does not correct over time. You have to keep validating the same training data.
5.) UX That Needs Some Work: To validate a response, you need to first select the response (click #1)”, tap the validation checkbox (Click #2), then tap “Approve” (Click #3). A much easier system would be to have the information in a table allow approval and reversal of this approval. The 3-tap-approval/training process is seemingly unnecessary.
6.) No Sorting Functionality: As previously mentioned, it keeps all the training data together; it doesn’t separate out what’s been validated and trained, versus what hasn’t. The ability to keep this data separated and cross-check against it would be extremely useful.
Since this learning platform never officially “learns” and keeps asking you to train the same data over and over again and there is no way to sort through it; it becomes a tedious process. A huge improvement would be the recognition that a bot has already been trained on specific data, either from intent data, or from prior training data.
Secondly, it would be great if there was a way to classify responses more accurately. Maybe this data could be pulled from their Chatbase product, which is touted as having this ability, before training that would also be helpful.
As always, we would love to have your opinion! Maybe there is something we missed? Let us know at [email protected]