It has passed some time since I published a Naive_Bayes for text-classifier written in Ruby. At that time the API was designed very much influenced in way I have written a similar classifier in C++. However it was after a looked at the gem Decider that I realized how it was not taking advantage over the dynamic characteristics of Ruby to make Domain-Specific-Language APIs.
Static characteristics of my original Ruby classifier interfaceThe following are the methods that called my attention:
Category declarationUse of add_category method to declare new categories expects a category name and an array of strings as an example.
classifier.add_category(name: :not_spam, training_set: [ 'Hi', 'Joe', 'how', 'are', 'you' ])
Input classificationThe classify method makes use of an array of string as input for classification.
category = classifier.classify(['Hi','James','are','you', 'going']) category.to_s # not_spam
Refactoring to new style
Category declaration through method missing and stringInstead of adding categories with a training set the user should only be required to provide examples to its desired with a dynamically created method which corresponds to a category name.
classifier.not_spam << "Hi Joe, how are you"
Input classification from stringInstead of requiring the user to break the string in an array the entire string can be classified. The classifier should also take the responsibility of breaking the string in words.
category = classifier.classify "Hi James,are you going ?" category.to_s # not_spam