Extracts a pure list of lemmatized words of a text filtered by stop words. it will remove non-word tokens, ones which their length is less than 3 and contains non-alphabetic charachters.
Name | Type | Description | |
---|---|---|---|
text |
String
|
input text |
|
filter |
Array.<String>
|
list of custom stopword which will replace with defaults, in case of passing |
Array.<Object>