i need to extract domain-specific terms from a big training corpus, such as political terms or etc .how can i use Weka and it's filters to aim this object?
can i use feature vector produced by StringToVector() filter in Weka to do this or not?
how can i use weka to terminology extraction?
289 views Asked by MSepehr At
1
There are 1 answers
Related Questions in TEXT
- Seeking Python Libraries for Removing Extraneous Characters and Spaces in Text
- How to increase quality of mathjax output?
- How to appropriately handle newlines and the escaping of them?
- How to store data with lots of subdata but keep easy and simple access in python
- Can I make this kind of radio button?
- I am findind it dificult to create a box containing text
- Replacing Text using Javascript
- How to set text inside a div using JavaScript and CSS
- How to get new text input after entering a password in a tab?
- How can I get my hero section to look like this?
- Find text and numbers Formatted: "Case: BE########" and format them, regardless of the number
- Auto style text in flutter
- Text analytics and Insights
- Combine an audio and a text file as one single file
- How to align side text and table horizontally in R-markdown
Related Questions in TERMINOLOGY
- Precise definitions of promotion and lifting
- Usage of the Term 'directive' for Control Flow Statements
- Is the SQL in brackets a scalar subquery?
- Common name for index for lexicographical sorting
- Is there a common name for variables which are not local and not global?
- What is a "Scalar" in PowerShell
- What is the difference between C++ "data member" and "field"?
- What is the term for a function/operation that is reversible by using the same function again?
- Can I rightfully name my opened file a stream if at some point I move its file-pointer backward?
- Definition of "operator", and by extension "operand", in Python terminology
- What do these terms "L0+", "L0" and "L+" stand for with respect to memory pool?
- What is the technical name for, "Unrelated asynchronous call stacks?" I want to say, "How do I wait until an unrelated ___ completes?"
- What are the generic terms for HTTP Pipelines and HTTP Modules, which seem to be .Net-specific terminology
- How would call the process of roughening Data to make it more realistic?
- What is the term for a function that can resume from error?
Related Questions in WEKA
- I keep getting a "NoClassDefFound" error with Weka Ai using Java. I keep getting this Error?
- How to treat integer attributes in WEKA i.e. number of bedrooms (cannot be float values)
- Dataset not being accepted by Weka's J48 plugin (C 4.5 algorithm)
- weka inital heap size memory allocated
- Problem with Decision Tree Visualization in Weka: sorry there is no instances data for this node
- How can I limit the depth of a decision tree using C4.5 in Weka?
- Weka supplied test set didn't process the full dataset
- converting a csv file to arff file using weka converter, but it is not counting enough columns
- i have loaded a csv file in weka tool but J48 is not highlight
- Why am I getting these exceptions when trying to load a .csv file into Weka 3.8.6?
- converting a csv file to arff file using weka converter
- WEKA EEG data Filter creation
- How can I see the ideal range of a numerical independent variable according to its dependent variable?
- Intepreting WEKA data
- Java Weka API: Getting ROC Area values
Related Questions in CATEGORIZATION
- How do you categorize a pending transaction immediately after making a purchase and have it save?
- Unsupervised Categorization of AI-Generated Image Labels for Similar Image Retrieval
- Why do I need to convert to_numpy() otherwise loc assignment does not work?
- Why does loc adds a NaN row in my Pandas Dataframe?
- Categorize data based on multiple criteria using dplyr
- What is the correct syntax to classify ages into groups using IF statements in Google Sheets?
- Is there any way to categorize the time (HH:MM)?
- identify sequences of approximately equivalent values in a series using R
- Using SQL to categorize multiple columns in a row, each with their own categorization logic
- Case_when - Not returning the correct values
- spaCy SpanCategorizer performance improvement
- allocating category to a comment pandas
- Issues in initializing labels in Spacy spancat pipeline
- Is there any tool/library/api (or anything at all) to categorize youtube videos
- Is there a way to classify industry category through occupation name?
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
You can at least partly, as far as you have an appropriate dataset. For instance, let us assume you have a dataset like this one:
For instance, for getting terms about politics, you can:
StringToWordVectorfilter to the text attribute to get terms.AttributeSelectionfilter withRankerandInfoGainAttributeEvalto get the top ranked terms.This latter step will give you a list of terms that are most predictive for the politics category. Most of them will be terms in the politics domain (although it is possible that some terms are predictive but just because they are not in the politics domain - that is, they provide negative evidence).
The quality of the terms you get depens on the dataset. The more topics it deals with, the better for your results; so instead of having two classes (politics, religion, like in my dataset), it is much better to have plenty of them and many examples for each category.