Entity Extraction is a process where an algorithm takes a string of text (sentence or paragraph) as input and identifies relevant nouns (people, places, and organizations) that are mentioned in that string. In our previous blog, we gave you a glimpse of how our Entity Extraction API works under the hood. In this post, we list some scenarios where the Entity Extraction technology can be immensely useful.
Classifying content for news providers
News and publishing houses generate large amounts of online content on a daily basis and managing them correctly is very important to get the most use of each article. Entity Extraction can automatically scan entire articles and reveal which are the major people, organizations, and places discussed in them. Knowing the relevant tags for each article help in automatically categorizing the articles in defined hierarchies and enable smooth content discovery. An example of how this work can be seen in the example below.
The Entity Extraction API has successfully identified all the relevant tags for the article and this can be used for categorisation.
Efficient Search Algorithms
Let’s suppose you are designing an internal search algorithm for an online publisher that has millions of articles. If for every search query the algorithm ends up searching all the words in millions of articles, the process will take a lot of time. Instead, if entity extraction can be run once on all the articles and the relevant entities (tags) associated with each of those articles are stored separately, this could speed up the search process considerably. With this approach, a search term will be matched with only the small list of entities discussed in each article leading to faster search execution.
Powering Content Recommendations
One of the major uses cases of entity extraction involves automating the recommendation process. Recommendation systems dominate how we discover new content and ideas in today’s worlds. The example of Netflix shows that developing an effective recommendation system can work wonders for the fortunes of a media company by making their platforms more engaging and event addictive. For news publishers, using entity extraction to recommend similar articles is a proven approach. The below example from BBC news shows how recommendations for similar articles are implemented in real life. This can be done by extracting entities from a particular article and recommending the other articles which have the most similar entities mentioned in them. This is an approach that we have effectively used to develop content recommendations for a media industry client.
There are a number of ways to make the process of customer feedback handling smooth and entity extraction could be one of them. Let’s take an example to understand the process. If you are handling the customer support department of an electronic store with multiple branches worldwide, you go through a number mentions in your customers’ feedback. Like this for instance,
@cromaretail please train your staff in croma bandra to provide correct details of customer support for Fitbit. The number given doesnt work
— Sandhya Advani (@sandyaadvani) April 16, 2017
Now, if you pass it through the entity extraction API, it pulls out the entities Bandra (location) and Fitbit (Product). This can be then used to categorize the complaint and assign it to the relevant department within the organization that should be handling this.
Similarly, there can be other feedback tweets and you can categorize them all on the basis of their locations and the products mentioned. You can create a database of the feedback categorized into different departments and run analytics to assess the power of each of these departments.
An online journal or publication site holds millions of research papers and scholarly articles. There can be hundreds of papers on a single topic with slight modifications. Organizing all this data in a well-structured manner can get fiddly. “Skimming” through that much data online, looking for a particular information is probably not the best option. Segregating the papers on the basis of the relevant entities it holds can save the trouble of going through the plethora of information on the subject matter. For instance, there could be around 2 Lakh papers on Machine Learning. If you put tags on them based on the entity extracted, you quickly find the articles where the use of convolutional neural networks for face detection is discussed.
Unstructured textual content is rich with information, but finding what’s relevant is always a challenging task. With the extensive amount of data that comes from social media, email, blogs, news and academic articles, it becomes increasingly hard and necessarily important to extract, categorize, and learn from that information. There can be other NLP techniques for process discovery, but when you want your categorized data well-structured, entity extraction API is your best choice. Try our entity extraction API and check for yourself. If you other ideas for the use cases of entity extraction, we would love to hear about it.