How we used NLP to optimize Argus Data Insights

Ivana Roksandic Categories: Business Insights, Case Studies Date 20-Jun-2022 3 minutes to read

We used natural language processing to help Argus develop an intuitive engine for detailed reports and rich insights.

ARGUS Casestudy 792 X 364 NEWS Najava Za Blog

Argus Data Insights helps companies access the media information they need to make better business decisions. The company uses a mix of technology and human input to monitor print, social media, TV and radio – giving detailed, actionable insights into the industry.

The challenge: finding the most relevant insights for clients

With millions of words published online every day, how do you filter down to the most relevant information for your business? Not just the stories you’re mentioned in - but the industry insights that will open up new opportunities for success. That’s the challenge Argus helps its clients solve – with rich, tailored insights.

The company wanted to improve the accuracy, relevance and speed of delivery for their insights product. To do that, they wanted a software partner to use the latest data science models to create an advanced solution for the future.

Results: An NLP-fuelled engine for detailed reports and rich insights

We used multiple natural language processing (NLP) models to dramatically improve the quality of the media insights Argus could offer its clients. We focused on four areas to improve:

Sentiment analysis – how to detect the underlying tone of text (negative, positive, neutral or mixed)
Named entity recognition – identifying people, organizations and locations from text, a crucial element of any report
Target dominance – working out the focus of the content, to judge whether it’s a passing reference or a detailed analysis of the target business, person, or topic
Language detection – to identify the language of the input text

Using language-agnostic NLP

At every stage of the project we worked closely with the team at Argus to iterate towards the best result. And although we were using models with complex architecture – like BERT, roBERTa – we made sure they were language-agnostic.

Microservices to improve speed

Alongside detail and accuracy, we improved the speed of the service too. We built every NLP model as an independent microservice with its own API – which makes them much faster to use. It also meant that each of Argus’ products could use the models independently, depending on their needs.

Working in German for multilingual solutions

Argus works primarily in German, so for this project we did too. It meant diving deep into the technical language of the space, and helped us work closely with Argus to deliver the best result. The solution we delivered is German-first, and we made sure some services support English too. And we created a service that detects the language of written text, to help improve the insights Argus produce.

Flexible compatibility and increased speed

We provided an API for each NLP model, so it could be used independently by any of the client’s products. In combination with Elasticsearch, this solution significantly reduces the time for generating final results.

Data Science Services

We are offering leading end-to-end data solutions that will help you make the best business decisions, improve user experience, and turn your big vision into reality. Your dreams. Our expertise. Together, we give you the strength to succeed.

Let's go

The tech we used for this project

We used a full suite of data science models to build a powerful solution for Argus. Here are the details:

Standard Python for cleaning and tidying data –we used libraries like NumPy, Pandas, PyTorch, and NLTK to remove errors and combine complex data sets.
NLP models from Hugging Face – an open-source platform provider of NLP technologies.
JupyterHub for finetuning – an on-premises solution for fine tuning the client’s data.
Different models for different tasks – depending on the task we aim to resolve, we used models with complex architecture like BERT, roBERTa, and more.

While machine learning was the core part of this project, we also used various other technologies to plug into the tools that data scientists use: