NLP

A team of people passionate about NLP challenges and technologies

Heka.ai's center of expertise on TALN and NLP topics

The team relies on proprietary textual data sets to create Python packages that include the approaches selected by the use case. The missions of the Lab:

Technology monitoring
Centralization and diffusion of NLP expertise within our Heka.ai team
Benchmark and development of new approaches to major NLP use cases

Our research topics

Topic Modelling

Within our solutions (e.g., Deep Review and CRM4.0), we need unsupervised detection of themes in a corpus of documents (in other words, without interpretation). This NLP use case is known as Topic Modeling.
We benchmark different methods from the literature and combinations of approaches imagined by our teams. This benchmark is based on a dataset of Google comments in French. It allows us to determine methods according to different criteria: the time required to detect themes, consistency, and gaps between the proposed pieces.
Once we identified the best approaches, we were able to integrate them into a pre-existing theme analysis package.
To complete the benchmark carried out in the NLP lab, we supervised two groups of students (from CentraleSupélec and Mines Saint-Etienne) in the framework of their final thesis on Topic Modeling subjects.
A similar approach of research and capitalization was carried out on related use cases:
Supervised theme detection (multi-label and multi-class) in a document corpus.
Attachment of named entities to the themes detected in the document.

Textual data annotation tool

As part of our work for our Heka.ai solutions and client engagements, we sometimes need to annotate textual data: named entities, themes, sentiments, etc.
This is why we are developing a textual data annotation tool. The challenge is identifying helpful annotation types and building an architecture that can interface itself with any kind of database and environment.
Thanks to a graphical interface accessible to each user profile, the tool allows functional experts to annotate textual data in an advanced way: multi-label management, the possibility to label a subset of the text, automated multi-annotator reconciliation, and administration of
annotation campaigns with access management to the user's mesh.

Topic Detection

The first step is to detect the themes in an unsupervised way with Topic Modeling and a certain number of verbatims annotated thanks to our annotation tool. The next logical step is to move from an unsupervised model to a supervised way with Topic Detection.
For this purpose, we have carried out a benchmark of the existing approaches in the literature and new methods imagined by the team to solve supervised topic detection problems.
This benchmark has been realized to take into account both the detection performances with an F1-score metric and the associated training/inference times and an estimation of the minimum number of annotated verbatims needed. Thus, we can choose the most appropriate method depending on the use cases and constraints.

Our AI solutions based on NLP

Reg Review

Reg Review

Automate your regulatory monitoring and processes

LEARN MORE
Deep Review

Deep Review

Optimize multi-channel customer relationship management

LEARN MORE
Doc Review

Doc Review

Automate your document review process

LEARN MORE

Our publications

- Financial Services & Compliance

Sia US FS - Weekly Regulatory Update

2025

- Marketing & Sales

Generative.AI

Data Science & AI expertise combined with consulting services enable customers to embrace all aspects of Generative AI.

1 Our Generative AI Approach
2 Generative AI at a glance
3 Use cases for Generative AI
4 How should companies prepare for Generative AI adoption?

2025

- Marketing & Sales

SiaGPT : Harness the Power of Generative AI to…

An on-demand SaaS product designed to expedite consulting workflows. By harnessing the power of Generative AI, the tool offers a cutting-edge information extractor, and intuitive prompt interface.

Original Atricle: https://www.sia-partners.com/en/trending-insights/siagpt

2025

See our publications

Contact us

Last name*

Company*

Email*

Phone*

Subject*

Country

Message*

*Required fields

This site is protected by reCAPTCHA and by the Google system.

A team of people passionate about NLP challenges and technologies

Our research topics

Topic Modelling

Textual data annotation tool

Topic Detection

Our AI solutions based on NLP

Reg Review

Deep Review

Doc Review

Our publications

Sia US FS - Weekly Regulatory Update

Generative.AI

SiaGPT : Harness the Power of Generative AI to…

Contact us