How to realize an automatic semantic analysis ?

Published on January 31, 2022  - Updated on February 01, 2022

How to realize an automatic semantic analysis?

Semantic analysis is often seen as a tedious process, very costly in time and resources. However, thanks to artificial intelligence this is no longer really the case today. We will see in this article how to carry out a semantic analysis and why the use of an automatic tool can allow you to automate all these tasks and analyze all your verbatims in record time!

A – The preparation

First of all, we will see that the preparation of a good automatic semantic analysis can be separated into 7 distinct steps.

First of all, it is important to specify that most algorithms will be dependent on the language of the verbatims. Indeed, depending on the language, the grammatical and syntactic rules will be different and that is why it is important to define them well beforehand before embarking on a semantic analysis or ensure that the tool can automatically detect the language of the comment.

Once this is done, we can move on to the first step: sentence segmentation.

1)   Sentence segmentation

The verbatims left by your customers will in the vast majority of cases be made up of several sentences. The first step is therefore to split the data. Indeed, in this segmentation phase, each verbatim will be broken down sentence by sentence. The objective of this first step is to be able to place the words of each sentence in their context in order to establish the meaning of the sentence itself.

choix analyse semantique (1).png

2)   Tokenization

This step is the logical continuation of the previous one.

Indeed, if the objective of segmentation is to separate sentences from each other within verbatim; the tokenization on the other hand will find the words in each of its sentences and assign them a "token". It is this token that will allow the algorithm to semantically analyze the words and identify them correctly and therefore move on to the next step: grammatical interpretation. 

3)   Grammatical interpretation

The third step in carrying out our semantic analysis is grammatical interpretation (or Part of speech tagging). The idea at this stage is to find out what are the adjectives, subjects, verbs, etc. This step is crucial since it is what will allow the semantic analysis algorithm to understand the sentence and to create links between the different words.

4)   Lemmatization

Once the grammatical interpretation has been made, the different words must be grouped by family of words, this is called lemmas. To put it simply, if in a sentence we find the word "eating"; the algorithm will automatically recognize that this word refers to the "eat" family and therefore the associated lemma will be "eat". The idea is to keep only the meaning of the word.

This greatly simplifies and increases the reliability of the semantic analysis. Indeed, if the algorithm is able to understand the meaning of the words within the sentences, it will be able to clearly identify what the customer is talking about in his comment and therefore be able to classify it in the appropriate theme effectively.

5)   The stop words cleaning

During the lemmatization, it will also be necessary to erase the words useless to the analysis within the sentences: this is called stop words. These are words that bring no value to the general analysis of the verbatim, they are obviously different depending on each language, hence the interest of having this information beforehand.

For example, in English, it can represent the words "and", "at", "the", etc. They are very often present in the verbatims and slow down the work without adding value to the understanding of the text. This is why the cleaning of stop words is not to be neglected.

16099626487943_P1C2.png

6)   Dependency analysis

We can then move on to dependency analysis. This consists of establishing links between the different words found in the previous steps. What are the topics? What are adjectives? What are action verbs? The objective is to put in relation the words to each other regardless of their position in the sentence in order to identify the different subjects mentioned in the verbatim.

7)  Identification of co-references

Finally, the final step is to seek the relationships between these different subjects by identifying coreferences. This simply means finding all the terms that refer to the same subject. For example in the sentence “The steward served me my meal. He didn't even look at me. the word "he" is a co-reference of "Steward". Thanks to this work, the tool will be able to correctly associate the negative emotion of the second sentence with the word "Steward".

As we have just seen, the preparation work for a semantic analysis is very substantial, but once carried out, you will be assured of very high analysis reliability. We will come back to this later, but if you decide to use an automatic semantic analysis tool, all of these tasks will be carried out by it, which will save you considerable time.

B – Which approach should be used to perform an automatic semantic analysis?

Now that we've seen the prep work, it's time to look at the analysis itself. For this, there are two different approaches to carry out this one:

- Classification

- Text-clustering

1) Classification

As part of this approach, it is first necessary to define a “model” upstream. This consists in establishing all the themes (or classes) that we wish to find during the analysis of the verbatim statements. This will allow the algorithm to automatically classify the verbatims in the different themes that have been defined beforehand.

By coupling this classification with emotional analysis, it is possible to identify at a glance which themes are the least well felt by your customers (irritants) and conversely which are the points of enchantment.

This approach can also be applied to entities such as places, stages of the customer journey, etc. For example, if one of your customers mentions an in-store experience in their comment, it is possible to search in the data if the customer was at the checkout, in the fitting room, etc. It is this set of methods that will allow you to better qualify customer data and enhance the interest and ROI of semantic analysis.

Capture d’écran 2021-01-21 à 10.10.10.png

2) Text-clustering

As part of this approach, it is first necessary to define a “model” upstream. This consists in establishing all the themes (or classes) that we wish to find during the analysis of the verbatim statements. This will allow the algorithm to automatically classify the verbatims in the different themes that have been defined beforehand.

By coupling this classification with emotional analysis, it is possible to identify at a glance which themes are the least well felt by your customers (irritants) and conversely which are the points of enchantment.

This approach can also be applied to entities such as places, stages of the customer journey, etc. For example, if one of your customers mentions an in-store experience in their comment, it is possible to search in the data if the customer was at the checkout, in front of the store, in the fitting room, etc. It is this set of methods that will allow you to better qualify customer data and enhance the interest and ROI of semantic analysis.

Capture d’écran 2021-04-06 à 17.45.28.png

C - Why combine classification and text-clustering?

Very often, it will be interesting to set up both methods. Let's take an example to understand.

Imagine that you are looking for your priority irritants, in this case the classification and analysis of emotions will allow you to directly target the negative points that bring out the most sadness, anger or disgust. By adding a clustering approach, you will be able to group together the subjects that have caused these

negative emotions and therefore bring out your irritants. It is therefore by coupling these two approaches that you will obtain the best results in your search!

To conclude, as we have seen, performing a semantic analysis is a very complex and time-consuming task. But the good news is that this whole process can now be fully automated! If you are interested in this topic and want to know more, request your free demo of our solution.

Share this article

Similar posts

How to use semantic analysis to improve customer experience?

Published on April 27, 2022  - Updated on May 11, 2022

Use semantic analysis to improve the customer experience. It's possible! The comments left by your customers are often a mine of information about your experience and customer journey. It's importa...

Customer Effort Score (CES) : definition, calculation et advantages

Published on April 01, 2022  - Updated on May 11, 2022

The Customer Effort Score (CES) is a key indicator that aims to put the customer at the center of your strategy. When it was first mentioned in 2010, it rethought the way we perceive the customer ex...

Q°emotion enables you to…

Automatically classify
verbatim.

Automatic classification

Q°emotion, a tool for ...

Prioritize irritants
on customer journeys.

Irritants & Customer journeys

Want to test our tool?

Ask for a
test of our tool!

phone