Text mining with Scilab

Words are a fundamental aspect of our life. They allow us to communicate our feelings, thoughts, emotions to the others.

Why not apply advanced maths to documents and try to automatically extract useful information? The right answer to this question names “text mining”, which is a mature discipline whose main concern is to develop automatic procedures which allow an “intelligent” read of a large quantity of documents. The aim is to extract useful information hidden in texts, highlight possible links and relations which could be invisible to a human being, proposing effective classifications of the document and facilitating a fast search of relevant documents.

These techniques have gained the attention of the scientific community with the advent of Internet, the largest library we have today.

With this paper we would like to show how it is possible to easily implement a text classifier with Scilab, the classifier is based on the use of self organizing maps (SOMs) . We explore a corpus made of all the English papers appeared on the EnginSoft newsletter last years in order to have a mathematical description of our community. The result is really surprising: we get a sort of paint summarizing interests and relationships of EnginSoft world.

Do not hesitate to contact us for more details.

Attachment	Size
Text_mining_Scilab.pdf	1.62 MB

View the full image

Self Organizing Map (SOM) representing the corpus

Self Organizing Map (SOM) representing the corpus
View the full image

pictorial representation of keywords

pictorial representation of keywords
View the full image

pictorial representation of keywords

pictorial representation of keywords
View the full image

tf-idf histogram

tf-idf histogram

Latest news

Scilab 6 beta

February 2016: Scilab team is happy to announce the release of the first beta version of Scilab 6!

Scilab 6 is a major new release of Scilab, the open source modeling & simulation platform.

What's new:
--> New computation core enabling bigger data sets
--> Improved Xcos allowing larger models
--> Utilities for development productivity (debugger, profiler and coverage)
--> Newsfeed (news, tips and communication from the community and the Scilab team)

Download it here!

read more
Scilab for Uncertainty Quantification

July 2015: Gregorio Pellegrini, M.Sc. defended his master thesis on "Polynomial Chaos Expansion with applications to PDEs" after completing an internship with the Openeering team... Congratulations Gregorio!

The entire thesis is available for download in the Openeering Made with Scilab session:
Polynomial Chaos Expansion with applications to PDEs

read more
New Scilab 5.5.2

April 2015: Scilab 5.5.2 is released!
Please download the new release from the website: Scilab 5.5.2

read more
Scilab webinar (ITA)

Rivedi il nostro webinar!
Scopri i vantaggi dell’utilizzo di Scilab/Xcos, l’alternativa Open Source a MATLAB®/Simulink®

Durante il seminario on-line sono stati mostrati esempi di applicazioni industriali, incluse la modellazione e la simulazione di sistemi dinamici ibridi, con uno strumento completamente gratuito.
Scilab/Xcos sono stati presentati dal punto di vista tecnico, della licenza e del ROI, anche in confronto a MATLAB®/Simulink®.

Rivedi il nostro webinar!

read more
Numerical Analysis using Scilab

January 2015:
A new tutorial on numerical anlysis is ready!
This third tutorial provides a collection of numerical methods for solving nonlinear equations using Scilab.
Download also all the examples we prepared for you!

Numerical Analysis using Scilab

read more

You are here

Text mining with Scilab

Self Organizing Map (SOM) representing the corpus

Self Organizing Map (SOM) representing the corpus

pictorial representation of keywords

pictorial representation of keywords

pictorial representation of keywords

pictorial representation of keywords

tf-idf histogram

tf-idf histogram

Latest news

Scilab 6 beta

Scilab for Uncertainty Quantification

New Scilab 5.5.2

Scilab webinar (ITA)

Numerical Analysis using Scilab