Subject ▸ Blog-neas

Python installation guide

I will briefly describe the first steps needed to obtain a running version on Python on your PC. The version you need depends on what you want to do in Python. Some older projects are coded in Python version 2.xx, while nowadays 3.xx version is the most common. Since there is no backward compatibility between 2.xx and 3.xx, if you want to work with older projects you probably need the former version.

Read More…

Text Mining with R: cleaning and preparing data

This is the first post of a series intented to present an overview of the steps involved in undertaking text and data mining, from the data preparation to the sentiment analysis. This is an excerpt of the online seminar that was held as part of the Summer School for the Plan for Science Degrees (Piano di Lauree Scientifiche, PLS) promoted by the University of Naples Federico II. Participants consisted of secondary school students and early university students.

Read More…

Help

This is not a help service for all your R, Python and statistical questions, so please try to avoid to post general questions in the comments, or send them to me by email. Please, limit to comments related to the published content. If you have questions about data analysis, ask for help on crossvalidated.com. If you have questions about R or Python, ask for help on stackoverflow.com. If you have questions about the website, check my github and the original source code.

Read More…

DESPOTA

DESPOTA (DEndogram Slicing through a PermutatiOn Test Approach) is a novel approach exploiting permutation tests in order to automatically detect a partition among those embedded in a dendrogram. Unlike the traditional approach, DESPOTA includes in the search space also partitions not corresponding to horizontal cuts of the dendrogram. The output of hierarchical clustering methods is typically displayed as a dendrogram describing a family of nested partitions. However, the exploitable partitions are usually restricted to those relying on horizontal cuts of the tree, missing the possibility to explore the whole set of partitions housed in the dendrogram.

Read More…

About NeaS

The Nea-Statistic (NeaS) blog is intended to be a virtual meeting place where people can read about statistics and share new findings in data science. We plan to introduce students (and also a broader audience) some current topics in Data Science, while also making some of our research interests visible to the general public. Our hope is to create a cozy corner where we can share observations and disseminate ideas with an international community informally, without the pressure of the peer review process.

Read More…