04 – Just a brief introduction to R

Hello crew, how is it going? “R” you ready for a new post?

Today I’ll introduce the software that I’m using for the data analysis in my project.
The name of the software is “R”.


Have you ever heard about it? Yes, when I arrived in NTEC I knew it just because my brother uses to work with it. My brother is a mathematician (strange people, but I love my brother anyway). By the way, I knew how to handle computer programming languages such as MATLAB®, C++, Python and Microsoft® Visual Basic for Applications (VBA)so, for this reason, it was not a problem to learn a new one. Futhermore, R is quite easy to learn and its syntax is very similar to Python. Yes, it can be confusing  at times because it is possible to write certain commands in another language instead of the one that you are using, but basically once you know how to handle one of them, you know (almost) all. Just change the syntax of the command or the keyword which recall a specific function that you need. Learning by practice is the best (and probably the only) way in these cases.

R is a powerful, freeware but very light software generally used for statistical computing. It offers a wide range of software facilities for data manipulation, calculation and graphical display. The fact of being freeware is its major strength,  as many new libraries and functions get added daily by users from all over the world regarding any kind of issue and/or topic.

Why I’ve chosen R instead of another software for the data analysis – apart from being freeware – is because its versatility. R is, in fact, able to handle different type of data without any difficulty, quickly and using just few lines of code. The latter is another major advantage of the software thanks to the large variety of libraries and functions that it offers.
At the same time, using R, I canhandle any kind of file treating it as a matrix or a data frame (sort of database). Microsoft® Excel files as well as “.csv”, “.txt”, among many others file formats can easily be read and written as well. SQL databases can also be accessed and queried. Furthermore georeferenced data can be handled and quickly mapped too. It is perfect for the aim of my project.

Essentially, R born with a bunch of pre-installed basic functionalities. Once R has been installed, it can be readily used for doing some data analysis and graphs. However, as mentioned before, the software can be expanded just by downloading packages that add specific capabilities (functions) to R. A package is basically a set of functions (scripts or part of them) that helps to perform specific tasks. In this sense a function can be seen as a procedure which can be used (called) whenever it is needed and that perform always a fixed step-by-step sequence of activities as a routine. Everybody can create his/her own package ready for other people to use. All the packages are stored into a single database known as CRAN. The CRAN is central for using R because from it every user can download (or upload) packages in order to customize his/her own R. The idea of needing to add packages to the software might seems odd but it gives the possibility to users of downloading just packages he/she really needs, and it allows the software to remain quick and light as well as powerful. For this reason IT technicians use to call software like R: “modular software”. The fact that R is a “modular software” gives it enormous flexibility and every time a new issue needs to be solved or new statistical techniques are developed, contributors can quickly react producing and uploading to the CRAN a new R package.

The main disadvantage of the software is that in R you work just by the command line and no GUI (Graphical User Interface) is available to help users. Moreover the debugger system is not so advanced as it can be in MATLAB®, in Spider (Python) or in other similar software and for this reason it is difficult to find out where and why an error occurs.

It is possible to install R onto your PC just by visiting the website of the R project (https://www.r-project.org/) and following the next steps:
1)  On the left-hand side of the website right-click with your mouse on the link labelled as CRAN;
2)  Choose the mirror closer to your country by right-clicking on it with your mouse;
3)  Once you have chosen the mirror you prefer you will be asked to choose the platform of your PC (Windows in most of cases, or Linux and MacOS alternatively);
4)  Then, right-click with the mouse on the “base” link if you are downloading R for your first time.
5)  Finally, download the setup of the latest version of R and just follow the instruction of the .exe file.

What else? Just practice!
R is easy to learn and very quick in data analysis, so have a trial and let me know what you think about it!

You can find nice tutorials about R on the internet just googling keywords like: “R”, “R data analysis”, “R tutorial”, etc.
Or you can visit some of these websites which I found very useful:

But videos on YouTube® and many others websites are available on the Internet about programming in R. Just search for them if you got interested and want to try this software.

That’s all folks for this post! Waiting for the next post enjoy the Easter break!

In the next post – it will be probably released immediately after Easter – I will show you how I intend to process the data we collected. So, one more time: stay tuned!

Cheers,


FP13

Comments

Popular posts from this blog

27 - Road to Project Management

22 - The IEEE Big Data 2017 in Boston

26 - A nice afternoon with SMARTI ETN