Import SPSS, text files or tables as data source Data to RStudio
Anyone who processes data using the R programming language can create records directly themselves. In most cases, however, the data is imported from other sources. We show the procedure and the possibilities.
Company about the topic
Successfully imported data to RStudio.
(Image: RStudio / Joos)
There are several ways to import data into R. In many cases, CSV or XLS files are used for import. Generally it is recommended to save the source files for the import in a directory on the computer that belongs to the corresponding R-project. Thus, it is possible to verify or re-read the database at any time if this is necessary.
Prepare packages for function calls
Data can also be imported with R as a function call, whereby the package “readr” is required for this. Then various functions are available:
Picture gallery with 8 pictures
In order for the commands from this package to be used, it must first be integrated and loaded. The explicit Installation of readr can be done via the following command.
A better starting point, however, is to install the complete data science package collection Tidyverse, this succeeds analogously with the command:
The functions are then loaded once within the R session by means of “library(tidyverse)” or “library(readr). If data is imported via the graphical interface, RStudio detects the missing packages and enables their installation via the GUI. This also includes the two packages mentioned above.
RStudio can use Excel spreadsheets ( * .xls,*.xlsx) even without an existing Excel installation open and display. If Excel tables are imported, the package “readxl” must be loaded. For example, the code for importing data from an Excel spreadsheet looks like this:
Einnahmen_Ausgaben_Tabelle <- read_excel("C:/Users/User /Desktop/Einnahmen-Ausgaben Tabelle.xlsx")
Import data into RStudio with the graphical interface
In most cases, when importing data once, it is easier and also more effective to use RStudio’s graphical interface to import data. The advantage is that the data is not only imported, but also displayed again before. If you need to import data more often, you are of course better served with a script, because the processes can be automated here.
The import options in RStudio can be found via the menu item “FileImport Dataset”. Here you can find your own menu items for importing Excel tables, text files or other source data. Data can also be imported via the “Environment” tab. Here the same options are available via the button “Import Dataset” as with “FileImport Dataset”.
In some cases, current versions of the necessary packages are required for importing data, for example “haven” and “Rcpp”. If these are not installed, the RStudio displays them and also allows the installation.
Control and execute import process
When the data is selected for import, RStudio displays the read data in “Data Preview”. Settings for the import process can still be selected in the lower area. On the right you can see the source code used to import the data.
If an option is changed, the effects can be seen in “Data Preview”. In addition, the RStudio displays the corresponding code in “Code Preview”. For example, if you use the “First Row as Names” option, R does not actively read the data of the first row.
Which options the RStudio displays for import depends on which data is imported. For example, when importing SPSS data sets (Statistical Package for Social Science), only the name can be selected. Depending on the imported record, the data also receive different attributes.
The data is read in via the “Import” button. On the tab” Environment “you can see the name of the data frame in” Data”. This is the data that has just been imported. The name is set as an option when importing the data. For the import, the code that was previously displayed in “Code Preview” is used.
Import SPSS data
If you want to work with open source software in the field of SPSS, you can also outstaffing work with GNU PSPP to create the import file. GNU PSPP is an open source program for statistical analysis of data. The tool is a free replacement for the proprietary program IBM SPSS. GNU PSPP is available for Windows, Linux and macOS.
The tool can be used to create SAV files, which R can import via the RStudio. This allows developers to adjust data before it can be imported into RStudio. The “haven” package is required for importing SPSS files. If this is not available on the computer, it can be downloaded with the following command and imported into RStudio:
For example, SPSS files can be imported with the following code:
DatenAusSPSS_spss <- read_sav("data/spss-daten.sav")
Save and import data
With the RStudio it is also possible to edit data from a data frame and then save it as an RData file. These files can be loaded again in RStudio. The data of a frame is stored with the function “save ()” The syntax is:
save(<Name des Frames>, file = "<Pfad zur RDA-Datei>")
save(Einnahmen_Ausgaben_Tabelle, file = "daten.rda")
The files can be loaded again with the function “load ()”. The syntax in this example is:
load(file = "daten.Rda")