In practical data analysis, where we do not perform calculations manually (as in a statistics lecture), we use the computing power of a computer to process data, perform simple calculations, or even calculate more complex statistical models. We use data that we have stored in files. But what are files again and how can they be found?
Data is usually stored as files in a local location on the computer (desktop or laptop) or tablet. Local means that the files are physically stored on your machine. In contrast, files that are stored in a cloud (like the Hessenbox) and are only temporarily stored locally when editing the files.
Files always have an extension that determines the type of the file. For example, Word documents typically have the extension .docx and PowerPoint files have the extension .pptx. Image files also have specific extensions such as .jpg or .png. The extension of a file indicates how to read the file or with which program to open it. This is usually set in the automatic settings (in Windows or Mac).
A second piece of information we need about files when working with them is the path. The path is like the home address of the file. Each file is stored at a specific local location, and there is always a unique address that makes the file accessible or locatable. This may be difficult to understand on a tablet, but it becomes clear quickly on a desktop/laptop.
For example, I have the file syllabus.pdf (course schedule for this course) saved locally on my computer. But how do we find out where the file is saved?
The first step is to search for it in Finder (Mac) or Explorer (Windows). Once we have found the file, we can also display the unique path.
On a Mac computer, it looks like this:
Similarly, on a Windows computer, it looks like this:
The file’s address is not necessarily directly visible. On a Mac, it is at the bottom of the window, and on Windows, it is at the top of the window. By right-clicking, you can copy the path from Finder or Explorer, obtaining the unique address of a file.
On a Mac computer, the address (path) in the example is:
/Users/jlug/Documents/Teaching/syllabus.pdf
On a Windows computer, the address (path) in the example is:
D:\g31442\Teaching\syllabus.pdf
We can already see an important difference here. Windows computers with German language settings use a backslash \
to differentiate folders, while Mac computers use a simple slash /
. This is due to the language settings of my Windows computer, which is set to German system language. If you are working with a Windows computer, it is important to convert these backslashes to normal slashes when using them in R. This is because R only works with the English language setting, where paths are always specified with slashes /
.
We need to change the upper path as follows:
D:/g31442/Teaching/syllabus.pdf
It is important that you have understood this first step. It’s best to try copying the file path of any file into a Word document yourself.
If you have any questions, please contact us during office hours. We will use paths in later learning blocks to manipulate data in R and also to save results from R on the computer.