Sunday, June 22, 2014

R vs. Excel and Access

In the Corporate Finance world, accountants and analyst often perform calculations over large data sets using Excel or Access.  

Typically a finance task will consist the below steps:
  1. Pull data from an ERP system
  2. Perform calculations and transformations on that data
  3. Prepare a presentation quality report 

Generally speaking, Excel is great for step 3- preparing presentation quality reports. Access, on the other hand, performs poorly on all three steps. I recommend using R to pull data and perform calculations, and then use Excel to prepare the final report.

The below table compares the performance of each software on the steps of a typically finance task.



Pulling Data
Performing Calculations
Preparing Final Reports
R
Fast. Highly scalable. Easy. See RODBC package.
Fast. Highly scalable. Easy. See SQLDF package.
Not implemented as far as I am aware.
Excel
Not scalable. Difficult to use.
Not scalable. Easy to use if approximately one to three tables.
Easy to use. Professional quality.
Access
Slow. Not scalable. Difficult to use.
Slow. Not scalable. Difficult to use.
Hard to use. Poor quality output.

Saturday, May 31, 2014

Introduction to SQL

SQL is a high level programming language for pulling and transforming data, which are two of the most time consuming tasks in corporate finance. SQL is high level, in that the user only has to write a few lines of code to perform a massive number of calculations. This means that SQL is easy to learn, and has high utility. Finance folks can use SQL to automate much of boring parts of their jobs, and spend more time providing analysis and recommendations to management.

The main clauses of a SQL select statement are select, from, where, and group by.

Select

The “select” clause is where you list the columns of the data you want to pull, separated by commas. You can rename columns by typing “as [insert name].” You can perform calculations on data using ordinary mathematical operations.

From

The “from” clause is where you list the tables which store the data you want, separated by commas.

Where

This is where you list the table filters and join conditions. Join conditions are cross table filters used to relate tables.  

Here is a link to learn more about joins: SQL Joins

Group by

The “group by” clause is a list of columns for which you are summarizing data. Here is a link to learn more about aggregating data. Aggregating data

Simple Example

Assume there is a table named departments that has two columns- department and number_of_employees. To return the number of employees in finance you would write

"Select 
number_of_employees 

from 
departments 

where 
department = 'finance' "









Saturday, May 24, 2014

Key R Packages

The three R packages that are the most foundational for corporate finance are RStudio, RODBC, and SQLDF. These packages allow you to easily pull and analyze data.

RStudio

The RStudio package provides an outstanding user interface to base R. Think of base R as the raw mechanics of a Porsche (engine, frame, wheels etc.), and RStudio as the plush interior (steering wheel, airbags, radio).


The user interface is basically divided into quadrants, which have the below main uses.

Edit scripts, view tables
Open files, view data summaries
Run code- command prompt syle
View graphs, load and install packages

RODBC

The RODBC package allows users to easily query databases by ODBC.


ODBC is a protocol used to translate database languages into a universal language. Finance folks will likely be able use RODBC to query their ERP and GL systems, assuming that their systems support ODBC.

SQLDF

The SQLDF package lets you query tables in stored in memory using SQL code.

SQL is a high level programming language designed specifically to query databases. From the finance perspective, SQL is great because using only a few lines of code, the computer will quickly perform complex data transformations on large data sets. 

In a later blog I will provide an introduction to SQL programming. For those interested in the R user community I recommend http://www.R-bloggers.com

Sunday, May 18, 2014

What is special about R for corporate finance?

I believe that in the next 5 years, R will prove to be a break through technology for corporate finance.

R is an open source software environment widely used by statisticians, data miners, market makers, and academics. The corporate finance world is lagging behind these groups in adopting R for analysis and reporting.

R Wikipedia

What is special about R? R is

1. Open source

The only implementation cost is training. There is a large community of smart people making useful packages available to the public for free and easy download. These network effects will continue to quickly increase the utility of R.

2. Well designed

R uses a high level programming language (S) that makes the retrieval, analysis, and visualization of data extremely efficient. This empowers users to develop programs to perform complex reports faster than with any other software environment.

In the coming posts, I will go into greater detail on how to implement R, and the implications of the software for the profession.