R for Software Developers and Data Analysts 


When:  June 28, 2014 

Where:  Microsoft NERD, Cambridge, MA 

Cost:  $179 through May 20 
Big Data Analytics
If you are looking at this workshop, you probably have some data that you need to collect, summarize, transform, explore, model, visualize, or present. If so, then R is for you! R has become the worldwide language for statistics, predictive analytics, and data visualization. It offers the widest range of methodologies for understanding data, from the most basic to the most complex and bleeding edge.
One of the hottest topics today is Big Data. Much of the publicity around Big Data focuses on interactive query operations, but the greatest value comes from Big Data Analytics – statistical analysis and visualization of the data.
The R language is widely used for Big Data Analytics, and has become one of the most popular languages for data analysis and visualization in general. Like many popular Big Data tools, R is free software – it is available at no charge under an open source license. This makes R a very attractive tool to learn and use.
R is a complete system. The first challenge with any analysis project is getting the data. R allows you to import data from a variety of sources and then clean, recode and restructure it. Note that in the real world the biggest challenge is making data usable – there are always issues with the data you have to work with!
After importing data, R has many functions for summarizing, modeling, analyzing and graphing data.
Statistical analysis tools include linear and nonlinear modeling, classical statistical test, time series analysis, classification, and clustering as well as other capabilities. Further, R can readily be extended through functions and extensions; the R community is well known for active contributions of many packages.
Finally, R has powerful visualization tools. These range from simple charts to publication quality graphs, through dynamic visualization, to interactive graphics. Visualization is key to successful data analytics – it helps the person doing the analysis to better understand the results and is an invaluable tool for explaining the results to others. This may well be the most important aspect of data analytics – providing information that can be used to make decisions.
Join us in this full day seminar and learn from one of the leading authorities on R.
In This Workshop
This workshop will provide a practical introduction to this comprehensive platform. Participants will learn to import data into R from a variety of sources; clean, recode, and restructure data; and apply R’s many functions for summarizing, modeling, and graphing data. Both basic and more advanced forms of data analysis and graphics will be covered. Additional topics include navigating R’s comprehensive help systems, practical advice for processing data, common programming mistakes to avoid, and useful functions for data mining.
Course Outline
I. Introduction – An introduction to R: R syntax and data structures; working interactively and in batch; alternative IDEs and GUIs; adding functionality through packages; common programming mistakes; getting unstuck – where to find answers to your questions.
II. Data Management – Importing, cleaning, and reformatting data: transforming and recoding variables; subsetting, merging, and aggregating data; control structures; userwritten functions.
III. Graphics – Taking advantage of R’s powerful graphics: creating basic and advanced graphs; customizing and combining graphs; innovative methods for visualizing complex data.
IV. Statistical Analysis and Data Mining – Using R for description, prediction, and classification: descriptive statistics and multiway tables; ANOVA variants; regression (e.g., linear, logistic, poisson), classification trees, cluster analysis, and other multivariate methods; dealing effectively with missing data; going further.