R for Software Developers and Data Analysts

When: 
Saturday, June 28, 2014 - 8:00am
Room: 
TBD
Lecturer(s): 
Robert Kabacoff, Ph.D.
Lecturer Photo

Big Data Analytics

If you are looking at this workshop, you probably have some data that you need to collect, summarize, transform, explore, model, visualize, or present. If so, then R is for you! R has become the world-wide language for statistice, predictive analytics, and data visualization. It offers the widest range of methodologies for understanding data, from the most basic to the most complex and bleeding edge.

One of the hottest topics today is Big Data. Much of the publicity around Big Data focuses on interactive query operations, but the greatest value comes from Big Data Analytics – statistical analysis and visualization of the data.

The R language is widely used for Big Data Analytics, and has become one of the most popular languages for data analysis and visualization in general. Like many popular Big Data tools, R is free software – it is available at no charge under an open source license. This makes R a very attractive tool to learn and use.

R is a complete system. The first challenge with any analysis project is getting the data. R allows you to import data from a variety of sources and then clean, recode and restructure it. Note that in the real world the biggest challenge is making data usable – there are always issues with the data you have to work with!
After importing data, R has many functions for summarizing, modeling, analyzing and graphing data.

Statistical analysis tools include linear and nonlinear modeling, classical statistical test, time series analysis, classification, and clustering as well as other capabilities. Further, R can readily be extended through functions and extensions; the R community is well known for active contributions of many packages.

Finally, R has powerful visualization tools. These range from simple charts to publication quality graphs, through dynamic visualization, to interactive graphics. Visualization is key to successful data analytics – it helps the person doing the analysis to better understand the results and is an invaluable tool for explaining the results to others. This may well be the most important aspect of data analytics – providing information that can be used to make decisions.

Seminar in Detail: 

In This Workshop

This workshop will provide a practical introduction to this comprehensive platform. Participants will learn to import data into R from a variety of sources; clean, recode, and restructure data; and apply R’s many functions for summarizing, modeling, and graphing data. Both basic and more advanced forms of data analysis and graphics will be covered. Additional topics include navigating R’s comprehensive help systems, practical advice for processing data, common programming mistakes to avoid, and useful functions for data mining.
Course Outline

**I. Introduction **– An introduction to R: R syntax and data structures; working interactively and in batch; alternative IDEs and GUIs; adding functionality through packages; common programming mistakes; getting unstuck – were to find answers to your questions.

II. Data Management – Importing, cleaning, and reformatting data: transforming and recoding variables; subsetting, merging, and aggregating data; control structures; user-written functions.

III. Graphics – Taking advantage of R’s powerful graphics: creating basic and advanced graphs; customizing and combining graphs; innovative methods for visualizing complex data.

IV. Statistical Analysis and Data Mining – Using R for description, prediction, and classification: descriptive statistics and multi-way tables; ANOVA variants; regression (e.g., linear, logistic, poisson), classification trees, cluster analysis, and other multivariate methods; dealing effectively with missing data; going further.

Pricing: 

$179 through May 13
$239 May 14 - June 3
$309 June 4 - June 24
$339 after June 24

Dr. Kabacoff is a seasoned researcher, with 30 years of experience in data analysis and data visualization.

As Vice President of Research for Management Research Group (1997-present), he consults widely with academic, government, and corporate organizations throughout North America, Western Europe, and the Pacific Rim.

As a Professor in the Center for Psychological Studies at Nova Southeastern University (1987-1997), he taught numerous graduate courses on multivariate statistics, statistical consulting, and research computing.

Dr. Kabacoff created and maintains the popular tutorial website Quick-R. The second edition of his popular book R in Action: Data Analysis and Graphics with R, is due out this year.