Course 596:
Scripting for Data Manipulation

(4 days)


Course Description

Data comes in from many sources and in many, many formats. The first objective of this course is to present the best available tools and techniques for selecting and reformatting data. The course also shows how to use Perl’s database independent methods for extracting, modifying, and inserting data.

Who Should Attend

  • Systems administrators
  • Database administrators
  • Power users of a Unix system


To get the maximum benefit from this course, the student must have a working knowledge of UNIX and some exposure to the different kinds of data formats available.

Course Outline

UNIX Tools

  • Simple Extraction with Cut
  • Simple Joining with join
  • Translations and Character Replacements with tr
  • Specialized Tools for Data Manipulation
  • xargs, tee, and Mastering the Pipeline

Mastering Regular Expressions

  • Basic Regular Expressions
  • How to Read Regular Expressions
  • How to Write a Regular Expressions
  • Extended Regular Expressions
  • The grep Family of Tools

Using sed

  • How sed Processes a File
  • sed’s Command Structure
  • Regular Expressions and sed
  • Simple sed Commands
  • sed Commands for Substitution and Replacement
  • Moving Lines with sed
  • Writing sed Scripts

Programming in awk

  • Language Design
  • Predefined Variables and Line Processing
  • Data Manipulation with Function Calls
  • Regular Expressions and Data Selection
  • Looping and Field Processing
  • Writing awk Scripts

Shell Scripting

  • Flow of Control
  • Shell Scripts are Coordinators
  • I/O Processing
  • Field Processing

Basic Perl

  • Scalars, Lists, and Arrays
  • Program Structure
  • Line-Oriented I/O

Regular Expressions in Perl

  • The Basic Expression Set
  • Anchors and Assertions
  • Minimal Match
  • Command to Make It Fast

Perl Subroutines and Modules

  • How to Write Subroutines
  • How to Use Predefined subroutines
  • How to Use OO Modules

Database Access with Perl

  • Setup for Implementation Independent Access
  • Processing Data from the Database
  • Methods of Inserting into the Database
  • How to Handle Implementation Uniqueness

Perl Data Manipulation Methods

  • Core Techniques
  • Some Must Have Modules

