Thursday, 14 August 2014

SQL Server Integration Services - SSIS DotNet Brother

Advertisement

SQL Server Integration Services
In this blog describe the concept of SQL Server Integrated Services (SSIS). SQL Server Integration Services (SSIS) is a tool that we use to perform ETL (extract, transform, and load) operations of data. Here, we explain the features of SQL server integration services, advantages and disadvantages of SSIS.
Introduction
SSIS stands for SQL Server Integration Services (SSIS). SQL Server Integration Services (SSIS) is a tool that we use to perform ETL (extract, transform, and load) operations of data.  Also SSIS is used to perform operations like loading the data based on the need, performing different transformations on the data like doing calculations (Sum, Average, etc.) and to define a workflow of how the process should flow and perform some tasks on the day to day activity.
Microsoft Integration Services is a platform for building enterprise-level data integration and data transformations solutions. You use Integration Services to solve complex business problems by copying or downloading files, updating data warehouses, mining data, and so on.
NoteSQL server integration services is not for SQL Server express editions. It required enterprise edition.
Features of SSIS
The SSIS Import/Export Wizard lets the user create packages that move data from a single data source to a destination with no transformations. The Wizard can quickly move data from a variety of source types to a variety of destination types, including text files and other SQL Server instances.
Developers tasked with creating or maintaining SSIS packages use a visual development tool based on Microsoft Visual Studio called the SQL Server Business Intelligence Development Studio (BIDS). It allows users to edit SSIS packages using a drag-and-drop user interface. A scripting environment in which to write programming code is also available in the tool. A package holds a variety of elements that define a workflow. Upon package execution, the tool provides color-coded, real-time monitoring.

Connections

A connection includes the information necessary to connect to a particular data source. Tasks can reference the connection by its name, allowing the details of the connection to be changed or configured at run time.

Tasks

A task is an atomic work unit that performs some action. There are a couple of dozen tasks that ship in the box, ranging from the file system task (which can copy or move files) to the data transformation task. The data transformation task actually copies data; it implements the ETL features of the product

Precedence constraints

Tasks are linked by precedence constraints. The precedence constraint preceding a particular task must be met before that task executes. The run time supports executing tasks in parallel if their precedence constraints so allow. Constraints may otherwise allow different paths of execution depending on the success or failure of other tasks. Together with the tasks, precedence constraints comprise the workflow of the package.

Event handlers

A workflow can be designed for a number of events in the different scopes where they might occur. In this way, tasks may be executed in response to happenings within the package —such as cleaning up after errors.

Variables

Tasks may reference variables to store results, make decisions, or affect their configuration.

Parameters (SQL Server 2012 Integration Services)

Parameters allow you to assign values to properties within packages at the time of package execution. You can project parameters and package parameters. In general, if you are deploying a package using the package deployment model, you should use configurations instead of parameters.
DTS
Data Transformation Services (also known as DTS) was the predecessor to the SSIS system. It is a set of objects using an ETS tool (which means a tool that extracts, transforms, and loads information into a database for warehousing) in order to extract, transform, and load this information to and/or from a database.
DTS was an original component of the Microsoft SQL Server 2000, and at its inception, was always used with the SQL Server databases. Even though it was an integral part of the server, DTS was also easily used independently from the Microsoft server, in conjunction with other databases. It is capable of transforming and loading data from heterogeneous sources, using OLE DB, ODBC, or files specified as text only, into any database that supports them.

Why use SSIS instead of DTS?

It is likely that Microsoft will discontinue the support for DTS/ SQL SERVER 
2000 later or sooner, so it better to migrate your DTS.

 Separation of Data Flow Engine from the Control Flow Engine or SSIS
 Runtime Engine and hence improves the performance by a significant
 amount.

The DTS uses ELTL strategy (i.e. they have first to load data from files to
 data base before making transformations) while SSIS uses ETL strategy.

Package Configuration. Using expressions and configurations it is much
 easier to make packages dynamic.

Advantages of SSIS

Easier to maintain


Highly integrated with SQL Server and visual studio


Package Configuration. Much better functions for dynamically settings or retrieving configuration settings for packages as per other ETL tools.


Service Oriented Architecture. Obviously being in Visual Studio and having access to .Net code makes it compatible with web services.


You can use the SQL Server Destination instead of OLE DB which allows you to load data into SQL faster.


You remove network as a bottleneck for insertion of data by SSIS into SQL

Disadvantages of SSIS

If you have a number of SSIS packages that need to run in parallel and/or need to use a lot of memory (especially ones that have a lot of complex transforms), you’d have memory allocation conflicts between SQL and SSIS.  SQL will typically win this battle which means SSIS pages to file.  Once SSIS pages to file, you will have suboptimal processing by SSIS – i.e. don’t do it!


The same can be said for CPU – you need to ensure there are enough processors allocated for SSIS and SQL separately.  If the two compete, SQL will typically win which will mean SSIS will run much more slowly.



EmoticonEmoticon