How to document SSIS package data flow

The data flow element is a part of a SSIS package that contains different type of data flow components:

  • Source components – make data from different types of data sources available to a data flow
  • Transformational components – aggregate, merge, distribute, and modify the data inside the data stream, and
  • Destination components – load the data from the data flow into different types of data sources or create an in-memory dataset

Besides this, the Path component (connection) between input and output components defines the sequence of components, and allows adding annotations to the data flow. A path connects two components in a data flow by connecting the output of one data flow component to the input of another component:

In development, documenting a data flow task means to specify a descriptive task name, add an annotation, or to explain the task additionally in the Description field. The problem is that, each time these information needs to be used or edited, the package has to be opened in development. This becomes complicated if the package is complex in a way that it contains multiple control flow tasks with multiple data flow diagrams.

Also, there is no option in development to export the data flow diagram, not as an image or any other way. A simple screen capture can be useful, but is limited to tasks visibility in the Data flow tab. When there is a need for export, in some cases the data flow diagram is not fully visible as it could contain numerous tasks. To overcome this, the diagram needs to be zoomed out in order to show all tasks. However, decreasing the diagram size can make tasks not showing properly and make diagram hard to explore.

Another problem is that the dataflow itself does not contain all necessary information and documentation requires additional data to be collected. This is also a problem, as there is no such way to save all the information in a user-friendly document.

ApexSQL Doc is a SQL server database, SSAS cube, SSRS item and SSIS packages, Tablau sites and SharePoint farm documenting tool that generates comprehensive and user-friendly documentation in various formats: CHM, HTML, DOC, DOCX, and PDF.

The SSIS package documentation includes both, data flow and control flow diagrams, and all the properties for the specified package components. ApexSQL Doc can document packages from the file system, SQL Server, and SSIS Package Store. In addition to this, the documenting process can be scheduled through the Command Line Interface, and run unattended. To document an SSIS package:

  1. Start ApexSQL Doc as an administrator

  2. Under the Home tab in the Projects group click the New button to start a new project

  3. Open the SSIS package connection dialog, by selecting the Integration services tab, and click the Add button:

  4. In the Add integration services packages dialog, select an appropriate package source, navigate to the package you want to document, and click the OK button:

  5. Once connected, the package is listed in the Package selection section:

  6. In the Package details section, select package attributes to include in documenting:

  7. When this is set, from the Home tab, in the Actions group, click the Generate button:

  8. When the documentation is finished, navigate to an appropriate node in the documentation to find data flow diagram:

 

January 22, 2015