Importing Empty Fields from Active Directory

Further to the series of posts on importing data from Active Directory, I’ve run into a new issue.  For this client I built the exact same solution as described here Getting Around Active Directory Paging on SSIS Import, but got this lovely error message: “Index was out of range. Must be non-negative and less than the size of the collection.” It turns out there were empty values in some of the single-value fields.  I hadn’t run into this previously, but I found a neat solution.

In the original solution I outlined how to create a simple SSIS script task in C# to import single value fields from Active Directory. I’ve added to this code to create a solution to import empty single-value fields.

A For Each statement for single-value fields has been added  to the script to check if the field is empty before setting the variable value. Even though there is only one possible value for a single value field, the For Each statement still works nicely to check if it’s empty.  Here is the code snippet of the For Each statement:

//If the property is null, set the variable to blank, else set it to the value in the property string Mail = ""; ResultPropertyValueCollection valueCollectionMail = results.Properties["Mail"]; foreach (String sField in valueCollectionMail) { //Replace any single quotes with two single quotes for SQL Statement

Mail = sField.Replace("'", "''"); }

Here is the complete code.  for more details on how to create the SSIS package and set up the references for the script task, please see Getting Around Active Directory Paging on SSIS Import.

        public void Main()
 
{
 
 //Set up the AD connection;
 
using (DirectorySearcher ds = new DirectorySearcher())
 
{
 
//Edit the filter for your purposes;
 
ds.Filter = "(&(objectClass=user))";
 
ds.SearchScope = SearchScope.Subtree;
 
ds.PageSize = 1000;
 
//This will page through the records 1000 at a time;
 
//Set up SQL Connection
 
string sSqlConn = Dts.Variables["SqlConn"].Value.ToString();
 
SqlConnection sqlConnection1 = new SqlConnection(sSqlConn);
 
SqlCommand cmd = new SqlCommand();
 
SqlDataReader reader;
 
cmd.CommandType = CommandType.Text;
 
cmd.Connection = sqlConnection1;
 
//Read all records in AD that meet the search criteria into a Collection
 
using (SearchResultCollection src = ds.FindAll())
 
{
 
//For each record object in the Collection, insert a record into the SQL table
 
foreach (SearchResult results in src)
 
{
    string sAMAccountName = results.Properties["sAMAccountName"][0].ToString();
    string objectClass = results.Properties["objectClass"][0].ToString();

    //If the property is null, set the variable to blank, otherweise set it to the value in the property
    string Mail = "";
    ResultPropertyValueCollection valueCollectionMail = results.Properties["Mail"];
    foreach (String sField in valueCollectionMail)
    {
        Mail = sField.Replace("'", "''"); //Replace any single quotes with two single quotes for SQL Statement
    }

    //If the property is null, set the variable to blank, otherweise set it to the value in the property
    string displayName = "";
    ResultPropertyValueCollection valueCollectiondisplayName = results.Properties["displayName"];
    foreach (String sField in valueCollectiondisplayName)
    {
        displayName = sField.Replace("'", "''"); //Replace any single quotes with two single quotes for SQL Statement
    }

 
sqlConnection1.Open();

cmd.CommandText = "INSERT INTO AD_Users (sAMAccountName, objectClass, Mail, displayName) VALUES ('" + sAMAccountName + "','" + objectClass + "','" + Mail + "','" + displayName +"')";
 
reader = cmd.ExecuteReader();
 
sqlConnection1.Close();
 
} } } }

 

Here are links to the other posts in the Active Directory series:

Importing Data from Active Directory using SSIS Data Flows

How to Query Multi-Value Fields in Active Directory using SSIS

Automate Import of SSIS Packages to File System

A client with SQL Server 2005 asked if there was a good way to import and deploy packages automatically, rather than having to do each package manually every time. This particular client has packages deployed to SQL Server Integration Services. I have written a separate blog post on the Automating Deployment of SSIS Packages in SQL 2005. Here is the solution for importing packages.

BIDS Solution:

Open (or create) the BIDS solution for the related packages which you want to import and deploy. The related packages should all reside in the same folder on Integration Services, otherwise the deployment piece will not work as it can’t deploy to multiple folders. You will need a separate BIDS solution for each SSIS folder, and you will need to create a separate _ImportPackages.dtsx package for each solution.

In order to make this somewhat scalable, the solution folder structures will need to incorporate the exact same folder name as the folder on SSIS, since the SSIS folder name will be used in the import query. For example, if your group of packages is deployed to a folder called “Maintenance” on SSIS, you will want your solution to be called something like “Maintenance ETL”. In this way, once you create your _ImportPackages.dtsx you will be able to copy it to the next solution and simply edit the folder name in the query and the Batch File connection manager. You’ll see what I mean.

Create SSIS Package to Import Deployed Packages:

Create a new SSIS package called _ImportPackages.dtsx. I put the underscore in the name so it will float to the top of the SSIS solution when you open it, thereby reminding you to run the package and get the latest copies of the packages deployed to SSIS. The package will look like this:

clip_image001

CREATE CONNECTIONS

DATABASE:

Create a connection to the msdb database on the server where the SSIS packages are being stored.

clip_image002

Although the packages are deployed out to Integration Services

clip_image003

They are actually stored in the MSDB database, which is what you will write your source query against.

clip_image004

 

BATCH FILE:

Create a Flat File Connection Manager called “Batch File”. Create a flat file called “_Import Packages.bat”. It will house the dtutil commands in a .bat file which will import all the dtsx packages from the MSDB. Choose a location within your BIDS SSIS solution folder, since you will have one of these batch files per BIDS SSIS solution.

clip_image005

There is one column called dtutil_command.

clip_image006

The column width is 4000 to accommodate the length of the dtutil command lines.

clip_image007

STEP 1 in the Package: Delete Batch File

Drag a File System task from the Toolbox into your Control Flow. Rename it “Delete Batch File”, and configure it to the operation “Delete file” with a Source Connection to the ”Batch File connection” you created above.

clip_image008

STEP 2 in the Package – Populate Batch File

Drag a Data Flow task from the Toolbox into the Control Flow surface. Rename it to Populate Batch File. Connect the “Delete Batch File” task to the “Populate Batch File” task on success. Move to the Data Flow tab and drag an OLE DB Source on the Data Flow surface. Configure it as follows:

clip_image009

Here is that Query again, so you can cut and paste it:

select
CAST(
‘dtutil /SQL "’
+ case
when len(FolderName) = 0 then ”
else FolderName + ”
end
+ [name]
+ ‘" /ENCRYPT FILE;"C:BI ProjectsETL’
+ case
when len(FolderName) = 0 then ”
else FolderName + ‘ ETL’ + FolderName + ‘ ETL’
end
+ [name]
+ ‘.dtsx";1’
+ ‘ /QUIET’ –suppresses prompt to delete if an older file with same name exists )
as nvarchar(4000))
as dtutil_command
from msdb.dbo.sysdtspackages90 pkg
join msdb.dbo.sysdtspackagefolders90 fld
on pkg.folderid = fld.folderid
WHERE foldername = ‘Maintenance’ —CHANGE VARIABLE
order by FolderName, [name]

 

The results of this query will look like this:

dtutil /SQL "MaintenancePackage1" /ENCRYPT FILE;"C:BI ProjectsETLMaintenance ETLMaintenance ETLPackage1.dtsx";1 /QUIET
dtutil /SQL "MaintenancePackage2" /ENCRYPT FILE;"C:BI ProjectsETLMaintenance ETLMaintenance ETLPackage2.dtsx";1 /QUIET

Notice that the folder name from SSIS is also used in the folder structure on the file system. You will need to adjust the query to suit your naming convention for your BIDS solutions.

Drag a Flat File Destination onto your Data Flow tab and connect the OLE DB source to the Flat File destination. Configure the Flat File destination to the Batch File connection.

clip_image010

Step 3 in the Package: Run Batch file

Drag and Execute Process task from the Toolbox onto the Control Flow tab. Connect the “Populate Batch File” task to the “Run Batch File” task on success. Configure the Executable on the Process tab by navigating to the _Import Packages.bat you created earlier.

clip_image011

Now you can run the SSIS package you just created. It will import all the packages in the specified SSIS folder into the solution folder you specified in the SQL Query.

***Please note that this will not add any new packages to your BIDS solution. It will only refresh existing packages with any deployed changes. You will still need to add the packages within the BIDS solution to the SSIS Packages folder even though they are sitting in the file system. You don’t want to simply ‘Add Existing Package’ either, as you will end up with a duplicate file in your BIDS directory.  I’ve blogged a simple way to Add Multiple Packages to a BIDS solution

Importing data from Active Directory Using SSIS

This is a neat little trick.  My client wanted to filter a report by employee manager, but didn’t have a good way to maintain the list of employees and their managers.  There were a few possibilities, but we wanted to use a system that already maintained the relationship between employee and manager.  We decided to pull the information out of Active Directory, using an SSIS package.

Addendum Sept. 12, 2012:  The approach outlined below works fine for a simple data set where the query will not return more records than the AD paging file size will allow (typically 1000 records), and you are only querying single value attributes.  However, I recommend you use this Script Task approach instead of using a Data Flow.  It is more powerful and flexible yet still simple method of importing any size data set from Active Directory, regardless of paging file size.  Read the step-by-step here:  Get around Active Directory Paging on SSIS import

Find the Active Directory Server

The first thing I needed to know was which server Active Directory resided on.  I didn’t want to have wait for I.T. to get back to me with that answer.  There is a very easy way to find this out if you are logged into the same domain.  Just run a command prompt and use the command

ping %USERDNSDOMAIN%

This will return the active directory server name and IP address.

Create the ADO.NET Connection in SSIS

Create your SSIS solution and your package.  Now create the connection to the Active Directory server.  There is a little trick to this.  In order to use the connection in a data flow you need to create an ADO.NET connection.  During the create process, in the Provider drop down, change the Provider to OLE DB Provider for Microsoft Directory Services.  Then enter your server name and test as normal.  If you were to create an OLE DB connection instead of ADO.NET and change the provider, it would not work for Data Flow connections.  It must be an ADO.NET connection type. 

image

Use ADO NET Source in your Data Flow

On the data flow tab, drag an ADO.NET Source onto the design surface.  Open up the ADO.NET Source Editor and choose the connection you created in the previous step.  Choose “SQL Command” for Data Access mode.  Now you will write a query to pull the desired information from Active Directory.  Your query will look something like this:

SELECT displayName, Mail, Title, physicalDeliveryOfficeName,  telephoneNumber, Manager, name, sAMAccountName
  FROM ‘LDAP://yourservername’
WHERE objectClass=’user’

or this, depending on how you like to structure your LDAP queries.

<LDAP://yourservername>; (objectClass=user); displayName, Mail, Title, physicalDeliveryOfficeName, telephoneNumber, Manager, name, sAMAccountName

You must put single quotes around the server.  For more information on what attributes are available to be imported from Active Directory I found this to be helpful:  http://www.kouti.com/tables/userattributes.htm

You can test your query in SSMS like this:

SELECT displayName, Mail, Title, physicalDeliveryOfficeName, telephoneNumber, Manager, name, sAMAccountName

FROM Openquery (adsi, ‘SELECT displayName, Mail, Title, physicalDeliveryOfficeName, telephoneNumber, Manager, name, sAMAccountName

FROM “LDAP://yourservername” WHERE objectClass = ”user” ‘)

For this SQL Query to work you will need to add your AD server as a linked servier on your SQL server and name it ADSI.   In SSMS open up Server Objects and right click on Linked Servers.  Select New Linked Server.   On the General tab select “OLE DB Provider for Microsoft Directory Services” as the Provider.  Type in your AD server name in the Product Name and the Data source.  Leave the other fields blank and click OK. 

When running your query, if you get an error like this:  “Cannot fetch a row from OLE DB provider “ADsDSOObject” for linked server “adsi”,  there is a good chance your query is returning more rows than the AD paging file size allows.  In that case you can do one of two things: 1) filter your query further to return fewer rows (try filtering on a specific sAMAccountName to be sure that the query syntax isn’t the problem) or 2) use a C# script which is not affected by paging file size as outlined here Get around Active Directory Paging on SSIS import .

Data Formatting

Each field will be of NTEXT type.  You can leave them as NTEXT if this is an acceptable format for your destination, or you can add Derived Columns which convert the values in an expression, like this:

(DT_WSTR,50)sAMAccountName

image

 

 

Add your Destination

Add your data flow destination and map the fields as you wish.  In this case I created a table in the data warehouse to receive the Active Directory data.

You can now finish your package and deploy it.  This is a simple but very powerful method.  I found it very useful and I know I’m going to be using it again.

Addendum Sept. 12, 2012: This approach works fine for a simple data set where the query will not return more records than the AD paging file size will allow (typically 1000 records), and you are only querying single value attributes. However, I recommend you use this Script Task approach instead of using a Data Flow. It is more powerful and flexible yet still simple method of importing any size data set from Active Directory, regardless of paging file size. Read the step-by-step here: Get around Active Directory Paging on SSIS import

 Credits

In preparing this post, I found the following articles to be useful:

http://social.technet.microsoft.com/wiki/contents/articles/processing-active-directory-information-in-ssis.aspx

http://www.kouti.com/tables/userattributes.htm