Importing data from Active Directory Using SSIS

This is a neat little trick.  My client wanted to filter a report by employee manager, but didn’t have a good way to maintain the list of employees and their managers.  There were a few possibilities, but we wanted to use a system that already maintained the relationship between employee and manager.  We decided to pull the information out of Active Directory, using an SSIS package.

Addendum Sept. 12, 2012:  The approach outlined below works fine for a simple data set where the query will not return more records than the AD paging file size will allow (typically 1000 records), and you are only querying single value attributes.  However, I recommend you use this Script Task approach instead of using a Data Flow.  It is more powerful and flexible yet still simple method of importing any size data set from Active Directory, regardless of paging file size.  Read the step-by-step here:  Get around Active Directory Paging on SSIS import

Find the Active Directory Server

The first thing I needed to know was which server Active Directory resided on.  I didn’t want to have wait for I.T. to get back to me with that answer.  There is a very easy way to find this out if you are logged into the same domain.  Just run a command prompt and use the command

ping %USERDNSDOMAIN%

This will return the active directory server name and IP address.

Create the ADO.NET Connection in SSIS

Create your SSIS solution and your package.  Now create the connection to the Active Directory server.  There is a little trick to this.  In order to use the connection in a data flow you need to create an ADO.NET connection.  During the create process, in the Provider drop down, change the Provider to OLE DB Provider for Microsoft Directory Services.  Then enter your server name and test as normal.  If you were to create an OLE DB connection instead of ADO.NET and change the provider, it would not work for Data Flow connections.  It must be an ADO.NET connection type. 

image

Use ADO NET Source in your Data Flow

On the data flow tab, drag an ADO.NET Source onto the design surface.  Open up the ADO.NET Source Editor and choose the connection you created in the previous step.  Choose “SQL Command” for Data Access mode.  Now you will write a query to pull the desired information from Active Directory.  Your query will look something like this:

SELECT displayName, Mail, Title, physicalDeliveryOfficeName,  telephoneNumber, Manager, name, sAMAccountName
  FROM ‘LDAP://yourservername’
WHERE objectClass=’user’

or this, depending on how you like to structure your LDAP queries.

<LDAP://yourservername>; (objectClass=user); displayName, Mail, Title, physicalDeliveryOfficeName, telephoneNumber, Manager, name, sAMAccountName

You must put single quotes around the server.  For more information on what attributes are available to be imported from Active Directory I found this to be helpful:  http://www.kouti.com/tables/userattributes.htm

You can test your query in SSMS like this:

SELECT displayName, Mail, Title, physicalDeliveryOfficeName, telephoneNumber, Manager, name, sAMAccountName

FROM Openquery (adsi, ‘SELECT displayName, Mail, Title, physicalDeliveryOfficeName, telephoneNumber, Manager, name, sAMAccountName

FROM “LDAP://yourservername” WHERE objectClass = ”user” ‘)

For this SQL Query to work you will need to add your AD server as a linked servier on your SQL server and name it ADSI.   In SSMS open up Server Objects and right click on Linked Servers.  Select New Linked Server.   On the General tab select “OLE DB Provider for Microsoft Directory Services” as the Provider.  Type in your AD server name in the Product Name and the Data source.  Leave the other fields blank and click OK. 

When running your query, if you get an error like this:  “Cannot fetch a row from OLE DB provider “ADsDSOObject” for linked server “adsi”,  there is a good chance your query is returning more rows than the AD paging file size allows.  In that case you can do one of two things: 1) filter your query further to return fewer rows (try filtering on a specific sAMAccountName to be sure that the query syntax isn’t the problem) or 2) use a C# script which is not affected by paging file size as outlined here Get around Active Directory Paging on SSIS import .

Data Formatting

Each field will be of NTEXT type.  You can leave them as NTEXT if this is an acceptable format for your destination, or you can add Derived Columns which convert the values in an expression, like this:

(DT_WSTR,50)sAMAccountName

image

 

 

Add your Destination

Add your data flow destination and map the fields as you wish.  In this case I created a table in the data warehouse to receive the Active Directory data.

You can now finish your package and deploy it.  This is a simple but very powerful method.  I found it very useful and I know I’m going to be using it again.

Addendum Sept. 12, 2012: This approach works fine for a simple data set where the query will not return more records than the AD paging file size will allow (typically 1000 records), and you are only querying single value attributes. However, I recommend you use this Script Task approach instead of using a Data Flow. It is more powerful and flexible yet still simple method of importing any size data set from Active Directory, regardless of paging file size. Read the step-by-step here: Get around Active Directory Paging on SSIS import

 Credits

In preparing this post, I found the following articles to be useful:

http://social.technet.microsoft.com/wiki/contents/articles/processing-active-directory-information-in-ssis.aspx

http://www.kouti.com/tables/userattributes.htm

6 thoughts on “Importing data from Active Directory Using SSIS”

  1. Hi, I would also like the GUID/SID from Active Directory, I believe the column name is objectSID/objectGUID but it seems to be returning object.byte[], how do I convert this to an actual GUID/SID in SSIS?

    Thanks,

    Michael

  2. I tried this and have the entrie flow built, but I still have these issues come up when I am executing the ado.net source task:
    [ADO NET Source [45]] Warning: The data type “System.Object” found on column “title” is not supported for the ADO NET Source. This column will be converted to DT_NTEXT.
    [ADO NET Source [45]] Error: The ADO NET Source was unable to process the data. ‘ADsDSOObject’ failed with no error message available, result code: -2147016669(0x80072023).
    [SSIS.Pipeline] Error: SSIS Error Code DTS_E_PRIMEOUTPUTFAILED. The PrimeOutput method on ADO NET Source returned error code 0xC02090F5. The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing. There may be error messages posted before this with more information about the failure.

    Do you have any idea of how to change this? it is using SSIS in SQL2012.

    1. Hi Ken, Did you try running the query directly in SQL first (examples in the post above)? Does it return Title without errors? Did you change the connection provider to .NET providers for OLEDB as described? Did you try formatting the title in a derived column like this (DT_WSTR,100)title?
      Hope this gives you a clue where to look.
      Cheers,
      Martina

  3. Question, I’m new this, but lets say I want to run this on SQL Server Agent, would I need to use UID/PW from IT? I don’t think I can use my standard Exchange/uid password.

  4. Anyone else having trouble with the query copied form this page, try replacing the slanted single quotes with regular single quotes.

Leave a Reply to Ed Cancel reply

Your email address will not be published. Required fields are marked *