Tuesday, March 4, 2014

Trying out AEM SAML SSO for the first time

Background / Overview

I spent the weekend doing some research on the use of SAML in Adobe AEM (formerly Adobe CQ) and found the documentation out there lacking, to say the least.  There are basically a grand total of two official docs that attempt to explain anything about how it works.
  • http://helpx.adobe.com/experience-manager/kb/saml-demo.html - an article that does a reasonable job giving you a step-by-step example of how to set it up.  It actually does a pretty good job over all.  The problem is there are some critical key points glossed over or omitted that will cause you problems until you "figure it out".  I will attempt to provide my experience and findings here in this article and hope that it helps fill the gaps.

That's not very much in the way of official documentation.  It is better than nothing but for something at the level of complexity of single sign on systems more is needed.  

My goal with this article is to build on the Adobe's demo article somewhat while providing my own experiences encountered while going through the steps.  I ran into a number of snags along the way that I'll overview here.  Hopefully between the two it will give you a more complete picture and save you some time.  I'll also try to provide a bit of context as I go so one might gain a little better understanding.


DISCLAIMERS

Besides the usual typos and misspellings (all free BTW) I am not an expert. Nor am I trying to provide a complete step-by-step HOWTO here.

I'm a noobie who missed out on the Elder Scrolls Online open beta last weekend because I had an itch to scratch with this and started it while the game was downloading.  By the time it finished downloading I'd already lost a full night of my life on this adventure and wasn't about to let it go hehe.

You know what I mean.

The Adobe article, which I'm basically playing off of here and refer back to heavily, does do a good job getting you going but the instructions are quite terse, it left out a lot of detail and leaves too much to blurry screen shots that are difficult to read.  

These types of installations are hard enough to pull off to begin with, much less demo, and I do appreciate Adobe at least doing that.  I will say that it is my sincere hope they expand on their official documentation (not to mention feature set!) in the immediate future.

A little more background for a novice like me is extremely helpful and I assume this to be the case for others too.  But it is just that, background from a novice that isn't going to be completely accurate, correct even, and will leave things out as well.  Such as some diagrams to help tie things together that I just don't have time to draw up right now.  : )


The Goal

So the goal overall here is to get AEM using a SAML based single sign on (or SSO) provider.  What that means is instead of authenticating an AEM user session against a AEM LoginModule we'll be using a single sign on server instead.  Additionally, an AEM user will be created in CQ and assigned to a single pre-existing CQ group during the login step.

The effect will be that when a user browses to a piece of content protected by the SSO authentication handler, they will be redirected to an external (to AEM) site to log in.  Once they log in, they will be sent back to the page they requested in an authenticated state from the perspective of AEM.  At this point they'll have an active AEM user session that was authenticated externally through a single sign on mechanisim!

The user database (or user directory) they'll be logging in against will be exposed via the LDAP protocol.  The identity provider (or IdP) will talk to the LDAP server and relay the user information back to AEM via the SAML protocols and standards.  All this is actually driven by the client (browser) so AEM and the IdP never actually talk to one another.  Instead they have the client pass cryptographically signed SAML documents back and forth.

Assumptions and Pre-requisites
  • You have that adventurous spirit and you enjoy hacking away at a problem with incomplete information in order to figure it out.  Turn back now if this is not something you are comfortable with! 
  • These are local development servers only, not production, you'll be working with that exist on your local computer
  • You are using OS X, or possibly Linux.  While this can be done on a Windows computer, I happened to be using a Macbook when I did this research.  It shouldn't be much of a translation to Windows or any *nix but that's an exercise for the reader to undertake.
  • You have moderate to advanced experience and/or understanding of the CQ/AEM environment, setting up local development servers and programming in general.  Knowing why two names are used there would be a reasonable start ; )
  • You already have a working LDAP server and user database to work with.
The overall steps involved are
  1. Set up an LDAP user directory. This will be out of scope for this article and the assumption is this step is complete.
  2. Setup an Identity provider. Referred to as an IdP.
  3. Setup CQ's SAML authentication handler.  Referred to as a service provider or SP.

The software, certificates, hosts/DNS configs, etc. that you will need
  • An LDAP server like OpenLDAP or ApacheDS
  • An identity provider server.  In this case we'll use the Shibboleth IdP which is a Java web application.  So we will need...
  • A servlet container like tomcat 6.x (recommended for the Shibboleth IdP used in this article).
  • SSL certs and public/private key pairs
    • Tomcat - to support SSL requests
    • IdP - a public/private key pair is needed for signing requests to the IdP
    • SP - and another public/private key pair is needed for signing responses to the SP
  • Adobe AEM 5.6.1 which will be our SP
  • a hostname to use for your local IdP.  This is a necessary step so pick one now and write it down for later.  I used sso.localdev.org
  • /etc/hosts config to use for the IdP host name you are using.  So for example:
    • 127.0.0.1 sso.localdev.org
  • A copy of the Adobe SAML demo article's zip file, found at the bottom of the page, that has the credentials, configs, metadata used by that article downloaded and handy for reference purposes.
  • A target directory to install things like tomcat and Shibboleth into. (I highly recommend you do this!)
Generating public/private signing keys (cliff notes)

When it comes to generating certificates, keys and the like I suck.  When I finally realized I had to generate a public/private key pair and upload them into the right spot in the AEM JCR this StackOverflow post came in handy:

http://stackoverflow.com/questions/14464441/how-to-create-a-self-signed-x509-certificate-with-both-private-and-public-keys
These were the openssl commands I used to create the SP pub/private certs that worked for me:
openssl genrsa -out SP-server.pem 1024
openssl req -new -key SP-server.pem -out SP-server.csr
openssl x509 -req -days 365 -in SP-server.csr -signkey SP-server.pem -out SP-server.crt
openssl pkcs8 -topk8 -inform PEM -outform DER -in SP-server.pem -nocrypt > SP-server.PKCS8.key
Fortunately the Shibboleth setup script generated the keys for the IdP.
Unfortunately, I failed to keep notes on how I generated the SSL certs for tomcat and setting that up so I cannot give you cliff notes here.
YMMV but HTH.

Setting up an LDAP user directory (off topic)

As I've already stated, we'll assume you already have an LDAP server with a populated user database set up, running and that you feel comfortable performing the general tasks required to do things like searching, CRUD operations, debugging, troubleshooting, etc.  Setting up an LDAP server is an adventure by itself.  Getting AEM to use it via it's LDAPLoginModule is yet another adventure.  Fortunately, for me at least, I'd just gone through this prior to digging into setting up SAML based SSO with AEM.

At some point I'll probably publish my notes about AEM using LDAP but fortunately for you there is already plenty of information out there.  At least enough to for you to be able to figure it out on your own.


Setting up an IdP

Set up tomcat 6.x, with SSL support

So one of the earlier steps the Adobe SAML demo article has you do is setup tomcat support for SSL.  This should be no problem for anyone who is still reading and is still awake at this point.  Unzip a fresh copy of tomcat 6.x into your work directory.  Why tomcat 6?  Well, the Shibboleth web app needs it.  They actually have a good write up on why it doesn't play nice and well with tomcat 7 in their documentation if you go looking for it.

GOTCHA (for me at least) The first problem I ran into following the Adobe article was the its instructions pretty much assumed you were using a Windows computer and I, on my Macbook, did not have immediate success.  Actually, I think the difference was the Windows version of tomcat generally has the Apache Native Libraries built in to its binary download and I had the vanilla version.  The result was a slightly different setup approach to getting the SSL certs loaded.

So instead of configuring SSLCertificateFile & SSLCertificateKeyFile attributes in the config in the conf/server.xml I had to use the keystoreFile approach.  Had to create the keystore and the certs too.  Google around for it and you should be fine.  Apologies for not having those notes here.  If I get to it I'll try to update this article since it was a bit of a pain at first.

At the end of it you should have a stock tomcat installation that will server https with the usual self-signed certificate warnings.

Install Shibboleth
This was probably the easiest step in the whole ordeal!  My hats off to the folks who develop this application.  Overall they did a fantastic job. All I had to do was run the install script, provide the answers it needed which were obvious and made sense and it did the rest.  All it needed was a directory to install to (copy files into is all), a hostname and a password for the certs.  It took me through the usual cert creation prompts of course which should be expected.

I do think it's important to know what is going on here though as it should help you figure things out.  This script, based off your answers, creates the baseline configuration files, certs, etc. with the hostname and paths you provided for what has to be a very complex application.  It also repackages a .war whose Spring configurations reference XML config files the Shibboleth directory.  

You have to copy the war to the tomcat web apps directory. Additionally, copy the endorsed directory from the source tarball to the base tomcat directory.  Apparently Shibboleth needs a few extra jars available in the classpath.  No biggie really.  Only trouble I had was needing to stop/start a time or two to pick up the jars.  Shouldn't need any modifications to the tomcat startup scripts since it already looks for that directory out of the box.

So after you copy this war and the endorsed dir into your tomcat directory you have the usual web app installation plus a whole set of external configurations to contend with.

It's actually a pretty slick way to go about providing a baseline install for a complex web application that requires a large amount of environment specific configuration.  Note to self ...

Configure Shibbolet
Now we get to the real meaty part, configuring the identity provider server to talk to both your LDAP server and to AEM.  This by far was where I had the most trouble.  I will say that after I downloaded the example zip files life got much, much easier but even then there were some things that were not apparent.

I will go through each step of the article here and attempt to fill in the blanks some.  A couple of things to note up front:

I found I had to restart the web app (i.e. restart tomcat) after each config file change.  They were not picked up automatically.

Problems with the web app itself were logged in the usual tomcat logging files (i.e. catalina.out and localhost.YYYY-MM-DD.log so don't forget to keep a tail going on those

  • Fix/validate the idp-metadata.xml references to the local tomcat
First up is fixing, or validating, the Shibboleth script is pointing to your local development tomcat server correctly.  There were a couple of spots missing the port number, which I added, but otherwise looked good.  So what worked for me was insuring that all references to my https://sso.localdev.org:8443 tomcat server looked the same.

I think the key configurations are highlighted in the articles screenshot but it didn't hurt me to add the ports where I found them missing.
  • Update the attribute-resolver.xml to set (the first) LDAP connection as well as the attribute mappings
This file appears to be big old mapping of LDAP object attributes to (presumably) SAML attributes or at least Shibboleth's internal representation of those attributes.  It is also the first place you configure a connection to your local LDAP server.  So it looks like this is where you control how the IdP asks LDAP for user information.

GOTCHA What was not really apparent in the article was I had to add uid and group AttributeDefnition's.  Mostly operator error on my part but I'll reinforce that you must add those as illustrated in the article (or its zip).

The LDAP config should be pretty straight forward for anyone who has done the very least amount of troubleshooting of an LDAP server.  If you haven't before, you are almost certainly going to now : )  I would suggest (figuring out how to) cranking up the debug logging of our LDAP server and observing both the IdP and the LDAP logs to help you tune this.

I'll also point out that if you mess the connection/bind info up the Shibboleth web app does not start. Considering that it is a DataConnector configuration you are updating, that adds up that it would fail on start. Check the tomcat logging for clues there.
  • Update handler.xml to disable login handlers you won't need
On this step the article worked as advertised.  The only thing that wasn't super clear is that there were just a couple of LoginHandler configs near the bottom of the file I had to disable as shown in the articles screen shot.  But you really should only have to comment out one or two of those.
  • Update the logging
Worked as advertised, no issues here.  Crank up the logs.  Should only need to modify the first logger definition.  I left the rest alone.  Now would be a good time to go take a peek at the Shibboleth log files.  They'll come in handy.  You probably want the logs/idp-process.log in the Shibboleth directory.

  • Update the login.config
This is Shibboleths JAAS config file that is used when logging in.  If you've done JAAS work before, such as setting up LDAP with AEM or CQ, this will be familiar.

One thing to note here.  This seems to be the right place to go to set up the LDAP base query string for the authentication step.  In other words the actual login step.  I *think* the attribute-resolver.xml is used later, after you authenticate to query the profile.  Seems strange at first but after thinking on it, makes sense.  The user is actually logging in to the IdP application via the JAAS spec.  Later on it runs some lookup queries but first you gotta get logged in.  So, yea, with that in mind, it makes sense.

It does add to the volume of things to deal with but not the end of the world by any stretch.
  • Update the relying-party.xml (and create metadata for your SP!)
GOTCHA this step gave me the most trouble. While the article does outline the spots in the XML file you'd need to modify, it's hard to read the screen shot and is easy to miss what is going on in there:
    • As the screen shot indicates, update the ProfileConfiguration for the saml:SAML2SSOProfile type by setting the encryptAssertion to "never".  I believe this will cause the data sent back to AEM to be in the clear versus scrambled.  Toggle it and see : )
    • create the RelyingParty node as indicated in the demo article (use their .zip file here).  
Something important to note:
    • the ID's in that node do not really appear to be tied to anything important, such as the value that identifies the particular SP like your CQ server your working with.  They just appear to need to be unique like XML ids often are.
    • you have to create a second MetadataProvider config.  The id just needs to be unique as near as I can tell.  It will reference another metadata file that you will create in a step or two that is really important as it basically gives the IdP the specifics on your SP.
Reference the articles zip file for text you can pretty much copy and paste in.

Another way to say all this is this is where you tell Shibboleth (the IdP) about AEM (the SP) and how to talk to it.  The file referenced in the MetadataProvider that you add has the public key to sign responses with and the end point to send responses to.  Not sure how much of that is specific to Shibboleth or not but it's good stuff to know.
  • Update the attribute-filter.xml
This worked as advertised in the article.  Appears to be defining the response document sent back to AEM.
  • Finally, create a metadata file for your SP
So here we're giving the IdP some key details about our SP.  In other words, this is where we tell Shibboleth what it needs to know about AEM.  The glue between the two.  It took two files for the LDAP side, so two seems fair for the SP side ; )

Here at the key bits
    • First, the entityID attribute of the EntityDescriptor node at the root of the document is important.  This has to match a configuration in AEM exactly.  It is how the IdP and the SP identify the SP so if they don't match it all breaks down.
    • Next is the X509Certificate.  Seriously doubt you'll ever be able to use the one from the article. This is the SP's public key in X509 format base64 encoded.  See the openssl commands near the beginning of this post for examples of how to make this.  look in the .crt file.
For AEM, there are some things you have to do in that Location attribute URL so it will fire the SAML auth handler.  First it has to end in /saml_login or at least that certainly appears to be the case.  Second, it only works when the path goes to a content page as near as I can tell.  So for example: http://localhost:4502/content/geometrixx/saml_login

More on that later.

One point about the cert here.  On the AEM side the conifg docs for the AEM SAML authentication handler say a public/private key pair is optional but I found it to be required.  While it may be possible to forgo that with the right mix of configurations, that doesn't seem right.  Any way, I ended up needing to do it to make things work.

BTW, it is beer-thirty.


Setting up AEM's SAML authentication handler

Ok, the home stretch.  All the pieces are in place, now it is time to get AEM talking SAML!
  • First up, lets add some certs.
The article has good instructions on how to get the IdP's public key set up in CQ.  In addition to that, I found I need to set up a public/private key pair for AEM to use as well.  The config docs say their optional but I found them to be required.  Use the openssl commands near the top of this post to generate the files and then upload them where the config doc (http://dev.day.com/docs/en/cq/current/core/administering/saml-2-0-authenticationhandler.html) tells you to.

When you're done you'll have an /etc/key/saml node in the jcr with idp_cert, public and private attributes and the right files in the right places.
  • Configure the Apache Sling Referrer Filter
The screenshot in the article has it spot on as to what you need to do.  I missed the check box to Allow Empty headers.  I'm here to tell you it is required.  Without both your IdP hostname listed and that Allow Empty enabled the responses from the IdP will be rejected.
  • Configure the SAML Authentication Handler
The screen shot is pretty fuzzy but it does call out the things you have to update.  I'll try to expand on that here.
    • Set the UserID attribute to uid
    • Set the Group Membership to group
    • Set the Service Provider Entity ID to match its counterpart in that extra Shibboleth metadata XML file you had to create.  This is where it has to match verbatim since it is how the SP is identified.
    • GOTCHA I recommend updating the path value as well to something other than the root / path.  This is basically a matching filter that will cause this SAML auth handler to fire on any request whose path matches.  Do you REALLY want to prevent yourself from being able to login to AEM if you got something wrong?  =)  Try restricting it to just one website like /content/geometrixx at first until you are more comfortable with everything.
User and group creation
If you've set up the LDAPLoginModule you know you can do some nice mappings from LDAP to the JCR.  So far I haven't been able to do the same with the SAML approach.  User records are created the same way regular CQ users are in that it follows the repository.xml defaultDepth setting.  (Crank that up if you need a wider distribution of user nodes BTW).

So far I haven't observed the SAML approach create groups, map extra attributes into the CQ profile and so on.  Mainly because I haven't tried yet.  It is probably possible but at this point those are items that can be solved for in other ways that likely make more sense in the long term.  Groups can be mapped.  Profile data should probably come from an outside service anyway. And, hey, that service can be protected by the same SSO that is protecting content pages : )

/saml_login
GOTCHA The URL that the IdP sends the request back to is supposed to end in /saml_login

Not only that but from what I observed, it has to be hung off a page content node to work as well.  In other words if the immediate parent of the /saml_login is not an instance of a content page it doesn't seem to work.  No idea if that is actually the rule but that was my observation.  And I could not find any documentation on this beyond a comment on the SAML Authentication Handler's documentation page.  Maybe it was different in 5.6 but in 5.6.1 it just did not work until the parent part of the URL path was referencing a content page.  I even saw a post in the support forums about turning saml_login on but never got that post to come up actually. Always got Bad Gateway errors when I tried =(

Since SAML is supported through an authentication handler, not a JAAS LoginModule like LDAP, it seems to make sense that it would only apply to content and not other areas.  The article has it off of the root node but I didn't actually try that since I didn't want to risk locking myself out of my instance and have to start over from a fresh install.

But authentication handlers are not too much of a mystery at this point to the CQ/AEM world and I expect the Adobe provided SAML authentication handler to follow the same rules.


Final Thoughts

Hey! Wake Up! It's time to go home, they're kicking us out of here!

If you made it this far, you must have found something useful in this article.  That or you're just bored.  Either way, if you do try to tackle setting SAML up in AEM, I hope this saves you some time.

The SSO trick is pretty cool to see in action.

UPDATE 6/15/14: fixed IDP URL example

21 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete
  4. This comment has been removed by the author.

    ReplyDelete
  5. This comment has been removed by the author.

    ReplyDelete
  6. This comment has been removed by the author.

    ReplyDelete
    Replies
    1. Hi Irene,
      Did you get a resolution for this error?

      Delete
  7. This comment has been removed by the author.

    ReplyDelete
  8. Hi Irene,

    I can't really tell without looking at your installation what is going wrong for you specifically. You may be stumbling on the GOTCHA I ran into with the /saml_login I was talking about at the end of this blog post.

    For some reason it doesn't seem to work when the parent node of the /saml_login URL your login form is posting to is not a cq:page node. You might try putting a URL in your form that points to your content tree before adding the /saml_login at the end. For example: http://localhost:4502/content/geometrixx-outdoors/en/saml_login

    Also, please refrain from posting very long logs in the comments. They add a lot of noise and are impossible to read. If need be paste them in a text file and put them in an email attachment. Thanks!

    ReplyDelete
  9. Hi Irene,

    Any luck on the work integration saml with AEM. if yes please share me where it went wrong .. so that i can take neccessary steps before implementing it, even i tried not able to get it.

    Your inputs are helpful for me.

    Thanks in Advance.
    Surender

    ReplyDelete
  10. Thanks Tedd! your tip on the GOTCHA saved me hours!

    ReplyDelete
  11. Hi Irene,

    what exact value we should give in the "entityID" of the metadata file?
    you said it should be same as the configuration of AEM.

    whether we have to give any URL here like dispatcher? or Publish URL?


    Regards
    Raju

    ReplyDelete
  12. Hi,

    I have a few questions related to the /saml_login url that the IdP needs to sen the response back to -

    1. Is this an out of the box servlet or something provided by AEM that needs to be configured ?
    2. How can I have the IdP send the response back to a custom servlet instead of the /saml_login
    3. Can the URL of the servlet be anything other that /saml_login?

    Thanks,
    Kunal

    ReplyDelete
  13. Hi Kunal, /saml_login is path pattern that will be intercepted by the out-of-box SAML handler, it knows how to handle the assertion (basically the SSO magic). Of course this works only if you configure the SAML handler per instruction. If you want custom SAML handler, you can do that by writing your own authentication handler (advance stuff). In the custom handler, yes, you can configure whatever path to intercept the SAML post from your IDP

    Thanks,
    Raymond

    ReplyDelete
    Replies
    1. Thank you for your reply Raymond. Is there any way where I can find out what this out-of-box SAML handler does after it receives control from the IdP.

      I have a requirement where I want to let the default post login operations happen the way they normally do and then add a few operations of my own each time a user logs in.

      I know I could achieve this by writing a custom handler and have the IdP to redirect to this custom handler instead of /saml_login. I want to know which interface(s) should this custom handler implement?

      Regards,
      Kunal

      Delete
    2. You can consider extending the built in SAML handler by adding custom post-authenticated logic in the authenticationSuceeded method. Based on the API (5.6.1, can't find for 6.0) https://docs.adobe.com/docs/en/cq/5-6-1/javadoc/com/adobe/granite/auth/saml/SamlAuthenticationHandler.html, the class is not sealed and should be doable

      Delete
    3. Thanks. I am going to try writing a custom Authentication Handler for handling the response. Didn't find any documentation on extending the SAML Handler for AEM 6 though.

      Delete
    4. Hi Kunal Mehta : did you get any success in this i.e. extending the SAML Handler in AEM 6 ?

      Delete
    5. We shelved the requirement to extend the SAML auth handler up until now. And now I see that with AEM 6.2 Adobe has disallowed extending the SAML Handler.

      Have you or somebody else found a way to extend the default SAML Handler in AEM 6.2?

      Delete
  14. Hi,

    I tried to deploy my project from Eclipse to AEM Local. Deployed succesfully but the bundle is not active, it shows just installed. While looked into the bundle it shows up like this



    org.opensaml -- Cannot be resolved
    org.opensaml.common.impl -- Cannot be resolved
    org.opensaml.saml2.core -- Cannot be resolved
    org.opensaml.saml2.encryption -- Cannot be resolved
    org.opensaml.xml -- Cannot be resolved
    org.opensaml.xml.encryption -- Cannot be resolved
    org.opensaml.xml.io -- Cannot be resolved
    org.opensaml.xml.schema -- Cannot be resolved
    org.opensaml.xml.security -- Cannot be resolved
    org.opensaml.xml.security.credential -- Cannot be resolved
    org.opensaml.xml.security.keyinfo -- Cannot be resolved
    org.opensaml.xml.security.x509 -- Cannot be resolved


    Can you please help me to resolve this issue ??

    Thanks,
    Mahesh.

    ReplyDelete
  15. Respect and that i have a swell provide: Whole House Reno luxury home renovations

    ReplyDelete