Recently about Security

Ever since hearing about Firesheep I've wanted a Safari extension similar to the HTTPS Everywhere extension for Firefox. Frankly, I was puzzled that one didn't already exist, so I set out to write it. The result is SSL Everywhere. The journey is this blog post.

I started the project as an effort to protect myself and others from Firesheep when using Safari on an open public wireless network, much like those found in coffee shops, hotels, and airports everywhere. Firesheep works by hijacking your session, which is basically a way of stealing your active login without needing your username or password.

My goal with SSL Everywhere was to protect someone from a session hijacking attack by ensuring all requests to the originating website were tunneled through SSL. This seemed trivial until I thought about the various kinds of requests that would need to be secured.

  • the original HTML page
  • images
  • external JavaScript files
  • external CSS files
  • inline frames (iframes)
  • object or embed tags for things like videos or applets
  • Ajax requests
  • requests for the favicon

It wasn't until I started to tackle each of the list items that I realized just how limited Safari's extension API is, and ultimately that one could never build a foolproof extension to protect Safari users from session hijacking attacks.

Lessons Learned

If you've spent any significant time writing Safari extensions, you may already be aware of the many restrictions and challenges I wrestled with for hours. Much of what I learned was trial and error, despite well written documentation from Apple. While the documentation does a fair job of describing many of the things you can do with an extension, it doesn't provide as much detail on what you can't do. That's what you'll find below.

No Access to Raw Cookies

The key to avoiding a session hijacking attempt is making sure the attacker can't access the session cookie, or cookies, your browser sends on each request. This can obviously be done by making sure all requests take place over an SSL connection. Unfortunately, you can't guarantee this won't happen since Safari extensions have no opportunity to intercept webpage requests before they occur. A user could easily type http://twitter.com into their address bar and SSL Everywhere couldn't stop the request.

The other option I pursued was having SSL Everywhere attempt to mark session cookies as secure since the browser will not send cookies marked as secure over a non-SSL connection. However, a Safari extension has no additional ability to manipulate cookies than normal JavaScript running on a page, meaning there is no way to read a cookie's path or expiration date for example. So, there's no way for the SSL Everywhere extension to simply mark a cookie as secure without having to guess at, or omit, other cookie information which may be important to the operation of the website.

No beforeload for favicons, Ajax Calls, or Stylesheet References

Apple was responsible for adding the beforeload event to Webkit. This gives scripts (and extensions) the ability to decide whether an external resource should be loaded or not. As stated in the Safari Extensions Development Guide,

Safari 5.0 and later (and other Webkit-based browsers) generates a "beforeload" event before loading each sub-resource belonging to a webpage. The "beforeload" event is generated before loading every script, iframe, image, or style sheet specified in the webpage, for example.

I was thrilled after reading those two sentences because it was exactly what I needed. That is, until I discovered there's no beforeload event fired when the browser requests a website's favicon, makes an Ajax request, or loads an image reference defined in a stylesheet. This leaves a loophole where I can't stop a non-secure favicon reference, Ajax call, or CSS image reference from exposing the session cookies I can't properly mark as secure.

src Attribute Can Not be Changed in beforeload Event Handlers

I ultimately wanted to have a beforeload event listener that would

  1. inspect the source URL being requested
  2. rewrite the source URL to a secure https: version if necessary
  3. replace the resource's insecure reference with the secure https: URL
  4. allow the loading of the resource to proceed with the new, secure URL

Following the above procedure results in the resource being requested with the SSL-secured URL as desired, but will not stop the original load with the original, insecure, URL. You just end up making two requests for the same resource; one over SSL, the other not. Again, another point of exposure for the cookies.

beforeload != before request

It turns out that all the effort described above to leverage the beforeload event was futile. Preventing a resource from loading, as described in Apple's example of blocking unwanted content, stops it from being inserted into the DOM, but does not prevent it from being requested by the browser anyway.

Apple's example documentation states

If your script responds to a "beforeload" event by calling event.preventDefault(), the pending sub-resource is not loaded. This is a useful technique for blocking ads...

Should I have really expected that "the pending sub-resource is not loaded" doesn't imply that it's not requested? Perhaps I wasn't sharp enough to catch that nuance, but once again the cookies are exposed.

Host Page Prototypes Can Not be Changed

When I discovered the beforeload event doesn't fire on Ajax requests, I decided to try to override the XMLHttpRequest open method and rewrite any insecure URLs to a secure version before yielding to the original open implementation. What I found was that you can't modify the prototypes of the page your start or end scripts are being injected into.

I suspect this has something to do with start and end scripts being secluded in their own separate, and randomly named, namespace. Changing around object prototypes never resulted in any errors, it just didn't change the prototypes of the host page and therefore couldn't change the XMLHttpRequest object. Again, another point of exposure for the precious cookies.

Status of SSL Everywhere

If you've read this far it should be obvious that SSL Everywhere cannot guarantee protection against session hijacking attacks, including those from Firesheep. However, it does enhance security by automatically redirecting you to secure versions of many websites and rewriting insecure links to their SSL-encrypted equivalents. If you must use Safari to access popular websites when connected to an open WiFi network, you're probably better off doing it with SSL Everywhere.

We don't offer a pre-built version of the extension for easy installation into Safari because we don't want people to casually install it and forget that it's not completely secure for all sites. If people find it valuable nonetheless, then we may consider creating the extension bundle in the future.

Try It!

If you'd like to try SSL Everywhere you can find the source code on Github. It's open source software licensed under the GPL version 2 license, primarily because it borrows code from HTTPS Everywhere. If you can come up with solutions to any of the problems I encountered, please let me know!

I'm writing today from the DSSS (Defense Special Security Systems) Conference in San Antonio. We're here showing Clearance Track, our product for security officers that helps organize and automate the clearance processes associated with getting and keeping your employees' clearances.

I was actually a little bit intimidated when we arrived and started setting up our booth yesterday -- our co-exhibitors rolled into the conference with crates taller than me full of degaussers (I had to google degausser), security doors, liquid scanners, extra large safes, alarm systems, and miscellaneous oversized, super-heavy equipment looking like it was taken from the set of a 1970s spy thriller.

The truth is, we don't have 2,000 pounds of awesomeness at our booth. Not even if you throw in our pop-up display and our booth babes. Our product is software, so our coolest item is a 30" Apple monitor that we borrowed from our CEO's desk on which we display the software. But we're holding our own. We have a product that people want. One of our first visitors took one look at Clearance Track and said to my co-worker, "you're my new best friend."

CT-booth-babes.jpg

We thought we would come here and meet lots of security officers from small consulting companies like Near Infinity whose security officers were overwhelmed with processing clearances. But in talking to the attendees, we're finding more government folks charged with managing the clearance data for thousands of military or other government personnel. They use JPAS, but find it insufficient to report on and manage the details of their employees' clearances. They're getting by now with Excel spreadsheets, outdated Access databases, filing cabinets, and whatever info they can remember in their heads.

Clearance Track is a great solution for them. It provides a single, organized place to store all of their clearance data. It makes it easy to report on their secure personnel, easy to find documents related to that person's clearance, easy to pass off responsibilities from one security officer to another, easy to remember to renew a badge, easy to run tedious government reports, easy to auto-generate documents... Who doesn't need easy?

So far, the reaction has been great. Lots of people are raving about how helpful Clearance Track would be in their jobs, how user friendly it is, and how much time it would save them. With a day and a half left to go, we're not missing the heavy equipment too much.

And now, if you'll excuse me, I'd better get back to our booth visitors!

Clearance Track is a fast, secure way to manage your clearances. By providing a single place to track your investigations, crossovers, badge renewals, briefings, and certs, Clearance Track significantly streamlines the security officer's job and provides a collaborative environment to organize and store employee data. Use it for 4311s, 4414s, 312s, SF-86s, visit certs, annual security refresher briefings, and key government reports.

Near Infinity recently announced the release of Grant, a Ruby on Rails plugin for securing and auditing access to your Rails model objects, and I'm here to tell you a little bit about it. There are two primary pieces of Grant, model security and model audit. I'll be focusing on model security for this post and will address model audit in a later entry.

Grant's model security is deliberately designed to force the developer to make conscious security decisions about what CRUD operations a user should be allowed to perform on your model objects. It doesn't care how you choose to authenticate and authorize your users to perform a CRUD operation, it only cares that you actually do it.

Rather than specify which operations are restricted, Grant restricts all CRUD operations unless they're explicitly granted to the user. It also restricts adding or removing items from has_many and has_and_belongs_to_many associations. Only allowing operations explicitly granted forces you to make conscious security decisions. While it obviously can't ensure you make the correct decisions, it should help ease the latent fear that you've inadvertently forgotten to secure something.

Enough talk, let me show you an example of how you might use it. To enable model security you simply include the Grant::ModelSecurity module in your model class. In this example you see three grant statements. The first grants find (aka read) permission to everyone. The second example grants create, update, and destroy permission when the passed block evaluates to true, which in this case happens when the model is editable by the current user. You can put any code you want in that block as long as it returns a boolean value. Similarly, the third grant statement permits additions and removals from the tags association when it's block evaluates to true. A Grant::ModelSecurityError is raised if any grant block evaluates to false or nil.

class EditablePage < ActiveRecord::Base
  include Grant::ModelSecurity
  has_many :tags

  grant(:find) { true }
  grant(:create, :update, :destroy) do |user, model| 
    model.editable_by_user? user 
  end
  grant(:add => :tags, :remove => :tags) do |user, model, associated_model| 
    model.editable_by_user? user 
  end

  def editable_by_user? user
    user.administrator?
  end
end

There's a lot more to the grant statement than shown in the above example. For instance, you can have multiple grant statements for the same action. Ultimate permission to perform the action will not be granted unless all grant blocks evaluate to true.

As you can see, Grant is pretty simple to use, but it's not going to do the dirty work for you. It's up to you to make the proper security decisions. Grant's just there to make sure you don't forget.

Below I have written some fully functionally code that shows how you could implement row level access control in Lucene (2.3.2). Basically you have to index enough information to be able to search (in a single query) and find all documents that a given user has access to read.

In the below example there are two fields:

DATA: Which contains any data that you want your users to be able to search. NOTE: You can have as many data fields as you like.

ACL_FIELD: The field used to determine what users have access to this document. Note: You can have as many access control fields as you like.

All you have to do is built the access control query for each user and submit your user's query unchanged.

public class TestIndexerSearcher {

   public static void main(String[] args) throws Exception {
      Directory directory = new RAMDirectory();
      IndexWriter indexWriter = new IndexWriter(directory, new StandardAnalyzer());
      indexWriter.addDocument(buildDocument("DATA:sametoken","ACL_FIELD:access"));
      indexWriter.addDocument(buildDocument("DATA:sametoken","ACL_FIELD:noaccess"));
      indexWriter.optimize();
      indexWriter.close();

      IndexSearcher indexSearcher = new IndexSearcher(directory);

      QueryParser parser = new QueryParser("DATA", new StandardAnalyzer());
      Query query = parser.parse("sametoken");
		
      //This is all you have to add to your existing code.
      Filter aclFilter = applyAccessControl(new TermQuery(
         new Term("ACL_FIELD","access")));

      Hits hits = indexSearcher.search(query, aclFilter);
      System.out.println("Hits[" + hits.length() + "]");
      for (int i = 0; i < hits.length(); i++) {
         Document doc = hits.doc(i);
         System.out.println("DATA [" + doc.get("DATA") + 
            "] ACL_FIELD [" + doc.get("ACL_FIELD") + "]");
      }
      indexSearcher.close();	
   }

   private static Filter applyAccessControl(Query aclQuery) {
      return new CachedQueryFilter(aclQuery.toString(), 
         new QueryWrapperFilter(aclQuery));
   }

   private static Document buildDocument(String... fieldInfo) {
      Document document = new Document();
      for (int i = 0; i < fieldInfo.length; i++) {
         String[] split = fieldInfo[i].split(":");
         String fieldName = split[0];
         String fieldValue = split[1];
         document.add(new Field(fieldName,fieldValue,
            Field.Store.YES,Field.Index.TOKENIZED));
      }
      return document;
   }	
}


After you run this code, you will get a single hit, not the two that you would normally get if the access control filter wasn't in place.

public class CachedQueryFilter extends Filter {
   private static final long serialVersionUID = 6797293376134753695L;
   private Filter filter;
   private String key;
   private static transient Map<String, BitSetCache> filterCache = 
      new ConcurrentHashMap<String, BitSetCache>();

   public CachedQueryFilter(String key, Filter filter) {
      this.filter = filter;
      this.key = key;
   }

   public BitSet bits(IndexReader reader) throws IOException {
      BitSetCache cachedBitSet = (BitSetCache) filterCache.get(key);
      if (cachedBitSet != null) {
         BitSet bitSet = cachedBitSet.bitSet.get();
         if (bitSet != null && cachedBitSet.indexReaderVersion == reader.getVersion()) {
            return bitSet;
         }
      }
      BitSet bits = filter.bits(reader);
      BitSetCache bitSetCache = new BitSetCache();
      bitSetCache.indexReaderVersion = reader.getVersion();
      bitSetCache.bitSet = new SoftReference<BitSet>(bits);
      filterCache.put(key, bitSetCache);
      return bits;
   }
	
   private class BitSetCache {
      long indexReaderVersion;
      SoftReference<BitSet> bitSet;
   }
}
There are two additional features that this query filter doesn't implements that you may want to consider.

1st - Provide per query locking around the bitset creation code. This would allow multiple bitset creation calls to occur at once, but the same access control query would block. Therefore we would only have to build it once, even if multiple user queries with the same access control hit the query filter at once.

2nd - Persist the bitsets. In the past I have used the same directory as the index, but you may want to use a database, or something else.

My current project has some unique searching requirements.

Requirements

  • Fuzzy searching is a must (Soundex, Levenshtein, etc.)

  • Has to be fast, a must with any searching solution

  • Has to provide access control

  • Full data load indexing needs to be completed in a reasonable amount of time

  • Scoring needs to be a custom implementation

  • Needs to run on a predetermined environment, meaning that new hardware purchases are not going to happen any time soon

  • And last but not least is ability do all these things on a dataset that exceeds a billion records

So we have had a lot of constraints to deal with, the hardest one by far is the last one.

The Data

  • 1 billion plus records

  • Over 30 million unique terms

Indexing and Searching Server Specs

  • 20 CPUs

  • 32 Gig of ram

  • Dedicated SAN storage

First Searching Experiences

After getting the index built in multiple partitions, I fired up a simple Lucene console to do some simple searches with a Lucene multi searcher. Ran out of memory with 2 Gig heap, tried the maximum heap size for the 32 bit JVM we were using, 3.3 Gig, and that ran out of memory as well. So, initial tries to just run one search were unsuccessful.

Then we installed a 64-bit JVM and tried an 8 Gig heap, and it worked! I could run searches and after the first couple of warm up searches it was getting 20 - 80 ms responses on single term searches. Great, but then we tried a Fuzzy search, which uses a Levenshtein algorithm to calculate matches, 2 minutes 45 seconds, this was unacceptable.

Next we wrote our own Levenshtein Lucene query and got the 2 minutes plus search down to about one second. We found that the built in Lucene Fuzzy query was taking 85-95% of the time to find the terms to search. Then after those terms were found the actual search with those expanded terms only took a second to two depending on how many terms were found. So we replaced the built in Fuzzy query with a custom one that gets near instantaneous results on Levenshtein fuzzy matches. Problem solved.

Indexing Time

After our initial proof of concept was complete, we needed to improve the indexing time down to something more reasonable. The index creation from scratch was taking 36 - 48 hours to build with 20 CPUs running at 100% utilization. Which means that the machine was indexing about 9,000 records a second. Not bad for Lucene 2.2, but not that great.

First we stopped merging the indexes after we created them, that by itself was taking about 12 hours. At this point we also started searching these multiple indexes in parallel, and we are seeing modest increases in query performance.

Second, we upgraded to Lucene 2.3, this provided a huge increase in indexing speed. Our index creation time went from 36 - 48 hours (depending on if we merged indexes or not) down to 3-4 hours. The indexing process is now indexing around 125,000+ records a second. Huge improvement, if you haven't upgraded to 2.3, you should!

Current Development

We are in the process of adding access control to Lucene as well as adding new custom queries and scoring. So far Lucene has performed better than any of the competition that it has come up against, and with it's price point it seems to have won acceptance on our project.

In upcoming parts I will go into more details about the technical solutions that we have developed to solve these problems, as well others that I haven't mentioned yet.