boulder dan

Thread safety with no threads in sight

When people discuss thread safety in code it is almost entirely within the context of threading in a single process. The problems that come up are races conditions, data races, and unseen updates with solutions like locking, synchronized updates and immutability.

This definition of thread safety is too focused. Much of the problems normally attributed to thread safety can be seen in programs that have no access to OS threads or are running on multiple servers.

My primary work is in Ruby, where we normally labor under the illusion of thread safety by primarily deploying to a single threaded VM. The problem is that often this single thread hinders scaling. The standard solution is to take long running tasks and allow them to finish asynchronous in a queued process.

The main web process and the backgrounded process don't share any data, so thread safety in the traditional sense isn't an issue, but if you step back to look at the application as a whole they *do* share a database. This database is often mutable and is therefore a big source of thread safety bugs. Data races are the most obvious, but lost updates and stale reads also need to be considered.

How do you fix these problems? With the same concepts you would at the application layer but translated to whatever database you have.

If you're using an ACID compliant RDBMS then you're probably used to transactions which correspond to locking and mutex constructs in most programming languages. You should also consider improving your schema so that the database can enforce as much of your business logic as possible. This way your database can reject data that would create an inconsistent state even if your application deems it okay (remember your application won't know about the other processes running simultaneously).

To pick on Rails for a moment: most application built on Rails have their database constraints in the application. This creates tough scenarios like the following.

1. user = User.find_by_email("[email protected]")
2. if !user
3. User.create!(:email => "[email protected]")
4. end

An execution might look like this:
Process #1 | Process #2
|
ln 1 user = nil |
ln 2 true | ln 1 user = nil
ln 3 create! | ln 2 true
ln 4 done. | ln 3 create!
| ln 4 done.

A worst case scenario of process execution has created identical users. Even with no threads involved we have a clear threading bug.

Of course there are ways to prevent this from happening in Rails, this is illustrating how threading issues can manifest in a single threaded application. The same thing would apply to Java or any of the functional languages that pride themselves on their threading capabilities.

Remember that thread safety doesn't always involve threads.

It's a broader application problem that requires diligent understanding of how your application interacts with shared state. None of us are immune; especially in the world of internet applications.

Epic google error message fail

Dan 7:22 PM

I just spent a day without being able to send email because I enabled two factor authentication on my work account. That and a complicated gmail setup that apparently appeals to no one else ;)

I forward all my mail from my work, google apps account into my personal gmail account so that all my email is consolidated for reading, filtering, etc. I then use the “Send mail as” functionality to email from my work account. Gmail even allows you to configure alternate SMTP information so that when I send from my work account people don't get a "Sent on behalf of..." in their Outlook.

Well, when I turned on the new two factor authentication for my work email account the smtp password stopped working and this all broke...silently.

No errors when I sent the email, no errors on the SMTP configuration page and worst: all my emails made it successfully into the "sent” folder!

I finally just guessed what must be happening and fixed it.

Nice.

Connecting to an untrusted certificate in java

Dan 3:38 PM

I was getting errors(unable to find valid certification path to requested target) in Alfresco trying to connect to LDAP over ssl using an untrusted certificate and found a great hint on the interwebs that I thought I'd share.

Most of the hits on google mention this post: http://blogs.sun.com/gc/entry/unable_to_find_valid_certification

My only criticism of that article is that it only minimally addresses how to actually use the fixed keystore. I downloaded their source and hacked it up a little change the name of the certificate store from "jssecacerts" into "cacert" which is the default certificate store for all java programs. My goal was to fix the certificate store for the entire machine.

To install an untrusted certificate into your keystore the process is like this:

Help improve the Kindle reading ecosystem.

Dan 9:12 AM

I sent this as an email to kindle-feedback but thought I would share it with the interwebs as well. Let me be clear that I am absolutely in love with my Kindle. I used to not particularly enjoy reading but the kindle has changed my opinion in two short weeks. With that said my major sadness about it was access to personal content. Hence this letter:

Dear Amazon,

Please improve the reading ecosystem represented by the kindle device, and the plethora of applications that also represent "kindle devices" e.g. the iphone, ipad, web, etc. Right now these other devices are treated like second class citizens because there is no way to get personal content onto them and to synchronize the reading of the content.

This hurts users who want to use their kindles as "reading devices" vs. "buying from amazon's bookstores" device.

One idea on how to improve this is to give access to a certain amount of disk space at Amazon, maybe 1 gb (per google docs) for free users and more for people that have bought kindles or have purchased books (maybe there is even a reward program, the more money you spend with Amazon, the more personal disk space you get). This space could be used to store personal documents that then appear in the "archived" section of people's kindles. Then you could add synchronization for: latest location, bookmarks, annotation. Now I can read documents i'm currently working on, review design specifications, etc. All on my kindle.

This changes my kindle from a device I use for an hour a day at home before bed into my preferred reading device at work, at home and on the road.

Thanks for a fantastic device and feel free to contact me,

--dan

Remap Caps lock on a Macbook Pro

Dan 6:29 PM

I was fortunate enough to receive a CR-48 from Google last week. My impressions of it are more or less in line with what MG Siegler over at TechCrunch has reported: good battery, nice screen, light weight laptop. The only problem is that I have needed to adjust my workflow when I'm working on my regular computer, a Macbook Pro to facilitate sharing documents between the two machines. For example, I have migrated away from OmniOutliner to a comparable online version, Workflowy.

The other major change is that my keyboard habits are changing. With only a weekend usage of ChromeOS I already prefer having the caps lock key replaced with a button to open up new tabs in my browser.

When I got into work this morning I immediately missed my new favorite key. I decided to try and figure out a way to replace this missing behavior.

A recent article outlines how to turn the caps lock key into a control, option or splat. One benefit of an Apple approved method is that the keyboard caps lock light disables itself when you change the functionality. A never ending flickering green light would have been a constant annoyance for me.

This approach isn't enough, I wanted more - to be able to open a new tab in Chrome. I found the PCKeyboardHack preference pane/kext. Using the instructions provided I remapped the caps lock key to F14, or keycode 107 on the slim apple keyboard.

Now that it was remapped to a real keyboard code it wasn't difficult to change the new tab shortcut in chrome to respond to F14.

These changes work on the laptop's keyboard in addition to the external keyboard and I'm very pleased with the results.

Notes about Alfresco's devcon

Dan 11:22 AM

Nate McMinn has a brief wrap up of the Alfresco developers convention. I was unable to attend and it sounds like I missed a lot of interesting news. I was excited to here:

One upcoming project that was discussed at DevCon is putting together a third-party components catalog for Alfresco. Right now there is nothing like this available. Alfresco community projects are scattered all over the place. Some are in Alfresco Forge, some are on Google Code, still others are on developers' blogs (mine included). I'm sure I'm forgetting a few locations, but you get the idea. Rolling all of this up in one queryable repository would be a fantastic addition to the Alfresco community.

I wonder how this will be deployed and who will maintain it. The wiki idea for documentation seems to be barely moving along.

Creating custom JMX MBeans for reporting in Alfresco

Dan 8:10 AM

JMX rocks. When configuring a server it is a boon to developers. Especially when combined with the Alfresco subsystem architecture. You can interate on changes to the LDAP sync without having to restart the server. JMX also gives savvy system administrators a way to manage and monitor what’s going on within the repository.

If you’re still unfamiliar with the basics of JMX, especially within the context of Alfresco, Jarred Ottley over at Alfresco has written a number of excellent tutorials. I’ve added some additional articles and come up with the list below.

Some links:

JMX basics in Alfresco
Some JMX console implementation - I prefer Java VisualVM, which comes with the JDK
JMX from the commandline
Tunnelling JMX - for use with Amazon EC2 or any firewalled computer. 3.2 sp2 and newer should review this update .

With the basics out of the way it is often interesting to create your own MBean that can report custom statistics or expose custom methods. This tutorial creates a new MBean that shows the number of Asynchronous jobs being run. Alfresco exports beans using standard Spring practices. This makes keeps everything well documented. The list of things to create is small:

Context file to register new MBean
Annotated Java class

Context File

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE beans PUBLIC "-//SPRING//DTD BEAN//EN" "http://www.springframework.org/dtd/spring-beans.dtd">
<beans>
  <bean id="whySlow" class="com.zia.jmx.WhySlow"/>
  <bean id="ziaExporter"
    class="org.springframework.jmx.export.MBeanExporter">
    <property name="assembler" ref="assembler"/>
    <property name="beans">
      <map>
        <entry key="Zia:name=WhySlow" value-ref="whySlow"/>
      </map>
    </property>
  </bean>
  <bean id="jmxAttributeSource"
    class="org.springframework.jmx.export.annotation.AnnotationJmxAttributeSource"/>
  <!-- will create management interface using annotation metadata -->
  <bean id="assembler"
    class="org.springframework.jmx.export.assembler.MetadataMBeanInfoAssembler">
    <property name="attributeSource" ref="jmxAttributeSource"/>
  </bean>
</beans>

Java Class

  @ManagedResource
  public class WhySlow {
    @ManagedAttribute( description = "Asynchronous actions left to run" )
    public long getAsyncActions() {
      AsynchronousActionExecutionQueueImpl aaeq = ( AsynchronousActionExecutionQueueImpl ) AlfUtil.getSpringBean( "defaultAsynchronousActionExecutionQueue" );
      long ret = -1;
      try {
        Class<?> c = aaeq.getClass();
        Field[] props = c.getDeclaredFields();
        Field tpeField = c.getDeclaredField( "threadPoolExecutor" );
        tpeField.setAccessible( true );
        ThreadPoolExecutor tpe = ( ThreadPoolExecutor ) tpeField.get( aaeq );
        ret = tpe.getTaskCount() - tpe.getCompletedTaskCount();
      } catch( NoSuchFieldException nsfe ) {
        nsfe.printStackTrace();
      } catch( IllegalArgumentException e ) {
        e.printStackTrace();
      } catch( IllegalAccessException e ) {
        e.printStackTrace();
      }
      return ret;
    }
  }

The annotations are important for documentation in the console. There are some reflection shenanigans that allows access to private fields. Your implementation will not need much of this code, except for the annotations.

When this code is deployed the WhySlow mbean will appear at the top level, next to Alfresco node. This is controlled by the key of the map passed into the “beans” property(Zia:name=WhySlow) and is explained in the Spring docs.

Wrap up

A bean created under JMX tends to keep the separation of concerns better than many of the alternatives. We have created “consoles” in webscripts, but it seems to be difficult to train the sys admins to go to multiple places for configuration. Once the repository is started it is consistent to point users at JMX for all administration and reporting.