Monday, June 22, 2009

[GSoC09] Hibernate + Devcathlon

Premise:
This past weekend I had the chance to dive into implementing Devcathlon's database back-end (Derby) by using Hibernate. I, unfortunately ran into some issues while applying these changes. They seem very pin-pointed since changes were recently made using Hibernate, but there were other confounding factors that hindered a smooth application. After last week's implementation, I was sure that the issue seemed small and easily doable. However the common, 1001 timeout error (which occurs during a Devcathlon's call out to Hackystat's SensorBaseClient) proved to be more of a challenge than I thought. In this entry I'll describe the issues I ran into, and the many solutions leading up to resolving this problem. I'll also mention about next week's goals and a timeline for Devcathlon's first milestone deliverable.

What was the issue?
SensorBaseClient 1001 timeout error - This happens due to the increase complexity of the application and heavy call loads out to Hackystat's SensorBaseClient server. This occurred while running the JUnit test cases against recent modifications while applying Hibernate. Another possible problem may arise due to the increase size of the application, which causes an overload memory usage on the contained JVM's heap size.

Resolving the issue:

I went through dismantling many of the previous code implementations. It wasn't awful, though I felt that changing the current state of the application was too obtrusive. I was prone to just keeping my changes reserved to just my additions. Any modifications to the actual code base would need to be reasonably thought out. So the (temporary) remedy for this matter happened to reside in modifying the DevcathlonTestHelper class. The error mostly occurs while trying to make multiple calls out to the SensorBase, along with calling out to Hibernate. The calls to Hibernate happens to stall (exactly at the same spots) and the connection to the SensorBaseClient apparently times out. I tried various ways to remedy the problem, a few of which that were clearly suggested by Johnson in this videocasting. If you ever run into this error, I would suggest to first watch that video and try out each solution, if all else fails then it's safe to assume that the problem must be you! The work around was applied yesterday, and I was thrilled that the build finally passed verify. The code has been unloaded to SVN and I would like some kind of code review from my mentors.
Changes for this past week.
r903 & r904

Next week's work load:
Just a reiteration, the primary goal at this point is to clearly implement a DB for Devcathlon. Considering how much I've endured this past week, It'll hopefully support a faster development time, now identifying those issues. Along with finishing the database, I will get the ball rolling next week for the user interface.
The first delivery date will be within two weeks from now, which will include a solid database implementation using Hibernate (with all/any kinks worked out), UI improvements mentioned in earlier blog posts and suggestions made by peers, and finally a mailing system using JavaMail. Look out for the official beta release of Devcathlon v2.0 on July 6, 2009.

Friday, June 12, 2009

[GSoC] Derby race to the finish...

Purpose
In my last posting, I mentioned about ditching BDB for Apache Derby DB as Devcathlon's database back-end. I'm glad to say that that was a reasonable choice and most likely a better one in the long run. For this entry, I'll mention about my progress with the application and what I have in store for the next few weeks.

Starting out...
At first I decided to create the database using traditional SQL dialects for building the tables necessary for Devcathon. Below is the necessary Derby configurations and the resulting db schema.

Derby configuration:
Installed and included an environment variable for Derby.
Created DevcathlonDerbyClient.java class (within the same directory of Devcathlon's Jetty-server Start class). The class includes the connection details and initialization of Devcathlon's database. It was implemented using Derby's included JDBC client driver, ClientDriver class. Next I included the client driver
for the start of Devcathlon's application in Start.java. Then I started Derby's Network Server, (ran startNetworkServer[sh|bat]) on port 1527. The application fired fine, and Devcathlon was able to communicate successfully with its database.

SQL DB Schema 'Devcathlon':
The full schema file can be found, here. This includes my quick implementation of Devcathlon's database schema for the following tables: Profile, Team, Project, Match and Event.
For convenience, the file is read in by the DevcathlonDerbyClient driver, then parsed and finally executed to initialize the database.

After spending some time reading up on ORMs, I decided to give it a try. I last mentioned about ORMs in my last few postings; this week was the real test and eventual implementation of an ORM, called Hibernate. I had a few other choices for a JPA + EJB standards (OpenJPA), but I think that Hibernate makes for an advanced extension to these APIs. Considering the immense support (online tutorial/manual) and popularity (in forums), I was able to get up and running with Hibernate within less than an hour.

Hibernate-Devcathlon configuration:
I downloaded the distribution (v3.3.1) and extracted it into to my local namespace. I also included an environment variable that links to the root directory of Hibernate, called HIBERNATE_HOME. Next, I modified the Ant build.xml to include Hibernate's required jars for compilation. Then I created a Hibernate configuration file, called hibernate.cfg.xml. The configuration file defines the SessionFactory settings, which includes the database connection(s), hibernate.cfg.xml files, pool connectivity, and debug options. Below is one possible rendering of an hibernate.cfg.xml file for Devcathlon. *Note that Devcathlon will default to looking for this configuration file under the ~/.hackystat/devcathlon/db/hibernate/hibernate.cfg.xml. This is intended so that system administrators can define their own configurations that may require a different database, or whatever is most optimal to their settings.

<hibernate-configuration>
<session-factory>
<property name="hibernate.connection.driver_class">org.apache.derby.jdbc.ClientDriver</property>
<property name="hibernate.connection.url">jdbc:derby://localhost:1527/devcathlonDB;create=true</property>
<property name="hibernate.default_schema">Devcathlon</property>

<property name="connection.pool_size">1</property>
<!-- SQL dialect -->
<property name="hibernate.dialect">org.hibernate.dialect.DerbyDialect</property>

<property name="current_session_context_class">thread</property>

<property name="cache.provider_class">org.hibernate.cache.NoCacheProvider</property>

<property name="show_sql">true</property>

<property name="hbm2ddl.auto">create-drop</property>
</session-factory>
</hibernate-configuration>

Loading this configuration file, required a basic HibernateUtil.java class to handle creating thread-safe, session-per-request SessionFactory. This class is intended to handle initializing one global session, providing pool connections for less overhead on the database, and closing a session. Most importantly, this will be the main communication between our data objects and Devcathlon's database. *Note on property, "hbm2ddl.auto" will be switched to a value of "create-drop" for testing purposes only. On an actual production server, this configuration file should just include a "create" value (without the quotes) for permanence.

Since I started out with building the db schema manually, it really benefited in creating the relational mappings for Hibernate. I could have gone with Hibernate's bottom-up approach by reverse engineering the process to get generated (Java annotated) mappings, but I wanted to see how close my schema was compared to the Hibernate's interpretation of Devcathlon's related classes, and mainly since all the Java persistent classes (ordinary POJOs) have already been built. Thus this is more of a top-down strategy that requires developers to build these relational mappings first, and finally have Hibernate auto-generate and apply its interpreted database schema.

Structural pattern to defining and applying relations:
Define the relational mapping of a persistent class entity, ex: User, Profile, Team, etc.
Add these relational mappings (*.hbm.xml) to the session configuration, by simply adding them programmatically in HibernateUtil.java.
All *.hbm.xml relations-per-entity will reside next to their declared entities and within the same folder.

Creating first relational mappings:
For initial testing purposes, I applied my first relational mappings for the User-Profile class models. The two class models have a one-to-one relation, or simply explained: A User has-one Profile and a Profile belongs-to a User. For a technical explanation, a Profile model would contain the primary key constraint, which 'references' the User model. The relationship will be represented as a bi-directional relationship, for equal referencing by both models.
Below is one way of creating the relational mappings through Hibernate. I could have also followed JPA standards and included some supportive Annotations, but that would eventually become too obstrusive (adding annotations) and more work (removing annotations) if we ever decide to use another ORM. Plus, using these xml configuration files is simple and easy to interpret.

Defining the user.hbm.xml:

<hibernate-mapping package="org.hackystat.devcathlon.engine.user">
<class name="User" table="ACCOUNT">
<id name="id" column="USER_ID">
<generator class="native"/>
</id>
<property name="username" type="string">
<column name="USERNAME"
length="16"
not-null="true"
unique="true"/>
</property>
<property name="email" type="string">
<column name="EMAIL"
length="20"
not-null="true"
unique="true"/>
</property>
<one-to-one name="profile"/>
<property name="password" type="string"/>
<property name="firstName" type="string"/>
<property name="lastName" type="string"/>
</class>
</hibernate-mapping>

Defining the profile.hbm.xml:

<hibernate-mapping package="org.hackystat.devcathlon.engine.profile">
<class name="Profile" table="PROFILE">
<id name="id" column="USER_ID">
<generator class="foreign">
<param name="property">user
</generator>
</id>
<one-to-one name="user" constrained="true"/>
<set name="emailAddresses" table="PROFILE_EMAIL_ADDR">
<key column="PROFILE_ID"/>
<element type="string" column="EMAIL_ADDR"/>
</set>
<property name="screenName" type="string"/>
<property name="firstName" type="string"/>
<property name="lastName" type="string"/>
<property name="bioInfo" type="text"/>
<property name="gender" type="string"/>
<property name="avatar" type="string"/>
</class>
</hibernate-mapping>


Applying the db store:
Since Devcathlon's folder architecture is quite organized, and well thought out for future implementations of a database in mind. It was almost an effortless job on my part, considering that all I needed to do was send a create, read/load, update or delete to Derby's database. Each CRUD function was beautifully built and implemented within each entity's engine manager. So rather than storing everything in the Devcathlon session store, everything gets applied to Hibernate's SessionFactory. Below is a usage template for applying this SessionFactory:
[User | Profile | Team | Match]Manager class:
public function foo() {
...
Session session = HibernateUtil.getSessionFactory();
session.beginTransaction();
// regular java program logic, here
// create the query to communicate with our database.
session.createQuery("..."); //DML statements
session.getTransaction().commit();
...
}

Running JUnit tests.
I ran into some issues recently after running JUnit tests with my implementation, caused a SensorBaseClient timeout 1001 error. The error only occurs while running 'testNoData' for the very first test Event implementation (TestBestCoverage). It's not an intermittent issue, rather I know exactly where it's bombing out. However, I couldn't understand why the SensorBaseClient would timeout after my committed SQL transactions. The transaction that seems to be causing the problem resides in deleting Hackystat-Devcathlon test users defined in the unit test. It might have something to do with deleting a non-existing row and returning an unhandled Warning from Derby. Anyhow, I'll try to dig something up from the db log files and hopefully resolve this situation soon.

Code:
All source code changes for this milestone can be found, here.

The code distribution for this implementation can be retrieved with SVN. This distribution is a branch of the current trunk distribution, which will be merged onto later.
svn checkout \
https://hackystat-ui-devcathlon.googlecode.com/svn/branches/
devcathlon-derbydb \
hackystat-ui-devcathlon-derbydb --username <your-user-name>

or
svn checkout \
http://hackystat-ui-devcathlon.googlecode.com/svn/branches/
devcathlon-derbydb \
hackystat-ui-devcathlon-derbydb

Plans...
Finish building database schema and get started on UI improvements. Here are some thoughts on UI improvements:
- Search query for User-profiles, teams and matches (browse pages)
- Add a 'challenge' button for each opposing team profile page. This button will trigger an appendTeam method for appending teams for a new match. The user then navigates over to the Match page to see 'pending' match creations with all teams selected from visiting each team profile page. A user is limited to one pending list per session, and will be able to either accept the new pending match creation, or discard it for an entirely new form.

Sunday, June 7, 2009

[GSoC09] Further DB Investigations and Vision updates

Purpose
This entry will be a brief progress report about what I've done thus far. Currently, I have no new discoveries except for coming to terms with using some kind of database. I've also put out a wiki document for Devcathlon's vision, here (in progress).
Thinking about what's best...
During the past week I did some further investigations and decided to revisit Derby's RDBMS. A comment from my last posting, by Austin Ito made me realize the potential flaw to using a non-SQL embedded database. Considering the long-term effects of using a non-SQL DB would likely attribute to less flexibility and portability of a DB system. Especially if future developers of Devcathlon decide on porting its DB to another enterprise DB, this could be a very painstaking and undocumented process. Austin also mentioned about using an ORM, and I might also consider implementing Hibernate. I've been doing a lot of investigative work (mostly reading) on the topic of ORM's and so far I've noted on some benefits. By definition an object/relational mapping supports full object modeling, including the composition, inheritance, polymorphism and peristence. ORM's, like Hibernate are totally transparent (persistent layer), since ordinary classes do not require any special base class or interface implementations. Its also vendor independent, because it abstracts the underlying SQL database and similar SQL dialects out of the application. Although capabilities of a database may differ between systems, you can expect building a more cross-platform application using ORM.
Plans for this week...
Finish Vision document to include technical review and goals for Devcathlon v2.0.
Continue implementing Derby DB and read up on Hibernate. Hibernate and ORM's itself is still a new topic for me, and I'll be interested to hear about any other comments that attribute to using this persistence framework with Derby DB.