Category: Databases

Hibernate Criteria API: paging with ordering

Paging with ordering

Ordering with examples in the previous post won’t work as expected.
#1. If we include .addOrder(Order.desc(“createDate”)) to the first query, it will be illegal. Because order by clause should also be present in the results.
#2. If we include .addOrder(Order.desc(“createDate”)) to the second query, it will only sort within one page.

Solution:

List<Object[]> results = (List<Object[]>) session.createCriteria(LessonPlan.class)
.add(Restrictions.eq("individualId", individualId))
.setFirstResult(firstResult).setMaxResults(pageSize)
.addOrder(Order.desc("createDate"))
.setProjection(Projections.distinct(Projections.projectionList()
.add(Projections.id())
.add(Projections.property("createDate"))))
.list();
List<Long> resultIds = new ArrayList<Long>();
for(Object[] objArray : results){
 resultIds.add((Long)objArray[0]);
}
List<LessonPlan> result = new ArrayList<LessonPlan>();
if(resultIds.size() > 0){
result = (List<LessonPlan>) session.createCriteria(LessonPlan.class)
.add(Restrictions.in("id", resultIds))
.addOrder(Order.desc("createDate"))
.setResultTransformer(CriteriaSpecification.DISTINCT_ROOT_ENTITY)
.list();
}
return result;

Notes:
– all items in bold above are important
– so now we got list of object arrays from the first query
– we need to get only IDs from that list
second list also need to have .addOrder specified, otherwise items within a single page won’t be sorted.

More here: http://floledermann.blogspot.com/2007/10/solving-hibernate-criterias-distinct.html

Hibernate Criteria API: paging

Case: fetching paged lists from a single Entity (that has related mapped entities, ex: one-to-many)

Problem:
This query doesn’t work as needed

session.createCriteria(SavedContentArticle.class)
.add(Restrictions.eq("individualId", individualId))
.setFirstResult(firstResult).setMaxResults(pageSize)
.list();

The query above results in the following:
NOTE: saved_content_label is eagerly loaded (specified in entity definition class)

select
blah blah
from
saved_content_article this_
left outer join article articlebas2_ on this_.article_id=articlebas2_.article_id
left outer join saved_content_article_saved_content_label labels3_ on this_.saved_content_id=labels3_.saved_content_id
left outer join saved_content_label contentlab4_ on labels3_.label_id=contentlab4_.label_id
where
this_.individual_id=? limit ? offset ?

Which obviously is wrong. It is wrong because Hibernate groups distinct entities after the query (which will return results smaller than the “pageSize” for each requested page)
Good discussion here: http://stackoverflow.com/questions/2183617/criteria-api-returns-a-too-small-resultset

Solution (the only solution as of right now):
Note: I don’t consider selecting an entire table and then paging it in the code a solution.

Create 2 separate queries:

List<Long> resultIds = (List<Long>) session.createCriteria(SavedContentArticle.class)
.add(Restrictions.eq("individualId", individualId))
.setFirstResult(firstResult).setMaxResults(pageSize)
.setProjection(Projections.distinct(Projections.id()))
.list();
return (List<SavedContentArticle>) session.createCriteria(SavedContentArticle.class)
.add(Restrictions.in("savedContentId", resultIds))
.setResultTransformer(CriteriaSpecification.DISTINCT_ROOT_ENTITY)
.list();

NOTE: line: setResultTransformer(CriteriaSpecification.DISTINCT_ROOT_ENTITY) is required!

Hibernate Criteria API: adding criteria for a property of a related (mapped) entity.

Let’s say Content entity contains Label entities (as “labels”), the relationship might be one-to-many or many-to-many.
In this case the code below won’t work:

session.createCriteria(Content.class)
.add(Restrictions.eq("labels.labelId", labelId))
.list();

It should be as follows:

session.createCriteria(Content.class)
.createAlias("labels", "x")
.add(Restrictions.eq("x.labelId", labelId))
.list();

OR

session.createCriteria(Content.class)
.createCriteria("labels")
.add(Restrictions.eq("labelId", labelId))
.list();

NOTE:
In the second statement above, if we try to add one more clause that belong to Content class, we should be careful. It should be added after createCriteria(Content.class) statment and before createCriteria(“labels”). This is because if we add if after, it will belong to “labels” object.

HIBERNATE: loading entity’s lazy properties using HQL (JOIN vs JOIN FETCH)

JOIN vs JOIN FETCH

Consider the query below:

List<Individual> individuals = session.createQuery("select i from Individual i left outer join fetch i.roles " +
" left outer join fetch i.credentialUPs " +
" join i.properties p " +
" where p.individualIdPropertyName.name = :emailType and p.value = :emailAddress")
.setParameter("emailType", "EMAIL_ADDRESS")
.setParameter("emailAddress", emailAddress)
.list();

This will select “credentialUP” and “roles” during the query. Interesting thing happens with “properties” association. The query will select an entity according to the where statement provided, however “properties” won’t be available by default.
To make Hibernate load all “properties” of an object we’ll need to call those properties explicitly within the current session.

Ex:

int size = individuals.size();
if (size == 1) {
 Individual result = individuals.get(0);
 result.getProperties().isEmpty();
 return result;
}

This will result in Hibernate making additional “select” calls to fetch a required associations.

Now consider doing call like this:

List<Individual> individuals = session.createQuery("select i from Individual i left outer join fetch i.roles " +
" left outer join fetch i.credentialUPs " +
" join fetch i.properties p " +
" where p.individualIdPropertyName.name = :emailType and p.value = :emailAddress")
.setParameter("emailType", "EMAIL_ADDRESS")
.setParameter("emailAddress", emailAddress)
.list();

Here, as we’ve explicitly specified that we want to “join fetch” this association, it will not do additional selects. We will however get only one “property” entity, the one that matches our “where” condition (even though there might be many more associated “properties”). Also, as expected, it doesn’t matter if we do or don’t call the “isEmpty” within the session. It also doesn’t matter if we do “left outer join” or a regular “join”

Overview of Apache Cassandra, part 1

Overview

Very nice overview video on : http://cassandra.apache.org. A few points were insightful, so I decided to outline them here.

Prehistory

– General purpose DB systems (Oracle, MS SQL, etc… ) solve most of the problems, but point is that they don’t solve ’em all.

– Most obvious problem is scalability, which is solved using vertical scaling in Relational Databases (faster CPUs, faster disks, etc…). To make things worse more serious RDBMS systems cost a lot of money.

– Papers on BigTable and Dynamo present a new approach to solve those problems. Basically the idea is a “system that can handle huge amounts of data horizontally” (over several networked PCs)

– Google’s BigTable entirely depends on distributed file system they already had. In addition, they developed very sophisticated “Sparse” table mechanisms.

– Amazon’s Dynamo is strictly “Distributed Hash Table”. Due to their “high-availability” requirement, it is also one-hop DHT, meaning they should have advanced hashing algorithms. I just guess that they don’t have hierarchical hasing, because if they had, it wouldn’t be called “one-hop” anymore (unless they call O(1) = “one hop”).

– Amazon also uses BASE ( = eventual consistency) because their primary concerns are availability and consistency.

Where does Cassandra fit today?

So called NoSQL projects and those that were specifically designed to scale and deal with huge amounts of data:

Name 				Big Data
- HBase 			[V]	- more like BigTable
- MongoDB			[x]
- Riak				[V]	- more like Dynamo
- Voldemort			[V]	- more like Dynamo
- Neo4J				[x]
- Cassandra			[V]     - combines BOTH
- Hypertable		        [V]	- more like BigTable
- HyperGraphDB		        [x]
- Memcached			[x]
- Tokyo Cabinet		        [x]
- Redis 			[x]
- Couch DB			[x]

CAP Theorem – states that databases can be either CP or AP. Term was coined by Eric Brewer (UC Berkeley).

BigTable approach focuses more on CP(Consistency-PartitionTolerance), while Dynamo provides more AP(Availability-PartitionTolerance).

Cassandra is considered AP. However, I fully agree with the author on his objection of “pick-two” approach that implies that a system can be viewed as  “fully consistent and totally not available” or the other way around. There is, rather, a gradual trade-off between those two concepts. “Eventual Consistency”, in fact, is an example of that tradeoff.

Now, more about Cassandra itself:

– It is symmetric, which means all nodes are exactly the same and there is no centralization of control, hence there is no single point of failure.

– It is linear, meaning increasing size of a cluster will increase storage capacity and querying capacity as well.

– Partitioning is flexible

– Easy to grow

– Highly available (due to ideas borrowed from Dynamo)

As I was guessing, P2P routing is used to communicate between clusters, which means no centralized bottlenecks. It is interesting to note that P2P communication is structured in a Ring model, which means each cluster can pass messages to one cluster only. I guess too many clusters might introduce some delays in message propagations and, hence, some inconsistency. It is interesting how they solved those consistency issues.

For instance, if node receives a request it might serve it while update operation is being propagated over the network, causing some inconsistency. However, due to the fact that most of the related requests will be sent from the same location (and hence to the same node) we might think that this is not a big issue. But it is an issue after all…

PHP does not connect to MSSQL 2005 (but mssql_connect works)

Error:
Warning: mssql_connect() [function.mssql-connect]: Unable to connect to server: NOMADS_TOSHIBA\MSSQL2005 in C:\WebServers\home\localhost\www\salarymanager\includes\system\db.inc.php on line 20

Solution:

Download file: how_to_make_PHP_work_with_MSSQL.rar (couldn’t upload file here, please email me, I will send them)

Microsoft SQL Server 2005 and Vista.

The following steps enabled MSSQL 2005 support on Vista Home Permium.
(native drivers, not odbc which works out-of-the-box)

1) Place ntwdblib.dll in C:\Windows\System32
2) Place php_mssql.dll in C:\Windows\System32
(I gave IUSR full access to these files, maybe read is enough)

3) uncomment php_mssql.dll in php.ini

4) Changed my default SQLSERVER 2005 install so it accepts named pipes.
start -> Sql server surface area configuration ->
surface area configuration for services and connections->
MSSQLXXX-> database engine -> remote connections -> using both TCP/IP and named pipes

5) reboot.

After that I could use the mssql_* functions.

Hope that helps. I guess same applies to Windows 7