12.07.2012

MongoDB Aggregation Queries Basics on MongoLab

What is MongoLab?

MongoLab is a hosted MonoDB service that offers one free database to developers wanting to learn the technology. It's a very elegant way to start learning or developing a small application. I started using this earlier in the year and had created a couple of collections on it. 

What is the MongoDB Aggregation Framework?

Traditionally in SQL databases you have the concept of a GROUP BY and an aggregation function like average or sum. In MongoDB, aggregation wasn't ever as easy, in order to do any aggregation you'd have to write your own complicated MapReduce jobs. Along comes MongoDB version 2.2, which includes the aggregation framework.

So consider this SQL statement:
SELECT sum(duration) FROM calls GROUP BY disposition

And here it is translated to MongoDB, I've attempted to color code the translations:
   
}
runCommand: {
   aggregate : 'calls',
   pipeline : [
                {
                  $group : {
                    _id : '$disposition',
                    durations: {
                             $sum : '$duration'
                                }
                            }
                 }
               ]
             }
}

Easy enough! There's much more to this of course. Here's a great reference: http://docs.mongodb.org/manual/reference/aggregation/

MongoLabs and Aggregation Framework

So to use this on MongoLabs you'll have to start a new database, old ones are on an older version, according to this helpful article

All free/starter databases created after Nov 30, 2012 will be running 2.2.x  will have Aggregation Framework Support

Why Do I care? 

Well I had some mongoDB Jaspersoft samples that would bring in an entire un-aggregated dataset into memory and then use the JasperReports library to do aggregations in-memory. This is fine and dandy if you're dealing with tens of thousands of records. When you move into millions of records this becomes a bit harry and I'd rather my database do the work!

So using my example query from above, I was able to create this very cool looking report in iReport:


Where can I learn more about Reporting and Analytics on MongoDB?

First place to look is the Jaspersoft community wiki: http://community.jaspersoft.com/wiki/mongodb
My colleague, Matt Dahlman has some excellent examples on his blog: http://mdahlman.wordpress.com/tag/mongodb/

6.22.2012

Jaspersoft Report Writing Best Practices


The following is a set of guidelines for developing reports with iReport and JasperReports Server. It was compiled with the help of Frau Klein

Best Practice - report resources:
Report units contain references to the resources they use: images, styles, sub-reports, queries, input controls - example
Why?
Promotes re-usability (nothing is local to report unit). Easy maintenance around moving report units, replacing resources in a mass scale. 

 Best Practice - references:
 Never hard code an entire repository path to resource, the image repo:/reports/images/image.png should come in as a reference like repo:image.png - example
Why?
When you use js-export with --uris option it only resolves repository dependencies, it doesn't look for expressions that maybe use an absolute repository path

 Best Practice - styles:
Do not hard code any style information to JRXML. Develop central style templates (jrtx, jrctx) for all your reports, add as a reference to reports - example
Why?
Fast change management, when the company style changes you will edit a couple of resources vs hundreds. Note that conditional styles are not in the jrtx

Best Practice - logged in user 
 Don't log in as superuser, use an organization specific user like jasperadmin - example
Why?
 Complicates repo: paths. If logged in as superuser you must pass in the full uri ex: repo:/organizations/organization_1/reports/drill_down_report where being part of an organization you would just use repo:/reports/drill_down_report. Also, it's good for testing around multi-tenancy to be a "real" user.

Best Practice - input controls
Create shared input controls if they apply to more than one report, bring them in as a link - example
Why?
Reduce development time, useful for dashboards. Note: Datasources won't switch if your sub-report switches

Best Practice - Use JNDI Datasources (contributed by Guillaume AUTIER)

Have a unique entry point to change datasource details (ip:port:credentials)
Why ?
When deploying the server in a different location, you would only have to change the datasource definition in one file: context.xml (this is valid in tomcat, see your app server settings for JNDI)

When using export scripts the datasource will remain untouched, so you can use the same export on multiple servers without having to change the datasource detail after the import

You can take advantage of Application Server pooling instead of creating many connections to the database (performance).

Best Practice – parametrized references with sub-reports (contributed by Guillaume AUTIER)

Using the previous advice you can also access a report (with subreports) both locally and on the server without touching your expressions.
Why?
This is useful when you want to quickly preview in iReport a report taken from the server. We assume that all reports and subreports .jasper files resides in the same local folder.


  • Create a parameter $P{SUB_REPORTDIR} (string type, use as prompt, default value “repo:”)
  • Then have the following formula in your subreport expressions


$P{SUB_REPORTDIR }.equals(“”)?”mySubReport.jasper”: $P{ SUB_REPORTDIR} + ”mySubReport.jrxml”


  • Publish your report on the server as usual : do not create(or link) an input control for the SUB_REPORTDIR parameter

Run your report :

  • In iReport you will be prompted for the SUB_REPORTDIR parameter leave it blank.
  • On the server : Without an input control for the SUB_REPORTDIR parameter the server will then take the default.


If you have any other best practices tips, post them in the comments and I'll include into the post, happy reporting!

4.12.2012

TalendDate

Not a lot of examples around TalendDate so thought I'd post a couple of quick examples:

Today's date:
TalendDate.getDate("yyyy-MM-dd")
Returns:
2012-04-12

Yesterday's date:
TalendDate.addDate(TalendDate.getDate("yyyy-MM-dd"),"yyyy-MM-dd",-1,"dd")
Returns:
2012-04-11

Tomorrow's date:
TalendDate.addDate(TalendDate.getDate("yyyy-MM-dd"),"yyyy-MM-dd",1,"dd")
Returns:
2012-04-13

Date format:
CC - century
YY - year
MM - month
DD - day of month
hh - hour
mm - minutes
ss - seconds

Return types:
string!


Other methods:


TalendDate.compareDate(myDate,myDate2,"yyyy-MM-dd")
TalendDate.diffDate(myDate,myDate2,"MM")
TalendDate.formatDate("yyyy-MM-dd HH:mm:ss",myDate)
TalendDate.setDate(mydate,newValue,"MM")
TalendDate.getFirstDayOfMonth(mydate)
TalendDate.getLastDayOfMonth(mydate)
TalendDate.getRandomDate("2007-01-01","2008-12-31")
TalendDate.parseDate("yyyy-MM-dd HH:mm:ss","")