Friday, February 24, 2012

Big Data : A Quick introduction

What is Big data?

This is one of the buzz words in today's world and has gained the attention of many industries. By looking at the name one might think it as data in "BIG" format. But there is much more than that. Today the rate of data generation is so huge and there are several dimensions that data evolves.

Big data spans three dimensions: Volume, Velocity and Variety.




"
Volume – Big data comes in one size: large. Enterprises are awash with data, easily amassing terabytes and even petabytes of information.

Velocity – Often time-sensitive, big data must be used as it is streaming in to the enterprise in order to maximize its value to the business.

Variety – Big data extends beyond structured data, including unstructured data of all varieties: text, audio, video, click streams, log files and more."
[1]

The most interesting challenge as well as the opportunity comes for companies who deals with Big data. You cannot rely on conventional method to store, load and analyse big data. If the companies do not carefully model their big data problems using all 3 mentioned dimensions the friction they will have in the future to growth will be significant. It is no doubt that huge reinvestment will be needed to re design the whole system to meet the requirements.

Here are some articles which discuss about those issues in big data [2] [3]

Sunday, February 19, 2012

How to kill a process running on a known port


If you know a process running on a specific port and need to kill the process ?
here are the steps.
1) Find the process ID. (Here which runs on 9443)
?
1
2
3
4
5
fuser -n tcp 9443
or
lsof -w -n -i tcp:9443
Then you have the process ID. Kill the process with (Here ID = 6147)
?
1
kill -9 6147

Saturday, January 14, 2012

How to add Findbugs and Checkstyle plugins to your maven3 build


Findbugs and Checkstyle are two awesome tools that help to maintain the quality of your code discovering anti patterns or bad smells.Normally we can run those two tools with the help of IDE but there are possibilities that developers may be ignorant about them.So how about adding them to the normal maven build so developers are forced to correct their mistakes. There are different level of sympathetic levels that you can have :) with this integration.You can break the build if you find any Findbug violations and use SVN blame to find who broke the build :D . I am leaving that part to you and introduce basic integration of those essential power tools to Maven build system. I am using maven3 so all assumptions are based on that.

Adding Findbugs

You can find the maven plugin page here.[1]. Under the project Reports–> plugin documentation you can find all the configuration so that you can customize to your own requirements.
?
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
<plugin>
    <groupId>org.codehaus.mojo</groupId>
    <artifactId>findbugs-maven-plugin</artifactId>
    <version>2.3.3</version>
    <configuration>
        <trace>false</trace>
        <effort>Max</effort>
        <threshold>Low</threshold>
        <xmlOutput>true</xmlOutput>
        <failOnError>false</failOnError>
    </configuration>
    <executions>
        <execution>
            <phase>verify</phase>
            <goals>
                <goal>findbugs</goal>
            </goals>
        </execution>
    </executions>
</plugin>
<plugin>
This will execute in verify phase before install phase. Since you need to verify your source code before compiling it is the most suitable phase to run.If you search you will find lot more resources how this can be adjusted easily so you can integrate this into report generation and add to the Maven site.
There is another cool hack that i have found in this.[2].This Groovy script will print all the lines and summary to your build log with line number .class name and message.
?
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
<plugin>
      <groupId>org.codehaus.groovy.maven</groupId>
      <artifactId>gmaven-plugin</artifactId>
      <version>1.0</version>
      <executions>
          <execution>
              <phase>install</phase>
              <goals>
                  <goal>execute</goal>
              </goals>
              <configuration>
                  <source>
                      def file = new File("${project.build.directory}/findbugsXml.xml")
                      if (!file.exists()) {
                      log.warn("Findbugs XML report is absent: " + file.getPath())
                      }
                      def xml = new XmlParser().parse(file)
                      def bugs = xml.BugInstance
                      def total = bugs.size()
                      if (total &gt; 0) {
                      log.info("Total bugs: " + total)
                      for (i in 0..total-1) {
                      def bug = bugs[i]
                      log.info("\n"+
                      bug.LongMessage.text()
                      + " " + bug.Class.'@classname'
                      + " " + bug.Class.SourceLine.Message.text()
                      +"\n")
                      }
                      }
                  </source>
              </configuration>
          </execution>
      </executions>
  </plugin>

 Adding Checkstyle

This assumes that you have added a custom checkstyle configuration file(checkstyle.xml) to your projects path directly.If you do not specify a checkstyle configuration file configuration provided by  Sun Microsystems Definition is selected by default.
You can read all the available configuration details from plugin page[3].
?
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
<plugin>
     <groupId>org.apache.maven.plugins</groupId>
     <artifactId>maven-checkstyle-plugin</artifactId>
     <version>2.8</version>
     <configuration>
         <consoleOutput>true</consoleOutput>
         <configLocation>checkstyle.xml</configLocation>
         <propertyExpansion>basedir=${project.basedir}</propertyExpansion>
     </configuration>
     <executions>
         <execution>
             <phase>verify</phase>
             <goals>
                 <goal>checkstyle</goal>
             </goals>
         </execution>
     </executions>
 </plugin>
 <plugin>
If you have a multi modules insert this under plugin management and add the plugin entry to   child POMs.