Showing posts sorted by date for query java. Sort by relevance Show all posts

In this post, we will learn how to use Elasticsearch, Logstash and Kibana for running analytics on application events and logs. Firstly, I will install all these applications on my local machine.

Installations

You can read my previous posts on how to install Elasticsearch, Logstash, Kibana and Filebeat on your local machine.

Basic configuration

I hope by now you are have installed Elasticsearch, Logstash, Kibana and Filebeat on your system. Now, Let's do few basic configurations required to be able to run analytics on application events and logs.

Elasticsearch

Open elasticsearch.yml file in [ELASTICSEARCH_INSTLLATION_DIR]/config folder and add properties to it.

cluster.name: gauravbytes-event-analyzer
node.name: node-1

Cluster name is used by Elasticsearch node to form a cluster. Node name within cluster need to be unique. We are running only single instance of Elasticsearch on our local machine. But, in production grade setup there will be master nodes, data nodes and client nodes that you will be configuring as per your requirements.

Logstash

Open logstash.yml file in [LOGSTASH_INSTALLATION_DIR]/config folder and add below properties to it.

node.name: gauravbytes-logstash
path.data: [MOUNTED_HDD_LOCATION]
config.reload.automatic: true
config.reload.interval: 30s

Creating logstash pipeline for parsing application events and logs

There are three parts in pipeline. i.e. input, filter and output. Below the pipeline conf for parsing application event and logs.

input {
    beats {
        port => "5044"
    }
}

filter {
   
    grok {
        match => {"message" => "\[%{TIMESTAMP_ISO8601:loggerTime}\] *%{LOGLEVEL:level} *%{DATA:loggerName} *- (?(.|\r|\n)*)"}
    }
 
    if ([fields][type] == "appevents") {
        json {
            source => "event"
            target => "appEvent"
        }
  
        mutate { 
            remove_field => "event"
        }

        date {
            match => [ "[appEvent][eventTime]" , "ISO8601" ]
            target => "@timestamp"
        }
  
        mutate {
            replace => { "[type]" => "app-events" }
        }
    }
    else if ([fields][type] == "businesslogs") {  
        mutate {
            replace => { "[type]" => "app-logs" }
        }
    }
 
    mutate { 
        remove_field => "message"
    }
}
output {
    elasticsearch {
        hosts => ["http://localhost:9200"]
        index => "%{type}-%{+YYYY.MM.dd}"
    }
}

In the input section, we are listening on port 5044 for beat (filebeat to send data on this port).

In the output section, we are persisting data in Elasticsearch on an index based on type and date combination.

Let's discuss the filter section in detail.

  • 1) We are using grok filter plugin to parse plain lines of text to structured data.
    grok {
        match => {"message" => "\[%{TIMESTAMP_ISO8601:loggerTime}\] *%{LOGLEVEL:level} *%{DATA:loggerName} *- (?(.|\r|\n)*)"}
    }
    
  • 2) We are using json filter plugin to the convert event field to a json object and storing it in appEvent field.
    json {
        source => "event"
        target => "appEvent"
    }
    
  • 3) We are using mutate filter plugin to the remove data we don't require.
    mutate { 
        remove_field => "event"
    }
    
    mutate { 
        remove_field => "message"
    }
    
  • 4) We are using date filter plugin to the parse the eventTime from appEvent field to ISO8601 dateformat and then replacing its value with @timestamp field..
    date {
        match => [ "[appEvent][eventTime]" , "ISO8601" ]
        target => "@timestamp"
    }
    

Filebeat

Open the file filebeat.yml in [FILEBEAT_INSTALLATION_DIR] and below configurations.

filebeat.prospectors:
- type: log
  enabled: true
  paths:
    - E:\gauravbytes-log-analyzer\logs\AppEvents.log
  fields:
    type: appevents
  
- type: log
  enabled: true
  paths:
    - E:\gauravbytes-log-analyzer\logs\GauravBytesLogs.log
  fields:
    type: businesslogs
  multiline.pattern: ^\[
  multiline.negate: true
  multiline.match: after

filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false
setup.template.settings:
  index.number_of_shards: 3

output.logstash:
  hosts: ["localhost:5044"]

In the configurations above, we are defining two different type of filebeat prospectors; one for application events and the other for application logs. We have also defined that the output should be sent to logstash. There are many other configurations that you can do by referencing filebeat.reference.yml file in the filebeat installation directory.

Kibana

Open the kibana.yml in [KIBANA_INSTALLATION_DIR]/config folder and add below configuration to it.

elasticsearch.url: "http://localhost:9200"

We have only configured Elasticsearch url but you can change Kibana host, port, name and other ssl related configurations.

Running ELK stack and Filebeat

//running elasticsearch on windows
\bin\elasticsearch.exe

// running logstash
bin\logstash.bat -f config\gauravbytes-config.conf --config.reload.automatic

//running kibana
bin\kibana.bat

//running filebeat
filebeat.exe -e -c filebeat-test.yml -d "publish"

Creating Application Event and Log structure

I have created two classes AppEvent.java and AppLog.java which will capture information related to application events and logs. Below is the structure for both the classes.

//AppEvent.java
public class AppEvent implements BaseEvent {
    public enum AppEventType {
        LOGIN_SUCCESS, LOGIN_FAILURE, DATA_READ, DATA_WRITE, ERROR;
    }

    private String identifier;
    private String hostAddress;
    private String requestIP;
    private ZonedDateTime eventTime;
    private AppEventType eventType;
    private String apiName;
    private String message;
    private Throwable throwable;
}

//AppLog.java
public class AppLog implements BaseEvent {
    private String apiName;
    private String message;
    private Throwable throwable;
}

Let's generate events and logs

I have created a sample application to generate dummy events and logs. You can check out the full project on github. There is a AppEventGenerator java file. Run this class with system argument -DLOG_PATH=[YOUR_LOG_DIR] to generate dummy events. If your log_path is not same as one defined in the filebeat-test.yml, then copy the log files generated by this project to the location defined in the filebeat-test.yml. You soon see the events and logs got persisted in the Elasticsearch.

Running analytics on application events and logs in Kibana dashboard

Firstly, we need to define Index pattern in Kibana to view the application events and logs. Follow step by step guide below to create Index pattern.

  • Open Kibana dashboard by opening the url (http://localhost:5601/).
  • Go to Management tab. (Left pane, last option)
  • Click on Index Patterns link.
  • You will see already created index, if any. On the left side, you will see Option to Create Index pattern. Click on it.
  • Now, define index pattern and Click next. Choose time filter field name. I choose @timestamp field for this. You can select any other timestamp field present in this Index and finally click on Create index pattern button.

Let's view Kibana dashboard

Once Index pattern is created, click on Discover tab on the left pane and select index pattern created by you in the previous steps.

You will see a beautiful GUI with a lot of options to mine the data. On the top most pane, you will see option to Auto refresh and data that you would want to fetch (Last 15 minutes, 30 minutes, 1 hour, 1 day and so on) and it will automatically refresh the dashboard.

The next lane has search box. You can further write queries to have more granular view of the data. It uses Apache Lucene's query syntax.

You can also define filters to have a more granular view of data.

This is how you can run the analytics using ELK on your application events and logs. You can also define complex custom filters, queries and create visualization dashboard. Feel free to explore Kibana's official documentation to use it to its full potential.

Java 8 introduced default and static methods in interfaces. These features allow us to add new functionality in the interfaces without breaking the existing contract for implementing classes.

How do we define default and static methods?

Default method has default and static method has static keyword in the method signature.

public interface InterfaceA {
  double someMethodA();

  default double someDefaultMethodB() {
    // some default implementation
  }
  
  static void someStaticMethodC() {
    //helper method implementation 
  }

Few important points for default method

  • You can inherit the default method.
  • You can redeclare the default method essentially making it abstract.
  • You can redefine the default method (equivalent to overriding).

Why do we need default and static methods?

Consider an existing Expression interface with existing implementation like ConstantExpression, BinaryExpression, DivisionExpression and so on. Now, you want to add new functionality of returning the signum of the evaluated result and/or want to get the signum after evaluating the expression. This can be done with default and static methods without breaking any functionality as follows.

public interface Expression {
  double evaluate();

  default double signum() {
    return signum(evaluate());
  }

  static double signum(double value) {
    return Math.signum(value);
  }
}

You can find the full code on Github.

Default methods and multiple inheritance ambiguity problem

Java support multiple inheritance of interfaces. Consider you have two interfaces InterfaceA and InterfaceB with same default method and your class implements both the interfaces.

interface InterfaceA {
  void performA();
  default void doSomeWork() {
  
  }
}

interface InterfaceB {
  void performB();

  default void doSomeWork() {
 
  }
}

class ConcreteC implements InterfaceA, InterfaceB {

}

The above code will fail to compile with error: unrelated defaults for doSomeWork() from InterfaceA and InterfaceB.

To overcome this problem, you need to override the default method.

class ConcreteC implements InterfaceA, InterfaceB {
  override
  public void doSomeWork() {

  }
}
If you don't want to provide implementation of overridden default method but want to reuse one. That is also possible with following syntax.
class ConcreteC implements InterfaceA, InterfaceB {
  override
  public void doSomeWork() {
    InterfaceB.super.doSomeWork();
  }
}

I hope you find this post informative and useful. Comments are welcome!!!.

Logstash

Logstash is data processing pipeline which ingests the data simultaneously from multiple data sources, transform it and send it to different `stash` i.e. Elasticsearch, Redis, database, rest endpoint etc. For example; Ingesting logs files; cleaning and transforming it to machine and human readable formats.

There are three components in Logstash i.e. Inputs, Filters and Outputs

Inputs

It ingests data of any kind, shape and size. For examples: Logs, AWS metrics, Instance health metrics etc.

Filters

Logstash filters parse each event, build a structure, enrich the data in event and also transform it to desired form. For example: Enriching geo-location from IP using GEO-IP filter, Anonymize PII information from events, transforming unstructured data to structural data using GROK filters etc.

Outputs

This is the sink layer. There are many output plugins i.e. Elasticsearch, Email, Slack, Datadog, Database persistence etc.

Installing Logstash

As of writing Logstash(6.2.3) requires Java 8 to run. To check the java version run the following command

java -version

The output on my system is

java version "1.8.0_161"
Java(TM) SE Runtime Environment (build 1.8.0_161-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.161-b12, mixed mode)

If Java 8 is not installed then please download it from Oracle website and follows instruction for installation. Also, set the JAVA_HOME variable.

Installing from binaries

You can directly download the binaries from here.

Installing from package repositories

Installation with APT

//ADD PUBLIC SIGNING KEY
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -

//add https-transports
sudo apt-get install apt-transport-https

//save the repository definition
echo "deb https://artifacts.elastic.co/packages/6.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-6.x.list

//installation command
sudo apt-get update && sudo apt-get install logstash

Installation with YUM

// Download and install the public signing key
rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch

Add the following in a new .repo file in your /etc/yum.repos.d/ directory

[logstash-6.x]
name=Elastic repository for 6.x packages
baseurl=https://artifacts.elastic.co/packages/6.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md
// Installation command
sudo yum install logstash

Docker installation

You can follow the link for docker installation.

What is Elasticsearch?

Elasticsearch is highly scalable, broadly distributed open-source full text search and analytics engine. You can in very near real-time search, store and index big volume of data. It internally use Apache Lucene for indexing and storing data. Below are few use cases for it.

  • Product search for e-commerce website
  • Collecting application logs and transaction data for analyzing it for trends and anomalies.
  • Indexing instance metrics(health, stats) and doing analytics, creating alerts for instance health on regular interval.
  • For analytics/ business-intelligence applications

Elasticsearch basic concepts

We will be using few terminologies while talking about Elasticsearch. Let's see basic building blocks of Elasticsearch.

Near real-time

Elasticsearch is near real-time. What it means is that the time (latency) between the indexing of document and its availability for searching.

Cluster

It is a collection of one or multiple nodes (servers) that together holds the entire data and provide you the ability to indexing and searching the cluster for data.

Node

It is a single server that is part of your cluster. It can store data, participate in indexing and searching and overall cluster management. Node could have four different flavours i.e. master, htttp, data, coordinating/client nodes.

Index

An index is collection of similar kind/characteristics of documents. It is identified by name(all lowercase) and is refer to by name to perform indexing, search, update and delete operations against documents.

Document

It is a single unit of information that can be indexed.

Shards and Replicas

Single index can store billions of documents which can lead to storage taking up TB's of space. Single server could exceed its limitation to store such a massive information or performing search operation on that data. To solve this problem, Elasticsearch sub-divide your index into multiple units called shards.

Replication is important primarily to have high availability in case of node/shard failure and to allow to scale out your search throughput. By default Elasticsearch have 5 shards and 1 replicas which could be configured at the time of creating index.

Installing Elasticsearch

Elasticsearch requiresJava to run. As of writing this article Elasticsearch 6.2.X+ requires at least Java 8.

Installing Java 8
// Installing Open JDK
sudo apt-get install openjdk-8-jdk
 
// Installing Oracle JDK
sudo add-apt-repository -y ppa:webupd8team/java
sudo apt-get update
sudo apt-get -y install oracle-java8-installer
Installing Elasticsearch with tar file

curl -L -O https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.2.4.tar.gz

tar -xvf elasticsearch-6.2.4.tar.gz
Installing Elasticsearch with package manager
// import the Elasticsearch public GPG key into apt:
wget -qO - https://packages.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -

//Create the Elasticsearch source list
echo "deb http://packages.elastic.co/elasticsearch/6.x/debian stable main" | sudo tee -a /etc/apt/sources.list.d/elasticsearch-6.x.list
  
sudo apt-get update
  
sudo apt-get -y install elasticsearch
Configuring Elasticsearch cluster

Configuration file location if you have downloaded the tar file

vi /[YOUR_TAR_LOCATION]/config/elasticsearch.yml

Configuration file location if you used package manager to install Elasticsearch

vi /etc/elasticsearch/elasticsearch.yml
Cluster Name

Use some descriptive name for cluster. Elasticsearch node will use this name to form and join cluster.

cluster.name: lineofcode-prod
Node name

To uniquely identify node in the cluster

node.name: ${HOSTNAME}
Custom attributes to node

Adding a rack to node to logically group the nodes placed on same data center/ physical machine

node.attr.rack: us-east-1
Network host

Node will bind to this hostname or IP address and advertise this host to other nodes in the cluster.

network.host: [_VPN_HOST_, _local_]
Elasticsearch does not come with authentication and authorization. So, it is suggested to never bind network host property to public IP address.
Cluster finding settings

To find and join a cluster, you need to know at least few other hostname or IP addresses. This could easily be set by discovery.zen.ping.unicast.hosts proeprty.

Changing the http port

You can configure the port number on which Elasticsearch is accessible over HTTP with http.port property.

Configuring JVM options (Optional for local/test)

You need to tweak JVM options as per your hardware configuration. It is advisable to allocate half the memory of total server available memory to Elasticsearch and rest will be taken up by Lucene and Elasticsearch threads.

// For example if your server have eight GB of RAM then set following property as
-Xms4g
-Xmx4g

Also, to avoid performance hit let elasticsearch block the memory with bootstrap.memory_lock: true property.

Elasticsearch uses concurrent mark and sweep GC and you can change it to G1GC with following configurations.

-XX:-UseParNewGC
-XX:-UseConcMarkSweepGC
-XX:+UseCondCardMark
-XX:MaxGCPauseMillis=200
-XX:+UseG1GC
-XX:GCPauseIntervalMillis=1000
-XX:InitiatingHeapOccupancyPercent=35
Starting Elasticsearch
sudo service elasticsearch restart

TADA! Elasticsearch is up and running on your local.

To have a production grade setup, I would recommend to visit following articles.

Digitalocean guide to setup production elasticsearch

Elasticsearch - Fred Thoughts

We have learnt about What is Apache Ignite?, Setting up Apache Ignite and few quick examples in last few posts. In this post, we will deep dive into Apache Ignite core Ignite classes and discuss about following internals.

  • Core classes
  • Lifecycle events
  • Client and Server mode
  • Thread pools configurations
  • Asynchronous support in Ignite
  • Resource injection

Core classes

Whenever you will be interacting with Apache Ignite in application, you will always encounter Ignite interface and Ignition class. Ignition is the main entry point to create a Ignite node. This class provides various methods to start a grid node in the network topology.

// Starting with default configuration
Ignite igniteWithDefaultConfig = Ignition.start();

// Ignite with Spring configuration xml file
Ignite igniteWithSpringCfgXMLFile = Ignition.start("/path_to_spring_configuration_xml.xml");

// ignite with java based configuration
IgniteConfiguration icfg = ...;
Ignite igniteWithJavaConfiguration = Ignition.start(icfg);

There are also other useful methods in Ignition class which we will discuss below. Ignite interface provide control over node. It has various methods to interact as data-grid, service-grid, compute-grid, schedular and many more.

Lifecycle events

Apache Ignite provides four LifecyleEvents i.e. BEFORE_NODE_START, AFTER_NODE_START, BEFORE_NODE_STOP and AFTER_NODE_STOP. It provide hook to tap these events. You need to implement LifecycleBean and set the implementation in the ignite configuration.

class IgniteLifecycleEventListener implements LifecycleBean {

    @Override
    public void onLifecycleEvent(LifecycleEventType evt) throws IgniteException {
        String message;
        switch (evt) {
            case BEFORE_NODE_START:
                message = "before_node_start event is called!";
                break;
            case AFTER_NODE_START:
                message = "after_node_start event is called!";
                break;
            case BEFORE_NODE_STOP:
                message = "before_node_stop event is called!";
                break;
            case AFTER_NODE_STOP:
                message = "after_node_stop event is called!";
                break;
            default:
                message = "Unknown event";
                break;
        }
        System.out.println(message);
    }
}

Client and Server mode

Apache Ignite node can be run in client or server mode. Server nodes participates in Computing, Caching, data grid, service grid etc. and client nodes are way to interact with server nodes to have near time caching, transaction, computing, service grid functionality. You need to explicitly define the client and server mode.

IgnitionConfiguration.setClientMode(...);

Ignition.setClientMode(...);

Thread pool configurations

System thread pool

It processes all cache related operations except SQL and some other queries and also handles computing cancellation tasks.

IgniteConfiguration.setSystemThreadPoolSize(...);
//By default it has size equals to max(8, total_no_of_cores)

Public thread pool

All computations are received by processed in this thread pool.

IgnitionConfiguration.setPublicThreadPoolSize(...);
//By default it has size equals to max(8, total_no_of_cores)

Queries pool

Handles the SQL queries and SCAN operation executed across the cluster.

IgnitionConfiguration.setQueryThreadPoolSize(...);
//By default it has size equals to max(8, total_no_of_cores)

Services Pool

Handles service-grid calls.

IgniteConfiguration.setServicePoolSize(...);
//By default it has size equals to max(8, total_no_of_cores)

Striped Pool

Accelerate basic caching operations and transactions by spreading execution on multiples stripes that don't contend with each other.

IgniteConfiguration.setStripedPoolSize(...);
//By default it has size equals to max(8, total_no_of_cores)

Data stream pool

Used in data streaming.

IgniteConfiguration.setDataStreamerPoolSize(...);
//By default it has size equals to max(8, total_no_of_cores)

Custom thread pool

You can define your own custom thread pools. These are used in compute grid. For example, you want to run another task from compute grid task and you also want to avoid the deadlocks. This could be done with custom thread pools synchronously.

IgniteConfiguration icfg = ...;
icfg.setExecutorConfiguration(new ExecutorConfiguration("myCustomThreadPool").setSize(16));
class InternalTask implements IgniteRunnable {
    private static final long serialVersionUID = 5169676352276118235L;
    
    @Override
    public void run() {
        System.out.println("Internal task executed!");
    }
}

class OuterTask implements IgniteRunnable {
    private static final long serialVersionUID = 602712410415356484L;

    @IgniteInstanceResource
    private Ignite ignite;
  
    @Override
    public void run() {
        System.out.println("Ignite Outer task!");
        ignite.compute().withExecutor("myCustomThreadPool").run(new InternalTask());
    }
  
}

// Ignite main example class
IgniteConfiguration icfg = defaultIgniteCfg("custom-thread-pool-grid");
icfg.setExecutorConfiguration(new ExecutorConfiguration("myCustomThreadPool").setSize(16));
  
try (Ignite ignite = Ignition.start(icfg)) {
    ignite.compute().run(new OuterTask());
}

Asynchronous support in Ignite

Ignite API comes with synchronous and asynchronous support. Asynchronous calls return IgniteFuture or one of its implementations. You can call the blocking get method to get value or can add listener(IgniteInClosure) which will get executed as soon as the IgniteFuture has the result.

IgniteCompute compute = ignite.compute();
IgniteFuture fut = compute.callAsync(() -> "Hello from Callable");
//blocking call
String result = fut.get();
//added listener to future which will get executed as soon as future has result.
fut.listener(f -> System.out.println(f.get());
If the IgniteFuture is already have the result from asynchronous operation by the time IgniteInClosure is passed to listen or chain method, then it will be executed synchronously with the caller thread. Otherwise closure will get executed when the asynchronous operation finishes. The closure will be called in system thread pool for asynchronous cache related operations or public thread pool in case of compute operations. So, it is recommended(at least avoid) calling cache/ compute related operations from the closure to avoid deadlocks due to thread starvations.

Resource Injection

Ignite support dependency injection of pre-defined resources which could be used in the task, jo, closure or SPI. It supports both field and method based injection.

IgniteRunnable task = new IgniteRunnable() {
    private static final long serialVersionUID = 787726700536869271L;

    @IgniteInstanceResource
    private transient Ignite ignite;
    
    @Override
    public void run() {
        System.out.println("Hello Gaurav Bytes from: " + ignite.name());
     
    }
};

In the above example code, we have used @IgniteInstanceResource annotation to inject current Ignite instance in the IgniteRunnable object. There are other pre-defined resources that you can inject in the jobs, tasks, closures and SPI.

Resource Name Description
@IgniteInstanceResource Injects current instance of Ignite API
@CacheNameResource Injects the grid-cache name provided by the CacheConfiguration.getName()
@CacheStoreSessionResource Injects the CacheStoreSession instance
@LoadBalancerResource Injects the ComputeLoadBalancer instance for load-balancing
@SpringApplicationContextResource Injects the Spring's ApplicationContext

Apart from this, there are few other resources like TaskContinuousMapperResource, TaskSessionResource, SpringResource, ServiceResource and JobContextResource.

This is an introduction series to Apache Ignite. We will discuss about Apache Ignite, its features, usage as in-memory data grid, compute grid, distributed caching, near real-time caching and persistence distributed database.

What is Ignite?

  • It is in-memory compute platform.
  • It is in-memory data grid.
  • Durable, strongly consistent and highly available.
  • Providing option to run SQL like queries on cache (Providing JDBC API to support this).

Durable memory

Apache Ignite is memory-centric platform based on durable memory architecture. It allows you to store and processing data on in-memory(RAM) and on disk (If Ignite Native persistence is enabled). When the Ignite native persistence is enabled, it will treat disk as superset of data, which is cable of surviving crash and restarts.

In-memory features

RAM is always treated as first memory tier, all the processing happens there. It has following characteristics.

  • Off-heap based: All the data and indexes are stored outside of Java heap which helps in processing petabytes of data.
  • Since all data and indexes are off-heap based, it removes noticeable GC pauses since application code is only source possible for pause-the-world events.
  • It has predictable memory usage. You can configure memory usage with MemoryConfiguration
  • It uses memory as efficient as possible and runs defragmentation routines in the background.
  • Data and indexes on disk and in-memory are stored as same page format which improved the performance and avoids unnecessary data format conversion.

Persistence features

Here are few high-level persistence features.

  • Persistence is optional to disk. You can enable or disable it.
  • It provides data resiliency. If persistence is enabled, full dataset will be stored on physical disk and you can survives cluster restarts, crashes.
  • It can execute SQL queries on full dataset.
  • Cluster restarts are instantaneous. In-memory data will be cached automatically.

In this post, we will externalize the properties used in the application in a property file and will use PropertyPlaceHolderConfigurer to resolve the placeholder at application startup time.

Java Configuration for PropertyPlaceHolderConfigurer

@Configuration
public class AppConfig {

  @Bean
  public PropertySourcesPlaceholderConfigurer propertySourcesPlaceholderConfigurer() {
    PropertySourcesPlaceholderConfigurer propertySourcesPlaceholderConfigurer = new PropertySourcesPlaceholderConfigurer();
    propertySourcesPlaceholderConfigurer.setLocations(new ClassPathResource("application-db.properties"));
    //propertySourcesPlaceholderConfigurer.setIgnoreUnresolvablePlaceholders(true);
    //propertySourcesPlaceholderConfigurer.setIgnoreResourceNotFound(true);
    return propertySourcesPlaceholderConfigurer;
  }
}

We created object of PropertySourcesPlaceholderConfigurer and set the Locations to search. In this example we used ClassPathResource to resolve the properties file from classpath. You can use file based Resource which need absolute path of the file.

DBProperties file

@Configuration
public class DBProperties {
 
  @Value("${db.username}")
  private String userName;
 
  @Value("${db.password}")
  private String password;
 
  @Value("${db.url}")
  private String url;

  //getters for instance fields
}

We used @Value annotation to resolve the placeholders.

Testing the configuration

public class Main {
  private static final Logger logger = Logger.getLogger(Main.class.getName());
 
  public static void main(String[] args) {
    try (ConfigurableApplicationContext context = new AnnotationConfigApplicationContext(AppConfig.class, DBProperties.class);) {
      DBProperties dbProperties = context.getBean(DBProperties.class);
      logger.info("This is dbProperties: " + dbProperties.toString());
    }
  }
}

For testing, we created object of AnnotationConfigApplicationContext and got DBProperties bean from it and logged it using Logger. This is the simple way to externalize the configuration properties from framework congfiguration. You can also get the full example code from Github.

In this post, we will discuss about Digest Authentication with Spring Security. You can also read my previous post on Basic Authentication with Spring Security.

What is Digest Authentication?

  • This authentication method makes use of a hashing algorithms to encrypt the password (called password hash) entered by the user before sending it to the server. This, obviously, makes it much safer than the basic authentication method, in which the user’s password travels in plain text (or base64 encoded) that can be easily read by whoever intercepts it.
  • There are many such hashing algorithms in java also, which can prove really effective for password security such as MD5, SHA, BCrypt, SCrypt and PBKDF2WithHmacSHA1 algorithms.
  • Please remember that once this password hash is generated and stored in database, you can not convert it back to original password. Each time user login into application, you have to regenerate password hash again, and match with hash stored in database. So, if user forgot his/her password, you will have to send him a temporary password and ask him to change it with his new password. Well, it’s common trend now-a-days.

Let's start building simple Spring Boot application with Digest Authentication using Spring Security.

Adding dependencies in pom.xml

We will use spring-boot-starter-security as maven dependency for Spring Security.

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-security</artifactId>
</dependency>

Digest related Java Configuration

@Bean
DigestAuthenticationFilter digestFilter(DigestAuthenticationEntryPoint digestAuthenticationEntryPoint, UserCache digestUserCache, UserDetailsService userDetailsService) {
  DigestAuthenticationFilter filter = new DigestAuthenticationFilter();
  filter.setAuthenticationEntryPoint(digestAuthenticationEntryPoint);
  filter.setUserDetailsService(userDetailsService);
  filter.setUserCache(digestUserCache);
  return filter;
}
 
@Bean
UserCache digestUserCache() throws Exception {
  return new SpringCacheBasedUserCache(new ConcurrentMapCache("digestUserCache"));
}
 
@Bean
DigestAuthenticationEntryPoint digestAuthenticationEntry() {
  DigestAuthenticationEntryPoint digestAuthenticationEntry = new DigestAuthenticationEntryPoint();
  digestAuthenticationEntry.setRealmName("GAURAVBYTES.COM");
  digestAuthenticationEntry.setKey("GRM");
  digestAuthenticationEntry.setNonceValiditySeconds(60);
  return digestAuthenticationEntry;
}

You need to register DigestAuthenticationFilter in your spring context. DigestAuthenticationFilter requires DigestAuthenticationEntryPoint and UserDetailsService to authenticate user.

The purpose of the DigestAuthenticationEntryPoint is to send the valid nonce back to the user if authentication fails or to enforce the authentication.

The purpose of UserDetailsService is to provide UserDetails like password and list of role for that user. UserDetailsService is an interface. I have implemented it with DummyUserDetailsService which loads every passed userName's details. But, you can restrict it to some few user or make it Database backed. One thing to remember is the password passed need to be in plain text format here. You can also use InMemoryUserDetailsManager for storing handful of user configured either through Java configuration or with xml based configuration which could access your application.

In the example, I also have used the caching for UserDetails. I have used SpringBasedUserCache and underlying cache is ConcurrentMapCache. You can use any other caching solution.

Running the example

You can download the example code from Github. I will be using Postman to run the example. Here are the few steps you need to follow.

1. Open postman and enter url (localhost:8082).

2. Click on Authorization tab below the url and select Digest Auth from Type dropdown.

3. Enter username(gaurav), realm(GAURAVBYTES.COM), password(pwd), algorithm(MD5) and leave nonce as empty. Click Send button.

4. You will get 401 unauthorized as response like below.

5. If you see the Headers from the response, you will see "WWW-Authenticate" header. Copy the value of nonce field and enter in the nonce textfield.

6. Click on Send Button. Voila!!! You got the valid response.

This is how we implement Digest Authentication with Spring Security. I hope you find this post informative and helpful.

In the previous posts, we have created a Spring Boot QuickStart, customized the embedded server and properties and running specific code after spring boot application starts.

Now in this post, we will create Restful webservices with Jersey deployed on Undertow as a Spring Boot Application.

Adding dependencies in pom.xml

We will add spring-boot-starter-parent as parent of our maven based project. The added benefit of this is version management for spring dependencies.

<parent>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-parent</artifactId>
  <version>1.5.0.RELEASE</version>
</parent>

Adding spring-boot-starter-jersey dependency

This will add/ configure the jersey related dependencies.

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-jersey</artifactId>
</dependency>

Adding spring-boot-starter-undertow dependency

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-undertow</artifactId>
</dependency>

These are all the necessary spring-boot-starters we require to create Restful webservices with Jersey.

Creating a Root resource/ Controller class

What are Root resource classes?

Root resource classes are POJOs that are either annotated with @Path or have at least one method annotated with @Path or a request method designator, such as @GET, @PUT, @POST, or @DELETE.

@Component
@Path("/books")
public class BookController {
  private BookService bookService;

  public BookController(BookService bookService) {
    this.bookService = bookService;
  }

  @GET
  @Produces("application/json")
  public Collection getAllBooks() {
    return bookService.getAllBooks();
  }

  @GET
  @Produces("application/json")
  @Path("/{oid}")
  public Book getBook(@PathParam("oid") String oid) {
    return bookService.getBook(oid);
  }

  @POST
  @Produces("application/json")
  @Consumes("application/json")
  public Response addBook(Book book) {
    bookService.addBook(book);
    return Response.created(URI.create("/" + book.getOid())).build();
  }

  @PUT
  @Consumes("application/json")
  @Path("/{oid}")
  public Response updateBook(@PathParam("oid") String oid, Book book) {
    bookService.updateBook(oid, book);
    return Response.noContent().build();
  }

  @DELETE
  @Path("/{oid}")
  public Response deleteBook(@PathParam("oid") String oid) {
    bookService.deleteBook(oid);
    return Response.ok().build();
  }
}

We have created a BookController class and used JAX-RS annotations.

  • @Path is used to identify the URI path (relative) that a resource class or class method will serve requests for.
  • @PathParam is used to bind the value of a URI template parameter or a path segment containing the template parameter to a resource method parameter, resource class field, or resource class bean property. The value is URL decoded unless this is disabled using the @Encoded annotation.
  • @GET indicates that annotated method handles HTTP GET requests.
  • @POST indicates that annotated method handles HTTP POST requests.
  • @PUT indicates that annotated method handles HTTP PUT requests.
  • @DELETE indicates that annotated method handles HTTP DELETE requests.
  • @Produces defines a media-type that the resource method can produce.
  • @Consumes defines a media-type that the resource method can accept.

You might have noticed that we have annotated BookController with @Component which is Spring's annotation and register it as bean. We have done so to benefit Spring's DI for injecting BookService service class.

Creating a JerseyConfiguration class

@Configuration
@ApplicationPath("rest")
public class JerseyConfiguration extends ResourceConfig {
  public JerseyConfiguration() {
  
  }
 
  @PostConstruct
  public void setUp() {
    register(BookController.class);
    register(GenericExceptionMapper.class);
  }
}

We created a JerseyConfiguration class which extends the ResourceConfig from package org.glassfish.jersey.server which configures the web application. In the setUp(), we registered BookController and GenericExceptionMapper.

@ApplicationPath identifies the application path that serves as the base URI for all the resources.

Registering exception mappers

Could there be a case that some exceptions occurs in the resource methods (Runtime/ Checked). You can write your own custom exception mappers to map Java exceptions to javax.ws.rs.core.Response.

@Provider
public class GenericExceptionMapper implements ExceptionMapper {

  @Override
  public Response toResponse(Throwable exception) {
    return Response.serverError().entity(exception.getMessage()).build();
  }
}

We have created a generic exception handler by catching Throwable. Ideally, you should write finer-grained exception mapper.

What is @Provider annotation?

It marks an implementation of an extension interface that should be discoverable by JAX-RS runtime during a provider scanning phase.

We have also created service BookService, model Book also. You can grab the full code from Githib.

Running the application

You can use maven to directly run it with mvn spring-boot:run command or can create a jar and run it.

Testing the rest endpoints

I have used PostMan extension available in chrome brower to test rest services. You can use any package/ API/ software to test it.

This is how we create Restful web-services with Jersey in conjuction with Spring Boot. I hope you find this post informative and helpful to create your first but not last Restful web-service.

In this post, we will create a simple Spring Boot application which will run on embedded Apache Tomcat.

What is Spring Boot?

Spring Boot helps in creating stand-alone, production-grade application easily with minimum fuss. It is the opinionated view of Spring framework and other third party libraries which believes in convenient configuration based setup.

Let's start building Spring Boot Application.

Adding dependencies in pom.xml

We will first add spring-boot-starter-parent as parent of our maven based project.

<parent>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-parent</artifactId>
  <version>1.5.1.RELEASE</version>
</parent>

The benefit of adding spring-boot-starter-parent is that version managing of dependency is easy. You can omit the required version on the dependency. It will pick the one configured the parent pom or from starters pom. Also, it conveniently setup the build related configurations as well.

Adding spring-boot-starter-web dependency

This will configure/ add all the required dependencies for spring-web module.

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-web</artifactId>
</dependency>

Writing App class

@SpringBootApplication
public class App {
  public static void main(String[] args) {
    SpringApplication.run(App.class, args);
  }
}

@SpringBootApplication indicates that class is configuration class and also trigger the auto-configure through @EnableAutoConfiguration and component scanning through @ComponentScan annotation in it.

@EnableAutoConfiguration

It enables the auto-configuration of Spring Application Context. It attempts to configuration your application as per the classpath dependencies that you have added.

In the main() of App class, we have delegated the call to run() method of SpringApplication. SpringApplication will bootstrap and auto-configure our application and in our case will start the embedded tomcat server. In run method, we have passed App.class as argument which tells Spring that this is our primary spring component (helps in bootstrapping).

Writing HelloGbController

@RestController
public class HelloGbController {
  @GetMapping
  public String helloGb() {
    return "Gaurav Bytes says, \"Hello There!!!\"";
  }
}

I have used two annotations @RestController and @GetMapping. You can read more on new annotation introduced by Spring here.

@RestController signifies that this class is web @Controller and spring will consider it to handle incoming web requests.

Running the application

You can use maven command mvn spring-boot:run to run it as Spring Boot application and when you hit the localhost:8080 on your web browser, you will see the below web page.

Creating a jar for spring boot application

You need to add spring-boot-maven-plugin plugin to your build configuration in pom.xml and then you can create a jar with maven command mvn repackage and simply run it as jar with command java -jar spring-boot-quickstart-0.0.1-SNAPSHOT.jar.

<plugin>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-maven-plugin</artifactId>
</plugin>

This is how you can build a simple spring boot application. I hope you find this post helpful. You can download the example code from Github.

In this post, we will learn about @Import annotation and its usage. You can see my previous post on how to create a simple spring core project.

What is @Import annotation and usage?

@Import annotation is equivalent to <import/> element in Spring XML configuration. It helps in splitting the single Java based configuration file into small, modular, maintainable and component based configuration. Let's see it with example.

@Configuration
@Import(value = { DBConfig.class, WelcomeGbConfig.class })
public class HelloGbAppConfig {

}

In above code snippet, we are importing two different configuration files viz. DBConfig, WelcomeGbConfig in application level configuration file HelloGbAppConfig.

The above code is equivalent to Spring XML based configuration below.

<beans xmlns="http://www.springframework.org/schema/beans"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="http://www.springframework.org/schema/beans
 http://www.springframework.org/schema/beans/spring-beans-2.5.xsd">

  <import resource="config/welcomegbconfig.xml"/>
  <import resource="config/dbconfig.xml"/>

</beans>

You can see the full example code for Java based configuration on Github.

In this post, we will create a spring context and will register bean via Java configuration file. You can see my previous post on how to create a simple spring core project.

What is @Configuration annotation?

@Configuration annotation indicates that there is one or more bean methods and spring containers can process to generate bean definitions at runtime. Also, @Bean annotation is used at method level to signifies that this will be registered as bean in spring context. Let's create a quick configuration class.

@Configuration
public class WelcomeGbConfig {

  @Bean
  GreetingService greetingService() {
    return new GreetingService();
  }
}

Now, we will create spring context as follows.

// using try with resources so that this context closes automatically
try (ConfigurableApplicationContext context = new AnnotationConfigApplicationContext(
      WelcomeGbConfig.class);) {
  GreetingService greetingService = context.getBean(GreetingService.class);
  greetingService.greet();
}

1) we created the spring context.
2) We got the bean from context.
3. We call greet() on bean object.

This is how you can use configuration file (Java based) to define bean and being processed by spring context. You can also find the full example code on Github.

In this post, we will create a Spring context and will get a bean object from it.

What is Spring context?

Spring context is also termed as Spring IoC container which is responsible for instantiate, configure and assemble the beans by reading configuration meta data from XML, Java annotations and/ or Java code in configuration files.

Technologies used

Spring 4.3.6.RELEASE, Maven Compiler 3.6.0 and Java 1.8

We will first create a simple maven project. You can select the maven-archtype-quickstart as archtype.

Adding dependencies in pom.xml

We will add spring-framework-bom in the dependency management.

<dependencyManagement>
  <dependencies>
    <dependency>
      <groupId>org.springframework</groupId>
      <artifactId>spring-framework-bom</artifactId>
      <version>4.3.6.RELEASE</version>
      <type>pom</type>
      <scope>import</scope>
    </dependency>
  </dependencies>
</dependencyManagement>

The benefit of adding this are to manage the version of the added spring dependencies from one place. By this, you can omit mentioning version number for spring dependencies.

<dependency>
  <groupId>org.springframework</groupId>
  <artifactId>spring-context</artifactId>
  <scope>runtime</scope>
</dependency>

Now, we will create a class GreetingService which is eligible to get registered as bean in Spring context.

@Service
public class GreetingService {
  private static final Logger logger = Logger.getLogger(GreetingService.class.getName());

  public GreetingService() {

  }

  public void greet() {
    logger.info("Gaurav Bytes welcomes you for your first tutorial on Spring!!!");
  }
}

@Service annotation at class-level means that this is service and is eligible to be registered as bean in Spring context.

Instantiating a container

Now, we will create object of Spring context. We are using AnnotationConfigApplicationContext as spring container. Also, there exists other spring container like ClassPathXmlApplicationContext, GenericGroovyApplicationContext etc. which we will discuss in future posts.

ConfigurableApplicationContext context = new AnnotationConfigApplicationContext(
      "com.gauravbytes.hellogb.service");

As you see at the time of object contruction of AnnotationConfigApplicationContext, I am passing one string parameter. This parameter ( of varags type) is the basePackages which spring context will scan for bean registration.

Now, we will get object of bean by calling getBean() on spring context.

GreetingService greetingService = context.getBean(GreetingService.class);
greetingService.greet();

At last, we are closing the spring container by calling close().

context.close();
It is important to close the spring context(container) after use. By closing it, we ensure that it will release all the resources and locks that its implementation might hold and will also destroy all the cached singleton beans.

We have also included maven-compiler-plugin in pom.xml to compile the java sources with the configured java version (in our case it is Java 1.8).

You can also find the example code on Github.

This article is in continuation to my other posts on Functional Interfaces, static and default methods and Lambda expressions.

Method references are the special form of Lambda expression. When your lambda expression are doing nothing other than invoking existing behaviour (method), you can achieve same by referring it by name.

  • :: is used to refer to a method.
  • Method type arguments are infered by JRE at runtime from context it is defined.

Types of method references

  • Static method reference
  • Instance method reference of particular object
  • Instance method reference of an arbitrary object of particular type
  • Constructor reference

Static method reference

When you refer static method of Containing class. e.g. ClassName::someStaticMethodName

class MethodReferenceExample {
  public static int compareByAge(Employee first, Employee second) {
    return Integer.compare(first.age, second.age);
  }
}

Comparator compareByAge = MethodReferenceExample::compareByAge;

Instance method reference of particular object

When you refer to the instance method of particular object e.g. containingObjectReference::someInstanceMethodName

static class MyComparator {
  public int compareByFirstName(User first, User second) {
    return first.getFirstName().compareTo(second.getFirstName());
  }
  
  public int compareByLastName(User first, User second) {
    return first.getLastName().compareTo(second.getLastName());
}

private static void instanceMethodReference() {
  System.err.println("Instance method reference");
  List<User> users = Arrays.asList(new User("Gaurav", "Mazra"),
      new User("Arnav", "Singh"), new User("Daniel", "Verma"));
  MyComparator comparator = new MyComparator();
  System.out.println(users);
  Collections.sort(users, comparator::compareByFirstName);
  System.out.println(users);
}

Instance method reference of an arbitrary object of particular type

When you refer to instance method of some class with ClassName. e.g. ClassName::someInstanceMethod;

Comparator<String> stringIgnoreCase = String::compareToIgnoreCase;
//this is equivalent to
Comparator<String> stringComparator = (first, second) -> first.compareToIgnoreCase(second);

Constructor reference

When you refer to constructor of some class in lambda. e.g. ClassName::new

Function<String, Job> jobCreator = Job::new;
//the above function is equivalent to
Function<String, Job> jobCreator2 = (jobName) -> return new Job(jobName);

You can find the full example on github.

You can also view my other article on Java 8

In this post, we will cover following topics.

  • What are Streams?
  • What is a pipeline?
  • Key points to remember for Streams.
  • How to create Streams?

What are Streams?

Java 8 introduced new package java.util.stream which contains classes to perform SQL-like operations on elements. Stream is a sequence of elements on which you can perform aggregate operations (reduction, filtering, mapping, average, min, max etc.). It is not a data structure that stores elements like collection but carries values often lazily computed from source through pipeline.

What is a pipeline?

A pipeline is sequence of aggregate (reduction and terminal) operations on the source. It has following components.

  • A source: Collections, Generator Function, array, I/O channel etc.
  • zero or more intermediate operations: filter, map, sequential, sorted, distinct, limit, flatMap, parallel etc. Intermediate operations returns/produces stream.
  • a termination operation: forEach, reduction, noneMatch, allMatch, count, findFirst, findAny, min, max etc.

Key points to remember for Streams

  • No storage.
  • Functional in nature.
  • Laziness-seeking.
  • Possibly unbounded. Operations, for example, limit(n) or findFirst() can permit calculations on infinite streams to finish in finite time.
  • Consumable. The elements can be visited only once. To revisit, you need to create a new stream.

How to create Streams?

1. In Collection, you can create streams by calling stream(), parallelStream().

Collection<Person> persons = StreamSamples.getPersons();
persons.stream().forEach(System.out::println);

// parallel stream
persons.parallelStream().forEach(System.out::println);

2. From Stream interface, calling static factory method of() which takes varargs of T type.

Stream.of("This", "is", "how", "you", "create", "stream", "from", "static", "factory",
      "method").map(s -> s.concat(" ")).forEach(System.out::print);

3. From Arrays class, by calling stream() static method.

Arrays.stream(new String[] { "This", "is", "how", "you", "create", "stream", ".",
      "Above", "function", "use", "this" }).map(s -> s.concat(" "))
      .forEach(System.out::print);

4. From Stream by calling iterate(). It is infinite stream function.

// iterate return infinite stream... beware of infinite streams
Stream.iterate(1, i -> i++).limit(10).forEach(System.out::print);

5. From IntStream by calling range.

int sumOfFirst10PositiveNumbers = IntStream.range(1, 10).reduce(0, Integer::sum);
System.out.println(sumOfFirst10PositiveNumbers);

6. From Random by calling ints(). It is infinite stream function.

// random.ints for random number
new Random().ints().limit(20).forEach(System.out::println);

7. From BufferedReader by calling lines(). Streams of file paths can be obtained by calling createDirectoryStream of Files class and some other classes like JarFile.stream(), BitSet.stream() etc.

try (BufferedReader br = new BufferedReader(new StringReader(myValue))) {
  br.lines().forEach(System.out::print);
  System.out.println();
}
catch (IOException io) {
  System.err.println("Got this:>>>> " + io);
}

I hope the post is informative and helpful in understanding Streams. You can find the full example code on Github.

You can also read on Aggregate opeations on Stream.

This post is in continuation with my earlier posts on Streams. In this post we will discuss about aggregate operations on Streams.

Aggregate operations on Streams

You can perform intermediate and terminal operations on Streams. Intermediate operations result in a new stream and are lazily evaluated and will start when terminal operation is called.

persons.stream().filter(p -> p.getGender() == Gender.MALE).forEach(System.out::println);

In the snippet above, filter() doesn't start filtering immediately but create a new stream. It will only start when terminal operation is called and in above case when forEach().

Intermediate operations

There are many intermediate operations that you can perform on Streams. Some of them are filter(), distinct(), sorted(), limit(), parallel(), sequential, map(), flatMap.

filter() operation

This takes Predicate functional interface as argument and the output stream of this operation will have only those elements which pass the conditional check of Predicate. You can learn a nice explanation on Predicates here.

// all the males
List<Person> allMales = persons.stream().filter(p -> p.getGender() == Gender.MALE).collect(Collectors.toList());
System.out.println(allMales);

map() operation

It is a mapper operation. It expects Function functional interface as argument. Purpose of Function is to transform from one type to other (The other type could be same).

// first names of all the persons
List<String> firstNames = persons.stream().map(Person::getFirstName).collect(Collectors.toList());
System.out.println(firstNames);

distinct()

It returns the unique elements and uses equals() under the hood to remove duplicates.

List<String> uniqueFirstNames = persons.stream().map(Person::getFirstName).distinct().collect(Collectors.toList());

System.out.println(uniqueFirstNames);

sorted()

Sorts the stream elements. It is stateful operation.

List<Person> sortedByAge = persons.stream().sorted(Comparator.comparingInt(Person::getAge)).collect(Collectors.toList());
System.out.println(sortedByAge);

limit() will reduce the number of records. It is helpful to end infinite streams in a finite manner.

Intemediate operations can be divided to two parts stateless and stateful. Most of the streams intermediate operations are stateless e.g. map, filter, limit etc. but some of them are stateful e.g. distinct and sorted because they have to maintain the state of previously visited element.

Terminal/ Reduction operations

There are many terminal operations such as forEach(), reduction(), max(), min(), average(), collect(), findAny, findFirst(), allMatch(), noneMatch().

forEach()

This takes Consumer functional interface as parameter and pass on the element for consumption.

persons.stream().forEach(System.out::println);

max(), min(), average() operations

average() returns OptionalDouble whereas max() and min() return OptionalInt.

//average age of all persons
persons.stream().mapToInt(Person::getAge).average().ifPresent(System.out::println);

// max age from all persons
persons.stream().mapToInt(Person::getAge).max().ifPresent(System.out::println);

// min age from all persons
persons.stream().mapToInt(Person::getAge).min().ifPresent(System.out::println);

noneMatch(), allMatch(), anyMatch()

matches if certain condition satisfies by none, all and/or any elements of stream respectively.

//age of all females in the group is less than 22
persons.stream().filter(p -> p.getGender() == Gender.FEMALE).allMatch(p -> p.getAge() < 22);
//not a single male's age is greater than 30
persons.stream().filter(p -> p.getGender() == Gender.MALE).noneMatch(p -> p.getAge() > 30);

persons.stream().anyMatch(p -> p.getAge() > 45);

Reduction operations

Reduction operations are those which provide single value as result. We have seen in previous snippet some of the reduction operation which do this. E.g. max(), min(), average(), sum() etc. Apart from this, Java 8 provides two more general purpose operations reduce() and collect().

reduce()

int sumOfFirst10 = IntStream.range(1, 10).reduce(0, Integer::sum);
System.out.println(sumOfFirst10);

collect()

It is a mutating reduction. Collectors has many useful collection methods like toList(), groupingBy(),

Collection<Person> persons = StreamSamples.getPersons();
List firstNameOfPersons = persons.stream().map(Person::getFirstName).collect(Collectors.toList());
System.out.println(firstNameOfPersons);

Map<Integer, List<Person>> personByAge = persons.stream().collect(Collectors.groupingBy(Person::getAge));
System.out.println(personByAge);

Double averageAge = persons.stream().collect(Collectors.averagingInt(Person::getAge));
System.out.println(averageAge);

Long totalPersons = persons.stream().collect(Collectors.counting());
System.out.println(totalPersons);

IntSummaryStatistics personsAgeSummary = persons.stream().collect(Collectors.summarizingInt(Person::getAge));

System.out.println(personsAgeSummary);

String allPersonsFirstName = persons.stream().collect(Collectors.mapping(Person::getFirstName, Collectors.joining("#")));
System.out.println(allPersonsFirstName);

The result would look like this.

[Gaurav, Gaurav, Sandeep, Rami, Jiya, Rajesh, Rampal, Nisha, Neha, Ramesh, Parul, Sunil, Prekha, Neeraj]
{32=[Person [firstName=Rami, lastName=Aggarwal, gender=FEMALE, age=32, salary=12000]], 35=[Person [firstName=Rampal, lastName=Yadav, gender=MALE, age=35, salary=12000]], 20=[Person [firstName=Prekha, lastName=Verma, gender=FEMALE, age=20, salary=3600]], 21=[Person [firstName=Neha, lastName=Kapoor, gender=FEMALE, age=21, salary=5500]], 22=[Person [firstName=Jiya, lastName=Khan, gender=FEMALE, age=22, salary=4500], Person [firstName=Ramesh, lastName=Chander, gender=MALE, age=22, salary=2500]], 24=[Person [firstName=Sandeep, lastName=Shukla, gender=MALE, age=24, salary=5000]], 25=[Person [firstName=Parul, lastName=Mehta, gender=FEMALE, age=25, salary=8500], Person [firstName=Neeraj, lastName=Shah, gender=MALE, age=25, salary=33000]], 26=[Person [firstName=Nisha, lastName=Sharma, gender=FEMALE, age=26, salary=10000]], 27=[Person [firstName=Sunil, lastName=Kumar, gender=MALE, age=27, salary=6875]], 28=[Person [firstName=Gaurav, lastName=Mazra, gender=MALE, age=28, salary=10000], Person [firstName=Gaurav, lastName=Mazra, gender=MALE, age=28, salary=10000]], 45=[Person [firstName=Rajesh, lastName=Kumar, gender=MALE, age=45, salary=55000]]}
27.142857142857142
14
IntSummaryStatistics{count=14, sum=380, min=20, average=27.142857, max=45}
Gaurav#Gaurav#Sandeep#Rami#Jiya#Rajesh#Rampal#Nisha#Neha#Ramesh#Parul#Sunil#Prekha#Neeraj

You can't consume same Streams twice

When the terminal operation is completed on stream, it is considered consumed and you can't use it again. You will end up with exception if you try to start new operations on already consumed stream.

Stream<String> stream = lines.stream();
stream.reduce((a, b) -> a.length() > b.length() ? a : b).ifPresent(System.out::println);

// below line will throw the exception
stream.forEach(System.out::println);
Exception in thread "main" java.lang.IllegalStateException: stream has already been operated upon or closed
 at java.util.stream.AbstractPipeline.sourceStageSpliterator(AbstractPipeline.java:279)
 at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
 at com.gauravbytes.java8.stream.StreamExceptionExample.main(StreamExceptionExample.java:18)

Parallelism

Streams provide a convenient way to execute operations in parallel. It uses ForkJoinPool under the hood to run stream operations in parallel. You can use parallelStream() or parallel() on already created stream to perform task parallelly. One thing to note parallelism is not automatically faster than running task in serial unless you have enough data and processor cores.

persons.parallelStream().filter(p -> p.getAge() > 30).collect(Collectors.toList());
Pass java.util.concurrent.ForkJoinPool.common.parallelism property while JVM startup to increase parallelism in fork-join pool.

Concurrent reductions

ConcurrentMap<Integer, List<Person>> personByAgeConcurrent = persons.stream().collect(Collectors.groupingByConcurrent(Person::getAge));
System.out.println(personByAgeConcurrent);
Prevent interference, side-effects and stateful lambda/functions.

Side effects

If the function is doing more than consuming and/ or returning value, like modifying the state is said to have side-effects. A common example of side-effect is forEach(), mutable reduction using collect(). Java handles side-effects in collect() in thread-safe manner.

Interference

You should avoid interference in your lambdas/ functions. It occurs when you modify the underlying collection while running pipeline operations.

Stateful Lambda expressions

A lambda expression is stateful if its result depends on any state which can alter/ change during execution. Avoid using stateful lambdas expressions. You can read more here.

I hope you find this post informative and helpful. You can find the example code for reduction, aggregate operation and stream creation on Github.

Java 8 reincarnated SAM interfaces and termed them Functional interfaces. Functional interfaces have single abstract method and are eligible to be represented with Lambda expression. @FunctionalInterface annotation is introduced in Java 8 to mark an interface as functional. It ensures at compile-time that it has only single abstract method, otherwise it will throw compilation error.

Let's define a functional interface.

@FunctionalInterface
public interface Spec<T> {
  boolean isSatisfiedBy(T t);
}

Functional interfaces can have default and static methods in them and still remains functional interface.

@FunctionalInterface
public interface Spec<T> {
  boolean isSatisfiedBy(T t);
 
  default Spec<T> not() {
    return (t) -> !isSatisfiedBy(t);
  }
 
  default Spec<T> and(Spec<T> other) {
    return (t) -> isSatisfiedBy(t) && other.isSatisfiedBy(t);
  }
 
  default Spec<T> or(Spec<T> other) {
    return (t) -> isSatisfiedBy(t) || other.isSatisfiedBy(t);
  }
}
If an interface declares an abstract method overriding one of the public methods of java.lang.Object, that also does not count toward the interface's abstract method count since any implementation of the interface will have an implementation from java.lang.Object or elsewhere.

In this post, we will cover following topics.

  • What are Lambda expressions?
  • Syntax for Lambda expression.
  • How to define no parameter Lambda expression?
  • How to define single/ multi parameter Lambda expression?
  • How to return value from Lambda expression?
  • Accessing local variables in Lambda expression.
  • Target typing in Lambda expression.

What are Lambda expressions?

Lambda expressions are the first step of Java towards functional programming. Lambda expressions enable us to treat functionality as method arguments, express instances of single-method classes more compactly.

Syntax for Lambda expression

Lambda has three parts:

  • comma separated list of formal parameters enclosed in parenthesis.
  • arrow token ->.
  • and, body of expression (which may or may not return value).

(param) -> { System.out.println(param); }
Lambda expression can only be used where the type they are matched are functional interfaces.

How to define no parameter Lambda expression?

If the lambda expression is matching against no parameter method, it can be written as:

() -> System.out.println("No paramter expression");

How to define single/ multi parameter Lambda expression?

If lambda expression is matching against method which take one or more parameter, it can be written as:

(param) -> System.out.println("Single param expression: " + param);

(paramX, paramY) -> System.out.println("Two param expression: " + paramX + ", " + paramX);

You can also define the type of parameter in Lambda expression.

(Employee e) -> System.out.println(e);

How to return value from Lambda expression?

You can return value from lambda just like a method did.

(param) -> {
  // perform some steps
  return "some value";
};

In case lambda is performing single step and returning value. Then you can write it as:

//either
(int a, int b) -> return Integer.compare(a, b);

// or simply lambda will automatically figure to return this value
(int a, int b) -> Integer.compare(a, b);

Accessing local variables in Lambda expression

Lambda can access the final or effectively final variables of the method in which they are defined. They can also access the instance variables of enclosing class.

Target typing in Lambda expression

You might have seen in earlier code snippets that we have omitted the type of parameter, return value and the type of Lambda. Java compiler determines the target type from the context lambda is defined.

Compiler checks three things:

  • Is the target type functional interface?
  • Is list of parameter and its type matched with the single method?
  • Does the return type matched with the single method return type?

Now, Let's jump to an example to verify it.

@FunctionalInterface
interface InterfaceA {
  void doWork();
}

@FunctionalInterface
interface InterfaceB<T> {
  T doWork();
}

class LambdaTypeCheck {
  
  public static void main (String[] args) {
    LambdaTypeCheck typeCheck = new LambdaTypeCheck();
    typeCheck.invoke(() -> "I am done with you");
  }
  
  public <T> T invoke (InterfaceB<T> task) {
    return task.doWork();
  }

  public void invoke (InterfaceA task) {
    task.doWork();
  }
}
When you call typeCheck.invoke(() -> "I am done with you"); then invoke(InterfaceB<T> task) will be called. Because the lambda return value which is matched by InterfaceB<T>.

I did a comparison of Java default serialization and Apache Avro serialization of data and results were very astonishing.

You can read my older posts for Java serialization process and Apache Avro Serialization.

Apache Avro consumed 15-20 times less memory to store the serialized data. I created a class with three fields (two String and one enum and serialized them with Avro and Java.

The memory used by Avro is 14 bytes and Java used 231 bytes (length of byte[])

Reason for generating less bytes by Avro

Java Serialization

The default serialization mechanism for an object writes the class of the object, the class signature, and the values of all non-transient and non-static fields. References to other objects (except in transient or static fields) cause those objects to be written also. Multiple references to a single object are encoded using a reference sharing mechanism so that graphs of objects can be restored to the same shape as when the original was written.

Apache Avro

writes only the schema as String and data of class being serialized. There is no per field overhead of writing the class of the object, the class signature as in Java. Also, the fields are serialized in pre-determined order.

You can find the full Java example on github.

Avro can't handle circular references and throw java.lang.StackOverflowError whereas Java's default serialization can handle it. (example code for Avro and example code for Java serialization) Another observation is that Avro have no direct way of defining inheritance in the Schema (Classes) but Java's default serialization support inheritance with its own constraints like super class either need to implements Serializable interface or have default no-args constructor accessible till top hierarchy (otherwise will throw java.io.NotSerializableException).

You can also view my other posts on Avro.

Iterating Collections API

Java 8 introduced new way of iterating Collections API. It is retrofitted to support #forEach method which accepts Consumer in case of Collection and BiConsumer in case of Map.

Consumer

Java 8 added introduced new package java.util.function which also includes Consumer interface. It represents the operation which accepts one argument and returns no result.

Before Java 8, you would have used for loop, extended for loop and/ or Iterator to iterate over Collections .

List<Employee> employees = EmployeeStub.getEmployees();
Iterator<Employee> employeeItr = employees.iterator();
Employee employee;
while (employeeItr.hasNext()) {
  employee = employeeItr.next();
  System.out.println(employee);
}

In Java 8, you can write Consumer and pass the reference to #forEach method for performing operation on every item of Collection.

// fetch employees from Stub
List<Employee> employees = EmployeeStub.getEmployees();
// create a consumer on employee
Consumer<Employee> consolePrinter = System.out::println;
// use List's retrofitted method for iteration on employees and consume it
employees.forEach(consolePrinter);

Or Just one liner as

employees.forEach(System.out::println);

Before Java 8, you would have iterated Map as

Map<Long, Employee> idToEmployeeMap = EmployeeStub.getEmployeeAsMap();
for (Map.Entry<Long, Employee> entry : idToEmployeeMap.entrySet()) {
  System.out.println(entry.getKey() + " : " + entry.getValue());
}

In Java 8, you can write BiConsumer and pass the reference to #forEach method for performing operation on every item of Map.

BiConsumer<Long, Employee> employeeBiConsumer = (id, employee) -> System.out.println(id + " : " + employee);
Map<Long, Employee> idToEmployeeMap = EmployeeStub.getEmployeeAsMap();
idToEmployeeMap.forEach(employeeBiConsumer);

or Just a one liner:

idToEmployeeMap.forEach((id, employee) -> System.out.println(id + " : " + employee));

This is how we can benefit with newly introduced method for iteration. I hope you found this post informative. You can get the full example on Github.