Exploring Websphere Commerce: 2016

Friday, October 28, 2016

How to query WCS DB from Search application

Introduction
There will be several scenarios where we have to interact with the WCS DB from the search application. We can use JDBCQueryService for the same. This class deals with complex SQL queries and can deal with select statements, aggregate functions, update, delete etc.

Example scenario: In a specific search profile boost a set of products which has got the column CATENTRY.FIELD1 set to 1.

Steps

Write a custom expression provider CustomProductBoostExpressionProvider.java which extends AbstractSolrSearchExpressionProvider and override the invoke() method.
Add the query to a custom .tpl file in the corresponding component configuration. For example Search/xml/config/com.ibm.commerce.catalog-ext/wc-query-utilities.tpl

BEGIN_SQL_STATEMENT
name=getCustomBoostedProducts
base_table=CATENTRY
sql=
SELECT catentry_id AS CATENTRY_ID_BOOSTED
FROM CATENTRY
WHERE CATENTRY.FIELD1 = ?param1?
with ur
END_SQL_STATEMENT

In the expression provider use the below code to run the query

JDBCQueryService service = new JDBCQueryService("com.ibm.commerce.catalog");
ArrayList<String> paramList = new ArrayList(1);
paramList.add("1");
Map queryParameters = new HashMap(1);
queryParameters.put("param1", paramList );
List<HashMap> results = service.executeQuery("getCustomBoostedProducts", queryParameters);

Iterate over the list to get the catentryIds.
Iterator<HashMap> recordIterator = results.iterator();
StringBuffer sb = new StringBuffer();
List<String, String> ids= new ArrayList<String,String>();
while (recordIterator.hasNext()) {
HashMap<String, Object> record = (HashMap) recordIterator.next();
String catentryId = record.get("CATENTRY_ID_BOOSTED").toString();
ids.add(catentryId);
}

Now form the solr filter query and add it to the control parameter

for (String catentryId : ids) {

StringBuilder s = new StringBuilder();

s.append("childCatentry_id");

s.append(":\"");

s.append(catentryId);

s.append("\"^");

s.append(25));

// adds the boost query

addControlParameterValue(SearchServiceConstants.CTRL_PARAM_SEARCH_INTERNAL_BOOST_QUERY, s.toString());

}

The above shows an interaction of search app with WCS DB. We can use this for any component configuration like marketing, foundation, promotion etc

Monday, October 3, 2016

Heap dump analysis using memory analyzer for WCS

Introduction
Memory is one of the important area in any application. Java has its own memory management. It is really important to efficiently maintain an optimum memory consumption and total run time in java. Failing to to do so will cause a lot of performance and other issues. Java handles its memory in two areas. Heap and Stack

Heap memory
All the objects created in a java application is stored in heap memory. We create an object using new operator. The garbage collector can logically separate the heap into different areas so that the GC is faster.

Stack memory
Stack is where the method invocations and the local variables are stored. If a method is called then its stack frame is put onto the top of the call stack. The stack frame holds the state of the method including which line of code is executing and the values of all local variables. The method at the top of the stack is always the current running method for that stack.

The maximum heap size and permgen size can be set during start up of java application using JVM parameters -Xmx and -XX:MaxPermSize

Memory leak
It is a type of resource leak when a program incorrectly manages memory allocations. That means a failure in a program to release the discarded memory which will cause impaired performance or failure.

OutOfMemory Errors
Java heap space error will be triggered when the application attempts to add more data into the heap space area, but there is not enough room for it.

Heap dump
A heap dump is a snapshot of the memory of a Java™ process. The snapshot contains information about the Java objects and classes in the heap at the moment the snapshot is triggered

Eclipse Memory Analyzer(MAT)

It is one of the feature rich heap analyzer that helps us analyse the heap memory. If we want look at a memory related issue, we need to generate the heap dumps during the issue and then analyse the same. Most of the heap issues will be resolved with a restart as it will freshen up the heap but it is just a temporary fix to retain the services. For permanent solution we need to perform a detailed analysis and fix the root cause.

1. Download the memory analyzer and extract the same. The source file can be obtained from below link
Memory analyzer download

2. Generate the heap dump file from the server.

IBM generates the files with .phd format. It is not recognized by MAT. In order to analyse these files we need to add IBM Diagnostic Tool Framework for Java (DTFJ) plugin to it.

3. Adding DTFJ plugin to MAT

Download the DTFJ assets from the below link DTFJ Source file
Add DTFJ to MAT. Open MAT and choose Help->Install New Software
In the dialog box click "Add" and provide the link. Follow the steps of the wizard.
There are some case where the link does not work. The download the files manually(one by one from the above link) to a folder, select "Local" in Add dialog box and locate the folder.
Restart MAT and you are good to go.

4. Choose File -> Open heap dump and locate your file. This will open the heap dump and it will automatically display the overview of the analysis.

Look at the overview and it will display the main problems if any or else there are multiple options available to analyze further.

Below is an example of a heap file where one object takes more than 75% of the memory. This shows that the object is not properly handled and hence we need to have a look at the cause behind such immense object creations.

The below pic says about the problem. It means the class loader has loaded an immense LRUCache object. We can now look at the application code and check when are these objects created and why the size is this huge.

This is an example of how to nail down root causes of memory issues using MAT.

The are options to look at Top components, Leakage suspects etc.We can also check the path to GC root, thread details, hash entries, duplicate classes and much more which can be explored our self.

Friday, September 23, 2016

Extracting response cookies of a rest call in WCS

Introduction
We all work with cookies and it is one of the important aspect when it comes to session management.

There will many situations where we would need to read the cookies at server side. Usualy we follow the responseWrapper approach to read the resposne cookies which is defined in the link below

ResponseWrapper approach to read cookies

There are times when the above approach doesn't work but we would still need to read them. Some changes to commerce OOB code will get us there.

1. Extend CommerceTokenRequestHandler.java and overwrite handleRequest method. Add the below code

{
HttpServletResponseWrapper resp = (HttpServletResponseWrapper) messageContext.getAttribute(HttpServletResponse.class);
SRTServletResponse srtResponse = getSRTResponseFromResponseWrapper(resp);
responseCookies = srtResponse.getCookies();
}

private SRTServletResponse getSRTResponseFromResponseWrapper(ServletResponseWrapper respWrapper) {

ServletResponse response = respWrapper.getResponse();
if (response == null) {
return null;
} else if (response instanceof SRTServletResponse) {
return (SRTServletResponse) response;
} else if (response instanceof ServletResponseWrapper) {
return getSRTResponseFromResponseWrapper((ServletResponseWrapper) response);
} else {
return null;
}

}

2. Extend CommerceDeploymentConfiguration.java and overwrite initRequestUserHandlers() method

final List<RequestHandler> handlerList = super.initRequestUserHandlers();
if (handlerList != null) {
for (int i = 0; i < handlerList.size(); i++) {
final RequestHandler handler = handlerList.get(i);
if (handler instanceof CommerceTokenRequestHandler) {
handlerList.set(i, new ExtendedCommerceTokenRequestHandler());
}
}
}

The above customisations would suffice to read the response cookies for rest calls

Saturday, May 21, 2016

Populating and retrieving data from Solr

Solr is one of the integral component of WCS and it is a very frequent requirement to update and read the data in Solr. In this write up I will explain it with some examples.

Populating data into Solr

There are multiple ways we can populate solr indexes. The source can be a cvs file, JSON feed, a url (like a web service) that will return the data in the way we need. Here we will have cvs file as the input.

The headings of the input file must be same as the solr fields. Referring to my earlier post, we will use the same data. So the headings will be searchterm and recipes. The file must have data as specified in this post http://exploringwebspherecommerce.blogspot.com.au/2016/02/creating-and-populating-new-core-in.html

To populate the indexes, we can write a java command and can use the solrj libraries. The base example would be something like below. It assumes the data to be populated is properly updated in the file c:/Ranjith/searchRecepies.csv

SolrServer solrServer= new HttpSolrServer("url");//solr url needs to be passed here

NamedList<String> params = new NamedList<String>();

//Sets the file from which data need to be streamed
updateParams.add(stream.file","c:/Ranjith/searchRecepies.csv");

updateParams.add("stream.contentType", "text/csv;charset=utf-8");

//set to false ignored to handle transaction rollbacks. Would need a specific commit query issued at //the end.
updateParams.add("commit", "false");
SolrQuery updateQuery = new SolrQuery();
SolrParams solrParams = updateQuery.toSolrParams(params);
updateQuery.add(solrParams);
QueryRequest solrRequest = new QueryRequest(updateQuery);
solrRequest.setPath("/update");
QueryResponse solrResponse;
solrResponse = solrRequest.process(server);

We must have a separate method which fires a query with just parameter "commit" as true and must call it at the end so that it is committed. Similarly we can have a different method for rollback as well so that we can rollback the changes in case of an exception or so.

Retrieving data from Solr

To retrieve data from solr we can use the below sample code.

SolrServer solrServer= new HttpSolrServer("url");
SolrQuery qry = new SolrQuery();
qry.setQuery("chicken"); //Set the search term here
//You can set all other parameters in Solr query as needed like sort, filter query, fields, facets etc
QueryResponse response = solrServer.query(query);
List<SolrDocument> data = response.getResults();

Now data object will have the response form Solr. You can manipulate it as required and pass it back to UI.

In case if we want all the GET request to go through a set of logic, it is possible to add an handler in solrconfig.xml and set the requestHandler in the query as well.

Saturday, February 20, 2016

Creating a new core in SOLR for Websphere Commerce

Introduction

A Solr core is basically an index of the text and fields found in documents. A single Solr instance can contain multiple "cores", which are separate from each other based on local criteria. Having multiple cores helps us to segregate the data. Commerce provides a set of cores by default. They are

MC_10001_CatalogEntry_en_US
MC_10001_CatalogEntry_Unstructured_en_US
MC_10001_CatalogGroup_en_US

Say we want to index a different data. An example would be as below

In order to support recipes we want have a search term to recipes relationship. When a customer searches anything we show a list of recipe names which has that search term. When they click on one of the name we can make a call to an external system where recipe information is stored and retrieve the specific recipe and show it to the user. So you need the data in SOLR in the below fashion

SearchTerm Recipes

milk Mascarpone mango lassi, Ultimate breakfast smoothie

cheese Canadian cheddar melt, Night mac and cheese

chicken Chicken paprica, Spanish chicken

beef Beef Stroganoff, Swiss Sizzler

We will keep the external integration away for now. Our requirement is to store the recipe to search term mapping in SOLR. As this is no way related to catentry or catgroup we don't want to keep it in the above listed cores and hence we will create a new one.

Steps

1. Change the solr.xml

This xml has the configurations of solr cores defined. Navigate to WCS_InstalDir/Search/solr/home and open solr.xml
To define a new core add the below line to it. "instanceDir" is the name of the folder where the configuration files are present and "name" is the name of new core.

2. Create the instance directory

Every solr core must have a set of configuration files which will be there in the instance directory. Solr provides a default core. You make a copy of this core and name it according to the solr.xml. So I would make a copy of Default folder and will rename it as "Recipes"

3. Alter schema.xml

Schema.xml defines the solr schema for the specific core. In order to index a field we need to use either an OOB solr field or create a custom field type of ours. Below is an example

<filter class="solr.WordDelimiterFilterFactory"

generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1"

catenateAll="1" splitOnCaseChange="1" splitOnNumerics="1" preserveOriginal="1" />

</analyzer>

<filter class="solr.WordDelimiterFilterFactory"

generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0"

catenateAll="0" splitOnCaseChange="0" splitOnNumerics="1" preserveOriginal="0" />

</analyzer>

</fieldType>

Also add the fields that we need to create in the schema.xml

4. Add the new core in extended wc-search.xml

Add the below line in the <_config:cores> tag

<_config:core catalog="0" indexName="Recipes" language="en_US" name="recipes" path="Recipes" serverName="AdvancedConfiguration_1" />

This has to be done on wc-search xml in Search and WC project.

5. Clean, build, Restart and publish. Hit the below url and you should be able to see new core up and running.

http://localhost/solr/recipes/select?q=*:*

Population of indexes in this custom SOLR core will be covered in the next blog

Saturday, January 23, 2016

Websphere Commerce SOLR tips

Introduction

Websphere Commerce uses apache solr which is a fast open source java search server for its searches. We know that the product and other data are stored in database. DB queries are heavy and performance intensive. Hence we can't keep on querying the DB to get the results for every search.

As a solution for this and to make our searches fast, commerce introduced Solr. Solr is able to achieve fast search responses because instead of searching the text directly, it searches an index. Solr stores this index in a directory called index in the data directory. Apart from just being fast, Solr provides the ability to change the relevancy of search result. This means by changing certain parameters we can improvise the order in which the search results appear.

For example if somebody search for "pepsi", ideally pepsi drink must come in the top. 7up is also a product of pepsi and it will have the word "pepsi" in the manufacturer name. In this scenario we can say that a match for search term in product name is more relevant than a match in manufacturer name. This way we can make sure that the correct product is displayed on the top. 7up will still be there in the search result but with a lesser relevancy. Also boost the products we want by changing the boost factor, change the sorting options, filtering specific products/categories from results and much more can be done using solr.

Commerce reindexing

Commerce reindexing is the process of indexing the data from DB in Solr. It consists of two processes.

Pre-process: Preprocess the data to be stored/indexed in solr and save it into temporary tables.

Build-index: Indexes the data in temp table to solr.

Important files for reindexing

Solrconfig.xml : This file has the configurations of solr like connection time out, replication handling, max connections etc.

Schema.xml : This file has the definition of the solr fields.

pre-process xmls : These xmls will have the sqls to create the temporary table and populate them as we need.

wc-data-config.xml : This has the query to get the data to populate the solr indexes and also have the mapping between the temporary table name to the solr field name.

SOLR Queries

We save fields in SOLR in a specific core. Mostly for product searches we use MC_10001_CatalogEntry_en_US (where 10001 is the master catalog of the store). For an easy analogy I am comparing SQL queries to SOLR queries.

If we want to get all the products from DB in a query we will write it as

SELECT * FROM CATENTRY;

The corresponding solr implementation would be as below

http://localhost/solr/MC_10001_CatalogEntry_en_US/select?q=*:*

q parameter is the main query of the request which is equivalent to the search term. Here we are passing q as * which means the query will fetch every doc in solr.

Now if we want to get all the product data which has got "milk" in the product name we will use the DB query

SELECT CT.* FROM CATENTRY CT, CATENTDESC CD WHERE CD.CATENTRY_ID = CT.CATENTRY_ID AND LOWER(CD.NAME ) LIKE '%milk%'

In Solr we can write the same query as below

http://localhost/solr/MC_10001_CatalogEntry_en_US/select?q=milk&fq=name:milk

Here "fq" stands for the filter query. Filter queries are used to add conditions to solr queries. CD.NAME in the above DB query is indexed as field "name" in Solr. So the solr query means, get all the details from solr which has the term milk in any of the fields and filter the results such that the results has "milk" in the field "name". So all the other results which might have "milk" in any other fields are omitted.

Say now we want to select all the search for milk which are has milk in product name and are in Dairy category. The DB query will look like below. You can see that the query is becoming bigger and messier

SELECT CT.* FROM CATENTRY CT, CATENTDESC CD , CATGPENREL CG, CATGRPDESC CGD WHERE CD.CATENTRY_ID = CT.CATENTRY_ID
AND CG.CATENTRY_ID = CT.CATENTRY_ID
AND CGD.CATGROUP_ID = CG.CATGROUP_ID
AND LOWER(CD.NAME ) LIKE '%milk%'
AND CGD.NAME LIKE '%Dairy%'

The solr query for the same would look like
http://localhost/solr/MC_10001_CatalogEntry_en_US/selectq=*&fq=name:milk&fq=categoryname:Dairy
Note : categoryname is the solr field corresponding to DB category name column
It is just an addition another filter query and bang you get the results..!!

Let us make a slight modification to fetch the products in diary as well as fresh category. The sql will be
SELECT CT.* FROM CATENTRY CT, CATENTDESC CD , CATGPENREL CG, CATGRPDESC CGD WHERE CD.CATENTRY_ID = CT.CATENTRY_ID
AND CG.CATENTRY_ID = CT.CATENTRY_ID
AND CGD.CATGROUP_ID = CG.CATGROUP_ID
AND LOWER(CD.NAME ) LIKE '%milk%'
AND (CGD.NAME LIKE '%Dairy%' OR CGD.NAME LIKE '%Fresh%')

Let us get help from solr, the query would be almost similar to above with a slight change

http://localhost/solr/MC_10001_CatalogEntry_en_US/select?q=*&fq=name:milk&fq=categoryname:(Dairy+Fresh)

Say now we want to get all the above results but don't want products which are from a specific manufacturer (say RanjithsMilk). DB query would look like
SELECT CT.* FROM CATENTRY CT, CATENTDESC CD , CATGPENREL CG, CATGRPDESC CGD WHERE CD.CATENTRY_ID = CT.CATENTRY_ID
AND CG.CATENTRY_ID = CT.CATENTRY_ID
AND CGD.CATGROUP_ID = CG.CATGROUP_ID
AND LOWER(CD.NAME ) LIKE '%milk%'
AND (CGD.NAME LIKE '%Dairy%' OR CGD.NAME LIKE '%Fresh%')
AND CT.MFNAME NOT LIKE '%RanjithsMilk%'
This is really messy as we can see couple of like and one not like in the same query. So if make a query for the above requirement in Solr, it would look like below

http://localhost/solr/MC_10001_CatalogEntry_en_US/select?q=*&fq=name:milk&fq=categoryname:(Dairy+Fresh)&fq=-mfName:RanjithsMilk

Note : mfName corresponds to the DB column CT.MFNAME and " - " corresponds to removing the results from the query

Yes it is just another filter query.. !!!!!
Now if we look at the SQL and the solr query , it is pretty evident that the the solr queries are really easy to write and is fast to execute. This is a very simple example. An original search will have much more stuffs and then SOLR will be really handy.

Paginated results : we can use start (start index of the results) and rows (number of rows to be fetched) parameters to get paginated results. The below query will fetch the first five results from the response.

http://localhost/solr/MC_10001_CatalogEntry_en_US/select?q=*&fq=name:milk&fq=categoryname:(Dairy+Fresh)&fq=-mfName:Coles&start=0&rows=5

Field lists : By using fl parameter we can specify the fields that we need in the response through which we can get rid of unused data. The below query will return only name and category name in the results.

http://localhost/solr/MC_10001_CatalogEntry_en_US/select?q=*&fq=name:milk&fq=categoryname:(Dairy+Fresh)&fq=-mfName:Coles&start=0&rows=5&fl=name,categoryname

These are some of the basic tips for querying solr. There are much more things we can do and it can be covered in a different post.

Exploring Websphere Commerce