Driven by Big Data – Beyond Enterprise Search
Enterprise Search is yet another rapidly growing ecosystem. Combining state of the art technologies to work together has redefined what “search” meant to an organization and its global customers. Being one of the prominent themes in the services business, building a matured knowledge management architecture has always been in heavy demand. We now are looking beyond the traditional criteria of scalability, low latency, cost optimization, and variety. The ask is for seamless migration, agile custom API integrations, and most importantly the ability to produce results that are not just a list of matching keywords and URL’s but “Valuable Insights” into the existing information. It is about delivering through next-generation user interfaces that are intuitive in design.
This blog is my perspective on where this particular platform is currently heading towards and how open source technologies are legitimate challengers in this space.
Let’s start with the term I quoted earlier – “Valuable Insights.” What does this mean to us? What makes it different?
Russell Ackoff defines knowledge as a collection of information however additional context around this information adds features of understanding and wisdom. This sequence of maturity in information has helped me understand on where modern knowledge management systems may head towards. For example, suggest SME’s when a document is being searched in the intranet, display sales trends for similar PY’s when someone searches for “Q4 2016”. This also includes capabilities like natural language processing/understanding (NLP/NLU model), auto-classification of information (entity extraction), deep learning of existing data to identify authors, experts, and appropriate data tagging to support search results. The ability to combine internal organizational data with external data points like social media streams and websites.
I am sure the immediate need and its realized benefits are easy to justify. The question is how do we reach there?
You will see a three step process along with this blog to help define a go-forward path and may lead to shortlisting few of the leading competitors. It is based primarily on the types of customer engagements that are trending, newer solution partners and providers, and customer specific requirements. Although most of the features are found out-of-box, there will be few nuances with version compatibility, choosing the right underlying storage, ease of monitoring, and security features like AD integration and encryption. These would be the areas where you would want to perform some detailed evaluations and proof-of-concepts.
Referring to the Magic Quadrant for Enterprise Search, most of the vendor strengths are in the cloud compatibility categories like Amazon, Azure, Sharepoint online, and Salesforce whereas the gaps are on user experience, and supporting few features. However, there are solutions outside of these as well and of which, one stands out with an ability to become an end-to-end platform that could cater to a wide variety of use cases, and yes it is none other than Elasticsearch.
With a complete indexing, storage, and retrieval engine and based on Apache Lucene, Elasticsearch offers JSON store with schema-less indexing capabilities and REST API’s. The solution has seen tremendous acceptance within various industry verticals. Typical usage includes log analytics using logstash, real-time streaming system connectors like fluentD, confluent, and NoSQL datastore connectors with MongoDB, HBase, and many others. We also see a rapid growth in NodeJS based UI layers for data federation, and services like Raigad to monitor and auto-provision cloud-based deployments. Some notable solution providers are integrating well with Elasticsearch like SearchBlox and BA Insight making it easier to tap into the potential this platform has. One of the interesting use cases that I have seen is by combining Elasticsearch with MapR Converged Platform providing continuous computed materialized views on GraphDB for use by Elasticsearch. The system was hosted in two different regions due to regulatory constraints on customer data with shareable data available to both the data centers.
Although there are multiple solutions out in the market, the need to construct a custom self-learning search engine that is near to being cognitive is now real and achievable with a combination of a robust underlying platform that will allow enough customizations to turn a search portal into an invaluable tool for business decisions and strategy forecasting.