MyPage is a personalized page based on your interests.The page is customized to help you to find content that matters you the most.

I'm not curious

IT Career Development Platform

SKIP>>

We built MyTechLogy for you

Help us to help you.

Share your expectations and experience to improve it.

Please enter your feedback.

Click here to continue..

Thank you for your Feedback

Your feedback would help us in sending you the most relevant job opportunities

Apache Solr Overview For Beginner's

Published on 24 October 17

Leviya Follow

Solr is an open-source search platform which is used to build search applications. It was built on top of Lucene (full text search engine). Solr is enterprise-ready, fast and highly scalable. The applications built using Solr are sophisticated and deliver high performance.

It was Yonik Seely who created Solr in 2004 in order to add search capabilities to the company website of CNET Networks. In Jan 2006, it was made an open-source project under Apache Software Foundation. Its latest version, Solr 6.0, was released in 2016 with support for execution of parallel SQL queries.

Why Solr?

It isn’t really feasible to execute blazing fast search queries on very big SQL databases for 2 different reasons. The first reason comes SQL databases favoring lack of radiancy over performance. Basically, you’d need to use JOINs in your SELECT. The second reason is about the nature of data in documents: it’s essentially unstructured plain text so that SELECT would need LIKE. Both joins and likes are performance killers, so this way is a no-go in real-life search engines.

Therefore, most of them propose a way to look at data that is very different from SQL, inverted index(es). This kind of data structure is a glorified dictionary where:

key are individual terms

values are list of documents that match term

Nothing fancy, but this view of data makes for very fast research in very high-volume databases. Note that the term ‘document’ is used very loosely in that it’s should be a field-structured view of the initial document (see below).

Index structure

Though Solr belongs to the NoSQL database family, it is no schemaless. Schema configuration takes place in a dedicated schema.xml file: individual fields must be defined, and with each its type. Different document types may be different in structure and have few (no?) fields in common. In this case, each document type may be set its own index with its own schema.

Predefined types like strings, integers and dates are available out-of-the-box. Types can be declared searchables (called indexed) and/or stored (returned in queries). For examples, books could (would?) include not only their content, but also author(s), publisher(s), date of publishing, etc.

This blog is listed under Development & Implementations and Data & Information Management Community

Share this Post: