mysql,ruby-on-rails,solr,sunspot
How are you accessing the result? If you are calling .results method then it will fire db query. You should iterate over hits and get the require field to avoid db query.
issue of version 51 is for the not having java 7... it might be the issue that you have installed/upgraded java but did not set it to JAVA_HOME & PATH. Install the java 7 and set java home to it. It would resolve your issue. Please make sure that both...
Try with the below data-config. <dataConfig> <dataSource name="ds-db" type="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/EDMS_Metadata" user="root" password="**************" /> <dataSource name="ds-file" type="BinFileDataSource"/> <document name="doc1"> <entity name="db-data" dataSource="ds-db" query="select TextContentURL,ID,Title,AuthorCreator from MasterIndex" > <field column="TextContentURL" name="TextContentURL" /> <field column="Title" name="title"...
I encountered this error in the past. Looking at my oai.cfg file, I used localhost for some settings and my public URL for others. solr.url=http://localhost/solr/oai # OAI persistent identifier prefix. # Format - oai:PREFIX:HANDLE identifier.prefix = repository.library.georgetown.edu # Base url for bitstreams bitstream.baseUrl = https://repository.library.georgetown.edu If you need to make...
When working with Solr directly, this is supported. I am not familiar with Solr.NET though, can anyone comment on whether this feature is supported by that client?
This number is the internal id of the document and doesn't affect the score. It's only a debugging info. The mailing list of Lucene gives this information.
I just rebooted the Solr Core i'm working with. Thanks .
No its not possible as of now. Its an open feature request. https://issues.apache.org/jira/browse/SOLR-7242
I did not read the documentation well enough as a result I messed up on the QuerySet part. foo.update_object(some) The above does add the object to the index. Its just that I was not searching for it properly. I was searching for the object after removing it in the following...
Response from the Solr mailing list: Once the SPLITSHARD call completes, it just marks the original shard as Inactive i.e. it no longer accepts requests. So yes, you would have to use DELETESHARD ( https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api7) to clean it up. As far as what you see on the admin UI, that...
Riak search (Solr) may seem a natural choice for at least some of your use cases. It supports sorting (by highest score) and pagination. However, according to Basho, Riak search should not be used when deep pagination is needed. It also does not scale well beyond 8-10 nodes. Also from...
solr,cluster-analysis,k-means,workbench,carrot
Your suspicion is correct, it is a heap size problem, or more precisely, a scalability constraint. Straight from the carrot2 FAQs: http://project.carrot2.org/faq.html#scalability How does Carrot2 clustering scale with respect to the number and length of documents? The most important characteristic of Carrot2 algorithms to keep in mind is that they...
I don't think there is any magic within the Solr to figure out which specific fields you are missing. Certainly that magic is not described in the UpdateCSV API. From Solr perspective, all it can deduce is that some fields are not there and throw an error of the length...
There is a Solr wiki entry on HighlightingParameters that you should read to get familiar with Solr and Highlighting at: https://wiki.apache.org/solr/HighlightingParameters Specifically, what you should consider is hl.snippets and hl.fragsize. To quote the important part from the wiki: hl.snippets The maximum number of highlighted snippets to generate per field. -...
try with change stored=true <field name="text_qs" type="text" indexed="true" stored="true" multiValued="true"/> True if the value of the field should be retrievable during a search....
looking at the error it seems apache-solr-cell jar and its dependencies are missing in the extraction lib in the Solr library . <lib dir="../../dist/" regex="apache-solr-cell-\d.*\.jar" /> <lib dir="../../contrib/extraction/lib" /> add these files......
The basic requirement while copying is that, the types should be compatible. Your event_id is custom_string type, which I am assuming it as normal string. However text is of type text_fr, which has tokenizers and filters. You can try both fields to be custom_string, unless otherwise, you have specific requirement...
java,php,solr,lucene,solrcloud
When you try SolrCloud for the first time using the bin/solr -e cloud, the related configset gets uploaded to zookeeper automatically and is linked with the newly created collection. The below command would start SolrCloud with the default collection name (gettingstarted) and default configset (data_driven_schema_configs) uploaded and linked to it....
First of all you have a fq (filterquery clause) inside your query clause (check parenthesis) which is wrong. fq={!geofilt d=40.2335}&pt=9.9312328,76.26730409999999&sfield=latlng You can try things like puting the geofilt filter query OUTSIDE your main query with tests so it will be skipped if... http://www.example.com:8983/solr/collection1/select?rows=10&start=0&wt=json&indent=true&sort=event_start_date asc&q=status:1 AND event_start_date:[2015-04-23T21:38:00Z TO *] AND (tags:5539d77455061a650f96c67e...
sql,database,web-applications,solr,nosql
I would spend a bit of time thinking about these tags. For example, are these tags going to be user generated or will you provide a few tags and let users select which ones they want? Will you need to search on tags based on text matches? For example if...
In case of a BooleanQuery, you can set the 'minimumShouldMatch' property. Here is the API link for more details: http://lucene.apache.org/core/5_1_0/core/org/apache/lucene/search/BooleanQuery.html#setMinimumNumberShouldMatch(int)
solr,faceted-search,hierarchical
You can get all of your aggregations to be populated if you push them into the index in stages. If Bob is from Norway, you might populate up to three values in your facet field: location location/Europe location/Europe/Norway (As an alternate design, you might have a hair color field separate...
You're using a lot of clauses so it is very hard to determine here what could be the cause. I can give you the following hints: a) Debug mode Copy the whole query string, go to the Solr Admin console and execute that query with and additional debug=true or debugQuery=true....
If you just need to be able to open a connection to your Solr server for the indexing (and don't need to have your configuration files actually integrated with the SolrJ project), this is fairly simple to do. First, you'll need to open a connection SolrJ, which is done as...
Sorting or scoring is per document and not in a global context , so i dont think you can achieve this using the same. How ever , you can use top_hits to achieve something similar and it does work fine. { "query": {}, "aggs": { "perSupplier": { "terms": { "field":...
This seems to a pretty old bug. According to Basho's Ryan Zezeski: At one time I fixed it but it had to be reverted because it broke rolling upgrade 1. It has languished ever since. To work around explicitly put AND in the query. E.g. q=nickname:Ring%20AND%20breed:Shepherd And as he says,...
solr,cassandra,datastax-enterprise
There is no way to index just a few rows. I agree that a parallel table (probably with TTL) is likely your best bet. Here are some (pretty effective) tactics to minimize the size of your DSE Search index. You can probably shrink it by ~50% if you're not using...
DisMax, by design, does not support all lucene query syntax in it's query parameter. From the documentation: This query parser supports an extremely simplified subset of the Lucene QueryParser syntax. Quotes can be used to group phrases, and +/- can be used to denote mandatory and optional clauses ... but...
indexing,solr,sitecore,sitecore7.2
I'm looking in the decompiled source of the Sitecore.ContentSearch assembly and it looks like the Refresh method eventually calls the RefreshTree method on the IndexCustodian class. The RefreshTree will create a new object array for the item to be indexed and will then iterate through all the available indices (even...
Solr heap size must be set according your business. Set -Xms=2G and -Xmx=12G is just a recommendation to lots of popular Solr applications but it's not mandatory. You need to assess your requirements and set the heap to work well for you. I really recommend you to use at least...
Turns out I was over complicating the debugging. When Boot2Docker first starts up it prints the message: IP address of docker VM: 192.168.59.103 This IP allows you to access the SOLR instance. The command: boot2docker ip Also lets you know what this IP is....
apache,solr,tomcat7,bigdata,solr-schema
I think in your earlier version of schema.xml you ahd field type of pint And now in current version it is not supported as I dont see the fieldType in schema.xml(in the default one when I download the Solr.5.1.0) Replace/Remove it and the error will get corrected. This fieldType was...
IMHO, the best way to implement such functionality is as a SearchHandler that returns Banana "compatible" response. You should index the fields that you need to be searchable without storing them in Solr. The search handler should retrieve corresponding rows from HBase according to search results which would enable labeled...
Maybe you can use this highlighter: https://issues.apache.org/jira/browse/LUCENE-1522 The problem that you are pointing is known and some patches are available: https://issues.apache.org/jira/browse/LUCENE-1489 Edit: The second link is the same that Bereng sent....
ruby-on-rails-4,solr,geospatial,sunspot-solr
I am really sorry to open this issue, I am stupid who have created searches_controller_backup.rb which had old code that was not configured with geospatial filter functionality. The problem was everything was working but because rails was loading searches_controller codes from backup file, it was suppose to looks like it...
ok so i see that you are using solrphpclient.You need to make changes in the service.php file so that these special characters get replaced to either blank or what ever you want. This will take care of the problem you are facing $params=str_replace("%", "", $params); $params=str_replace("*", "", $params); $params=str_replace("&", "",...
You already have a query that returns size per product? If you don't want your size query to affect the relevancy scores of your query use a filter query - fq This parameter can be used to specify a query that can be used to restrict the super set of...
Every facet order it's independent of others, so if you store the id and the name in separated fields, you can't assume that both are going to be in the same row. By count it will be the case only if the result number is different for each one. (so...
oracle,apache,search,solr,lucene
does Solr 100% require Lucene as a backend? Yes. Solr can't function without Lucene. It might be a standalone application but it uses Lucene at it's core. As to whether or not you can store the index in a database, this seems to suggest you can : http://stackoverflow.com/a/17371651/2039359 (which in...
solr,django-haystack,django-cms
Found the solution. Turns out the information I was looking for is in the Title model, NOT the page model. The Title model allows you to access all the page title, meta tags, menu titles, etc. So I just created a search index based off of that model and it...
java,json,solr,jersey,jersey-client
The wt param, should take care of JSON response format, as per this. However, things can go wrong sometimes, as mentioned like JSON responses can be returned as plain text, with a change in solrconfig.xml. Please check that option also. Hope this helps you in identifying the issue.
This is a known bug in Solr. Even I have come across! I posted this as an answer because, this is a bug and there's no solution from the author! We have actually downgraded the version in order to get rid of this bug. I am not sure if this...
The problem is that you are referencing a field type booleans that is not defined in your schema.xml file. When you create a core a file managed-schema is created in server\solr\my_collection\conf\. Rename this file to schema.xml and restart solr with ClassicIndexSchemaFactory and it will work fine.
solr,lucene,multicore,sharding,solrcloud
In SolrCloud each of your Core will become a Collection. Each Collection will have its own set of Config Files and data. You might find this helpful Moving multi-core SOLR instance to cloud Solr 5.0 (onwards) has made some changes on how to create a SolrCloud setup with shards, and...
java,solr,lucene,config,solrcloud
The issue is a classloader issue. I had added my custom jar into the server/lib/ folder. When I added the Collection, it would instantiate my custom class which needs the UpdateRequestProcessorFactory class but is not available in that classloader. I solved this by rmoving my jar from server/lib/ and adding...
The JSON module support using the regular JSON array notation (from 3.2 and forward). If you're adding documents, there is no need for the "add" key either: [ { "id" : "MyTestDocument", "title" : "This is just a test" }, { "id" : "MyTestDocument2", "title" : "This is antoher test"...
It all depends on the your requirement. You can do both ways By creating core for the individual table or By joining couple of tables(consider 3-4 tables which are related with each other) together and indexing the data into a core. I would suggest to go with the later where...
From https://docs.lucidworks.com/display/lweug/Wildcard+Queries: The Lucid query parser will detect when leading wildcards are used and invoke the reversal filter, if present in the index analyzer, to reverse the wildcard term so that it will generate the proper query term that will match the reversed terms that are stored in the index...
It could be the result of the space in *THÔN 22*, and how it changes the query. If it's a single search term instead of two, try quoting it and see if you get the same results then.
Looks like your usecase is good fit to Dismax/Edismax query parser in SOLR: https://cwiki.apache.org/confluence/display/solr/The+DisMax+Query+Parser In particular: qf Query Fields: specifies the fields in the index on which to perform the query. Comment: the problem with fq is that they don't undergo the same query pre-processing as the q parameter. However,...
make sure the fields have stored=true <field name="field_name" type="text_general" indexed="true" stored="true"/> True if the value of the field should be retrievable during a search. Use the Default Search Field : The is used by Solr when parsing queries to identify which field name should be searched in queries where an...
java,groovy,solr,spring-boot,criteria
One possible solution is to make the field in solr to be multi-valued.
Forward index is what you were asking. here in general. here in solr
You can't really guarantee a clear priority as the fuzzy search will naturally match on more terms (Apple, Appl, App, Appla and so on). Just give it a high enough boost value that it will outscore the fuzzy search in everything but edge cases. The fuzzy search will also help...
I assume you're using Spring Data Solr, from your reference to SimpleFacetQuery. Based on your sample query, the code would look something like this: // creates query with facet SimpleFacetQuery query = new SimpleFacetQuery( new Criteria("lastName").startsWith("Harris")) .setFacetOptions(new FacetOptions() .addFacetOnField("state")); // filter parameters on query query.addFilterQuery(new SimpleFilterQuery( Criteria.where("filterQueryField").is("red"))); // using query...
The key for this are function queries. Assumed you use EDisMaxQP, you can specify a boost function (bf) with this value: sub(product(answer_count,100),product(div(ms(NOW,created_at),3600000),5)) However, a function query would influence scoring, but it rather looks like you don’t want any scoring at all and instead a fixed sort order, so I guess...
The ideal solution of course (from Solr perspective) would be to store the data in Solr in the denormalized form. However, if that is not a viable option, you could take a look at the Join query Parser. https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-JoinQueryParser You would be performing a query along the lines of (not...
solr,solrcloud,synonym,stop-words
In SolrCloud configuration and other files like stopwords, are stored and maintained by Zookeeper. Which means you do not need to individually send updates to each server. Once you have SolrCloud, before putting in any data, you will create a collection. Each collection has its own set of resources/config folder....
Issue is with Solr 5. From version 5, solr by default manages the schema and doesn't read the schema from schema.xml. As indexing starts and "path" field in my documents is a int, solr analyzes it as int, but as it comes to document where "path"="0/6000" it throws NumberFormatException and...
The valid characters for a core name appear to be undocumented. According to the source of org.apache.solr.core.CoreContainer#registerCore(String, SolrCore, boolean) in Solr 4.10.4, the only invalid characters are: Forward-slash: / Back-slash: \ The following characters a problematic by causing issues in the admin interface and when performing general queries: Colon: :...
solr,dataimporthandler,data-import
Finally got a solution. We need to modify data-config.xml file, under which there would be 3 different entities. 2 from MySQL and 1 from Solr core itself. For MySQL, entity should be something like this <entity name="entity name" dataSource="data source created in the file" query="SQL query to retrieve the data"...
I'm not saying that Drew's answer is incorrect but I've found there is a more direct way to solve this problem. After a couple of days of searching I and posting on the Lucene forums I was able to come up with a pretty comprehensive answer to this question. If...
search,indexing,solr,levenshtein-distance
I wonder what keeps you from trying it with Solr, as Solr provides much of what you need. You can declare the field as type="string" multiValued="true and save each list item as a value. Then, when querying, you specify each of the items in the list to look for as...
I eventually had to change my.cnf for the new MySQL server. For some reason the new one (also 5.5.43) closed the connection after timeout. I changed the settings for timeout in MySQL and now it indexes correctly, for about 21 minutes. I wished tomcat7 and solr made that more clear...
Your goal is to achieve syntactically correct JSON data: no quotes for numbers and the strings "true" and "false". That's what your attempt handling these data types tells me. In this case the following xsl:choose part should work - at least I hope so, since we don't know your input...
xml,tomcat,solr,lucene,xinclude
This can be triggered by other configuration options than solr.xml and solrconfig.xml - the exact error message seems to be produced by the Currency field, which require XInclude to load its list of currencies. While I'm not sure about the exact reason for this, my guess is that Tomcat bundles...
solr,typo3,typoscript,typo3-6.2.x
I assume that your type-field in the solr index only has 4 values, one for pages, 1 each for the two custom tables, and 1 for news. In order to get 6 facets, you need to have 6 different values in the field the faceting is done on. I'm not...
You can use the data import handler and set the "query" to point to your view. See example below: <dataConfig> <dataSource driver="org.hsqldb.jdbcDriver" url="jdbc:hsqldb:/temp/example/ex" user="sa" /> <document name="products"> <entity name="feature" query="SELECT * FROM MyView"> <field column="attr_1" name="Attr1" /> <field column="attr_2" name="Attr2" /> <field column="attr_3" name="Attr3" /> </entity> ...
Try bin/post -h to see additional information, including -params option. Then, you need to actually figure out the parameter name. Usually, it is something like literal.id=555 as, for example, in the ExtractingRequestHandler documentation....
so you are using a single core/collection and the multitenancy is enforced by a fq=customer_id:A right? Well, what about enforcing the multitenancy via a one collection per customer? This way each one can has its own conf (including elevate stuff). About your second question, I did not check but probably...
java,apache,solr,lucene,autosuggest
Thanks for your answer @Dhanesh S Radhakrishnar, You are right but then we are actually changing the AnalyzingInfixLookupFactory to FuzzyLookupFactory which works but our purpose has been lost. Anyhow i have found the solution and the thing is that, we need to add the indexPath of analyzer in the implementation...
search,solr,lucene,full-text-search,hibernate-search
Scoring calculation is something really complex. Here, you have to begin with the primal equation: score(q,d) = coord(q,d) · queryNorm(q) · ∑ ( tf(t in d) · idf(t)2 · t.getBoost() · norm(t,d) ) As you said, you have tf which means term frequency and its value is the squareroot of...
You just need to add this information on a field documentPath for exemple and query the information on indexing. If you're using DataImportHandler just return the information on query and index it in the new field. Create the documentPath with string type for exemple.
django,apache,solr,django-haystack,solr-multy-valued-fields
Got to the root of the problem. The problem was that solrconfig.xml wasn't configured correctly. By default the schemafactory class is set to ManagedIndexSchemaFactory which overrides the use of schema.xml. By changing the schemaFactory to class ClassicIndexSchemaFactory it forces the use of schema.xml and makes the schema immutable by API...
When Solr starts and the index folder doesn't exists Solr create it by itself. To make it possible the folder be created by tomcat user this user need to have permission to create the folder. If the tomcat user doesn't have permission to create the index folder an exception is...
For both queries and updates, its better if you route it to all 6 servers. For updates you might think its better to route it to leaders, but SolrCloud dynamically selects the leaders for each shard. So depending on number of requests and other operations, leaders will be switched every...
java,indexing,solr,lucene,full-text-search
Solr is a general-purpose highly-configurable search server. The Lucene code in Solr is tuned for general use, not specific use cases. Some tuning is possible in the configuration and the request syntax. Well-tuned Lucene code written for a specific use-case will always outperform Solr. The disadvantage is that you must...
apache,tomcat,amazon-web-services,solr,amazon-ec2
Add a rule in the appropriate security group to enable inbound traffic on port 8080. Type = Custom TCP Rule Protocol = TCP Port = 8080 Source Anywhere / 0.0.0.0/0 ...
I tried it ...here it goes...only difference is I tried on Ubantu [email protected]:~/Downloads/solr-5.0.0$ bin/solr start -e cloud Welcome to the SolrCloud example! This interactive session will help you launch a SolrCloud cluster on your local workstation. To begin, how many Solr nodes would you like to run in your local...
solr,full-text-search,sunspot-solr
This may or may not be the issue in your case, but I think I know what's happening here. You didn't specify your mm (Minimum Should Match) value, which I suspect is set to at least "3" or "70%". (As an aside, in the future if you add the argument...
After some analysis, we went ahead and implemented the design with NGHbas indexer. One argument is that we cannot gaurantee same data in hbase and solr as we cannot handle transactions at large scale. Also we have similar design for streaming data. So made used of the setup
solr,lucene,solr-multy-valued-fields
Create a copyfield to copy the content of multivalued data into a sorted concatenated single value without commas and use it for sorting. For Ex : Doc 1 : multiValuedData : 11, 78, 45, 22 sortedConcatenatedSingleValue : 11224578 Doc 2 : multiValuedData : 56, 74, 62, 10 sortedConcatenatedSingleValue : 10566274...
Well, it seems you don't have write permissions on the disk. You should check if the OS user running your Solr instance is allowed to write on disk. Notice I don't know anything about GCE, just check if you have options for managing permissions on the file system in an...
The version string is specified in Lucene's solr-4.10.4/lucene/common-build.xml. In it you will find four version strings: <!-- The base version of the next release (including bugfix number, e.g., x.y.z+): --> <property name="version.base" value="4.10.4"/> ... <!--TODO: remove once Jenkins jobs are updated:--><property name="dev.version.suffix" value="SNAPSHOT"/> <!-- Suffix of the version, by default...
/DAY simply means: use 00:00:00 of that day. Without /DAY, it would be the current time minus 1 year. For the upper boundary, the NOW/DAY+1DAY means: use today, 00:00:00 and add 1 day, which results in tomorrow, 00:00:00. With /YEAR, it is basically the same: it goes back to January,...
mysql,oracle,solr,dataimporthandler
I checked out the source code for solr and tried to solve my issue. I had a fix for it and its working for me. The variable resolve in case of date is somehow making a array and so it appends the '[?, '28/05/2015 11:13:50']'. In the TemplateString.java in method...
You need to change text_general on publisher field which uses WhitespaceTokenizerFactory means it splits phrases/strings into chunks whenever it encounters whitespace. <field name="publisher" type="text_general" indexed="true" stored="true"/> So Cambridge University Press is divided into Cambridge University Press Either remove that tokenizer or use other fieldType which doesn't use WhitespaceTokenizerFactory You can...
solr,typo3,extbase,typo3-6.2.x,fal
Solution found: index { queue { tx_myextension = 1 tx_myextension { fields { ... bild_stringS = FILES bild_stringS { references { table=tx_myextension_model_name uid.data = field:uid fieldName=artikelbild } renderObj = TEXT renderObj { stdWrap.data = file:current:publicUrl stdWrap.wrap = | } } } } } } This way I get the URL,...
Well, there was something stupid which I landed up doing because of which things didn't work. The above steps surely worked for me, except for that when I reloaded the core, I did it using the LB VIP and not each individual machine (!) . Doing that solved my problem....
It's a bug that occurs just in Solr 5.1. There is an issue resolved that fixed this bug in Solr 5.2. https://issues.apache.org/jira/browse/SOLR-7454 The problem is that Solr 5.1 doesn't use SOLR_JAVA_MEM to set heap size. It uses SOLR_HEAP to set the min and max heap sizes with the same value...
java,indexing,solr,lucene,solrj
here in your code it still points to core1. HttpSolrClient solrClient = new HttpSolrClient("http://localhost:8983/solr/core1" In case you want to have the indexex for core2 you need to change here HttpSolrClient solrClient = new HttpSolrClient("http://localhost:8983/solr/core2" after this change try run the job, it will index for the core2....
ruby-on-rails,solr,sunspot-rails
I figure out how to index multilingual documents in a single Solr instance and search the indexed documents by a specified language from sunspot/rails. This method uses different fields instead of cores for different languages, so it is not a direct answer to my question, but a working example to...