Full-text indexing in Mongodb is slower than regular

  mongodb, question

Version:

version:
3.4.3

Total Data Volume Million Level
To create a full-text index:

Db.tests.createindex ({a’:’ text’}) # a field has a value of more than 1024 characters

Query

db.tests.find({‘$text’:{‘$search’:’ere’}).explain(‘executionStats’)

"executionStats" : {
 "executionSuccess" : true,
 "nReturned" : 80018,
 "executionTimeMillis" : 306877,
 "totalKeysExamined" : 83546,
 "totalDocsExamined" : 83546,
 "executionStages" : {
 "stage" : "TEXT",
 "nReturned" : 80018,
 "executionTimeMillisEstimate" : 306699,
 "works" : 167095,
 "advanced" : 80018,
 "needTime" : 87076,
 "needYield" : 0,
 "saveState" : 16525,
 "restoreState" : 16525,
 "isEOF" : 1,
 "invalidates" : 0,
 "indexPrefix" : {
 
 },
 "indexName" : "a_text",
 "parsedTextQuery" : {
 "terms" : [
 "ii"
 ],
 "negatedTerms" : [ ],
 "phrases" : [
 "ere"
 ],
 "negatedPhrases" : [ ]
 },
 "textIndexVersion" : 3,
 "inputStage" : {
 "stage" : "TEXT_MATCH",
 "nReturned" : 80018,
 "executionTimeMillisEstimate" : 306649,
 "works" : 167095,
 "advanced" : 80018,
 "needTime" : 87076,
 "needYield" : 0,
 "saveState" : 16525,
 "restoreState" : 16525,
 "isEOF" : 1,
 "invalidates" : 0,
 "docsRejected" : 3528,
 "inputStage" : {
 "stage" : "TEXT_OR",
 "nReturned" : 83546,
 "executionTimeMillisEstimate" : 305932,
 "works" : 167095,
 "advanced" : 83546,
 "needTime" : 83548,
 "needYield" : 0,
 "saveState" : 16525,
 "restoreState" : 16525,
 "isEOF" : 1,
 "invalidates" : 0,
 "docsExamined" : 83546,
 "inputStage" : {
 "stage" : "IXSCAN",
 "nReturned" : 83546,
 "executionTimeMillisEstimate" : 1103,
 "works" : 83547,
 "advanced" : 83546,
 "needTime" : 0,
 "needYield" : 0,
 "saveState" : 16525,
 "restoreState" : 16525,
 "isEOF" : 1,
 "invalidates" : 0,
 "keyPattern" : {
 "_fts" : "text",
 "_ftsx" : 1
 },
 "indexName" : "a_text",
 "isMultiKey" : true,
 "isUnique" : false,
 "isSparse" : false,
 "isPartial" : false,
 "indexVersion" : 2,
 "direction" : "backward",
 "indexBounds" : {
 
 },
 "keysExamined" : 83546,
 "seeks" : 1,
 "dupsTested" : 83546,
 "dupsDropped" : 0,
 "seenInvalidated" : 0
 }

I don’t quite understand why we need to do the operations of TEXT_OR and TEXT_MATCH.

Ideally, full-text indexing should be faster than regular. Otherwise, what else would full-text indexing do, or the above query criteria using regular (whether or not there is a b_text index, to avoid hot data restarting mongo) are the same results

db.tests.find({‘a’:{‘$regex’:’ere’,’$options’:’i’}}).explain(‘executionStats’)

"executionStats" : {
 "executionSuccess" : true,
 "nReturned" : 81319,
 "executionTimeMillis" : 101701,
 "totalKeysExamined" : 0,
 "totalDocsExamined" : 4256954123,
 "executionStages" : {
 "stage" : "COLLSCAN",
 "filter" : {
 "a" : {
 "$regex" : "ere",
 "$options" : "i"
 }
 },
 "nReturned" : 81319,
 "executionTimeMillisEstimate" : 101391,
 "works" : 4256956,
 "advanced" : 81319,
 "needTime" : 4175636,
 "needYield" : 0,
 "saveState" : 33964,
 "restoreState" : 33964,
 "isEOF" : 1,
 "invalidates" : 0,
 "direction" : "forward",
 "docsExamined" : 4256954
 }

Did I open it the wrong way? Please give me some advice. Thank you

1. text _ or appears in text search, and text _ match is normal.

2. Because your regex query is not sensitive to case, it does not use indexes, instead it is a full table scan.

Therefore, the results of your comparison need to be further looked at:

1. Size of text Index;

2. the size of the collection;

If the size of the text index is larger than that of the collection, it may not be suitable to use text index here.

In addition, the document mentions that

If you specify a language value of “none”, then the text search uses
simple tokenization with no list of stop words and no stemming.

The’ ere’ you are inquiring about is meaningless, try setting $language: none, turning off stemming, etc. to see if the performance is better.

For reference.

Love MongoDB! Have fun!