最近Mongodb会经常突然挂掉,检查日志发现如下的错误:
tcmalloc: large alloc 2061584302080 bytes == (nil) @
Tue Nov 26 17:45:04.539 out of memory, printing stack and exiting:
0xdddd81 0x6cfb4e 0x121021d 0xafcc1f 0xaf815f 0xaf8d1d 0xaf8e0f 0xaf52ae 0xaf53c9 0xb1eb11 0x8ab6a2 0x8d78ca 0x8d951d 0x8daa72 0xa80970 0xa8523c 0x9f9079 0x9fa5a3 0x6e8b88 0xdca34e
./mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xdddd81]
./mongod(_ZN5mongo14my_new_handlerEv+0x3e) [0x6cfb4e]
./mongod(_Znwm+0x6d) [0x121021d]
./mongod(_ZNSt6vectorIN5mongo18DocumentSourceSort9KeyAndDocESaIS2_EE7reserveEm+0x6f) [0xafcc1f]
./mongod(_ZN5mongo18DocumentSourceSort12populateTopKEv+0x6f) [0xaf815f]
./mongod(_ZN5mongo18DocumentSourceSort8populateEv+0x2d) [0xaf8d1d]
./mongod(_ZN5mongo18DocumentSourceSort3eofEv+0xf) [0xaf8e0f]
./mongod(_ZN5mongo18DocumentSourceSkip7skipperEv+0x6e) [0xaf52ae]
./mongod(_ZN5mongo18DocumentSourceSkip3eofEv+0x9) [0xaf53c9]
./mongod(_ZN5mongo8Pipeline3runERNS_14BSONObjBuilderERSs+0x1b1) [0xb1eb11]
./mongod(_ZN5mongo15PipelineCommand3runERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x132) [0x8ab6a2]
./mongod(_ZN5mongo12_execCommandEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x3a) [0x8d78ca]
./mongod(_ZN5mongo7Command11execCommandEPS0_RNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0x71d) [0x8d951d]
./mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x5f2) [0x8daa72]
./mongod(_ZN5mongo11runCommandsEPKcRNS_7BSONObjERNS_5CurOpERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x40) [0xa80970]
./mongod(_ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_+0xd7c) [0xa8523c]
./mongod() [0x9f9079]
./mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x383) [0x9fa5a3]
./mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x98) [0x6e8b88]
./mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x42e) [0xdca34e]
内存溢出了,一开始我以为是有些排序没有加索引,后来一些地方加了索引后还是会出现,细想了下,如果是没加索引的话,是不会让整个Mongodb宕机的。
后来在Mongodb的issue上查到了这样一条提交的bug清单SERVER-10136(https://jira.mongodb.org/browse/SERVER-10136)。
原来aggregation如果传递一个$skip特别大的值的时候,就会内存溢出。我看到这个bug已经被修复了,不过是在2.5.2版本,最新的稳定版是2.4.8。所以我们需要在自己的应用程序里面控制,让$skip的值不要超出总长。