I used below MultiIndexContainer
    typedef multi_index_container<PositionSummary*,
            indexed_by<
                    ordered_unique<
                            composite_key<PositionSummary, const_mem_fun<PositionSummary, int, &PositionSummary::positiondate>,
                                    const_mem_fun<PositionSummary, const std::string&, &PositionSummary::accountid>,
                                    const_mem_fun<PositionSummary, const std::string&, &PositionSummary::instid> > >,
                    ordered_unique<
                            composite_key<PositionSummary, const_mem_fun<PositionSummary, int, &PositionSummary::positiondate>,
                                    const_mem_fun<PositionSummary, const std::string&, &PositionSummary::instid>,
                                    const_mem_fun<PositionSummary, const std::string&, &PositionSummary::accountid> > >
                        > > PositionSummaryContainer;
And I inserted 10000 insts*36 accounts*100 days=36 million records
 
 //Begin testing of the multiIndexContainter
 std::cout << "Begin inserting data from array into the multiIndexContainter" << std::endl;
 timer.reset();
 timer.begin();
 for (int i = 0; i < numOfDays_; i++)
 {
  for (int j = 0; j < accountSize_; j++)
  {
   for (int k = 0; k < instSize_; k++)
   {
    PositionSummary* ps = psArray_[(i * accountSize_ + j) * instSize_ + k];
    uniqueIndex.insert(ps);
   }
  }
 }
 printMemoryUsage();
 timer.end();
 std::cout << "Time take is " << timer.getInterval() << std::endl;
And I found the speed of insertion is a little bit slow, about 20K+ records per second... Is there anyway to enhance this insertion speed?
My data was in Oracle, properly indexed, so there should be no danger of corrupted data structure. I knew that in oracle you can first load then build index to save time, can I do the same with MultiIndexContainer, if there is a way?
By the way, the parallel query speed is quite satisfactory, querying all the 36 m records on a 4 cpu(8kernal) machine takes only 2.8 seconds, code as below
    #pragma omp parallel for collapse(2)
     for (int i = 0; i < numOfDays_; i++)
     {
   
      for (int j = 0; j < accountSize_; j++)
      {
       const int& date = dates_[i];
       const std::string& accountID = accountIDs_[j];
       for (int k = 0; k < instSize_; k++)
       {
        const std::string& instID = instIDs_[i];
        PositionSummaryContainer::iterator it = uniqueIndex.find(boost::make_tuple(date, accountID, instID));
        if (it != uniqueIndex.end())
        {
    #pragma omp atomic
         sum2 += (*it)->marketvalue();
        }
       }
       //std::cout << "accountID: " << accountID << std::endl;
      }
   
     }