Chunk Splitting in mongoDB

Connect with

Chunk Splitting in mongoDB
Chunk Splitting in MongoDB, well, I tried to demonstrate, how to split chunk in MongoDB through this post. Please go through step by step and you get to know the things and this topic will be in your control.

physical chunk for Chunk Splitting in mongoDB

  • Chunks are not physical data:
  • logical grouping/partitioning
  • described by the metadata
  • when you split a chunk, no change to the actual data are performed, you are changing only the metadata that represents the real data.

2. Chunk splitting algorithm

  • Chunk splitting algorithm (auto method) heuristic algorithm: mongos tracks writes to chunks
  • ~20 % of the max chunk size (12-13 mg)
  • use splitVector on shard primary to ask for possible split points using a possible key.
  • primary returns a list of split points.
  • update the meta data to reflect the split.
  • no data has moved, no data has changed.

3. Manual chunk splitting

  • sh.splitFind(“dbname.collectioname”, {key : ….}), sh.splitAt()
  • To turn autosplit off: –noAutoSplit

4. Jumbo chunks

  • A jumbo chunk is one that exceeds the maximum chunk size.
  • Appears for example when you pick a very bad shard key
  • Cannot be moved.

5. Manual Example for Chunk Splitting in mongoDB

  • Pre-splitting overview:
  • How to Programatically Pre-Split a GUID Based Shard Key with MongoDB stackoverflow
  • Caveat: Do this before you insert data
  • When would you want to do this?:
  • You know what your shard key is going to be. A know domain of data
  • You’ve already multiple shards up and running.

  • You’re about to do a bulk initial data load. You’re going to load in a lot of data onto your database.
  • Avoid bottlenecks .
  • you have a large amount of data in your cluster and very few chunks, as is the case after deploying a cluster using existing data.
  • you expect to add a large amount of data that would initially reside in a single chunk or shard.
  • example: You plan to insert a large amount of data with shard key values between 300 and 400, but all values of your shard keys are between 250 and 500 are in a single chunk.

6. example of data description

  • 10k avg doc size
  • 6 million of them
  • guid : 32 character hex string
  • approx 60gb of data
  • max chunk size: 64mb ~ we are using 32 mb
  • 1920 chunks
  • 16 x 16 x 16 = 4096/2 = 2048 spliting factor

Suggestions are welcome to improve this post 🙂 , and Happy learning for learning for Chunk Splitting in mongoDB.


Connect with

Leave a Comment

Your email address will not be published. Required fields are marked *